A novel approach for computing glueball masses and matrix elements in Yang-Mills theories on the lattice

We make use of the global symmetries of the Yang-Mills theory on the lattice to design a new computational strategy for extracting glueball masses and matrix elements which achieves an exponential reduction of the statistical error with respect to standard techniques. By generalizing our previous work on the parity symmetry, the partition function of the theory is decomposed into a sum of path integrals each giving the contribution from multiplets of states with fixed quantum numbers associated to parity, charge conjugation, translations, rotations and central conjugations Z_N^3. Ratios of path integrals and correlation functions can then be computed with a multi-level Monte Carlo integration scheme whose numerical cost, at a fixed statistical precision and at asymptotically large times, increases power-like with the time extent of the lattice. The strategy is implemented for the SU(3) Yang--Mills theory, and a full-fledged computation of the mass and multiplicity of the lightest glueball with vacuum quantum numbers is carried out at a lattice spacing of 0.17 fm.


Introduction
The existence of glueballs is a distinctive property of quantized non-Abelian gauge theories [1][2][3]. Since the first Monte Carlo simulations of lattice field theories, glueballs have been the focus of many studies (see Ref. [4] and reference therein). The main difficulty for isolating their contribution in correlation functions, and thus computing their masses and matrix elements, was identified almost immediately: the signal-to-noise ratio of suitable two-point correlation functions [5] decreases exponentially with the time separation of the sources, and in practice it is very difficult to find a window where statistical and systematic errors are both under control [6,7]. A widely used strategy to mitigate this problem is to reduce the contamination from excited states in the correlators by constructing interpolating fields with a small overlap over them [8,9]. The lowest energy is then extracted at short time-distances by assuming a negligible contribution from excited states, sometimes also with the help of anisotropic lattices [10]. The most comprehensive studies of glueball masses and matrix elements performed with these techniques can be found in Refs. [8,[11][12][13][14][15]. This approach is not entirely satisfactory from a conceptual and a practical point of view. The problem of the exponential degradation of the signal-to-noise ratio remains unsolved, and the functional form of the sources is usually optimized so that correlators or combinations of them show a single exponential decay in the short time-range allowed by the statistical accuracy. A solid evidence that a single state dominates, i.e. a long exponential decay over many orders of magnitude, is thus missing. In most of the cases the computation of more involved correlation functions remains inaccessible.
Recently we proposed a new approach to solve the problem [16,17] (see also [18,19]). By using the transfer matrix formalism, it is possible to introduce in the partition function projectors which select the contributions from states with a given set of quantum numbers only. The composition rules of the corresponding transfer matrix elements can then be exploited to implement a hierarchical multi-level integration procedure. By iterating over several levels, the numerical cost for computing ratios of partition functions or correlation functions grows, at asymptotically large times, with a power of the time extent of the lattice rather than exponentially.
The aim of this paper is to design a new strategy for computing the lightest glueball masses, multiplicities and their matrix elements in each super-selection sector of the Hilbert space of the Yang-Mills theory. This is achieved by generalizing the analysis in Ref. [16] to all discrete symmetries of the lattice theory in finite volume: parity, charge conjugation, rotations, translations and central conjugations Z 3 N . We assume some familiarity with Ref. [16], especially in Section 6, where the numerical algorithm is described. Being based on the transfer matrix formalism, the approach is rather general and can be applied to a wide class of bosonic theories.
We implement the strategy in the SU(3) Yang-Mills theory on lattices with a spacing of roughly 0.17 fm, spatial volumes up to 5 fm 3 , and time extent up to 2 fm. A full-fledged computation of the mass and multiplicity of the lightest glueball with vacuum quantum numbers is then carried out. The algorithm behaves as expected, and the multi-level integration scheme achieves an exponential enhancement of the signalto-noise ratio for the quantities considered. As a result we can extract with confidence the contribution from the lightest glueball and, from a lattice with a time extent of 2 fm, determine its mass with a precision of few percent. Its multiplicity turns out to be one within statistical errors, and the latter are small enough to exclude all other values allowed by the underlying group theory.

Preliminaries and basic notation
For definiteness we focus on the SU(3) Yang-Mills theory discretized by the standard plaquette Wilson action. The theoretical discussion, however, can be applied to other bosonic field theories, and it is mostly independent on the discretization details. The theory is set up on a finite four-dimensional lattice of volume V = T × L 3 with a spacing a and periodic boundary conditions 1 . The gauge-invariant action is defined as (all unexplained notation can be found in Ref. [16]) where the trace is over the color index, and β = 6/g 2 0 with g 0 being the bare coupling constant. The plaquette is defined as a function of the gauge links U µ (x) as with µ, ν = 0, . . . , 3,μ is the unit vector along the direction µ, and x is the space-time coordinate. The path integral is defined as usual where DU is the invariant Haar measure on the SU(3) group, which throughout the paper will be always normalized so that DU = 1. The "coordinate" basis in the Hilbert space of the theory is the set of vectors which diagonalize the field operators at all spatial points, so that on a given time-slice of the latticê As the notation suggests, the operator eigenvalues U k (x 0 , x) are identified with the spatial links on the time-slice x 0 at the spatial coordinate x. The matrix elements of the transfer operatorT are whereP G is the projector onto the gauge invariant states. In the following, for notational simplicity, this projector is always included in the definition ofT. For the Wilson action the transfer matrix elements are [20][21][22][23] T where the explicit form of the Lagrangian L can be found in Appendix A. From this formula one can define the transfer matrix elements associated to a given thick timeslice, i.e. the ensemble of points in the sub-lattice with time coordinates in a given interval [x 0 , y 0 ] and bounded by the equal-time hyper-planes at times x 0 and y 0 , as (2.7) By identifying the gauge transformations Ω in Eq. (2.6) with the links in the temporal direction, the path integral can thus be written as

Decomposition of path integrals
The partition function Z can be decomposed into a sum of path integrals each giving the contribution from multiplets of states which transform as an irreducible representation of a symmetry group of the theory. The phase space of the theory can indeed be divided into regular representations of the group, which in turn can be decomposed into irreducible ones by applying the standard group-theory machinery [24,25]. The invariance of the transfer operatorT under the group transformations then guarantees the decomposition of the partition function. An analogous decomposition applies to the correlation functions of the theory.
To carry out this program in detail, let us assume that the theory is invariant under the transformations of a generic discrete 2 symmetry group G of order g. Its elements R i , with i = 1, . . . , g, act on a generic vector |U of the coordinate basis of the Hilbert space asΓ where U R i is the gauge field obtained by applying the group transformation R i to the original one (see next section). The vectors |U R i form a regular representation, i.e.
The non-equivalent irreducible representations Γ (µ) (R i ) are labeled by µ = 1, . . . , N r , where N r is the number of classes in which the group can be divided, their dimensions n µ satisfy Nr µ=1 n 2 µ = g , (3.4) and their characters are given by For a given realization of an irreducible representation µ, the "projector" operators are defined as usual byP and they satisfŷ In particularP (µ) jj is the projector onto states of the Hilbert space that transform as the component j of an irreducible representation µ. The path integral can then be decomposed as By inserting a complete set of eigenstates of the Hamiltonian we can also write where µ = 1 corresponds to the invariant singlet representation. In these expressions E 0 is the vacuum energy, E

Partition function
The transfer matrix elements among states belonging to irreducible representations can be written as For a thick time-slice the analogous ones are defined by exploiting the orthogonality of the projectors which also implies where the integration is restricted to the active links of the thick time-slice. The relations (3.11) and (3.12) are the basic building blocks for the practical implementation of the multi-level integration algorithm described in Section 6. Finally, by repeatedly using Eq. (3.11), it is possible to rewrite the path integral for a given representation as It is interesting to notice that even though the transfer matrix formalism inspired the construction, the above considerations hold independently of the existence of a positive self-adjoint transfer operator. The insertion of T (µ) jj U x 0 +1 , U x 0 in the path integral plays the same rôle asP

Correlators of composite operators
We are interested in irreducible tensor operators, which under the group transform aŝ is a functional of the links. The transformation rule (3.14) implies that By using the Wigner-Eckart theorem, we can define the reduced matrix elements corresponding to an operator insertion on the time-slice x 0 inside a thick time-slice of size d as k are the Clebsch-Gordan coefficients, k = 1, . . . , n µµ 2 µ 1 and n µµ 2 µ 1 is the number of times that the irreducible representation µ 1 appears in the direct product representation Γ (µ) ⊗ Γ (µ 2 ) . Thanks to the orthogonality properties of the Clebsch-Gordan coefficients, a two-point correlation function can be written as where z 0 +d < w 0 . The generalization to n-point correlation functions is straightforward.

Discrete symmetries of the Yang-Mills theory
The SU(3) Yang-Mills theory on a lattice of finite volume with periodic boundary conditions is invariant under parity, charge conjugation, translations, rotations and central conjugations Z 3 3 . In this section we set the notation for these groups, and we briefly review the properties which are relevant for the paper.

Parity
The group is of order 2. A regular representation is two-dimensional, and it is spanned by the vectors where on a generic time-slice x 0 The two irreducible representations of dimension one are Γ (±) (R 1 ) = ±Γ (±) (R 2 ) = 1, where the phase convention is the same as in Ref. [16].

Charge conjugation
The group is of order 2. A regular representation is two-dimensional and it is spanned by The two irreducible representations of dimension one are Γ (±) (R 1 ) = ±Γ (±) (R 2 ) = 1.

Translations
The group of translations is a direct product of three Abelian groups, one for each space direction. Its elements are labeled by a three dimensional vector of integers m = (m 1 , m 2 , m 3 ), with m i = 0, . . . , L − 1, where each component labels the elements of the Abelian group in the corresponding direction. A regular representation is L 3dimensional and is spanned by Since the group is Abelian, each element forms its own class and there are L 3 inequivalent irreducible representations of dimension 1 which are labeled by momentum vectors p = 2π L [n 1 , n 2 , n 3 ], with n i = 0, . . . , L − 1.

Rotations
The octahedral group is of order 24. Its elements are listed in Appendix B. They form 5 equivalence classes. A regular representation is 24-dimensional and is spanned by The inequivalent irreducible representations are two singlets A 1 and A 2 , one doublet E and two triplets T 1 and T 2 . Their expressions and their characters are given in Appendix B while the Clebsch-Gordan coefficients can be found in Ref. [24,25].

Central conjugations Z 3 3
The presence of this symmetry, first described by 't Hooft in Ref. [26] (see also Ref. [27] for a review), is due to the choice of periodic boundary conditions and it disappears in the infinite volume limit. The group is a direct product of three Z 3 , one for each spatial direction. It is of order 27, and its elements are labeled by a three dimensional vector of integers ν = (ν 1 , ν 2 , ν 3 ), with ν i = 0, 1, 2, where each component labels the elements of the Abelian group in the corresponding direction. A regular representation is 27-dimensional and is spanned by and where W is a 3 × 3 diagonal matrix with elements W αα = (1 − 3 δ α3 ). Since the group is Abelian, each element forms its own class, and there are 27 non-equivalent irreducible ones of dimension 1

Glueball masses, multiplicities and matrix elements
Dynamical properties of glueballs can be extracted from ratios of partition functions and correlators, which in turn can be computed efficiently by judiciously putting together the tools developed in the previous sections. As usual, the various super-selection sectors of the Hilbert space are identified by the quantum numbers associated to a complete set of operators which commute among themselves and with the transfer operator. In the zero-momentum sector (p = 0), glueball states are classified by their transformation properties under rotations (µ, j), parity (P) and charge conjugation (C). Moreover in finite volume a null electric flux vector (e = 0) identifies the "physical sector" of the theory, i.e. the one which survives in the infinite volume limit. The corresponding projectors are given byP The transfer matrix elements associated to a given thick-time slice are defined as T (µ,P,C) jj where χ (µ,P,C) and for each group the sum is on all its elements. The corresponding path integrals are finally given by In each sector, thanks to Eqs. defined analogously as in Eq. (5.5), and extract the mass and the multiplicity of the lightest state. If, in the continuum limit, the mass turns out to be heavier than the one computed in the first step, the latter is the mass of the lightest glueball with vacuum quantum numbers, and it is also the mass gap of the theory.
By generalizing Eq. (3.18) to the case of multiple quantum numbers, correlation functions of composite operators can be calculated analogously to ratios of partition functions discussed above. It is interesting to notice that the approach described here allows to determine efficiently also the mass and the multiplicity of states in the "unphysical" sectors (e = 0) [28].

Numerical algorithm
The formulae in Eqs. (5.3) and (5.5) require some manipulation before they can be implemented in a numerical simulation.

Thick-time slice with fixed quantum numbers
In the computation of the transfer matrix elements (normalized to the standard one) given in Eq. (5.3), the basic building blocks are the ratios Once written as in Eq. (3.12), they are computed numerically through the telescopic algorithm described in Section 4.1 of Ref. [16], the latter being generalized to all possible combinations of group transformations. Their calculation is the most expensive part of the multi-level procedure. It is therefore worthwhile to optimize on the number of them and/or on their numerical precision. To this aim it is relevant to notice that a weighted average appears in Eq. (5.3), with the weights given by the appropriate products of characters. The larger is the number of significant terms averaged over, the lower is the statistical precision required on each term so to achieve a given accuracy on the sum. In the case of the singlet under all symmetry groups, for instance, the precision on each thick-time slice ratio can be reduced proportionally to the square root of the number of addenda. A large number of terms therefore does not automatically implies a more expensive numerical computation. The statistical error on each ratio (6.1), however, cannot be arbitrarily large for these arguments to apply. Therefore it also pays off to reduce the number of ratios to be determined. This can be achieved by promoting each group index to a stochastic variable with integer values. To each of the index combinations is thus associated a probability distribution Π R i ,p,c,m,ν so that Moreover if we sum over a subset of indices, for instance R i , then is still a probability distribution in the remaining variables. If, for instance, we choose Π R i ,p,c,m,ν = const, then the thick-time slice with fixed quantum numbers can be written as T (µ,P,C) jj The sum on the r.h.s can be computed stochastically by extracting, for each thick-time slice, a series of "configurations" for the set of indices (R i , p, c, m, ν) and averaging over them. There is clearly a great freedom in choosing the best sampling procedure. An efficient one is to project exactly on the quantum numbers which eliminate the contributions from states lighter than the one of interest, while treating stochastically the indices which project out heavier states. One can also extract the indices with a distribution different from a constant one, or implement a more involved sampling procedure. Once the estimate of the l.h.s. of Eq. (6.4) is inserted in the multi-level algorithm (see below), the final result is independent from the particular procedure implemented and from the statistical accuracy on the thick time-slice matrix elements. The algorithm is design to be always exact, but the variance and therefore its efficiency depends on details of the implementation.

Multi-level integration scheme
The ratio of partition functions Z (µ,P,C) /Z can then be calculated by implementing the hierarchical two-level integration formula (see Figure 1) In each ratio on the r.h.s of Eq. (6.6) the stochastic indices are generated, independently on each thick-time slice, with chosen distribution Π R i ,p,c,m,ν . The procedure can be generalized easily. For a three-level scheme, for instance, each ratio can be computed with a two-level scheme by exploiting the composition rule in Eq. (3.11). While the result does not depend on the particular integration scheme implemented, its statistical error does. The algorithm therefore requires an optimization which allows one to exploit the expected spectral properties of the theory. For instance if d is chosen large enough, i.e. larger than 1/T c with T c being the critical temperature, only a few of the physical states give a sizeable contribution to each ratio T (µ,P,C) jj . The latter is therefore expected to be of order e −E (µ,P,C) 1 d , the magnitude of the product is of the order of e −E (µ,P,C) 1 T for each configuration of the boundary fields, and the statistical fluctuations are reduced to this level [16,17].

Correlators of composite operators
The formula in Eq. (3.18) and its generalization to the case of multiple quantum numbers can be implemented by inserting the sources in a thick time-slice and then following the same steps that lead to Eq. (6.5). It is important to stress, however, that once all the thick time-slice ratios T (µ,P,C)) jj only the first integral of the telescopic expansion in Eq. (4.2) of Ref. [16] needs to be re-done for the thick time-slices where a source is inserted. The extra numerical burden is therefore negligible. As in the previous subsection, the statistical error associated to the estimate of the correlation function is comparable to the signal if the algorithm is properly optimized.

Numerical results
To demonstrate the practical feasibility of the strategy proposed in this paper, we have carried out a full-fledged computation of the mass and multiplicity of the lightest glueball with vacuum quantum numbers. We have simulated the SU(3) gauge theory at β = 6/g 2 0 = 5.7, which corresponds to a spacing of 0.17 fm if the reference scale r 0 = 0.5 fm is used to calibrate the lattice [29]. The spatial lengths of the lattices are 1.4 and 1.7 fm, while their time dimension extends up to 2 fm. A list of the runs, the number of configurations generated and some details of the multi-level algorithm implemented is reported in Table 1.
The primary quantity that we have calculated numerically is the ratio of partition functions Z (p,+) /Z defined in section 5, with p 2,3 = 0, p 1 = (2π/L) n 1 and n 1 = 1, 2. The thick time-slice transfer matrix elements associated to the projector in Eq. (5.6) have been computed as described in section 6: the sum on the group indices of charge conjugation and translations along direction 1 has been done exactly, while the one over the remaining indices (p 2 , p 3 , ν) has been carried out stochastically by extracting    Table 2, and those of the A lattices are plotted in Figure 2 as a function of T (left panel). The data show a clear exponential decay of the ratio Z (p,+) /Z (0,+) over more than 6 orders of magnitude. We fit the results of the A series to a single exponential, i.e. ln Z (p,+) where A = ln ω + , with ω + being the multiplicity of the state, and B = E (p,+) eff . This function fits well the last four points, and the best fit gives 4 A = −0.6(4) and B = 1.15(6) (χ 2 /dof = 1.5). Group theory predicts the multiplicity to be an integer between 0 and 3. The null value is excluded by the data, the multiplicity 1 is within 1.5σ of the central value given by the fit, while 2 and 3 are 3.2 σ and 4.2σ away. As a further check we also fix the multiplicity to one of the possible integers, and we obtain χ 2 /dof = 1.7, 4.4 and 6.8 for ω + = 1, 2 and 3 respectively. We therefore conclude that the data strongly prefer multiplicity 1, the value expected for a singlet under the octahedral group. We stress once again that the computation of the multiplicity is new because this quantity is not accessible to the standard technique.
By imposing the multiplicity to be 1, the energy of the state can be determined at each T as which yields the results given in Table 2 and shown in the right panel of Figure 2. It is interesting to notice that a precision of a few percent is reached with only 50-100 Lattice Z (0,+) /Z n 1 Z (p,+) /Z Z (p,+) /Z (0,+) E (p,+) eff  configurations of the uppermost algorithmic level. To be on the conservative side, we take as our best estimate for the energy the value at T = 12 reported in Table 2.
The result of the fit to a constant of the last four data points gives 1.233 (14), a value compatible with our best one but with half the statistical error. Finite volume effects in the energy values are expected to be exponentially suppressed at asymptotically large values of L. Lattice B 3 serves the purpose of assessing their magnitude. It has the same lattice spacing of the A series but a linear extension of L = 10. The results for n 1 = 1, 2 are reported in Table 2, and are plotted as a function of the momentum squared in Figure 3. The dashed line is a linear interpolation of the two black points (circles) of the lattice B 3 , while the red point (square) is our best result for the A series. It is rather clear that, within our statistical precision, finite volume effects are not visible in our data. It is also interesting to notice that, even if the values of the momenta are rather large, the continuum dispersion relation is well reproduced within our statistical errors.
The glueball mass can finally be extracted, up to O(a 2 ) discretization errors, by using the continuum dispersion relation which, in units of the lattice spacing, yields to M + = 0.935 ± 0.042 . (7.4) This is one of the main numerical results of this paper. When converted in physical units by using r 0 , it gives M + = 1.08(5) GeV. We remark that the value in Eq. (7.4) is fully compatible with 0.955 (15), the mass computed with the standard method by using the same action at the same lattice spacing [12]. On the other hand its value in physical units is quite smaller than the continuum extrapolated one estimated, for instance, in Ref. [14]. This can be easily explained by discretization effects which still contribute to the value in Eq. (7.4) [12]. Removing them goes beyond the scope of this paper.
To identify the state of mass M + with the 0 ++ glueball we still need to verify that the lightest state in the sector orthogonal to the vacuum, i.e. the one selected by the projector in Eq. (5.7), is heavier. Since the multiplicity turns out to be one, we can further restrict the orthogonal sector by projecting onto the singlet representations of the octahedral group A 1 and A 2 . The ratio of such a partition function over the standard one has been computed in a dedicated run at T = 6 and L = 8, and our best estimate turns out to be 2.6(26) · 10 −4 . This result, even if compatible with zero, puts an upper bound on this quantity much smaller than the one expected if a state lighter than M + were present. This suggests to identify M + with the mass of the A ++ 1 glueball. More simulations are needed to corroborate this result, and to have a solid numerical evidence of the existence of a mass gap in the theory.   Figure 3: The effective energy squared from the A 5 (red square) and the B 3 (black circles) lattices. The line is a linear interpolation of the black points (circles).

Conclusions
The relative contributions to the partition function, due to states carrying a given set of quantum numbers associated with the exact symmetries of a field theory, can be expressed by ratios of path integrals with different boundary conditions in the time direction. From a theoretical point of view these are very clean quantities, which have a finite and universal continuum limit once the bare parameters in the action have been renormalized. From an algorithmic point of view, the composition properties of the projectors can be exploited to implement a hierarchical multi-level integration procedure which solves the problem of the exponential (in time) degradation of the signal-to-noise ratio. The numerical study presented in this paper demonstrates that glueball masses can be extracted from these observables (for the SU(3) Yang-Mills theory) with the present generation of computers. We have been able to follow the exponential decay of a ratio of partition functions dominated by the lightest glueball with vacuum quantum numbers for more than 6 orders of magnitude. A fit to a single exponential in the time range 0.85 -2 fm has allowed us to determine unambiguously the multiplicity of the lightest state for the first time. The mass has then been extracted with a precision of few percent at a time distance of 2 fm, where the contamination from excited states is negligible.
The ideas presented here can also lead to a solid quantitative evidence of the existence of a mass gap in the theory. To this aim a continuum limit extrapolation of the numerical results is mandatory. At present, however, the scaling of the algorithm with the square of the number of spatial points, together with the limited numerical resources at our disposal, prevents us to simulate the theory at small enough lattice spacings for such a limit to be reliably taken. Another intriguing application is the computation of the thermodynamic potentials at non-zero temperature as suggested in Ref. [30]. No vacuum subtraction or renormalization constant is required, and the method can be applied at arbitrary high and low temperatures. The ratios introduced here may also turn out to be useful in the study of QCD-like theories, where isolating the contributions from single states in correlation functions may not be easy.

A Lagrangian of the Wilson action
The Lagrangian L which enters the definition of the transfer matrix elements in Eq. (2.6) is given by where the kinetic and the potential contributions are defined as and respectively.

B Irreducible representations of the octahedral group
In this appendix we report our conventions for the octahedral group and its irreducible representations. The group has 24 elements R i , 3 × 3 orthogonal matrices (R T i R i = 1 1), grouped in five conjugacy classes.   The characters of the various representations are reported in Table 3.