Numerical properties of staggered quarks with a taste-dependent mass term

The numerical properties of staggered Dirac operators with a taste-dependent mass term proposed by Adams [1,2] and by Hoelbling [3] are compared with those of ordinary staggered and Wilson Dirac operators. In the free limit and on (quenched) interacting configurations, we consider their topological properties, their spectrum, and the resulting pion mass. Although we also consider the spectral structure, topological properties, locality, and computational cost of an overlap operator with a staggered kernel, we call attention to the possibility of using the Adams and Hoelbling operators without the overlap construction. In particular, the Hoelbling operator could be used to simulate two degenerate flavors without additive mass renormalization, and thus without fine-tuning in the chiral limit.


JHEP04(2012)142
Such formulation, which is one of the various approaches aiming at minimally doubled fermions [24][25][26][27][28][29][30][31], could combine the advantages of the overlap construction with the computational efficiency of a staggered kernel. Furthermore, this formulation appears to be particularly attractive from the point of view of topological properties [1].
Using a staggered operator with a "flavored" mass term as the kernel in an overlap construction is a very appealing idea, but the properties of such operators (with various tastedependent mass terms) are interesting on their own. In fact, while the overlap construction completely removes the need for fine tuning to achieve massless fermions, it still leads to a considerable computational overhead. In contrast, using a staggered operator with tastedependent massà la Wilson requires fine tuning to obtain exactly massless modes, but, by virtue of the reduced size of the operator, may still be a computationally competitive alternative to the usual Wilson discretization, while avoiding the rooting prescription.
This motivation led us to address a numerical investigation of different operators of this type that we present here (preliminary results have appeared in [22,23]). In the following, we present a systematic classification of the possible taste-dependent mass terms, discuss their analytical features in the free theory, and then move on to the interacting case that we study via numerical simulations. We perform an elementary measurement of the pion mass on a set of quenched configurations, and verify the expected PCAC behaviour as one approaches the chiral limit. In an appendix, we also explore the properties of the staggered overlap operator proposed in [1], in comparison with the usual overlap based on the Wilson kernel. In particular, we compare the locality of the operators, and the computational cost of applying them to a vector and of solving for the quark propagator.
The structure of this paper is as follows. First, in section 2 we recall theoretical aspects of the construction of taste-dependent mass terms, and discuss their spectral structure in the free field case. Then, we address the interacting case, presenting our numerical studies in section 3. We summarize our findings and discuss their implications for possible future, large-scale applications of these operators in section 4. Finally, in the appendix A, we report on our study of an overlap operator based on a staggered kernel, as proposed in ref. [1].

Theoretical formulation and general features
The staggered operator [14] with η µ (x) = (−1) ν<µ xν and (V µ ) x,y = U µ (x)δ x+aμ,y , is a computationally very efficient way to discretize the massless QCD Dirac operator on a d-dimensional Euclidean hypercubic lattice of spacing a. This operator is invariant under a global U(1) symmetry, which can be interpreted as a remnant of chiral symmetry: in fact, D KS anticommutes with the operator Γ 55 defined by (Γ 55 ) x,y = (−1) d ν=1 xν . In the free theory, one can easily see that in four dimensions the operator Γ 55 has γ 5 ⊗ γ 5 structure in spin-taste space [33][34][35][36]. The construction of D KS is based on a local spin diagonalization, which, for the four-dimensional JHEP04(2012)142 case, allows one to reduce the number of fermion components by a factor of 4 with respect to the naive operator, and yields four tastes in the continuum limit. The degeneracy between these four tastes is explicitly broken by gauge interactions at finite lattice spacing a, but is recovered in the continuum limit a → 0.
It is highly desirable to preserve the symmetry Γ 55 DΓ 55 = D † , because it guarantees that det D is real, and non-negative (in the absence of real negative eigenvalues), thus avoiding a "sign problem" in the measure [38]. This symmetry is satisfied only if the hermitian mass term connects sites of the same parity. Thus, we do not consider the 1-link or a 3-link mass terms further.
This leaves three possibilities: 0-, 2-and 4-link mass terms. 1 The 0-link mass term corresponds to the usual staggered operator, with a taste-independent bare mass (2. 2) The staggered operator with a 2-link mass term, which was discussed in refs. [3,22,23], can be written in the form where (following the notation of ref. [3])   3)), which includes a taste-dependent mass term with tensor-like structure in taste space (i.e., a 2-link mass term). Right panel: free spectrum of operator D 4 (eq. (2.8)), which includes a taste-dependent mass term with γ 5 structure in taste space (i.e., a 4-link mass term).
Finally, a staggered operator featuring a mass term with γ 5 structure in taste space [1] can be written as while C is the average of four-link parallel transporters joining sites at opposite corners of the elementary lattice hypercubes Note that the mass term appearing on the r.h.s. of eq. (2.8) is hermitian and commutes with Γ 55 . To understand the properties of these three different types of operators it is instructive to start by discussing their spectra in the free limit. The three panels in figure 1 show the structure of the spectrum of eigenvalues for D KS (for D 0 , the spectrum is just trivially shifted by m), for D 2 , and for D 4 in the non-interacting case.
In the free limit the eigenvalues of D 0 on a lattice with N µ sites along the µ direction read

JHEP04(2012)142
with eight degenerate eigenvalues of both signs, and where ε µ = 0 (1/2) if the fermionic field satisfies (anti-)periodic boundary conditions along the µ direction. For D 2 the free eigenvalues take the form (for ρ = 1) and in which the ± signs are chosen independently and the eigenvalues are doubly degenerate, and having defined

15)
where c µ = cos p µ , and c = c 1 c 2 c 3 c 4 . Expanding for small momenta gives 18) so that at low momenta, the eigenmodes corresponding to λ 2 get a mass of ±2, while the eigenmodes corresponding to λ 1 are massless. Finally, the free spectrum of D 4 reads: Note that, in the continuum limit, the point where the spectrum of the D KS operator intersects the real axis corresponds to four massless modes. By contrast, D 2 leads to one mode in each of the two intersections away from the origin, and two at the origin. Finally, for D 4 one obtains two modes at each of the two intersections of the spectrum with the real axis. The taste chirality of the eigenmodes Ψ of D 4 and D 2 , is given by (ΨΓ 55 Γ 5 Ψ), where Γ 55 = γ 5 ⊗ γ 5 exactly, and Γ 5 = γ 5 ⊗ 1 + O(a) in spin ⊗ taste. The taste chirality of the eigenmodes corresponding to eigenvalues λ 1 is c, while that of the λ 2 -eigenvectors is −c. This is also depicted in figure 2; one can see that the taste chirality of the real eigenmodes is ±1, and is the same (+1 or −1) for all modes in a given branch of the D 4 or D 2 spectrum. The implications of a well-defined taste chirality have been stressed in [2]: if Γ 55 Γ 5 ≈ ±1, then Γ 55 ≈ ±Γ 5 , so that the spin chirality of the real eigenmodes can be probed by Γ 55 . This is the reason why the index theorem applies to D 2 and D 4 , while it does not for D KS (where both the ±1 taste chiralities lay on the same single branch.) A shift of the spectra by a real value can thus lead to chiral low-momentum zero modes in each branch, and hence to the possibility of constructing an appropriate index. A common way to study the index consists of looking at the flow of eigenvalues λ(m) of: (2.20)
An alternative way to look at the spectral flow was proposed in ref. [1] for the D 4 operator, by studying the eigenvalues of 2  the latter case, the comparison of the two flow definitions shows that, with the standard definition, the region around the real axis is populated by a large number of eigenvalues, preventing one from identifying the crossing with accuracy. Next, it is interesting to compare the identification of the index, using the spectral flow defined from eq. (2.20), for staggered fermions with a taste-dependent mass term, and for conventional Wilson fermions. This is shown in figure 4: the left, central and right plot in each row show the spectral flow for D 4 , D 2 and a standard Wilson operator, respectively, while the three different rows, from top to bottom, refer to a cold configuration, to a cooled Q = 1 configuration, and to a non-cooled Q = −1 quenched configuration at β = 6. It is interesting to observe that, as expected, the spectral flow on a cooled instanton configuration clearly reveals N f × Q crossings. However, one already sees that in the plots of the β = 6 configuration the gap tends to close. This is especially the case for the D 4 operator, and is related to the properties that will be discussed in section 3.
The overall message that can already be drawn from these observations (before addressing a full-fledged numerical investigation) is that the gauge field fluctuations in interacting configurations reduce the width of the gap in the spectrum, and blur the distinction between light modes and doublers.

Numerical investigation on interacting configurations
As we showed in the previous section, the fluctuations in typical interacting configurations lead to a filling of the gap in the spectral flow for the various lattice Dirac operators that we are considering, making a proper identification of the index difficult. A related effect can also be seen directly in the spectra of the operators: the panels in figure 5  This can be understood from the fact that, since D 4 involves 4-link parallel transporters, it is more sensitive to the gauge field fluctuations in interacting configurations than D 2 , which involves 2-link transporters only. 3 However, for practical applications in large-scale simulations, it is important to remark that, as usual, the effect of gauge fluctuations can be considerably reduced through some suitably optimized smearing procedure.
Next, we considered the effectiveness of these operators for spectroscopy calculations. To this end, we performed a simple test, by studying the mass m P S of the lightest meson in the pseudoscalar channel (the pion). We computed the quark propagator G(x, y, z, t) from a point source, on quenched configurations at β = 6 on a lattice of size 16 3 × 32, then we evaluated the p = 0 correlation function  and extracted am P S searching for the large-time plateau in the effective mass plot, as a function of t. Monitoring the behavior of (am P S ) 2 as a function of (am), one can study the partially conserved axial current and the issues related to mass renormalization. Figure 6 shows the correlators obtained on a free configuration, for D 0 (left panel), for D 2 (central panel) and for D 4 (right panel). As expected, the D 2 operator leads to a massless pion for both am = 0 and am = 2.
For an interacting configuration (at β = 6), the comparison between D 2 and D 4 shown in figure 7 reveals that for D 2 one obtains a massless pion at approximately am ∼ 1.15, while for D 4 the same happens for am ∼ 0.25. Comparing these numbers with the values of the bare masses corresponding to a massless pseudoscalar state in the free limit (1 and 2 respectively), these results give an indication that the mass renormalization is more pronounced for D 4 than for D 2 . Quantitatively, one can observe that the renormalization factor grows exponentially with the length of the parallel transporters used: (0.25/1) 1/4 ∼ (1.15/2) 1/2 , in agreement with the fact that D 4 involves 4-link terms, as opposed to D 2 , in which the mass term is constructed from 2-link terms.
Remarkably, with the D 2 operator, the pion mass shows a square-root behaviour of  three different kinds: one can approach the critical bare quark mass am 0 ∼ 1.15 from the left or from the right, i.e. from the inside or the outside of the D 2 spectrum (the behaviour is square-root-like even though the theory describes one flavour only -it is caused by the approach to the Aoki phase). In addition, one can also approach the other critical quark mass am = 0, corresponding to the central branch of the spectrum, which remains zero as in the free case by symmetry of the average spectrum. The transition from one branch to another seems rather abrupt, and the scaling of the pion mass can be observed over a broad range of quark masses approaching zero. The lesson is that D 2 may provide a cost-effective way to simulate N f = 2 light quark species, without fine-tuning of the bare quark mass to approach the chiral limit.

Conclusions
In this work, we performed a numerical study of staggered Dirac operators with a tastedependent mass term. We restricted our attention to operators including mass terms with tensor or pseudoscalar structure in taste space: their γ 5 -hermiticity properties are such, that their eigenvalues come in complex conjugate pairs (as is the case for the usual staggered Dirac operator), leading to a real fermionic determinant, which is non-negative in the absence of negative real eigenvalues.

JHEP04(2012)142
Such operators were proposed by Adams [1,2] and by Hoelbling [3]. We compared their properties both in the free limit and on interacting configurations at typical values of the gauge coupling.
Our results show that these operators can indeed be used to separate the low-lying modes and reduce the number of tastes, in a way characterized by well-defined topological properties. Our study of the spectral flow reveals that, for the 4-link operator (with a taste-pseudoscalar mass term), the gap in the eigenvalue spectrum tends to close rather early, obstructing an easy identification of the eigenvalue crossings, which are related to the index. As one might have expected, the 2-link operator shows markedly more robustness to gauge fluctuations.
We also performed an elementary study of pion propagators, which shows that the lightest meson is rather easy to isolate without explicitly disentangling spin and taste degrees of freedom. Approaching the chiral limit requires in general the fine-tuning of an additive mass term, as for Wilson fermions. One important exception occurs for the 2-link operator: if one chooses the middle branch of the spectrum, one can study a theory with two tastes, where the additive mass renormalization vanishes due to the symmetry of the spectrum. Therefore, no fine-tuning is needed.
Although the 2-link operator was designed to produce a single taste (with a fine-tuned additive mass), it may well be that its most promising use is to simulate two tastes without additive mass renormalization. Note that the heavy doubler modes do not completely decouple in that situation. In the background of a topological charge Q, they contribute real eigenvalues ∼ (+1/a) Q and (−1/a) Q , making the determinant negative when Q is odd. The θ-parameter is thus equal to π. This sign (−1) Q should be removed by hand (or simply ignored) in order to simulate the θ = 0 theory.
Finally, we studied the properties of an overlap operator with a D 4 kernel (see appendix). We found that its locality properties are similar to those of the operator based on a Wilson kernel. As it concerns the computational cost for a quark propagator calculation, we found that, in the free limit or on very smooth gauge configurations, the inversion of the operator based on a kernel with a four-link mass term is almost one order of magnitude faster than using an overlap with Wilson kernel. However, we also observed a significant loss of efficiency on interacting (quenched) configurations at β = 6, where the operator with the D 4 kernel is only approximately twice as fast as that with a Wilson kernel. The reason for this can probably be traced back to the fact that the four-link transporters in the mass term are more sensitive to the effect of the fluctuations in gauge configurations on coarser lattices. Our crude assessment indicates that this new, staggered, overlap operator does not bring a major computational advantage over a Wilson kernel, while producing two degenerate flavors, but without the full SU(2) flavor symmetry.
Two copies of an overlap operator with a kernel based on Hoelbling's 2-link operator would give more flexibility, e.g. that of simulating two flavors with unequal masses, for a similar computer effort.
Note added. After this paper was completed, a difficulty with the Hoelbling operator D 2 eq. (2.3) was pointed out by Steve Sharpe, and clarified by David Adams, during the

JHEP04(2012)142
Yukawa Institute Workshop "New Types of Fermions on the Lattice". It appears that the Hoelbling operator lacks sufficient rotational symmetry, so that fine-tuned Wilson loop counterterms will presumably be needed to maintain hypercubic rotational symmetry in unquenched simulations. Adams' operator D 4 eq. (2.8) does not suffer from this problem.

A Staggered overlap operator
We also studied the properties of an overlap operator based on a staggered D 4 kernel, as originally proposed in [1]. The construction is completely standard: and leads to two exactly massless physical fermions in the continuum limit, without finetuning. As compared to a conventional overlap operator based on a Wilson kernel, the potential advantages of this construction are related to the reduced kernel size (D 4 is a matrix of size four times smaller than a Wilson operator on the same lattice). We take ρ = 1.
To understand the effectiveness of an overlap operator with a D 4 kernel, the first important issue to be discussed is the locality of the operator. As is well-known, an overlap operator is not ultra-local [41]. Its locality properties can be studied by looking at the decay of its matrix elements between source and sink at sites x and y (which we denote as M x,y , where, for simplicity, we only show the indices corresponding to the site coordinates), as a function of the distance between x and y [42]. To this end, in the two plots at the top of figure 8 we show the decay of |M x,y | against |x−y| 1 , the 1-norm distance (or "Manhattan distance") between the sites x and y, comparing the matrix elements of an overlap operator obtained using a D 4 kernel (left panel) or a conventional Wilson kernel (right panel). Although the D 4 kernel is less local than the Wilson kernel, the locality properties of the corresponding overlap operators are comparable. This appears quite clearly in the plots displayed at the bottom of the figure, in which the results for the two operators are shown together, for a cold configuration (left panel) and for a configuration at β = 6 (right panel).  Another important factor in the efficiency of a lattice Dirac operator is the cost of applying the operator to a vector. The multiplication by the kernel is about twice as fast, if one uses D 4 instead of a Wilson kernel (staggered fields do not have an explicit spinor index but D 4 has twice as many non-zero elements as the Wilson operator). In the computation of the sign ofH = Γ 55 D 4 , using the conjugate gradient (CG) method, and no deflation, the gain with respect to a conventional Wilson kernel is a factor ranging from approximately 2-3 to about 8. However, these numbers are only gross estimates, and could be improved, e.g., by optimizing the parameters of the D 4 kernel. Similarly, an improved form for the kinetic operator, link smearing (for the kinetic and/or the mass term), and standard tricks related to deflation, preconditioning, etc. . . could be applied.
To discuss the computational cost of the inversion of the operator, we compared the two overlap operators on the same pure-glue, β = 6, background on a 12 4 lattice, using the same, basic, inner/outer CG algorithm. In our computation, we evaluated the propagator as the solution of the equation: with ma = 0.1, using a conjugate gradient (CG) iterative solver: at each iteration, sign(H) is applied to a vector v through a 2-pass Lanczos process. One builds a tridiagonal matrix  T and takes the signs of its eigenvalues (which are representative of those of H), then reconstructs sign(H)v, as described in ref. [45]. The results are displayed at the top of figure 8: the three plots (from left to right) show the computational cost for the outer CG iteration, for the matrix-times-vector multiplication, and the total CPU cost. For comparison, we also show the analogous results in the free-field case, in the plots at the bottom of the figure. This comparison shows that the computational advantages expected from elementary arguments, and observed in the free limit, turn out to be dramatically reduced in "realistic" interacting configurations. Again, this reduction points to a reduction in the eigenvalue gap of D 4 .
Open Access. This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.