Improved lattice fermion action for heavy quarks

We develop an improved lattice action for heavy quarks based on Brillouin-type fermions, that have excellent energy-momentum dispersion relation. The leading discretization errors of $O(a)$ and $O(a^2)$ are eliminated at tree-level. We carry out a scaling study of this improved Brillouin fermion action on quenched lattices by calculating the charmonium energy-momentum dispersion relation and hyperfine splitting. We present a comparison to standard Wilson fermions and domain-wall fermions.


Introduction
Charm and bottom quarks have substantially shorter Compton wave-lengths than the typical length scale of Quantum Chromodynamics (QCD), 1/Λ QCD . This poses a problem for numerical simulations of QCD on the lattice. The resolution of the lattice, the lattice spacing a, is chosen such that a is sufficiently smaller than 1/Λ QCD while the entire lattice size L has to be much larger than the length scale of the inverse pion mass, the lightest particle in the system. For the lattices that can be generated with currently available computational resources, the charm quark mass m c is similar to 1/a, and the bottom quark mass m b is even larger. This was the motivation for introducing the static or nonrelativistic effective theories for heavy quarks, which allow for disentangling the relevant physical scales in these calculations. The clear scale separation helps in the control of the systematics, but the effective theory approaches require an increasing number of extra terms and tuning their associated parameters in order to achieve more precise calculations.
As an alternative to the effective heavy quark theories, in this work we perform an extensive feasibility study of different relativistic approaches to the heavy quark physics from the lattice.
Treating heavy quarks on the lattice with the conventional relativistic formulation has the advantage that the calculation can be made more and more precise as smaller lattice spacings become available. Currently, the finest lattices have 1/a ≃ 4 GeV and the attempts are being made to raise it to 5-6 GeV in the coming years. The use of the relativistic fermion formulations is therefore a promising option in the near future. For that to be really useful, it is essential to use improved fermion discretizations that allow to make precise predictions even when m is not much smaller than 1/a. One successful example is the Highly Improved Staggered Quark (HISQ) formulation [1], for which the staggered fermion formulation is improved by introducing higher dimensional operators, and the leading discretization error is of order (am) 4 for heavy quarks. This formulation has been applied for a number of calculations of phenomenologically important quantities, such as D (s) and B (s) meson decay constants and other form factors [2][3][4][5][6].
Among other relativistic actions, which do not involve the complication due to the fermion doubling of the staggered fermion formulations, the widely used formulations still contain the discretization effects of O(a 2 ), which have to be eliminated to achieve a similar level of precision to that of the HISQ formulation. This can be done in a systematic way according to the recipe of the Symanzik improvement program [7,8], and some attempts were made in the past [9][10][11][12] but they have not been used extensively except for the minimal one, i.e. the O(a)-improved (or clover) action [9], mainly because the non-perturbative tuning of improvement parameters requires a lot of effort.
The goal of this work is to study the scaling of relativistic heavy-quark formulations in the quenched approximation, before dynamical configurations with similar parameters become available. In particular, we present the scaling study of heavy-heavy meson correlators, while the scaling of the heavy-strange systems will be presented in a future publication.
In this paper, we mainly describe a study of the fermion formulation based on the improved covariant derivative and Laplacian operators [13]. We compare this Brillouin fermion formulation to more standard lattice fermions, such as the non-improved Wilson fermion formulation and the Möbius domain Wall fermions (non-smeared and smeared) [14]. In order to investigate the scaling towards the continuum limit, we generate lattice gauge ensembles in the range of 1/a = 2.0-5.6 GeV in the quenched approximation and perform the measurements of heavy-heavy correlators.
Among many options explored in [13], we consider a combination of the "isotropic" covariant derivative (iso) and "Brillouin" Laplacian (bri). This so-called Brillouin fermion is designed such that the violation of four-dimensional rotational symmetry is minimized. By such modification, it turned out that the energy-momentum dispersion relation of a massless fermion is much closer to the continuum one compared to that of a standard Wilson fermion [13]. In the context of the Symanzik improvement, this is not obvious since the leading discretization error of O(a) remains with this prescription. But, as far as the tree-level dispersion relation is concerned, the improvement seems to be achieved including higher orders of the lattice spacing a. Once the dispersion relation is improved, one can expect that interaction terms are also improved, since the form of fermion and gauge field interaction is highly constrained in the gauge theory. Namely, one simply replaces the tree-level derivative terms by the corresponding covariant derivatives by inserting the gauge links.
In this work, we consider a further improvement of the Brillouin-based fermion formulation according to the Symanzik improvement program. We design the lattice action such that the discretization effects of O(a) and O(a 2 ) are eliminated at the tree-level. With our choice we find that the continuum-like energy-momentum dispersion relation is satisfied very precisely for quark masses up to am ∼ 0.5.
Another virtue of Brillouin fermions can be seen in its eigenvalue distribution in the complex plane. Unlike the standard Wilson fermion formulation, the Brillouin fermion has eigenvalues which lie very closely on the unit circle which the Ginsparg-Wilson relation [15] requires. It suggests that this fermion formulation has an approximate chiral symmetry without explicitly constructing the overlap operator of [16,17]. It also means that the Brillouin-Dirac operator is suitable as a kernel of the overlap operator and relatively small numerical effort is needed to build the overlap operator. We mention this possibility and its improvement beyond O(a 2 ).
The mentioned properties of the Brillouin-type fermions are not guaranteed to be satisfied beyond tree-level, and a non-perturbative study is needed to test the size of the scaling violations in the interacting case. In this work we explicitly check the scaling towards the continuum limit by taking some basic non-perturbative quantities, such as the heavy-meson dispersion relation and hyperfine splitting. This paper is organized as follows. In Section 2 we review the construction of the Brillouin-type fermion and study its improvement according to the Symanzik improvement program. At tree-level, we compare the energy-momentum dispersion relation and complex eigenvalue spectrum of the Dirac operator of various formulations. The improved Brillouin fermion has a limitation on the values of quark mass due to a violation of the reflection positivity property as discussed in Section 3. Section 4 describes a non-perturbative scaling study of the improved Brillouin fermion and its comparison to the standard Wilson fermion and domain-wall fermions. We then conclude in Section 5.

Brillouin operators
The Brillouin-type covariant derivative and Laplacian operators were introduced in [13]. (See also, [18], which introduced similar types of operators in a different context.) We write the lattice Dirac operator as where ∇ µ (n, m) and △(n, m) are the generalized covariant derivative term and Laplacian, respectively. The Sheikholeslami-Wohlert (or clover) term [9] could also be introduced with a coefficient c sw when one introduces the field rotation for the O(a)-improvement, but we do not consider this possibility in this paper. For the standard Wilson fermion, the derivative operators are at tree-level; the gauge interaction is introduced by promoting the hopping terms δ n±μ,m to a covariant derivative including a gauge link. In momentum space, they are given as The leading discretization effects are ones from ∇ std µ of O(a 2 ), as well as those of a△ std , which is O(a). We note that the O(a 2 ) term of ∇ std µ violates rotational and Lorentz symmetry.
In order to make these operators gauge covariant, we have to insert gauge links for each hop. This should be done so that the rotational symmetry under the cubic group is respected. We average the shortest possible paths in the taxi-driver distance. For two-hop terms there are two paths; three-hop terms have six paths. The most complicated four-hop terms have 24 shortest paths to be averaged. For practical implementation of them, see Appendix A.
In momentum space (at tree level), they have the form The derivative operator ∇ iso µ has O(a 2 ) discretization effects which are invariant under rotation, thus the name of "iso".
The Brillouin-type Laplacian (2.8) has the interesting structure that the doublers on the edges of the Brillouin zone have the same mass. Indeed, for (non-zero) momenta ap µ = (±π, ±π, ±π, ±π), the form in the momentum space (2.10) implies that the induced mass is always 2/a. Figure 1 shows△ std and△ bri in two-dimensional space. It apparently shows that the Brillouin-type Laplacian shows a flat tail at the edge of the Brillouin zone. This can also be seen from the eigenvalue spectrum of the Dirac operator. Figure 2 shows the eigenvalues of the Dirac operator D(n, m) for the Wilson and the Brillouin operators calculated on a free gauge field background. The eigenvalues of the Wilson-Dirac operator plotted on a complex plane show five branches on the real axis, corresponding to the doublers of masses 0, 2/a, 4/a, 6/a and 8/a. For the Brillouin operator the doublers are all degenerate at 2/a. Apart from the real axis, the eigenvalues roughly lie on a single orbit, very similar to those of the overlap-Dirac operator. It suggests that the operator is close to the overlap operator and the Ginsparg-Wilson relation is satisfied with good accuracy at least in the free field case. Among similar lattice fermion formulations which involve hopping terms within the 3 4 hypercube [19][20][21][22], the Brillouin fermion is advantageous for both the continuum-like dispersion relation and the eigenvalue spectrum that approximately respects the Ginsparg-Wilson relation.

Tree-level dispersion relation
One useful measures of the discretization effect is the energy-momentum dispersion relation. It is defined through a pole of the fermion propagator, and takes the form E = m 2 + p 2 in the continuum theory. For the lattice Dirac operator (2.2), the pole is a solution of   for specific forms of ∇ µ and △. The poles exist in the Minkowski region that is identified by assigning the "energy" E as p 4 = iE. There are more than one poles due to the doublers which are heavier than the physical mode by O(1/a). In the following we only show the dispersion relation for the physically relevant pole unless otherwise stated.
The tree-level dispersion relations are shown in Figure 3 and 4 for Wilson and Brillouin fermions, respectively. As we have an application to heavy fermions in mind, we show the results for the massive case am = 0.5 (right panel) as well as those in the massless limit (left). Lattice momenta are taken in three directions parallel to (1,0,0), (1,1,0) and (1,1,1), in order to see discretization effects which may violate rotational symmetry. The continuum relation E = m 2 + p 2 is shown by a solid line, as well.
In the massless limit (left panels), the discretization effect is quite significant for Wilson fermions beyond |ap| 0.5, while the dispersion relation for the Brillouin fermion closely follows that of the continuum theory up to |ap| ≃ 1.5. For the massive case, am = 0.5 (right     panels), the deviation from the continuum curve is sizable for both Wilson fermions and Brillouin fermions already at |ap| = 0. If we shift the overall energy such that the dispersion relation agrees with the continuum one as adopted in the non-relativistic effective theory approaches, the deviation would become visible above |ap| ∼ 0.6. Still, the dispersion relation of the massive Brillouin fermion closely follows that of the continuum compared to the Wilson fermion. The closeness to the continuum theory is quantified by Taylorexpanding the dispersion relation. Up to fifth order of a, we obtain for Wilson fermions. The first line corresponds to an expansion of the exact relation aE = ln(1 + am), which contains O(a) discretization effects. The third line represents the terms that violate rotational symmetry. On the other hand, the expansion for the Brillouin fermion gives There is no difference in the first term, since∇ iso µ (ap = 0) =∇ std µ (ap = 0) and△ bri (ap = 0) =△ std (ap = 0). For finite momenta the Brillouin fermion is improved: the coefficient of (ap) 2 does not have terms of O((am) 2 ), and the rotational symmetry violating term is suppressed by another order of a. The second property follows from the fact that∇ iso µ (p) has only an isotropic error at O(a 2 ).

D34 action
One may wonder whether the improvement obtained with the Brillouin fermion might also be achieved by more traditional improved actions which include next-to-nearest neighbor interactions, such as those of Eguchi and Kawamoto [10] or Hamber and Wu [11]. We call them the D34 action following the terminology of [12]. The Dirac operator is given as where c D34 is a free parameter. ∇ std µ is already defined in (2.3) and △ std µ is given by Note that the D34 action is defined without the fermion field rotation. Following the steps of calculating the energy-momentum dispersion, we obtain an expansion for small am and ap up to a 5 as Therefore, it is improved so that there is no O(a) and O(a 2 ) term, as designed, while the Brillouin fermion contains errors of O(a) and O(a 2 ) in the term of vanishing momenta (the first line). In this sense, the D34 is even better. The dispersion relation for the D34 action is shown in Figure 5 for the massless (left panel) and massive (right) cases. (We take c D34 = 1/6 as in [12].) Although they closely follow the continuum curve for small ap, the solution disappears beyond |ap| ∼ 1. It is understood that the solution of the equation (2.11) becomes complex, which is due to the lack of reflection positivity. It is potentially dangerous since the Wick rotation to the Minkowski space is not doable in such a situation and one has to assume that the reflection positivity is recovered if the continuum limit is taken first. It may have a practical problem that some instability occurs at relatively low momenta, especially for the massive case, as we discuss in the following sections.

Improved Brillouin operator
So far, we have shown that Brillouin fermion have some advantageous properties, even though it still contains the discretization effect of O(a). In the following, we attempt to eliminate these leading discretization errors by modifying the action.
Since the O(a 2 ) error of ∇ iso µ keeps the rotational symmetry, its improvement is relatively simple. For instance, we may construct an improved Brillouin action as where we multiply the Laplacian operator from both sides of ∇ iso µ in order to preserve the γ 5 -hermiticity property. The second term is simply squared with an arbitrary (positive) parameter c imp .
This form of the improved action resembles the D34 action, but using ∇ iso µ and △ bri as building blocks the energy-momentum dispersion relation is improved. As shown in Figure 6, the dispersion relation gives a good approximation of the continuum up to |ap| ∼ 1.5. The Taylor expansion gives  Figure 6. Same as Figure 3, but for the improved Brillouin action. which has the leading correction of O(a 3 ) as expected and it does not contain the possible term of O(a 3 ) that violates the rotational symmetry. This is because the building blocks ∇ iso µ and △ bri themselves reduce the Lorentz violating effects. The eigenvalue spectrum of the Dirac operator on the free background gauge field is shown in Figure 7 for the improved Brillouin fermion (blue) together with those of Wilson (red) and Brillouin (green) fermions. The improved Brillouin eigenvalues form a circle structure similar to that of the Brillouin operator, but the circle is slightly squashed and pressed on the imaginary axis and approaches the continuum limit where the eigenvalues are purely imaginary.
The improved Brillouin operator D imp defined in (2.17) involves multiple applications of △ bri , and therefore is numerically more expensive. Instead, we may consider a less expensive operator by using the standard operators for the terms introduced to cancel the O(a 2 ) errors. Namely, we define  .17) is also plotted in green for comparison.
where △ std is the standard lattice Laplacian operator. The energy-momentum dispersion relation for this modified operator is shown in Figure 8. Unlike the original improved Brillouin action (2.17), the departure from the continuum relation is apparent already around a|p| 1.2. Furthermore, the eigenvalue distribution shown in Figure 9 demonstrates that the doubler spectrum splits as in the standard Wilson fermion. It is therefore expected that it requires more conjugate gradient iterations than the original improved Brillouin action to obtain the inverse. (See the discussions at the end of this section.)

Overlap operators
Since the Brillouin-Dirac operator has an eigenvalue distribution very similar to that of overlap fermions as demonstrated in Figure 2, it may be an interesting option to use it as a kernel operator for the overlap-Dirac operator. Projection of eigenvalues to the unit circle in the complex plane would then require minimal numerical effort, i.e. the order of the Chebyshev polynomial or the Zolotarev rational function is relatively lower. Another advantage of the overlap fermion is that the discretization effect of the massless Dirac operator is restricted to even powers of a due to its exact chiral symmetry. For instance, if the standard Wilson-Dirac operator is used as a kernel of the overlap construction, the O(a) error of Wilson fermions is eliminated and the leading error becomes O(a 2 ). If the kernel operator is improved up to O(a 2 ), then the discretization effect of the corresponding overlap operator starts from O(a 4 ). The massless overlap-Dirac operator can be defined as where X is a kernel operator with a large (negative) mass ρ and R is often taken to be proportional to the unit matrix. Then, D ov satisfies the Ginsparg-Wilson relation {D ov , γ 5 } = RaD ov γ 5 D ov . Introducing a mass, the operator is modified to It is straight-forward to write down the propagator and solve the pole to obtain the energy-momentum dispersion relation. With the standard Wilson kernel and ρ = 1, the relation at ap = 0 is This implies that the leading discretization effect is indeed O(a 2 ). For finite momenta, we plot the dispersion relation in Figure 10. One can see that the dispersion relation is very similar to that of the kernel operator, which is in this case the Wilson-Dirac operator, shown in Figure 3. With the Brillouin operator as a kernel, the dispersion relation is improved as shown in Figure 11. Improving the overlap fermion action beyond the O(a 2 ) discretization effects, one has to modify the construction of the overlap operator of (2.20), because the Ginsparg-Wilson relation of the form {D ov , where D ov is that of (2.20). In order to eliminate the O(a 2 ) effects, it has to be used with an O(a 2 ) improved kernel operator. We calculate the dispersion relation for this improved overlap-Dirac operator with the improved Brillouin operator as a kernel. The result is shown in Figure 12, where we observe that a good approximation for the dispersion relation is maintained up to |ap| ∼ π/2. At zero spatial momentum, the relation E = m is satisfied up to an error of O(a 4 ). The overlap operator (2.20) is usually constructed using a rational approximation, which is numerically expensive. The number of terms to be included in the rational approximation depends on the range of eigenvalues to be treated and on the desired precision. When the kernel operator is already close to the overlap operator as in the case with the Brillouin operator, it is expected that a minimal order of the rational function would achieve a sufficient level of approximation. This property is still to be confirmed with actual numerical calculations.

Numerical cost
Although the advantage of the Brillouin-type Dirac operators is clear, its numerical cost is substantially higher than that of the standard (or improved) Wilson fermion action. This is simply because the isotropic derivative and the Brillouin Laplacian involves an interaction to 3 4 − 1 = 80 neighboring points, which is ten times larger than the number of nearest neighbor points, 4 × 2 = 8. Moreover, the reduction of numerical operation by a factor of two through taking advantage of the special γ matrix combination 1±γ µ works only for the Wilson fermion. Therefore, we expect at least twenty times larger computational costs for the Brillouin operator, and in practice it is several times more, especially when we use the O(a 2 )-improved version in (2.17). Therefore, in practical applications the improved Brillouin fermion could be used only for heavy quarks, for which the fermion matrix inversion can be carried out with small number of conjugate gradient iterations. In Figure 13 we compare the numerical costs for various Dirac operators by measuring the elapsed time to solve the heavy quark propagator corresponding to the pseudo-scalar meson mass m P S ≃ 3.0 GeV. A quenched 16 3 × 32 lattice of 1/a ≃ 2.0 GeV is chosen for the test and the numerical computation is done on a 32-node partition of the IBM Blue Gene/Q machine. For the solver we employ the conjugate gradient method for D † (m)D(m). From Figure 13 we can see that the Brillouin fermion takes only five-times more time than Wilson fermions does, despite the above expectation. Likewise, the improved Brillouin fermion is only ten-times slower than Wilson fermions. For the Brillouin fermion, two implementations are attempted, i.e. the overall smearing strategy (OSS) and the recursive formula (Rec) as described in the Appendix A. The OSS implementation has an additional cost, which we did not account for here, due to an uncounted cost to setup diagonal gauge links.
We also notice that the performance of the computation on the Blue Gene/Q is different for different fermion actions. In our implementation, the number of floating point operation per second (GFlops) per node is about 4.0 for Wilson fermions, while that for other actions is around 10, which is compared to the peak performance 200 GFlops, because they are more compute-intensive. The elapsed time is thus relatively shorter for the actions other than Wilson. (In other applications, the JLQCD collaboration uses a highly optimized code for the Wilson fermion, which performs much better than 15 GFlops depending on the condition, but for the comparison in Figure 13 we used a more primitive version of the Wilson-Dirac operator in order to make a fair comparison. Optimization of the Brillouin operators is yet to be done.) This relative speed-up of the Brillouin fermion is explained by the number of conjugate gradient iterations to converge. Figure 14 shows the squared norm of the residual vector at every conjugate gradient iteration steps for Wilson fermions, Brillouin fermions, the improved Brillouin fermion and the domain-wall fermion. The number of iterations is clearly smaller for the Brillouin-type fermions by more than a factor of two. This explains why the Brillouin-type fermions are not as slow as we naively expect. It comes from the fact that the largest eigenvalue of |D(0)| is 2 for the Brillouin operator rather than 8 of Wilson fermions. The condition number of the matrix D(m) is thus four-times smaller for the Brillouin-type fermion.

Limitation on the quark mass for the improved actions
In general, higher derivative terms may give rise to some unphysical poles [12], which are sometimes called "ghost" or "lattice ghost." If the mass of the ghosts are sufficiently large, no physical effect can be observed, but once they come close to the physical pole, ghosts may distort the physical solution. Practically, it appears as an oscillatory behavior of the Euclidean correlator. For instance, Figure 15 shows the pseudo-scalar meson correlators calculated with improved Brillouin fermions at various quark masses up to am = 3.0. For am 1.5, one finds that the correlator is no longer a simple exponential function but is oscillating. Once this happens, the Wick rotation back to the Minkowski space can not be performed. This problem typically shows up only when the improvement including nextto-nearest neighbor interactions is introduced and when the bare quark mass am is large. Therefore there is an upper limit on am to avoid such sickness. One has to be careful, because the problem may be hidden even when the resulting correlation function does not show the oscillatory behavior.
At tree-level, we calculate the energy of the physical pole as well as that corresponding to the lightest doubler. Figure 16 shows those for zero spatial momentum as a function of am. The energy from the physical pole follows the expectation E(|p| = 0) ≃ m, up to am = 0.6-0.7. On the other hand, the doubler mass slightly decreases for larger am and comes close to the physical pole near am ≃ 0.84. Beyond this value, the two poles merge and transform to a complex-conjugate pair, which indicates the "ghost" as discussed above. (The position of the "merging" point depends on the details of the action, and can in fact be slightly pushed to am ≃ 0.97 for c imp = 1/16 instead of c imp = 1/8.) In order to avoid unwanted effects due to the doubles and ghosts, we need to keep their energy sufficiently higher than the physical mode. By requiring a "gap" of O(1/a), the upper limit of the heavy quark mass would be 0.5-0.6, according to the tree-level analysis shown in Figure 16. The effect of the doublers/ghosts on non-perturbative physical observables may appear as a larger scaling violation. Such a symptom will be discussed in the end of the next section.

Scaling studies on quenched configurations
In this section, we describe a non-perturbative scaling study of the improved Brillouin fermion as well as the standard Wilson and domain-wall fermions towards the continuum limit. We monitor the energy-momentum dispersion relation and hyperfine splitting of charmonium-like heavy-heavy mesons on a set of quenched lattices of inverse lattice spacing between 2.0 and 5.6 GeV.

Lattice parameters
We generate a set of quenched lattices with the tree-level Symanzik gauge action at β = 4.41, 4.66, 4.89 and 5.20 as summarized in Table 1. All lattices have a roughly constant spatial volume L 3 with L ≃ 1.6 fm. These lattices have inverse lattice spacing between 2.0 GeV and 5.6 GeV. The lattice spacing is fixed through the gradient flow using an input w 0 = 0.176(2) fm [24]. The lattice spacing determined from the Sommer scale r 0 = 0.49 fm is also listed for three coarser lattices. All ensembles are generated with the heatbath algorithm and the measurement is carried out on gauge configurations separated by N sep heatbath sweeps, so that the auto-correlation can be safely neglected. on the gauge configurations before using for the measurement of the heavy-heavy meson correlators (except for that of the "unsmeared" domain-wall fermion, as described below).
To be explicit, we employ stout smearing [25] with a parameter α = 0.1 and repeat it three times.
We study the continuum scaling of the improved Brillouin fermion defined by (2.17) with c imp = 1/8. For comparison, we also employ the standard Wilson fermion and the Möbius domain-wall fermion. For Möbius domain-wall fermions, we chose two options: smear or unsmear the gauge links. Möbius domain-wall fermions are essentially the same as the conventional domain-wall fermions, but they are designed to achieve much smaller violation of chiral symmetry at a fixed length L s in the fifth dimension [14]. We chose the scale factor b 5 + c 5 to be 2.0 and L s = 8 with the domain-wall height M 0 = 1.0 for the smeared domain-wall fermion. The residual breaking of chiral symmetry in this setup is found to be O(1 MeV) [26].
In the following we first describe the measurements with the smeared gauge link. Another set of measurements with unsmeared domain-wall fermion is separately discussed below. We tune the quark masses so that pseudo-scalar meson masses become close to our target values m P S = 1.0, 1.5, 2.0, 2.5, 3.0 and 3.5 GeV for all three fermion formulations. The numerical results are interpolated to these target values before comparing the final results. Since the length of this interpolation is tiny, we only use a linear function between nearest two data points.
We calculate heavy-heavy meson correlators from four different source points in the time direction. We smear the source with a function e −µsmr r with a parameter µ smr tuned for each mass to obtain better saturation of the ground state. The gauge configurations are fixed to Coulomb gauge. The mass and smearing parameter are listed in Table 2. The effective masses for the ground state pseudoscalar and vector mesons are shown in Figure 17. The data corresponding to m P S ≃ 3.0 GeV calculated on the β = 4.89 lattice are taken as a typical example. We fit the lattice data with a single exponential function (plus the term representing the contribution from the other temporal direction) in a range [t min , t max ] = [20,32]. To estimate the systematic error due to the fits, we repeat the fit with larger t min 's until t min = 29 and take the variation of their central values as the size of systematic error. This similar procedure is applied for other β values and for all masses. The fit results and associated error are also shown in the plots by horizontal lines.
In the case of the unsmeared Möbius domain-wall fermion a slightly different strategy  Table 2. Mass parameters given as inputs for each calculation. m P S is a target heavy-heavy (pseudo-scalar) meson mass. The bare mass parameter am is listed for each fermion formulation: Wilson fermions, improved Brillouin fermions (Imp. Bri.), and smeared domain-wall fermions (DW). For unsmeared domain-wall fermions, see Appendix B. The gauge links are smeared in these measurements as described in the text. µ smr stands for a parameter appearing in the exponential function e −µsmr r to define the source. Since the critical mass is not subtracted, the bare mass for Wilson fermions (and for Imp. Bri.) can be negative. and set of parameters are chosen. Since the unsmeared gauge links are relatively rough, we take a larger value of L s (L s = 12) with M 0 = 1.6. We work with two different source types (point and complex Z 2 -wall source [27]). For each source type, we also calculate the quark propagator with a gauge-covariant Gaussian smearing applied on either source or sink (or both). We take many quark masses as described in Appendix B. The results of the correlator fits are interpolated to the same reference pseudo-scalar masses in Table 2. This interpolation is carried out with two different ansatzes: a linear interpolation between the nearest two and a quadratic interpolation between the nearest three simulated data points. The spread of the central values between the two different approaches gives rise to a systematic error that has been taken into account in the analysis of the continuum extrapolation. In the data for the hyperfine splitting we see a variation of the central value by up to one sigma when varying the fit-range for the two point functions over a wide range. In order to remain on the conservative side we attach a systematic error of the size of the statistical error.

Speed-of-light for pseudo-scalar meson
The effective speed-of-light c eff can be defined as which is unity in the continuum theory and therefore gives a useful measure of the violation of Lorentz symmetry. We calculate the energy with lattice momenta ap of (1,0,0), (1,1,0) and (1,1,1) in units of 2π/L, where L is almost the same for our ensembles. Effective masses for these finite momentum correlators are shown in Figure 18.
As mentioned previously, the systematic error from the mass interpolation needs to be taken into account for the case of the unsmeared domain-wall fermion data. We interpolate the lattice data with various ansätze and take the spread of the central values as the systematic error. We find that this systematic error is subleading to the statistical error in all cases, but is particularly large for the case of the coarsest ensemble. The reason for this lies in the fact that we had to slightly extrapolate to reach m PS = 3.0 GeV, causing the systematic error to be larger than on the other ensembles. This systematic error is added in quadrature to the statistical error before performing the continuum limit.
The effective speed-of-light thus calculated is plotted as a function of |p| 2 /(2π/L) 2 in Figure 19 for the heavy-heavy pseudo-scalar mesons of mass m P S = 1.5 GeV (left) and 3.0 GeV (right). The results on the coarsest lattice (at β = 4.41) show substantial deviation from the continuum relation c 2 eff (p) = 1 for Wilson and domain-wall fermions. These two formulations are close to each other as far as the energy-momentum dispersion relation is concerned as the tree-level analysis suggests. For the lighter meson of m P S = 1.5 GeV, the deviation is already significant 5-10%; for 3.0 GeV, which is close to charmonium, it is greater than 20%. The data for the improved Brillouin fermion are consistent with unity for both 1.5 and 3.0 GeV mesons. We calculated at four other heavy meson masses as listed in Table 2, and found that the results with the improved Brillouin fermion are always consistent with unity within the error.
We show the scaling of the speed-of-light in Figure 20 against the lattice spacing a. The data for momentum |p|L/2π = 1 are shown; higher momenta are similar but have larger error bars. As one can see, for the heavy-heavy meson of mass m P S = 3.0 GeV, no deviation from the continuum relation c 2 eff (p) = 1 is found with the improved Brillouin fermion in the range of lattice spacing a 0.1 fm. The results with the Wilson and domain-wall fermions are similar. A substantial deviation of 20-30% is found on the coarsest lattice, which decreases to the level of 5% at a ≃ 0.05 fm. This would be a typical size of discretization error for these lattice formulations, unless other theoretical constrains such as that of nonrelativistic effective theory [28] are introduced.
In order to quantify the size of scaling violations, we attempt to model the discretization effect using the data at |p| = 2π/L. Assuming that the continuum relation c 2 eff (p) = 1 is recovered in the continuum limit, we employ an ansätz f (a) = 1 + c 1 a + c 2 a 2 for Wilson fermions. For smeared and unsmeared domain-wall fermions, an ansätz f (a) = 1 + c 2 a 2 + c 4 a 4 is used instead, because the O(a) and O(a 3 ) terms are forbidden by chiral symmetry.   Figure 20. These results suggest that the coefficients have a reasonable size of O(1 GeV) or less. The coefficient for the improved Brillouin fermion is essentially zero even at the order of a 3 .
Since improved Brillouin fermions are designed to achieve O(a) and O(a 2 ) improvement only in the free theory, there is a possibility that significant contributions of O(a) and O(a 2 ) appear due to radiative corrections. A naive order-counting suggests that their size is O(α s a) or O(α s a 2 ), respectively. Assuming α s ∼ 0.2-0.3, these contributions are not negligible. The small scaling violation of the actual data may suggest that these effects are small, which is consistent with an expectation that the radiative corrections are relatively small in general when using link smearing [29]. An explicit perturbative calculation to confirm this expectation is on-going.

Hyperfine splitting
The hyperfine splitting m V − m P S is also an interesting quantity to investigate scaling violations. In the non-relativistic effective theory, it arises from the Pauli term of the form ψ † σ · Bψ. As this term has the same form as the clover term of the O(a)-improved action, it is expected that the hyperfine splitting is very sensitive to the O(a) discretization effects and also possibly to higher order effects.
We show the scaling of m V − m P S in Figure 21 at m P S = 3.0 GeV. For this quantity, the value in the continuum limit is not known. Experimentally, the charmonium hyperfine splitting is 117 MeV, but in the quenched theory it could be significantly different from this value. We therefore do not assume that the continuum limit of the lattice data will reproduce the experimental value. The result in Figure 21 clearly shows substantial scaling violations for the Wilson fermion. The splitting is several times smaller than the results from other formulations. Particularly on the coarsest lattice this can be seen (a ≃ 0.1 fm). This is in accordance with the expectation that the hyperfine splitting is sensitive to O(a) effects. With domain-wall fermions (both smeared and unsmeared), we also find a significant discretization effect, though much less severe than in the case of Wilson fermions. In contrast, the result with improved Brillouin fermions shows very mild a dependence. From their results alone, one cannot tell any sign of the discretization effects.
Here again, we examine the scaling violation by fitting the lattice data as a function of a. Since the value in the continuum limit is unknown, we assume that four lattice formulation give the universal result in the continuum limit and fit all the data simultaneously. For Wilson fermions we employ f (a) = c 0 + c 1 a + c 2 a 2 , while for smeared and unsmeared the domain-wall fermions we take f (a) = c 0 + c 2 a 2 + c 4 a 4 as constrained by chiral symmetry. For improved Brillouin fermions, we first attempt a fit with a function f (a) = c 0 + c 3 a 3 . As shown in Figure 21, a combined fit of four formulations is unsuccessful (χ 2 /d.o.f. ≃ 3.8). The continuum limit of this fit yields c 0 = 0.068(1) GeV.
It may indicate that the improved Brillouin action receives significant radiative correction at O(a) (and O(a 2 )). Therefore we also try to fit with f (a) = c 0 + c 1 a + c 2 a 2 for the action as that for Wilson fermions. The quality of the fit is better (χ 2 /d.o.f.    domain-wall fermion at a = 0.5 GeV −1 shows significantly larger deviation from the continuum limit. Finally, we study the scaling of the hyperfine splitting for different sets of heavy quark masses, in order to check the discretization effects for the improved Brillouin fermion action. Figure 23 shows the hyperfine splitting for various heavy-heavy meson masses plotted against the lattice spacing a. A good scaling is observed in general, but focusing on the heaviest mass (m P S = 3.5 GeV) we observe a very strong scaling violation for the coarsest lattice point a ≃ 0.5 GeV −1 . At this lattice spacing, the natural heavy quark scaling that the hyperfine splitting decreases for heavier masses is lost between m P S = 3.0 GeV and 3.5 GeV. This may indicate the problem of the improved action for too large am as discussed in Section 3. The bare mass for these two masses is 0.55 and 0.67, respectively, which is in the mass range where the effects of doublers could become significant.

Conclusion
The energy-momentum dispersion relation is a key property of relativistic field theories. On the lattice, the Lorentz symmetry is explicitly broken by the lattice discretization and the continuum dispersion relation is expected to be recovered only in the continuum limit. For practical applications, how fast one can approach the continuum limit becomes a crucial question; while charm quarks can be simulated with moderately large values of the input quark mass in lattice units am on ensembles available today, the bottom quark mass cannot be made much less than 0.5-1.0. It is therefore important to design a lattice formulation that respects the symmetry relation of the continuum theory as much as possible. The Brillouin fermion is among such class of fermion formulations, and in this paper we consider its further improvement and carry out the corresponding numerical tests.
The improved Brillouin fermion defined by (2.17) has the energy-momentum dispersion relation which is close to the continuum one. This is confirmed at the tree-level for the massless case (am = 0) and the massive case (am = 0.5). The leading discretization effect is O(a 3 ), but as far as the dispersion relation is concerned, the actual error seems to be much smaller than a naive estimate O((am) 4 ) ∼ 13% for am = 0.5. Through the radiative correction, the terms of O(aα s ) and O(a 2 α s ) are induced, and their effects have to be carefully examined. In our non-perturbative test in the quenched theory, the discretization effect is insignificant on the lattices of 1/a = 2.0-5.6 GeV and charm quark mass m ≃ 1.3 GeV. The sign of cancelling O(a) and O(a 2 ) effects found in the hyperfine splitting needs to be studied more carefully, though. The most direct approach would be to calculate on finer lattices. We also to plan to inspect the size of corresponding one-loop correction. In successful, the action is a promising candidate for the simulation of charm quark on more realistic unquenched lattices.
For more extensive calculations, the numerical cost would become an important issue. With our current implementation, the inversion of charm quark propagators takes 2-3 times more time than for the inversion of the domain-wall fermion. It would be worth spending such numerical cost given the highly suppressed discretization effects, but further improvement of the numerical code is certainly desired.
Another important conclusion from our analysis is that the continuum extrapolation with the (smeared and unsmeared) domain-wall fermion is possible for charm quark, provided that the data are available in the region beyond 1/a 3.0 GeV. A combined fit including other formulation as done in this work would be useful to have better control of systematic effects.

(A.2)
Note that the Brillouin fermion has 80 nearest-neighbors within a 3 4 hypercube, and thus 80 off-axis link variables have to be prepared. Calculation of these off-axis links can be done before the conjugate gradient iterations start, as the link variable is unchanged during the solver steps. This method is called "overall smearing strategy (OSS)" [13]. We also consider an alternative method to calculate the covariant Brillouin Laplacian and the isotropic derivative operators with gauge fields. We use the following recursive definition.  Table 3.
Simulated bare quark mass in lattice units for the unsmeared Möbius domain-wall fermion. The masses starting from "start" with a step of "step" and ending at "end" are simulated. The "smearing parameter" refers to the choice of the smearing parameter for the Gaussian smearing of the source/sink of the propagators. The first block corresponds to the dispersion relation measurements with a point source, the second block to the hyperfine splitting measurements with Z 2 -wall source. All measurements we carried out with the unsmeared Möbius domain-wall fermion are with parameters L s = 12 and M 0 = 1.6.