1 Introduction

An interesting approach to study quantum chromodynamics (QCD) is to consider the order of the gauge group \(N\) as a free parameter. As shown by ’t Hooft [1], by taking the large \(N\) limit of the perturbative weak coupling expansion, the theory simplifies in many ways, and in fact one can treat theories at finite \(N\) as corrections in the “small” parameter \(1/N\). Moreover, the large \(N\) expansion predicts that quark loop effects are suppressed by a power of \(1/N\), so that the weak coupling expansion of large \(N\) QCD is dominated by planar diagrams with purely gluonic internal loops. All of these rather remarkable properties of large \(N\) QCD, make it an interesting theory to study not only from the theoretical perspective, but also from a practical point of view, as results for real world QCD could be obtained by considering corrections to the \(N= \infty \) theory which are parametrized by powers of \(1/N\).

Although this \(1/N\) scaling is obtained perturbatively, lattice computations provide evidence that it also holds at the non-perturbative level, both in \(D=4\) space-time dimensions [2,3,4,5,6,7,8,9] and in \(D=3\) [9,10,11,12,13,14]. The evidence is usually based on complicated observables, where typically one needs to project onto ground states by large Euclidean times. It is then difficult to obtain high precision at various \(N\) in order to verify ’t Hooft scaling with good confidence. Let us stress the fact that the validity of the \(1/N\) scaling, beyond the weak coupling expansion, is not a trivial statement. Hence, it is desirable to test it by means of lattice simulations and with statistically and systematically very precise observables.

Perturbatively, if one carries on with the ’t Hooft \(1/N\) topological expansion, another simplification arises, which has to do with the property of factorization

(1.1)

where the \({\mathcal {O}}_i\) are local gauge invariant or Wilson loop operators, and the leading correction scales as \(1/N^2\) in the pure gauge theory, which we focus on for the rest of this work. Eq. (1.1) has several consequences, as it tells us that in the large \(N\) limit, the dominant part of a correlator is the disconnected one. In particular, when \({\mathcal {O}}_1 = {\mathcal {O}}_2\), this means that fluctuations are suppressed; and as discussed in Ref. [15], this fact can be put in analogy with the classical limit of a quantum theory, where \(1/N\) plays the role of \(\hbar \). Related to this is also the concept of the “master field”, i.e., the idea that the path integral is dominated by a single gauge configuration (or rather a gauge orbit) [16, 17]. Although these ideas triggered hope to find the solution of large \(N\) QCD, such an analytical solution is still lacking today. The situation in the Yang–Mills theories is similar in this respect to two-dimensional \({\mathrm {SU}}(N)\)\(\times \) \({\mathrm {SU}}(N)\) spin models [18], while for O(N) models and CP(N) models the large N limit is solvable and one can therefore really carry out the expansion [19,20,21].

One more aspect where Eq. (1.1) plays a crucial role has to do with the idea of volume independence, which starting from the work of the authors in Ref. [22], has been used in the lattice formulation to study the large \(N\) limit of the Yang–Mills theory by performing simulations in small spacetime volumes [23, 24], and even in single site lattices, provided a clever choice of boundary conditions [25,26,27] is made.

The above indicates that factorization is not only relevant in the theoretical context, but also on the practical level, as it is a requirement for the single site lattice simulations to be valid. To be more precise, the equivalence is expected to hold between the single site and the infinite volume theory in the \(N\rightarrow \infty \) limit. The equivalence is argued for on the basis of the Makeenko–Migdal loop equations [28] on the lattice. As originally shown in Ref. [22], the loop equations in both theories are equivalent, provided that the product of the expectation value of the Wilson loops factorize as stated in Eq. (1.1). Let us mention that shortly after volume reduction was put forward, it was clear the situation is more complicated, and phase transitions spoil the reduction in its simplest form [23,24,25, 29,30,31]. Workarounds this issue have been presented in the literature [23, 25,26,27] and show that either full or partial volume reduction are a possible way to make simulations of \({\mathrm {SU}}(N)\) Yang–Mills theory at large \(N\) more accessible, as there is a significant compensation of the extra cost for increasing \(N\) by the much smaller number of lattice sites.

Additionally, we would like to point out that important physics is contained in the corrections to factorization. The most obvious one is that glueball masses are obtained from the connected correlation functions of Wilson loops.

The previous discussion motivates the search for a non-perturbative proof, beyond the realm of weak coupling perturbation theory. Several authors have investigated factorization beyond perturbation theory [32,33,34,35]; and as mentioned earlier, lattice simulations also suggest this result to be valid. In particular, the results presented in Ref. [5], Sect. 6, provide strong evidence for factorization. Here we go beyond this. We consider several high precision normalized observables by using the gradient flow to study the large \(N\) scaling, and address the important issue of the size of the correction to \(N= \infty \) in a non-perturbative lattice computation.

This paper is organized as follows, in Sect. 2 we present the observables that are used both to check the large \(N\) scaling, as well as factorization. In Sect. 3 we discuss different ways of defining the large \(N\) limit and in particular the two choices we made for our investigation. In Sect. 4 we describe the ensembles and lattice parameters used for the simulations and in Sect. 5 we present our results, both at finite lattice spacing, and in the continuum limit. We finish with a short summary of the results.

2 Observables

The basic observables we consider are the Yang–Mills action density E(t) at positive flow time [36] (defined below) and rectangular Wilson loop operators

(2.1)

where C is a closed rectangular path in space-time, and P denotes the path ordering operator. The normalization factor \(1/N\) is included in the definition of \(W_C\) in order to have a finite large \(N\) limit – already at tree-level. Wilson loops have singularities which have to be removed before the continuum limit can be taken. In particular, for our square Wilson loops, one must remove not just the “perimeter” divergences but also “corner” divergences [37,38,39]. One way to proceed is to consider Creutz ratios [40], which however, for loops of large size in lattice units, suffer from small signal to noise ratios. As this would compromise our desire for a precision test, we work instead with smooth Wilson loops. The smoothing is provided by the Yang–Mills gradient flow [41, 42]. It evolves the gauge fields \(A_{\mu }(x)\) according to the flow equation

$$\begin{aligned} \partial _t B_{\mu }(t,x)= & {} D_{\nu } G_{\nu \mu }(t,x)\, , \quad B_{\mu }(0,x) = A_{\mu }(x) \, \nonumber \\ G_{\mu \nu }(t,x)= & {} \partial _{\mu }B_{\nu }(t,x) - \partial _{\nu }B_{\mu }(t,x) - \left[ B_{\mu }(t,x), B_{\nu }(t,x) \right] , \nonumber \\ \end{aligned}$$
(2.2)

where the dimension two parameter t is known as the flow time. The loops at positive flow time are then simply

(2.3)

Choosing 8t to be of a typical QCD size, say of the order of the inverse string tension, they benefit from small statistical errors even for large loops [43]. In particular, their variance remains finite in the continuum limit. That property is a particular manifestation of the most important feature of observables which are built from the smoothed gauge fields \(B_{\mu }(t,x)\): they are renormalized operators at positive flow time t [36, 44]. In other words, there is no renormalization scheme or scale dependence beyond t and the continuum limit is unambiguous and well defined. Even the action density

(2.4)

is finite. It will be one of our observables.

2.1 The gradient flow coupling at large \(N\)

The gradient flow can also be used to define a renormalized coupling [42]. Using the perturbative expansion of the Yang–Mills energy density at positive flow time , one has that

(2.5)

where \(\bar{\lambda }_{\overline{{\mathrm {MS}}}}(\mu ) = N{\bar{g}}^2_{\overline{{\mathrm {MS}}}}(\mu )\) is the ’t Hooft coupling at the scale \(\mu = 1/\sqrt{8t}\) and \(c_1 = \frac{1}{16 \pi ^2} \left( \frac{11}{3}\gamma _E + \frac{52}{9} - 3 \ln 3 \right) \) is \(N\) independent. With this definition, we can then define a scale by setting the renormalized coupling \(\bar{\lambda }_{{\mathrm {GF}}}\) to a given value. A convenient choice for \({\mathrm {SU}}(3)\) is the reference scale \(t_0\) [42], which corresponds to a value of the coupling such that . This particular choice can be generalized to \({\mathrm {SU}}(N)\) if the right hand side is modified so that it has the correct scaling with \(N\). Clearly, we also want the definition to remain what it is for \(N=3\). Thus, Eq. (2.5), suggests to define \(t_0\) implicitly by the equation [2]

(2.6)

for all \(N\).

2.2 Smooth Wilson loops

The favourable properties of smooth Wilson loops have already been exploited in the literature, as for example to estimate the string tension at small values of t in Refs. [6, 44], or to study the large \(N\) phase transition in the eigenvalue spectrum of the Wilson loop matrices [45]. For our purpose, the limit of small t is not required, as the smooth loops are used to test factorization and the large \(N\) limit for well defined renormalized observables, regardless of their relation to the operators at \(t=0\).

In the end, we study the large \(N\) limit of square Wilson loops, i.e. for loops where the path C in Eq. (2.1) is given by a square of size \(R\times R\). In order to take the large \(N\) limit, the loops are matched at different \(N\) relating their size to the scale \(t_0\) introduced in the previous section. More precisely, the large \(N\) and continuum limits are taken for loops of size \(R_c = \sqrt{8 c t_0}\) (see Fig. 1), where the smoothing parameter \(t = c t_0\), and c is a constant parameter.

To be more precise, let us denote a square loop with one of its corners at the spacetime point \((x_0,\vec {x})\) and extending only in space as \(W(t,x_0,\vec {x},R)\). Its expectation value

$$\begin{aligned} W(c)= \left\langle W(t,x_0,\vec {x},R_c) \right\rangle \, \quad \text {with } \quad t=ct_0, R_c = \sqrt{8 c t_0},\nonumber \\ \end{aligned}$$
(2.7)

is independent of the position \(\vec {x}\) due to translation invariance and only depends on the parameter c. In our notation we separate time and space-coordinates, as we will later use lattices with different boundary conditions in time (open b.c.) and space (periodic b.c.). While the independence on \(\vec {x}\) is exact, \(x_0\) has to be sufficiently far away from the boundary for Eq. (2.7) to hold.

Similarly, we define

$$\begin{aligned} W^{{\mathrm {sq}}}(c) = \left\langle W^2(t,x_0,\vec {x},R_c) \right\rangle \, \,\, \text {with } \,\, t=ct_0, R_c = \sqrt{8 c t_0}, \nonumber \\ \end{aligned}$$
(2.8)

which corresponds to the expectation value of the product of a Wilson loop with itself.

Fig. 1
figure 1

Schematic representation of a smooth Wilson loop operator. The size of the loop is chosen such that it has the same length as the smoothing radius, i.e. \(R=\sqrt{8t}\)

2.3 Observables to test factorization

In order to investigate the property of factorization from Eq. (1.1), we define several observables based on the Yang–Mills action density and the smooth Wilson loops at positive flow time. They are constructed such that factorization implies that they vanish as \(N\rightarrow \infty \). First we consider the simplest case of the observable \(G_W\) defined in terms of the smooth Wilson loops as

$$\begin{aligned} G_W(c) = \frac{W^{{\mathrm {sq}}}(c) - W^2(c)}{W^2(c)}. \end{aligned}$$
(2.9)
Table 1 Parameters of the simulations. For each of the gauge groups \({\mathrm {SU}}(N)\) we give the inverse lattice coupling \(\beta =2N^2/\lambda _0\), the dimensions of the lattice, the approximate lattice spacing using \(\sqrt{t_0}=0.166 \, {\mathrm {fm}}\) followed by the number \(N_{{\mathrm {meas}}}^\mathrm {W}\) of measurements used for the computation of the smooth Wilson loops, and \(N_{{\mathrm {meas}}}^\mathrm {E}\) for the action density, Eq. (2.4). In the second to last column we present the values of \(t_0/a^2\):\(~^{*}\) taken from Ref. [53] and\(~^{**}\) taken from Ref. [2]

Then, we consider observables built from the space integral of the smooth Wilson loops and the Yang–Mills action density. We defineFootnote 1

(2.10)

with the factor \(1/t_0^{3/2}\) rendering \(H_{{\mathcal {O}}}(c)\) dimensionless, and where \({\mathcal {O}}\) is either a smooth Wilson loop, or the Yang–Mills action density. Notice that H is a type of susceptibility, as we are integrating over the contributions from the correlation function of \({\mathcal {O}}\) at different distances. The integration does not extend over \(x_0\) due to our choice of boundary conditions and \(x_0\) is again supposed to be far away from the time-boundaries. In comparison to the simple observable \(G_W\), this probes longer distances, but introduces also more noise and affects the statistical errors in the measurements. Nonetheless, as will be shown in Sect. 5, the statistical precision that can be achieved for \(H_{{\mathcal {O}}}\) remains good. In particular, we will consider \(H_E(c)\), defined by inserting

$$\begin{aligned} {\mathcal {O}}(t,x_0,\vec {x}) = E(t,x_0,\vec {x}), \end{aligned}$$
(2.11)

into Eq. (2.10) and \(H_W\) by

$$\begin{aligned} {\mathcal {O}}(t,x_0,\vec {x}) = W(t,x_0,\vec {x},R_c). \end{aligned}$$
(2.12)

We remind the reader of our choice \(R_c = \sqrt{8 c t_0}\).

Equation (1.1) means

(2.13)

2.4 Finite volume

For a numerical test, we need to choose a finite volume. We chose our parameters such that \(L/\sqrt{8t_0}\approx 3.3\). Table 1 shows the actual values used in our simulations. Since L is thus approximately constant, it is omitted as an argument of the observables. We note that the large \(N\) limit and factorization can be tested in infinite or in finite volume. To be on the safe side, we chose the latter, even though we are not far from the infinite volume limit for most observables.

3 Defining the approach to the large N limit

The complete definition of a quantum field theory involves a regularization (here Wilson’s lattice theory) as well as a non-trivial renormalisation before the regulator can be removed. Although this is usually not discussed, quantitative statements about the approach to the large \(N\) limit, such as the ones we are seeking here, do depend on the renormalisation scheme if the renormalisation scheme defines which quantity is held fixed as we take \(N\rightarrow \infty \).

While the \(\mathrm{O}(1/N^2)\) corrections depend on these details, the true limit is expected to be unique in the following sense. It is independent of the scheme, as long as the ’t Hooft coupling \(\bar{\lambda }_{s}(\mu ) = N{\bar{g}}^2_{s}(\mu )\) in any scheme is kept fixed as one takes the limit. This statement becomes most transparent when we replace couplings by the associated \(\Lambda \)-parameters,

$$\begin{aligned} \Lambda _s= & {} \lim _{\mu \rightarrow \infty } \, \mu \, \left( \frac{48\pi ^2}{11 \lambda _s(\mu )} \right) ^{51/121} \exp \left( -\frac{1}{b_0\lambda _s(\mu )} \right) ,\nonumber \\ b_0= & {} \frac{11}{24\pi ^2}. \end{aligned}$$
(3.1)

Now any renormalization group invariant quantity \({\mathcal {O}}\) of mass dimension n, has a large \(N\) limit

$$\begin{aligned} \lim _{N\rightarrow \infty } \frac{{\mathcal {O}}}{\Lambda _s^n} = \lim _{N\rightarrow \infty } r^n(N) \frac{{\mathcal {O}}}{\Lambda _{s'}^n} = r_\infty ^n\, \lim _{N\rightarrow \infty } \frac{{\mathcal {O}}}{\Lambda _{s'}^n}, \end{aligned}$$
(3.2)

where

$$\begin{aligned} r(N)= & {} \exp (c_{ss'}(N)/b_0), \end{aligned}$$
(3.3)
$$\begin{aligned} \lambda _{s'}= & {} \lambda _{s} +c_{ss'}(N)\lambda _{s}^2 + \mathrm{O}(\lambda _{s}^3). \end{aligned}$$
(3.4)

and

$$\begin{aligned} r_\infty = \exp (\lim _{N\rightarrow \infty } c_{ss'}(N)/b_0). \end{aligned}$$
(3.5)

Examples for \(n=1\) are glueball masses and \(t_0\) defined above is a RGI scale with \(n=-2\). When the observable \({\mathcal {O}}\) depends on external momenta or coordinates, they have to be fixed in units of \(\Lambda \) in a specified scheme, e.g. \(\Lambda _{\overline{{\mathrm {MS}}}}\), when taking \(N\rightarrow \infty \).

Due to the existence of the limit Eq. (3.2), we may also scale distances with respect to any one particular reference scale (choice of \({\mathcal {O}}\)). In our numerical work we have chosen \(t_0\), Eq. (2.6), because of its high precision.

The preceding discussion is about the continuum theory. It thus saliently assumes that first we take the continuum limit at finite \(N\) and then we perform \(N\rightarrow \infty \). However, we may also proceed in the opposite order: first take the large \(N\) limit at fixed lattice spacing and then send the lattice spacing to zero.Footnote 2 Let us briefly discuss that this order of limits is indeed the same as above; the limits are interchangeable.

3.1 Large \(N\) limit at fixed lattice spacing

The existence of the large \(N\) limit at fixed finite lattice spacing is expected due to the following consideration. We start from the Lambda-parameter, \(\Lambda _{\mathrm {lat}}\) in the lattice minimal subtraction scheme, which satisfies Eq. (3.1) with \(\mu =1/a\) and \(\lambda _{\mathrm {lat}}(\mu )=\lambda _0\) in terms of the lattice spacing, a, and the bare coupling, \(\lambda _0=Ng_0^2\). In fact, having a specific scheme, the lat-scheme, we can give the more detailed formula,

$$\begin{aligned} a \Lambda _{\mathrm {lat}}= & {} \left( \frac{48\pi ^2}{11 \lambda _0} \right) ^{51/121} \exp \left( -\frac{24\pi ^2}{11 \lambda _0} \right) \nonumber \\&\times \left( 1 + c_1(N) \lambda _0 +\mathrm{O}(\lambda _0^2)\right) , \end{aligned}$$
(3.6)

where \(c_1 = 0.1048 + \mathrm{O}(1/N^2)\) [46]. Equation (3.6) shows that the large \(N\) limit can be taken at fixed bare coupling which is equivalent to fixed lattice spacing a. Apart from \(\mathrm{O}(1/N^2)\) terms in \(c_1\) and higher order terms, fixed lattice spacing is the same as fixed \(\Lambda _{\mathrm {lat}}\) and therefore also fixed \(\Lambda _s\) in other schemes. See also an early discussion of \(\Lambda _{\overline{{\mathrm {MS}}}}/\Lambda _{\mathrm {lat}}\) including its \(N\) dependence [47].

In general, taking the large \(N\) limit at fixed lattice spacing has to be followed by the \(a\rightarrow 0\) limit at \(N=\infty \). However, when we investigate factorization, the second step is not expected to be necessary. This is because the perturbative proof of factorization holds in the lattice regularization [48] at finite a. If factorization holds non-perturbatively we thus also expect Eq. (2.13) at any fixed a. In any case, verifying Eq. (2.13) at arbitrary finite lattice spacing implies that it holds in the continuum limit.

Note also that even the large \(N\) limit of divergent quantities, such as Wilson loops at \(t=0\), is expected to exist. A high precision numerical test has recently been performed [8].

4 Lattice details

In this section we give the details of our lattice simulations. We simulate the pure gauge theory with \(N=3,\, 4,\, 5,\, 6,\, 8\) at several lattice spacings. The lattice action is the Wilson gauge action and we use open boundary conditions in the time direction [49]. The simulations are performed using a combination of heatbath and overrelaxation local updates using the Cabibbo–Marinari strategy [50] to refresh the \({\mathrm {SU}}(N)\) matrices. The ratio of overrelaxation to heatbath updates is fixed to L / (2a).

For convenience, we present the values of the lattice spacing, as well as lattice sizes in physical units by assigning a value to \(t_0\) such that \(\sqrt{t_0}=0.166 \, {\mathrm {fm}}\). This choice is motivated by the result in \({\mathrm {SU}}(3)\) for \(\sqrt{8t_0}/r_0 = 0.941(7)\) [51] and the value of the reference scale \(r_0 \approx 0.5 \, {\mathrm {fm}}\) [52]. Notice that this choice is somewhat arbitrary, as apart from the missing quark loops, for \(N\ne 3\) the theory cannot be directly identified with Nature.

The parameters of the simulations are displayed in Table 1. The configurations used for the measurements are a subset of those reported in Ref. [2] for all ensembles except for those at \(N=3, 8\), and for the finest lattice spacings in the case of \(N=4, 5\). As announced above, all the lattices considered in Table 1 are of approximately the same spatial size \(L \approx 1.55 \, {\mathrm {fm}}\). In addition, we have used two additional ensembles with \(L \approx 2.35 \, {\mathrm {fm}}\) at the coarsest lattice spacing (\(a \approx 0.1 \, {\mathrm {fm}}\)) for \(N= 4, 5\) in order to check for effects due to small variations in the volume. Notice that for the ensembles which have been reported in Ref. [2], we have a very large number of measurements for the Yang–Mills action density.

The flow equations are integrated using a third order Runge–Kutta integrator [42] and the observables are measured at intervals \(\Delta t\) of t of \(\Delta t/a^2 \approx 2-3 \times 10^{-2}\). Afterwards, they are interpolated using a second order polynomial in order to obtain their values at arbitrary t. The action density is defined exactly as in [42], using the clover discretization and it is measured from \(t=0\) up to \(t \approx 1.2 \, t_0\). The loops, W(c), are measured only in the vicinity of \(t=ct_0\), with \(c={1/2,1,9/4}\), and then interpolated to the exact value of t. For the loops, one has to do an additional interpolation to \(R_c\), and since their statistical precision is very high, one has to be careful with small potential systematic effects. The details of this interpolation were already presented in Ref. [54].

We end this section with the precise definition of the observables introduced in Eqs. (2.7)–(2.10) on the lattice. First, for the Wilson loops we use translation invariance in the form

(4.1)

where corresponds to the estimator of the true expectation value computed on the lattice.

In order to compute \(H_W\), we define

(4.2)

so that

$$\begin{aligned} H_W(c) = \left( \frac{L^3}{t_0^{3/2}} \right) \frac{W^{{\mathrm {sq}}}_{{\mathrm {int}}}(c) - W^2(c)}{W^2(c)} \, , \end{aligned}$$
(4.3)

and we proceed in a similar way to define \(H_E\) after replacing \(W(ct_0,x_0,\vec {x},R_c)\) by \(t^2 E(ct_0,x_0,\vec {x})|_{t=ct_0}\). The parameter d is introduced to deal with the systematic effects from the open boundary conditions. It is chosen in a similar way as described in Ref. [55], so that the effects coming from the boundaries are negligible with respect to the statistical error in the bulk.

5 Results

5.1 Large \(N\) scaling

In order to test and provide a precise verification of \(1/N^2\) scaling, we analyse our results for W(c) and for the gradient flow coupling \(\bar{\lambda }_{{\mathrm {GF}}}\). Let us first discuss our results for the latter.

5.1.1 The gradient flow coupling

In Fig. 2 we plot \(\bar{\lambda }_{{\mathrm {GF}}}\) as a function of t for several gauge groups and different lattice spacings. Within the scale of the plot, the results are hard to distinguish for all gauge groups, which already shows the small size of the \(N\) dependent corrections. While at \(t=t_0\) independence of \(N\) is enforced by Eq. 2.5 or equivalently

$$\begin{aligned} \bar{\lambda }_{{\mathrm {GF}}}\left( 1/\sqrt{8t_0}\right) = 0.3 \times 16\pi ^2, \end{aligned}$$
(5.1)

the different \(N\) curves remain remarkably close when t is a factor of 5 away from \(t_0\). At a closer look, corrections to \(N= \infty \) are present and the data agrees very well with a polynomial in \(1/N^2\) as expected. We verified this by interpolating the data to several values of t in a regular interval from \(t=0.1 \, t_0\) to \(t = 1.1 \, t_0\), and then taking the large \(N\) limit once at a fixed lattice spacing and once in the continuum.

Fig. 2
figure 2

\(\bar{\lambda }_{{\mathrm {GF}}}\) as a function of t for several values of \(N\) and a (see Table 1). In the lower plot, we present a closer look at the small t region

Fig. 3
figure 3

Left: Continuum extrapolation of \(\bar{\lambda }_{{\mathrm {GF}}}\) at \(t/t_0 = 0.8\) for all the different gauge groups. Right: Large \(N\) extrapolations of \(\bar{\lambda }_{{\mathrm {GF}}}\) in the continuum limit (solid line) and at finite lattice spacing, \(a^2/t_0=0.2091\), (dotted line). The points at finite lattice spacing have been slightly shifted for better legibility

Fig. 4
figure 4

Left: Continuum extrapolation of \(\bar{\lambda }_{{\mathrm {GF}}}\) at \(t/t_0 = 0.4\) for all the different gauge groups. Right: Large \(N\) extrapolations of \(\bar{\lambda }_{{\mathrm {GF}}}\) in the continuum (solid line) and at finite lattice spacing (dotted line). The points at finite lattice spacing have been slightly shifted for better legibility

As can be observed in Fig. 2, cut-off effects are large at small t. At \(t= 0.1 \, t_0\) the relative difference between the results at the finest lattice spacing (\(a \approx 0.05 \, {\mathrm {fm}}\)) and at the coarsest (\(a \approx 0.1 \, {\mathrm {fm}}\)) one, is around \(20\%\); while the errors in the measurements themselves is at the per-mill level. The situation is better at larger values of t, so let us first focus on values of \(t/t_0 \ge 0.3\), where the relative size of cut-off effects is reduced tenfold, when compared to the case at \(t/t_0=0.1\). In Fig. 3 we show a plot of the continuum extrapolation of \(\bar{\lambda }_{{\mathrm {GF}}}\) at \(t/t_0 = 0.8\) and the large \(N\) extrapolation both at finite lattice spacing and in the continuum. In order to be able to use the dataset at \(N=8\), in addition to the continuum limit extrapolations, we consider \(a^2/t_0=0.2091\), the value on ensemble A(8)\(_2\). We then interpolated the results for all the other gauge groups to that lattice resolution.

On the left panel of Fig. 3 we show the continuum limit extrapolations. The strategy chosen for the extrapolation is the following: all continuum extrapolations are performed by linear fits in \(a^2/t_0\) to those data which satisfy \(a^2/t_0 \le 1/4\) (default fit). Such a restriction has been well motivated in Ref. [56] for \(N=3\) and we find smaller discretization effects for larger \(N\). As an estimate of the systematic uncertainty associated with this choice, we perform a second fit linear in \(a^2\) with a data point at larger \(a^2\); if the latter fit does not have a good \(\chi ^2\) we add an \(a^4/t^2\) term to the fit-function (control fit). If necessary, the error of the default fit is enlarged until it covers the full \(1-\sigma \) band of the control fit. The uncertainties of the continuum limit points are usually dominated by the systematics which arises from different fits and which is not necessarily independent for different \(N\).

Table 2 Parameters of the large \(N\) extrapolations, eq. (5.2), of \(\bar{\lambda }_{{\mathrm {GF}}}(1/\sqrt{8ct_0})\) and W(c) at finite lattice spacing (L) and in the continuum (C)
Fig. 5
figure 5

Left: Continuum extrapolation of W(1) for all gauge groups. Right: Large \(N\) extrapolations of W(1) in the continuum (solid line) and at finite lattice spacing (dotted line). There is an excellent agreement between the data and the expected scaling in powers of \(1/N^2\). The points at finite lattice spacing have been slightly shifted for better legibility

All values of \(\chi ^2/{{\mathrm {dof}}}\) are excellent except for \({\mathrm {SU}}(4)\), where we obtain a value of 2.2 and 2.7 for the linear and quadratic fits respectively. After performing the fits in \(a^2/t_0\), on the right panel of Fig. 3 we plot the large \(N\) extrapolations both in the continuum and at finite lattice spacing. As discussed, \(N=8\) is available only at finite lattice spacing, where in addition, errors are much smaller due to the fact that we performed an interpolation instead of the continuum limit extrapolation. The large \(N\) extrapolation uses the form

$$\begin{aligned} Y(1/N^2) = a_0 \left( 1 + \frac{a_1}{N^2} + \frac{a_2}{N^4} \right) . \end{aligned}$$
(5.2)

As seen in Fig. 3, the fit to the function Y is excellent, with a \(\chi ^2/{{\mathrm {dof}}}= 1.02\) at finite lattice spacing; for the continuum points we do not consider \(\chi ^2\) since the errors are strongly correlated due to the dominating systematic uncertainty of the continuum extrapolations. Notice also that the results suggest that cut-off effects decrease with increasing \(N\).

As an example of results at smaller t, we show our analysis at \(t/t_0=0.4\) in Fig. 4. In this case, the magnitude of the cut-off effects is larger, but the same analysis as before can be carried out.

As mentioned earlier, dealing with \(\bar{\lambda }_{{\mathrm {GF}}}\) at values of \(t/t_0 \le 0.3\) presents a bigger challenge, so one cannot reach the same level of accuracy as the results presented in this section. However, we have proceeded to do a similar analysis for such small values of t, including corrections of higher order in \(a^2/t_0\). Details are found in Appendix B.

From the above analysis, we find that the large \(N\) dependence of \(\bar{\lambda }_{{\mathrm {GF}}}\) is in excellent agreement with the \(1/N^2\) scaling predicted by the ’t Hooft perturbative expansion. Moreover, defining

$$\begin{aligned} \eta (1/N^2)=\Big |\frac{Y(0) - Y(1/N^2)}{Y(0)} \Big |, \end{aligned}$$
(5.3)

we can determine the “distance” between \({\mathrm {SU}}(3)\) and \({\mathrm {SU}}(\infty )\). In the continuum, at \(t/t_0 = 0.8\) and \(t/t_0=0.4\) we find \(\eta (1/9) = 1.1\%\) and \(2.8\%\) respectively. Note also that the large \(N\) limit is taken at fixed \(t_0\) and therefore \(\eta \equiv 0\) at \(t=t_0\) by definition. To account for this effect, we also fit Y to \(\bar{\lambda }_{{\mathrm {GF}}}(1/\sqrt{8t}) - \bar{\lambda }_{{\mathrm {GF}}}(1/\sqrt{8t_0})\) instead of \(\bar{\lambda }_{{\mathrm {GF}}}(1/\sqrt{8t})\), and define \(\delta \) in a similar way to \(\eta \). The results, together with those obtained for \(\eta \) are displayed in Table 2. Let us remark that the individual errors in our measurements are below the per-mill level, so we can confidently quantify these percent level deviations between \({\mathrm {SU}}(3)\) and \({\mathrm {SU}}(\infty )\).

The magnitude of the \(1/N^2\) corrections can be read off from the coefficients \(a_1\) and \(a_2\) collected in Table 2, together with those of the smooth Wilson loops which we discuss next.

5.1.2 Smooth Wilson loops

We have determined the smooth Wilson loops at three different values of c, i.e. \(c=1/2,1,9/4\). As in the case of \(\bar{\lambda }_{{\mathrm {GF}}}\), we are interested in the large \(N\) scaling at finite lattice spacing and in the continuum. The strategy for the continuum limit fits is the same as for \(\bar{\lambda }_{{\mathrm {GF}}}\). The fits for the loops at different c are qualitatively similar, so in Fig. 5 we show the results at \(c=1\) only.

Fig. 6
figure 6

Left: Continuum extrapolation of \(H_E(1)\) for all gauge groups. Right: Large \(N\) extrapolations of \(H_E(1)\) in the continuum (solid line) and at finite lattice spacing (dotted line). The points at finite lattice spacing have been slightly shifted for better legibility

Once again, to quantify the magnitude of the finite \(N\) corrections, we collect in Table 2 the values of \(a_1\) and \(a_2\) from the fit to Y. We observe that the relative magnitude of them grow at larger values of c (or t equivalently). Similarly, the deviation between \({\mathrm {SU}}(3)\) and \({\mathrm {SU}}(\infty )\) also grows up to a value of \(\eta (1/9)=0.1\) when \(c=9/4\). In all cases we find an excellent fit to Y (the values of \(\chi ^2/{{\mathrm {dof}}}\) are reported in Table 2).

5.2 Factorization

In order to verify the property of factorization from Eq. (1.1), we take the large \(N\) limit of the observables defined in Sect. 2.3. The large \(N\) limits are taken in a similar way as described earlier, but we modify the parametrization of the large \(N\) fitting function for convenience, so that

$$\begin{aligned} Y(1/N^2) = b_0 + \frac{b_1}{N^2} + \frac{b_2}{N^4}. \end{aligned}$$
(5.4)

For the continuum limit extrapolations we use the same strategy as for W and for \(\bar{\lambda }_{{\mathrm {GF}}}\), and in all cases, the data can be fitted very well with a linear or quadratic polynomial in \(a^2/t_0\). We also check for effects caused by variations of \(L/\sqrt{8t}\) in all observables. As discussed in Appendix A, we find that \(H_W\) at \(c=1/2\) and \(c=1\), are potentially affected by large effects. We tried to include them as a systematic error on the measurements, but this yields errors which are too large to be of interest as a test of factorization. Hence, we present only results for \(H_W\) at \(c=9/4\) .

Let us first discuss our results for \(H_E(1)\). On the right panel of Fig. 6 we show the large \(N\) fits both in the continuum and at a finite lattice spacing. The fits are excellent, which provides yet again confirmation of the scaling in powers of \(1/N^2\). It is worth mentioning that at finite lattice spacing, where results are very precise, we find that a quadratic fit in \(1/N^2\), excluding the \({\mathrm {SU}}(3)\) point, extrapolates to \({\mathrm {SU}}(3)\) within one standard deviation. In this sense, \({\mathrm {SU}}(3)\) can be used as validation of our fitting strategy. The values of the parameters of the fitting function Y are displayed in Table 3. At finite lattice spacing we include also the parameters from the fit excluding \({\mathrm {SU}}(3)\).

Table 3 Parameters of the large \(N\) extrapolations of \(H_E\), \(G_W\) and \(H_W\). We present the results for three different cases, L: at finite lattice spacing, \(\hbox {L}_4\): at finite lattice spacing excluding \({\mathrm {SU}}(3)\), and C: in the continuum. Additionally, we fit the data to a function with \(b_0 =0\), so that factorization is imposed at finite lattice spacing L* and in the continuum C*. In this case, the value of \(\chi ^2/{{\mathrm {dof}}}\) validates this hypothesis

Concerning the \(N\rightarrow \infty \) limit itself, the extrapolated value is within two standard deviations from zero in the worst case. Notice that at finite lattice spacing, the errors in the extrapolation are two orders of magnitude smaller than the value of \(H_E(1)\) at \(N=3\). To further validate factorization, an additional fit is performed for which \(b_0 =0\) is fixed, and only \(b_1\) and \(b_2\) are fitted to the data. This enforces factorization, so the value of \(\chi ^2/{{\mathrm {dof}}}\) from the fit can be used to asses the validity of the assumption (\(L^*,C^*\) in Table 3). To summarize, for \(H_E(1)\) we find excellent agreement with factorization in the continuum, and a deviation compatible with two standard deviations in the worst case at finite lattice spacing, still statistically consistent with factorization.

We now turn to the smooth Wilson loops. In Fig. 7 we display the results of the continuum and large \(N\) fits for \(G_W(1)\). The parameters of the extrapolations at the three values of c are displayed in Table 3. Also in this case, we find that \({\mathrm {SU}}(3)\) can be used as a validation point, and if it is excluded from the fit, it agrees with the extrapolating function within two standard deviations at \(c=1\), and within one standard deviation at the remaining values of c.

Fig. 7
figure 7

Left: Continuum extrapolation of \(G_W(1)\) for all gauge groups. Right: Large \(N\) extrapolations of \(G_W(1)\) in the continuum (solid line) and at finite lattice spacing (dotted line). The points at finite lattice spacing have been slightly shifted for better legibility

For the fits with factorization enforced by fixing \(b_0 = 0\), the values of \(\chi ^2/{{\mathrm {dof}}}\) are also excellent. These values, together with those of \(b_0\) reported in Table 3, give us confidence on the validity of factorization. Notice that the errors at large \(N\) are at least one order of magnitude smaller than the value at \({\mathrm {SU}}(3)\) itself. Concerning the finite \(N\) corrections, comparing the loops at different values of c, we observe that those at large c are characterized by large coefficients in front of the \(1/N^2\) and \(1/N^4\) correction terms.

Yet another interesting question is whether loops at fixed t but different R have different finite \(N\) corrections. We explore this issue at the end of the section. Let us first look at \(H_W\) in Fig. 8. The \(N\rightarrow \infty \) limits are less than two standard deviations away from zero. Inspecting the continuum limit fits, we observe that had we taken the three-point extrapolation for \(N=6\) as our central result, the central value would have been close to the upper end of the error bar in Fig. 8 and the \(1/N^2\) extrapolation in full agreement with factorization. In other words, one should not look too much at the central value but at the full range of the error, as always.

Fig. 8
figure 8

Left: Continuum extrapolation of \(H_W(9/4)\) for all gauge groups. Right: Large \(N\) extrapolations of \(H_W(9/4)\) in the continuum (solid line) and at finite lattice spacing (dotted line). The points at finite lattice spacing have been slightly shifted for better legibility

5.2.1 Loop size dependence

Finally, let us explore how the finite \(N\) corrections to factorization change when the size of the loop is increased at a fixed value of the smoothing parameter t. For a given value of t, we consider square loops of size \(R(\xi ) = \xi \sqrt{8t}\). Given the finite size of the lattices, we use the loops measured at \(c=1/2\) (\(t=t_0/2\)), so that we can consider larger values of \(\xi \). Thus, at a fixed value of \(t=t_0/2\), we define \({\hat{W}}\) and \({\hat{G}}_W\) in a similar way as W and \(G_W\), but in this case, as a function of \(\xi \) instead of c. In addition to the already presented results at \(\xi =1\), we also measured \({\hat{W}}\) and \({\hat{G}}_W\) at \(\xi =1.25,\, 1.5, \, 1.75\) and 2. The coefficients obtained for the large \(N\) fits at finite lattice spacing are displayed in Table 4 as a function of \(\xi \).

Table 4 Parameters of the large \(N\) extrapolation of \({\hat{W}}\) and \({\hat{G}}_W\) as a function of \(\xi \)
Fig. 9
figure 9

Plot of the parameters \(a_1\) and \(\log (b_1)\) as a function of \(\xi \). The interpolating function is a quadratic function in \(\xi \) in both cases

We observe that in the case of the loops themselves, the coefficients of the \(1/N^2\) expansion do not change significantly with \(\xi \), while those of \({\hat{G}}_W\) grow rapidly for larger loops. In fact, they grow exponentially fast as shown in Fig. 9. At finite \(N\), larger loops are much further away from \(N\rightarrow \infty \) than smaller loops.

6 Conclusions

We have taken the large \(N\) limit of a few observable of \({\mathrm {SU}}(N)\) pure gauge theories numerically defining all dimensionfull quantities in units of \(t_0\). This means that we held \(t_0\), or equivalently the coupling \(\bar{\lambda }_{{\mathrm {GF}}}\), Eq. (2.5), at a low energy fixed in defining the approach to the limit. As explained in Sect. 3, the precise magnitude of \(1/N^2\) corrections do depend on this choice. For each quantity, the continuum limit was taken before the large \(N\) limit, but we have also investigated large \(N\) scaling at finite lattice spacing, defined by \(a^2/t_0=\) constant.

In both cases we find that finite \(N\) observables are very well and very precisely described by a leading order term and corrections \(\sim 1/N^2\) and \(\sim 1/N^4\). We recall for example Fig. 4 where the excellent precision, in particular at finite a, is visible. In the same way, factorization has been confirmed very precisely. Of course, a numerical computation cannot substitute a mathematical proof, but our results make it very implausible that anything goes wrong with the large \(N\) limit in general, or factorization in particular.

However, the magnitude of corrections to the large \(N\) limit is more complex. We found a strong dependence on the physical size of the observables. For example, we considered \(R\times R\) Wilson loops smoothed with a smoothing radius of size again \(\sqrt{8t}=R\). Table 2 shows the deviation, \(\eta \), of SU(3) from SU(\(\infty \)) of these smooth loops to increase from 3% at a loop-size of \(R=0.2 \, {\mathrm {fm}}\) to 10% at \(R=1 \, {\mathrm {fm}}\).

When we increase the loop size R at fixed smoothing radius \(\sqrt{8t}=0.23 \, {\mathrm {fm}}\) from \(R=0.23 \, {\mathrm {fm}}\) to \(R=0.5 \, {\mathrm {fm}}\), the corrections \(\eta (1/9) \approx a_1/9\) (with \(a_1\) from Table 4 or Fig. 9) grow from 4% to more than 30%. The growth with R of the finite \(N\) corrections to factorization is even more dramatic as seen on the right panel in Fig. 9. These large corrections may also contribute to the fact that one has to go to very large \(N\) to approach the large \(N\) limit in the 1-point model [8, 57]. Of course, the dominating effect is expected to be that the color degrees of freedom provide the effective size of the system in that model.

One may also speculate that the growth of factorization violations with the loop size parameter \(\xi \) is so strong that it spoils the large \(N\) limit all together. We do not think that this is the case, but that indeed, it is important to take the limits in the right order: take the \(N\rightarrow \infty \) limit first and then the limit of large loop size. In order to investigate this issue further, one should probably first understand the large \(\xi \) limit at fixed \(N\), maybe just \(N= 3\). Here the relation to the effective string theory of Yang–Mills is likely to play a role [58,59,60,61]. In a second step, one may then consider \(N\) large. Such a demanding programme is beyond the scope of our present work.

In summary, for the quantities studied explicitly, large \(N\) scaling is confirmed with high precision, but corrections to large distance observables can be substantial. One thus has to be careful when deriving quantitative information from large \(N\) considerations in gauge theories.