Topological properties of CPN − 1 models in the large-N limit

We investigate, by numerical simulations on a lattice, the θ-dependence of 2d CPN − 1 models for a range of N going from 9 to 31, combining imaginary θ and simulated tempering techniques to improve the signal-to-noise ratio and alleviate the critical slowing down of the topological modes. We provide continuum extrapolations for the second and fourth order coefficients in the Taylor expansion in θ of the vacuum energy of the theory, parameterized in terms of the topological susceptibility χ and of the so-called b2 coefficient. Those are then compared with available analytic predictions obtained within the 1/N expansion, pointing out that higher order corrections might be relevant in the explored range of N, and that this fact might be related to the non-analytic behavior expected for N = 2. We also consider sixth-order corrections in the θ expansion, parameterized in terms of the so-called b4 coefficient: in this case our present statistical accuracy permits to have reliable non-zero continuum estimations only for N ≤ 11, while for larger values we can only set upper bounds. The sign and values obtained for b4 are compared to large-N predictions, as well as to results obtained for SU(Nc) Yang-Mills theories, for which a first numerical determination is provided in this study for the case Nc = 2.


Introduction
The 2d CP N −1 models are a class of quantum field theories which have gained much importance in the study of non-perturbative properties of quantum gauge theories. Indeed, they present many interesting non-perturbative features, analogous to the ones of Yang-Mills theories, such as the existence of a θ-term, the presence of topologically-stable instantonic configurations and confinement of fundamental matter fields [1][2][3][4]. These characteristics make the CP N −1 theories ideal test-beds for the study of the main non-perturbative properties of gauge theories.
The Euclidean action of the CP N −1 models, with the topological term, can be written, introducing a non-propagating abelian gauge field A µ , as where z is a complex N -component scalar field satisfyingz(x)z(x) = 1, D µ is the usual U(1) covariant derivative, g is the 't Hooft coupling, θ is the CP -violating angle and is the topological charge.

JHEP01(2019)003
Like SU(N c ) models, also CP N −1 models offer the possibility of performing a large-N 't Hooft limit, i.e. keeping g fixed, to study the non-perturbative regime. However, contrary to Yang-Mills theories, these models are exactly solvable in this limit and large-N predictions are also quantitative [5][6][7][8][9]. Therefore, a careful inspection of this limit is possible and deserved in order to better understand the non-perturbative behavior of quantum gauge theories.
One of the most intriguing non-perturbative properties of CP N −1 models is the θdependence of their vacuum energy (density), defined as: where V is the 2d space-time volume. The vacuum energy is an even function of θ and is assumed to be analytic around θ = 0, so that one can express it as a Taylor expansion around this value. The usual parametrization employed in the literature is: where χ is the topological susceptibility and the b 2n coefficients, which parameterize the non-quadratic dependence of f on θ, are related to ratios involving higher cumulants of the topological charge distribution P (Q). The explicit expression of the first few terms as a function of the cumulants k n of P (Q) reads: On general grounds, large-N arguments predict χ =χN −1 + O(N −2 ) and b 2n = b 2n N −2n + O(N −2n−1 ). More precisely, one can show that [5][6][7][8] thatb 2n < 0 and that e 2 ≃ −0.0606 [5]. The length scale ξ appearing in eq. (1.6) is the second moment correlation length, defined as: where (1.10) If on one hand many efforts were put in the analytic study of the large-N limit of the CP N −1 models, on the other hand present numerical results are quite limited for these theories and mostly regarding the topological susceptibility. The situation is in some sense opposite to the case of SU(N c ) pure gauge theories, where quantitative analytic large-N predictions are lacking, but numerical results for the large-N behavior are available both for χ [6,[10][11][12] and b 2 [6,10,[13][14][15][16][17]. This fact constitutes a strong motivation to make a similar numerical inspection for the CP N −1 theories too and to look for a numerical validation of large-N analytic predictions for these models. The leading term in the 1/N expansion of the topological susceptibility has been checked quite carefully on the lattice [18][19][20][21], however large uncertainties exist regarding the next-to-leading correction. Indeed, many numerical works on the CP N −1 theories show a deviation from the leading term which appears to be of opposite sign compared to that predicted by eq. (1.6) [19,22]. As for the O(θ 4 ) term only a preliminary investigation is available [23], while for the other high-order corrections no numerical results have been reported up to now.
The purpose of the present paper is to make progress in this direction. For this reason we consider a range of N going from 9 to 31, with the aim of determining the first coefficients in the 1/N expansion for at least χ and b 2 and, possibly, for b 4 too. Two main numerical difficulties have to be faced to complete this task: i) as the continuum limit and/or the large-N limit are approached, standard updating algorithms fail to correctly sample the distribution of Q and field configurations get trapped in path integral sectors with fixed topology. This problem is usually known as the freezing of topological modes [24][25][26][27]; ii) the determination of higher order cumulants usually requires the detection of slight deviations of P (Q) from a Gaussian distribution, resulting in a low statistical accuracy (for a more detailed discussion of this problem see, e.g., ref. [17]). The problem is more and more critical as the large-N limit is approached, given that b 2n is expected to vanish like 1/N 2n .
To alleviate the second problem, we consider analytic continuation from simulations performed at imaginary θ values. Analytic continuation, originally explored for the study of QCD at finite baryon chemical potential [28,29], has been introduced as a tool to explore the θ-dependence of the mass gap in CP 1 models [30][31][32] and then successfully extended to the investigation of SU(N c ) gauge theories at non-zero θ [6,17,[33][34][35][36][37][38], there leading to a substantial improvement in the determination of the higher-order coefficients of f (θ), compared to the measure at θ = 0 alone.
As for the first problem, we have decided to exploit simulated tempering [39,40]: although other methods have been proposed recently [19,41,42], this algorithm has proven to be sufficiently powerful to allow us to reasonably contain statistical errors in the explored range of N .

JHEP01(2019)003
The paper is organized as follows. In section 2 we present the adopted discretization for the action and for the topological charge, along with a description of the imaginary-θ method and of the simulated tempering algorithm. In section 3 we present our numerical results for χ, b 2 and b 4 . Results are then discussed and compared to large-N analytic predictions in section 4: there we discuss about the possible influence from higher order terms in the 1/N expansion, also in connection with the non-analytic behavior expected for N = 2; moreover, results obtained for b 4 are compared also with a determination of this coefficient in Yang-Mills theories that we provide for the first time in this study. Finally, in section 5, we draw our conclusions.
2 Numerical set-up

Lattice action
We discretized the theory on a square lattice of size L with periodic boundary conditions. For the non-topological part of the action, we adopted the tree-level O(a) Symanzikimproved lattice action [18] where c 1 = 4/3, c 2 = −1/12, β L ≡ 1/g L is the bare coupling and U µ (x) satisfiesŪ µ (x)U µ (x) = 1. Next-to-nearest-neighbor interactions are introduced to cancel O(a) corrections to the continuum limit.
Several discretizations of the topological charge exist: the most straightforward one makes use of the plaquette operator Π µν (x): where, as usual, This choice leads to an analytic function of the gauge field but does not result in an integer value for a generic configuration. There are, instead, other possible definitions, known as geometric, which always result in an integer value when assuming periodic boundary conditions. For the CP N −1 models one can introduce two different geometric charges: one [18] makes use of the link variables U µ (x)

JHEP01(2019)003
the other [43] uses the projector P defined in eq. (1.10) Since the three definitions agree already after a small amount of cooling (see subsection 2.4), one can use them equivalently. We adopted the non-geometric one to discretize the θ term in the action (see the next section) since it allows to make use of an over-heat-bath update algorithm, while we used the geometric Q U charge for measures, to avoid dealing with lattice renormalizations (see subsection 2.2 for further details). Lastly, for the lattice correlation length, we adopted the usual discretization [44]: whereG L (p) is the Fourier transform of G L , the lattice version of the two-point correlator of P defined in eq. (1.10), and k = 2π/L. Being the CP N −1 theories asymptotically free, the continuum limit a → 0 is approached when β L → ∞ and, in this limit, the lattice correlation length diverges as ξ L ∼ 1/a. This allows to express finite lattice spacing corrections to the continuum limit in terms of ξ L . In our case linear corrections are removed by the adoption of the improved action, thus the expectation value of a generic observable will scale towards the continuum limit like: (2.7)

Imaginary-θ method
When expressed in terms of expectation values computed at θ = 0, the coefficients of the higher order terms in the Taylor expansion, reported in eq. (1.4), are difficult to determine because they involve small corrections to an almost Gaussian distribution. For this reason, the direct insertion of a source term in the action, i.e. a non-zero θ, leads to a significant improvement of the signal-to-noise ratio, because it permits to determine higher order terms in eq. (1.4) from the θ-dependence of lower order cumulants, see refs. [17,34] for a more detailed discussion. However, a theory with a non-vanishing θ-term cannot be straightforwardly simulated on the lattice, due to the sign problem that arises when considering the probability distribution of trajectories at finite θ: P ∝ e iθQ .
The idea is then to exploit the analytic θ-dependence of the theory around θ = 0 and to continue the path-integral to imaginary θ angles:

JHEP01(2019)003
Now the topological term is real and can be simulated on the lattice. To discretize the θ-term we chose the non-geometric lattice definition of the topological charge in eq. (2.2): This choice allows the use of a generalization of the over-heat-bath update used also for the θ = 0 case (see subsection 2.3).
As anticipated above, after adding the imaginary source term to the action we are no more measuring just the fluctuations of Q but also its response to the source term: that provides a strategy to determine χ and the b 2n coefficients that exploits the θ I -dependence of the cumulants k n of the topological charge distribution. Indeed, the explicit expression of the θ I -dependence of the first few cumulants reads: (2.9) A global fit to eqs. (2.9) provides an improved way of measuring χ and the b 2n on the lattice, compared to the naive application of eqs. (1.5) (see ref. [17] for an explicit example). Note that, since we choose Q L to discretize the topological term in the action, θ I is related to the lattice angle θ L by a multiplicative renormalization, which is nothing but the renormalization constant relating Q L to its continuum counterpart [21]: θ I = Z θ θ L . This is just a technical difficulty that can be trivially approached by treating Z θ as an additional fit parameter. On the other hand, the choice of the cooled geometric charge Q U (see subsection 2.4) for the measure of k n introduces no additional renormalization.

Numerical algorithm
With the chosen discretization, the total lattice action is linear in both U and z. More precisely, the part of the action that depends only on a single site/link variable can be expressed as:S where φ stands for U or z and F φ is the force. Thus, a local update algorithm can be adopted. In particular, we chose a mixture of over-heat-bath and microcanonic overrelaxation. More precisely, following ref. [18], we performed 4 sweeps of microcanonic for every sweep of over-heat. In the following sections we will refer to a sweep of the whole lattice as a unit of measurement of the Monte Carlo updating time, where sweep will refer to either microcanonic or over-heat in the proportion reported above.

JHEP01(2019)003
A detailed description of these two update procedures is reported in ref. [18]; the only difference is that here we need to modify the U -force to include the topological term: µ (x) is the forward/backward staple relative to U µ (x). This local algorithm suffers from a severe critical slowing down of topological modes: freezing is experienced both in the continuum limit and in the large-N limit, as reported in ref. [25]. To overcome this problem we exploited the simulated tempering algorithm [39,40]. The main idea behind this algorithm is to enlarge the configuration space promoting the parameters of the probability distribution to dynamical variables. In our case, setting E L ≡ S L /β L , that means: (2.11) Since decreasing β L corresponds to increasing the lattice spacing, by changing β L we can move the theory away from the continuum limit, where the slowing down is absent. This allows a faster change in the topological charge of the configuration at the highest β L , which is the most affected by the freezing. As for the change of θ L , larger angles make highercharge configurations easier to realize. Thus, changing θ L constitutes a further stimulation to change the charge of a configuration at a given β L .
To obtain an actual benefit, it is of utmost importance that the probability of occupying a certain couple (β L , θ L ) is as uniform as possible. To achieve this, we add a term to the action which depends just on θ L and β L : where the optimal choice, making the distribution exactly uniform, coincides with the free energy of the theory, F L (β L , θ L ) = − log Z L (β L , θ L ). Indeed with this choice: becomes a constant. The change of a single parameter, keeping the other fixed, is obtained including a Metropolis step in the algorithm, whose acceptance probability is: The free energy as a function of β L and θ L is estimated by a numerical integration of E L and Q L [39], which can be easily measured on the lattice: (2.14) -7 -

JHEP01(2019)003
Note that the free energies are needed only up to a global irrelevant constants, so free energy derivatives contain all the relevant information. To that purpose we ran some preliminary simulations to measure the θ L and β L -dependence of U L and k 1 and estimate the value of F (β L , θ L ) in a chosen interval of the parameters. We fixed the interval of values of β L and θ L to be used in the simulated tempering by using the following criteria: • β min is chosen in a region where the slowing down is not significant, β max is chosen according to how close to the continuum limit one wants to arrive, the spacing δβ among intermediate β L values is chosen so that there is a reasonable overlap between probability distributions of E L at different couplings (a necessary condition to get a reasonable Metropolis acceptance). We a posteriori verified that the simplest choice of a constant δβ is sufficient to obtain a reasonably uniform acceptance probability; • the choice of the θ angles does not affect the slowing down of topological modes, so θ min , θ max and δθ were chosen to have a reasonably large interval and enough points for the imaginary-θ global fit described in subsection 2.2.
From a practical point of view, simulated tempering has been implemented by trying, after each sweep, a change in either β L or θ L (with equal probability) to one of their nearest-neighbor values (with equal probability). Notice that simulated tempering has not been used for all values of N ; in particular for N < 15, for which the improvement is marginal, only standard local updating sweeps have been used.

Cooling procedure
To dampen the effects of the ultraviolet (UV) fluctuations in the measure of the lattice topological charge, we adopted a standard smoothing algorithm. Various lattice studies have shown the equivalence of different procedures, like cooling [45], the gradient flow [46], or smearing, once they are appropriately matched to each other [47][48][49]. In this study we chose cooling, because of its relative simplicity and speed.
Each cooling step consists in a sweep of the lattice in which every site/link variable is aligned to its local force: This update corresponds to a local minimization of the action and it typically does not alter the topological content of the configuration. For the cooling procedure the minimized action does not need to be the same governing the field dynamics. Therefore, to speed up the cooling process, we chose the unimproved θ = 0 action obtained by setting c 1 = 1 and c 2 = 0 in eq. (2.1).
Since the various charge discretizations discussed in subsection 2.1 differ at finite lattice spacing, we checked that all of them agree after a certain amount of cooling steps. In figure 1 we show an example for N = 21, θ L = 0 and some values of β L . It turns out that after ∼ 10 cooling steps all the definitions agree regardless of β L . Moreover, as we move towards the continuum limit, the discrepancy between the various definitions is less and less visible, as expected.

Simulations set-up
We performed several sets of simulations for each value of N . To start with, we performed numerical simulations at fixed values of β L and θ L in order to determine the free energy F L (β L , θ L ) entering simulated tempering. From some of these runs we have extracted the autocorrelation time (in number of sweeps) τ of Q 2 for various values of ξ L (β L ) and N in the case of the standard local algorithm. As it is well known, critical slowing down implies an exponential dependence of τ (Q 2 ) both on ξ L and on N [19,25]. It is interesting to observe that such a dependence seems to take the form of a universal function of the scaling variable ξ L N α , as one can see from figure 2. The coefficient α, fixed empirically to obtain a reasonable collapse of data points, turns out to be α = 1.2(1).
After this preliminary set-up, we performed simulations exploiting simulated tempering both in β L and in θ L , as explained above. The total accumulated statistics for each value of N has been of the order of 10 9 -10 10 total updating sweeps, typically divided in O(10 3 ) independent runs. As an example, in figure 3 we report the Monte Carlo evolution of β L during a simulation performed for N = 21.
The simulated tempering provides a gain of a factor up to 4 (for the largest explored N ) in terms of the autocorrelation time (measured in number of sweeps) τ (Q 2 ) for the highest β L (which is the one mostly affected by the freezing). This result is obtained discarding all the intermediate values of β L and keeping into account the machine time needed to generate them. The use of all the intermediate couplings to extract the continuum limit provides a further gain with respect to the case of independent simulations, although correlations between measures at different couplings introduced by the algorithm may, in principle, reduce this second advantage.
A number of independent dedicated simulations at fixed θ L = 0, with statistics of the order of 10 9 total sweeps for each N , were performed in order to obtain precise measure-  Table 1. Summary of the simulation parameters adopted for all values of N . We also report the total accumulated statistics, in terms of total sweeps. Measures were performed every 50 sweeps.
ments of ξ L needed to accurately determine ξ 2 L χ. A summary of our simulation parameters and statistics is reported in table 1.

Analysis procedure and results
The simulated tempering provides samples containing measures of Q U at different values of β L and θ L . Our idea is to fully exploit the information contained in these samples proceeding as follows: i) we compute the cumulants of Q U and we measure χ, b 2 and b 4 for each value of β L using a global fit to eqs. (2.9) (an example is shown in figure 4); ii) we exploit the determinations at all β L values to obtain the continuum limit extrapolation of ξ 2 χ, b 2 and b 4 for a given value of N , through a best fit to eq. (2.7) (an example of this extrapolation is reported in figure 5 for N = 21, with the corresponding Z θ values shown in figure 6).   The simulated tempering introduces correlations among measures at different θ L and β L values that come from the same Markov chain. Thus, it is of primary importance to carefully take these correlations into account to get a correct estimate of the statistical errors. To do so, we applied a bootstrap analysis to our data set, treating as a single independent extraction a whole set of measures coming from one of the O(10 3 ) independent simulated tempering runs performed for each N . Of course for runs at N ≤ 13, for which simulated tempering was not adopted, this analysis was not required and we relied on a simple binned jackknife analysis at each fixed β L and θ L .
Several sources of systematic error were checked. For the imaginary-θ fit we verified that the fit parameters, as well as the reducedχ 2 , were stable varying the number of cumulants and the θ-interval used in the fit. Concerning the truncation of the fit functions, stopping to the b 4 terms in each cumulant was always sufficient to get stable results and reducedχ 2 of order 1, while the introduction of higher-order terms in θ in the fit did not   Figure 6. Z θ results obtained for N = 21. Data are plotted against 1/β L since deviations of Z θ from 1 can be expanded in powers of 1/β L [21]. Note that the values of Z θ obtained in this work are much closer to 1 than those obtained in SU(N c ) Yang-Mills theories, see, e.g., ref. [6]. change results within errors. As for the extrapolation to the continuum limit, fitting with a linear function of ξ −2 L was always sufficient to get a reducedχ 2 of order 1. Furthermore, we checked that both the continuum limit and the reducedχ 2 were stable when varying the fit range. A final remark about finite-size corrections: we checked that in all our simulations L/ξ L > 10 and (L/ξ L ) 2 ≫ N (see ref. [50] for more details on this condition), so they are under control.
A summary of our continuum extrapolated results is reported in table 2, where we also report some results from other recent studies [19,51].
It is important to stress the good agreement between continuum extrapolated values obtained by studies adopting different discretizations: for N = 21 our result for ξ 2 χ JHEP01(2019)003 N ξ 2 χ · 10 3 b 2 · 10 3 b 4 · 10 5 9 20.00 (15) (2) -- Table 2. In this table we report continuum extrapolated results for different values of N . We report also some results obtained in other recent studies. In particular, results marked by * and ** are taken from refs. [19] and [51], respectively; in these works only ξ 2 χ is reported.
(Symanzik improved discretization) agrees within one standard deviation with that reported in ref. [19] (standard discretization). This represents a consistency check that the one percent precision level achieved in both cases is solid, i.e. it correctly takes into account the possible sources of statistical and systematic uncertainty. As for the b 4 coefficient, we notice that our present numerical precision only allows to set upper bounds after the continuum extrapolation is taken, with the only exceptions of N = 9 and 11, where a reliable non-zero determination has been obtained. The main purpose of this section is to make use of our results and of results from other recent studies [19,51], all summarized in table 2, to investigate the large-N behavior of the free energy coefficients and compare it with analytic predictions.
Let us start from the topological susceptibility: its analytic large-N prediction has been summarized in eq. (1.6). If one tries to numerically check the leading order term and fits data with N ≥ 26 using ξ 2 χ = e 1 /N , one obtains e 1 = 0.1600(8) withχ 2 = 2.86/2, which is in good agreement with the analytic result 1/(2π) ≃ 0.1592. This result is stable both under the change of the fit range and under the inclusion of a further quadratic term in 1/N .
The good agreement of the leading 1/N term with analytic predictions was already noted in previous studies where, however, some tension emerged when comparing with the next-to-leading analytic computation, e 2 ≃ −0.0606. This seeming discrepancy is clearly visible from figure 7, where N ξ 2 χ is reported together with the asymptotic result 1/(2π) for this quantity: deviations from the asymptotic value are typically positive. On the other hand, if one tries to fit all data according to N ξ 2 χ = 1/(2π) + e 2 /N , one obtains JHEP01(2019)003  Table 3. In this table we report systematics for the determination of the large-N behavior of ξ 2 χ using the fit function N ξ 2 χ = e 1 + e 2 /N + e 3 /N 2 . Blank spaces mean that the corresponding coefficient was set to 0 in the fit procedure while numerical values with no error mean that the corresponding coefficient was fixed to that value. the positive value e 2 = 0.12(1), but withχ 2 = 130/10, meaning that results cannot be accounted for by just the next-to-leading contribution: higher order terms are needed. As one can see from table 3, this is true even if the smallest values of N are discarded from the fit: theχ 2 is still large and the fitted value of e 2 is not stable.
Therefore, we tried to include a e 3 /N 3 correction in ξ 2 χ: as one can see from table 3, in this case e 2 turns out to be negative and actually compatible with the analytic prediction (even if with large error bars), with marginally acceptable values ofχ 2 when the smallest (N ≤ 10) values of N are discarded. One can even fix e 2 to its analytic prediction, without changing the quality of the fit and maintaining a stable prediction for e 3 in the range 1.5-2.
Let us summarize what, in our opinion, one can conclude from this analysis: presently available results are actually consistent with the analytic next-to-leading large-N prediction for ξ 2 χ, once a further term in the 1/N expansion is taken into account. However, an independent and numerically stable determination of e 2 is presently not possible: this is due to the fact that e 2 and e 3 turn out be of opposite signs, with e 3 at least one order of magnitude larger than e 2 , so that for the largest available values of N their contributions . Large-N behavior of N ξ 2 χ. We report also the values obtained in refs. [19] and [51]. still cancel each other, while for the smallest values (N 10) one cannot exclude that further terms could come into play. In order to make a further improvement, in the future one should try to substantially increase the precision of present data and/or to push the investigation towards even larger values of N .
We now pass to the analysis of b 2 . In this case, apart from some preliminary study [23], no results can be found in the literature, therefore this work represents the first numerical test of the large-N scaling of this coefficient. The leading order contribution is expected to be of order 1/N 2 , see eq. (1.7); for this reason, we have considered the quantity N 2 b 2 , which is expected to have a finite large-N limit and is plotted in figure 8.
Since results clearly point to a finite value of N 2 b 2 as 1/N → 0 (see figure 8), we tried to fit data according to the following function: fitting in the whole range we obtain: This result is stable if we change the fit range and we obtain compatible results fixing k 2 = 0 and fitting in a narrower range, however it is not compatible with the analytic prediction in eq. (1.7)b 2 = −27/5 = −5.4. Also in this case, a possible slow convergence to the large-N asymptotic regime could explain the disagreement: indeed, if in eq. (4.1) we fixb 2 = −27/5, we are still able to obtain a good fit, withχ 2 = 6/4, by allowing for just another k 3 /N 3 term in the expansion. Both best fits are reported in figure 8 with numerical results reported in table 4. As one can appreciate from this figure, a definite assessment of this issue will be possible with more precise data at the largest explored values of N , or once results will be available for 1/N 0.02, i.e. N 50.
Results for the b 4 coefficient will be discussed separately in section 4.3.  Table 4. In this table we report systematics for the determination of the large-N behavior of b 2 obtained using the fit function N 2 b 2 =b 2 + k 1 /N + k 2 /N 2 + k 3 /N 3 . The convention is the same of table 3.

Trying to match the large-N and the small-N behaviors
As we have discussed above, numerical results for ξ 2 χ clearly indicate that, within a 1/N expansion, at least O(1/N 3 ) contributions are needed for N ∼ O(10), and even higher order terms are likely to give non-negligible contributions. Actually, one must take into account that any N -dependence of topological quantities should match the divergence of the topological susceptibility which is expected [52][53][54][55][56], and has been numerically verified [57,58], for N = 2. A naive way to match the usual large-N expansion and the divergence in N = 2 JHEP01(2019)003 would be to change the expansion parameter from 1/N to 1 1/(N − 2); however, a more educated guess can be tried, based on how χ is expected to diverge as N = 2 is approached.
That can be done by recalling how the prediction for a divergence comes out. In the weak coupling regime, the instanton size distribution for the CP N −1 models is given by thus, the average number of instantons and antiinstantons expected in a finite box of size ρ 0 is approximately given (for N > 2) by Assuming, naively, that the instanton-antiinstanton gas is weakly interacting, as in the Dilute Instanton Gas Approximation (DIGA), then N I and N A are both distributed according to two independent Poissonian distributions: using this fact it is simple to show that i.e. the susceptibility is expected to diverge as 1/(N − 2). Of course, our assumption is questionable: the properties and dynamics of the small topological objects dominating at small N can be more complex [61][62][63], and they are not diluted at all, since their density diverges as N → 2. However, it is interesting to see if such a prediction is supported by present numerical data available for the smallest values of N . If we fit data with N < 15 to a functional dependence ξ 2 χ = a/(N − b) c we obtain b = 2.14(18) withχ 2 = 5.1/3 if we fix c = 1, and c = 1.013 (22) with a similarχ 2 if we fix b = 2, while our available information is still not enough to reliably fit both b and c at the same time. Finally, fixing both b = 2 and c = 1, one still getsχ 2 = 5.6/4 with a = 0.1385(5), meaning (when compared to 1/(2π) ≃ 0.1592) that the simple 1/(N − 2) behavior could still describe up to 90% of the total signal which is observed even for asymptotically large N values.
Therefore, we conclude that our present data are perfectly consistent with the presence of a 1/(N − 2) divergence, and that its amplitude is large enough to influence data for N ∼ O(10). It is clear that the standard 1/N expansion has to face the presence of such a divergence, so that its radius of convergence cannot be larger than 1/2. It is interesting to consider if one can devise a different form of the expansion, in which the presence of the divergence is explicitly taken into account, in such a way that the remaining N -dependence is more regular.
The divergence at N = 2 can be enforced in the fit in many different ways. We will use a/(N −2)+g N , but this expression is exactly equivalent to eq. (4.6) with f N = a+(N −2)g N . Also note that eq. (4.6) can be thought of as a partial resummation of the 1/N series, since: Expanding f N in powers of 1/N , one gets and from the expansion (4.7) one finds (e ′ 1 ) theo = 1/(2π) and (e ′ 2 ) theo = −2(e 1 ) theo + (e 2 ) theo ≃ −0.379 . Using this parametrization and fixing e ′ 1 = 1/(2π), we fitted (N − 2)ξ 2 χ versus 1/N , obtaining the results reported in table 5 and shown in figure 9. We notice that, although difficulties in measuring e 2 still persist, compared to the standard expansion the resulting fit is more stable, the reducedχ 2 is marginally smaller and the scaling window is larger. This might suggests that indeed the N − 2 ansatz in eq. (4.6) constitutes an improved parametrization, compared to the usual one, of the N -dependence of ξ 2 χ, since it encodes more relevant physics.
Finally, let us stress that the divergence of ξ 2 χ does not imply that also further coefficients of the θ-expansion should diverge for N = 2. For instance, in the naive noninteracting instanton gas approximation, all coefficients would stay finite even in the presence of a divergent instanton density. Actually, future studies, focussing on the small-N side of the problem, could clarify, by determining b 2 and other higher order terms in the θ expansion, the exact nature and distribution of the UV-diverging topological objects populating the small-N world. We have seen in the previous sections that it is possible to provide precise estimates of χ and b 2 in the CP N −1 models, thus fixing the θ-dependence of f (θ) up to O(θ 4 ). The numerical determination of higher orders in θ is however extremely challenging, since higher orders are suppressed by large powers of 1/N and they are thus very small. In this section we will present our results for b 4 , i.e. the coefficient fixing the θ 6 -dependence of f (θ).
In the large N limit, analytical computations [6,7] give for b 4 the prediction reported in eq. (1.8), hence we expect b 4 to be negative (and very small) for large enough N . The Ndependence that is numerically observed presents however some peculiar trends (figure 10). For N ≥ 21 negative values of b 4 are observed at finite lattice spacing, but the continuum extrapolations are well consistent with zero. This is more or less the behavior one could have guessed a priori given the large N theoretical results. However, one should notice that the 1.5 σ hint of a negative continuum result for N = 21 (b 4 = −1.7(1.2) · 10 −5 ) is in any case lower by more than one order of magnitude with respect to the leading large N prediction (b 4 ≃ −7 · 10 −4 for N = 21). Furthermore, this hint for a negative (continuum) b 4 does not become a full evidence for smaller values of N and in fact something new happens for N close to 10: for N = 9 and 11 the continuum extrapolations are clearly positive, although at coarse lattice spacing negative b 4 values are observed (see figure 11).
This fact suggests that, when looking at b 4 , the large N limit is not even qualitatively correct for small N values. An intriguing possibility, already advocated above, is that the physics at small N is in fact dominated by the small instantons responsible for the singularities that are present in the N = 2 case. It was shown in the previous section that, assuming a weakly interacting ensemble of instantons, this possibility leads to a peculiar small-N behavior for χ, which is in agreement with numerical data. Using this approximation it is simple to show that f (θ) has to be of the functional form f (θ) = χ(1 − cos θ), just like in DIGA, from which one gets b 2 = −1/12 and b 4 = 1/360. While the numerical values obtained for N = 9 and 11 are still far from these values, the qualitative picture we get is correct: b 2 is always negative (at large N due to eq. (1.7), at small N because b 2 = −1/12 in the DIGA) while b 4 changes sign. Also the behavior as a function of the lattice spacing of b 4 is easily explained: for small N it becomes positive only when the lattice spacing is small enough for small instantons to become dominant.
It is interesting to compare CP N −1 models with SU(N c ) Yang-Mills theories, where the situation is different in several respects: on one hand, we do not have analytical control on the large-N c limit, on the other hand nothing singular happens in the small-N c case, and in particular for N c = 2. The absence of small-N c singularities could be the reason, together with the fact that the expansion is in powers of 1/N 2 c , why the large N c scaling is known to extend in this case down to very small N c -values, i.e. N c = 3 and even 2. Since previous studies performed in SU(N c ) with N c ≥ 3 could not identify a non-vanishing b 4 value, we here perform a dedicated study for SU (2).
In order to estimate b 4 in SU(2) Yang-Mills theory we performed simulations using the same techniques described in ref. [6], to which we refer for further details. Standard heath-bath [64,65] and over-relaxation [66] algorithms were adopted to investigate four different values of the bare coupling (β = 2.70, 2.743, 2.7979, 2.85); the corresponding lattice spacings (a √ σ ≃ 0.1014, 0.0894, 0.0751, 0.063) were extracted from ref. [67] or cubic interpolation of data thereof. For each value of β simulations were performed for seven values of θ L in the range 0 ≤ θ L ≤ 12 and the physical size of the lattice satisfied in all the cases the relation L √ σ 3. The numerical results obtained using this set-up are shown in figure 12, from which we see that lattice artefacts are smaller than our statistical errors and extrapolating to the continuum limit with a linear function in a 2 we obtain b 4 = 6(2) · 10 −4 .
Despite the fact that the large-N c expansion works well also for small N c values, it would be far too optimistic to extract a large N c behavior from just an SU(2) result. It is nevertheless interesting to note that, assuming that the 1/N 4 c scaling holds down to N c = 2, one would expect b 4 ∼ 10 −4 for N c = 3, which is well within the upper bound on b 4 which was possible to achieve in refs. [6] and [17]: that clarifies why a non-vanishing value of b 4 was never detected in previous studies performed for N c > 2.
The value that we have obtained is also quite reasonable when compared with other results in the literature. For instance, when considering predictions about θ-dependence in pure gauge theories obtained within holographic approaches [68,69], one obtains b 4 /b 2 2 = 5/6 in the large N c limit [68], i.e. b 4 ≃ 0.033/N 2 c using the results of ref. [6] for the large N c values of b 2 . In the two color case this gives b 4 ≃ 2 · 10 −3 which has the same sign and JHEP01(2019)003 is just a factor three larger than our result. The same order of magnitude, even if with an opposite sign, is obtained using the expression for the vacuum energy θ-dependence in QCD derived from chiral perturbation theory (see, e.g., ref. [70]): it is easy to see that in this case one gets b 4 (QCD) = −3.5 · 10 −4 .

Conclusions
The purpose of this study was to make progress towards a numerical validation of analytic large-N predictions for the θ-dependence of 2d CP N −1 models. To that aim, we have explored a range of N going from 9 to 31, where we have provided continuum extrapolated results both for ξ 2 χ and for the b 2 coefficient, which parameterize fourth order terms in the θ expansion of the free energy density. We have also been able to provide, for the first time in the literature, a determination of sixth order contributions, parameterized in terms of the b 4 coefficient, both for CP N −1 models, with N ≤ 11, and for SU(N c ) Yang-Mills theories with N c = 2.
The large-N behavior of ξ 2 χ had already been explored by previous studies, which found deviations from the leading 1/N term of opposite sign with respect to the analytic prediction for the 1/N 2 term [5]. Our results have provided evidence that this may be due to a large 1/N 3 contribution: indeed, once that is taken into account, numerical results are consistent with the prediction of ref. [5]. A similar scenario applies to b 2 , which has been investigated systematically in our study for the first time: while, on one hand, the fact that b 2 is proportional to 1/N 2 at the leading order is well supported by numerical data, on the other hand consistency also with the predicted prefactor appearing in eq. (1.7) is possible only when O(1/N 4 ) corrections are taken into account.
The fact that higher order corrections in the 1/N expansion seem to be significant for N ∼ O(10) is at odds with what is observed for SU(N c ) Yang-Mills theories. We have put that in connection with the expected finite radius of convergence of the series, which is bounded by 1/2 because of the divergence of the topological susceptibility taking place for N = 2, where the topological fluctuations are dominated by small instanton and antiistantons, whose density is UV divergent.
Having that in mind, we have tried to improve the convergence of the 1/N expansion, after guessing the leading divergent behavior around N = 2. Indeed, assuming such small topological objects to be weakly interacting, one obtains a prediction for a 1/(N − 2) divergence of the topological susceptibility, which we have verified to be well supported by our present numerical data for N < 15. We have therefore tried to expand the function (N − 2)ξ 2 χ as a regular series in 1/N , obtaining results which are marginally better and more stable than those obtained within the standard 1/N expansion for ξ 2 χ.
The values obtained for the b 4 coefficient at N = 9 and 11, which are of opposite sign with respect to the leading 1/N 4 prediction, are also consistent with the presence of significant corrections related to small-N physics.
Present results could be improved in the future in various directions. On one hand, an effort to extend the analysis to larger values of N , using for instance the algorithm proposed in ref. [19], could be useful especially for a better quantitative test of the coefficient in front JHEP01(2019)003 of the leading 1/N 2 term for b 2 ; we have estimated that N 50 should be explored to that purpose. On the other hand, a precise determination of θ-dependence for small values of N , approaching N = 2, could be useful for a more precise matching between small-N and large-N behaviors. It would be also interesting to consider the non-zero momentum components of the topological susceptibility [71], which could give more information on the actual topological charge distribution.
Finally, let us comment on the result obtained for the b 4 coefficient in the 4d SU(2) pure gauge theory, b 4 = 6(2) · 10 −4 . Assuming that the 1/N 4 c scaling holds down to N c = 2, this value is well within previous upper bounds set for b 4 in SU(N c ) gauge theories with N c ≥ 3, and also well explains why a clear non-zero signal has not yet been achieved for this quantity in those cases. At the same time, the value appears in good semi-quantitative agreement (i.e. within a factor 3) with large-N c predictions obtained within holographic models [68].