Metadynamics Surfing on Topology Barriers: the $CP^{N-1}$ Case

As one approaches the continuum limit, $QCD$ systems, investigated via numerical simulations, remain trapped in sectors of field space with fixed topological charge. As a consequence the numerical studies of physical quantities may give biased results. The same is true in the case of two dimensional $CP^{N-1}$ models. In this paper we show that metadynamics, when used to simulate $CP^{N-1}$, allows to address efficiently this problem. By studying $CP^{20}$ we show that we are able to reconstruct the free energy of the topological charge $F(Q)$ and compute the topological susceptibility as a function of the coupling and of the volume. This is a very important physical quantity in studies of the dynamics of the $\theta$ vacuum and of the axion. This method can in principle be extended to $QCD$ applications.


I. INTRODUCTION
In numerical simulations of asymptotically free theories, CP N −1 in two dimensions and SU (N ) in four dimensions, one observes an increase of the autocorrelation time τ , defined as the number of iterations needed to generate independent field configurations, as one proceeds towards the continuum limit where the coupling constant vanishes and the typical correlation length diverges. This corresponds to the critical slowing down occurring in statistical systems close to a second order phase transition. In addition, in these systems, a particularly dramatic increase of autocorrelation time is observed in the case of the topological charge, independently of the precise discretized definition which is used in the simulation on the lattice [1][2][3][4][5][6][7][8][9][10].
In general, the asymptotic scaling behavior of the autocorrelation time with the lattice spacing is expected to be power-like for all the quantities. On the contrary, in both CP N −1 and QCD the autocorrelation time of the topological charge, τ Q , grows so rapidly with the length scale ξ, that an apparent exponential behavior τ Q ∼ exp(c ξ θ ) in the explored range of values of ξ is compatible with the available data [2,3]. On the other hand, this peculiar effect is not observed for "quasi-Gaussian modes" such as the plaquette or the Polyakov line correlations, suggesting a separation of the dynamics of the topological modes from that of quasi-Gaussian ones.
The different dynamical behavior of quasi-Gaussian and topological modes is induced by sizable free-energy barriers separating different regions of the configuration space. Therefore the evolution in this space presents a long relaxation time due to the transitions between different topological charge sectors, and the corresponding autocorrelation time is expected to behave as τ Q ∼ exp(∆F (Q)), where ∆F (Q) is the typical free-energy barrier between different topological sectors. If the height of the barrier increases as we proceed toward the continuum limit, the system can be trapped in a topological-charge sector for a number of simulation steps comparable or even longer than the total available resources of the simulation. In QCD this problem may bias the lattice predictions for several important physical quantities such as the mass of the η meson [11][12][13][14], or the polarized baryon structure function [15][16][17][18]. In general this bias will be present for any observable correlated to the topological charge, whenever the algorithm does not explore with the appropriate weight the different topological sectors. This has been shown to occur in ref. [9] in the contest of CP N −1 models. It is clear that in order to design a cure for these problems one should also measure and understand how the typical free-energy barriers scale with the correlation length (and with the physical volume) in a simulation.
In this paper we propose to address the problem of topological trapping by using metadynamics, a powerful method that was introduced to enhance the probability of observing rare conformational changes and reconstructing the free energy in biophysics, chemistry and material sciences systems [26,27].
The metadynamics method [26,27] requires the preliminary identification of Collective Variables (CV) which are assumed to be able to describe the process of interest. A CV is a (physical) quantity which depends on a set of the coordinates or fields of the system.
In the simplest case the variable could be just the topological charge itself although the full power of this approach relies in its ability to treat several CVs simultaneously. The dynamics in the space of the chosen CVs is enhanced by a history-dependent potential constructed as a sum of Gaussians centered along the trajectory followed by the CVs. The sum of Gaussians is exploited to reconstruct iteratively an estimator of the free energy.
This procedure resembles the Wang and Landau algorithm [28], in which a time-dependent bias is introduced to modify the density of states to produce flat histograms in models with discrete energy levels.
The history of microscopic systems in normal Monte Carlo or Hybrid Monte Carlo (HMC) [10] corresponds to a random walk with a bias in the direction of lower free energy.
In systems with many local minima the probability to explore the transition regions and tunnel in a different minimum is very small, particularly when the height of the barrier is high. In metadynamics, the system has access to a feedback which during the time evolution fills the local free energy minima. Thus, even if at the beginning the system visits more often the region at the bottom of a local minimum, after a few steps almost deterministically it starts exploring regions corresponding to higher and higher values of the free energy. Sooner or later, the system fills the minimum, climbs out of it, and visits another minimum that is eventually also filled, until all the relevant minima are explored.
The key idea of metadynamics is exploiting the time-dependent bias potential itself as a free energy estimator. In particular, the time average of the bias potential has been shown to converge to the negative of F with an error that scales to zero with the inverse square root of the simulation time [29].
We here propose to address the problem of topological trapping by performing metadynamics on the topological charge. We will show that this approach induces a large number of transitions between different sectors, and therefore converges very rapidly. At the same time, the approach allows computing the unbiased average value of any observable by standard reweighting techniques. In order to test our proposal we first study the two-dimensional CP N −1 models that have several features in common with QCD, such as asymptotic freedom and a non-trivial topological structure. Since these models require much smaller computing resources, they are an ideal theoretical laboratory to be used in an early and exploratory stage of any new algorithm. We find the improvement induced by metadynamics considerable and worth to be implemented in a QCD study that we plan to perform in the near future.
The paper is organized as follows: in Sec. II we recall the basic ingredients of CP N −1 and define the quantities that will be measured in our numerical simulation; in Sec. III the metadynamics algorithm is described; in Sec. IV we present the results of our numerical study and compare several quantities obtained without or with metadynamics. Our conclusions will be given in Sec. V and some technical details are presented in appendix A.

II. CP N −1 AND THE DIFFERENT DEFINITIONS OF THE TOPOLOGICAL CHARGE ON THE LATTICE
In the continuum the two dimensional CP N −1 model is defined by the action where z is a complex N -dimensional field with unit normz · z = N i=1 z * i z i = 1 and the covariant derivative is given by D µ = ∂ µ + iA µ . The field A µ has no kinetic term. For this reason, by using the equation of motion, A µ can be expressed in terms of the field z The action is invariant under the local U (1) gauge transformation The connected correlation function and magnetic susceptibility are defined as In the continuum one can define the correlation length ξ from the large time-distance behavior of G(x) or from the second moment of the correlation function In our numerical analysis we have used the definition ξ G given below in eq. (12), which is proportional to ξ W and ξ II in the scaling region.
The topological-charge density q(x), the total topological charge Q and the topological susceptibility χ t are defined as The lattice action is given by [30] where we introduce the lattice covariant derivative expressed in term of the U (1) gauge link λ n,μ ≡ exp(iA µ ( n +μa/2), where a is the lattice spacing andμ the unit vector in the direction µ. This is the simplest nearest neighbor derivative. Improved versions of it could also be used. Lattice gauge invariance, in its linear realization, reads λ n,μ → e iΛ(x) λ n,μ e −iΛ(x+μ) ,z n → e iΛ(x)z n .
By integrating by part the term in eq. (8) and removing the term independent of the fields we arrive to the action used in our numerical simulation We use the lattice definition of the correlation length ξ G [3] where q m is the smallest non zero dimensionless momentum on a lattice with lattice spacing a and physical volume aL, namely q m = (2π/L, 0). In the continuum limit aξ G → ξ W , where the correlation ξ W was defined in eq. (5).
At finite lattice spacing no unique definition of the topological charge exists. One possible geometrical definition of the topological charge was introduced in Ref. [31] as n Im {ln Tr [P(n +μ +ν)P(n +μ)P(n)] + ln Tr [P(n +ν)P(n +μ +ν)P(n)]} , where µ = ν and P is the projector defined in eq. (4). Thanks to the periodicity of the lattice, the previous definition is guaranteed to be an exact integer number on every configuration (strictly speaking this in fact holds for all configurations except for a subset of measure zero). Different geometrical definitions of the topological charge, however, are not guaranteed to return the same integer number, especially on coarse lattices. Q g can be written in terms of the phase θ n,μ = arg (z n z n+μ ) as Being strictly integer-valued, the geometrical definition of eq. Another definition of the topological charge that better serves our scope is given in terms of the imaginary part of the plaquette, as illustrated in Ref. [32] Q λ differs from the geometrical definition of eq. (13). We have where Z Q is a suitable renormalization constant that can be computed in perturbation theory and η is a zero-average additive ultraviolet noise depending on the particular field configuration, the variance of which is a increasing function of the lattice volume and a mildly dependent function of the lattice spacing.
Such noise can be reduced by computing the topological charge of eq. (15) after "smoothing" the λ fields (see e.g. Ref. [34] for a recent comparison between cooling and Wilson flow in the contest of QCD), or by smearing the gauge fields with procedures like APE [35] or HYP [36].
Introducing a bias potential related to a non-geometric definition of the topological charge may accelerate the dynamics of the ultraviolet noise, that is expected to be connected with the degrees of freedom ultimately related to the tunneling between different topological sectors. As a matter of fact, it will turn out to be useful to filter out the highest frequency components of the noise, with the purpose of capturing those components expected to be more closely related to the tunnelling phenomenon. To this end we make use of a modification of the stout smearing [37], adapted to treat U (1) variables, according to the procedure explicitly described in appendix A.
We stress that the smeared topological charge is exploited in our approach only to enhance the tunneling between different minima, thus improving the convergence of the system to thermalisation with the possibility of averaging over all the different topological sectors. After convergence is achieved, one can then compute the properties of the system described by the normal CP N −1 action by using standard reweighting techniques. For example, one can compute the probability of observing different values of the exact Q.
To conclude this section we define the fictitious-time autocorrelation function where O is any observable (topological charge, magnetization, ξ etc.) and t is the discrete simulation time expressed in sweeps where the averages are taken after the thermalisation of the system. We expect that

III. METADYNAMICS
Let us consider a physical system described by a set of coordinates x and an action S(x) that evolves under the effect of a given dynamics that samples a equilibrium probability proportional to exp(−S(x)). We are interested in exploring the properties of the system as a function of a CV Q(x). The probability distribution of this variable is given by and the corresponding free energy is F (q) = − log (P (q)). The probability distribution and the free energy can be estimated by computing the histogram of q = Q(x) over a phase ). If the system displays metastability, namely if P (q) is characterized by the presence of two or more local maxima, separated by regions where the probability of observing the system is small, this estimator is affected by large statistical errors, since q is normally trapped in a region of high probability, and only rarely performs a transition to another region. This is the case for the dynamics of a CP N −1 model, where at small lattice spacing the system is trapped in a specific topological sector.
In metadynamics the action S (x) is modified by adding to it a history-dependent po-tential of the form where g (q) is a non-negative function of its argument, that rapidly vanishes for large |q| .
In the original implementation, g(q) = w exp − q 2 2δq 2 , where w and δq are two parameters that can be used to tune the growth speed of V G . Thus, the metadynamics potential is a sum of small repulsive potentials, acting on the CV, and localized on all the configurations q(t) that have already been explored during the trajectory, up to time t. This potential disfavors the system from revisiting configurations that have already been visited. If the dynamics of q is bound in a finite connected region, after a transient the probability distribution of q in this region can only be approximately flat. Indeed, if this is not the case, by definition the system would spend more time in a subregion of q, and V G would grow preferentially in that region, disfavoring the system from remaining there. Thus, deviations from the flat distribution can only survive for a transient time. P (q) exp (−V G (q, t)) must be approximately constant or, equivalently, This equation states that in metadynamics the free energy is estimated by the negative of the bias potential itself. More precisely, since eq. (20) is valid at any time, the best estimator of the free energy at time t is given by the (large) time average of V G up to time The equilibration time t eq entering in eq. (21) is the time at which the history dependent potential becomes approximately stationary (or, equivalently, the probability distribution as a function of q becomes approximately flat). Like in the ordinary estimates of the average value of an observable, the exact choice of t eq influences only the convergence speed, but not the final result. The difference between −F and V G in eq. (21) decreases as the square root of t − t eq , with a prefactor that strongly depends on the specific CV q [29].
Since the bias potential in eq. (19) must be a differentiable function of the fields, we use as CV the topological charge Q λ defined in eq. (15) rather than the "integer" charge of eq. (13).
The probability of observing different values of the exact Q is then computed by reweighting, as discussed below.
The most efficient choice of the smoothing parameters is such that the distribution probability of Q λ is the largest which still allows an assignment to different integer values of the topological charge. In other words, the distribution probabilities of Q λ for different given topological sectors of Q g are such that they do not overlap. This determines the optimal width of Q λ .
For the sake of computational efficiency, we store the history-dependent potential on a regular grid of spacing δq; (q 0 , q 1 , · · · , q n ), with q i = q 0 + i δq. At the beginning of the simulation we set V i = V G (q i ) = 0. Then, at every step, we 1. compute the value of the CV q(t) ≡ Q λ (t); 2. find the grid interval i where it falls 3. update the history dependent potential as follows Thus, in our implementation the function g(q) entering in eq. (18) is a sum of two triangular-shaped functions, one centered on q i , the other on q i+1 .
The force ruling the evolution of the fields x is then changed by adding to it the component deriving from the history-dependent potential. Thus, the total force is The two crucial parameters in this procedure are w and δq. For too large values of w an accurate integration of the trajectory would require an infinitesimal time step (for vanishing w metadynamics becomes the standard HMC). The optimal grid spacing δq must be such that i) the potential wells are filled rapidly, and this requires a large δq; ii) the free energy F (q), eqs. (18) and (20), can be accurately reconstructed, and this requires a small value of δq. Suitable values for w and δq can be initially estimated in an unbiased run by measuring the transition probability between two topological sectors and the fluctuations of Q λ in a given sector respectively [27]. The first will give us the height of the barrier, and w must be significantly smaller than this height, while second give us an estimate of the width of the distribution.
In order to reach a stationary state in which the probability distribution as a function of q is flat it is necessary constraining the dynamics of q in a finite region. This can be done without loss of generality by suppressing configurations, corresponding to large values of q, that have a exponentially low probability weight. A well established method [27] is given by the introduction of a threshold value of the topological charge Q thr : we disfavor values of Q λ larger than the threshold by a repulsive force which increases linearly starting from k is a new parameter which should be chosen of the same size as w/(2δq 2 ). Q thr can be estimated from the value of the topological susceptibility, choosing Q thr χ Q L 2 .
In summary five parameters have to be tuned, namely w, δq, Q thr , k and the smearing parameter ρn discussed in appendix A. The grid on q in eq. 22 is chosen in such a way that q 1 = −Q thr and q n−1 = +Q thr . The forces from metadynamics in eq. 23 are added only if The history-dependent potential V i is updated according to eq. 22 also when Q(x) ∈ [q 0 , q 1 ] or Q(x) ∈ [q n−1 , q n ]. Note that, since the threshold Q thr is chosen in such a way that the configurations that are disfavored by the restraining potential in eq. 24 correspond to a exponentially low probability, using a larger value of Q thr would not change the estimate of the average value of any observable, except for exponentially small corrections, much smaller than the statistical error.
More details on how the parameters of this approach are chosen and tuned to optimize the efficiency can be found in ref [27].

IV. NUMERICAL RESULTS
In this section we present a comparison of results obtained by using the standard HMC algorithm with those obtained by using metadynamics. In particular we focus on the autocorrelation time τ of several physical quantities among which the autocorrelation of the topological charge τ Q and its scaling properties in the continuum limit.
Since this work describes the first application of metadynamics to lattice field theory, in the next subsection we will present several details about the numerical results, whereas in Sec. (IV B) we discuss the scaling properties of the efficiency as we proceed toward the continuum limit.

A. A comparison of standard HMC and metadynamics
We start by considering a metadynamics run performed at β = 0.70 and = L = 62. For these values, the standard HMC is still able to explore different topological sectors, and achieve convergence. This allows checking if the two approaches provide consistent results.
In Fig. 1 we show the metadynamics bias potential averaged over progressively larger number of MC sweeps. As the simulation proceeds, the minima are iteratively filled, until all the five minima shown in figure are explored, and the bias potential starts growing evenly in all the Q range. At this point, the free energy estimator V G (q, t) can be considered converged. Since the instantaneous value of V G (q, t) is subject to fluctuations, in order to extract the free energy and its statistical error, we average V G (q, t) after convergence, namely after 10k HMC sweeps for the example in Figure. In Fig. 2 the reconstructed average free energy of the topological charge F Q λ and its statistical uncertainty, shown as a (orange) band, estimated by dividing the Monte Carlo history after equilibration in n p = 4 different intervals and estimating the standard deviation of the n P measures.
In the bottom panel we compare the metadynamics result with −log(P (Q λ )) estimated in a standard HMC run. Remarkably, the two estimates are fully consistent within the small error bars, indicating that metadynamics allows computing reliably the probability distribution of the charge.
The most important effect of the metadynamics bias is reducing the autocorrelation time of the observables by orders of magnitude. In Fig. 3 we show the (fictitious) time autocorrelation function for the magnetization and topological charge for the standard HMC run and for the metadynamics run. For comparison, we have chosen for these figures a value of β and of the lattice size L equal to one of the runs presented in Ref. [32] (see also [1]). The improvement is significant: by fitting the exponential decrease of the autocorrelation functions we find τ Q = 155000 ± 10000 in the HMC run, whereas the corresponding value for the metadynamics run is τ Q = 5600 ± 1000. In both cases we find τ m = 50 ± 10. A related quantity is the transition probability per unit time between two different topological sectors, ν. This quantity is defined as the number of jumps between two different values of Q g divided by the total number of sweeps. For the two runs corresponding to Fig. 4, we have ν = 4.98 · 10 −5 with the HMC and ν = 2.24 · 10 −3 for metadynamics respectively.
One of the main results of this paper is the reconstructed free energy, Fig. 2 where A and b are numerical constants which in general, at fixed N , depend on L and β.
As an example we fitted the effective potential reported in the top-left panel of   In order to demonstrate that the approach presented in this work allows addressing efficiently the problem of topological trapping, we now discuss the scaling of the autocorrelation time as a function of the lattice spacing a ∼ ξ −1 G , namely of ξ G , and of the physical volume L/ξ G . In Fig. 5 and Tab. IV B we display the dependence of ν on ξ G with HMC and metadynamics at several values of L/ξ G . As expected, because of entropy, in the standard HMC ν is an increasing function of the lattice size, and this corresponds to an increase in the dispersion of the topological charge, namely of Q 2 . At larger values of β, as we proceed toward the continuum limit, with the standard HMC, ν decreases exponentially as a function of the correlation length ξ G [2,3], so that for aξ G 9 the changes of the topological sector are so rare that only an upper limit for ν can be provided.
The fact that the system is locked in a given topological sector may severely bias the lattice predictions for several important physical quantities, in primis the topological susceptibility. The apparent small errors with HMC for the topological susceptibility at large values of β, for example at β = 0.80, with L = 80, see Tab. A, are illusory because they refer to an average taken essentially at fixed topological charge and thus its value is affected by a systematic error. This is dramatically clear by comparing the values of the renormalization group invariant product χ t ξ 2 G in the last column of Tab. A and B. β = 0.8 is just the largest value at which we could made a comparison between standard HMC and metadynamics, since above that it is impossible to get sensible results for the topological susceptibility using HMC. Indeed, with the standard HMC the number of sweeps necessary to get the correct results increases exponentially with β while in metadynamics increases only linearly.
The behavior observed with the HMC is to be contrasted with the results obtained with metadynamics (corrected for the bias using eq. 25) since in this case ν is sensibly flatter, and the simulation spans all the possible sectors of Q allowing to produce reliable results.
We first analyze the scaling of the performances with the volume. It is known that in HMC the autocorrelation time of topological charge does not increase with the volume, as the increased tunneling rate (due to larger entropy) compensates the larger range of topological charges to explore. In metadynamics it is a priori not clear how the mechanism of tunnelling is affected by the volume. We note empirically (see Fig. 5) that the tunnelling rate does not change with the volume, so one would expect that the algorithm efficiency degrades (mildly) with the volume. This is partially sustained from the observation of the errors reported on the last two columns of Tab. B. Nonetheless we remark that, no matter which volume is chosen, we still expect metadynamics to scale better with the lattice spacing than HMC. This means that for any physical volume there is a critical lattice spacing below which using metadynamics is convenient.
The final issue that we have to address is the scaling of the error of observable quantities measured in the metadynamics simulations. As we proceed to the continuum limit, the weight of configurations with non integer Q λ decreases, but the system keeps exploring all values Q λ ∈ [−Q thr , +Q thr ] with equal probability after that the bias potential has converged. The unbiased expectation value is recovered by the reweighting procedure of eq. 25. The impact of the reweighting can be estimated modelling the scaling of the bias potential with the lattice spacing. Assuming that in the parametrization of eq. 26 the coefficient b, corresponding to the height of the barriers, grows approximately as 1/a, one expects a mild increase of the error due to reweighting. By comparing Tabs. A and B one is comforted by observing that at small β (where HMC simulations are reliable) the errors for ξ G , E and χ m obtained with metadynamics are comparable with those obtained with HMC, and increase very slowly with β, not faster than the standard HMC.
For the case of the renormalization invariant combination χ t ξ 2 G we observe that in the case of HMC the errors growth relatively less than with metadynamics, but the value drops dramatically at large β due to the freezing of the topological charge. With metadynamics we find similar values across the whole range of explored lattice spacings, signaling that the algorithm samples correctly the distribution of the topological charge, as opposed to HMC.
In spite of the great effort undertaken in the past [2,3,9] even for the HMC algorithm it is not clear yet whether the topological modes are affected by exponential or powerlike slowing down. In this first work we do not aim at determining accurately the scaling properties of our current implementation of metadynamics. We can however note that the observed growth of the error on the topological susceptibility as a function of the lattice spacing is compatible with a power law divergence of topological autocorrelation time, with an exponent that we estimate to be 2 ÷ 3. This has to be compared with the pure HMC simulations, for which the exponent of the power-like ansatz has steadily been suggested to be close to 5. We consider this a substantial improvement, that allows already to have reliable results at β much larger that the HMC. Based on previous experience [33], further improvement might be obtained including a larger set of collective variables to the algorithm, possibly coupling including in the bias other slow modes.     In the table we only give the values of ν that have been reliably determined, excluding those corresponding to β so large that, in the case of HMC, only an upper bound can be given.

V. CONCLUSIONS
In this paper we have shown that the metadynamics approach [26,27] can be used to simulate CP N −1 improving dramatically the problem of the slowing down observed in numerical simulations for quantities related to the topological charge. In particular we have studied the N = 21 case, showing that we are able to reconstruct the free energy of the topological charge, F (Q), as a function of the coupling and of the volume.
The much reduced slowing down allows us to study a range of β much larger than that available with ordinary HMC. Further improvement might be obtained by biasing a larger set of collective variables.
It seems straightforward to extend the general method exposed in this paper to the case of QCD. It will be interesting to investigate whether metadynamics will also work in this case.
The possibility of measuring of F (Q) suggests that we can further improve the efficiency of the algorithm by using the information about its form as obtained in a preliminary run, cfr. eq. (26). This would allow simulating our system by using as importance sampling the knowledge of free energy itself added to the original action of our theory. being ρ a small real number. The recursive procedure is based on the 0-level stout smeared links, that are nothing but the original (non-smeared) ones .
As an effect of the exponentiation, the level-1 link results to be the average of the original link and the collection of all the infinite paths surrounding the two nearby 1 × 1 plaquettes: All calculations were performed on the Ulysses HPC facility maintained by SISSA -ITCS (Information Technology and Computing Services). We acknowledge Antonio Lanza, Piero