Topological sampling through windings

We propose a modification of the Hybrid Monte Carlo (HMC) algorithm that overcomes the topological freezing of a two-dimensional $U(1)$ gauge theory with and without fermion content. This algorithm includes reversible jumps between topological sectors$-$winding steps$-$combined with standard HMC steps. The full algorithm is referred to as winding HMC (wHMC), and it shows an improved behaviour of the autocorrelation time towards the continuum limit. We find excellent agreement between the wHMC estimates of the plaquette and topological susceptibility and the analytical predictions in the $U(1)$ pure gauge theory, which are known even at finite $\beta$. We also study the expectation values in fixed topological sectors using both HMC and wHMC, with and without fermions. Even when topology is frozen in HMC$-$leading to significant deviations in topological as well as non-topological quantities$-$the two algorithms agree on the fixed-topology averages. Finally, we briefly compare the wHMC algorithm results to those obtained with master-field simulations of size $L\sim 8 \times 10^3$.


I. INTRODUCTION
Standard algorithms for lattice QCD are well-known to suffer from topology freezing [1][2][3][4]. Near the continuum limit, distinct topological sectors are poorly sampled due to the large energy barriers separating them, leading to exponentially increasing autocorrelation times as the continuum limit is approached in a finite volume. This problem has received a lot of attention, and several algorithmic strategies have been proposed over the years [5][6][7][8][9][10], but there is no fully satisfactory solution.
In this paper, we study a modification of the Hybrid Monte Carlo (HMC) algorithm, named winding HMC (wHMC), that incorporates Metropolis-Hastings steps [11] with tailored reversible jumps between topological sectors. The idea is similar to an old attempt under the name of instanton hit [12,13]. We will test our algorithm in the U (1) gauge theory in 2D with and without fermion content. This model has been recently used as benchmark in machine-learned flow-based sampling algorithms [14,15], as well as in tensor network approaches [16,17] (see Ref. [18] for a review).
We first test the algorithm in a compact U (1) pure gauge theory, which suffers from topology freezing, but is solvable. Exact results on topological and nontopological observables exist in the literature for the lattice regularization [19][20][21], i.e., at finite lattice spacing. Therefore, we can accurately test the approach to the continuum limit of the topological susceptibility and the plaquette. We then include two degenerate flavours of Wilson fermions and study the pion mass dependence of the topological susceptibility. In both cases we compare the scaling of the autocorrelation time with that of the standard HMC.
It is a general belief that algorithms with topology freezing do nevertheless well in observables computed at fixed topology-failing only in the weights of the different sectors. We can test this hypothesis accurately in this model by comparing the plaquette (in the pure gauge) and the pion mass (in the fermionic case) at fixed topology with the exact results or between the two algorithms, HMC and wHMC.
Finally, since topology freezing can be circunvented altogether by working with very large physical volumes and taking local averages [22][23][24], we also compare our topology-sampling algorithm with HMC in a very large lattice of size V = 8192 2 .
We comment on the prospects to extend the wHMC algorithm to other gauge theories and higher dimensions in the outlook section.

II. ANALYTICAL RESULTS
The Schwinger model [25] is a U (1) gauge theory with one or more massless fermions. It is a solvable quantum field theory that shares many properties with Yang-Mills in four dimensions [26]. In particular, Euclidean gauge configurations can be classified according to their topological charge and there is a mass gap. The spectrum contains a free boson that can be interpreted as the singlet pseudoscalar meson, η , with mass where N f is the number of degenerate flavours.
Since this theory can be solved on the lattice and in a finite volume [19][20][21], it is therefore a good starting test-bed for Monte Carlo (MC) algorithms.
The Wilson lattice formulation of the theory is where U l and U p are the standard link and 1 × 1 Wilson loop, respectively. We use periodic boundary conditions. Note that β = 1/e 2 is dimensionful, but all dimensionful quantities are assumed in lattice units in the following. Therefore, as we approach the continuum limit, β ∼ a −2 .
We will be considering the plaquette and the topological susceptibility: where the lattice definition of topological charge is The result for these quantities is known in terms of modified Bessel functions for finite β and V [19][20][21]: where and the sums in n are over all integers. The infinite volume limits are In the continuum limit, β → ∞, we recover the wellknown results The partition function in fixed topology can also be easily derived from the known partition function in the θ vacuum [19][20][21]. At sufficiently large volume with An interesting quantity is the average of the plaquette in fixed topology sectors, which shows a subtle dependence on Q, and it is analytically known also at finite β: As we will see, this is a golden observable to test how algorithms perform in sampling sectors of fixed-topology, given its high precision.

B. N f Schwinger Model
In the theory with N f > 1 fermions, the flavour symmetry group is SU (N f ) L ×SU (N f ) R . Even though spontaneous chiral symmetry breaking cannot occur in 2D and the condensate vanishes in the massless limit, the scaling of the condensate with the quark mass is nontrivial. The Gell-Mann-Oakes-Renner (GMOR) relation follows from the Ward identity (WI) where m is the quark mass. The condensate is expected to scale with the quark mass [29] as and therefore the pion mass scales with the quark mass as The topological susceptibility vanishes in the limit of massless fermions. From the WI and the GMOR relation it follows and combining this with the Witten-Veneziano formula (and neglecting mass corrections to F π ), we expect which nicely interpolates between the pure gauge case, Eq. (3), for M π → ∞, and the flavoured result of Eq. (17), even though it is strictly derived close to the chiral limit.

III. WINDING HMC
Even in this simple model, standard MC algorithms such as the HMC algorithm fail to reproduce the continuum limit expectations due to the bad sampling of topological sectors. In Fig. 1 we plot an HMC history of the topological charge Q, showing the well-known topology freezing phenomenon. This results in the exponential growth of the autocorrelation time as β → ∞ shown in Fig. 4 for Q 2 .
The basic idea of our proposal is to combine HMC steps with a Metropolis-Hastings accept-reject step, where the trial configuration is obtained from the previous one by performing a winding. The winding transformation acts on the link variables whose starting and ending points are within a square region S w of size L w , if both x, x+μ ∈ S w . By contrast, the other links remain unchanged. The anatomy of the winding step is depicted in Fig. 2. We only have to define the gauge transformation Ω. We set Ω = 1 except at the boundary where it is chosen to have a winding number. If x n are the points on the boundary of S w , ordered from n = 1, . . . , 4L w , we pick where the + denotes a winding and the − an antiwinding. The sign is chosen with 50% probability, and is common for the n points, ensuring that the transformation will yield a change in the topological charge of ∆Q = ±1 in smooth configurations. The invariance of the measure ensures that this transformation has a trivial Jacobian, dU Ω = dU . The transition probability for U → U of this Metropolis-Hastings step is Since q(U |U ) = q(U |U ) due to the 50% probability of performing a winding or antiwinding, p acc is just the usual Metropolis [30] accept-reject probability being the target probability distribution. In the pure gauge theory S[U ] is the plaquette action, whereas in the dynamical theory it includes the fermionic determinant. The latter is evaluated stochastically using one pseudofermion.
It is easy to check that p(U ) is the equilibrium distribution of such a Markov chain, i.e., dU q(U |U ) = 1.
Substituting Eq. (21) into Eq. (24), we get where the last step can be easily obtained after considering the two cases p(U ) < p(U Ω ), or p(U ) > p(U Ω ) for each Ω. By itself this algorithm is obviously not ergodic, since only a predefined change is performed. Ergodicity should be ensured by combining one or several winding steps with a standard HMC step. We refer to this algorithm as wHMC.

A. Pure Gauge Case
We have carried out a simulation of this algorithm for volumes with fixed V /β ∼ 80 at various values of the  lattice spacing, β, for the pure gauge theory. Tab. I includes the parameters and results from these simulations for both HMC and wHMC. Two MC histories of the topological charge for HMC and wHMC are compared in Fig. 3, where the freezing of topology is absent for wHMC. This can be quantified more precisely by looking at the scaling of the autocorrelation times with a 2 ∼ β −1 . As shown in Fig. 4, there is an enourmous improvement with respect to standard HMC. The curves in Fig. 4 correspond to the twoparameter fits: The best fit parameters are b = 1.47 (14) for the exponential fit to HMC, and b = 0.565(53) for the power-law fit to wHMC. Therefore we find an exponential scaling of the autocorrelation time with the lattice spacing for HMC, in agreement with previous findings [1][2][3][4], versus  a scaling proportional to ∼ √ β for wHMC. We have also studied the dependence on the size of the winding region, L w . In Fig. 5 we show the autocorrelation times for P , Q and the acceptance rate of the winding step as function of L w : the acceptance of the winding grows with L w reaching 50% at the largest L w considered, and τ Q improves in a correlated fashion, while τ P is insensitive to L w .
One can understand the improvement of the acceptance rate of the winding step with L w . The change in the action when a winding is performed is restricted to the plaquettes at the boundary of S w , and is due to the change in the links at the boundary-see violet region in Fig. 2. The change in the phase of the plaquette, δθ p , due to the transformation Ω ± is therefore ∓ π 2Lw . We refer to δS w as the outer boundary of the winding region. For sufficiently large L w , the change in the phase of the plaquette is small and we can approximate The average of the first term of ∆S vanishes, while the last term averages to The acceptance increases as L w → ∞ at fixed β, since the change in the action averages to zero. We therefore conclude that the most efficient approach is to set the winding size to the largest possible value in this case. The result for the average plaquette and the topological susceptibility normalized to the analytical results of Eqs. (7) are shown in Fig. 6. The agreement with the exact results for both observables is very good for the wHMC algorithm, while for HMC both observables differ significantly from the theoretical expectation close to the continuuum limit. Although the divergence from the analytical result is more significant for the topological susceptibility, the plaquette also differs at various σ's of confidence level.
The inclusion of the fermion determinant is challenging in the wHMC algorithm, because the acceptance becomes very small. The reason is that the change in the action induced by the winding can no longer be circumscribed to the boundary plaquettes, since the determinant is non-local. In Fig. 7 we show the acceptance of a winding Metropolis-Hastings step as a function of L w . The acceptance is seen to be below 1%, and the highest acceptance is no longer reached for large L w : the optimal L w is roughly 2-3 with a mild dependence on the quark mass. The value of the optimal acceptance is however very sensitive to β and to the quark mass, decreasing as the chiral limit is approached. This is shown in Fig. 8, where we plot the acceptance as a function of the pion mass for fixed L w = 3 for various β. There is indeed room for optimizing L w as a function of β and the quark mass.
On the other hand, one winding accept-reject step involves one inversion of the Dirac operator, while an HMC step involves as many as the number of steps in the integrator, n HMC , which is of order O(10 − 100). Therefore, instead of one winding, we could perform O(100) winding accept-reject steps between HMC steps at a similar cost. A step of the wHMC is then defined as one HMC step + n W -winding accept-reject steps. This increases the computational cost of each wHMC step compared to a HMC one by a factor while significantly improving the scaling with β of the  autocorrelation time of the topological charge. We have performed a series of simulations at several β's, computing the pion mass and the topological susceptibility for various values of the bare quark mass, m 0 . The summary of our results is in Tab. II.
In Fig. 9 we show the topological susceptibility as a function of the pion mass, together with the fit to the continuum expectation, Eq. (18), plus generic cutoff effects that in the theory with unimproved Wilson fermions are expected to scale with O(a) ∼ β −1/2 . We fit the various results at various β and quark masses to the ansatz where c and d are the fitting parameters. The agreement of wHMC with the expectation is good, even at values of β where the topology in HMC is completely frozen and L Configs. β m0 Mπ P τP Q 2 τ Q 2 wHMC 90 30000 5.0 -0.05 0.15916(54) 0.899605(18) 1.85(13) 13.5 (16) 84 ( does not allow to measure the topological susceptibility. Cutoff effects are significant and larger than in the pure gauge theory, as expected from the presence of Wilson fermions. Even though the autocorrelation is larger than in the pure gauge theory, we still see a major improvement in the scaling towards the continuum limit as shown in Fig. 10. Note that the autocorrelation time is multiplied by the factor in Eq. (30), which accounts for the increase in computational cost. In our simulations, this factor goes from r c ≈ 2.53 to r c = 6 in the range β ∈ [5,9].

IV. RESULTS AT FIXED TOPOLOGICAL SECTOR
A way to overcome the topology freezing problem consists in extracting physical quantities of interest from simulations at fixed topology and correcting for the finitesize dependence [31,32] (see also [33] for applications in the context of finite size scaling). A key ingredient in this approach is that algorithms that suffer from topol- ogy freezing can nevertheless sample correctly sectors of fixed topology. In this sense only the relative weights of different topological sectors are difficult to compute for an algorithm suffering topology freezing. This hypothesis can be studied very accurately in the context of our simple 2D model: on the one hand we can compare with the analytical results in the pure gauge case, and on the other hand we can compare the results with our wHMC algorithm for the N f = 2 case.

A. Pure Gauge
Let us start with the pure gauge model. In Fig. 11 we show the result for the weights of the different topological sectors obtained with the two algorithms at β = 11.25, compared to the expectations in Eq. (11). Clearly HMC fails at evaluating these weights, while wHMC succeeds.
In the pure gauge model, the plaquette at fixed topology has a small but measurable Q dependence (see Eq. (13) and Fig. 12). We can therefore test whether the algorithm samples properly within each topological sector and reproduces the correct Q dependence. We consider the projected observable O to the topological sector n where δ n (Q) = 1 |Q| = n 0 otherwise . Fig. 13 shows the difference between the measured plaquette and the analytical expectation, (∆P ) th = P −P th , in units of the error of the measured plaquette. We see that HMC fails to reproduce the correct expectation  value of the plaquette (label "All Q") by 5 standard deviations. This is expected since HMC is completely frozen and only Q = 0 is present in the Monte Carlo history, while the plaquette shows a small (but noticeable) dependence on Q. On the other hand the expectation value of the plaquette projected to the Q = 0 sector is perfectly predicted by HMC. We also see that the wHMC, which is able to sample all topological sectors, reproduces correctly the value of the plaquette projected to all values of the charge from 0 to 6. It also reproduces the expectation value of the plaquette.

B. N f = 2 results
We now turn to the model with dynamical fermions. Simulations at fixed topology have been performed in previous works using the HMC algorithm for this model [34][35][36]. Again, Fig. 14 shows that HMC is not able to sample the different topological sectors correctly at β = 9.0. Focusing on the pion mass as the observable of interest, we see in Fig. 15 that it shows a dependence on the topological sector, explaining why HMC fails to correctly reproduce its value in Fig. 16 (label "All Q") by more than 8 standard deviations. Nevertheless the values of M π projected to the topological sector with |Q| = 0, 1 are correctly reproduced (labels |Q| = 0, 1).

V. MASTER-FIELD SIMULATIONS
We now turn to the computation of physical observables by means of simulations in large lattices, the socalled master fields [22]. Using this approach, observ- This approach requires large volumes for reasonable error estimates (see appendix A). At these large values of the volume we expect the effects of the global topology to be suppressed. Master-field simulations therefore bypass the effects of topology freezing as long as fixed topological sectors are sampled correctly. In section IV we have argued that HMC, even suffering severely from topology freezing, can determine correctly observables on sectors of fixed topology. Therefore we expect master-field simulations to produce correct numbers, even if simulations are performed in a region of parameter space where topology is frozen. In this section we will confirm this expectation.
We have performed simulations on lattice volumes of 8192 × 8192 using the standard HMC algorithm at the same values of β as in the wHMC case (see Section III A). For each case we have generated a single configuration by a process of thermalization (using 2000 trajectories of length 0.5), followed by an unfolding in the two periodic directions. We start with a small lattice 16 × 16. After unfolding 9 times, we reach our target size 8192 × 8192. On this single configuration, we measure the plaquette and the susceptibility. Given the value of the 1×1 Wilson loop U p (x) at a point x, we use its real part and argument to estimate the value of the plaquette and the topological charge density respectively The susceptibility can be determined from q(x) using the local observable If the value of R is taken larger than the correlation length of the system, the expectation value of χ R (x) will coincide with the topological susceptibility.
In the infinite volume limit the partition function Eq. (11) factorizes. This implies that the values of q(x) are not correlated among different x, and therefore χ t = χ R (x) for any value of R. Moreover the variables χ R (x) are also uncorrelated. It is easy to check that the variance of the observable χ R (x) increases as which implies that the best estimate of the topological susceptibility is obtained by using R = 0. Incidentally, this also suggests that in theories with a non-zero correlation length R has to be taken as small as possible. Fig. 17 shows that the values of the plaquette and the susceptibility agree perfectly with the theoretical expectations. Further details in the evaluation of the error in master field simulations can be found in appendix A.
Finally let us comment on the cost comparison. The key element for master field simulations is the cost of thermalization. For our case (due to the small numerical cost of our simulations) this thermalization has been performed by brute force. Whether a thermalization process performed with more care would result in a cost comparable to the one of wHMC or HMC is beyond the scope of this work. Any conclusion in this regard would be anyway difficult to extrapolate to other gauge theories in more dimensions, since this particular model shows no spatial correlations among observables.

VI. OUTLOOK
We have presented a new algorithm based on Metropolis-Hastings steps that are tailored to induce jumps in the topological charge. This algorithm satisfies detailed balance, and ergodicity is ensured when alternated with standard HMC steps. As we have shown, it successfully improves the problem of topology freezing and exponentially-growing autocorrelation times in the 2D model considered-both with and without fermion content. The integrated autocorrelation time of wHMC in the pure gauge case is very similar to the one obtained in machine-learned flow-based sampling algorithms [14,15], however without the additional training cost.
In spite of the shortcomings of algorithms with topology freezing, we have been able to confirm that averages in fixed topology sectors are not affected, and agree in wHMC and HMC. This is seen both in the pure gauge theory, where the analytical results are known at finite β, as well as in the theory with fermions.
We have finally compared the wHMC algorithm with the results by local averages in very large lattices of size up to L ∼ 8000. Our results indicate that master-field simulations are satisfactory in the controlled setup of this 2D model, since analytical results are reproduced with very high accuracy.
The interesting question is whether wHMC can be equally successful in the case of other gauge theories in higher dimensions. In fact, the winding step is trivial to extend to, for instance, a SU (2) theory in 4D. We have indeed carried out the naive implementation of wHMC in that context, and found very poor acceptances-the "curse" of dimensionality. We hope that less trivial implementations in 4D could resolve this matter; we are currently exploring modifications of the algorithm that incorporate the idea of normalizing flows [14].
It is also convenient to define the fluctuations over the mean, In general we are interested in computing the uncertainty on derived observables. These are functions of the primary observables.
Note that in general derived observables depend on measurements performed in various master field simulations, possibly with different physical parameters (lattice spacing, quark masses, volume, etc. . . ). The observable F and its error are estimated by Taylor expanding around a α i , . This last equation suggests to use as estimate for the observablē In order to compute its error, we use the autocorrelation functions Γ α F (x), which can be estimated from the data using At large distances compared with the largest correlation length in the system ξ α , they decay exponentially Only if L α ξ α it is possible to give a reasonable estimate of the uncertainty. In these cases we use In practice the summation in Eq. (A9) has to be restricted to |x| < R. As in the case of error estimation of Monte Carlo data, the optimal value of R has to be chosen as a balance between a small value, which will underestimate the true value of the error in Eq. (A9), and a large value, which will only add statistical noise to the error estimate. Similar recipes to the ones used in usual Monte Carlo simulations (see [39]) can be used to estimate appropriate values of R. Note however that in contrast with the case of Monte Carlo simulations, the exponential asymptotic decay of the autocorrelation function in Eq. (A8) can be estimated from the physical parameters of the simulation. This opens the door to more accurate error estimates along the lines of [4].
Finally let us comment two more points. First, if more than one configuration is produced in a master-field simulation, they can be used to reduce the uncertainty in the determination of the correlation function Γ F (x) along the lines of the analysis of different replica [39]. Second, analyzing derived observables that depend both on master-field simulations and Monte Carlo ensembles can be performed along the lines suggested in [40,41].