Mitigating topological freezing using out-of-equilibrium simulations

Motivated by the recently-established connection between Jarzynski's equality and the theoretical framework of Stochastic Normalizing Flows, we investigate a protocol relying on out-of-equilibrium lattice Monte Carlo simulations to mitigate the infamous computational problem of topological freezing. We test our proposal on $2d$ $\mathrm{CP}^{N-1}$ models and compare our results with those obtained adopting the Parallel Tempering on Boundary Conditions proposed by M. Hasenbusch, obtaining comparable performances. Our work thus sets the stage for future applications combining our Monte Carlo setup with machine learning techniques.


Introduction
It is well known that Markov Chain Monte Carlo simulations of lattice gauge theories based on local updating algorithms are affected by critical slowing down.The integrated autocorrelation time τ associated with a given lattice observable O diverges as the continuum limit is approached, in a way that is naturally described in terms of the correlation length ξ of the system.
Topological critical slowing down can be understood as follows.Sufficiently close to the continuum limit, a proper definition of topological charge is recovered for lattice gauge field configurations.This definition can be used to partition configuration space into sectors, separated by free energy barriers.The height of these barriers diverges with ξ and, as the continuum limit is approached, the disconnected topological sectors of the continuum theory emerge.The growth of the energy barriers also suppresses the tunneling rate of a Markov Chain generated by a local updating algorithm, that thus remains trapped in one fixed topological sector.This effect, known as topological freezing, leads to a loss of ergodicity and introduces large systematic effects, especially in the calculation of topological observables such as the topological susceptibility.
In this study, we propose a novel algorithm aimed at mitigating topological freezing.As our strategy builds on features of simulations with OBCs as well as on the PTBC algorithm proposed by M. Hasenbusch in Ref. [13], we briefly summarize their main properties below.
In theories defined on a continuum space-time, topological charge is integer valued as a consequence of Periodic Boundary Conditions (PBCs) in time.Abandonding PBCs in favour of OBCs corresponds to eliminating the barriers between topological sectors, which are not disconnected anymore.Performing Monte Carlo simulations with OBCs thus allows the Markov Chain to switch topological sector more easily, resulting in a dramatic reduction of topological auto-correlation time.In particular, it was shown in Ref. [9,19] that the divergence of τ is reduced to a polynomial one, τ ∼ ξ 2 .Yet, this approach suffers from some drawbacks.As a matter of fact, OBCs introduce nonphysical effects, which have to be avoided by only computing correlation functions in the bulk of the lattice.Hence, larger volumes are required in order to keep finite size effects under control while still avoiding unwanted systematic errors.In this respect, the PTBC algorithm, proposed for 2d CP N −1 models in Ref. [13] and recently implemented also for 4d SU(N ) gauge theories in Ref. [16] aims at having the best of both worlds.By combining PBCs and OBCs simulations in the framework of the parallel tempering idea, it exploits the improved scaling of the auto-correlation time of the topological charge in systems with OBCs simulations while, at the same time, avoiding the drawbacks metioned above, as physical quantities are computed with PBCs.The PTBC algorithm has been recently employed in a variety of cases, demonstrating a dramatic reduction of the auto-correlation time of the topological charge, and improving state-of-the-art results for several topological and non-topological quantities [16,[32][33][34][35][36][37].
The strategy proposed in this study shares its roots with the PTBC algorithm, as it still combines the use of Open and Periodic boundary conditions.However, it makes use of out-of-equilibrium evolutions based on the well-known Jarzynski's equality, see Ref. [38].A fundamental result of non-equilibrium statistical mechanics, Jarzynski's equality has been extensively used in recent years in several contexts in lattice field theories, ranging from the computation of interface free energies, see Ref. [39], of the QCD equation of state, see Ref. [40], of renormalized coupling of SU(N ) gauge theories, see Ref. [41], and in the study of the entanglement entropy from the lattice, see Ref. [42].
The underlying idea is to sample the configuration space of the system with PBCs using a previous sampling of the configuration space of the system with OBCs, the latter being used as a starting point for non-equilibrium evolutions.Using Jarzynski's equality, the expectation values of the desired observables can then be obtained for the system with PBCs through a reweighting procedure.In this approach, the decorrelation of the topological charge still benefits from the presence of OBCs, while avoiding their pitfalls, with a cost overhead that will be quantified by the length of the out-of-equilibrium trajectory.
The main motivation behind the present study is a recent development in the field of machine-learning based on the combination of Jarzynski's equality with Normalizing Flows (NFs).It this new framework, known as Stochastic Normalizing Flows (SNFs), see Refs.[43,44], out-of-equilibrium Monte Carlo evolutions are combined with discrete coupling layers (the same building blocks composing NFs) to achieve a substantial improvement of sampling efficiency with respect to a purely stochastic approach.As our strategy is rooted on Jarzynski's equality too, SNFs would be a natural future direction for the application of our proposal.
The aim of the present study is thus to probe the use of the out-of-equilibrium methods in the computation of the topological observables in order to set the stage for this future development.Its technical feasibility will be explored, and its performance compared with that of the PTBC algorithm.We will focus on the 2d CP N −1 models, see Refs.[45][46][47][48], which are a very popular test bed for new numerical approaches, see Refs.[6,13,15,32,[49][50][51][52][53][54][55].They are generally simpler to study on the lattice compared to QCD while still exhibiting non-trivial topological features, which they share with 4d SU(N ) gauge theories.A preliminary version of the results discussed in this manuscript was presented at the 2023 Lattice Conference, and can be found in Ref. [56].
This paper is organized as follows.In Section 2 we introduce the numerical setup used in this study, with a heavy focus on the features of out-of-equilibrium evolutions.The numerical results obtained with this method are presented and discussed in Section 3. Finally, in Section 4 we draw our conclusions and point to the future directions of this investigation.

Numerical setup
In this section, the numerical setup used in this study is introduced.The 2d CP N −1 models are defined, along with the local updating algorithm used to generate their configurations.Out-of-equilibrium methods are also introduced, along with a detailed explanation on how they are used to carry out the program sketched above.

Lattice discretization of 2d CP N −1 models
Consider the following continuum Euclidean action of 2d CP N −1 models [45,46]: where g is the 't Hooft coupling, z = (z 1 , . . ., z N ) is a N -component complex scalar satisfying zz = 1, and D µ ≡ ∂ µ + iA µ is the U(1) covariant derivative, where A µ (x) is a non-propagating U(1) gauge field1 .An integer-valued topological charge can be conveniently expressed in terms of the gauge field according to Refs.[45,46] as follows, while the topological susceptibility, which is our main observable of interest, is defined as usual as (here V is the space-time volume): We discretize the action, Eq. (2.1), on a square lattice of size L, using the O(a) tree-level Symanzik-improved lattice action defined in Ref. [50].A crucial ingredient in this study is the choice of boundary conditions.They are imposed as periodic in every direction and for every point of the boundary of the lattice, except for a segment of length L d along the spatial boundary.In the following, this segment will be known as defect and will be denoted by D. Along the defect we impose OBCs.Our purpose is to gradually evolve these to PBCs using the out-of-equilibrium evolutions described in Sec.2.2.This choice of boundary conditions can be encoded directly into the dynamics by considering the following family of lattice actions, each labeled by an integer n, where β is the inverse bare 't Hooft coupling, c 1 = 4/3 and c 2 = −1/12 are Symanzikimprovement coefficients, and U µ (x) = exp{iA µ (x)} are the U(1) gauge link variables.The factors k (n) µ (x) appearing in Eq. (2.4) are used to implement different boundary conditions for the link variables crossing the defect D. They are defined as follows, with 0 ≤ c(n) ≤ 1 a function that interpolates between c = 0 and c = 1 corresponding, respectively, to OBCs and PBCs.The setup described above is sketched in Fig. 1.
The behaviour of the model defined above can be clarified as follows.When c = 1, the factors k (n) µ (x) are everywhere equal to 1.Then, the action in Eq. (2.4) reduces to the standard Symanzik-improved action of 2d CP N −1 models with PBCs.When 0 ≤ c < 1, the coupling β on the links crossing D get suppressed by a factor of c each.This corresponds to a suppression of the force entering the updating procedure for those specific links and of the corresponding site variables, and to its vanishing when c = 0.This condition realizes OBCs, as the vanishing of the force implies no update of the involved variable.In this work  [32].The dashed line represents the defect D, while the solid arrows represent links or product of links crossing the defect orthogonally, and thus getting suppressed by factor(s) of c(n) according to the definition of k (n) µ (x) in Eq. (2.5).In this case the defect is placed on the space boundary (i.e., along the µ = 1 direction), so that only temporal links U 0 will cross it.
For the updating procedure, we relied on the standard 4:1 combination of Over-Relaxation (OR) and over-Heat-Bath (HB) local update algorithms, see Ref. [50] for more details.The full lattice update sweeps were supplemented with hierarchical updates of subregions of the lattice centered on the defect.These allow to update the links and sites close to D, where the creation/annihilation of new topological excitations occurs most likely, thereby improving the evolution of the Markov Chain between topological sectors.These hierarchical updates were designed along the same lines of Ref. [13], where more details may be found.Hierarchical updates were alternated with translations in randomly-chosen directions by one lattice spacing of the position of the defect on the periodic replica (which is translation-invariant), so that topological excitations are created/annihilated in different places around the lattice.Such translations are effectively achieved by simply translating the site/link variables of the periodic field configurations.
As discussed in the Introduction, the measurement of lattice out-of-equilibrium evolutions happens on systems with PBCs.In the case of PBCs, no unphysical effects coming from the fixed boundaries are present and we can safely rely on several different discretizations of the global topological charge defined in Eq. (2.2) for the 2d CP N −1 models on a torus.In this study, we will adopt the so called geometric definition as in Ref. [50], which is known to be integer valued for every lattice configuration, where is the elementary plaquette.
The geometric lattice topological charge in Eq. (2.6) is computed on configurations that have undergone a small amount of smoothing of the gauge and matter fields.This is done in order to remove the effects of ultraviolet (UV) fluctuations at the scale of the lattice spacing.Several different smoothing methods have been proposed in the literature, such as cooling, see Refs.[57][58][59][60][61][62][63], stout smearing, see Refs.[64,65] or gradient flow, see Refs.[66,67].From the numerical standpoint, all these methods have been shown to give consistent results when properly matched to one another, see Refs.[63,68,69].In this study, we adopt cooling for its simplicity and for its limited numerical cost.A single cooling step consists in aligning, site by site and link by link, each variable z(x) and U µ (x) to their corresponding local forces.As shown in Ref. [69], the specific action from which the local force is computed does not need to be the one used for the Monte Carlo evolution.Thus, we rely on the non-improved action, i.e., the action in Eq. (2.4) with c 1 = 1 and c 2 = 0.The topological charge was evaluated using the geometric definition after a fixed number of 20 cooling steps, as its value was systematically observed to stabilize after 10 steps.In the end, the topological susceptibility will then be defined on a finite lattice as follows, where ∈ Z is the integer-valued lattice geometric topological charge computed after cooling, and where the meaning of the mean over the ensemble in our out-ofequilibrium setup will be clarified in the next section.

Out-of-equilibrium evolutions and Jarzynski's equality
In this subsection, we explain in detail how Jarzynski's equality enables us to compute vacuum expectation values in the system with PBCs (the target distribution), starting from a sampling of the system with OBCs (the prior distribution).
Consider the following family of partition functions, Each one corresponds to a system described by action in Eq. (2.4) with boundary conditions specified by the parameter c.The prior system with OBCs will then correspond to c(n = 0) = 0, with partition function Z 0 , while the target system with PBCs will instead correspond to c(n = n step ) = 1, with partition function Z 1 ≡ Z.
A sampling of the target distribution is obtained from a sampling of the prior distribution through out-of-equilibrium evolutions.Along the evolution, the boundary condition parameter is gradually changed from c = 0 to c = 1.The change is effected in n step steps.At each step n, the change from c(n) to c(n+1) is followed by a number of configuration updates 2 .These combined operations allow us to define a transition probability distribution P n (ϕ n−1 → ϕ n ) where ϕ n−1 and ϕ n collectively denote the elementary degrees of freedom z and U at step n − 1 and n, respectively.As just a few updates are performed after every change in c, the system with elementary degrees of freedom ϕ n can be considered to be out of equilibrium.
Jarzynski's equality allows to compute the ratio between the partition functions of the target and prior distributions, where is known as generalized work.The out-of-equilibrium average ⟨A⟩ f is defined as follows, where ϕ ≡ ϕ nstep denotes the configuration reached at the end of the out-of-equilibrium evolution.In the above, P f [ϕ 0 , . . ., ϕ] ≡ is the total transition probability of the out-of-equilibrium evolution, and L [ϕ 0 ]} the probability of drawing the field configuration ϕ 0 from the prior distribution.
Using Eq. (2.9) and Eq. ( 2.11), one can show that the expectation value of an observable O with respect to the target distribution can be expressed as follows, where the NE denotes the fact that a Non-Equilibrium method has been used for the computation.
In the following, our strategy will thus be to start from an ensemble of configurations {ϕ 0 }, sampled from the prior distribution q 0 , i.e., using action Eq.(2.4) with c(0) = 0, and to perform, for each configuration, the out-of-equilibrium evolution defined above.Of course, there is no unique way to interpolate from c = 0 to c = 1.Hence, we supplement the above procedure with a protocol, that is, with a choice for the value of c(n) for each step.While it is necessary to perform a measurement of the action at each step in order to obtain the generalized work from Eq. (2.10), the measurement of the observable of interest O is only performed at the end of the out-of-equilibrium evolution.In other words, the observable is only measured when the system has reached PBCs, thus avoiding unphysical effects introduced by OBCs.A sketch of our overall algorithmic procedure is displayed in Fig. 2.
As is also evident from Eq. (2.12), the technique laid out above has conceptual similarities with common reweighting techniques.It is natural to expect the quality of sampling of the target distribution to depend on the overlap between the successive distributions in Z c , which is expected to be smaller for evolutions that are farther from equilibrium.Since too far out-of-equilibrium evolutions could lead to a small signal-to-noise ratio and/or to a possibly large bias in the computation of ⟨O⟩ NE through Eq. (2.12), it is important to quantify the distance of the chosen evolution from equilibrium.
A figure of merit designed to fit that purpose is the reverse Kullback-Leibler divergence DKL , which is a measure of the similarity between two probability distributions and it is defined to be always larger or equal to zero.In general, the simplest choice would be to compute the overlap between the target distribution and the one that has been generated at the end of an out-of-equilibrium evolution.The corresponding Kullback-Leibler divergence is: where p ≡ Z −1 exp{S } is the probability of drawing a configuration from the target distribution, while q is a (generally intractable) distribution of the form For out-of-equilibrium evolutions we have no direct access to q and this prevents us from computing the quantity in Eq. (2.14).However, another, different, Kullback-Leibler divergence can be defined by comparing the forward and reverse transition probabilities between ϕ 0 and ϕ.Labeling the former by f, see Eq. (2.13), and the latter by r, we have: This quantity is directly related to thermodynamic quantities: if ∆F ≡ − log Z Z 0 is the variation of the free energy along the out-of-equilibrium evolution and W the corresponding (generalized) work from Eq. (2.10), then it is easy to derive (using detailed balance) that (2.17) For an equilibrium process, for which DKL = 0, we have that ⟨W ⟩ f = ∆F , i.e., all of the work spent in the out-of-equilibrium evolution is transferred into the free energy difference between the prior and the target distributions; this is exactly the expectation for a reversible evolution through equilibrium states, for which there is no difference between forward and reverse.For a generic out-of-equilibrium process we have instead DKL > 0, that corresponds to evolutions for which the probabilities of the forward and reverse evolutions are different.From a thermodynamic perspective this implies, instead, ⟨W ⟩ f > ∆F , i.e., part of the work is dissipated.It is then clear that Eq. (2.17) is a restatement of the Second Principle of Thermodynamics.Finally, it is easy to prove that: DKL (q∥p) ≤ DKL (q 0 P f ∥pP r ). (2.18) Thus, the value of the divergence of Eq. (2.16) also puts a constraint on how far the actual generated distribution q is from the target distribution p.
Another figure of merit that can be used to quantify the distance from equilibrium is the so-called Effective Sample Size (ESS).This quantity is customarily employed in the context of (Stochastic) Normalizing Flows as it encapsulates the relationship between the variance of an observable sampled directly from the target distribution p and the variance of the same observable obtained using Eq.(2.12).Ignoring auto-correlations, it is easy to show that which also motivates the name Effective Sample Size.The variance of an observable O, obtained through Eq. (2.12), is equal to the variance obtained from the target distribution p with a smaller sample of size n eff = ESS × n ≤ n.
In this study, we employ the following estimator: and we refer to Ref. [70] for a discussion on how Ê SS is related to the true Effective Sample Size of Eq. (2.19).The estimator Ê SS can be easily related to the variance of the weights e −W appearing in Eq. (2.12).Indeed, since Var(e and, as a consequence, From the above we see that Ê SS = 1 implies Var(e −W ) = 0, which is only possible if the weights e −W are all equal.In this case, it is apparent from Eq. (2.12) that no reweighting is being done at all, and we are at equilibrium.On the other hand, a value of Ê SS approaching zero signals sizeable fluctuations in the weights e −W , corresponding to a noisy reweighting and to out-of-equilibrium evolutions.

Numerical results
This section is devoted to the discussion of the numerical results concerning the efficiency of the use of out-of-equilibrium evolutions, which will be measured in terms of the autocorrelation time of χ.In Ref. [13], the PTBC algorithm was shown to outperform standard local algorithms both in the presence of PBCs and OBCs.Moreover, it was shown to enjoy smaller autocorrelation times with respect to other setups, such as metadynamics simulations, see Ref. [10].Hence, in the following, a direct comparison between the performance of the out-of-equilibrium protocol and of parallel tempering will be performed.
Since our goal is to test the robustness of the method, we have chosen to probe a wide array of different combinations of L d , n step and n relax in independent simulations, rather than obtaining larger samples for a smaller number of setups.
Thus, simulations for two different values of the parameter N specifying the model were performed, and we focused, for each value of these, on a single value of β and L, see Tab. 1.These simulation setups were chosen to enable direct comparison with previous results from Refs.[32,33].However, for a few choices of L d and n step we will also present results for the auto-correlation time obtained with our out-of-equilibrium setup varying the lattice spacing and/or the volume.

Effective Sample Size and Kullback-Leibler divergence
As a first step in the investigation of the performance of our new proposal, we study how both the Kullback-Leibler divergence DKL and the Effective Sample Size Ê SS depend on n step and on the defect length L d .The aim of this section is, in particular, to quantify the distance of Jarzynski evolutions from equilibrium as the parameters n step and L d are varied.This is a crucial step in order to assess the reliability of the exponential Jarzynski reweighting.
On general grounds, we expect the Kullback-Leibler divergence to approach zero, its expected value at equilibrium, when either n step is increased at fixed L d or L d is decreased at fixed n step .The expectation explained above is clearly confirmed in Fig. 3, where DKL is displayed as a function of 1/L d and n step for fixed n relax .
The behaviour of Ê SS reflects the same picture.The evolution towards the target distribution is expected to approach equilibrium as n step is increased and to recede from it as L d is increased at fixed n step .Accordingly, Ê SS is expected to approach 1 in the former case, and to recede from it in the latter.This is indeed what can be observed in Fig. 4, where Ê SS is displayed as a function of 1/L d and of n step , in the left and right panels, respectively, for fixed n relax .For sufficiently small defects, L d < 20, the value of Ê SS is seen to become greater than 0.5 already for n step ≃ 500.For larger defects, L d > 20, a value of n step of order 1000-2000 is required for Ê SS to be greater than 0.5.While Ê SS = 0.5 can be considered as a lower safety threshold, we will see below that the Monte Carlo ensembles that were employed were large enough to allow an acceptable computation of the topological susceptibility with even smaller values.
A careful inspection of the available data suggest that, in fact, both Ê SS and DKL are, to a good approximation, only functions of the ratio n step /L d .That this is a sensible idea can be immediately appreciated from Fig. 5, where Ê SS and DKL are shown to collapse on two different single n step /L d dependent curves at two different values of N .A semiquantitative justification can instead be obtained from the definition of work in Eq. (2.10).Indeed, the calculation of W involves the sum of the variations of the action induced by the change c(n) −→ c(n + 1).Focusing for simplicity on only the nearest-neighbors interaction terms, we can write: with and where we have assumed that the defect lies on the x 0 = L − 1 boundary from x 1 = 0 to x 1 = L d − 1.Since a difference of actions is an extensive quantity and is, in the case at hand, localized on the defect, then the average of the quantity in Eq. (3.1) is of order L d .Then, as a consequence of our choice of interpolating function, c(n) = n/n step , it is clear that Eq. (3.1) is, on average, only a function of L d × ∆c(n) = L d /n step .The same considerations can be made on the next-to-nearest-neighbors interaction term.Hence, the work W will on average only depend on n step /L d and it is natural to think that the same holds true for Ê SS and DKL as well.
Further insight on Ê SS and DKL can be obtained from Fig. 6 where the frequency histograms of the out-of-equilibrium evolution are displayed as functions of W and of the normalized3 weight w = exp{−W + ∆F } that appears in Eq. (2.12).
Three different cases are represented, corresponding to evolutions that are far from (n step = 50), moderately far from (n step = 500) and close (n step = 2000) to equilibrium.The left-hand panel of the figure shows how the peak in the distribution of the evolutions in W approaches the value of ∆F , represented as the vertical black line, as n step grows, i.e. as the evolution approaches equilibrium.This corresponds, in the right-hand panel, to the distribution of evolutions in w becoming progressively more peaked around 1, where W = ∆F , with the variance in w approaching zero.
The right-hand panel also enables us to better understand the reliability of the statistical reweighting procedure in Eq. (2.12).When the protocol is relatively far from equilibrium (n step = 50), the distribution in W has a larger support and, correspondigly, the support of the distribution in w then extends exponentially to smaller values.In this case, large statistics would be necessary to provide an unbiased sampling of w.To the opposite, a more peaked distribution in W is obtained as n step is incrased, corresponding to a smaller support in the distribution of w, and to an easier sampling.This is completely analogous to how the reliability of the re-weighting technique in classical statistical mechanics depends on the overlap between the supports of the source and target distributions.
In conclusion, both the Ê SS and DKL behave as expected on theoretical grounds and according to the general discussion in Sec.2.2.This means, in practice, that we are  able to control the magnitude of systematic effects originated from the reweighting step in Eq. (2.12), and we can fully trust the expectation values obtained for the target distribution.In the next section, we will present a practical application of the strategy laid out above.Further details on the mutual relation between Ê SS and DKL can be found in Appendix B.

Extracting the topological susceptibility
A necessary condition for the viability of the non-equilibrium methods must of course rely on the comparison between its results and the results obtained with equilibrium methods.In Fig. 7 the values of the topological susceptibility χ obtained using Eq.(2.12) and several combinations of L d , n step and N are displayed, along with the values obtained in Ref. [32], using the PTBC algorithm.
The non-equilibrium results are found to be in excellent agreement with those from the PTBC algorithm for the whole range of values of Ê SS.Notably, no bias seems to arise even when Ê SS ∼ 0.1 or smaller, while for nearly vanishing Ê SS, an increase in the magnitude of the error seems to occur for L d = 114 at N = 21.This remarkable fact can be understood in terms of the computational effort spent to obtain each estimate of a 2 χ.The computational effort is of course proportional to n stat = n ev × (n step + n relax ), and this is roughly the same for each point on the plot in Fig 7 .Hence, the data on the lower range of Ê SS, for which n step /L d is small, is characterized by a correspondingly larger factor n ev .This is in agreement with the previous observation that a comparatively larger statistics is needed to avoid bias in expectation values computed from Eq. 2.12, when using evolutions that are farther from equilibrium.

Integrated auto-correlation time of χ
A quantitative study of the non-equilibrium method crucially revolves around the computation of the integrated auto-correlation time of the topological susceptibility.The auto-correlation function for χ, from which the integrated auto-correlation time will be calculated, has been computed as follows, where Q 2 i denotes the cooled squared geometric lattice topological charge at the end of the i th out-of-equilibrium evolution, i.e., when the system is subject to PBCs, and is the topological susceptibility computed from out-of-equilibrium evolutions.
For convenience, it is useful to define normalized auto-correlation function ρ(t) ≡ Γ(t)/Γ(0), which is displayed in Fig. 8 for various combinations of n step and n relax at fixed L d .The integrated auto-correlation time is defined in terms of ρ(t) as follows, where W is the window computed using the Γ-method, see Refs.[71,72].
The value of τ int computed for several combinations of n relax , L d are reported in Tab. 2, and displayed as a function of 1/L d , for n step = 1000, in Fig. 9.As one could expect, smaller values of τ int are generally observed for larger values of n relax , L d and n step , the latter corresponding to evolutions closer to equilibrium.
In the following, we will focus on the data obtained for combinations of parameters L d , n relax , n step for which τ int is of order 1, as such small values of τ int afford a reliable estimate of the error on a 2 χ.No investigation of the full dependence of τ int on n step will be attempted, as Tab. 2 shows that its value does not seem to change appreciably, even when n step is doubled.Instead, we will focus on the regime n relax ≥ 50, n step ≥ 500, L d ≥ 6 for the N = 21 ensemble and L d ≥ 10 for the N = 41 ensemble.
The values of τ int are provided in Tab. 3 for N = 21 at different values of the coupling and of the lattice volume.Remarkably, no significant variation is observed in either τ int or Ê SS as the coupling or the lattice volumes are varied at fixed (L d , n step , n relax ).This is a further confirmation of the previous observation that Ê SS seems to only depend on L d and n step , and provides evidence of the robustness of the non-equilibrium method.

Efficiency of the method
In this section, we assess the efficiency of the non-equilibrium method, using the product between the variance χ and the cost of each evolution, Var(χ) NE × (n step + n relax ), as a figure of merit.Note that this is the quantity to minimize in order to reach the maximum efficiency in evaluating χ, and this is the quantity on which a comparison with the PTBC approach will be based.
The quantity Var(χ) NE is computed directly from the sample of values of χ obtained after reweighting according to Eq. (2.12).Some insight into this figure of merit can be gained from using Eq.(2.19), and taking into account the effects of auto-correlation.It  ).According to the results of Ref. [15], where the joint dependence of the auto-correlation time of χ on β and N was studied using the same lattice volumes, lattice discretization and overrelaxation/heat-bath updating algorithms employed in the present investigation, for N = 21 and β = 0.7 we expect τ (PBCs) int ∼ 10 4 − 10 5 standard updating steps using PBCs, while for N = 41 and β = 0.65 an even larger auto-correlation time.For a fair comparison, the quantity τ  where Var(χ) p would be the variance of the topological susceptibility when sampled directly from the target distribution p (i.e., with PBCs).Note that in principle, this is a quantity which is only dependent on the latter distribution, and independent from the parameters of the non-equilibrium algorithm.Indeed, the effect of the auto-correlations and of using Eq.(2.12) to estimate observables are accounted for, respectively, by the presence of τ int and Ê SS.This figure of merit is displayed in Fig. 10 as a function of n step , for several combinations of L d and n relax .The values obtained for the non-equilibrium estimations do not seem to have any definite behaviour as n step is increased.However, they cluster around a value

Conclusions
In this work, we have presented a first exploration of an non-equilibrium Monte Carlo setup designed to mitigate the problem known as topological freezing.We have carried out our analysis using the 2d CP N −1 model as test bed, owing to their combination of numerical simplicity and physical non-triviality.The new approach outlined in this manuscript has some features in common with the PTBC algorithm originally proposed by M. Hasenbusch.It consists in starting from a thermalized ensemble generated with OBCs, and gradually switching to PBCs along a non-equilibrium Monte Carlo evolution.At the end of the latter, expectation values with PBCs can be computed through a reweighting-like formula which is tightly related to Jarzynski's equality.
This method is able to reproduce the expected value of the topological susceptibility of the CP N −1 models.Moreover, it is possible to gauge the reliability of the said reweighting-like formula using two different figures of merit: the Effective Sample Size and the Kullback-Leibler divergence, which provide consistent results.The efficiency of the method, quantified in terms of the product of the variance of the final result with the computational effort of one measurement is also shown to be comparable to the one found in the PTBC approach.
Put in perspective, the above study provides a broad framework for the application of non-equilibrium methods to address the issue of critical slowing down in lattice simulations.It can be thought as a different kind of Monte Carlo simulation, and could be implemented, for instance, by varying the value of β rather than of some parameter controlling the type of boundary conditions.In principle, one could start by an equilibrium sampling of the configuration space of the system for values of the inverse coupling characterized by small auto-correlations.Then, the coupling could be gradually increased through nonequilibrium evolutions, to values at which the system would be strongly auto-correlated at equilibrium.The meaning of n step and n relax would then be the same as above, while L d would be replaced by a new parameter ∆β describing the non-equilibrium changes in β.We highlight that this approach has already been implemented in the 4d SU(3) pure-gauge theory in Ref. [40], although not with the aim of mitigating the effects of critical slowing down.
Another relevant possible future application of the present out-of-equilibrium method is represented by systems that include fermionic degrees of freedom.This does not pose any additional theoretical difficulties, nor does it contribute in principle with any computational overhead.Indeed, the calculation of the action needed at each change of the parameter controlling the boundary conditions for the gauge fields entering the fermion determinant is performed both at the beginning and at the end of each Hybrid Monte Carlo trajectory anyways.Alternatively, a prior distribution with periodic boundary conditions for the fermion determinant can be used, making the fermionic contribution to the work exactly zero.
Finally, given the recently-established connection with Jarzynski's equality and the theoretical framework of Stochastic Normalizing Flows (SNFs) in Ref. [44], our work sets the stage for an application of our proposal to SNFs, where a stochastic part (which is given by the out-of-equilibrium evolutions discussed in this study) is combined with the discrete layers (parametrized by neural networks) that compose Normalizing Flows.The training of such layers would of course need a possibly lengthy procedure, which is however performed only once.Such an approach has the potential to greatly improve the efficiency of the non-equilibrium evolutions as, in principle, a considerably lower amount of Monte Carlo steps would be needed to achieve the same efficiency when sampling the target distribution.Another natural future outlook of the present investigation is to implement the non-equilibrium setup in a more physical and realistic model, such as the 4d SU(3) pure-gauge theory.We plan to investigate both ideas in the near future.

Figure 1 :
Figure 1: Figure taken from Ref. [32].The dashed line represents the defect D, while the solid arrows represent links or product of links crossing the defect orthogonally, and thus getting suppressed by factor(s) of c(n) according to the definition of k

Figure 2 :
Figure2: Sketch of the out-of-equilibrium setup.The horizontal axis represents the Monte Carlo time, where a new configuration is generated at equilibrium every n relax updating steps according to the prior distribution, i.e., with OBCs.The vertical arrows instead stand for the n ev out-ofequilibrium evolutions used to gradually switch on PBCs, each of these n step steps long.The bullets correspond to the states of the system.Observables are computed only at the end of the out-ofequilibrium evolution, while the work is computed all along.

Table 1 :
Setup of the numerical simulations.The model is specified by N , β and the size of the square lattice L. Non-equilibrium evolutions are characterized by the choice of the length L d of the line defect, the length of the evolution n step and the intermediate number of updating steps between each out-of-equilibrium evolution n relax .0.025 0.050 0.075 0.100 0.125 0

Figure 3 :
Figure 3: Behavior of the DKL as a function of n step (lower panels) and of the inverse of the defect size L d (upper panels), for N = 21 (left panels) and N = 41 (right panels).

Figure 4 :
Figure 4: Behavior of the Ê SS as a function of n step (lower panels) and of the inverse of the defect size L d (upper panels), for N = 21 (left panels) and N = 41 (right panels).

N = 21 ,N = 21 , 30 Figure 5 :
Figure 5: Behavior of the Ê SS (left panel) and of DKL (right panel) as a function of n step /L d at fixed value of n relax .

Figure 6 : 4 N=41,
Figure 6: Distribution of the work W (left panel) and of exp{−(W −∆F )} (right panel) for L d = 6 for various values of n step .In the left panel the vertical bar represents the value of ∆F for these particular evolutions.

Figure 7 :
Figure 7: Results for the topological susceptibility in lattice units a 2 χ obtained with non-equilibrium evolutions as a function of the Effective Sample Size, for N = 21 and β = 0.7 (left panel) and for N = 41 and β = 0.65 (right panel).The results obtained with the PTBC algorithm in Ref.[32] is reported as a black horizontal line.

Figure 8 :NFigure 9 :
Figure 8: Behavior of the normalized auto-correlation function ρ(t) for L d = 6, for several values of n step at fixed n relax = 100 (left panel) and for several values of n relax at fixed n step = 500 (right panel).

Table 2 :
Values of the integrated auto-correlation time extracted with the Γ-method for several combinations of L d , n step and n relax , for the N = 21 ensembles (left table) and the N = 41 ensembles (right table int × (n step + n relax ).

Table 3 :
Comparison of τ int and Ê SS for different values of the coupling β and the size of the lattice L