Sign problem and Monte Carlo calculations beyond Lefschetz thimbles

We point out that Monte Carlo simulations of theories with severe sign problems can be profitably performed over manifolds in complex space different from the one with fixed imaginary part of the action (“Lefschetz thimble”). We describe a family of such manifolds that interpolate between the tangent space at one critical point (where the sign problem is milder compared to the real plane but in some cases still severe) and the union of relevant thimbles (where the sign problem is mild but a multimodal distribution function complicates the Monte Carlo sampling). We exemplify this approach using a simple 0+1 dimensional fermion model previously used on sign problem studies and show that it can solve the model for some parameter values where a solution using Lefschetz thimbles was elusive.


Introduction
The Monte Carlo method is frequently the only method applicable to strongly coupled field theoretical or many-body systems. As versatile as it is, a very important category of models has been up to now excluded from its reach, namely, problems where the action (or its euclidean version) is complex. Models in this category includes most field theory models (like QCD) with a chemical potential, the repulsive Hubbard away from half-filling as well as real time observables. Roughly speaking, this problem arises from subtle cancellations among contributions with similar weight but differing phases (or signs) and it is termed "the sign problem". Many ways to circumvent it have been proposed in the literature over the years with varied but limited amounts of success.
In [1] a new approach to the sign problem was suggested. It consists of complexifying all the variables of integration in the path integral and deforming the region of integration from the real values to a manifold where the action is real (or has a constant imaginary part). The integrand over this other manifold, a union of the so-called "Lefschetz thimbles", has a fixed phase in each connected part. In principle, the integral over this manifold does not have a sign problem. Specific algorithms to perform the integration over a thimble were suggested and applied to a variety of simple toy models with bosonic degrees of freedom [2][3][4][5][6][7][8][9].
Fermionic models, where most of the interest in solving the sign problem lies, bring a new feature [10][11][12]. Some thimbles are finite and end at a point where the fermion determinant vanishes and the effective action, which includes the tr log(D) term, diverges. Other thimbles are connected to the same zero of the fermion determinant and they are, in turn, connected to further thimbles through their boundaries. It is not surprising then that the original integral over real fields is equivalent to the integral over a large number of thimbles. This seems to be a nearly insurmountable difficulty for the application of the thimble idea to fermionic models. All the algorithms proposed for the computation of one thimble require the identification of the corresponding critical point but identifying a larger number of those in a realistic theory is a daunting process. In addition, one would JHEP05(2016)053 need to determine the combination of thimbles equivalent to the original real integral, a task of great complexity to say the least. 1 The need for a multi-thimble integration, at least in some models, was evidenced in [14] where we considered a simple soluble 0 + 1 dimensional fermionic model previously used as a toy model for the sign problem [15,16]. We showed that an algorithm designed to compute the contribution of one thimble gave the exact answer as long as the coupling constant was small and the temperature and lattice spacing were not too small. However, at strong coupling or in the zero temperature or continuum limits, there were small but statistically significant discrepancies from the exact result. Furthermore, a semiclassical estimate of the contribution from other thimbles matched, in order of magnitude, the size of the discrepancy. This result suggests that the algorithm proposed in [14] properly samples one thimble and that the discrepancies result from neglecting the other thimbles' contributions.
The objective of the present paper is to point out that the partition function can also be computed as an integration over other manifolds in complex space besides thimbles and that it is useful to do so. We consider, in particular, a family of manifolds parametrized by the flow time. In the generic case we generate these manifolds by flowing the original integration contour (the real plane) such that the value of integral remains the same, while numerically the integral can be easier to evaluate. Furthermore, as an optimization we can use as a starting point for the flow another manifold that has the same integral as the original contour. We exemplify these observations by showing that for a 0+1 model we considered previously we can obtain the exact results both in the continuum limit and up to very low temperatures -something that eluded us in [14] -by simply integrating over the tangent space of the main thimble. In the case of the lowest temperatures, where the integration over the tangent space fails, the calculation is still possible if a different manifold, obtained by flowing by a moderate amount, is used.

Thimbles and the contraction algorithm
Expectation values of an observable O can be computed in field theory as a path integral of the form where the integration is over real fields φ. We assume from the start that some discretization is used so the integration is over R N , where N is the number of degrees of freedom of the system (proportional to the number of spacetime points). The standard procedure for an stochastic calculation of O is to obtain a set of N field configurations {φ (a) } distributed according to the distribution P [φ] = e −S[φ] /Z and use them to estimate the expectation 1 As the action is an extensive quantity, arguments were made that only the thimble with the smallest action contributes in the thermodynamic limit. Also, universality has been used to claim that computing the contribution of one thimble is sufficient in the continuum limit. These observations are the basis of the philosophy advocated, for instance, in [13], stating that the integration over one thimble suffices in the continuum/thermodynamic limits. The present paper can be seen as an alternative to this point of view as it suggests a practical way of performing the path integral even if not dominated by one thimble.

JHEP05(2016)053
value of the observable as Unfortunately, e −S[φ] /Z cannot be interpreted as a probability distribution if S is not real and stochastic methods fail. A possible way around this problem is to use the real part of S in the distribution probability and include the imaginary part in the operator to be averaged: (2.3) This "reweighting" method works as long as the phase e −iS I does not fluctuate too wildly from configuration to configuration. A measure of how much the phase fluctuates is the value of the average phase e −iS I S R . As it approaches zero a very large number of configurations are required for the computation of O . Unfortunately, as S is an extensive quantity, the fluctuation of the phase is expected to be proportional to the spacetime volume (that is, it is proportional to the spatial volume and inversely proportional to the temperature) and one expects a wildly fluctuating phase in the low temperature and thermodynamic limits. This is the famous "sign problem" that has impeded Monte Carlo calculations in theories of as great interest as QCD at finite chemical potential.
An idea proposed recently [1] is to complexify the variables of integration and change the region of integration away from R N into a manifold of N (real) dimensions immersed in C N where the imaginary part of the action is constant and the sign problem disappears. One such manifold is given by "Lefshetz thimbles". The Lefshetz thimble is a multidimensional generalization of the "stationary phase path" or the "steepest descent direction" used in asymptotic expansions of one dimensional integrals. A Lefshetz thimble is defined for every critical point, that is, a field configuration φ c (not necessarily real) satisfying the classical equations of motion (an account of these methods can be found in [17,18]; see also [19]): The thimble of the critical point φ c is defined as the set of points that, under the (downward) flow defined by the equations and approach φ c asymptotically. We can gain some insight into eq. (2.5) by splitting z i = x i + iy i and S = S R + iS I into real and imaginary parts: The first equality shows that the flow in eq. (2.5) is the (negative of the) gradient flow of the real part of S and that the imaginary part of S is constant along the flow (and, consequently, constant over each thimble). It is in this sense that thimbles are generalizations of the path of steepest descent or stationary phase used in one-dimensional (N = 1) integrals. The constancy of S I over a thimble is useful in addressing the sign problem as the e −iS I factor can be factored out of the integral. In general, however, several thimbles need to be added together to reproduce the original integral over real fields. More precisely, the original integral over the real fields is given by where σ indexes the different thimbles and φ σ are the corresponding critical points. The integers n σ are determined by the crossing of the original integration region with the "upward flow sets" defined as the set of points approaching a given critical point along the upward flow The analysis of the flow equations eq. (2.5) near the critical point shows that thimbles are manifolds with N real dimensions. 2 Thimbles in fermionic models have a distinct feature not present in bosonic models [10][11][12]. If one uses the standard procedure of introducing an auxiliary bosonic field through the Hubbard-Stratanovich transformation and then integrate over the fermion fields one is left with a determinant det D[φ] which, when exponentiated, adds to the auxiliary field action a term − log detD [φ]. At the values of φ at which the determinant vanishes the action has a logarithmic singularity and the flow equations cannot be integrated past a certain value of t. Thus, thimbles have a boundary at finite distances from the critical point. 3 In addition, the thimble is very curved near these singularities and frequently veers towards imaginary values of φ i . As the "largest" values of φ i (farthest from the critical point) are very far from the real values it attains asymptotically in the original integral over R N it is doubtful that one thimble is enough to reproduce the desired integral. The analysis determining which combination of thimbles is equivalent to the original integration region R N or, in other words, the determination of the integers n σ in eq. (2.8) is very involved in all but the simplest, one dimensional cases (for an example where this is possible, see [10]).
We will now describe the algorithm suggested in [14], referred to later on as the "contraction algorithm", to sample one single thimble. It will be the basis of further generalization discussed later. We start by noticing that every point φ sufficiently close to the critical JHEP05(2016)053 point is indistinguishable from a point on the tangent plane to φ c . The tangent plane at the critical point can be characterized as the thimble corresponding to the quadratic part of the action. For this reason we will denote the neighborhood of φ c which is close enough to be approximated by the tangent plane the "gaussian region". Taking a point in the gaussian region as the initial condition and evolving it according to eq. (2.9) by a fixed time T establishes a mapping between the gaussian region and the thimble. By making T large enough we can guarantee that every point sampled on the thimble (or every point on the thimble if the thimble is finite) is mapped to a point close to the critical point. That establishes a mapping between the thimble and the gaussian region. Let us denote byφ the point in the gaussian region associated to a point φ on the thimble. The mapping betweenφ and φ has a jacobian J that, as explained in [14], is the determinant of the matrix J ij (t = T ) defined as the solution of j form an orthonormal basis of the tangent space at φ c . The determinant J contains information about not only i) the inclination of the tangent space to the thimble at the point φ i (T ) in relation to the tangent space at φ c (known in the literature as the "residual phase") but also about ii) the size of neighborhood of φ i (T ) parametrized by a neighborhood ofφ with unit volume.
The contraction algorithm proposed in [14] is a standard Metropolis algorithm in the variablesφ using the effective action S − log |J|. A little care has to be taken in the choice of update proposals, specially if T is very large as the appropriate values for the updates of φ are very different along different direction in field space. The algorithm becomes exact in the large T limit. In this limit, the sample pointsφ (a) can be identified with points in the tangent space.

0 + 1 dimensional Thirring model
We will use the model studied in [11,12,14,20] to illustrate our point. The model describes two flavors of fermions living on one single spatial site with a (continuum) action given by where χ is a two component, time dependent spinor and γ 0 is a Pauli matrix. The interaction term is simply the 0+1 dimensional analog of the current-current interaction (χγ µ χ)(χγ µ χ) of the 1 + 1 dimensional Thirring model. The quartic interaction can be eliminated by introducing an auxiliary field φ and, upon integrating overχ, χ the partition function is with This model has a sign problem for non-zero values of µ and it has been used as a toy model for testing ideas such as complex Langevin dynamics [15,16,20] and hybrid Monte Carlo on Lefschetz thimbles [11,12].
Discretizing the (euclidean) time coordinate and using a staggered fermion formulation we obtain the lattice model defined by where the effective action and the explicit form of the discretized Dirac matrix are Here N = β/a is an even number that denotes the number of lattice sites related to the inverse temperature of the system β, and all the dimensionful quantities, m, g 2 , µ are rendered dimensionless by multiplying with appropriate powers of the lattice spacing: m = ma,μ = µa,ĝ 2 = g 2 a,φ = aφ. The auxiliary field e iφ in this discretization plays the role of a U(1) link variable. We can gauge the severity of the sign problem by the average phase e −iS I in a standard Metropolis calculation with real fieldsφ. Figure 1 shows the average phase for two values of the coupling strengthĝ 2 andm = 1. For parameters where the average phase is much smaller than one, a Monte Carlo evaluation requires a large number of configurations and is impractical in any realistic model.
The partition function, the chiral condensate, and the charge density can be calculated analytically in this lattice model [20] and are useful in assessing the accuracy of the numerical calculation: , (3.8) where α ≡ 1/(2ĝ 2 ) and I n (α) denotes the modified Bessel function of the first kind.
The critical points of action in eq. (3.4) are determined by the equation The solution to eq. (3.9) with the smallest (real part of the) action has the formφ t = (iφ, · · · , iφ) withφ real. The tangent space to this point is spanned by purely real vectors, a feature common in this class of models. As we will see below this critical point and its associated thimble and tangent space will play a central role in our discussion and for this reason will be called "main thimble" and "main tangent space". In [14] the contraction algorithm discussed above was applied to the computation of n and χχ over the main thimble. The results showed small but statistically significant discrepancies from the exact result (eq. (3.6)) for some values of the parameters. For instance, in figure 2 we show the condensate as a function ofμ for the parametersĝ 2 = 1/6, N = 8,m = 1 where the discrepancy is evident. Similar results were also obtained in [12]. There is evidence that the discrepancy between Monte Carlo and exact results are due to the neglected contributions of other thimbles. In fact, a semiclassical estimate of the contribution of the other thimbles, which is suppressed by their higher action, suggests that they have the right order of magnitude to explain the discrepancies with the exact result. This is, however, only a plausibility argument as the semiclassical expansion is not reliable for the parameters considered 4 and the possibility that the discrepancy arises from a problem in the algorithm computing the contribution of the main thimble should not be excluded.

Beyond thimbles
We have up to now described the contraction algorithm as a way to compute the integral over the main thimble. For that to be true the limit of large flow time (T → ∞) must be taken. However, as we will argue now, the exact results including all contributing thimbles can be obtained by using the contraction algorithm at finite values of T , which is profitable to do in many circumstances. Let us first consider the case where field variables are unbounded and the action S is a polynomial which diverges only at infinity. Since the integrand in the partition function is holomorphic, the manifold of integration can be deformed without changing the partition function [21]. But there are restrictions on the allowed deformations. For the path integral to be well defined it is necessary for the real part of the action S R to diverge at large values of the field. But the integral over M (t) is well defined at all t. Indeed, in order for the integral over M (t) not to be defined there has to be a path to infinity L t on M (t) over which S R is bounded from above. Consider then the pre-image L 0 of L t under the flow by t. Since the flow only increases the value of S R , the values which S R takes along L 0 would also be bounded and the integral over M (0) would not be well defined, which is contrary to our assumption. Consequently, M (T ) is in the same homological class as M (0). In the fermionic model we are considering, the action is not a polynomial and two small modifications are needed to the argument we just made. The first is that the real parts of the field variables φ are bounded. The second one is that S R diverges at finite values of the field and we consider finite manifolds that end on these singularities. It is the behavior of S R as these points are approached that determine the homological class of the manifolds. The points where S R diverges play the role that points at infinity play in the bosonic (polynomial S) case.
In the present model (and similar ones) the main tangent space is purely real, and thus parallel to the original domain of integration R N . Since the real part of the fields are periodic variables, the translation from R N to the tangent space does not change the value of the integral. Now, consider the manifold M  same homological class as the main tangent space M (0). As T increases, the evolution of some special points of M (0) approaches critical points of the action. Points in M (0) near those special points will flow towards points close to the thimbles of each one of these critical points. As the flow is tangent to the thimbles it cannot cross a thimble but, instead, approaches it asymptotically. Other points of M (0) flow towards infinity or, in non-polynomial actions, to singularities of S. The result is that M (T ) will approach all the thimbles contributing to the original integral, and only those. If that was not the case, the integral over M (T ) for large T would have a different value from the one over M (0), in contradiction to the argument that they belong to the same homological class. This argument is one way of understanding the rule determining the coefficients n σ in eq. (2.8) stating that n σ are determined by the intersection of the downward set of the thimble σ with the original domain of integration. For example, if 2 points on M (0) flow to the critical point of thimble σ, then n σ is 2. If the flow transports a tangent basis attached to one of these special points to an oppositely oriented basis, then n σ is negative (or the integral over thimble σ is defined to have an opposite orientation).
The observations above suggest a practical way of performing the original integral even in the case where many thimbles contribute and even if no direct knowledge of the location of the critical points is available. The contraction algorithm with T = 0 computes the integral over the main tangent space that, as mentioned above, is equivalent to the integral over R N . However, a severe sign problem may still exist on the main tangent space and a way to ameliorate that is to use a manifold obtained by flowing by a finite amount T . All algorithms previously proposed for the integration over one thimble require the corresponding critical point to be known analytically. In a more realistic field theory, the integration over all relevant thimbles would require that all solutions of the (euclidean) equation of motion, even with complexified fields, be known. Furthermore, one would have to determine which one of those contribute to the original integral over real fields, a daunting task that has only been accomplished in 0 + 0 and 0 + 1 dimensional models. A main point of this paper is that the observations of this section imply that the contraction algorithm, used at finite values of T , is an alternative way of computing the original integral without assuming the dominance of a single thimble. In other words, as the points of the tangent space (φ) parametrize (through the flow) a manifold in the same homological class as R N , a Markov chain in the tangent space generated with the effective action S − log |J| will sample all relevant thimbles, bypassing the need to analytically find all critical points and their corresponding coefficients n σ .
Let us now illustrate the points above with explicit calculations. The integral over the main tangent space is extremely cheap computationally as the flow equations eq. of the algorithm). Of course, the integrand is not real over the main tangent space and a potential sign problem arises. Still, for some models/parameter sets the phase e −iS I can be reweighted even in cases when a similar reweighting on the original domain of integration is not feasible. An example of this is shown in figure 2 where the result of the calculation in [14] (corresponding to the contribution of the main thimble only) is plotted together with a similar calculation performed with flow T = 0 (corresponding to an integration over the tangent space M (0)). The possibility of reweighting the e −iS I phase on the main tangent space is not an artifact of small coupling. In the strong coupling N = 8,m = 1,ĝ 2 = 1/2 case, where the quenched phase e −iS I R essentially vanishes beyondμ ≈ 1.4, the e −iS I phase is not smaller than ≈ 0.3 in the main tangent plane and is easily reweighted. As shown in figure 5 the results for the condensate agree with the exact result. On the other hand, the T = 2 flow result does not agree with the exact result. In view of our discussion above we can see why it fails. The contraction algorithm computes the integral over the manifold M (T ) obtained by the upward flow of the main tangent space. As S R is increased by the flow, the region of the tangent space that is mapped into a manifold close to the main thimble becomes separated from the regions mapped into other thimbles by broader and taller action barriers. Our algorithm, based on a Metropolis chain in the tangent space using the effective action S R − log |J|, "gets stuck" in a small region mapped into the main thimble. Consequently, only the main thimble is sampled. In fact, the T = 2 results agrees with [12] obtained with a different algorithm sampling (by construction) only the main thimble. These results demonstrate that i) the disagreement between the results in [12,14] and the exact result come indeed from the neglected contribution from other thimbles and ii) the contribution from other thimbles are captured by the T = 0 no flow contraction algorithm which computes the integral over the main tangent space.
Given the success of the zero flow contraction algorithm in dealing with system with small N , a natural question is how well this approach does with larger N . increased along two different routes of physical interest: the continuum limit and the zero temperature limit.
Consider first the continuum limit. It corresponds to the limit N → ∞ while keeping andĝ 2 N,mN,μN fixed. The results from the zero flow calculations, that is, the result of the integration over the main tangent space, are shown in the left panel of figure 6 starting from the strongly coupled (more challenging) case N = 8,ĝ 2 = 1/2,m = 1. It is evident that, as expected, the continuum limit is approached smoothly and no problem arises due to the fluctuating e −iS I phase. On the other hand it is expected that the e −iS I phase will fluctuate more and more as the zero temperature is approached since the euclidean action is proportional to the spacetime volume and, consequently, proportional to the inverse temperature. This is indeed what we find. In the right panel of figure 6 we show the results of the zero flow computation starting from the N = 8,ĝ 2 = 1/2,m = 1 case and taking the limit N → ∞ while keepingĝ,m andμ constant. For large enough N (low enough temperature) the e −iS I phase fluctuates too wildly to be reweighted without resorting to enormous numbers of configurations (which could be done in our toy model but would be impractical in problems of greater physical interest). We stress, however, that by integrating over the main tangent space one can perform calculations at much lower temperatures than one could have by integrating over real fields.
The examples above illustrate the best case scenario: a simple change of variable (φ → φ +φ c ) is enough to make the sign problem tractable by reweighting. From these examples we know this situation is more likely to happen at weak coupling. On the other hand, the action is an extensive quantity and we do expect the phase e −iS I (calculated for real variables or on the main tangent space) to vanish in the thermodynamic limit. It is unclear how the phase behaves on the main tangent space in asymptotically free theories in the continuum (small coupling) and thermodynamic/zero temperature limits. Whether the integration over the main tangent space alleviates the sign problem enough to be useful to physically interesting problems is something that can be determined only by future calculations.  Table 1. Value of the condensate in the low temperature (N → ∞) limit and µ = 1.688 obatined with flow time T = 0.5. These results and its error bars should be compared to the ones in the right panel of figure 6 obtained with T = 0.
In order to tame the sign problems at low temperatures we apply the algorithm with a finite flow time T . There is a compromise in the choice of T . Too small a value does not alleviate the sign problem enough. For too large values of T the relevant regions in the tangent space (which parametrizes the important parts of the thimbles) are separated by regions of large Re(S − log |J|) which "trap" the Markov chain and leads to large decorrelation lengths. At infinite T these barriers correspond to the junction of thimbles at their borders where S R diverges. That is why our previous calculation in [14], performed with a fairly large value of T , was constrained to the region of the tangent space mapped onto the main thimble, failed to include the contributions corresponding to the remaining thimbles and produced some wrong results corresponding to the contribution of only one thimble (typically, at larger couplings). 5 Let us clarify the way the one and multi-thimble results are obtained with the contraction method. There are two limits to be considered: the T → ∞ and the infinite statistics limit N → ∞. If the T → ∞ limit is taken before the N → ∞ the one thimble result is obtained. On the other hand, if the N → ∞ is taken, the result becomes independent of T .
After some trial and error we determined the value of T = 0.5 for the flow time required in the low temperature calculations. Since at low temperatures the condensate χχ as a function of the chemical potential µ is essentially a step function we show results only for the value µ = 1.688 where the condensate abruptly changes from the µ = 0 value to zero (see table 1). On figure 7 we show a histogram of the sampled fields (in the tangent space) and the corresponding values of S I . These histograms clearly shows that regions corresponding to several thimbles are being sampled, despite the high action barriers between them. This was accomplished by taking large values of the proposal, capable to "jump over" these barriers in a single Metropolis step. As a consequence the acceptance rate is small (around 15%). Compounded with the fact that e −iS I still oscillates significantly, this small acceptance rate leads to a serious loss of efficiency. In the toy problem discussed in this paper this lack of efficiency can be overcome by brute force (the N = 32 calculation is the result of a 2 million Metropolis steps). In any case, these results demonstrate the soundness of parametrizing all thimbles by the tangent space of one of them. A promising research avenue is to substitute the simple Metropolis algorithm we used by another algorithm more adapted to multi-modal distributions (for a review of these methods see [22]).

Conclusion
We pointed out that the contraction algorithm lends itself to calculations over a whole family of sub-manifolds of the complexified field space, all of which are equivalent to the original real manifold where the partition function is defined. At zero flow (T = 0) this sub-manifold is simply the tangent space to one critical point. On the other hand, in the large flow limit (T → ∞) the sub-manifold is the union of all thimbles contributing to the partition function. This observation has a practical consequence: the contraction algorithm defined around one single critical point can be used to sample all relevant thimbles, bypassing the tremendous difficulty of identifying all critical points, figuring their contribution to the partition function and sampling them with the proper statistical weights.
As an example we use this approach in a simple fermionic model, previously studied by several authors. We showed that for a large parameter region that includes the continuum limit the simplest version of this idea, the no-flow (T = 0) algorithm, is sufficient to produce the correct results, something that eluded previous studies (including the present authors). For other parameters/models, with more severe sign problems, a more sophisticated algorithm, capable of dealing with multimodal distributions will be required. This observation establishes a connection between the sign problem and the problem of sampling multimodal distributions. Of course, this is also a hard problem and one might wonder whether this observation can have practical use. This question can be addressed only after a full implementation is investigated but we stress that the kind of multimodality expected here is much JHEP05(2016)053 more benign than, for instance, spin glasses. Indeed, at small values of the coupling constant there is a hierarchy of importance of the different thimbles and only a few of them are expected to contribute significantly while the others are suppressed by small values of e −S R .
In our model, the main tangent space is parallel to the real field space and consequently, the integral over it is the same as over the real space. We can then use the main tangent space as an initial condition for the flow to obtain other manifolds where the integral is still the same but where the sign problem is ameliorated. For other models, where the integral over the main tangent space is not the same as the original integral over real fields (or where the homological class of the main tangent space is unknown) we can slightly modify the algorithm by substituting the main tangent plane by R N itself. Then one is guaranteed to flow to the correct combination of thimbles. The price to pay is that a larger flow will be required, which is the most computationally intensive part of the calculation.