Combining the complex Langevin method and the generalized Lefschetz-thimble method

The complex Langevin method and the generalized Lefschetz-thimble method are two closely related approaches to the sign problem, which are both based on complexification of the original dynamical variables. The former can be viewed as a generalization of the stochastic quantization using the Langevin equation, whereas the latter is a deformation of the integration contour using the so-called holomorphic gradient flow. In order to clarify their relationship, we propose a formulation which combines the two methods by applying the former method to the real variables that parametrize the deformed integration contour in the latter method. Three versions, which differ in the treatment of the residual sign problem in the latter method, are considered. By applying them to a single-variable model, we find, in particular, that one of the versions interpolates the complex Langevin method and the original Lefschetz-thimble method.


Introduction
The sign problem is a notorious problem in computational physics, which occurs in performing a multi-variable integral with a complex weight based on importance sampling such as Monte Carlo simulation. Recently two closely related methods have been pursuit as promising solutions to this problem based on complexification of the original real variables. One is the complex Langevin method (CLM) [1,2], and the other is the Lefschetzthimble method [3]. An important generalization of the second approach was proposed [4] to overcome some problems in the original proposal, and we will therefore refer to this approach as the generalized Lefschetz-thimble method (GLTM) in this paper.
The CLM is an extension of the stochastic quantization [5] (See ref. [6] for a review.), which generates dynamical variables with a given probability by solving the Langevin equation that describes a fictitious time evolution of those variables under the influence of a Gaussian noise. In the case of a complex weight, one has to complexify the dynamical variables, and the drift term in the Langevin equation should be extended to a holomorphic function of the complexified variables by analytic continuation. When one measures the observables for the complexified variables generated by the Langevin process, one also has to extend the observables to holomorphic functions of the complexified variables. Then, under certain conditions, one can show [7,8] that the expectation values of the observables calculated this way at some Langevin time are equal to the expectation values of the observables for the original real variables with a complex weight, which evolves with the Langevin time following the Fokker-Planck equation. If this is the case, one can obtain the desired expectation values in the long Langevin-time limit [9]. Recently, a subtlety in the use of time-evolved observables in the original argument [7,8] was recognized [10]. This subtlety was fixed in a refined argument, which also led to the derivation of a practical criterion for correct convergence in the CLM [10].
On the other hand, the GLTM [4] is based on a continuous deformation of the integration contour using the so-called holomorphic gradient flow. Along the deformed contour, the phase of the complex weight fluctuates only mildly for a sufficiently long flow time, and one can apply ordinary Monte Carlo methods using the absolute value of the weight in generating configurations and reweighting the phase factor when one calculates the expectation values. In the long flow-time limit, the deformed contour becomes a set of Lefschetz thimbles and the GLTM reduces to the originally proposed method [3]. (See refs. [11][12][13][14][15][16][17][18][19][20][21][22] for related work.) The phase of the complex weight becomes constant on each thimble, which makes this limit attractable at first sight. However, when there exists more than one thimbles, the transition from one to another does not occur during the Monte Carlo simulation and hence the ergodicity is violated in this limit. Therefore, there is an optimal flow time in general [23][24][25]. (See ref. [21,22] for a new proposal concerning this issue.) Let us recall here that the two methods have their own pros and cons. From the viewpoint of solving the sign problem, the CLM is more powerful since it solves it completely as far as it works, whereas the GLTM has a residual sign problem, which may be problematic when the system size becomes large. From the viewpoint of numerical cost, the GLTM is more demanding since one has to calculate the Jacobian arising from deforming the integration contour. (See, however, ref. [26].) It should also be noted that one can include fermions in the CLM with the cost of O(V ) for the system size V , whereas in the GLTM, it requires O(V 3 ). The disadvantage of the CLM is that there is a condition [7, 8,10] to be satisfied to make it work, and hence there is a certain range of applicability. While the gauge cooling technique [27] (See ref. [10,28] for its justification) has enlarged this range of applicability to the extent that finite density QCD either with heavy quarks [29][30][31] or in the deconfined phase [32,33] can now be investigated, it is not yet clear whether one can investigate it even in the confined phase with light quarks [34][35][36][37][38][39][40][41].
In this situation, it is useful to compare the two methods closely [42,43] and to try to come up with a new method which is free from the aforementioned problems of each method [44,45]. Note here that despite the resemblance of the two methods in that they both use complexified variables, the connection between them is far from obvious. In particular, the CLM samples configurations in the complex plane based on the Langevin process, whereas the GLTM samples configurations on a particular contour in the complex plane obtained by deforming the original contour along the real axis. Therefore the dimension of the configuration space in the CLM is twice as large as that of the GLTM.
In order to clarify the relationship between the two methods, we propose a formulation which combines them by applying the CLM to the real variables that parametrize the deformed integration contour in the GLTM. We consider three versions, which differ in the treatment of the residual sign problem in the GLTM. The ordinary CLM and the GLTM are included in our combined formulation as special cases. We apply our formulation to a simple one-variable case with a single Lefschetz thimble, and investigate the distribution of flowed configurations in detail. We find, in particular, that one version of our formulation interpolates the ordinary CLM and the original Lefschetz-thimble method.
The rest of this paper is organized as follows. In section 2 we briefly review the CLM and the GLTM. In section 3 we present our formulation, which combines the two methods. In section 4 we apply our formulation to a single-variable case and clarify the relationship between the two methods. Section 5 is devoted to a summary and discussions.

Brief review of the CLM and the GLTM
In this section, we briefly review the CLM and the GLTM. Here we consider a general model given by a multi-variable integral where the action S(x) is a complex function of x = (x 1 , · · · , x n ) ∈ R n . The expectation value of an observable O(x) is defined by

Complex Langevin Method (CLM)
In the CLM, we complexify the dynamical variables x ∈ R n as x → z = x + iy ∈ C n and extend the action and the observable to holomorphic functions of z. Then we consider the complex Langevin equation where t is a fictitious time and η k (t) is a real Gaussian noise normalized as η k (t)η l (t ′ ) η = 2 δ kl δ(t − t ′ ). This equation ( where the right-hand side denotes the expectation value of O(z) with respect to the limiting distribution of z at t → ∞. As in the case of Monte Carlo methods, O(z) CLM can be replaced with a long-time average of the observable calculated for the generated configurations z(t) if ergodicity holds for the Langevin time-evolution. The derivation of the equality (2.4) uses integration by parts, which can be justified if the distribution of z falls off fast enough in the imaginary directions [7,8] as well as near the singularities of the drift term if they exist [9,40]. Recently [10], a subtlety in the use of time-evolved observables in the original argument [7,8] was pointed out, and the derivation of the equality (2.4) has been refined taking account of this subtlety. This also led to the proposal of a useful criterion for justification, which states that the distribution of the drift term should be suppressed exponentially or faster at large magnitude [10].

Generalized Lefschetz-Thimble Method (GLTM)
In the GLTM [4], we deform the integration contour from R n to an n-dimensional real manifold in C n by using the so-called holomorphic gradient flow, which makes the sign problem milder.
The holomorphic gradient flow is defined by the differential equation which is solved from σ = 0 to σ = τ with the initial condition φ(x; 0) = x ∈ R n . The flowed configurations define a n-dimensional real manifold in C n , which we denote as M τ = {φ(x; τ )|x ∈ R n }. One can actually argue that the integration contour can be deformed continuously from R n to M τ ⊂ C n without changing the partition function. Then, by noting that φ(x) ≡ φ(x; τ ) defines a one-to-one map from x ∈ R n to φ ∈ M τ , one can rewrite the partition function as where J kl (x) is the Jacobian corresponding to the map φ(x). We obtain the Jacobian as J kl (x) = J kl (x; τ ), where J kl (x; σ) ≡ ∂φ k (x; σ)/∂x l is calculated by solving the differential equation from σ = 0 to σ = τ with the initial condition J kl (x; 0) = δ kl . Thus, the deformation of the integration contour simply amounts to changing the action from S(x) to the "effective action" The virtue of using the holomorphic gradient flow (2.5) in deforming the integration contour lies in the fact that the real part of the action S grows monotonically along the flow keeping the imaginary part constant. Thus, for a sufficiently large flow time τ , the partition function (2.6) is dominated by a small region of x, and the sign problem becomes much milder. One can therefore apply a standard Monte Carlo method using only the real part of the effective action (2.8), dealing with the imaginary part by reweighting. Thus, the expectation value of O(x) can be calculated as where · · · MC represents the expectation value obtained by a Monte Carlo method with the weight exp(−ReS eff (x)). When the flow time τ becomes infinitely large, M τ contracts to a set of Lefschetz thimbles, and the GLTM reduces to the so-called Lefschetz-thimble method [3]. Since ImS(φ(x)) is constant on each thimble, the sign problem is maximally reduced in some sense. However, when there are more than one thimbles, the transition between thimbles does not occur during the Monte Carlo simulation, which leads to the violation of ergodicity. The GLTM [4] avoids this problem of the original method by choosing a large but finite flow time. Recently, it has been pointed out [21,22] that the flow time can be made as large as one wishes without the ergodicity problem if one uses the parallel tempering algorithm.

Combining the CLM and the GLTM
In this section, we present our formulation, which combines the CLM and the GLTM. In fact, we consider three versions of our formulation, which differ in the treatment of the residual sign problem in the GLTM. The simplest version removes the residual sign problem completely by applying the CLM to the partition function (2.6) that appears in the GLTM. The real variables x ∈ R n parametrizing the n-dimensional real manifold in C n have to be complexified, and they evolve in time following the complex Langevin equation obtained from the effective action (2.8). The observable should be evaluated with the flowed configuration φ(z), which is defined as holomorphic extension of φ(x). Therefore, it is important to see how the distribution of flowed configurations changes as the flow time increases.
In order to apply the CLM to the partition function (2.6), we consider the drift term where we have defined K mlk (x) ≡ ∂J ml (x; τ )/∂x k , which can be obtained by solving the differential equation for K mlk (x; σ) ≡ ∂J ml (x; σ)/∂x k from σ = 0 to σ = τ with the initial condition K mlk (x; 0) = 0. Then the complex Langevin equation corresponding to (2.6) reads where φ(z), J kl (z) and K mlk (z) are holomorphic extension of φ(x), J kl (x) and K mlk (x), respectively. The crucial point to note here is that the flow equations (2.5), (2.7) and (3.2) for the three functions φ(x), J kl (x) and K mlk (x) involve complex conjugation on the right-hand side. Therefore, in order to define φ(z; σ), J kl (z; σ) and K mlk (z; σ) as holomorphic functions of z, we have to extend the flow equations as where φ(z; σ), J kl (z; σ) and K mlk (z; σ) appear on the right-hand side. These functions with z in the argument obey the flow equations, which can be obtained by replacing z withz in the above equations. In practice, we solve the two sets of equations simultaneously with the initial conditions φ(z; 0) = z, J kl (z; 0) = δ kl and K klm (z; 0) = 0 for a particular value of z at each Langevin step. Using this formulation, we can calculate the expectation value (2.2) as where the flowed configuration φ(z; τ ) has to be used in evaluating the observable unlike in the ordinary CLM (2.4). For τ = 0, our formulation reduces to the ordinary CLM since φ(z; 0) = z, J kl (z; 0) = δ kl and K klm (z; 0) = 0. The second version of our formulation amounts to applying the CLM to a partially phase-quenched model and treating the phase of detJ(x) by reweighting. The complex Langevin equation for (3.8) where 1 2 (· · · ) is the holomorphic extension of ∂ log |detJ(x)|/∂x k = Re(J −1 lm (x)K mlk (x)). We can calculate the expectation value defined in (2.2) as where · · · CLM, pPQ represents the expectation value obtained by the CLM using (3.9) and the reweighting factor ω(z) is holomorphic extension of ω(x) ≡ detJ(x; τ )/|detJ(x; τ )|. In practice, we obtain ω(z) by solving the differential equation for ω(z; σ) from σ = 0 to σ = τ with the initial condition ω(z; 0) = 1. Note that |ω(z)| = 1 in general unless z is real. The third version amounts to applying the CLM to the (totally) phase-quenched model and treating the phase of detJ(x) e −S(φ(x)) by reweighting. The complex Langevin equation for (3.11) reduces to the real one, which is given by We can calculate the expectation value defined in (2.2) as where · · · RLM, PQ represents the expectation value obtained by the real Langevin method (RLM) using (3.12), and Γ(x) in the reweighting factor e iΓ(x) is the phase of detJ(x) e −S(φ(x)) . This version is nothing but the GLTM with the RLM used as a standard Monte Carlo method in evaluating (2.9).

Results for a single-variable model
In this section, we apply the three versions of our combined formulation to a single-variable model to clarify the relationship between the CLM and the GLTM. The model is defined by the partition function [9,10,40], where x is a real variable, while α and p are real parameters. When α = 0 and p = 0, the weight becomes complex and the sign problem occurs. First, using the holomorphic gradient flow for a fixed flow time τ , we rewrite (4.1) as where φ(x) = φ(x; τ ) and J(x) = J(x; τ ) are obtained by solving (2.5) and (2.7) for φ(x; σ) and J(x; σ) ≡ ∂φ(x; σ)/∂x. Then we apply the three versions of our formulation; namely (i) CLM applied to the model (4.2).
(ii) CLM applied to the partially phase-quenched model: (iii) RLM applied to the totally phase-quenched model: In the cases (ii) and (iii), appropriate reweighting is needed in calculating the expectation value as described in the previous section. The phase of J(z) is evaluated as the imaginary part of log J(z), which is obtained by solving the differential equation from σ = 0 to σ = τ with the initial condition log J(z; 0) = 0. We focus on the choice of parameters p = 4 and α = 4.2, for which there exists only one Lefschetz thimble and the ordinary CLM is known to be justified [10]. The flow equations (3.4), (3.5) and (3.6) are solved for τ = 0, 3, 6, 9 by using the fourth-order Runge-Kutta algorithm with the flow time discretized by ∆σ = 10 −3 , 10 −4 , 10 −5 for τ = 3, 6, 9, respectively. The Langevin simulations are performed by using the higher-order algorithm [46] with the step-size ǫ = 10 −5 for τ = 0, 3, 6 and ǫ = 10 −6 for τ = 9. We choose z = 0 as the initial configuration, and for thermalization, we discard the first 10 5 , 10 4 and 10 2 configurations for τ = 0, τ = 3, 6 and τ = 9, respectively. Measurements are made with 10 4 configurations obtained every 10 5 , 10 3 , 10 2 , 10 steps for τ = 0, 3, 6, 9, respectively. In all the three cases (i)-(iii), we have checked that the obtained expectation values of x 2 and x 4 agree well with the exact results. In Figs. 1, 2 and 3, we show the distribution of z and flowed configurations obtained in the case (i), (ii) and (iii), respectively. The zoom up of the distribution of z is presented in Appendix A for the cases (i) and (ii).
Let us first discuss the case (i) shown in Fig. 1. At τ = 0, the distribution of φ(z; τ = 0) = z is nothing but that of the ordinary CLM. At τ = 3, the distribution of z comes close to the real axis, which occurs because the sign problem is reduced to large extent by the holomorphic gradient flow. As a result of this, the distribution of φ(z; τ ) comes close to a curve, which looks similar to the one obtained in the GLTM for the same τ shown in Fig. 3. At larger τ , however, the distribution of φ(z; τ ) spreads out again, and its boundary approaches the Lefschetz thimble. The spreading of the distribution can be understood as the effects of the complex Langevin dynamics, which takes care of the residual sign problem arising from the Jacobian even in the large-τ limit. The distribution of z shrinks towards the origin, but the zoom up in Appendix A shows that it does not contract to a distribution on the real axis. It is interesting that the expectation values evaluated with flowed configurations φ(z; τ ) reproduce the exact results for any τ without reweighting although the distribution of φ(z; τ ) looks quite different for different τ . The case (ii) shown in Fig. 2 is of particular interest. At τ = 0, the distribution of φ(z; τ = 0) = z is nothing but that of the ordinary CLM, as in the case (i). As τ increases, the distribution of φ(z; τ ) approaches the Lefschetz thimble, while the distribution of z approaches the real axis (See also Appendix A.). This is due to the fact that the sign problem that the complex Langevin dynamics has to take care of vanishes in the large τ limit since the phase of the Jacobian is taken care of by reweighting and the remaining phase factor is taken care of by the holomorphic gradient flow. Thus, in this case our combined formulation interpolates the CLM and the original Lefschetz-thimble method via In the case (iii) shown in Fig. 3, our formulation reduces to the GLTM [4]. At τ = 0, the calculation simply amounts to applying the reweighting method to the original model (4.1). At τ > 0, the distribution of φ(x; τ ) is restricted to the deformed integration contour M τ , which approaches the Lefschetz thimble as τ is increased. Note also that the distribution of x shrinks towards the origin as τ increases.

Summary and discussions
We have proposed a formulation which combines the CLM and the GLTM by applying the CLM to the real variables that parametrize the deformed integration contour in the GLTM. The residual sign problem in the GLTM is treated in three different ways. We have applied the three versions to a single-variable model in the case with a single Lefschetz thimble, and investigated the distribution of flowed configurations, which are used in evaluating the observables.
In the first version, where we remove the residual sign problem completely by the complex Langevin dynamics, the distribution of flowed configurations changes with the flow time, and it does not contract to a curve in the long flow-time limit although the Lefschetz thimble appears at the edge of the distribution. This version is interesting from the viewpoint of generalizing the CLM in the line of Refs. [47][48][49][50] since the distribution of flowed configurations for different flow time can reproduce the expectation values of observables equally well without the residual sign problem. In the second version, where we quench the phase of the Jacobian and treat it by reweighting, the distribution of flowed configurations contracts to the Lefschetz thimble in the long flow-time limit. Thus, in this case our formulation interpolates the ordinary CLM and the original Lefschetz-thimble method. The third version is nothing but the GLTM with the RLM used as the Monte Carlo method for updating the real variables parametrizing the deformed contour. The distribution of flowed configurations is restricted to the deformed contour, which approaches the Lefschetz thimble in the long flow-time limit.
While our formulation is useful in clarifying the relationship between the CLM and the GLTM, it is not of much practical use since the computational cost required in solving (3.4), (3.5) and (3.6) increases as O(V ), O(V 3 ) and O(V 5 ), respectively, with the system size V . Note also that the computational cost increases linearly with the flow time. Furthermore, our formulation as it stands does not help in enlarging the range of applicability of the CLM [7, 8,10]. The reason for this is that the distribution of flowed configurations is qualitatively not much different from the distribution of configurations obtained in the CLM, which is nothing but the distribution obtained for τ = 0 in the first and second versions of our formulation. For instance, in the model (4.1) it is known that the CLM gives wrong results for α 3.7 with p = 4 due to the singular drift problem [10]. This cannot be cured by the first and second versions of our formulation. For large τ , the situation gets even worse since the flowed configurations tend to come closer to the singularity of the drift term, and the simulation itself becomes unstable. Thus, when the CLM fails, our formulation fails as well, except for the third version, which does not rely on the justification of the CLM.
Nevertheless, we consider that the relationship of the two methods as seen in our combined formulation is useful in developing these methods further. In particular, it is possible to generalize the CLM in such a way that the range of applicability is enlarged as we report in a separate paper [51].
A Zoom up of the distribution of z In Figs. 4 and 5, we present zoom up of the distribution of z in the case (i) and (ii), respectively, for τ = 3, 6, 9. We find that the distribution contracts to the real axis in the case (ii), but not in the case (i).