An efficient algorithm to estimate the potential barrier height from noise-induced escape time data

It is a common phenomenon in nature and technology that a system under perturbations exits a regime of its usual dynamics. Often it is possible to define a potential function whereby a potential well can be associated with a usual or persistent dynamics, and a potential barrier needs to be overcome to escape the regime associated with the usual dynamics. We develop an algorithm to determine the potential barrier height experimentally, provided that we have control over the noise strength. We are concerned with the situation when the experiment requires large resources of time or computational power, and wish to find a protocol that provides the best estimate in a given amount of time. The optimal noise strength to use is found to be very simply related to the potential barrier height, and we propose an iterative method for the estimation.


August 22, 2018
It is a common phenomenon in nature and technology that a system under perturbations exits a regime of its usual dynamics [1,2,3,4,5,6]. Often it is possible to define a potential function whereby a potential well can be associated with a usual or persistent dynamics, and a saddle of the potential adjacent to the potential well is a feature through which the exit takes place [7]. The potential difference between the bottom of the potential well and the saddle is often termed a potential barrier. The expected exit time then depends on the height of this potential barrier and the (small) noise strength. Therefore, knowing the potential barrier height is often of strong interest, because then one can predict -for a given or applied noise strength -the expected escape time.
We develop an algorithm to determine the potential barrier height experimentally, provided that we have control over the noise strength. We are concerned with the situation when the experiment requires large resources of time or computational power, and wish to find a protocol that provides the best estimate in a given amount of time. We encountered such a situation when wanted to determine expected transition times to a cold climate for a noisy version of the climate model presented in [8].
We consider the rather generic situation when the dynamics is governed by the following Langevin stochastic differential equation (SDE): x, F, ξ ∈ R n , and the diffusion matrix D ∈ R n×n is independent of x, i.e., the white noise ξ is additive. The vector field F(x) is such that it realises the coexistence of multiple attractors (including the possibility of an attractor at infinity) and at least one nonattracting invariant set, often called a saddle set. The saddle set is embedded in the boundary of some basins of attraction.
Based on a well-established theory due to Freidlin and Wentzell [9] the steady state probability distribution in the weak-noise limit, σ ≪ 1, can be written as in which Φ(x) is called the nonequilibrium-or quasipotential. In gradient systems where . See e.g. [10] for an example of multiplicative noise where lim σ→0 σ 2 ln W (x = x) does not exists for some parameter setting and W (x) has a fat tail. The probability that a perturbed trajectory does not escape the basin of attraction over a time span of t t decays exponentially: The approximation is in fact quite good already for times t t ≈ E[t t ] = τ or even smaller. The reciprocal of the expectation value τ can be written as an integral of the probability current through the basin boundary, whose leading component as σ → 0 comes from a point x e where Φ(x) is minimal on the boundary. The proportionality of the probability current to W (x) leads [11,12] to: where is what we call the potential barrier height. Both the saddle and the attractor can be chaotic, in which cases Φ(x e ) and Φ(A) have been shown [13,14] to be constant over the saddle [13] and attractor [14], respectively. Considering (4), the expected transition times increase "explosively" as the noise strength σ decreases. From the point of view of estimating ∆Φ, there seems to be a trade-off between an increasing accuracy of the estimation and an increasing demand of resources as σ decreases. However, if we fix the amount of resources that we are willing to commit, then an increasing of accuracy is not guaranteed any more, because we can register fewer transitions as σ decreases. On the other hand, increasing σ beyond a point might not improve accuracy either for the following reason. We assume that for some σ 0 we can estimate τ = τ 0 arbitrarily accurately because a large number of transitions can be achieved inexpensively. We also assume that in this "anchor point" (4) applies accurately: Then, we can identify the accuracy of estimation by where we introduced t t is our finite-N estimate of τ for a fixed σ. Clearly, as σ → σ 0 the inaccuracy explodes. That is, in the described setting of estimation (which is not the most generic one) there should exist an optimal value of σ.
The sum of the exponentially distributed random variables, Nt t , does in fact follow an Erlang distribution [15], and so: where Ψ (1) (N ) is the first derivative of the digamma function [16]. We can make the interesting observation that Var[lnt t ] does not depend on τ , only on N . Next, we make use of the approximation [16] Ψ (1) where we, first, assumed a certain fixed commitment of resources, which can be expressed simply by T = N τ , and, second, made use of (6). We look for a σ = σ * or y = y * that minimizes δ∆Φ, for which we need to solve d δ∆Φ/ d y = 0, yielding our main result: We can make the interesting observation that it is independent of τ 0 and T , which we comment on shortly. y * depends only on ∆Φ (in a very simple way), the unknown that we wanted to determine in the first place, and so the result can seem irrelevant to practice for the first sight. However, one can simply start out with an initial guess value,∆Φ 0 , and iteratively update the estimate as∆Φ i by performing a maximum likelihood estimation (MLE) [17] each time a new value of t t,i is acquired. This way, for the acquisition of t t,i+1 , one continues the experiment with an updated noise strength . . , N , according to (12). The MLE of ∆Φ is based on the probability distribution (3) jointly with (6). This is an analogous procedure to nonstationary extreme value statistics when one or more parameters of the extreme value distribution (EVD) is a function of a covariate that could depend on time. In our case τ and σ correspond to the EVD parameter and covariate, respectively. We note that as σ * does not depend on T , at any time into the experiment (for large enough N , though, such that (10) is a good approximation) our estimate of ∆Φ is done most efficiently, and so we can revise our commitment, either stopping the experiment early or extending it. Next we demonstrate the use of our algorithm on two examples; in a singleas well as a multi-dimensional system. Example 1: Overdamped particle in a symmetrical 1D double-well potential. It is governed by the following SDE: We specify our example as: V = x 4 /4 − x 2 . The two minima are at x ± = ± √ 2, and the local maximum in between is at x 0 = 0. These are fixed points of the deterministic case (σ = 0). A numerical solution of the SDE (13) is obtained by using an Euler-Maruyama integrator [18] with a time step size of h = 0.02. Examples are shown in Fig. 1, indicating the regime behaviour with irregular transitions between the two regimes. The time series clearly evidence bimodal marginal distributionscorresponding to the two regimes -whose maxima, and the local minimum in between (not shown), are exactly at x ± = ± √ 2 and x 0 = 0, respectively. With substituting these in to (5) we obtain that ∆V = ∆Φ = 1. This shows up as the slope of the curve in Fig. 2. The green coloring indicates that (4) is satisfied well even with so strong noise that the time spent in a regime is not so clear cut any more, as seen in Fig. 1 (a). The result of applying our algorithm is shown in Fig. 3, indicating that it serves its purpose, and that the convergence is rather fast. Finally, Fig. 4 verifies the corner stone of the algorithm (12), showing the sample standard deviation of a number of estimates. Results with the algorithm and different fixed sample values of σ are shown in one diagram, indicating that the accuracy of estimate by our algorithm is just about the best accuracy achievable by the same amount of computation using the optimal fixed σ. Note that we chose N = 30 for our algorithm, resulting in some computational time T , and then we realised N = ⌈T /τ (σ)⌉ transitions using the different fixed σ's.    The initial value for each was σ * 0 = 0.9 < σ 0 . A "safeguarding" of the procedure is facilitated by overriding (12) such that σ * i+1 = σ 0 if ∆Φ i < 0 and σ * i = σ * min = 0.6 when (12) dictates smaller.  Earth's climate is its bistability: beside the relatively warm climate that we live in, under the present astronomical conditions a very cold climate featuring a fully glaciated so-called snowball Earth is also possible, and this state might have been experienced a number of times by Earth in the past few hundred million years [19]. Different hypotheses of transitioning from the warm to the cold climate and the other way round involve external forcings, but in principle it is possible that the climate system is transitive, at least in the warm-to-cold direction. This transitivity can be modeled by noise-induced transition, where the noise models some unresolved internal, say, atmospheric and/or oceanic dynamics. Without a requirement for physical realism, we consider additive noise perturbations of the Ghil-Sellers model [20] written for the long time average surface air temperature T (φ, t) as a function of latitude φ ∈ [−π/2, π/2] or x = 2φ/π ∈ [−1, 1]. The deterministic GSEBM stands in the form of a diffusive heat equation: See [20,21] [23] regarding the xdependent diffusivity). The boundary conditions are eliminated by the method of reflection, setting T 0 = T J and T J+1 = T 1 . Such a grid deals effectively with the singularity of M (x) at the poles, but the resulting ODE can be somewhat stiff. Fig. 5 shows that our algorithm works also in a multi-dimensional setting.

Acknowledgments
I would like to thank Ying Tang for his extremely helpful support for using their code [3], and Tamás Tél for providing valuable feedback on a draft of the  Figure 5: Sames as Fig. 3 (b) but for the discretized GSEBM, J = 10, and σ 0 = 1.7, N 0 = 400, σ * 0 = 1.5, σ * min = 1.0. We considered only warm-to-cold transitions given the present day solar strength µ = 1. The red line marks the potential difference ∆Φ between the warm climate and the saddle. Note that here we measure time in 10 6 seconds [Ms], as a change to [22,21]. An expression for the potential functional Φ(T ) of the PDE was given in [20]. However, we calculate ∆Φ for the discretized system by an action-minimizing procedure [3], using the computer code as a supplementary material to that paper. Time-discretization of the instanton was realised by 100 points over a span of 2000 [Ms]. Note that for the feasibility of the minimization it is crucial to provide symbolically the gradient of the action with respect to the displacement of the discrete sample points of the instanton. To be able to achieve this using the code of [3] the method of lines has to result in an explicit expression forṪ j . This could not be achieved by the sophisticated method implemented in Matlab's pdepe, which is why we developed the discretization scheme (15). manuscript. I would like to acknowledge Valerio Lucarini for the many inspiring exchanges on the topic of critical transitions. This work received funding from the EU Blue-Action project (under grant No. 727852).