Geometric Characterization of the Eyring–Kramers Formula

In this paper we consider the mean transition time of an over-damped Brownian particle between local minima of a smooth potential. When the minima and saddles are non-degenerate this is in the low noise regime exactly characterized by the so called Eyring–Kramers law and gives the mean transition time as a quantity depending on the curvature of the minima and the saddle. In this paper we find an extension of the Eyring–Kramers law giving an upper bound on the mean transition time when both the minima/saddles are degenerate (flat) while at the same time covering multiple saddles at the same height. Our main contribution is a new sharp characterization of the capacity of two local minima as a ratio of two geometric quantities, i.e., the minimal cut and the geodesic distance.


Introduction
In this paper we investigate the so called metastable exit times for the stochastic differential equation where F is a smooth potential with many local minimas and ε is a small number.
The main question of metastability is to determine how long time does the process (1.1) take from going from one local minima to another one. We call these the metastable exit times. This question has a rich history and in the double well case with non-degenerate minimas and a saddle point this is characterized by a formula called Eyring-Kramers law [11,17] which can be stated as follows: Assume that x and y are quadratic local minimas of F , separated by a unique saddle z which is such that the Hessian has a single negative eigenvalue λ 1 (z). Then the expected transition time from x to y satisfies where ≃ denotes that the comparison constant tends to 1 as ε → 0. The validity of the above formula has been studied, from a qualitative perspective, quite extensively, starting from the work of Freidlin and Wentzell. For more information, see the book [12]. Roughly 15 years ago, Bovier et. al. produced a series of papers [6,7,8,9] (see also [5]) which provided the first proof of (1.2) in the general setting of Morse functions. Specifically, they showed that the comparison function is like 1 + O(ε 1 2 log ε 3 2 ). In these papers, they utilized the connection to classical potential theory in order to reduce the problem of estimating metastable exit times to the problem of estimating certain capacities sharply. This approach was later used in [4] to generalize (1.2) to general polynomial type of degeneracies.
In this paper we are interested in estimating the metastable exit times in the case of general type of degenerate critical points. This requires new techniques and effective notation from geometric function theory which we will describe below. Our motivation comes from the field of non-convex optimization where we cannot expect the minimas/saddles to be quadratic or even to have polynomial growth in any direction. In particular, such situations are well known in the context of neural networks, where the minimas and saddles may be completely flat in some directions [15]. Furthermore, it seems that they are preferrable, see [19] for a discussion, see also [3] for an explicit example.
The main goal is to estimate the dependency of the metastable exit times with respect to the geometry of the potential F . In the proof of (1.2) in [8] this is reduced to estimating the ratio of the L 1 norm of the hitting probability and the capacity. Thus in order to estimate the metastable exit times, one needs to produce (1) Estimates of the integral of the hitting probability, i.e. the integral of capacitary potentials with respect to the Gibbs measure. (2) Estimates of the capacity itself, i.e. estimates of the energy of the capacitary potentials. The interesting point is that the influence of F on 1 and 2 is in a sense dual. Specifically, the shape of minimas of F influence 1 while the shape of saddles between minimas influence 2. As is well known, the main difficulty is to estimate 2, which is an interesting topic of its own.
Our main contribution is a sharp capacity estimate for a very general class of degenerate saddle points. In order to achieve this, we phrase the problem in the language of geometric function theory, where the capacity estimates are central topic [13,20]. We introduce two geometric quantities which allow us to estimate the capacity in a sharp and natural way. As a byproduct, we see that in the case of several saddle points at the same height, the topology dictates how the local capacities add up. Specifically, we consider two cases, which we call the parallel and the serial case, and it turns out that the formulas for the total capacity have natural counterparts in electrical networks of capacitors, see Theorem 1. Even in the context of non-degenerate saddles, our formulas provide a generalization of the result of [8] where the authors consider only the parallel case. As we mentioned, we allow the saddle points to be degenerate but we have to assume that saddles are non-branching, see (1.6).
1.1. Assumptions and statement of the main results. In order to state our main results we first need to introduce our assumptions on the potential F . We also need to introduce notation from geometric function theory which might seem rather heavy at first, but it turns out to be robust enough for us to treat the potentials with possible degenerate critical points.
Let us first introduce some general terminology. We say that a critical point z of a function ) in a neighborhood of z. If f is not locally constant at a critical point z, then z is a saddle point if it is not a local minimum / maximum. For technical reasons we also allow saddle points to include points z where f is locally constant. We say that a local minimum at z is proper if there exists aδ > 0 such that for every 0 < δ <δ there exists a ρ such that where B ρ (z) denotes open ball with radius ρ centered at z. When the center is at the origin we use the short notation B ρ .
Let us then proceed to our assumptions on the potential F . Throughout the paper we assume that F ∈ C 2 (R n ) and satisfies the following quadratic growth condition for a constant C 0 ≥ 1. We assume that every local minimum point z of F is proper, as described above, and that there is a convex function G z ∶ R n → R which has a proper minimum at 0 with G(0) = 0 such that where ω ∶ [0, ∞) → [0, ∞) is a continuous and increasing function with We denote by δ 0 the largest number for which ω(δ) ≤ δ 8 for all δ ≤ 4δ 0 . We define a neighborhood of the local minimum point z and δ < δ 0 as For the saddles, we assume that for every saddle point z of F there are convex functions g z ∶ R → R and G z ∶ R n−1 → R which have proper minimum at 0 with g z (0) = G z (0) = 0, and that there exists an isometry T z ∶ R n → R n such that, denoting The assumption (1.6) allows the saddle point to be degenerate, but we do not allow them to have many branches, i.e., the sets {F < F (z)} ∩ B ρ (z) cannot have more than two components. Note that the convex functions g z , G z and the isometry T z depend on z, while the function ω is the same for all saddle points. We define a neighborhood of the saddle point z and δ < δ 0 as where T z is the isometry in (1.4). Note that, since the saddle may be flat, we should talk about sets rather than points. However, we adopt the convention that we always choose a representative point from each saddle (set) and thus we may label the saddles by points z 1 , z 2 , . . . . Moreover, we assume that there is a δ 1 ≤ δ 0 such that for δ < δ 1 we have that if z 1 and z 2 are two different saddle points, then their neighborhoods O z 1 ,3δ and O z 2 ,3δ defined in (1.7) are disjoint. We assume the same for local minimas (or more precisely, the representative points of sets of local minimas). Let us then introduce the notation related to geometric function theory, see [13,20]. Let us fix two disjoint sets A and B in a domain Ω (open and connected set). We say that a smooth path γ ∶ [0, 1] → R n connects A and B in the domain Ω if γ(0) ∈ A, γ(1) ∈ B and γ([0, 1]) ⊂ Ω.
In this case we denote γ ∈ C(A, B; Ω). We follow the standard notation from geometric function theory and define a dual object to this by saying that a smooth hypersurface S ⊂ R n (possibly with boundary) separates A from B in Ω if every path γ ∈ C(A, B; Ω) intersects S. In this case we denote S ∈ S(A, B; Ω). We define the geodesic distance between A and B in Ω as and its dual by Here H k denotes the k-dimensional Hausdorff measure. Finally, we define the communication height between the sets A and B as Let us then assume that x a and x b are local minimum points and denote the height of the saddle between x a and x b as We note that the points x a and x b lie in different components of the set U −δ 3 while they are in the same component of the set U δ 3 . We will always denote the components of U −δ 3 containing the points x a and x b by U a and U b , respectively. It is important to notice that if z is a saddle point and F (z) < F (x a ; x b ) + δ 3, then the neighborhood O z,δ defined in (1.7) intersects the set U −δ 3 . We will sometimes call the components of the set U −δ 3 islands and the neighborhoods O z,δ bridges since we may connect islands with bridges, see Fig. 1. (The terminology is obviously taken from Seven Bridges of Königsberg). We say that the set of saddle points Z xa,x b = {z 1 , . . . , z N } charge capacity if it is the smallest set with the property that We will focus on two different topological situations, where the saddle points in Z xa,x b are either parallel or in series. We say that the points in passing only through z i . We say that the points in Z xa,x b are in series if every path γ ∈ C(B ε (x a ), B ε (x b ); U δ 3 ) passes through the bridge O z i ,δ , defined in (1.7), for all z i ∈ Z xa,x b . In other words, if the points in Z xa,x b = {z 1 , . . . , z N } are parallel, then the islands occupied by the points x a and x b respectively are connected with N bridges and we need to pass only one to get from x a to x b . If they are in series, then we have to pass all N bridges in order to get from x a to x b , see Fig. 2.
Recall that U a and U b denote the islands, i.e., the components of U −δ 3 , which contain the points x a and x b . If the points in Z xa,x b = {z 1 , . . . , z N } are parallel, then we may connect U a and U b with one bridge, i.e., for every If the points in Z xa,x b are in series, then it is useful to order them Z xa,x b = {z 1 , . . . , z N } as follows. Let us consider a path γ ∈ C(B ε (x a ), B ε (x b ); U δ 3 ) which passes through each point in Z xa,x b precisely once. This means that there are Left picture is the parallel case and the right is the series case.
which gives a natural ordering for points in Z xa,x b . By the assumption (1.6), we also deduce that there are s 1 , . . . , The idea is that then every point x i lie in a different island, i.e., component of U −δ 3 , see Fig. 2.
We are now ready to state our main results. The first result is a quantitative lower bound on the capacity between the sets B ε (x a ) and B ε (x b ), where x a and x b are two local minimum points of F . For a given domain Ω ⊂ R n we define the capacity of two disjoint sets A, B ⊂ Ω with respect to the domain Ω as Above, the infimum is taken over functions u ∈ W 1,2 loc (Ω). In the case Ω = R n we denote cap(A, B) = cap(A, B; R n ) for short.
Finally, for functions f and g which depend continuously on ε > 0, we adopt the notation when there exists a constant C depending only on the data of the problem such that whereη is an increasing and continuous functionη ∶ [0, ∞) → [0, ∞) with lim s→0η (s) = 0. In all our estimates, the functionη is specified and depends only on the function ω from (1.4) and (1.6). In order to define it, we first let 0 < ε ≤ δ 0 2 be fixed and let ε 1 (ε) be the unique solution to From the assumption that ω(s) < s 2 for s < δ 0 we see that ε < ε 1 . Furthermore, since ω is increasing we get that ε 1 → 0 as ε → 0. Now, from the definition of ε 1 in (1.13) we see, using lim s→0 ω(s) s = 0 and ε 1 → 0 as ε → 0, that On the other hand, again using the same facts, we see that (1.14) In the following we will denote Finally, in our main theorems and our lemmas/propositions beyond Section 3 there is a ball B R which contains all the level sets of interest. The existence of such a ball is given by the quadratic growth condition (1.3). The constants in the estimates in our main theorems and in Section 3 are unless otherwise stated, depending on n, ∇F B R , δ, R, C 0 , specifically, this applies to the constants inη, and as such, gives precise meaning to a ≃ b.
Theorem 1. Assume that F satisfies the structural assumptions above. Let x a and x b be two local minimum points of F and let Z xa,x b = {z 1 , . . . , z N } be the set of saddle points which charge capacity as defined above, and let 0 < δ ≤ δ 1 be fixed. There exists an 0 < ε 0 ≤ δ such that if 0 < ε ≤ ε 0 the following holds: If the points in Z xa,x b = {z 1 , . . . , z N } are parallel, then, using the notation U z i ,δ from (1.10), it holds Moreover for all i = 1, . . . , N we have the estimate If the points in Z xa,x b are in series, then, using the ordering z 1 , . . . , z N from (1.11) for the points in Z xa,x b and the points x 0 , x 1 , . . . , x N defined in (1.12), it holds Let us make a few remarks on the statement of the above theorem. First, in the case of a single saddle Z xa,x b = {z} the above capacity estimate reduces to ) is the area of the 'smallest cross section'. This is in accordance with the classical result on parallel plate capacitors, where the capacity depends linearly on the area and is inversely proportional to their distance.
The statement (1.17), when the saddle points are parallel, means that each saddle point z 1 , . . . , z N charges capacity and the total capacity is their sum. Again the situation is the same as in the case of parallel plate capacitors with capacity C 1 , . . . , C N , where the total capacity is the sum On the other hand, if the plate capacitors are in series their total capacity satisfies 1 Using the assumption (1.6) we calculate in Proposition 4.1 and in Proposition 4.2 more explicit, but less geometric, formulas for the single saddle case in a domain Ω. Namely, we have and thus we recover the result in [4] . In particular, if the saddle point is nondegenerate, i.e., g z and G z are second order polynomials and the negative eigenvalue of ∇ 2 F (z) is −λ 1 , we may estimate In particular, we recover the classical formula (1.2). Our second main theorem is an estimate on the so called metastable exit times. However, in order to state it we need some further definitions. Assume that the local minimas of F are labelled x i and ordered such that We will group the minimas at the same level using the sets In addition to the previous structural assumptions we assume further that for δ 2 ≤ δ 1 small enough, it holds for all k = 1, . . . , K. In our second theorem we give an upper bound on the exit time for the process defined by (1.1) to go from local minimum point in G ε k+1 to a lower one in S k . Theorem 2. Assume that F satisfies the structural assumptions above, and let Ω be a domain that contains S ǫ k+1 . There exists an 0 < ε 0 ≤ δ 2 such that if 0 < ε < ε 0 , the following holds: be a pair that maximizes the pairwise capacity. Then, with the notation of Theorem 1, we get in the parallel case and in the series case we get If both the minimas and saddles are non-degenerate points, and there is only one saddle connecting , where x a , x b are the only minimas of F , the above estimate coincides with Eyring-Kramers formula (up to a constant) Here λ 1 is the first eigenvalue of the Hessian of F at the saddle z, and the additive error in Theorem 2 can be removed for small ǫ as the right hand side of the above tends to ∞ as ǫ → 0.

Preliminaries
The generator of the process (1.1) is the following elliptic operator In this section we study the potential and regularity theory associated with the operator (2.1). We provide the identities and pointwise estimates that we will need in the course of the proofs. Most of these are standard, but we provide them adapted to our situation for the reader's convenience. We note that in this section we only require that the potential F is of class C 2 and satisfy the quadratic growth condition (1.3).
Let Ω ⊂ R n be a regular domain and let G Ω (x, y) be the Green's function for Ω, i.e., for every f ∈ C(Ω) the function The natural associated measures are the Gibbs measure dµ ε = e −F ε dx and the Gibbs surface measure dσ ε = e −F ε dH n−1 .

Remark 2.2.
Note that the Green's function is symmetric w.r.t. the Gibbs measure, i.e.
We also have the fundamental Green's identities. Here we assume that Ω is a Lipschitz domain and denote the inner normal by n.

Lemma 2.3.
Let Ω be a smooth domain, ψ, φ be in C 2 (Ω), and G Ω be the Green's function for Ω. Then the following Green's identities holds (Green's first identity)

2)
and (Green's second identity) Furthermore, the following (balayage) representation formula holds: for every g ∈ C(∂Ω) the function Proof. Integration by parts gives The second Green's identity follows from the first by applying it twice We may now obtain the representation formula for the Dirichlet problem. We choose φ(x) = G Ω (x, y) and obtain by Green's second identity that Now relabeling x → y, we get the representation formula (2.4).
Recall the definition of capacity (variational): for A, B ⊂ Ω two disjoint compact sets The extension of capacity to open sets follows in the classical way It is well known that for bounded sets with regular boundary the continuity of the capacity implies that cap(U, B; Ω) = cap(U , B; Ω). The extension w.r.t the second entry follows similarly. The variational definition of the capacity has many equivalent forms, one that we will need is the one below: ⊂ Ω be two disjoint compact sets. Then the variational formulation of capacity coincides with the balayage definition, i.e., The unique measure which maximizes the above, i.e., satisfying is called the equilibrium measure µ A,B . The corresponding equilibrium potential is defined as h A,B = ∫ Ω G Ω∖B (x, y)dµ A,B (y) and is the minimizer of (2.5).
If in addition A, B are smooth, then we have Proof. The claim follows from the symmetry of the Green's function, Remark 2.2, and the strong maximum principle that h A,B = 1 in A, see [1,10].
Using h A,B = 0 on ∂(Ω ∖ B) we see that the right hand side of the above is zero. Moreover, from L ǫ h A,B = µ A,B and from the definition of dµ ǫ we get Note that since h A,B = 1 in A and 0 on ∂(Ω ∖ B), and since L ǫ h A,B = 0 in Ω ∖ (A ∪ B), we have by the uniqueness of the solution to the Dirichlet problem that h A,B coincides with the variational minimizer of (2.5). This establishes the first two equalities of (2.6).

AVELIN, JULIN, AND VIITASAARI
To prove the last equality in (2.6) we insert h A,B = φ = ψ into (2.2) (Green's first identity) and get where n is the outward unit normal of A. The result now follows from (2.7) and (2.8).
Let Ω be a smooth domain and A ⊂ Ω. Then we define the potential of the equilibrium potential as The definition of the potential of the equilibrium potential might seem technical at first. However, w A,Ω has a clear probabilistic interpretation as the expected hitting time of hitting A of a process killed at ∂Ω. Indeed, the probabilistic interpretation of h A,Ω c is P(τ A < τ Ω c ) i.e. the probability of hitting A before Ω c . By Dynkin's formula we see that then We also have the following integration by parts formula for the potential of the equilibrium potential: Let Ω be a smooth domain, let A ⊂ Ω be a smooth set, and assume that B 2ρ (x) ⊂ Ω ∖ A. Then The above statement looks more familiar if we write it in the formal way as Proof. Using the definition of w A,Ω and Fubini's theorem Using the symmetry of the Green's function, Remark 2.2, Note that supp µ Bρ(x),A ⊂ ∂B ρ (x) and as such Combining the equalities above yields the result.

2.2.
Classical pointwise estimates. In this section we recall classical pointwise estimates for functions which satisfy , then for all Hölder continuous f the solutions of the above equation are C 2,α -regular, see [14]. However, these regularity estimates depend on ε and blow up as ε → 0. The point is that we may obtain regularity estimates for constants independent of ε if we restrict ourselves on small enough scales. To this aim, for a given domain Ω we choose a positive number ν such that We have the following two theorems from [14].
Let Ω be a domain and let u ∈ C 2 (Ω) be a non-negative function satisfying L ε u = 0. Then for any B 3R (x) ⊂ Ω it holds that where the symbol ⨏ denotes the average integral, and the constant C p in addition to above depends also on p.
In the non-homogeneous case L ε u = f we have the following generalization of Harnack's inequality.
Let Ω be a domain and let u ∈ C 2 (Ω) be a non-negative function satisfying L ε u = f . Then for any B 3R (x) ⊂ Ω it holds that The Harnack inequality in Lemma 2.7 holds also in the case of the punctured ball.
Proof. By translating the coordinates we may assume that We obtain the claim by summing over i = 1, . . . , N .
Proof. Again we may assume that x = 0. Using Lemma 2.8 and Harnack's inequality for h yields Now, using Harnack's inequality for h again, we obtain for a constant C as in the statement. This proves the claim.

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 15
The Harnack's inequality in Lemma 2.7 implies Hölder continuity for solutions of L ε u = 0.
Lemma 2.11. Let u ∈ C 2 (B 3R (x)) be a function such that for any constant c, for which v = u + c is non-negative, the function v satisfies Harnack's inequality with constant C 0 , independent of c. Then there exists C = C(C 0 ) > 1 and α = α(C 0 ) ∈ (0, 1) such that, for all ρ ≤ R, it holds that In particular, if u, h ∈ C 2 (Ω) are non-negative functions such that L ε u = h and h satisfies Harnack's inequality with constant C 0 , then u + h satisfies the estimate above.
Proof. The proof follows verbatim from the classical proof of Moser, see [14,Theorem 8.22].

Technical lemmas
In this section we provide some preliminary results for the proofs of the main theorems. We recall that we assume that the potential F satisfies the structural assumptions from Section 1.1, and that from this moment on our constants are allowed to depend on the data, see paragraph after (1.16).
3.1. Rough estimates for potentials. In this subsection we provide estimates for the capacitary potential h A,B , when A and B are two disjoint closed sets. The first estimate is the so called renewal estimate of [8]. In order to trace dependencies of constants, we provide a proof. Proof. Again, without loss of generality, we may assume that x = 0. Since Now by Green's second identity (2.3) in Ω ∖ (A ∪ B ∪ B r ) and (3.1) we see that, for z ∈ Ω, where n is the inward unit normal of B r . First note that by (2.3) we can identify the equilibrium measure as µ B∪Br ,A = −ε∇h B∪Br ,A ⋅ ndσ ε = ε∇h A,B∪Br ⋅ ndσ ε .
Using that h A,B∪Br = 1 − h B∪Br ,A , together with the above and (3.2), we get for z ∈ B r (since h A,B∪Br (z) = 0) that First note that µ Br∪B,A ∂Br is an admissible measure for cap(B r , A; Ω), which follows from the fact that by the comparison principle, the potentials for ordered measures are ordered and the support of µ Br∪B,A ∂Br is in B r . To bound h A,B from above, note that by the balayage representation of capacity (see Lemma 2.4) and the above, we obtain  The result below is a version of the rough capacity bound of [8], but we give a simplified proof. We will later use a similar argument in the proof of Theorem 1.
Then there exists constants q 1 , q 2 ∈ R and C > 1 such that Proof. We assume without loss of generality that F (B ρ ; D) = 0, since the quantities can always be scaled back. Consider γ ∈ C(B ρ (x), D; B R ) (i.e. a curve connecting B ρ (x) and D inside Ω) such that sup t F (γ(t)) ≤ Cε and let u(z) = h D,Bρ(x) (z). We first note by Lemma 2.4 that Fix an n − 1 dimensional disk D ρ of radius ρ. Then by Cauchy-Schwarz

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 17
By the fundamental theorem of calculus and Cauchy-Schwarz, we have for a fixed point z ∈ D ρ that From the above we get Now since F is Lipschitz in B R and F (B ρ ; D) = 0, we know that there exists a constant C(γ) such that, for z ∈ D ρ and ρ < 2ε, In the above the constant C depends on the length of γ, which can be assumed to be bounded. To see this, take an ǫ neighborhood of γ, E ε and consider a set of balls ⋃ i B ε (y i ) ⊃ E ε such that ⋃ i B ε (y i ) ⊂ E C 1 ε (for some large C 1 ), the maximal number of such balls needed is C 1 R n ǫ n . If we construct a piecewise linear curve γ ε connecting the center of each ball in the covering, this curve will be inside E C 1 ε and its length will be bounded by 2C 1 R n ε n−1 . This newly constructed curve can be mollified to achieve a smooth curve without increasing the length by more than a factor. From the above and the Lipschitz continuity of F it is clear that sup t F (γ ε (t)) ≤ Cε, and as such we can replace γ with γ ε in the above and get from (3.6) that there is a constant C > 1 depending only on the data such that 1 0 γ e F (γ(t)+z) ε dt ≤ ε 1−n C.

This implies that for a new constant C we have
which completes the proof of the lower bound after rescaling our potential F . To prove the upper bound we have two possible cases: In the case when F (x; D) = F (x) we can take a cutoff function χ Bρ(x) ≤ φ ≤ χ B 2ρ (x) where ∇φ ≤ C ρ as a competitor in the variational formulation of capacity (2.5). Then In the case where F (x; D) > F (x), consider the setD = {z ∈ B R ∶ F (z) ≤ F (x; D)} and letD 1 be the component that intersects D. We setD = (D 1 ∪ D) ∖ B 4ρ (x). By the Lipschitz continuity, we know that infD F > −Cρ.
We take χD +Bρ ≤ φ ≤ χD +B 2ρ , where ∇φ ≤ C ρ, and get Again, the upper bound follows from rescaling the potential F as in the case of the lower bound. This completes the whole proof.
, there exists constants q and C such that Proof. Let L ∶= ∇F L ∞ (B R ) . By combining Lemmas 3.1 and 3.2 with R = ε, r = min{ε L, ε} yields the result.

Lemma 3.5.
Let Ω be a smooth domain and let x a , x b ∈ Ω ⊂ B R be two local minimum points of F . Fix 0 < δ < δ 1 and assume that Then there exists an ε 0 ∈ (0, 1) and a constant C = C > 1 such that, for any 0 ≤ ε ≤ ε 0 for which B 3ε (x a ), B 3ε (x b ) ⊂ U −δ 3 , the following holds: If U i is a component of U −δ 3 , then Proof. Consider any component U i of U −δ 3 . We note that we can take ε small enough depending on the Lipschitz constant of F in B R and δ such that there exists a Lipschitz domain D i satisfying For simplicity, denote u ∶= h Bε(xa),Bε(x b ) . Since D i is Lipschitz we may use the Poincaré inequality to get Using the definition of U −δ 4 and Lemma 3.2, we get for some constant q 1 ∈ R. Now, for any x 0 ∈ U i we have by Lemma 2.7 that Since x 0 was an arbitrary point in U i we conclude that there exists ε 0 ∈ (0, 1) depending only on the data such that if ε < ε 0 , the claim holds.
We conclude this subsection with an estimate relating the value of the potential of the equilibrium potential to the ratio of the L 1 norm of the equilibrium potential and the capacity.
Proof. From Lemma 2.6 we get We can estimate the left hand side as We want to estimate the oscillation of w A,Ω which we do by considering Now, the oscillation of w A,Ω + h A,Ω c and h A,Ω c can estimated by Lemma 2.11 for ρ ≤ 1 C √ ε. That is, We apply Lemma 2.7 to replace the supremums on the right hand side with the value at x as both w A,Ω +h A,Ω c and h A,Ω c satisfies the Harnack inequality (see Lemma 2.10). That is, It is easily seen that the above can be extended to ρ ≤ √ ǫ by applying Lemma 2.10 again and by enlarging the constant C. The proof is completed by using (2.6) and collecting the estimates above.

3.2.
Laplace asymptotics for log-concave functions. The assumptions (1.4) and (1.6) ensure that near critical points the potential F is well approximated by convex functions. Therefore we will need basic estimates for log-concave functions, which rather surprisingly we did not find in the literature.
Lemma 3.7. Assume G ∶ R n → R is a convex function which has a proper minimum at the origin and G(0) = 0. Then there exists a constant C = C(n) > 1 such that with η as in (1.15).
Proof. By approximation we may assume that G is smooth. The lower bound in (3.8) follows immediately from To prove the upper bound in (3.8) we first show that, for all t > 0, it holds {G < 2t} ≤ 2 n {G < t} . (3.10) In order to prove (3.10) it is enough to consider only the case t = 1 (the general case follows by consideringG = G t). Denote E 1 = {G < 1} and E 2 = {G < 2}. Hence our goal is to show Fixx ∈ ∂E 1 and define g(t) = G(tx) for t ≥ 0. By our assumptions, g(t) is a smooth convex function satisfying g(0) = 0 and g(1) = 1. As such, both g, g ′ are increasing functions from which we can conclude that g ′ (1) ≥ 1. Now, by the fundamental theorem of calculus, It remains to prove (3.9). Fix Λ > 1. Then, for every x ∈ {G ≥ Λε}, it holds Therefore we have, by (3.8), (3.11) and (3.12), and the inequality (3.9) follows by using (1.15).
Lemma 3.8. Assume G ∶ R n → R is a function which has a proper global minimum at the origin and G(0) = 0. Furthermore, assume there is a constant C 0 such that, for all a > 0 and ε > 0, it holds that G>a e −G ε dx < C 0 e −a ε . (3.14) If there is a level ε 0 > 0 such that G is convex on the component of {G(x) < ε 0 } that contains 0, then there is an ε 1 (n, {G < ε 0 2} ) < ε 0 and a constant C = C(C 0 , n) > 1 such that, for all ε < ε 1 , it holds that Proof. Since G is convex in the level set {G < ε 0 }, we know that the level set {G ≤ ε 0 2} is convex and as such we can extend the function G outside that level set to a globally convex function. This allows us to apply Lemma 3.7 and obtain 1 Now, split the integral as From (3.15) it follows that it suffices to bound the second integral on the right hand side. Using (3.14) for a = ε 0 2 we get
We conclude this section with the following technical lemma which is useful when we study the potential near critical points. Lemma 3.9. Assume G ∶ R n → R is a convex function which has a proper minimum at the origin and (1.4) and (1.6). Then for all δ ≤ δ 0 , we have Proof. Denote Λ ε = ε 1 ε with ε 1 as in (1.13). From (1.14) we know that Λ ε → ∞ as ε → 0. Now, by (3.9) in Lemma 3.7 and (1.14), we get The lower bound follows immediately from this. In order to prove the upper bound, note that ω(s) ≤ s 2 for all s ≤ δ 0 by assumption. Therefore we can repeat the argument in (3.13) to get which together with (3.16) yields the upper bound.

Proofs of Theorem 1 and Theorem 2
In this section we prove the capacity estimate in Theorem 1 and exit time estimate in Theorem 2. Before we begin, we would like to remind the reader that, as in Section 3, we will assume that F satisfies our structural assumptions and that all constants depend on the data, see the paragraph after (1.16).
We first study the geometric quantities d ε (A, B; Ω) and V ε (A, B; Ω) defined in (1.8) and (1.9) and give a more explicit, but less geometric, characterization. The characterization for the geodesic distance d ε (A, B; Ω) turns out to be much easier than for the separating surface V ε (A, B; Ω) and therefore we prove it first. Proposition 4.1. Assume that x a and x b are local minimum points of F , let U a and U b be the islands, i.e., the components of the set U −δ 3 , containing B ε (x a ) and B ε (x b ) respectively. Assume that z is a saddle point in Z xa,x b , such that the bridge O z,δ connects U a and U b , and denote Ω = U a ∪ U b ∪ O z,δ . Then it holds for g z given in (1.6) that Proof. Begin by denoting g = g z and let us first prove the lower bound, i.e., To this aim we choose a smooth curve γ ∈ C(B ε (x a ), B ε (x b ); Ω) which, by assumptions, intersects the bridge O z,δ . We may choose the coordinates in R n such that z = 0 and Moreover, by changing the potential from F to F −F (z) we may assume that Let us fix s ∈ R such that g(s) < δ 10 and denote Γ s = {s} × {G < δ} ⊂ O δ . By the assumption (1.6) we have In particular, since U a , U b ⊂ {F < −δ 3}, then the surface Γ s does not intersect U a or U b . Thus we conclude that every γ ∈ C(B ε (x a ), B ε (x b ); Ω) intersects Γ s , i.e., Γ s ∈ S(B ε (x a ), B ε (x b ); Ω). Let us denote the projection to the x 1 -axis by π 1 ∶ R n → R, i.e. π 1 (x) = x 1 . From the previous discussion we conclude that s ∈ π 1 γ([0, 1]) ∩ O δ . This holds for every s ∈ {g < δ 10}, and therefore Now the assumption (1.6) implies that, in the set O δ , it holds that Then for γ 1 = π 1 (γ) we have by (4.2) and Lemma 3.9 that proving (4.1).
To prove the upper bound, i.e.
withη as in (1.16). we denote by c − < 0 < c + the numbers such that g(c − ) = g(c + ) = δ. We first connect the points x 1 = (c − , 0) and x 2 = (c + , 0) by a 24 AVELIN, JULIN, AND VIITASAARI segment γ 0 (t) = tx 1 + (1 − t)x 2 . Then it holds by the assumption (1.6) and by Lemma 3.9 that We then connect x a to x 1 and x 2 to x b with smooth curves γ 1 , γ 2 ⊂ {x ∈ Ω ∶ F (x) < −δ 3}. Since it holds g(t) ≤ C t we have {g < ε} ≥ c ε. Therefore it holds by Lemma 3.7 that The constant in the last expression depends on the length of γ i . We can use a similar argument as in the proof of Lemma 3.2 to bound the length of the curve. This time, we will however consider coverings with balls of size comparable to δ, as we are in the level set {F < −δ 3} we have some room to replace our curve with another curve which has a length depending on δ and R, while still retaining the same upper bound as above.
The upper bound now follows by joining the paths γ 1 , γ 0 and γ 2 , thus, constructing a competitor for the geodesic length.
We need to prove similar result to Proposition 4.1, but for the separating surface. This turns out to be trickier than the previous result for paths.
Proof. Denote G z = G for short. As in the proof of Proposition 4.1 we may assume that z = 0, F (0) = 0 and that Let us begin by proving the upper bound. In the proof of Proposition 4.1 we already observed that the surface Γ 0 = {0} × {G < δ} is in the family of separating surfaces Γ 0 ∈ S(B ε (x a ), B ε (x b ); Ω). Therefore the assumption (1.6) and Lemma 3.9 together with the definition of V ε imply The upper bound follows directly from this. Moreover by Lemma 3.7 it holds that (4.4) In order to prove the lower bound we fix a small t > 0 and choose a smooth Then S divides the domain Ω into two different components, from which we denote the component containing x a byÛ a . Note that then ∂Û a ∩ Ω ⊂ S. Figure 3. The bridge O δ connects the sets U a and U b . The smaller bridgeÔ has its lateral boundaries inside U a ∪ U b .

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 25
Denote ρ = ε 2 . We use an idea from [16] and instead of studying the setÛ a , we study the density which can be written as a convolution, v ρ (x) = 1 Bρ (χÛ a * χ Bρ ). To see why studying v ρ is relevant, we need some setup that we will present next. We choose a subsetÔ of the bridge O δ aŝ Fig. 3, and denote its lateral boundaries by Γ − and Γ + , i.e., Moreover, by relabeling we may assume that Γ + ⊂ U a and Γ − ⊂ U b . Furthermore, by the Lipschitz-continuity of F we have F (x) − F (y) ≤ cε 2 for all y ∈ B ρ (x). Note also that for all x ∈Ô and y ∈ B ρ (x) it holds x−y ∈ Ω.
We will now relate v ρ to the surface integral of S as follows: Recall that the setÛ a has smooth boundary in Ω and thus its characteristic function is a BV-function. In particular, the derivative ∇χÛ a is a Radon measure in Ω and Using the definition of v ρ and the Lipschitzness of F inside B ρ (x), we may thus estimate Putting together (4.5) and (4.6) we see that it is enough to establish a lower bound on the integral of ∇v ρ inÔ. In order to achieve this, we first claim that for all x such that B ρ (x) ⊂ Ω we have, when ǫ is small, (4.7) We now complete the proof of the lower bound, using (4.7), followed by the proof of (4.7). Assume now (4.7). Then we can use the fundamental theorem of calculus to get that for all Now, arguing as in (4.3) we conclude that Multiplying and dividing with e − F (x) ε inside the integral in (4.8) and using (4.9) we get Integrating over x ′ ∈ {G < δ 100} we obtain The lower bound on the integral on the right hand side follows by Lemma 3.9, i.e. we have Now, assuming (4.7), we may use (4.5), (4.6) and (4.10) to get the lower bound from ; Ω) + t as t is arbitrarily small. Thus we obtain the lower bound, and hence in order to complete the proof it remains to prove (4.7). For this, we fix x ∈ U a ∪ U b such that B ρ (x) ⊂ Ω. By the relative isoperimetric inequality (also called Dido's problem, see for instance [2,Theorem 3.40] or [18]) and by ρ = ε 2 it holds

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 27
On the other hand, since x ∈ {F < −δ 3} and thus B ρ (x) ⊂ {F < −δ 4}, we have by (4.4) that By combining the two inequalities above we obtain (4.7) which completes the whole proof.
Proof of Theorem 1. We consider parallel case and series case separately.
Parallel case: Assume that the saddle points in F xa,x b = {z 1 , . . . , z N } are parallel, see Fig. 2. Let us fix a saddle point z i ∈ F xa,x b and recall the definition of the bridge O z i ,δ in (1.7). As before, by considering F − F (z i ) instead of F , we may assume We also recall the notation Let us choose a subsetÔ of the bridge O δ as in the proof of Proposition 4.2 (see Fig. 3 and denote its lateral boundaries by Γ − ⊂ {x 1 < 0} and Γ + ⊂ {x 1 > 0}, i.e., Then by using (1.6) and arguing as in the proof of Proposition 4.2 we deduce that Γ − , Γ + ⊂ {F < F (x a ; x b )−δ 3} and we may assume Γ + ⊂ U a and Γ − ⊂ U b . Therefore, by (4.11), we have that h A,B ≤ Cε on Γ − and h A,B ≥ 1 − Cε on Γ + . Now, by the fundamental theorem of calculus and Cauchy-Schwarz inequality, it holds that (4.12)

AVELIN, JULIN, AND VIITASAARI
Let us next estimate the last term above. By assumption (1.6) we have, for x ∈Ô, Therefore, by Lemma 3.9 and Proposition 4.1, we can estimate We combine the inequalities (4.12) and (4.13) leading to (for another con- for all x ′ ∈ {G < δ 100}. By integrating over x ′ ∈ {G < δ 100} we have by Fubini's theorem, Lemma 3.9 and Proposition 4.2, that ; Ω) . (4.14) Therefore, by repeating the argument for every saddle z i ∈ Z xa,x b and using the fact that the bridges O z i ,δ are disjoint, we obtain after scaling back the potential This yields the lower bound when the saddle points are parallel.
For the upper bound, we only give a sketch of the argument as it is fairly straightforward. The idea is to contruct a competitor h in the variational characterization of the capacity, see (2.5). Let us first define h in the set Since the saddle points Z xa,x b = {z 1 , . . . , z N } are parallel, it follows that the points x a and x b lie in different components of the setŨ where O z i ,δ is defined in (1.8). Denote the components ofŨ containing x a and x b byŨ a andŨ b , respectively. We define first h = 1 inŨ a and h = 0 inŨ b .

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 29
Let us next fix a saddle point z i ∈ Z xa,x b . As before, we may again assume that Moreover, we may assume that Let c − < 0 < c + be numbers such that g(c − ) = g(c + ) = δ 100. We define h(x) = ϕ(x 1 ) in O δ such that the function ϕ ∶ [c − , c + ] → R is a solution of the ordinary differential equation with boundary values ϕ(c − ) = 0 and ϕ(c + ) = 1. We extend ϕ into R by setting ϕ(s) = 0 for s ≤ c − and ϕ(s) = 1 for s ≥ c + . It follows that for the function h we have, by construction, Lemma 3.9, and an argument similar to the one leading to (4.12), that By repeating the construction for every saddle point z i ∈ Z xa,x b , we obtain a function which is defined in U δ 3 . We denote this function by h ∶ U δ 3 → R.
Note that now for h the estimate (4.14) is optimal. Moreover, h is Lipschitz continuous. We extend h to R n without increasing the Lipschitz constant L, e.g., by defining This finally leads to the upper bound completing the proof of the parallel case, while we leave the final details on the upper bound for the reader.
Series case: Assume that the saddle points Z xa,x b = {z 1 , . . . , z N } are in series, see Fig. 2. We use the ordering as in (1.11) and denote the points x i as in (1.12). We also fix the islands, U x i−1 and U x i (components of {F < F (x a ; x b ) − δ 3}), which are connected by the bridge O z i ,δ . Again we may assume that z i = 0, F (0) = 0 and that By Lemma 3.5 we have osc Ux i−1 (h A,B ) + osc Ux i (h A,B ) ≤ Cε. Therefore there are numbers c i−1 , c i such that

AVELIN, JULIN, AND VIITASAARI
Then, using the fundamental theorem of calculus as in (4.12), we obtain Moreover, arguing as in (4.13), we have These together imply By integrating over x ′ ∈ {G < δ 100} we have, by Fubini's theorem, Lemma 3.9, and Proposition 4.2, that By repeating the argument for every saddle z i ∈ Z xa,x b and using the fact that the sets O z i ,δ are disjoint we obtain Recall that the numbers c i are the approximate values of h A,B in the components U x i . Therefore we may choose them such that 1 = c 0 and c N = 0. By denoting y i = c i−1 − c i and a i = Vε( where we have a constraint ∑ N i=1 y i = 1. By a standard optimization argument (using Lagrange multipliers) we get that under such a constraint it holds that This yields the lower bound in the case when the saddle points are in series.
The upper bound on the other hand follows from a similar argument than in the parallel case, and we leave the details for the reader. This completes the proof in the series case, and hence the whole proof.
Proof of Theorem 2. Let us first recall the notation related to Theorem 2. We assume that the local minimas x i of F are ordered such that F (x i ) ≤ F (x j ) if i ≤ j, and they are grouped into sets G i such that The proof of Theorem 2 follows from the following lemma together with Lemma 3.6 and Theorem 1.

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 31
We prove Theorem 2 first, while the proof of Lemma 4.3 is given later on.
Proof of Theorem 2. Using Lemma 3.6 and choosing ρ = ε we obtain that, for ε small enough and x ∈ G ε k+1 , that The ratio above can be estimated by using Lemma 4.3 and the monotonicity of the capacity. That is, the numerator can be bounded by Lemma 4.3, while for the capacity we have cap(G ε k+1 , S ε k ) ≥ cap(G ε k+1 , G ε k ) ≥ max x∈G k ,y∈G k+1 cap(B ε (x), B ε (y)). The claim for the parallel and series cases now follows by assuming that the maximum is attained for a pair of minimas x a ∈ G k , x b ∈ G k+1 and applying Theorem 1.
Remark 4.4. We note that in the general case, the last inequality in (4.15) has the optimal dependence with respect to ε but the inequalities may differ by a constant. Essentially the inequality is sharp only in the case where only one saddle contributes to the total value of the capacity. Hence we have the sharp estimate when saddle points are parallel or in series, but in general the situation might be more complicated than that. We have illustrated this in Fig. 4, where each gray dot is a saddle at the same height, and A, B produces G k . Then the precise value of cap(G ε k+1 , A∪B) is already non-trivial to calculate.
Proof of Lemma 4.3. First we will prove a localization estimate for exponential integrals. Consider a set 0 ∈ O and a function f such that f (0) = l is a proper local minimum and that f is locally convex around 0. Then there exists an ε 0 such that, for any ε < ε 0 , (4.16) We will first prove (4.16) and then repeatedly apply it to prove Lemma 4.3.
In order to prove (4.16), we begin by rescaling such that l = 0. Then we extend f outside O as +∞ and call this extended functionf . We first prove {f >a} e −f (x) ε dx ≤ ce −a ε which, by the definition off , is equivalent to This now follows from Lemma 3.8 by using O < ∞ and observing thatf satisfies the assumptions of Lemma 3.8. Hence we observe (4.16).
Consider now the set and let U i be the component of U −δ 2 3 containing x i . We split where complement is understood with respect to the domain Ω. By assumptions (1.4) and (1.6) on F it holds that F (G k+1 ; S k ) ≥ F (G k+1 ) + 2 3 δ 2 . which shows that the first integral is neglible in the final estimate. For the second integral we further split We will now consider all the different components U i depending on what minimas they contain. We start with the components U i that do not intersect S ε k ∪ G ε k+1 . Then all local minimas in U i are larger than F (G k+1 ), and hence from (4.16) we get that there exists a constant C such that where the last inequality follows from (1.19). This shows that also this term is neglible. Consider next the component U i that intersects G ε k+1 but do not intersect S ε k . In this case, by (4.16) and Lemmas 3.5 and 3.7, we have 1 providing us the leading term that contributes to the final estimate.

GEOMETRIC CHARACTERIZATION OF THE EYRING-KRAMERS FORMULA 33
Consider next a component U i such that U i ∩ S ε k ≠ ∅. Since U i is a component of U −δ 2 3 , it follows from U i ∩S k ≠ ∅ that F (y; S k ) ≤ F (y; G k+1 )− δ 2 3 ≤ F (y; G k+1 ) in U i . Therefore we have, by Lemma 3.3, in U i that h G ε k+1 ,S ε k ≤ Cε q e −(F (y;G k+1 )−F (y;S k )) ε . Hence, for q ∈ R, we obtain e −(F (y;G k+1 )−F (y;S k )) ε e −F (y) ε dy.
In order to compute the integral on the right hand side we study the infimum value of the function f (y) = F (y; G k+1 ) − F (y; S k ) + F (y). Clearly, the infimum is attained at an interior point of U i , denoted by x i . It follows that then x i is necessarily a local minimum point of F . By above considerations, we also have F (y; S k ) < F (y; G k+1 ) for all y ∈ U i , and thus we may deduce that x i ∉ G k+1 . If now x i ∈ S k , then F (x i ) = F (x i ; S k ) and thus, by the definition of f and by (4.17), It remains to study the case where x i ∈ G j for some j ≥ k + 2. In this case we apply where the last inequality follows from (1.19). Therefore we can conclude that, for δ 3 = 2 3 δ 2 , it holds that U i ε q e −(F (y;G k+1 )−F (y;S k )) ε e −F (y) ε dy ≤ Cε q e − δ 3 ε e −F (G k+1 ) ε .
Consequently, the component U i satisfying U i ∩ S ε k ≠ ∅ does not contribute either. The proof is hence completed by (4.18) and by the fact that the integral over the remaining components are neglible whenever ε is small enough.