Metastability for the Ising Model on the Hypercube

We consider Glauber dynamics for the low-temperature, ferromagnetic Ising Model on the n-dimensional hypercube. We derive precise asymptotic results for the crossover time (the time it takes for the dynamics to go from the configuration with a “-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1$$\end{document}” at every vertex, to the configuration with a “+1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+1$$\end{document}” at each vertex) in the limit as the inverse temperature β→∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta \rightarrow \infty $$\end{document}.


Ising Model on the Hypercube
Put in simple terms, metastability is the phenomenon describing a stochastic process that is temporarily trapped in the neighbourhood of a metastable state, away from the stable state which corresponds to the thermodynamic equilibrium. Usually this trap comes in the form of a local minimum of an associated energy function, and over a short time scale the observed process appears to be in a quasi-equilibrium. Viewed over a longer time scale, the process manages (after many unsuccessful attempts) to overcome the energy-barrier that separates it from a global minimum, which is often unique and the only true equilibrium.
In the physical world, observations of this phenomenon can be witnessed for example in magnetic hysteresis and in the condensation of an over-saturated vapour. In the context of statistical physics, the three main points of interest for metastability are the transition time from the metastable state to the stable state, the gate of critical configurations the process will visit in order to achieve the transition, and the tube of typical trajectories the system follows when making this transition. These have been studied intensively in the past decades, aided by the development of a number of powerful tools. The first of these tools is a method known as the pathwise approach and based on large deviation theory. Initially this was introduced by Cassandro et al. [11] for the Curie-Weiss model and the contact process on Z, and later by Neves and Schonmann [21] for the Ising Glauber model on Z 2 . This approach was then generalized by Olivieri and Scoppola [22,23] and Catoni and Cerf [12] for reversible and non-reversible Markov chains. A particular strength of the pathwise approach is that it can be applied to study all three of the aforementioned points.
A second method, known as the potential-theoretic approach, was developed by Bovier et al. [6][7][8][9]. This method makes use of results from electrical network theory and their applications to reversible Markov chains, and was used by Bovier and Manzo [10], and Bovier et al. [5] to obtain sharper results for the Glauber and Kawasaki Ising model.
The pathwise and potential-theoretic approach are the subjects of two monographs: the first is treated by Olivieri and Vares in [24], the latter by Bovier and den Hollander in [4].
A third method, known as the martingale approach, was developed more recently by Beltran and Landim in [1][2][3] and applied to the study of metastability for the Ising model in Z 2 .
Since their development, these approaches have been used to study metastability for a variety of models. For instance, an analysis in Z d for a nucleation-and-growth model was done by Dehghanpour and Schonmann [14], and for the d-dimensional Ising model by Cerf and Manzo [13]. Kotecky and Olivieri [18][19][20] also applied this method for the Glauber Ising model with isotropic, anisotropic and staggered interactions. More recently, a study of metastability for the Glauber Ising model on random graphs was done by Dommers [15] and Dommers et al. [16].
In this paper we use general theory from the potential theoretic approach to derive results for the Ising model set on an n-dimensional hypercube. This study is motivated by the distinct geometry of the hypercube. It is an expander graph, and should therefore give rise to very different dynamical behaviour (discussed in Sect. 1.4). At the same time, its strong symmetry makes it sufficiently tractable to fully exploit the potential theoretic approach in obtaining sharp results.
The general theory behind the potential-theoretic approach shows that in the limit β → ∞ and under certain regularity conditions, the average crossover time-i.e. the time it takes for the process to go from the configuration (corresponding to a −1 spin at every vertex of the hypercube) to the configuration (corresponding to a +1 spin at every vertex of the hypercube)-behaves like K exp (−β ) for some K , ∈ R + . We show that for the hypercube the required conditions are met (subject to necessary constraints on the parameters), obtain exact solutions for K and , and give a full description of the critical configurations seen by the process when making the transition from to . In order to do this, we investigate specific geometric properties of the hypercube and of the induced graph of the Markov process (explained further in Sect. 1.1). We will show that in case of the hypercube, is proportional to the volume of the cube (i.e. it is of order 2 n ). This differs dramatically from the Ising model on a finite box in Z d , where does not depend on the volume of the box (see [13] for instance).

The Ising Model on the Hypercube
We will denote the graph of the n-dimensional hypercube by Q n = (V n , E n ), where V n = {0, 1} n are its vertices and E n = {(v, w) ∈ V n × V n : v − w 1 = 1} its edges. Here for a vertex v = (v 1 , . . . , v n ) ∈ V n , the norm · 1 is defined by v 1 = n i=1 v i . If Q r is an r -dimensional sub-cube of Q n (a subgraph of size 2 r that is isomorphic to an r -dimensional hypercube, and hence all its vertices agree on n−r co-ordinates), we shall (by a minor abuse of notation) write "A ⊆ Q r " to mean that A is a subset of the vertices in Q r . By "Ising Model on the hypercube" we are thinking of the configuration space = {+1, −1} V n together with an associated Gibbs measure on this space, defined in (1.2). This configuration space corresponds to the assignment to each vertex of exactly one of two spins (either +1 or −1). Hence an equivalent representation of is the power set P (V n ) of V n , where A ∈ P (V n ) is identified with the configuration that assigns (+1) to every vertex in A, and (−1) to every vertex in A (the complement of A). Therefore we will (by further abuse of notation) identify with P (V n ) and refer to the terms in P (V n ) (and hence ) as configurations, whenever there is no threat of ambiguity.
Two special configurations (subsets) deserve their own symbols-we will denote by and the configurations V n and ∅ in (equivalently, these are the two configurations with a (+1) / (−1) assigned to every vertex). The Hamiltonian function H : → R associates an energy with each configuration A ∈ according to where for two subsets U, W ⊆ V n , E (U, W ) ⊆ E n is the set of all unoriented edges with one endpoint in U and another in W . The parameters J > 0, h > 0 are fixed constants, known as the interaction and external field parameters, respectively. The Gibbs probability measure on is given by with β ≥ 0 being the inverse temperature and Z n the normalizing constant. Our interest is in the behaviour of the system when β → ∞, thus we may take J = 1, which simply corresponds to a rescaling of β and h. Then with J = 1, we will also assume throughout this paper that 0 < h < n, and some of our results will also require that h is not an integer. The implications of, and reasons for having these assumptions will be discussed in Sects. 1.4 and 5.
The final ingredient will be to define the dynamics on . To do this, let us first define where for two sets A, A ⊆ V n , A A = A\A ∪ A \A denotes their symmetric difference. We consider continuous-time Glauber dynamics, which is a reversible, continuous-time Markov process (ξ t ) t≥0 with (1.2) as its equilibrium measure, and is defined by the transition rates 3) it is clear that c β defines single spin-flip dynamics, with transitions corresponding to a sign change at a single site.

Metastability
In order to discuss and describe metastable behaviour, we will need to investigate certain geometric quantities. The first of these is the communication height between two configurations ξ , ξ , defined by ξ, ξ = min where the minimum is taken over all paths γ : ξ → ξ on the graph ( , E n ). The stability level of a configuration ξ ∈ \ { } is defined by It is easy to see from the definition of H in (1.1) that the set of stable configurations, always reduces to s = { }. The set of metastable configurations is defined by and identifying this set is generally not a trivial task. We will show that for the Ising model on Q n , m = { } whenever metastable behaviour occurs (see paragraph following Theorem 3). This justifies our next definition, namely the energy-barrier between the metastable and stable configurations, Note from (1.1) that for any σ ∈ (recall that we are taking J = 1), We call paths γ : → that satisfy the minmax in (1.8) optimal paths. One further point of interest will be the critical set C ⊆ and the proto-critical set P ⊆ of configurations, defined as the unique, maximal subset C × P ⊆ 2 that satisfies the conditions Uniqueness follows from maximality and the observation that if C 1 , P 1 and C 2 , P 2 both satisfy the above conditions, then so does C 1 ∪ C 2 , P 1 ∪ P 2 . For any A ⊆ , define to be the first hitting time of the set A ⊆ once the starting configuration has been vacated.
The following hypotheses are required in order to state the key theorems of the potential theoretic approach.
Hypothesis (H1) is an essential hypothesis for all results given below. Indeed, the validity of (H1) will be verified in Theorem 3 where it is also shown that if (H1) is not satisfied, then the system does not display metastable behaviour. Hypothesis (H2) states that ∃k ∈ N such that every configuration ξ ∈ C has exactly k neighbours in P . This hypothesis is only necessary for the second result in Theorem 2. We will verify the validity of (H2) in Sect. 4, where we also derive a description of the sets P and C defined in (1.9). The potential-theoretic approach to metastability relates the crossover-time-the first hitting time of the configuration by a process starting at a metastable state-to the quantities defined above, via the following theorems.
In order for these results to give substantial quantitative insight, one has to verify that (H1) and (H2) hold, and establish what , K and C amount to. This is the basis of our results.

Results
Our first result assures the validity of hypotheses (H1) and (H2). Theorem 3 Suppose that 0 < h < n. Then the hypotheses (H1) and (H2) hold, and hence Theorem 1 and Theorem 2 apply.
It follows from the proof of Theorem 3 (given in Sect. 5) that the condition h < n is essential. Indeed, for h ≥ n, H has no local minima. That is, for every σ ∈ , there exists a finite path and thus no metastable behaviour is observed in this system (from (1.4) it follows that c β (σ i , σ i+1 ) = 1, and hence lim β→∞ E [τ ] < ∞).
Our second result gives a description of the set C . Recall that an isomorphism on Q n is a bijection ϕ : (1.11) When h < n − 2, define the sub-sets {W i } of V n in the following way:

Fig. 1 Schematic representation of the critical configuration
In this example, (n − h) /2 = 4. Note that |W 3 | = 4, which is the case if and only if n − h is even W i is the set of vertices of a n − h − 2 -dim sub-cube of Q n , and for i > 1 the set W i is adjacent to all of W 1 , . . . , W i−1 . By this we mean that for every 1 ≤ j < i ≤ (n − h) /2 , W j ∩ W i = ∅ and ∀w ∈ W i , ∃v ∈ W j such that (w, v) ∈ E n . See Figure 1 above.

Theorem 4
Suppose that h < n and h is not integer valued. If n − 2 ≤ h < n, then C is the set of all singleton configurations-i.e. C is the set of all configurations that have exactly one vertex with a +1 spin.
Thus for h < n − 2, critical configurations take the shape of a series of shrinking, adjacent sub-cubes of Q n (with the smallest being 1-dimensional if n − h is odd and 2-dimensional otherwise) together with a protuberance (in the form of a single vertex) which is adjacent to all the other sub-cubes.
Our third result determines the value of the energy-barrier in terms of the parameters of the system.

Theorem 5 Suppose that h < n. Then the energy-barrier
is given by Our fourth result gives the prefactor K under the assumption that h is not an integer.

Discussion
From the perspective of the potential-theoretic approach to metastability, the theorems in Sect. 1.3 give a complete description of the metastable behaviour of Ising spin-flip dynamics on the n-dimensional hypercube. The metastable regime is characterized by the inequality 0 < h < n. Within this regime, and for a fixed value of h, we see from (1.14) that showing that grows proportionally to the volume of V n . A look at equation (1.8) hints that this should indeed be the case. The hypercube Q n is an expander graph (which can be easily concluded from Theorem 7), meaning that there is some ρ > 0 such that Thus from (1.8) it is clear that for any such graph, ≥ (ρ − h) |V n | 2 , growing linearly (or faster) with the volume of the graph whenever h is sufficiently small.
Critical configurations take the shape of a series of adjacent cubes that are decreasing in size, with the final cube being a single vertex-a protuberance, similar to what is also observed for the Glauber Ising model on Z 2 . A noteworthy distinction from the Z d case is that on Q n , adjacent cubes decrease in dimension arithmetically by 2, whereas in Z d the critical configuration is a union of 'quasi-squares' that decrease in dimension arithmetically by 1 ( [13,14] give a description of critical droplets in Z d ).
Due to the large size of C , the prefactor K decays in a super-polynomial way with respect to the volume of V n : for some a > 0. In contrast, for the Glauber Ising model on Z 2 (where the Ising model is set on an n × n torus, denoted by n ), it is known that K = c | n | −1 for some c > 0 (see Theorem 17.4 in [4]). Indeed, in Z 2 a critical configuration can be described as a quasi-square-an Hence in that case C is obtained by taking all lateral translations of two quasi-squares (one is 1 × 2 , and the other 2 × 1 ). Clearly this is a proportion of the area | n |. The hypercube permits a much larger set of isomorphic translations, resulting in C being a considerably bigger set.
From the proof of Theorem 5 and Lemma 5 it is evident that when h is an integer, optimal paths can contain a 'plateau' at the top. This has no implication on , but it does complicate the description of the set C , and would require a separate (and somewhat more difficult) analysis to determine the prefactor K .

Outline of the Paper
Our main focus will be on particular geometric properties of the hypercube that relate to the triplet ( , C , K ). In Sect. 2 we will first establish some known results related to isoperi-metric inequalities on the hypercube, followed by a new result on this subject (Lemma 1). These results are applied in Sect. 3 to isolate local and global maxima of the energy function H given in (1.1) along an optimal path, and to prove Theorem 4, Theorem 5 and the validity of hypothesis (H2) in Theorem 3. In Sect. 4 we prove Theorem 6 by computing the prefactor K . In Sect. 5 we give a proof of hypothesis (H1) in Theorem 3. Sect. 6 contains the proof of Lemma 1, which together with Theorem 7 gives a description of all sets that solve the isoperimetric problem in equation (2.2).

Isoperimetric Inequalities for the Hypercube
In this section we state an edge-isoperimetric problem and a result from [17] that gives a solution to this problem. We show in Lemma 1 that this solution is in fact the only one, up to M-equivalence (as defined in (1.11)). We use this to define an optimal path γ : → in Lemma 2.
The definitions (1.1) and (1.7) suggest that will be closely related to the following edge-isoperimetric problem: given a graph G = (V, E) and integer 1 ≤ k ≤ |V |, what is min E A, A : A ⊆ V, |A| = k ? We will say a set A has minimal edge-boundary if it satisfies this minimum. Solutions to this problem are known for the graph Q n (see [17]). Aside from determining the minimum above, we will also need to identify all subsets of V n that have a minimal edge-boundary.
Define the function q : The following gives a solution to the edge-isoperimetric problem for the hypercube.
and note that |γ k | = k. Then For a set S of size k, we will say that E S, S is minimal if S has a minimal edgeboundary.
r is a good set.
In other words, S is a good set if it 'decomposes' into a disjoint union of adjacent sub-cubes that decrease in size. It is shown in [17] that if S is a good set, then E S, S is minimal. Equivalently, every good set S makes |E (S, S)| maximal (i.e. for any U ⊆ V n of size |S|, It is easy to verify that (2.1) defines a good set for every k, and γ 7 Fig. 2 From the definition of γ k in (2.1) it follows that the binary decomposition of k completely describes the configuration γ k . For example, k = 7 = 2 0 + 2 1 + 2 2 implies that γ 7 is a configuration consisting of a 2-dim, a 1-dim, and a 0-dim sub-cube, and there are edges between any two of these three sub-cubes thus by symmetry, the set of all good sets is given by M (γ k ). Hence every ξ ∈ M (γ k ) has minimal edge-boundary. See Figure 2 above. This information will be sufficient to compute . However, in order to determine the prefactor K in Theorem 1, we will see that it is necessary to examine all sets of a particular volume that have minimal edge-boundary. The following lemma proves that the sets {γ k } 2 n k=1 in Theorem 7 and their isomorphic translations are the only sets with minimal edge-boundary. To the best of the author's knowledge, this result has not been established in any previous work.

Lemma 1 Let S ⊆ V n be a subset of the hypercube. Then E S, S is minimal if and only if S ∈ M γ |S| . Equivalently, E S, S is minimal if and only if S is a good set.
The proof of Lemma 1 is given in Sect. 6. Let γ 0 = , and note that the path γ : → given by as defined in (2.1) is a Glauber path (i.e. a path in ( , E n )), since by definition the set γ k+1 = γ k ∪ {w} where w = (w 1 , . . . , w n ) ∈ V n is the unique vertex that satisfies n i=1 w i 2 i−1 = k + 1. By Theorem 7 we have the following immediate conclusion.

Critical Configurations and Computation of the Energy-Barrier
From Lemma 2 we know that the path γ in (2.3) is an optimal path. In this section we prove Theorem 5 by computing the maximum value H attains along this path. The proof yields the volume of the first configuration along γ that attains the value , which we use to prove Theorem 4. We end the section with a brief argument justifying hypothesis (H2) in Theorem 3.
We begin with the following elementary result.

Lemma 3
For any 0 ≤ r ≤ n, The proof is by induction. Note that (3.1) is clearly true for r ∈ {0, 1}. Suppose that this also holds for all r ∈ {1, . . . , k}. Then The second equality follows from the observation that for any 0 ≤ i < 2 k , the binary expansion of the number 2 k + i has exactly one more "1" than the binary expansion of the number i. Note that the right-most term in (3.2) is equivalent to (3.1) with r = k + 1. This completes the proof.
A different proof of Lemma 3 is also given in [17].
Lemma 4 Let 1 ≤ j < n − 1 and 1 ≤ a < 2 n , and let the binary expansion of a be given by Suppose also that a j = 1 and a j+1 = 0, and let b = a +2 j−1 .
Proof Observe first that the binary expansion of b is obtained from the binary expansion of a by switching a j with a j+1 . Suppose first that a < 2 j , so that a j is the last "1" appearing in the binary expansion of a. Then a = 2 j−1 + c for some c < 2 j−1 and from Lemma 3 it follows that which agrees with (3.3). Now suppose that a ≥ 2 j , and hence a ≥ 2 j+1 since a j+1 = 0. Letã = j i=1 a i 2 i−1 andb =ã + 2 j−1 , and note that for everyã ≤s <b and s = s + n i= j+2 a i 2 i−1 we have

Hence it follows from (3.4) that
We can now proceed with proving Theorem 5.
Proof of Theorem 5 Recall from (2.3) the definition of the path γ : → . For 0 ≤ k ≤ 2 n , define Then from (1.8) and Lemma 2 it follows that We are interested in finding any k ∈ {0, . . . , 2 n } such that g (k) = , and then computing g (k). Uniqueness of k will follow whenever h is not integer valued. In general (i.e. when h ∈ N is permitted), g may attain its maximum at more than one place, and therefore in the constructive proof below we will refer to a particular global maximum as our global maximum.
We will first show that if k is any local (or global) maximum of g, it must have exactly δ = (n − h) /2 digits equal to "1" in its binary decomposition. Starting with any such local maximum k = n−1 i=0 k i 2 i−1 , we will show that if max {i : k i = 1} = n − h − 1 , we can "shift" the "1" 's in the decomposition of k (as was done in Lemma 4) to obtain a different local maximum k that satisfies max i : k i = 1 = n − h − 1 and g (k) ≤ g k (see Figure 3). This will determine where the final "1" in the decomposition of our global maximum should be. The same argument will also show where the other δ − 1 "1"s in the binary decomposition of our global maximum should be. We will thus obtain a k that attains the maximum in (3.6).
The function g is decreasing on {k, k + 1} if and only if g (k + 1) ≤ g (k), which is equivalent to Similarly, g is increasing on {k − 1, k} if and only if 2q (k − 1) ≤ (n − h). Observe that q (k) − q (k − 1) ≤ 1, hence local maxima of g occur at values k that have exactly δ = (n − h) /2 digits equal to "1" in their binary expansion, while k − 1 has at most δ digits equal to "1" in its binary expansion. Observe also that if k ≥ 2 is even, then q (k) ≤ q (k − 1), while if k ≥ 3 is odd, q (k) = q (k − 1) + 1. Hence, in order to find a global maximum it suffices to only consider odd k, with k − 1 having exactly δ − 1 digits equal to "1" in its binary expansion. Now suppose that k (1) is an integer that satisfies the above conditions, with its binary expansion given by k (1) We can now use (3.8) to compare the local maxima of g in order to find its global maximum. Starting with any k = n i=1 k i 2 i−1 that satisfies the aforementioned conditions (k is odd, k has δ digits equal to "1" in its binary expansion, k − 1 has δ − 1 digits equal to "1" in its The bottom-left diagram is the outcome of a 'switch' that corrects this, resulting in k with g k ≥ g (k). In the top-right diagram, ξ 1 (k) > n − h − 1 . To correct this, the first step is for the "0" at s 1 (k) to be switched with the "1" at s 1 (k) + 1, yielding k with s 1 k = s 1 (k) + 1 and g k ≥ g (k) binary expansion), let ξ 1 (k) = max {i : k i = 1}. We will show that if k is a global maximum, If ξ 1 (k) < n − h − 1, then by (3.8) we can switch the values of k ξ 1 (k) (= 1) and k ξ 1 (k)+1 (= 0) to obtain a local maximum k such that g (k) < g k . We can repeat this 'switch' procedure until the final "1" is the n − h − 1 th term. With every switch we observe a new local maximum of g that is greater than all previously observed ones (see Remark 1 below for the case n − h − 1 = 0).
We want to show that if ξ 1 (k) ≥ n − h − 1 +1 , we can apply Lemma 4 again to shift the final "1" one space 'back' and obtain some k * with ξ 1 (k * ) = ξ 1 (k) − 1 and g (k * ) ≥ g (k). In order to do this, we need that k ξ 1 (k) = 0, which may not be immediately the case. Hence we take the nearest "0" preceding ξ 1 (k) and shift it 'forward' until we have a value k * with k * ξ 1 (k) = 0. See right-hand illustration in Figure 3.
Formally, this is done as follows. Let s 1 (k) = max {i < ξ 1 (k) : k i = 0} and let k be the result of switching the terms k s 1(k) (= 0) and k s 1 (k)+1 (= 1) in the binary expansion of k. Then again from (3.8) it follows that Thus by switching the values of k s 1 (k) and k s 1 (k)+1 , we obtain a local maximum k which satisfies g k > g (k). We continue in a recursive manner. Let υ 0 = k , and let υ i+1 be obtained from υ i by switching the "0" at s 1 (υ i ) with the "1" at s 1 (υ i ) + 1 in the binary decomposition of υ i . Let M = min {i : Hence we may assume that our global maximum k satisfies ξ 1 (k) = n − h − 1 .
We can repeat this process to determine the location of all other "1"s in the binary expansion of our global maximum. For 2 ≤ m ≤ δ we can define ξ m (k) = max {i < ξ m−1 (k) : k i = 1} and from (3.8) we conclude that if ξ m (k) < n − h + 1 − 2m and k ξ m (k)+1 = 0, we obtain a greater maximum by switching k ξ m (k)+1 and k ξ m (k) . Similarly, if ξ m (k) ≥ n − h + 1 − 2m + 1, then we can define s m (k) = max {i < ξ m (k) : k i = 0} and define k analogous to (3.9) to conclude that Thus, we can repeat (3.10) until we obtain a local maximum k # of g that satisfies ξ m k # = n − h + 1 − 2m . Note that for m = δ, n − h + 1 − 2m ∈ {0, 1} and hence we set ξ δ = 1 which agrees with our previous observation that we may take k to be odd. Therefore, for h < n − 1 (see Remark 1) the maximum of g is attained at In Remark 1 we consider the case h ≥ n − 1. Note also that if n − 2 ≤ h < n − 1, then δ = 1 and (3.11) gives k = 1. Hence in this case we get = n − h. Therefore, for the remainder of this proof we will assume that h < n − 2, and hence δ ≥ 2 and k ≥ 3. Let and by Lemma 3 we have that Hence we see that − 1 when n − h is even, and both cases agree with = n − h when n − 2 ≤ h < n. This completes the proof.

Remark 1
The above derivation was done under the assumption n − h − 1 ≥ 1. Note that if n − h − 1 = 0, then δ = 1 and it is immediate from (3.8) that the only "1" in the binary expansion of k belongs to k 1 . Therefore, in this special case k = 1 and = n − h are the solutions to the above problem.

Proof of Theorem 4
In the proof of Theorem 5 we observed that if n −2 ≤ h < n, the energy value H ( ) + = H ( ) + n − h is first attained along the path {γ i } by the configuration γ 1 . Furthermore, every optimal path attains this value in its initial step, and hence from the definition in (1.9) it is clear that P = { }, and C = M (γ 1 ) is the set of all singleton configurations.
For h < n − 2, it follows from the binary decomposition of k in (3.11) and the fact that γ k is a good set, that γ k is the configuration W i given in the statement of Theorem 4. By Lemma 1, every optimal path γ : → must pass through M (γ k ). Furthermore, from Lemma 5 it will follow that for any such path γ , if γ i is the first configuration along the path γ that lies in M (γ k ), then γ i−1 , < γ i−1 , . Hence by the third condition in (1.9) it follows that no configuration σ with |σ | < k can be in C . Furthermore, no configuration σ with |σ | > k satisfies "∃ξ ∈ such that (σ, ξ ) ∈ E n and (ξ, ) < +H ( )". Hence, no configuration σ with |σ | > k can have a neighbour in P , which by the first condition in (1.9) implies that σ / ∈ C . Lastly, if |σ | = k and σ / ∈ M (γ k ), then by Lemma 1 we have that H (σ ) > H (γ k ), and hence again σ / ∈ C . This shows that σ ∈ C iff σ ∈ M (γ k ), and thus completes the proof.
The validity of hypothesis (H2) in Theorem 3 is now trivial-all members of C are the image of γ k under different isomorphisms on Q n , hence they all have the same number of neighbours in P .

Computation of the Prefactor K
In this section we prove that the only way down from a critical configuration is through M (γ k −1 ) ∪ M (γ k +1 ) (Lemma 5). In Lemma 6 we calculate the cardinality of C , which we then use to prove Theorem 6.
The following variational equation (derived in Lemma 16.17 in [4]) gives an expression for K . and a similar definition is given to S . Lastly, S ⊆ is the set of all σ ∈ such that (σ, ) ≤ ( , ) (and hence also (σ, ) ≤ ( , )). We remark that for Lemma 16.17 in [4] , S is defined to be the set of all ξ ∈ with H (ξ ) ≤ ( , ). Let us denote that definition of S by S a , and note that it yields a bigger set since it may include ξ ∈ that satisfy (ξ, ) > ( , ). But any ξ ∈ S a \S must lie in a component of the graph S a , E n that is disconnected from S ∪ C ∪ S . Thus in (4.1) we may take f ≡ 0 on S a \S , which reduces the sum in (4.1) to a sum over the connected component of S a , E n that contains the set S ∪ C ∪ S , which is precisely (S , E n ). This justifies our definition of S .
Recall from (1.11) that for any ξ ∈ , M (ξ ) denotes the equivalence class of all σ ∈ that are the image of ξ under some graph isomorphism on Q n . To simplify (4.1), we will make use the following lemma which tells us that the only way down in energy from a critical configuration is through M (γ k −1 ) or M (γ k +1 ), where k is given in (3.11) and equal to the volume of a critical configuration.
An immediate conclusion from Lemma 5 is the following.
Corollary 1 Let k be as in (3.11).
In other words, P is the set of all images of the set δ−1 i=1 W i given in (1.12) and (1.13), under isomorphisms on Q n (see also Figure 1). We will now compute the cardinality of the critical set C . This quantity will be necessary for computing the prefactor K .

Lemma 6
Suppose 0 < h < n and h is not integer valued. Then |C | = 2 n for h ≥ n − 2, and Proof From Theorem 4 it follows immediately that if h > n − 2, |C | = |V n | = 2 n . For v ∈ V n and s ∈ N, 1 ≤ s ≤ n, let θ s (v) ∈ V n be the vertex that agrees with v at every co-ordinate except at s. In other words, θ s (v) i = v i for i = s, and θ s (v) s = 1 − v s . If Q r is an r -dimensional sub-cube of Q n (r < n), and 1 ≤ s ≤ n is such that v s = w s for every v, w ∈ Q r (in other words, the co-ordinate s lies outside Q r ), define θ s (Q r ) by Note that θ s (Q r ) is also an r -dimensional sub-cube of Q n . We will say in this case that s is an external co-ordinate of the sub-cube Q r . By Definition 1 and Theorem 4, every configuration in C can be constructed as follows. Start with any n − h − 2 -dimensional sub-cube Q 1 . There are n n−h−2 ×2 n− n−h−2 different choices for such a sub-cube. Let s 1 be any external co-ordinate of Q 1 , and let Q 2 be a n − h − 4 -dimensional sub-cube of θ s 1 (Q 1 ). There are (3.11) implies that we should continue with this construction until we have chosen a n − h − 2δ + 2 -dimensional sub-cube Q δ−1 followed by a single vertex from the sub-cube θ s δ−1 Q s δ−1 , which will be identified with the 0-dimensional sub-cube Q δ . For i ≥ 2, there are always two choices for the external co-ordinate s i of Q i , since both Q i and θ s i (Q i ) lie inside θ s i−1 (Q i−1 ) (see Figure 1). Moreover there are ways to choose the co-ordinates of Q i+1 , and 2 2 ways to fix the two external co-ordinates of Q i+1 (for i + 1 < δ) that are in θ s i (Q i ) . Therefore, letting b 1 = (n − n − h − 2 ) and b i = 2 for 2 ≤ i ≤ δ − 2, we see that |C | is given by From this, the statement of the lemma follows.
We can now proceed with computing K .

Proof of Theorem 6
The proof works as follows. We first show that (4.1) can be simplified considerably. Following this simplification, it is necessary to count the neighbours (in S and S ) of any critical configuration. We do this by making use of Lemma 5.
Recall the definition of k in (3.11). It follows from equation (3.7) in the proof of Theorem 5 that if h is not an integer, g (k ) is a strict local maximum-i.e. g (k − 1) < g (k ) and g (k ) > g (k + 1). Furthermore, from equations (3.9) and (3.10) (in particular, the final inequality in both equations) it follows that k is the unique maximum of g. This in particular implies that γ k −1 ∈ S and γ k +1 ∈ S .
Observe also that by symmetry, for every ξ ∈ C , the inner sum in the right-hand side of (4.5) is the same. Thus, taking any ξ ∈ C , we have that (4.1) has been reduced to As pointed out in the proof of Lemma 5, if h ≥ n − 2 then the first sum in (4.6) contains only one term, namely ξ = . It is also easy to see that in this case the second sum contains n terms. Hence, for n − 2 ≤ h < n we get For h < n − 2, it was shown in Lemma 5 that if n − h is even, there is a unique ξ ∈ M (γ k −1 ) with ξ , ξ ∈ E n (in particular, if ξ = δ i=1 W i as in (1.12) and (1.13), then Lemma 5). Similarly, if ξ ∈ S and ξ , ξ ∈ E n and ξ = δ i=1 W i , then ξ = ξ ∪ {w} where w ∈ W δ−1 is a vertex adjacent to the unique vertex w δ ∈ W δ -i.e. (w, w δ ) ∈ E n . If n − h is odd then W δ−1 is a 1-dim sub-cube and there is a unique vertex w that satisfies this. If n − h is even, W δ−1 is a 2-dim sub-cube and there are two choices for w. See Figure 4.

Stability Levels and Reference Paths
Theorem 3 states that hypotheses (H1) and (H2) hold whenever 0 < h < n. The latter was verified in Sect. 3, following the proof of Theorem 4. To verify (H1), we use a standard nucleation-path type of argument, similar to what is given in Chapter 17 in [4] for the Ising model in Z 2 . It exploits translation invariance in the underlying graph, and the possibility to initiate a uniformly optimal path (as defined in the statement of Lemma 2) starting from any vertex.
Proof of Theorem 3 Let σ ∈ , σ / ∈ { , }. We will show that V σ < ( , ) = V , which by definition implies that σ / ∈ m and hence that the metastable states are given by Pick any w ∈ σ s.t. (w, y) ∈ E n for some y ∈ σ , and let γ = (γ 0 , . . . , γ 2 n ) be an optimal path with initial steps γ 1 = {y} and γ 2 = {w, y}. This is always possible by translation invariance. We will show that the path {σ ∪ γ i } S i=0 (S will be defined below), going from σ to σ ∪ γ S , satisfies H (σ ∪ γ S ) < H (σ ), and H (σ ∪ γ i ) − H (σ ) < ( , ) for all 0 ≤ i ≤ S. By definition, this means that V σ < ( , ), and hence σ / ∈ m . Note first that Let us also denote by and note (by means of a simple computation) that S ≥ 2 whenever h < n. Furthermore, for It follows that for 2 ≤ i ≤ S where the last inequality follows from the fact that |γ i ∩ σ | = m for some m < i. Hence, by uniform minimality of the configurations γ j Thus we have shown that the path Finally, note that if h ≥ n, then H (γ 1 ) − H ( ) ≤ 0 and from (5.1) it follows that (the derivation in (5.1) is also true for i = 1, except that now the final inequality is not strict anymore) H (σ ∪ γ 1 ) − H (σ ) ≤ H (γ 1 ) − H ( ) ≤ 0, thus σ is not a local minimum of H.

Proof of Lemma 1
In this section we will show that if W is not a good set (as per Definition 1), E W, W is not minimal-that is, ∃U ⊆ V n , |U | = |W | such that E U, U < E W, W . Note that this is equivalent to showing |E (W, W )| is not maximal. And unlike E W, W , the quantity |E (W, W )| is invariant of the size of the cube in which W is embedded, which will make it easier to work with.
We start with a definition. We will say that W ⊆ V n with 2 r < |W | ≤ 2 r +1 is wellcontained if there is a (r + 1)-dimensional sub-cube of Q n that contains W . Note that every set W of size |W | > 2 n−1 is well-contained, as is every good set (see Definition 1). The following lemma shows that if E W, W is minimal, then W must be well-contained.

Lemma 7
If W is not well-contained, |E (W, W )| is not maximal.
Proof We begin with an observation. Let C 0 be any sub-cube (of any dimension in {0, 1, . . . , n − 1}), and let C 1 = θ s (C 0 ) for some external co-ordinate s of C 0 (recall from (4.3) that this means C 0 and C 1 are disjoint sub-cubes of the same size, and that there is some 1 ≤ s ≤ n such that every u ∈ C 0 can be mapped to a v ∈ C 1 by changing the value at u s ).
where the inequality follows from the observation that every v ∈ W 1 has at most one neighbour in W 0 , and vice versa. Furthermore, claim If W is a good set, then the inequality in (6.1) is an equality.
Proof of claim We will assume that W 0 = ∅ and W 1 = ∅, since otherwise the claim is trivially true. By the definition of a good set, there is some l ∈ N, such that W can be decomposed into l disjoint good sets Here W i , 1 ≤ i ≤ l, is the set of all vertices in some a i -dimensional sub-cube, with a i < a i−1 . Furthermore, again from the definition of a good set, for every i ≥ 2 we have l j=i W j ⊆ θ b i−1 W i−1 for some external co-ordinate b i−1 of W i−1 (this is analogous to the statement m= j+1 W m ⊆ W j in (1.13) ). Then Recall that C 1 = θ s (C 0 ), hence s is an not an external co-ordinate of the r + 1-dimensional sub-cube C 0 ∪ C 1 . Note that for any 1 ≤ i ≤ l, if s is not an external co-ordinate of W i , W i ∩ C 0 = W i ∩ C 1 = 1 2 W i and this is also equal to E W i ∩ C 0 , W i ∩ C 1 . Thus, if s is not an external co-ordinate of W i for all 1 ≤ i ≤ l, then it must be that |W 0 | = |W 1 | = 1 2 W j and The second equality comes from the fact that if j < k, then any v j ∈ W j ∩ C 0 and any v k ∈ W k ∩ C 1 differ by at least two co-ordinates, namely s and b j (since W k ⊆ θ b j W j ), and s = b j . Note that if s is an external co-ordinate of W i , then s is an external co-ordinate of W j for all j ≥ i. Let = min i : s is an external co-ordinate of W i , and suppose w.l.o.g. that W ⊆ W 0 . Then |W 1 | ≤ |W 0 |, and for any i ∈ R = i > : W i ⊆ W 1 , we have W i ⊆ θ b W (see Figure 5). Hence for every i ∈ R and v ∈ W i , there is exactly one w ∈ W 0 (more precisely, w ∈ W ) such that (v, w) ∈ E n . This shows that and thereby proves the claim. Let r be such that 2 r < |W | ≤ 2 r +1 . We may assume that r + 1 ≤ n − 1, since if 2 n−1 < |W | then W is by definition well-contained in the cube Q n . We will start by induction on n. For n = 2, the only sets that are not well-contained are W 1 = {(0, 0) , (1, 1)} and W 2 = {(1, 0) , (0, 1)}. Clearly is not maximal. Now suppose that the statement of the lemma is true whenever the setting is a hypercube of dimension less than or equal to n − 1, and let W ⊆ V n be a set that is not wellcontained. Let W 0 = {w ∈ W : w 1 = 0} with W 1 defined similarly, so that W 0 ∪ W 1 = W , and suppose w.l.o.g. that |W 0 | ≥ |W 1 |. Note that the sets W 0 and W 1 are contained in two disjoint sub-cubes, call them Q 0 n−1 and Q 1 n−1 , of dimension n − 1. Let r 0 ≤ n − 2 be such that 2 r 0 < |W 0 | ≤ 2 r 0 +1 , and define r 1 ≤ r 0 in a similar manner. If W 0 is not well-contained, then by the inductive hypothesis |E (W 0 , W 0 )| is not maximal. Hence we can find a good set W 0 in Q 0 n−1 with |W 0 | = W 0 and |E (W 0 , W 0 )| < E W 0 , W 0 , and we can also replace W 1 by a good set W 1 of the same size such that |E (W 1 , W 1 )| ≤ E W 1 , W 1 . By (6.1), |E (W 0 , W 1 )| ≤ |W 1 |, and we may take W 1 such that E W 0 , W 1 = W 1 (by taking W 1 to be a good set contained in θ 1 W 0 ), hence it also follows that |E (W 0 , W 1 )| ≤ E W 0 , W 1 . By (6.1) the set W = W 0 ∪ W 1 satisfies |E (W, W )| < E W , W , and hence |E (W, W )| is not maximal. The same argument follows if W 1 is not well-contained. We may therefore assume that W 0 and W 1 are wellcontained.
Suppose first that r 0 + 1 < n − 1. Assuming W 0 and W 1 are well-contained, we can find two disjoint sub-cubes Q 0 r 0 +1 and Q 1 r 1 +1 containing W 0 and W 1 respectively (they are disjoint since every vertex in W 0 (W 1 ) has a 0(1) in its first co-ordinate, hence the same must be true for every vertex in Q 0 r 0 +1 (Q 1 r 0 +1 )). We may also assume that W 1 ⊆ θ 1 (W 0 ), since otherwise |E (W 0 , W 1 )| < min (|W 0 | , |W 1 |) and we can make the same argument as before to conclude that |E (W, W )| is not maximal. Hence W is contained in a (r 0 + 2)-dimensional sub-cube containing Q r 0 +1 and Q r 1 +1 . Since r 0 + 2 ≤ n − 1, it follows from the inductive hypothesis that |E (W, W )| is not maximal.
Finally, if r 0 + 1 = n − 1, we can decompose W 0 into W 0 = W 00 ∪ W 01 , with W 00 = {w ∈ W 0 : w 2 = 0} and with a similar definition for W 01 . We can assume w.l.o.g. that W 0 and W 1 are good sets, since otherwise we can replace them by good sets W 0 and W 1 as was done in the previous case, to get |E (W, W )| ≤ E W , W , where W = W 0 ∪ W 1 . Then assuming W 0 is a good set, one of W 00 , W 01 is the set of all vertices of a (n − 2)-dimensional sub-cube. W.l.o.g. take this to be the set W 00 , and note that W 01 is well-contained (since W 0 is a good set). Note that at least one of the inequalities |E (W 00 , W 1 )| ≤ min (|W 00 | , |W 1 |) = |W 1 | (since |W | ≤ 2 n−1 = 1 2 |W 00 |) and |E (W 01 , W 1 )| ≤ min (|W 01 | , |W 1 |) is strict, since each w ∈ W 1 has at most one neighbour in W 0 , and that will be either in W 00 or W 01 . Furthermore, we can find a good set W † of same size as W = W 1 ∪ W 01 contained in the (n − 2)-dimensional sub-cube that contains W 01 such that E W † , W † ≥ E W , W and E W † , W 00 = W † ≥ E W , W 00 . But then at least one of the inequalities E W † , W † ≥ E W , W and E W † , W 00 ≥ E W , W 00 is strict, and hence E W † ∪ W 00 , W † ∪ W 00 > |E (W, W )| (see Figure 6). This shows that |E (W, W )| is not maximal, and completes the proof.
Proof of Lemma 1 As in the proof of Lemma 7, we will prove the statement of this lemma by induction on the size of the ambient hypercube. The case n = 2 is simple, since the only sets that are not good are the two sets W 1 and W 2 from (6.3). Suppose now that whenever the setting is a hypercube of dimension less than or equal to n − 1, W is not a good set implies |E (W, W )| is not maximal. Let W be a subset of Q n that is not good, |W | = 2 r + k for 1 ≤ k ≤ 2 r and 0 ≤ r ≤ n − 1. Then at least one of the following three statements is true: (1) There is no (r + 1)-dimensional sub-cube which contains the set W (i.e. W is not wellcontained). (2) Q r +1 is a (r + 1)-dimensional sub-cube of Q n that contains W , and for any decomposition Q r +1 = Q 0 r , Q 1 r into two disjoint, r -dimensional sub-cubes, we have that W ∩ Q 0 r = ∅ and W ∩ Q 1 r = ∅. (3) Q r +1 is a (r + 1)-dimensional sub-cube of Q n that contains W , and for any decomposition Q r +1 = Q 0 r , Q 1 r into two disjoint, r -dimensional sub-cubes, we have that W ∩ Q 0 r = ∅ implies W ∩ Q 1 r is not good.
If the first statement is true, |E (W, W )| is not maximal by Lemma 7. If the third statement is true, then the argument follows almost immediately from the inductive hypothesis. Indeed, if W i = W ∩Q i r for i ∈ {0, 1}, then replacing W 1 by a good set W 1 of the same size and contained in Q 0 r implies that |E (W 1 , W 1 )| < E W 1 , W 1 and |E (W 1 , W 0 )| ≤ |W 1 | = E W 1 , W 0 . Suppose that the second statement is true. By the inductive hypothesis, if r + 1 < n or if either one of W 0 , W 1 is not good, |E (W, W )| is not maximal. Hence we may assume that r +1 = n. But now we can consider the set U = W instead, since E U, U = E W, W . Clearly |U | < 2 n−1 and U is not a good set (W satisfies the second statement above, so U is not well-contained), hence again by the inductive hypothesis we have that |E (U, U )| is not maximal (and hence E U, U is not minimal). This proves that |E (W, W )| is not maximal.