Bootstrap, Markov Chain Monte Carlo, and LP/SDP hierarchy for the lattice Ising model

Bootstrap is an idea that imposing consistency conditions on a physical system may lead to rigorous and nontrivial statements about its physical observables. In this work, we discuss the bootstrap problem for the invariant measure of the stochastic Ising model defined as a Markov chain where probability bounds and invariance equations are imposed. It is described by a linear programming (LP) hierarchy whose asymptotic convergence is shown by explicitly constructing the invariant measure from the convergent sequence of moments. We also discuss the relation between the LP hierarchy for the invariant measure and a recently introduced semidefinite programming (SDP) hierarchy for the Gibbs measure of the statistical Ising model based on reflection positivity and spin-flip equations.


Introduction
Statistical Ising model is defined by a specific probability measure, called the Gibbs measure, over the space of spin configurations on a lattice. 1 Despite the simplicity of its definition, it exhibits surprisingly rich dynamics which has driven developments of several important branches of math and physics.In particular, the existence of the phase transition in two and three dimensions provides an outstanding example of dramatic physical phenomena that may take place in the infinite volume systems.
Even though there exist analytic solutions in some special cases [3,4], the statistical Ising model under general temperature and external magnetic field in two and higher dimensions still remains unsolved -for example, the value of the critical temperature in three dimensions is unknown.Traditionally, numerical estimates of various quantities were obtained using the Monte Carlo simulations, where probable spin configurations (on a finite lattice though) are sampled over based on the Gibbs measure. 2 Markov Chain Monte Carlo (MCMC) is one of the standard dynamical procedures defining such sampling, which is also called the stochastic Ising model when restricted to the Ising model.
Alternatively, there is another approach called "bootstrap" where consistency conditions of the model are imposed and the corresponding consequences are studied.In particular, the conformal bootstrap program has been very successful in studying the continuum theory that arises at the criticality in two and three dimensions.Unitarity, conformal symmetry, and the consistency conditions known as crossing equations provide exact solutions in two dimensions [6], and rigorous and highly tight bounds on physical data in three dimensions with the help of semidefinite programming (SDP) [7][8][9][10].Recently in [11], a different bootstrap approach (labeled BS ′ 2 in this work) was applied directly to the statistical Ising model on the infinite lattice, where reflection positivity and spin-flip equations satisfied by the Gibbs measure were represented as a SDP problem and provided rigorous (and sometimes highly tight) bounds on the spin correlators.It is worth mentioning that the very definition of the Gibbs measure on the infinite lattice given by the DLR equations [12,13] allows for such a bootstrap formulation very naturally.
An obvious but essential fact about MCMC is that, by construction, the Gibbs measure is guaranteed to be an invariant measure of MCMC.Moreover, under the assumption of translation invariance, every invariant measure is also Gibbs (see Theorem 2).Therefore, it is natural to pose another bootstrap problem (labeled BS 1 in this work) where probability bounds (stating that the measure is a probability measure) and the invariance condition (stating that the measure is invariant under the Markov chain dynamics) are imposed as bootstrap conditions. 3These conditions must be met by the Gibbs measure and thus should be compatible with the bootstrap problem based on reflection positivity and spin-flip equations in the sense that they should share common solutions.As we discuss in section 3, BS 1 is described by a linear programming hierarchy while BS ′ 2 is described by a SDP hierarchy, where the bootstrap conditions at the lower level are part of those at higher levels in both cases.For any choice of the transition rate for MCMC, the set of invariance equations in BS 1 will manifestly be a proper subset of the set of spin-flip equations in BS ′ 2 at each level in the two hierarchies.
The hierarchy of LP/SDP encountered in this work is a special case of the Lasserre hierarchy which is much studied in the optimization literature. 4Statistical mechanical systems provide a unique setup for the Lasserre hierarchy where the number of the polynomial variables is infinite as opposed to finite.One immediate question is the convergence of such hierarchy and it was conjectured in [11] that the lower and upper bounds on spin correlators obtained from BS ′ 2 converge to each other as the hierarchy level increases, when there is a single phase.As the main result of this work, we will show the asymptotic convergence of the LP hierarchy of BS 1 in the sense that the solutions to the LPs converge to moments of an invariant probability measure of MCMC.As an intermediate step, we will also discuss the relevant moment problem over the space of spin configurations on the infinite lattice.Similar convergence statement for BS ′ 2 remains unclear to us at the moment.Instead, we will define the bootstrap problem BS 2 by equipping BS ′ 2 with probability bounds of BS 1 , which in practice requires only little extra computational cost while the convergence still holds true.This paper is organized as follows.We first review the definitions and relevant theorems of the stochastic and statistical Ising model in section 2. They will naturally lead to the bootstrap problems BS 1 and BS 2 which we introduce in section 3.In section 4, we discuss the moment problem for the spin configurations on the infinite lattice and show the convergence of BS 1 .We provide the bounds obtained by different bootstrap approaches in section 5 and end with further discussions in section 6.

Review of the statistical and stochastic Ising model
In this section, we will review the definitions of the statistical and stochastic Ising model and their relations, and rephrase their properties in terms of the polynomial moments.We will mostly follow [2] where the details of the theorems and proofs may be found.Even though this section collects very elementary facts about the statistical and stochastic Ising model, showing that they are all satisfied by the solution of the bootstrap problem to be defined later will be the main result of this work, which provides several interesting implications.

Probability space for the Ising model
In this work, we are going to work on the infinite d-dimensional hypercubic lattice Λ = Z d .At each lattice site i ∈ Λ, we have a spin degree of freedom s i ∈ {−1, 1}.The state space S is the set of all possible spin configurations s over the lattice Λ: S = {−1, 1} Λ .We will be interested in a specific set of probability measures on the sample space S. In order to define the event space, we first define the following.Definition 1.Let A be a finite subset of Λ, and u i ∈ {−1, 1} for i ∈ A be a specific spin assignments over the lattice sites of A. An event E({u i } i∈A ) is defined as the following set of spin configurations: In other words, E({u i } i∈A ) is the set of all spin configurations whose spins at lattice sites of A agree with u i .Note that the above definition applies to the case A = ∅: The event space is going to be the union of the events for all finite subsets A ∈ Λ and all possible spin assignments u i over them, together with the empty set.
Definition 2. The event space V is the σ-algebra generated by the events E({u i } i∈A ) for all finite subsets A ∈ Λ and all possible spin assignments {u i } i∈A over them.
A probability measure over S and V is defined as follows.
Definition 3.For the sample space S and the event space V defined as above, a probability measure over them is a function ρ ρ (E a ) has the interpretation of probability that the event E a happens.Later when we try to construct a probability measure for the statistical and stocahstic Ising model from the candidate moments obtained by LPs, it will be important to check that all the requirements in the above definition are satisfied.
In order to define the expectation values associated with a probability measure ρ, we introduce the indicator functions.
As the name suggests, the indicator function for the event E({u i } i∈A ) evaluated on a spin configuration s ∈ S is equal to 1 if the spin assignments of s agrees with {u i } i∈A over A, and 0 otherwise.The construction of the expectation value then proceeds as usual.
Definition 5. Given a probability measure ρ over the sample space S and the event space V , and a function f : S → R, the expectation value of f given by ρ is ⟨f (s)⟩ = S f (s)dρ. (2.3)

The statistical and stochastic Ising model
We are now ready to define the statistical and stochastic Ising model.For any given site i ∈ Λ, its nearest neighbors are the collection of sites n(i) The Ising model is local in the sense that its probability measure is defined using only the nearest neighboring spins.
Definition 6.The Gibbs measure g of the statistical Ising model on the lattice Λ = Z d at couplings J ∈ R and h ∈ R is a probability measure over the sample space S and the event space V such that: given any lattice site i ∈ Λ, any finite subset T ⊂ Λ such that n(i) ⊂ T and i / ∈ T , any spin assignments {u k } k∈T over T , and any spin assignment The set of all Gibbs measures at couplings J and h is denoted as Γ J,h .
This definition is equivalent to the traditional one given by the DLR equations [12,13].In case g (E ({u k } k∈T )) ̸ = 0, this is equivalent to saying that the conditional probability that the spin s i at i ∈ Λ takes the value u i , given the spin assignments {u k } k∈T over T which in particular includes the nearest neighbors of i, is given by 1 + e −2(hu i +J j∈n(i) u i u j ) −1 .When J ≥ 0 and h ≥ 0, the statistical Ising model is called ferromagnetic, and we are going to focus only on the ferromagnetic case in this work.
The above definition using the conditional probability agrees with the conventional definition of the statistical Ising model on the finite lattice Λ f (Proposition 1.8 in Chapter IV of [2]), which is described by the partition function and probability measure where (i,j) means that the sum is over all the nearest neighbor pairs (i, j).
It is very important that depending on the value of J and h (and also the dimension d), there may be more than one Gibbs measure satisfying Definition 6!This is the hallmark of the phase transition which may take place only on the infinite lattice.Now, we turn to the definition of the stochastic Ising model.Definition 7. Given the couplings J ∈ R and h ∈ R, the stochastic Ising model is a Markov chain on the state space S such that: • on every lattice site of Λ = Z d , a Poisson clock is placed, • if the current state is given by s ∈ S and the Poisson clock at the site i ∈ Λ rings, the state s makes a transition to another state s ′ ∈ S with a strictly positive transition rate c(i, s) where s ′ j = s j , ∀j ∈ Λ \ {i}, and s ′ i = −s i , • the function c(i, s)e hs i +J j∈n(i) s i s j does not depend on the value of s i .
Note that we did not specify the transition rate (or the transition probability) c(i, s).
The key idea is that as long as c(i, s) satisfies the last condition in Definition 7, the objects of interest (which we will introduce soon) will be independent of the specific choice of c(i, s).Independence on the value of s i is equivalent to saying that the function is even in s i .Popular choices for c(i, s) are c(i, s) = exp −hs i − J j∈n(i) s i s j and c(i, s) = 1 + exp 2hs i + 2J j∈n(i) s i s j −1 .Later in section 5, we will work with the following choice: where C is a constant depending on d, J, and h whose details will not matter for us.One possible choice would be C = 1/ (1 + exp(4dJ + 2h)).
When we apply the above definition to the case where Λ is finite, we obtain the traditional Markov chain (sometimes called the Glauber dynamics) which is used to perform the Monte Carlo simulation of the Ising model, known as MCMC.The last condition in Definition 7 is nothing but the detailed balance equation for the probability measure g f in (2.6).The ergodicity theorem states that g f is indeed the unique invariant measure of the Markov chain.Of course for our case where Λ is infinite, the set of invariant measures needs not be a singleton.Definition 8.A probability measure ρ over the sample space S and the event space V is an invariant measure of the stochastic Ising model if where s i ∈ S is defined by s i j = s j , ∀j ∈ Λ \ {i}, and s i i = −s i .We denote by Π J,h the set of all invariant measures of the stochastic Ising model at couplings J and h.
The definition of the space of functions D(S) (sometimes called the core of the Markov chain) can be found in Chapter I of [2].For us, the only relevant facts about D(S) are that it is a dense subset of the set C(S) of continuous functions on S, and the set P (S) of polynomials in {s i } i∈Λ is a subset of D(S).As the name suggests, the invariant measure remains invariant under the time evolution of the Markov chain.
The stochastic Ising model is defined such that the Gibbs measure of the statistical Ising model is a reversible measure.Definition 9. A probability measure ρ over the sample space S and the event space V is a reversible measure of the stochastic Ising model if We denote by Ω J,h the set of all reversible measures of the stochastic Ising model at couplings J and h.Theorem 1. (Theorem 2.14 in Chapter IV of [2]) Given J ∈ R and h ∈ R, Ω J,h = Γ J,h .
Note that a reversible measure is invariant.Furthermore, Theorem 1 says that a reversible measure is a Gibbs measure.This is essentially because the reversibility condition and the conditional probability defining the Gibbs measure are equivalent.Also note that Theorem 1 does not rely on the specific choice of the transition rate c(i, s).This implies that the set of reversible measures is independent of the choice of c(i, s) as long as the latter satisfies the definition of the stochastic Ising model.A natural question is whether there are invariant measures which are not reversible.It was shown in [21] that there are no such measures under the assumption of translation invariance.Definition 10.Let t p : Λ → Λ for p = 1, 2, ..., d be a translation of the lattice sites by one unit in p-th direction: t p (i) = i + e p where e p is the unit vector along the p-th direction.A probability measure ρ over the sample space S and the event space Theorem 2. Let ρ be a translation invariant probability measure over the sample space S and the event space In fact, it can be shown that for d = 1 and d = 2, invariant measures are reversible even in the absence of the translation invariance assumption (see e.g.Chapter IV.5 of [2]).However, as far as we are aware, this is not established for d ≥ 3.

Moments, positivity, invariance, and reversibility
Later when we formulate the bootstrap problems for the Ising model, the information about a probability measure will be expressed in terms of moments.Therefore, we describe the properties of a probability measure discussed so far in terms of moments in this subsection.
Say that we are given a candidate set of polynomial moments ⟨p(s)⟩, ∀p(s) ∈ P (S).The question is, how do we make sure that they correspond to the expectation values of some probability measure ρ satisfying either invariance or reversibility?In the general case of real-valued polynomial moment problems, this type of question remains unsolved.However, as we will see in this work, this question for the Ising model has a definite answer.
We first address the positivity of the candidate measure.Given a candidate set of polynomial moments ⟨p(s)⟩, ∀p(s) ∈ P (S), we know in particular the moments of all the indicator functions because indicator functions for events E({u i } i∈A ) are polynomials themselves.Then, the candidate probability measure ρ realizing the given set of moments should satisfy for all events E({u i } i∈A ).This is a natural requirement for the candidate measure ρ since the value of the measure evaluated on an event has the interpretation of the probability that the event takes place, which in turn should be equal to the expectation value of the corresponding indicator function.Therefore, Lemma 1.A candidate probability measure ρ over the sample space S and the event space V is positive only if its candidate moments satisfy ⟨F ({u i } i∈A , s)⟩ ≥ 0 for all events E({u i } i∈A ).
Note that the above Lemma states only a necessary condition for the probability measure.Such a condition can be readily checked for the candidate moments ⟨p(s)⟩.In contrast to the general polynomial moment problems where the indicator functions are not polynomials and thus require extra conditions to even discuss their moments, the Ising model (and many other statistical models) is particularly simple since the indicator functions are polynomials.Just checking the positivity of the candidate probability measure evaluated on the generators of the event space V is not enough to guarantee that it is indeed a probability measure, since one also has to make sure that countable additivity can be made sense.We will have further discussions on this in section 4.1.
Next, we turn to the invariance and reversibility conditions for a candidate measure and candidate moments.Given s ′ ∈ S, s ′′ ∈ S such that s ′ ̸ = s ′′ , there exists at least one site i ∈ Λ such that s ′ i ̸ = s ′′ i .The polynomial function s i then separates two points s ′ and s ′′ .Therefore, the set P (S) of polynomials in {s i } i∈Λ is a subalgebra of C(S) which separates points in S. By Tychonoff's theorem, the sample space S = {−1, 1} Z d is compact under the product topology.Then, Stone-Weierestrass theorem implies that P (S) is dense in C(S), and also in D(S). 5he implication of this fact is that the invariance and reversibility for a measure, which by definition require considering the expectation values of arbitrary functions in D(S) and C(S), can be checked by considering only the polynomial moments.Lemma 2. A probability measure ρ over the sample space S and the event space V is an invariant measure of the stochastic Ising model if and only if its polynomial moments satisfy (2.11) Lemma 3. A probability measure ρ over the sample space S and the event space V is a reversible measure of the stochastic Ising model if and only if its polynomial moments satisfy It may not be immediately obvious how c(i, s) f (s i ) − f (s) may be expressed as a polynomial.This is essentially because the spin variables s i at each site i ∈ S can take values only in {−1, 1} and c(i, s) is a local expression around the site i involving only the nearest neighbors so that any reasonable choice of c(i, s) (such as ones discussed around (2.7)) can be equivalently written as a polynomial of finite number of spin variables.Therefore, (2.11) and (2.12) are indeed equations for polynomial moments.

Bootstrap problems for the stochastic and statistical Ising model
In this section, we formulate the bootstrap problems for the stochastic and statistical Ising model.Such a formulation is very natural from the definitions of the stochastic and statistical Ising model for two reasons.The first is that the object of interest is a probability measure, whose positivity is a crucial defining property.The second is that any such measure satisfying a given set of equations (invariance or reversibility) is physical.The combination of positivity and equations provides a bootstrap-friendly setup, and it is thus expected that imposing them over the set of candidate measures would lead to rigorous and nontrivial results about the space of physical measures.
We first begin by introducing some notations.Given the lattice Λ = Z d , we denote by H n := {−n, −n + 1, ..., 0, ..., n − 1, n} d ∈ Λ the hypercube of size 2n + 1 centered at the origin.We then denote by D n the dual polytope of the hypercube H n .For example, it is a diamond in d = 2 and octahedron in d = 3.The hierarchy of LP/SDP for the bootstrap problems originates in part from the hierarchy of D n .
Given two subsets A ⊂ Λ and B ⊂ Λ, we write A ∼ B if they can be transformed into each other by a symmetry transformation of the lattice Λ = Z d (which are generated by translations, rotations, and reflections).This defines an equivalence relation on the set of finite subsets of Λ.
Given any finite subset A ⊂ Λ, we define the monomials s A := i∈A s i , and we also define s ∅ := 1.For each n, we further define P n := { A∈Dn t A s A , t A ∈ R}, the set of polynomials in spin variables restricted to D n .In the hierarchy of LP/SDP, the level n LP/SDP will impose constraints on candidate moments for polynomials in P n .Such candidate moments will be denoted as m n : P n → R.

Bootstrapping the invariant measure of the stochastic Ising model
We now introduce the hierarchy of LPs which provides a series of rigorous bounds on the objective moment of the invariant measure of the stochastic Ising model.
Definition 11.Given p ∈ P m for some m ∈ N, we define the bootstrap problem BS 1 (p) as the following hierarchy of LPs: For each n ∈ N (called the level of the LP hierarchy) such that n ≥ m, we have the LP problem LP (p, n) of minimizing m n (p) over the space of candidate moments m n : P n → R satisfying the following conditions: • Probability bound.For all the spin assignments {u i } i∈Dn over D n , 0 ≤ m n (F ({u i } i∈Dn , s)) where F ({u i } i∈Dn , s) is the corresponding indicator function.
• Linearity.Given any polynomials q 1 ∈ P n and q 2 ∈ P n , with λ ∈ R, their moments satisfy linearity: • Invariance.For any polynomial f ∈ P n−1 , the moments satisfy the invariance with respect to the transition rate c(i, s) of the stochastic Ising model in (2.7): where The minimum of m n (p) obtained by LP (p, n) will be denoted as ⟨p⟩ * n .The corresponding candidate moments m n (q) for polynomials q ∈ P n realizing such a minimum (which may not be unique) will be denoted as ⟨q⟩ * n .
A few comments are in order.Firstly, the invariance condition written above makes sense because f ∈ P n−1 , and the transition rate c(i, s) depends only on the nearest neighbors of the site i ∈ D n−1 , so that d f i indeed is an element of P n and (3.1) therefore is a linear equation on the moments m n : P n → R. In fact, the very existence of the hierarchy of LPs for the stochastic Ising model is due to locality, where invariance equations involve only the nearest neighbor expressions.It is also worth mentioning that we could replace the invariance condition by reversibility condition: Theorem 2 implies that this condition is obeyed by any invariant measure of the stochastic Ising model with the transition rate c(i, s) respecting the symmetries of the lattice.We will see later that the invariance condition is already sufficient for the convergence of BS 1 (p) and the resulting measure will be not only invariant, but also reversible (which is equivalent to Gibbs).
Secondly, the above LP problem LP (p, n) is always feasible because the measure g f in (2.6) for the statistical Ising model on a large enough but finite torus will satisfy all the conditions.Of course, the Gibbs measure of the statistical Ising model on the infinite lattice (whose existence was established long time ago) also satisfies all the conditions of LP (p, n) for any n.
Let us compare BS 1 (p) to the traditional K-moment problem [23], where there will be a variable x i ∈ R at each lattice site and the moment m ′ will map polynomials in x i (of any positive integer power) to R. x 2 i = 1 will then be imposed by m ′ (x i ) 2 − 1 f (x) = 0 for all i ∈ Λ and all sums of squares functions f (x).This is indeed how SDP was formulated for 0-1 problem in [18] for example.BS 1 (p) instead imposes x 2 i = 1 directly within m ′ (•) and thus considers polynomials which are at most linear in each x i .⟨p⟩ * n for any n provides a rigorous lower bound on the expectation value ⟨p⟩ of any invariant measure respecting all the symmetries of the lattice, for the stochastic Ising model with the transition rate c(i, s).One may use any c(i, s) for the stochastic Ising model as long as it allows for a polynomial expression, and still obtain rigorous lower bounds on the expectation value ⟨p⟩.Of course, one can obtain rigorous upper bounds simply by studying the analogous LP problem of maximizing m n (p).
All the conditions of LP (p, n) are a subset of the conditions of LP (p, k) when k ≥ n.Therefore, the obtained lower bounds can only increase as we increase the level n of the LP hierarchy: ⟨p⟩ * n ≤ ⟨p⟩ * k , ∀k ≥ n.Later, we will discuss its convergence to the expectation value of an extremal Gibbs measure.

Bootstrapping the Gibbs measure of the statistical Ising model
In this subsection, we review the bootstrap problem BS ′ 2 proposed in [11] for the Gibbs measure of the statistical Ising model and discuss the related bootstrap problem BS 2 which will be shown to converge later.BS ′ 2 is mainly based on two properties of the Gibbs measure: reflection positivity and spin-flip equations, both of which are explained in full details in [11].We provide a brief summary of the two below.
Z} where e µ is the unit vector along the µ-th direction, there are three inequivalent reflections preserving the lattice up to rotations and translations by integer units (except for d = 1 where there are only two inequivalent reflections).They are denoted as R v,c where the pair (v, c) consists of a vector v on the lattice and a constant c.Their actions on a site i ∈ Λ are given by R v,c (i Three inequivalent reflections are given by R v,c with (v, c) ∈ κ := {(e 1 , 0), (e 1 , 1/2), (e 1 + e 2 , 0)}, where the last reflection is absent for d = 1.Reflection positivity states that the expectation value ⟨•⟩ of the Gibbs measure satisfies: Spin-flip equations can be most easily seen from the Gibbs measure on the finite lattice g f in (2.6).When evaluating the expectation value of a function using g f , sum over all possible spin configurations {u i } i∈Λ is performed.Since the spin values u i at each site i ∈ Λ are summed over both −1 and 1, the expectation value should be the same if one takes a change of variable u i → −u i .This produces spin-flip equations, which can be extended to the infinite lattice case: We now define the bootstrap problem BS 2 , which is a small extension of the bootstrap problem BS ′ 2 in [11], for the Gibbs measure as follows: Definition 12.Given p ∈ P m for some m ∈ N, we define the bootstrap problem BS 2 (p) as the following hierarchy of SDPs: For each n ∈ N (called the level of the SDP hierarchy) such that n ≥ m, we have the SDP problem SDP (p, n) of minimizing m n (p) over the space of candidate moments m n : P n → R satisfying the following conditions: n by its matrix elements ).Then these matrices should satisfy reflection positivity M v,c n ⪰ 0. • Probability bound.For all the spin assignments {u i } i∈Dn over D n , 0 ≤ m n (F ({u i } i∈Dn , s)) where F ({u i } i∈Dn , s) is the corresponding indicator function.
• Linearity.Given any polynomials q 1 ∈ P n and q 2 ∈ P n , with λ ∈ R, their moments satisfy linearity: • Spin-flip equation.For all i ∈ D n−1 and f ∈ P n , the moments satisfy spin-flip equations: where the RHS is a polynomial moment due to s 2 j = 1, ∀j ∈ Λ.
The minimum of m n (p) obtained by SDP (p, n) will be denoted as ⟨p⟩ # n .The corresponding candidate moments m n (q) for polynomials q ∈ P n realizing such a minimum (which may not be unique) will be denoted as ⟨q⟩ # n .
Bootstrap problem BS ′ 2 in [11] is the same as BS 2 except that the condition of probability bound was not imposed.It can be checked that reflection positivity alone does not imply probability bound within the domain D n .In d = 1, the combination of reflection positivity and spin-flip equations still does not imply probability bound.In contrast in d = 2, it was empirically observed in [11] that the same combination implies square positivity which we will later show to be equivalent to probability bound.In any case, adding probability bounds to the SDP does not increase the computational cost significantly since they are merely a lot of 1 × 1 inequalities, rather than a large irreducible matrix inequality.
Similar to the previous discussion on BS 1 , the existence of the SDP hierarchy for BS 2 is due to the local nature of spin-flip equations which involve only the nearest neighbor expressions.Also, the feasibility of BS 2 is guaranteed due to the existence of the Gibbs measure on the infinite lattice.The sequence of the mimina ⟨p⟩ # n gives rigorous lower bounds which can only increase as n increases.In [11], it was observed that well away from the criticality in d = 2, BS ′ 2 produces lower and upper bounds for the nearest spin correlator ⟨s i s i+e 1 ⟩ which are very close to each other already at n = 2, where the gap between the two sometimes was as small as 10 −15 .
BS 1 and BS 2 differ in terms of the equations imposed on the candidate measure, and the latter further imposes reflection positivity.Nonetheless, they should be compatible because the Gibbs measure on the infinite lattice provides a feasible solution to both of them.By Theorem 1, one may expect that BS 2 is stronger than BS 1 since every Gibbs/reversible measure is invariant.for f (s) ∈ P n and i ∈ D n−1 , and using that c(i, s) exp hs i + J j∈n(i) s i s j is even in s i by definition, it is straightforward to derive which is the reversibility equation (3.2).The latter then implies invariance equations by linearity.■ Lemma 4 shows that any solution of SDP (p, n) is feasible for LP (p, n).In particular, ⟨p⟩ * n ≤ ⟨p⟩ # n .It should be noted though that there are considerably many more spin-flip equations than invariance equations at each level of the hierarchy, which may lead to a bigger scale separation issue for BS 2 (this issue will be discussed further in section 5).In contrast, even if LP (p, n) is further equipped with spin-flip equations, it is safer from the scale separation issue since LP is less sensitive about it than SDP in general.We will consider different combinations of positivity and equations later in section 5.

Asymptotic convergence of BS 1
In this section, we show that as the level n of the LP hierarchy BS 1 increases, one can find a convergent subsequence of moments {⟨q⟩ * n } n∈N for q ∈ P (S) where the convergent limit corresponds to the moments of an invariant measure of the stochastic Ising model.Theorem 2 then implies that this measure is also a Gibbs measure.Also, BS 2 converges in the same sense by Lemma 4.
There are two steps in the proof.The first step is to show that the candidate moments indeed come from a valid probability measure, a problem often called "the moment problem."The second step is to make sure that such a measure is indeed an invariant measure of the stochastic Ising model respecting the symmetries of the lattice.We will obtain the desired result by explicitly constructing a probability measure realizing the candidate moments produced by LP.Since the indicator functions corresponding to the generators of the event space V are finite polynomials, the value of the measure evaluated on such events can be naturally associated with the candidate polynomial moments of the corresponding indicator functions obtained from LP.This natural prescription indeed will be shown to define a consistent probability measure.

Moment problem on S
Establishing a moment problem over a general sample and event space is very difficult and the answers are known only in some special cases, such as Hamburger moment problem or K-moment problem.In this subsection, we will see that statistical mechanical systems are particularly well-suited for formulating the moment problem. 6Even though we present only the case of the Ising model, the ideas can be straightforwardly generalized to other statistical mechanical systems.
We begin by explaining the moment problem on a finite lattice. 7  Theorem 3. Consider a finite subset Λ f ⊂ Λ = Z d .Denote the space of spin configurations over Λ f by S f = {−1, 1} Λ f and the corresponding event space by V f .Let P (S f ) be the space of polynomials of spin variables over S f .A candidate moment m Λ f : P (S f ) → R is a moment of a probability measure over the sample space S f and the event space V f if and only if it satisfies: • Probability bound.For all the spin assignments {u where F {u i } i∈Λ f , s is the corresponding indicator function.
• Linearity.Given any polynomials q 1 ∈ P (S f ) and q 2 ∈ P (S f ), with λ ∈ R, their moments satisfy linearity: m Λ f (q 1 + λq 2 ) = m Λ f (q 1 ) + λm Λ f (q 2 ). 6Discussions on the moment problem of the statistical mechanical systems can be found for example in [24]. 7An equivalent problem was discussed in [24], and similar problems where the sample space is given by a finite product of a finite set appeared in various places, such as 0-1 problem and MAX-CUT problem -see e.g.[18].
Proof ) " Only if " part is trivial since moments of non-negative functions for a probability measure are non-negative.For the " if " part, we explicitly construct a probability measure ρ Λ f giving rise to the moment m Λ f .Since the moment is defined on all polynomials of spin variables over Λ f , it is defined in particular on the indicator functions (2.2): for all A ⊂ Λ f .The event space V f is generated by the events E ({u i } i∈A , s) defined in (2.1), where spin assignments u i are specified over a subset A ⊂ Λ f .We define ρ Λ f by its value on these generating events: We extend the definition linearly: given disjoint generating events E {u ρ Λ f on the complement events are defined by This definition is consistent in that, if there are two sets of pairwise disjoint events such that their unions coincide, ρ Λ f evaluated on them are the same.This is due to the assumption m Λ f (1) = 1, linearity, and the properties of the indicator functions, together with the fact that the sample and event spaces under consideration are finite.This determines ρ Λ f completely and finite additivity of ρ Λ f naturally follows.
It remains to show that ρ Λ f is non-negative and bounded from above by 1.By definition, if we sum over all the indicator functions corresponding to all the events where every spin over Λ f is specified, we should get the function 1: Since every summand in the above is non-negative by probability bound assumption, linearity and unit normalization imply that any partial sum of m F {u i } i∈Λ f , s should be bounded from above by 1, leading to for all pairwise disjoint events E {u (t) i } i∈A (t) , s .Since m Λ f (1) = 1 by assumption, ρ f evaluated on the complement events are also bounded from below by 0 and from above by 1.This completes the proof.■ Now, we extend probability measure ρ Λ f constructed above to a probability measure ρ over the sample space S and the event space V on the infinite lattice Λ = Z d using the Kolmogorov extension theorem in stochastic process.The key idea of the extension theorem is that if probability measures defined on the finite subsets of an infinite set are compatible with each other in the sense explained below, then it is guaranteed that there exists a probability measure on the infinite set which agrees with probability measures on the finite subsets when restricted to those finite subsets.Theorem 4. A candidate moment m : P (S) → R is a moment of a probability measure if it satisfies: • Probability bound.0 ≤ m (F ({u i } i∈A , s)) for any the spin assignments {u i } i∈A over any finite subset A ⊂ Λ.
Proof ) Since Λ = Z d is countable, we can consider the sequence {r i } i∈N where r i ∈ Λ and r i ̸ = r j for i ̸ = j such that i∈N {r i } = Λ.Given N ∈ N, consider the subsequence R N := {r 1 , r 2 , ..., r N }.Considering R N as Λ f in Theorem 3, we obtain a valid probability measure ρ r 1 ,...,r N R N over the sample space {−1, 1} R N and the corresponding event space V R N as defined in the proof of Theorem 3: for all A ⊂ R N and spin configurations {u i } i∈A over it.This definition extends linearly and specifies the probability measure ρ r 1 ,...,r N R N completely as outlined in the proof of Theorem 3. Given any permutation π on the set {1, 2, ..., N }, we similarly define and this defines a valid probability measure ρ r π(1) ,...,r π(N ) R N . These probability measures are then manifestedly permutation invariant.

Furthermore, given any
for any A ⊂ R N and spin configurations {u i } i∈A over it.This implies that given the joint probability measure ρ , the marginal probability measure where the spin values on {r N +1 , ..., r N ′ } are summed over is given by ρ r 1 ,...,r N R N .
The above two properties of ρ r 1 ,...,r N R N , permutation invariance and marginality, are the sufficient conditions for the Kolmogorov extension theorem, which states that there is a probability measure ρ over the sample space S and the event space V on the infinite lattice i∈N {r i } = Λ = Z d such that its marginals are given by ρ r 1 ,...,r N R N : for all N ∈ N, A ⊂ R N , and spin configurations {u i } i∈A over A. By construction, m : P (S) → R is the moment of the probability measure ρ. ■ Probability bounds are the minimal positivity requirements for the existence of a measure realizing the candidate moments.It turns out that they are equivalent to another familiar positivity condition, square positivity.Lemma 5. Given a candidate moment m Λ f : P (S f ) → R satisfying the linearity and unit normalization of Theorem 3, the following two conditions are equivalent: • Probability bound.For all the spin assignments {u • Square positivity.For any polynomial q ∈ P (S f ), the moment of its square is positive: Proof ) Since every indicator function squares to itself, square positivity trivially implies probability bound.For the opposite direction, note that F {u i } i∈Λ f , s for all {u i } i∈Λ f provide a complete basis of P (S f ).Therefore, we can expand for any q ∈ P (S f ) with t u ∈ R. By definition, the product of indicator functions corresponding to pairwise disjoint events vanishes.Therefore, which is the desired result.■ Similarly, Theorem 4 holds true if probability bound is replaced by square positivity.Lemma 5 implies that the LP problem BS 1 can be equivalently formulated as a SDP problem where square positivity is imposed instead of probability bound.This is because, by defining the matrix M via its matrix elements M A,B = m s (A∪B)\(A∩B) where A ⊂ Λ f and B ⊂ Λ f , square positivity is equivalent to M ⪰ 0, which is a SDP constraint.However, there is no advantage in doing so because LP is much faster and cheaper than the equivalent SDP in this case.
Theorem 4 not only shows the existence of a probability measure ρ realizing the candidate moments, but also is constructive in that ρ evaluated on any event can be expressed in terms of the moments of the indicator functions.For example, given an infinite sequence of disjoint events {E k } k∈N , ρ evaluated on the partial union n k=1 E n is bounded from above by 1.Therefore, the limit ρ ( n k=1 E n ) as n → ∞ exists and is what the countable additivity of ρ predicts.Similarly, given an infinite sequence of strictly descending events E 1 ⊋ E 2 ⊋ E 3 ⊋ ..., the sequence ρ(E k ) is non-increasing and bounded from below by 0. Therefore, the limit ρ(E k ) as k → ∞ exists and this for example defines the value of ρ evaluated on the event where spin values on an infinitely many lattice sites are specified.Even though we expect such a value to be essentially 0 for the Ising model, it may even be 1 for extreme cases like Dirac measure on S.This illustrates the point that the moment problem we discussed above is about the space of all possible probability measures on the sample space S and the event space V , while the probability measure of our interest is specifically that of the Ising model.We now address how the symmetry and invariance conditions of BS 1 pin down the invariant/reversible/Gibbs measure of the Ising model within the space of all probability measures on S.

Asymptotic convergence of the Ising bootstrap
In this subsection, we show that the bootstrap problem BS 1 converges as the level n of the LP hierarchy increases.Two main ingredients for the proof have already been presented: the moment problem in Theorem 4 and the polynomial representation of invariance equations in Lemma 2. The rest of the proof follows the usual steps. 8heorem 5. Consider the bootstrap problem BS 1 (p) with LP hierarchy LP (p, n) for p ∈ P m .Recall that the minimum of m n (p) obtained by LP (p, n) is denoted as ⟨p⟩ * n and the corresponding candidate moments m n (q) of the polynomials q ∈ P n are denoted as ⟨q⟩ * n .For l ∈ N, define the sequence N m,l = {max(m, l), max(m, l) + 1, max(m, l) + 2, ...}.Then, Furthermore, there exists an invariant measure ρ of the stochastic Ising model with the transition rate c(i, s) which respects the lattice symmetries, whose corresponding expectation values ⟨•⟩ satisfy: Finally, given any other invariant measure ρ ′ of the stochastic Ising model with the transition rate c(i, s) respecting the lattice symmetries, ⟨p⟩ ≤ ⟨p⟩ ′ , where ⟨p⟩ ′ is the expectation value of p given by ρ ′ .Proof ) Square positivity (which follows from probability bounds by Lemma 5) and unit normalization imply that −1 ≤ ⟨s A ⟩ * n ≤ 1 for any A ⊂ D l and any n ∈ N m,l .Therefore, {⟨s A ⟩ * n } n∈N m,l is a bounded sequence in R 2 |D l | and thus has a convergent subsequence {⟨s A ⟩ * n } n∈Q with the limiting values ⟨s A ⟩ ∞ .By continuity, ⟨s A ⟩ ∞ as candidate moments satisfy all the conditions of Theorem 3 with Λ f = D l .Therefore, we can construct a probability measure ρ l on the sample space {−1, 1} D l and corresponding event space by declaring that its moments are given by ⟨s A ⟩ ∞ , A ⊂ D l .Furthermore, for all l 2 > l 1 ≥ l, we can similarly define ρ l 1 and ρ l 2 such that ρ l 1 is the marginal probability measure of ρ l 2 .Then, following the proof of Theorem 4, there is a probability measure ρ on the sample space S and the event space V such that its marginal probability measures are {ρ l ′ } l ′ ≥l .Since each lattice symmetry constraint involves only finitely many moments, ρ respects the lattice symmetries by continuity.Similarly, each invariance equation involves only finitely many moments and thus the moments of ρ satisfy invariance equations in Lemma 2 with the transition rate c(i, s) by continuity.Therefore, ρ is the invariant measure of the stochastic Ising model with the transition rate c(i, s) and the corresponding expectation value ⟨s A ⟩ of s A for any A ⊂ D l agrees with that given by the finite marginal probability measure: As discussed below Definition 11 of BS 1 (p), the sequence {⟨p⟩ * n } is a non-decreasing sequence in R. Square positivity, unit normalization, and linearity also imply that the sequence is bounded from above.Therefore, its limit ⟨p⟩ ∞ exists and coincides with the corresponding moment of ρ: ⟨p⟩ = ⟨p⟩ ∞ .Let ν be an invariant measure of the stochastic Ising model with the transition rate c(i, s) respecting the lattice symmetries such that its moment ⟨p⟩ ν for p is minimal among all such invariant measures.Since ⟨•⟩ ν is feasible for BS 1 (p), we have ⟨p⟩ ≤ ⟨p⟩ ν .Because ρ itself is an invariant measure, the definition of ν implies ⟨p⟩ ≥ ⟨p⟩ ν .Therefore, ⟨p⟩ = ⟨p⟩ ν .■ A few corollaries follow from previous discussions.Due to the symmetry conditions of BS 1 (p), Theorem 2 implies: Corollary 1. Probability measure ρ in Theorem 5 is a Gibbs measure of the statistical Ising model.converges: for all spin assignments {u i } i∈Dn such that u i = 1 for i ∈ D n−1 .
Proof ) Spin flip equations imply reversibility conditions for all f (s) ∈ P n and i ∈ D n−1 .Taking f (s) to be a specific indicator function F ({u j } j∈Dn , s), reversibility condition becomes where u ′ j = u j for j ̸ = i and u ′ i = −u i .Probability bound 0 ≤ F ({u j } j∈Dn , s) then implies since c(i, u) is strictly positive.By repeatedly applying the same argument, we obtain for all u ′′ such that u ′′ j = u j for j ∈ ∂D n .■ This Lemma implies that the number of probability bounds which should be imposed is of order 2 |∂Dn| ∼ 2 n rather than 2 |Dn| ∼ 2 n 2 in the presence of reversibility conditions.Instead, the number of spin-flip equations is of order |D n−1 |2 |Dn| ∼ n 2 2 n 2 while that of invariance equations is of order |D n−1 | ∼ n 2 .Therefore, the size of the LP increases from 2 n 2 to n 2 2 n 2 as we replace invariance equations with spin-flip equations, but this replacement nonetheless produces stronger bounds.Lemma 6 also applies to the SDP problem of BS 2 .There is even a further reduction in the number of probability bounds since reflection positivity implies that the indicator function corresponding to reflection symmetric spin assignments has a non-negative moment.Therefore, one only needs to impose probability bounds on the spin assignments over ∂D n which are not symmetric under all of the reflections.

Comparisons of different bootstrap approaches
We have discussed two sets of positivities in this work for the LP/SDP hierarchy (for each domain D n ⊂ Λ): Probability bound (LP) ⊂ Reflection positivity + Probability bound (SDP) These positivities are sufficient to solve the moment problem on S. We then combine one of these with the equations specifying the statistical/stochastic Ising model: Invariance equations ⊂ Spin-flip equations.
Any combination of positivity and equations in the above is guaranteed to converge.LP is much faster and cheaper than SDP, but the latter involving reflection positivities produces stronger bounds.Including too many equations leads to a SDP matrix whose ratio between the element of the biggest magnitude to the element of the smallest nonzero magnitude is large.In such cases, higher precision SDP solvers are needed which are necessarily much slower.Therefore, there is an advantage in using invariance equations instead of spin-flip equations because such a scale problem may be milder for the former.For the LP problem in contrast, such a precision issue is less likely to occur and imposing more equations do not require much extra computation cost.One great advantage of LP is that equations do not need to be solved because they can be directly implemented as part of the linear constraints.In contrast, directly incorporating equations into SDP is hard in practice, and one should instead solve the equations and substitute the solutions into SDP matrices by hand.
In [11], it was observed that BS ′ 2 produces the weakest bounds around the critical points.We thus take the d = 2 Ising model at the criticality, J = log(1+ √

2) 2
, h = 0, as the testing ground for different combinations of positivities and equations, where the objective function was the free energy ⟨p⟩ = ⟨s i s i+e 1 ⟩ whose exact value is given by 0.707107....The following table provides a summary of the results obtained by MOSEK [25] on the Intel i9-10900F processor.The abbreviations are given by: P: positivity, E: equations, n: LP/SDP hierarchy level, Min: lower bound on ⟨p⟩ rounded down to six significant digits, Max: upper bound on ⟨p⟩ rounded up to six significant digits, ST: solver runtime, PB: probability bound, RP: reflection positivity, I: invariance equations based on the transition rate c * (i, s), S: spin-flip equations.For the third row, we imposed spin-flip equations for polynomials in P 3 where the spin flip may take place at the boundary of D 3 .PB was then imposed only on the spin configurations generated by such spin-flip equations.For the last row, we truncated reflection positivity matrices to some arbitrary 200 × 200 principal submatrices because the full problem was slow.As expected, LP (used for PB) is much faster than SDP (used for RP), but produces much weaker bounds than the latter.However, it seems straightforward to extend the LP to D 4 , in which case the bounds may be comparable to those obtained by SDP while still requiring shorter amount of runtime for the solver.
For SDP, spin-flip equations on D 3 produced SDP matrices where the element of the biggest magnitude was ∼ 10 3 , while it was ∼ 10 2 for invariance equations on D 3 .Even though there are only 5 invariance equations on D 3 (fourth row), they still produce bounds of the same order as the full 549 spin flip equations on D 3 (fifth row), where the upper bounds are identical and the solver runtime is much shorter.This is where Theorem 2 is realized in practice.Finally, invariance equations on D 4 were still mild enough in terms of the scaling to produce SDP matrices that can be run on a double-precision solver and produced the strongest bounds (last row).

Discussions
In this work, we discussed the convergence of the bootstrap approach to the statistical and stochastic Ising model.We discuss several interesting conclusions.
• As already demonstrated many times in literature (e.g.[14][15][16]) and again in this work, Markov processes and stochastic models are amenable to the bootstrap approach.This is essentially because the observable of interest in these systems is an invariant measure, and LP/SDP provide systematic methods to study such a measure problem.One great feature manifest in many of such systems is that they have better chances to be ergodic and free of special solutions.This is in contrast to the classical dynamical systems where chaotic systems are always accompanied by infinitely many unstable periodic orbits which prevent bootstrap from directly accessing the ergodic orbit.Furthermore, this work suggests that any system that used to be studied by the traditional MCMC simulations may allow for an alternative bootstrap approach -one may choose to run the simulations, or to "bound" the simulations.The latter may be more expensive computationally, but the relative advantage is that bootstrap provides rigorous bounds on the observables of the infinite volume systems directly.
• We also demonstrated that statistical mechanical systems on the lattice are particularly well-suited for the Lasserre hierarchy formulation.As long as there is a notion of compactness on the local degrees of freedom and there is locality in the system, most of the steps in the moment problem and the convergence presented in this work may be extended straightforwardly.For example, lattice pure Yang-Mills theory may be an interesting case to study, where the compactness is present since SU (N ) is compact.9Above all, the very definition of the Gibbs measure on the infinite lattice using the local conditional probabilities allows for a very natural bootstrap formulation.
• A general lesson for the positive measure bootstrap is that considering the associated MCMC may help identifying the relevant pieces of bootstrap conditions.In the case of the Ising model considered in this work, there are plethora of spin correlator inequalities (some of which are non-convex) which have played important roles in establishing highly nontrivial results such as the existence of the phase transition.Also, the number of spin-flip equations explodes as the domain under consideration increases.Considering the problem of finding the invariant measure of the stochastic Ising model showed that the minimal set of bootstrap conditions which guarantee the convergence is probability bounds and invariance equations.In other words, these are enough to completely determine the theory.Of course for more general theories, the analogue of Theorem 2 may be hard to prove and the set of invariant measures may be strictly bigger than the set of physical measures of interest.Still, bootstrap approach may provide insights into such differences which are interesting problems on their own.There are also very obvious next steps.
• It will be very important to obtain the rate of the convergence as n increases.At least away from the criticality, empirical results of [11] suggest that the convergence is exponentially fast.Establishing the rate of the convergence is meaningful from both conceptual and practical perspectives.The asymptotic convergence shows that bootstrap can serve as an alternative definition of the system, while the rate of the convergence will tell us how to determine the physical observables to any desired precision.It will be also interesting to understand how much reflection positivity speeds up the convergence.
• In many examples on the lattice, an important quantity which is not explored in this work is the long-range correlators, which are often used to extract critical exponents or mass gap.From the convergence proof of BS 1 , we learned that to pin down the invariant measure, we need to impose probability bounds and invariance equations over the entire lattice in principle.If we consider a subset of probability bounds and invariance equations involving the long-range correlators, the bounds will be tight only if there is some universality among all the measures satisfying the subset of conditions.Furthermore, we will need to face the computational cost which increases exponentially as the number of spin configurations to be considered grows.Whether there will be an alternative approach to directly study critical exponents or mass gap within the bootstrap framework is unclear at the moment.
• Given the fundamental importance of reflection positivity and the role it played in showing various properties of the Ising model, it would be desirable to establish the precise relation between the positivity of the Gibbs measure and reflection positivity.Even though reflection positivity is a property of specific Hamiltonians, it is curious that it does not imply probability bounds even in the presence of spin-flip equations in d = 1 statistical Ising model.At least in this case, the nice inner product structure defined by reflection positivity together with the equations of motions is not be enough to deduce that the candidate moments originate from a valid probability measure.The question readily extends to any reflectionsymmetric Gibbs measures in other statistical mechanical systems.
• Needless to say, it is worth improving LP/SDP formulation itself.Indicator functions played a central role in showing the convergence in this work.They also provide a complete basis of P n and make probability bound and spin-flip equations very simple by definition (see for example (5.3)).The only drawback of this basis is that translation invariance is not straightforward to impose.From the perspective of Theorem 2, it may seem that translation invariance is essential, but it is also known that the Gibbs measures of the statistical Ising model are translation invariant.Therefore, one would expect to recover translation invariance by imposing spin-flip equations even if translation invariance is not imposed at the level of bootstrap.

Lemma 4 .
For each n ∈ N, spin-flip equations of SDP (p, n) include reversibility equations (3.2), which also include invariance equations of LP (p, n), under the linearity assumption.Proof ) Making the following choice of f in spin-flip equations (3.5), f (s) = c(i, s)f si (3.6) 167853 0.851084 ∼0.5 sec PB S 3 0.303045 0.820244 ∼0.5 sec PB S 3.5 0.444667 0.820244 a few mins only a subset of PB used RP I 3 0.628600 0.753475 a few secs RP S 3 0.654752 0.753475 a few mins data from [11] RP I 4 0.682418 0.740840 ∼20 mins only a subset of RP used