In this section we prove (standard and non-standard) Central Limit Theorems for the vector \({\widehat{m}}^{(n)}\). In the first subsection we will treat the high temperature regime. Here we derive a standard CLT using the Hubbard–Stratonovich transform. This is in spirit similar to the third section in [27] and technically related to [22]. The result can also be derived from [17], where similar techniques are used. However, the subsection also prepares nicely for Sect. 3.2, where we treat the critical case and show a non standard CLT. This generalizes results from [18] and [27]. Finally, in Sect. 3.3 we will use Stein’s method, an alternative approach to prove the CLT for \({\widehat{m}}^{(n)}\). This is not only interesting in its own right, but also has the advantage of providing a speed of convergence, which is missing in the case of a proof via the Hubbard–Stratonovich transform.
Central Limit Theorem: Hubbard–Stratonovich Approach
For the proof we shall use the transformed block magnetization vectors
$$\begin{aligned} w^{(n)}&:=V_n m^{(n)}, \\ {\widehat{w}}^{(n)}&:=V_n {\widehat{m}}^{(n)}, \\ {\widetilde{w}}^{(n)}&= V_n \varGamma _n {\widetilde{m}}^{(n)}, \end{aligned}$$
where \(\varGamma _n A \varGamma _n = V_n^T \varLambda _n V_n\) is the orthogonal decomposition. It is easy to see that
$$\begin{aligned} H_n = \frac{1}{2N} \left\langle {w^{(n)}, \varLambda _n w^{(n)}} \right\rangle = \frac{1}{2} \left\langle {{\widehat{w}}^{(n)}, \varLambda _n {\widehat{w}}^{(n)}} \right\rangle = \frac{N}{2} \left\langle {\varLambda _n {\widetilde{w}}^{(n)}, {\widetilde{w}}^{(n)}} \right\rangle . \end{aligned}$$
Proof of Theorem 2
As in [27] or [17] (both papers are inspired by [16]), we use the Hubbard–Stratonovich transform (i.e. a convolution with an independent normal distribution). For each \(n \in {{\,\mathrm{\mathbb {N}}\,}}\),
$$\begin{aligned} \mu _{J_n}(\sigma ) = Z_n^{-1} \exp \left( \frac{1}{2} \left\langle {\varLambda _n {\widehat{w}}^n, {\widehat{w}}^n} \right\rangle \right) . \end{aligned}$$
Our first step is to prove that \({\widehat{w}}^n\) converges weakly to a normal distribution. Let \(Y_n \sim {\mathcal {N}}(0, \varLambda _n^{-1})\) be an independent sequence, which is moreover independent of \(({\widehat{w}}^n)_{n \in {{\,\mathrm{\mathbb {N}}\,}}}\). We have for any \(B \in {\mathcal {B}}({{\,\mathrm{\mathbb {R}}\,}}^k)\)
$$\begin{aligned}&{{\,\mathrm{\mathbb {P}}\,}}({\widehat{w}}^n + Y_n \in B) = \frac{1}{Z_n} \sum _{\sigma \in \{\pm 1\}^N} \mu _{J_n}(\sigma ) \int _{B} \exp \left( - \frac{1}{2} \left\langle {x-{\widehat{w}}^n, \varLambda _n(x-{\widehat{w}}^n)} \right\rangle \right) dx \\&\qquad \qquad = \frac{2^N}{C_n Z_n} \int _B \exp \left( - \frac{1}{2} \left\langle {x, \varLambda _n x} \right\rangle \right) {{\,\mathrm{\mathbb {E}}\,}}_{\mu _0} \exp \left( N \left\langle { \frac{1}{\sqrt{N}} \varGamma _n V^T \varLambda _n x, \frac{1}{N} \varGamma _n^{-2} m} \right\rangle \right) dx \\&\quad \qquad = \frac{2^N}{C_nZ_n} \int _B \exp \left( - \varPhi _n(x) \right) dx, \end{aligned}$$
where we have defined
$$\begin{aligned} \begin{aligned} \varPhi _n(x)&:=\frac{1}{2} \left\langle {x, \varLambda _n x} \right\rangle - \sum _{i=1}^k |{B_i^{(n)}} | \log \cosh \left( \frac{\sqrt{N}}{|{B_i^{(n)}} |} (\varGamma _n V_n \varLambda _n x)_i \right) \\&= \frac{1}{2} \left\langle {x, \varLambda _n x} \right\rangle - \sum _{i=1}^k |{B_i^{(n)}} | \log \cosh \left( |{B^{(n)}_i} |^{-1/2} (V_n \varLambda _n x)_i \right) . \end{aligned} \end{aligned}$$
Since \(\log \cosh (x) = \frac{1}{2} x^2 + O(x^4)\), we obtain
$$\begin{aligned} \varPhi _n(x)&= \frac{1}{2} \left\langle {x, \varLambda _n x} \right\rangle - \frac{1}{2} \left\langle {x, \varLambda _n^2 x} \right\rangle + \frac{1}{N} O\left( \sum _{i = 1}^k \frac{N}{|{B^{(n)}_i} |} (V_n \varLambda _n x)_i^4 \right) \nonumber \\&= \frac{1}{2} \left\langle {x, (\varLambda _n - \varLambda _n^2) x} \right\rangle + \frac{1}{N} O(\Vert {\varGamma _n^{-1/2} V_n \varLambda _n x} \Vert _4^4). \end{aligned}$$
(8)
For parameters \(r, R > 0\) let \(B_{0,r,R} :=\{ x \in {{\,\mathrm{\mathbb {R}}\,}}^k : r \le \Vert {x} \Vert _2^2 \le R \}\) and decompose
$$\begin{aligned} {{\,\mathrm{\mathbb {P}}\,}}({\widehat{w}}^n + Y_n \in B)&= \frac{2^N}{C_n Z_n} \left( \int _{B \cap B_R(0)} + \int _{B \cap B_{0,R,r\sqrt{N}}} + \int _{B \cap B_{r\sqrt{N}}(0)^c} \right) \exp \left( - \varPhi _n(x) \right) dx\\&=:\frac{2^N}{C_n Z_n} \left( I_1 + I_2 + I_3 \right) . \end{aligned}$$
Since \(\varLambda _n \rightarrow \varLambda _\infty \) (which is a consequence of the continuity of the eigenvalues) we have for any \(R > 0\)
$$\begin{aligned} \lim _{n \rightarrow \infty } I_1 = \int _{B \cap B_R(0)} \exp \left( - \frac{1}{2} \left\langle {x, (\varLambda _\infty - \varLambda _\infty ^2) x} \right\rangle \right) dx. \end{aligned}$$
Next, we will estimate (8) from below in order to obtain an upper bound for \(I_2\). If we define \(C_{2,4} :=\Vert {{{\,\mathrm{Id}\,}}} \Vert _{2 \rightarrow 4}\), it follows that
$$\begin{aligned} \begin{aligned} \varPhi _n(x)&\ge \frac{1}{2} \left\langle {x, (\varLambda _n - \varLambda _n^2)x} \right\rangle - C(r) \Vert {\varGamma _n^{-1/2}} \Vert _{4 \rightarrow 4} r^2 \Vert {\varLambda _n} \Vert _{2 \rightarrow 2}^2 \left\langle {x,x} \right\rangle \\&\ge \frac{1}{2} \left\langle {x, \left( \varLambda _n - \varLambda _n^2 - C(r)r^2 C \right) x} \right\rangle \\&\ge c \frac{1}{2} \left\langle {x, x} \right\rangle . \end{aligned} \end{aligned}$$
Here, we have used the convergence of \(\varGamma _n\) to \(\varGamma _\infty \) to bound \(\Vert {\varGamma _n^{-1/2}} \Vert _{4 \rightarrow 4}\) and the fact that \(C(r)r^2 \rightarrow 0\) as \(r \rightarrow 0\), so that the right hand side is positive definite for r small enough, uniformly in n. Thus, after taking the limit \(n \rightarrow \infty \), \(I_2\) will vanish in the limit \(R \rightarrow \infty \).
Lastly, we need to show that \(I_3\) vanishes as well. To this end, we show that we can choose \(r > 0\) small enough to ensure that \(\varPhi _n(x) \ge \exp (-N c)\) uniformly for \(x \in B_{r\sqrt{N}}(0)^c\) and for n large enough. Since \(\Vert {\varLambda _n - \varLambda _\infty } \Vert _{2 \rightarrow 2} \rightarrow 0\) and \(\Vert {\varLambda _\infty } \Vert _{2 \rightarrow 2} < 1\), choose n large enough so that \(\Vert {\varLambda _n} \Vert _{2 \rightarrow 2} < 1\) uniformly. Again, as before, it can be seen that 0 is the only minimum for n chosen that way. Indeed, after some manipulations any critical point satisfies \(\varGamma _n A \varGamma _n \tanh (y) = y\), and since \(\Vert {\tanh (y)} \Vert _2 \le \Vert {y} \Vert _2\) and \(\Vert {\varGamma _n A \varGamma _n} \Vert _{2 \rightarrow 2} < 1\), this is only possible for \(y = 0\). As a consequence, for any \(r > 0\) there is a constant c such that uniformly \({\widetilde{\varPhi }}_n(x) \ge c\), i.e.
$$\begin{aligned} I_3 \le \int {\mathbb {1}}_{\{\Vert {x} \Vert _2 > r \sqrt{N}\}} \exp \left( - \varPhi _n(x) \right) dx \le \int _{B_{r \sqrt{N}}(0)^c} \exp \left( -N {\widetilde{\varPhi }}_n(N^{-1/2}x) \right) dx \rightarrow 0. \end{aligned}$$
Lastly, choose \(r > 0\) so small that \(\varLambda _n - \varLambda _n^2 - C(r)r^2 C\) is uniformly positive definite, and observe that we obtain
$$\begin{aligned} \lim _{n \rightarrow \infty } {{\,\mathrm{\mathbb {P}}\,}}({\widehat{w}}^n + Y_n \in B) = {\mathcal {N}}(0, (\varLambda _\infty - \varLambda _\infty ^2)^{-1})(B). \end{aligned}$$
From here, it remains to undo the convolution (e.g. by using the characteristic function), giving
$$\begin{aligned} \lim _{n \rightarrow \infty } \mu _{J_n}({\widehat{w}}^n \in B) = {\mathcal {N}}(0, ({{\,\mathrm{Id}\,}}- \varLambda _\infty )^{-1})(B). \end{aligned}$$
With the help of Slutsky’s theorem and the definition \({\widehat{m}}^n = V^T_n {\widehat{w}}^n\) this implies
$$\begin{aligned} \mu _{J_n} \circ {\widehat{m}}^n \Rightarrow {\mathcal {N}}(0, V^T ({{\,\mathrm{Id}\,}}- \varLambda _\infty )^{-1} V) = {\mathcal {N}}(0, \left( {{\,\mathrm{Id}\,}}- \varGamma _\infty A \varGamma _\infty \right) ^{-1}) \end{aligned}$$
as claimed. \(\square \)
Example 3
Consider the case \(k = 2\) and
$$\begin{aligned} A_2 = \begin{pmatrix} \beta &{} \alpha \\ \alpha &{} \beta \end{pmatrix}. \end{aligned}$$
\(A_2\) is positive definite if \(\beta \ge 0\) and \((\beta - \alpha )(\beta + \alpha ) \ge 0\), i.e. if \(|{\alpha } | \le \beta \). We have the diagonalization
$$\begin{aligned} A_2 = \frac{1}{2} \begin{pmatrix} 1 &{} 1 \\ 1 &{} -1 \end{pmatrix} \begin{pmatrix} \beta +\alpha &{} 0 \\ 0 &{} \beta -\alpha \end{pmatrix} \begin{pmatrix} 1 &{} 1 \\ 1 &{} -1 \end{pmatrix} =:V^T \varLambda V, \end{aligned}$$
and \(w = V^T m = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 &{} 1 \\ 1 &{} -1 \end{pmatrix}m\) corresponds to the transformation performed in [27, Theorem 1.2] (up to a factor of \(\sqrt{2}\)). In this case
$$\begin{aligned} \left( {{\,\mathrm{Id}\,}}- \frac{1}{2}A_2\right) ^{-1} = \frac{2}{(\beta -2)^2 - \alpha ^2}\begin{pmatrix} 2 - \beta &{} \alpha \\ \alpha &{} 2-\beta \end{pmatrix} \end{aligned}$$
which is exactly the covariance matrix in [27] (again up to a factor of 2). Note that similar results have been derived in [25].
Remark 1
If \(A \in M_k({{\,\mathrm{\mathbb {R}}\,}})\) is symmetric and positive semidefinite, then a variant of the proof shows that if we let \(A = V^T \varLambda V\) with \(\varLambda = {{\,\mathrm{diag}\,}}(\lambda _1, \ldots , \lambda _l, 0, \ldots , 0)\) for \(l < k\), \(((V{\widetilde{m}})_i)_{i \le l}\) converges to an l-dimensional normal distribution with covariance matrix \(\varSigma _l :=({{\,\mathrm{Id}\,}}- \varLambda _l)^{-1}, \varLambda _l = {{\,\mathrm{diag}\,}}(\lambda _1, \ldots , \lambda _l)\). This can be applied to the matrix \(A_2\) above with \(\alpha = \beta \), resulting in a CLT for the magnetization in a Curie–Weiss model, which of course can also be obtained by choosing \(k = 1\) and \(0< \beta < 1\).
Non-central Limit Theorem
Recall the situation of Theorem 3: The block interaction matrix has eigenvalues \(0< \lambda _1 \le \ldots \le \lambda _{k-1} < \lambda _k = k\) and we consider the uniform case, i.e. \(\varGamma _\infty ^2 = k^{-1}\). Moreover, we use the definitions
$$\begin{aligned} w'&= {{\,\mathrm{diag}\,}}(N^{-1/2}, \ldots , N^{-1/2}, N^{-3/4}) V m^{(n)}, \\ {\hat{C}}_N&= {{\,\mathrm{diag}\,}}(\lambda _1, \ldots , \lambda _{k-1}, kN^{1/2}), \end{aligned}$$
so that
$$\begin{aligned} H_n = \frac{1}{2} \left\langle {{\hat{C}}_N w', w'} \right\rangle . \end{aligned}$$
Proof of Theorem 3
Let \(Y_n \sim {\mathcal {N}}(0, {\hat{C}}_N^{-1})\) and \(X_n \sim \mu _{J_n}\) be independent random variables, defined on a common probability space. We have for any Borel set \(B \in {\mathcal {B}}({{\,\mathrm{\mathbb {R}}\,}}^k)\)
$$\begin{aligned} {{\,\mathrm{\mathbb {P}}\,}}\left( w_n'(X_n) + Y_n \in B \right)&= 2^{N} Z_n^{-1} \int _B \exp \Big ( - \frac{1}{2} \left\langle {{\hat{C}}_N x,x} \right\rangle \Big ) {{\,\mathrm{\mathbb {E}}\,}}_{\mu _0} \exp \Big ( \left\langle {x,{\hat{C}}w'} \right\rangle \Big ) dx \\&= {\widetilde{Z}}_n^{-1} \int _B \exp \Big ( - \frac{1}{2} \left\langle {{\hat{C}}_N x,x} \right\rangle + \frac{N}{k} \sum _{i= 1}^k \log \cosh ((V^T \varLambda {\widetilde{x}})_i) \Big ) dx \\&= {\widetilde{Z}}_n^{-1} \int _B \exp \left( - \varPhi _N(x) \right) dx \\&= {\widetilde{Z}}_n^{-1} \int _B \exp \left( - N {\widetilde{\varPhi }}_N\left( \frac{x_1}{N^{1/2}}, \ldots , \frac{x_{k-1}}{N^{1/2}}, \frac{x_k}{N^{1/4}} \right) \right) dx \end{aligned}$$
where we used
$$\begin{aligned} \varPhi _N(x)&:=\frac{1}{2} \left\langle {x,{\hat{C}}_N x} \right\rangle - \frac{N}{k} \sum _{i = 1}^k \log \cosh \left( \left( V^T \varLambda \left( \frac{x_1}{N^{1/2}}, \ldots , \frac{x_{k-1}}{N^{1/2}}, \frac{x_k}{N^{1/4}}\right) \right) _i \right) , \\ {\widetilde{\varPhi }}_N(x)&:=\frac{1}{2} \left\langle {x, \varLambda x} \right\rangle - \frac{1}{k} \sum _{i = 1}^k \log \cosh \left( (V^T \varLambda x)_i \right) . \end{aligned}$$
Now the proof is along the same lines as the proof of the CLT in the high temperature phase, with the slight modification that we use the expansion of \(\log \cosh \) to fourth order
$$\begin{aligned} \log \cosh (x) = \frac{x^{2}}{2} - \frac{x^4}{12} + O(x^6). \end{aligned}$$
We again split \({{\,\mathrm{\mathbb {R}}\,}}^k\) into three regions, namely the inner region \(I_1 = B_R(0)\) for an arbitrary \(R > 0\), the intermediate region \(I_2 = K_r \backslash B_R(0)\) for some arbitrary \(r > 0\), where
$$\begin{aligned} K_r :=\left\{ x \in {{\,\mathrm{\mathbb {R}}\,}}^k : {\left\| \left( N^{-1/2}x_1, \ldots , N^{-1/2}x_{k-1}, N^{-1/4}x_k\right) \right\| }_\infty \le r \right\} , \end{aligned}$$
and the outer region \(I_3 :=K_r^c\). Also define the rescaled vector
$$\begin{aligned} {\widetilde{x}} :=\left( \lambda _1 N^{-1/2} x_1, \ldots , \lambda _{k-1} N^{-1/2} x_{k-1}, k N^{-1/4} x_k \right) . \end{aligned}$$
Firstly, in the inner region we rewrite
$$\begin{aligned} \varPhi _N(x)&= \frac{1}{2} \left\langle {x,{\hat{C}}_N x} \right\rangle - \frac{N}{2k} \sum _{i = 1}^k (V^T {\widetilde{x}})_i^2 + \frac{N}{12k} \sum _{i = 1}^k (V^T {\widetilde{x}})_i^4 + \frac{N}{k} O(\Vert {V^T {\widetilde{x}}} \Vert _6^6) \\&= \frac{1}{2} \sum _{i = 1}^{k-1} \left( \lambda _i - \frac{\lambda _i^2}{k}\right) x_i^2 + \frac{N}{12k} \Vert {V^T {\widetilde{x}}} \Vert _4^4 + \frac{N}{k} O(\Vert {V^T {\widetilde{x}}} \Vert _6^6) \\&= \frac{1}{2} \sum _{i = 1}^{k-1} \left( \lambda _i - \frac{\lambda _i^2}{k}\right) x_i^2 + \frac{k^3}{12} x_k^4 \sum _{i=1}^k V_{ki}^4 + O(N^{-1/4}) + \frac{N}{k} O(\Vert {V^T {\widetilde{x}}} \Vert _6^6), \end{aligned}$$
and since the convergence of the error terms is uniform on any compact subset of \({{\,\mathrm{\mathbb {R}}\,}}^k\), for any fixed \(R > 0\) this yields
$$\begin{aligned} \lim _{N \rightarrow \infty } \int _{B \cap I_1} \exp \left( - \varPhi _N(x) \right) dx = \int _{B \cap I_1} \exp \left( - \frac{1}{2} \sum _{i = 1}^{k-1} \left( \lambda _i - \frac{\lambda _i^2}{k}\right) x_i^2 - \frac{k^3}{12} x_k^4 \sum _{i=1}^k V_{ki}^4 \right) dx. \end{aligned}$$
Secondly, we show that the outer region does not contribute to the limit \(N \rightarrow \infty \). It can be seen by elementary tools that \({\widetilde{\varPhi }}_N\) has a unique minimum 0 in 0, and so for any \(r > 0\) we have \(\inf _{x \in I_3} {\widetilde{\varPhi }}(x) > 0\). Using the monotone convergence theorem, we obtain
$$\begin{aligned} \lim _{N \rightarrow \infty } \int _{B \cap I_3} \exp \left( - N {\widetilde{\varPhi }}(x) \right) dx = 0. \end{aligned}$$
Lastly, we will estimate the contribution of the intermediate region from above by a quantity which vanishes as \(R \rightarrow \infty \). To this end, we will bound the function \(\varPhi _N\) from below. Recall that
$$\begin{aligned} \varPhi _N(x)&= \frac{1}{2} \left\langle {x,{\hat{C}}_N x} \right\rangle - \frac{N}{2k} \sum _{i = 1}^k (V^T {\widetilde{x}})_i^2 + \frac{N}{12k} \sum _{i = 1}^k (V^T {\widetilde{x}}_i)^4 + \frac{N}{k} O(\Vert {V^T {\widetilde{x}}} \Vert _6^6) \\&= \frac{1}{2} \left\langle {x, {\hat{C}}_N x} \right\rangle - \frac{N}{2k} \left\langle {{\widetilde{x}}, {\widetilde{x}}} \right\rangle + \frac{N}{12k} \Vert {V^T {\widetilde{x}}} \Vert _4^4 + \frac{N}{k} O(\Vert {V^T {\widetilde{x}}} \Vert _6^6) \end{aligned}$$
and since \(\Vert {V^T \widetilde{x_i}} \Vert _4^4 \ge C \Vert {{\widetilde{x}}_i} \Vert ^4_4\) for \(C = \Vert {V} \Vert _{4 \rightarrow 4}^{-4}\) this yields
$$\begin{aligned} \varPhi _N(x)&\ge \frac{1}{2} \left\langle {x, {\hat{C}}_N x} \right\rangle - \frac{N}{2k} \left\langle {{\widetilde{x}}, {\widetilde{x}}} \right\rangle + \frac{N}{12k} C \Vert {{\widetilde{x}}} \Vert _4^4 + \frac{N}{k} O(\Vert {V^T {\widetilde{x}}} \Vert _6^6) \\&= \frac{1}{2} \left\langle {\left( \varLambda - k^{-1} \varLambda \right) x,x} \right\rangle + \frac{k^4}{12} C x_k^4 + O(\Vert {V^T {\widetilde{x}}} \Vert _6^6). \end{aligned}$$
Now, as in the case of the central limit theorem, we can estimate from below the error term in such a way that there is a positive constant c and a positive definite matrix C such that
$$\begin{aligned} \varPhi _N(x) \ge \frac{1}{2} \left\langle {C(x_1, \ldots , x_{k-1},0),(x_1, \ldots , x_{k-1},0)} \right\rangle + c x_k^4, \end{aligned}$$
from which we obtain an upper bound, i.e.
$$\begin{aligned} \int _{B \cap I_3} \exp \left( - \varPhi _N(x) \right) dx \le \int _{B \cap I_3} \exp \left( - \frac{1}{2} \sum _{i,j=1}^{k-1} C_{ij} x_i x_j - c x_k^4 \right) dx, \end{aligned}$$
and the right hand side vanishes as \(R \rightarrow \infty \) by dominated convergence. As a result, the limit \(n \rightarrow \infty \) exists and is equal to
$$\begin{aligned} \lim _{n \rightarrow \infty } {{\,\mathrm{\mathbb {P}}\,}}\left( w'_n(X_n) + Y_n \in B \right) = Z^{-1} \int _B \exp \left( - \frac{1}{2} \sum _{i = 1}^{k-1} \left( \lambda _i - \frac{\lambda _i^2}{k} \right) x_i^2 - \frac{k^3}{12} x_k^4 \sum _{i = 1}^k V_{ki}^4 \right) dx. \end{aligned}$$
The convergence results for the non-convoluted vector follow easily by considering the characteristic functions. We have for any \(t \in {{\,\mathrm{\mathbb {R}}\,}}^k\)
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\exp \left( i \left\langle {t,w_n'(X_n) + Y_n} \right\rangle \right) \rightarrow \exp \left( - \frac{1}{2} \left\langle {(t_1,\ldots , t_{k-1}, {\widetilde{\varSigma }} (t_1, \ldots , t_{k-1}))} \right\rangle \right) \phi (t_k), \end{aligned}$$
where \({\widetilde{\varSigma }} = {{\,\mathrm{diag}\,}}\left( \lambda _i^{-1} + (k-\lambda _i)^{-1} \right) \) and \(\phi \) is the characteristic function of a random variable with distribution \(\exp \left( -x_k^4 k^3/12 \sum _{i = 1}^k V_{ki}^4 \right) \). Using the independence of \(X_n\) and \(Y_n\), the results follow by simple calculations. \(\square \)
Central Limit Theorem: Stein’s Method
Lastly, we will prove Theorem 4 using Stein’s method of exchangeable pairs. For brevity’s sake, for the rest of this section we fix \(n \in {{\,\mathrm{\mathbb {N}}\,}}\) and we will drop all sub- and superscripts (e.g. we write \(B_i\) instead of \(B_i^{(n)}\), \({\hat{m}}\) instead of \({\hat{m}}^{(n)}\), J instead of \(J_n\) et cetera). It is more convenient to formulate this approach in terms of random variables. Let X be a random vector with distribution \(\mu _J\) and I be an independent random variable uniformly distributed on \(\{1,\ldots , N\}\). First, denote by \((X, {\widetilde{X}})\) the exchangeable pair which is given by taking a step in the Glauber chain for \(\mu _J\), i.e. \({\widetilde{X}}\) is the vector after replacing \(X_I\) by an independent \({\widetilde{X}}_I\) with distribution \({\widetilde{X}}_I \sim \mu _J( \cdot \mid {\overline{X}}_I)\) (the exchangeability follows from the reversibility of the Glauber dynamics). Consequently, \(({\hat{m}}, {\hat{m}}') = ({\hat{m}}(X), {\hat{m}}({\widetilde{X}}))\) is also exchangeable. More precisely, with the standard basis vectors \((e_i)_{i = 1,\ldots ,k}\) of \({{\,\mathrm{\mathbb {R}}\,}}^k\) we have
$$\begin{aligned} {\hat{m}}' :={\hat{m}} - \frac{X_I - {\widetilde{X}}_I}{\sqrt{|{B_I} |}} \begin{pmatrix}1 \\ \vdots \\ 1 \end{pmatrix} \Rightarrow {\hat{m}} - {\hat{m}}' = \frac{X_I - {\widetilde{X}}_I}{\sqrt{M}} e_{h(I)}. \end{aligned}$$
(9)
We need the following lemma to identify the conditional expectation of \({\widetilde{X}}_i\). Here, we write \(h: \{1,\ldots , N\} \rightarrow \{1,\ldots ,k\}\) for the function that assigns to each position its block, i.e. \(h(j) = k \Longleftrightarrow j \in B_k\).
Lemma 4
Let \({\mathcal {F}} = \sigma (X)\) and \((X, {\widetilde{X}})\) be defined as above. Then for each fixed \(i \in \{1, \ldots , N\}\)
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\left( {\widetilde{X}}_i \mid {\mathcal {F}} \right) = \tanh \left( \frac{1}{\sqrt{N}} (A\varGamma {\hat{m}})_i - \frac{1}{N} A_{h(i)h(i)} X_i \right) . \end{aligned}$$
Proof
For any Ising model \(\mu = \mu _J\) the conditional distribution of \({\widetilde{X}}_i\) is given by \(\mu (\cdot \mid {\overline{X}}_i)\) and so
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\left( {\widetilde{X}}_i \mid {\mathcal {F}} \right) = 2\mu (1 \mid {\overline{X}}_i) - 1 = \tanh \left( (J^{(d)}X)_i \right) , \end{aligned}$$
where we recall the notation \(J^{(d)}\) for the matrix without its diagonal, i.e. \(J^{(d)} = J - {{\,\mathrm{diag}\,}}(J_{ii})\). In the case that \(J = J_n\) is the block model matrix, this yields
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\left( {\widetilde{X}}_i \mid {\mathcal {F}} \right)&= \tanh \left( N^{-1} \sum _{j = 1}^k A_{h(i)j} \sum _{l \in B_j} X_l - N^{-1} A_{h(i) h(i)} X_i \right) \\&= \tanh \left( N^{-1} (A m)_{h(i)} - N^{-1} A_{h(i) h(i)} X_i \right) \\&= \tanh \left( N^{-1/2} (A\varGamma {\hat{m}})_i - N^{-1} A_{h(i)h(i)} X_i \right) . \end{aligned}$$
\(\square \)
Since the conditional expectation will be of importance, we define
$$\begin{aligned} g_i(X) :=N^{-1} (A m)_{h(i)} - N^{-1} A_{h(i) h(i)} X_i = N^{-1/2} (A\varGamma {\hat{m}})_i - N^{-1} A_{h(i)h(i)} X_i, \end{aligned}$$
so that \({{\,\mathrm{\mathbb {E}}\,}}({\widetilde{X}}_i \mid {\mathcal {F}}) = \tanh (g_i(X))\). Note that \(g_i\) actually does not depend on \(X_i\), the latter term is added for convenience to rewrite the first term. Thus we have \(g_i(X) = {{\,\mathrm{\mathbb {E}}\,}}({\widetilde{X}}_i \mid {\overline{X}}_i)\).
Lemma 5
We have
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\left( {\hat{m}} - {\hat{m}}' \mid {\mathcal {F}}\right) = N^{-1} \left( {{\,\mathrm{Id}\,}}- \varGamma A \varGamma \right) {\hat{m}} + R(X), \end{aligned}$$
with
$$\begin{aligned} R(X) :=N^{-1} \sum _{i = 1}^k e_i \Big ( (\varGamma A \varGamma {\hat{m}})_i - |{B_i} |^{-1/2} \sum _{j \in B_i} \tanh \Big ( g_j(X) \Big ) \Big ). \end{aligned}$$
Proof
From Eq. (9) and Lemma 4 we obtain
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\left( {\hat{m}} - {\hat{m}}' \mid {\mathcal {F}} \right)&= N^{-1} \sum _{i = 1}^k e_i |{B_i} |^{-1/2}\sum _{j \in B_i} {{\,\mathrm{\mathbb {E}}\,}}( X_j - \widetilde{X_j} \mid {\mathcal {F}}) \\&= N^{-1} \sum _{i = 1}^k e_i {\hat{m}}_i - N^{-1} \sum _{i = 1}^k e_i |{B_i} |^{-1/2} \sum _{j \in B_i} \tanh (g_j(X)) \\&= N^{-1} {\hat{m}} - N^{-1} \sum _{i = 1}^k e_i |{B_i} |^{-1/2} \Big ( \sum _{j \in B_i} N^{-1/2} (A\varGamma {\hat{m}})_i \Big ) + R(X) \\&= N^{-1} \left( {{\,\mathrm{Id}\,}}- \varGamma A \varGamma \right) {\hat{m}} + R(X). \end{aligned}$$
\(\square \)
For n large enough, the matrix \(\varLambda :=N^{-1}({{\,\mathrm{Id}\,}}- \varGamma A \varGamma )\) satisfies \(\Vert {\varLambda } \Vert _{2 \rightarrow 2} < \frac{1}{N}\) and is thus invertible, with inverse \(\varLambda ^{-1} = N \sum _{l = 0}^\infty (\varGamma A \varGamma )^l\). Moreover, we also have \(\Vert {\varLambda ^{-1}} \Vert _{2 \rightarrow 2} \le N (1-\Vert {\varGamma A \varGamma } \Vert _{2 \rightarrow 2})^{-1}\).
We will need the following approximation theorem for random vectors.
Theorem 5
([30, Theorem 2.1]) Assume that \((W,W')\) is an exchangeable pair of \({{\,\mathrm{\mathbb {R}}\,}}^d\)-valued random vectors such that
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}W= 0, \quad \quad {{\,\mathrm{\mathbb {E}}\,}}W W^t = \varSigma , \end{aligned}$$
with \(\varSigma \in {{\,\mathrm{\mathbb {R}}\,}}^{d\times d}\) symmetric and positive definite. Suppose further that
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}[W' - W \mid W] = -\varLambda W + R \end{aligned}$$
is satisfied for an invertible matrix \(\varLambda \) and a \(\sigma (W)\)-measurable random vector R. Then, if Z has d-dimensional standard normal distribution, we have for every three times differentiable function
$$\begin{aligned} |{{{\,\mathrm{\mathbb {E}}\,}}h(W) - {{\,\mathrm{\mathbb {E}}\,}}h(\varSigma ^{1/2} Z)} | \le \frac{|{h} |_2}{4} E_1 + \frac{|{h} |_3}{12} E_2 + \left( |{h} |_1 + \frac{1}{2} d \Vert {\varSigma } \Vert ^{1/2} |{h} |_2 \right) E_3, \end{aligned}$$
where, with \(\lambda {(i)} :=\sum _{m = 1}^d |{\left( \varLambda ^{-1} \right) _{m,i}} |\), we define the three error terms
$$\begin{aligned} E_1&= \sum _{i,j = 1}^d \lambda {(i)} \sqrt{{{\,\mathrm{Var}\,}}{{\,\mathrm{\mathbb {E}}\,}}\left[ (W_i'-W_i)(W_j'-W_j) \mid W \right] }, \\ E_2&= \sum _{i,j,k = 1}^d \lambda {(i)} {{\,\mathrm{\mathbb {E}}\,}}|{(W_i'-W_i)(W_j'-W_j)(W_k'-W_k)} |, \\ E_3&= \sum _{i = 1}^d \lambda {(i)} \sqrt{{{\,\mathrm{Var}\,}}R_i}. \end{aligned}$$
Here, \(|{h} |_j\) denotes the supremum of the partial derivatives of up to order j.
Note that in the proof the choice of \(\sigma (W)\) for the conditional expectation is arbitrary; it suffices to take any \(\sigma \)-algebra \({\mathcal {F}}\) with respect to which W is measurable. Clearly, the value \(E_1\) has to be adjusted accordingly.
Corollary 2
Let \({\hat{m}}\) be the block magnetization vector and \({\hat{m}}'\) as above, define \(\varSigma :={{\,\mathrm{\mathbb {E}}\,}}{\hat{m}}{\hat{m}}^T\) and let \(Z \sim {\mathcal {N}}(0, \varSigma )\). For any function \(h \in {\mathcal {F}}_3\)
$$\begin{aligned} |{{{\,\mathrm{\mathbb {E}}\,}}h({\hat{m}}(X)) - {{\,\mathrm{\mathbb {E}}\,}}h(Z)} | \le CN \left( \frac{|{h} |_2}{4} E_1 + \frac{|{h} |_3}{12} E_2 + \left( |{h} |_1 + \frac{1}{2} k \Vert {\varSigma } \Vert ^{1/2} |{h} |_2 \right) E_3 \right) \end{aligned}$$
with the three error terms
$$\begin{aligned} E_1&= \sum _{i = 1}^k \sqrt{{{\,\mathrm{Var}\,}}\left( {{\,\mathrm{\mathbb {E}}\,}}(({\hat{m}}_i(X) - {\hat{m}}_i({\widetilde{X}}))^2 \mid {\mathcal {F}}) \right) } \\ E_2&= \sum _{i = 1}^k {{\,\mathrm{\mathbb {E}}\,}}|{{\hat{m}}_i(X) - {\hat{m}}_i({\widetilde{X}})} |^3 \\ E_3&= \sum _{i = 1}^k \sqrt{{{\,\mathrm{Var}\,}}(R_i)}. \end{aligned}$$
Finally, the following lemma shows that all error terms \(E_i\) can be bounded by a term of order \(N^{-3/2}\).
Lemma 6
In the situation of Corollary 2 we have
$$\begin{aligned} \max (E_1,E_2,E_3) = O(N^{-3/2}). \end{aligned}$$
Before we prove this lemma (and consequently Theorem 4), we will state concentration of measure results in the block spin Ising models. These will be necessary to bound \(E_1, E_2, E_3\). The first step is the existence of a logarithmic Sobolev inequality for the Ising model \(\mu _{J_n}\) with a constant that is uniform in n.
Proposition 3
Under the general assumptions, if \(\Vert {\varGamma _\infty A \varGamma _\infty } \Vert _{2 \rightarrow 2} < 1\), then for n large enough the Ising model \(\mu _{J_n}\) satisfies a logarithmic Sobolev inequality with a constant \(\sigma ^2 = \sigma ^2(\Vert {\varGamma _\infty A \varGamma _\infty } \Vert _{2 \rightarrow 2})\), i.e. for any function \(f: \{-1,+1\}^N \rightarrow {{\,\mathrm{\mathbb {R}}\,}}\) we have
$$\begin{aligned} {{\,\mathrm{Ent}\,}}_{\mu _{J_n}}(f^2) \le 2\sigma ^2 \sum _{i = 1}^N {{\,\mathrm{\mathbb {E}}\,}}_{\mu _{J_n}} (f - f \circ T_i)^2, \end{aligned}$$
(10)
where \({{\,\mathrm{Ent}\,}}\) is the entropy functional and \(T_i(\sigma ) = (\sigma _1,\ldots , \sigma _{i-1}, -\sigma _i, \sigma _{i+1}, \ldots , \sigma _N)\) the sign flip operator.
This follows immediately from [23, Proposition 1.1], since \(\varGamma _n A \varGamma _n \rightarrow \varGamma _\infty A \varGamma _\infty \), which implies the convergence of the norms, i.e. for n large enough we have \(\Vert {\varGamma _n A \varGamma _n} \Vert _{2 \rightarrow 2} < 1\). Although the condition in [23] is \(\Vert {J} \Vert _{1 \rightarrow 1} < 1\), this was merely for applications’ sake and \(\Vert {J} \Vert _{2 \rightarrow 2} < 1\) is sufficient to establish the logarithmic Sobolev inequality.
For any function \(f: \{-1,+1\}^N \rightarrow {{\,\mathrm{\mathbb {R}}\,}}\) and any \(r \in \{1,\ldots ,N\}\) we write
$$\begin{aligned} {\mathfrak {h}}_r f(x) = |{f(x) - f(T_r x)} |, \end{aligned}$$
so that (10) becomes
$$\begin{aligned} {{\,\mathrm{Ent}\,}}_{\mu _{J_n}}(f^2) \le 2\sigma ^2 \sum _{r = 1}^N \int ({\mathfrak {h}}_r f(x))^2 d\mu _{J_n}(x). \end{aligned}$$
Moreover, it is known that (10) implies a Poincaré inequality
$$\begin{aligned} {{\,\mathrm{Var}\,}}(f) \le \sigma ^2 \sum _{r = 1}^N {{\,\mathrm{\mathbb {E}}\,}}{\mathfrak {h}}_r f(X)^2. \end{aligned}$$
(11)
Proof of Lemma 6
Error term\(\mathbf{E }_\mathbf{1 }\): To treat the term \(E_1\), fix \(i \in \{1,\ldots , k\}\) and observe that
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}\left( ({\hat{m}}_i(X) - {\hat{m}}_i({\widetilde{X}}))^2 \mid {\mathcal {F}} \right)&= N^{-1} \sum _{j = 1}^N {{\,\mathrm{\mathbb {E}}\,}}\left( ({\hat{m}}_i(X) - {\hat{m}}_i({\overline{X}}_j, {\widetilde{X}}_j))^2 \mid {\mathcal {F}} \right) \\&= (N|{B_i} |)^{-1} \sum _{j \in B_i} {{\,\mathrm{\mathbb {E}}\,}}\left( (X_j - {\widetilde{X}}_j)^2 \mid {\mathcal {F}} \right) \\&= - 2(N|{B_i} |)^{-1} \sum _{j \in B_i} X_j \tanh (g_j(X)) + 2N^{-1}. \end{aligned}$$
Thus, if we define
$$\begin{aligned} f_i(X) :=|{B_i^{(n)}} |^{-1/2} \sum _{j \in B_i} X_j \tanh \left( N^{-1} \sum _{l = 1}^k A_{il} m_l(X) - N^{-1} A_{ii} X_i \right) , \end{aligned}$$
we see that
$$\begin{aligned} {\hbox {Var}}^{1/2}\left( {{\,\mathrm{\mathbb {E}}\,}}\left( ({\hat{m}}_i^{(n)} - {\hat{m}}_i^{(n)}\, ')^2 \mid {\mathcal {F}} \right) \right) = 2N^{-1} |{B_i^{(n)}} |^{-1/2} {{\,\mathrm{Var}\,}}^{1/2}(f_i(X)), \end{aligned}$$
and we need to show that \({{\,\mathrm{Var}\,}}(f_i(X)) = O(1)\). Using the Poincaré inequality (11) it suffices to prove that \({\mathfrak {h}}_r f_i(X)^2 \le C |{B_i^{(n)}} |^{-1}\).
Let \(r \in \{1,\ldots , N\}\) be arbitrary and define \(h_i(X) :=N^{-1} \sum _{l = 1}^k A_{il} m_l(X) - N^{-1} A_{ii} X_i\). The first case is that \(r \in B_i^{(n)}\), for which
$$\begin{aligned} {\mathfrak {h}}_r f_i(X)&\le |{B_i} |^{-1/2} |{2X_r \tanh (h_i(X))} | + |{B_i} |^{-1/2} \sum _{\begin{array}{c} j \in B_i \\ j \ne r \end{array}} |{\tanh (h_i(X)) - \tanh (h_i(T_r X))} | \\&\le 4 |{B_i} |^{-1/2} + |{B_i} |^{-1/2} N^{-1} \sum _{\begin{array}{c} j \in B_i\\ j \ne r \end{array}} |{*} |{\sum _{l = 1}^k A_{il} (m_l(X) - m_l(T_r(X)))} \\&\le |{B_i} |^{-1/2} (4 + 2\Vert {A} \Vert _\infty ). \end{aligned}$$
The second case \(r \notin B_i^{(n)}\) follows by similar reasoning.
Error term\(\mathbf{E }_\mathbf{2 }\): The second term \(E_2\) is much easier to estimate, as
$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}|{{\hat{m}}_i - {\hat{m}}_i'} |^3 = N^{-1} |{B_i} |^{-3/2} \sum _{j \in B_i} {{\,\mathrm{\mathbb {E}}\,}}|{X_j - {\widetilde{X}}_j} |^3 \le 8N^{-1} |{B_i} |^{-1/2} = O(N^{-3/2}). \end{aligned}$$
Error term\(\mathbf{E }_\mathbf{3 }\): To estimate the variance of the remainder term R we first split it into two sums. For any \(i = 1, \ldots , k\) write
$$\begin{aligned} R_i(X)&= N^{-1} \Big ( |{B_i} |^{-1/2} \sum _{j \in B_i} g_j(X) - \tanh (g_j(X)) + N^{-1} A_{ii} X_j \Big ) \\&= N^{-1} |{B_i} |^{-1/2} \sum _{j \in B_i} g_j(X) - \tanh (g_j(X)) + N^{-2} A_{ii} m_i(X) \\&=:R_j^{(1)}(X) + R_j^{(2)}(X). \end{aligned}$$
Clearly \(\Vert {R_i - {{\,\mathrm{\mathbb {E}}\,}}R_i} \Vert _2 \le \Vert {R_i^{(1)} - {{\,\mathrm{\mathbb {E}}\,}}R_i^{(1)}} \Vert _2 + \Vert {R_i^{(2)} - {{\,\mathrm{\mathbb {E}}\,}}R_i^{(2)}} \Vert _2\) and we estimate these terms separately. It is obvious that the \(L^2\) norm of the second term is of order \(O(N^{-2})\). To estimate \(R^{(1)}_i\), we use \(\tanh (x) - x = O(x^3)\) to obtain
$$\begin{aligned} \Vert {R^{(1)}_i - {{\,\mathrm{\mathbb {E}}\,}}R^{(1)}_i} \Vert _2&\le CN^{-1} |{B_i} |^{-1/2} \sum _{j \in B_i} \Vert { |{g_j(X)} |^3} \Vert _2 \\&\le C N^{-1} |{B_i} |^{-1/2} \sum _{j \in B_i} \Vert {|{N^{-1/2} (A\varGamma {\hat{m}})_j} |^3} \Vert _2 + \Vert {N^{-3} |{A_{ii}} |^3} \Vert _2 \\&= O(N^{-2}) + O(N^{-5/2}). \end{aligned}$$
In the last line we have used the fact that \(\Vert {(A \varGamma {\hat{m}})_i^3} \Vert _2 = \Vert {A \varGamma {\hat{m}}_i} \Vert _6^3\) and for all \(p \ge 2\)
$$\begin{aligned} \Vert {(A \varGamma {\hat{m}})_i} \Vert _p \le C \sum _{l = 1}^k \Vert {{\hat{m}}_l} \Vert _p \le C \sum _{l = 1}^k (\sigma ^2 p)^{1/2} \end{aligned}$$
which evaluated at \(p = 6\) gives \(\Vert {(A\varGamma {\hat{m}})_i} \Vert _6^3 = O(1)\). For the details see [23]. The constant depends on a norm of \(A \varGamma \), which by convergence to \(A \varGamma _\infty \) can again be chosen independently of n. \(\square \)
Proof of Theorem 4
The theorem follows immediately from Corollary 2 and Lemma 6. \(\square \)