Abstract
We study the natural linear operators associated to divide and color (DC) models. The degree of nonuniqueness of the random partition yielding a DC model is directly related to the dimension of the kernel of these linear operators. We determine exactly the dimension of these kernels as well as analyze a permutation-invariant version. We also obtain properties of the solution set for certain parameter values which will be important in (1) showing that large threshold discrete Gaussian free fields are DC models and in (2) analyzing when the Ising model with a positive external field is a DC model, both in future work. However, even here, we give an application to the Ising model on a triangle.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction, Some Notation and Summary of Results
There is a very simple mechanism for constructing random variables with a (positive) dependency structure, which are called divide and color models. These were introduced in its general form in [4], but have already arisen in many different contexts.
Definition 1
A \(\{0,1\}\)-valued process \( X :=(X_i)_{i \in S} \) is a divide and color (DC) model if \( X \) can be generated as follows. First, choose a random partition of S according to some arbitrary distribution \(\pi \), and then independent of this and independently for different partition elements in the random partition, assign, with probability \( p \), all the variables in a partition element the value \( 1 \) and with probability \( 1-p \) assign all the variables the value \( 0 \). This final \(\{0,1\}\)-valued process is called the DC model associated to \((\pi ,p)\). We also say that \((\pi ,p)\) is a color representation of X.
As detailed in [4], many processes in probability theory are DC models; examples are the Ising model with zero external field, the fuzzy Potts model with zero external field, the stationary distributions for the voter model and random walk in random scenery.
While certainly the distribution of a divide and color model determines p, it in fact does not determine the distribution of \(\pi \). This was seen for small sets \( S \) in [4], and this lack of uniqueness will essentially be completely determined in this paper.
Given a set S, we let \(\mathcal {B}_S\) denote the collection of partitions of S. We denote \(\{1,2,3, \ldots , n\}\) by [n] and if \(S =[n] \), we write \(\mathcal {B}_n\) for \(\mathcal {B}_S\). \( |\mathcal {B}_n| \) is called the \(n \)th Bell number. We let \( P_n \) denote the number of integer partitions on \( n \). Since S will always be finite, we will, without loss of generality, assume it is equal to \( [n]\) for some \(n \in \mathbb {N}\).
The law of any random partition of [n] can be identified with a probability vector \(q=\{q_\sigma \}_{\sigma \in \mathcal {B}_n}\in \mathbb {R}^{\mathcal {B}_n}\). Similarly, the law of any random \(\{0,1\}\)-valued vector \((X_1,\ldots ,X_n)\) can be identified with a probability vector \(\nu =(\nu (\rho ))_{\rho \in \{ 0,1 \}^n}\in \mathbb {R}^{\{ 0,1 \}^n}\). The definition of a DC model yields immediately, for each n and \(p\in [0,1]\), a map \(\Phi _{n,p}\) from random partitions of [n], i.e., from probability vectors \(q=\{q_\sigma \}_{\sigma \in \mathcal {B}_n}\) to probability vectors \(\nu =(\nu _\rho )_{\rho \in \{ 0,1 \}^n}\). While a triviality, a crucial observation is that \(\Phi _{n,p}\) is an affine map of the relevant simplices. As a result, this map naturally extends to a linear mapping \(A_{n,p}\) from \(\mathbb {R}^{\mathcal {B}_n}\) to \(\mathbb {R}^{\{ 0,1 \}^n}\). This will allow us to more easily analyze questions concerning nonuniqueness of color representations, since, by placing the problem in a vector space context, one can consider formal solutions (to be defined below) which one might show afterward are in fact nonnegative solutions and therefore color representations.
To describe the linear operator \(A_{n,p}\), we identify \(\mathcal {B}_n\) with the natural basis for \(\mathbb {R}^{\mathcal {B}_n}\) in which case \(A_{n,p}\) is uniquely determined by giving the image of each \( \sigma \in \mathcal {B}_n\) which is done as follows. Given \( \sigma \in \mathcal {B}_n\) and a binary string \( \rho \in \{ 0,1 \}^n \), we write \( \sigma \lhd \rho \) if \(\rho \) is constant on the partition elements of \(\sigma \). We then have
where \( \Vert \sigma \Vert \) is equal to the number of partition elements in the partition \( \sigma \) and \( c = c(\sigma , \rho )\) is the number of partition elements on which \(\rho \) is 1. We would not be surprised if this operator has occurred in other contexts, but we have not been able to find it in the literature.
As seen in [1] and [4], the cases \(p=1/2\) and \(p\ne 1/2\) behave quite differently when it comes to DC models. We will see this difference also below when studying the dimension of the kernels of the corresponding operators. In [4], the below was obtained for some small values of n. For \( \rho \in \{ 0,1\}^n \), we write \( - \rho \) to denote the binary string where the zeros and ones in \(\rho \) are switched, i.e., \( -\rho = 1 - \rho \).
Theorem 1
\(A_{n,\frac{1}{2}}\) has rank \(2^{n-1}\) and hence nullity \(|\mathcal {B}_n|-2^{n-1}\). The range of \(A_{n,\frac{1}{2}}\) is
Remark 1
In the proof of Theorem 1, we will obtain a concrete formula for a formal solution (to be defined below) for any \(\nu \) satisfying (1). If, in addition, \(\nu \) is a probability vector with the property that the probability of being constant is at least .5, then this formal solution will be a nonnegative solution, and hence, \(\nu \) will be a divide and color model. We mention that there is no such result when \(p\ne 1/2\).
Theorem 2
If \(p\not \in \{0, 1/2,1\}\), then \(A_{n,p}\) has rank \(2^n-n\) and hence nullity \(|\mathcal {B}_n|-(2^n-n)\). The range of \(A_{n,p}\) is equal to
(The vector subspace defined by (2) is the vector space analog of the marginal distributions each being \(p\delta _1 + (1-p)\delta _0\).) In particular, if \(\nu \) is a probability vector on \( \{ 0,1\}^n \), all of whose marginals are \(p\delta _1 + (1-p)\delta _0\), then \(\nu \) is in the range of \(A_{n,p}\). (Of course, there might not be a probability vector \( q = (q_\sigma )_{\sigma \in \mathcal {B}_n}\) which maps to \(\nu \); i.e., \(\nu \) need not be a DC model.)
We now discuss the relationship between a nontrivial kernel and nonunique color representations. Given n, p and \(\nu \), a (not necessarily nonnegative) vector \(q\in \mathbb {R}^{\mathcal {B}_n}\) is called a formal solution if
while a nonnegative such vector q is called a nonnegative solution. It is easy to see, using inclusion–exclusion, that (3) is equivalent to the system
where \( \sigma _S \) denotes the restriction of a partition \( \sigma \in \mathcal {B}_n \) to a set \( S \subseteq [n] \), and \( 1^S \) is the event that \( \rho \in \{ 0,1 \}^n\) is equal to 1 on \( S \). If \(\nu \) is a probability vector, then the sum of the coordinates of any formal solution will always be one, but it will be a nonnegative solution if and only if it corresponds to a divide and color representation. Therefore, the relationship between nontriviality of the kernel of \( A_{n,p} \) and uniqueness of a color representation (i.e., uniqueness of a nonnegative solution) is as follows. First, of course \( A_{n,p} \) has a nontrivial kernel if and only if for any \(\nu \) in the range, there are an infinite number of formal solutions. Hence, if the kernel is trivial, there is always at most one divide and color representation for any DC model. The converse is not true since an i.i.d. process clearly has at most one color representation even when the kernel is nontrivial. However, as is also explained in [4], if the kernel is nontrivial, \(\nu \) is in the range and there exists a nonnegative solution all of whose coordinates are positive, then one has infinitely many nonnegative solutions since we can add a small constant times an element in the kernel. More generally, if \(\nu \) is in the range and there exists a nonnegative solution q for \(\nu \), then there is another nonnegative solution (and then infinitely many) if and only if there is an element \(q'\) in the kernel whose negative-valued coordinates are contained in the support of q.
It is sometimes natural to consider situations where one has some further invariance property, a special case being full invariance meaning everything considered is invariant under the full symmetric group. In this case, the characterization of the ranges and of the dimensions of the kernel are given in Theorem 5 in Sect. 2.3.
Our next theorem will be used in [3] to show that threshold discrete Gaussian free fields are DC models for large threshold. This theorem is included here since it heavily relies on the algebraic picture used in the proofs of the above results. It gives sufficient conditions for a family of probability measures \( \nu _p \) to be a DC model for small \( p \) in terms of the asymptotic behavior of certain probabilities as \(p\rightarrow 0\). In the below, for a given set S of coordinates, \( \nu (1^S) \) will denote the \(\nu \)-probability that we have all 1’s on S and \(\nu _p(1^S 0 ^{S^c})\) will denote the \(\nu \)-probability that we have all 1’s on S and 0’s on \(S^c\).
Theorem 3
Let \( (\nu _p)_{p \in (0,1)} \) be a family of probability measures on \( \{ 0,1\}^n \). Assume that \( \nu _p \) has marginals \( p\delta _1 + (1-p) \delta _0 \) and that for all \( S \subseteq [n] \) with \( |S| \ge 2 \) and all \( k \in S \), as \( p \rightarrow 0 \), we have that
and
Then, \( \nu _p \) is a DC model for all sufficiently small \( p > 0 \).
Next, since \(p=1/2\) plays a special role, it will turn out to be useful to understand the limiting behavior of the solution set as \(p\rightarrow 1/2\). This will also be needed in understanding which Ising models are DC models in the presence of an external field; the latter will be studied in [2]. The following result captures this limiting behavior.
Theorem 4
Let \( (\nu _p)_{p \in (0,1)} \) be a family of probability measures on \( \{ 0,1\}^n \). Assume that \( \nu _p \) has marginals \( p\delta _1 + (1-p) \delta _0 \), and that for each \( S \subseteq [n] \), \( \nu _p(1^S) \) is differentiable in \( p \) at \( p = 1/2 \). Assume further that for any sets \( T \subseteq [n] \) and \( S \subseteq T \), and any \( p \in (0,1) \), we have that
Finally, for each \( p \in (0,1) \) let \( (q_\sigma ^{(p)})_{\sigma \in \mathcal {B}_n} \) be a formal solution to the equation
Then, the set of subsequential limits \( (q_\sigma )_{\sigma \in \mathcal {B}_n} \) of sequences \( ((q_\sigma ^{(p)})_{\sigma \in \mathcal {B}_n})_{p\in (0,1)} \) as \( p \rightarrow 1/2 \) is exactly the set of solutions to the system of equations
Remark 2
It follows from the proof of this theorem that the system of linear equations given by the equations in (6) corresponding to even sets is equivalent to the linear equation system in (4) for \( p = 1/2 \).
The following application of Theorem 4 will be proved in Sect. 4. We consider the Ising model on a triangle with parameters J and h; this is the probability measure on \({\{1,-1\}}^{[3]}\) which has relative weights
to the configuration \(\eta \). Call this measure \(\nu _{J,h}\). For any \(J\ge 0\) and \(h>0\), by Theorem 2, there is a unique \(q^{J,h}\in \mathbb {R}^{\mathcal {B}_3}\) with \(A_{3,p} q^{J,h}=\nu _{J,h}\) where \(p=p(J,h)\) is chosen to be the probability that a single site is positive. The uniqueness of this solution also follows from Theorem 2.1(C) in [4]. If we now, for fixed J, let h tend to zero, then any subsequential limit \(q^J\) of \((q^{J,h})\) necessarily satisfies \(A_{3,1/2} q^J=\nu _{J,0}\). One natural random partition which yields \(\nu _{J,0}\) as its color process is the so-called random cluster model or Fortuin–Kasteleyn representation denoted by \(q^{\text {RCM}}\). Interestingly, it turns out that \(q^{\text {RCM}}\) does not correspond to the small h limit. This was first observed by the second author and Johan Tykesson with the help of Mathematica. Here, we obtain it as a direct corollary of Theorem 4.
Corollary 1
For all \(J> 0\), \(\lim _{h\rightarrow 0}q^{J,h}\) exists and does not equal \(q^{\text {RCM}}\).
The rest of the paper is organized as follows. The proofs of Theorems 1, 2 and 5 will be given in Sect. 2. Then, Theorem 3 will be proved in Sect. 3 and Theorem 4 as well as Corollary 1 will be proved in Sect. 4.
2 Dimension of the Kernels of the Induced Linear Operators
2.1 Formal Solutions for the \(p=1/2 \) case
In this subsection, we prove Theorem 1 and demonstrate the statement made in the remark after the statement of this theorem.
Proof (Proof of Theorem 1)
Given a \( \{ 0,1\} \)-symmetric probability vector \(\nu = (\nu (\rho ))\) on \(\{ 0,1\}^n\), it is easy to verify (and left to the reader) that a formal solution (i.e., a solution to \(A_{n,\frac{1}{2}}q= \nu \)) is given by
In addition, this yields a color representation (i.e., a nonnegative solution to \(A_{n,\frac{1}{2}} q=\nu \)) if and only if
by observing that
Clearly, every element of the range must satisfy the symmetry condition (1) since \(p=1/2\) while the first part of the proof shows that any vector satisfying (1) is in the range. This proves the description of the range and from this, it follows immediately that the rank is \(2^{n-1}\), and hence, the nullity is \(|\mathcal {B}_n|-2^{n-1}\).
2.2 Formal Solutions for the \(p\ne 1/2\) Case
In this subsection, we prove Theorem 2.
Proof (Proof of Theorem 2)]
Step 1
The rank of \(A_{n,p}\) is at least \(2^n-n\).
Proof (Proof of Step 1)
Let \( A'_{n,p} \) be the \(2^n\times |\mathcal {B}_n|\) matrix corresponding to the left-hand side of (4), i.e., let
It suffices to show that the rank of \( A'_{n,p} \) is at least \(2^n-n\).
Let \(\sigma ^\emptyset \) be the partition into singletons and for each \( T \subseteq [n] \) with \( |T| > 1 \), let \( \sigma ^T \in \mathcal {B}_n \) be the unique partition with exactly one non-singleton partition element given by \( T\). If, e.g., \( n = 5 \) we would have that \( \sigma ^{\{ 1,2,3 \}} = (123,4,5) \). One easily verifies that \( \Vert (\sigma ^T)_S \Vert = |S\backslash T| + (1\wedge |S\cap T|) \) for \(T=\emptyset \) or \( |T| > 1 \).
Consider the equation system
and let \( A''=A_{n,p}'' \) be the corresponding \(2^n\times (2^n-n)\) matrix. Define \( B = ( B(S,S'))_{S,S' \subseteq [n]} \) by
If we order the rows (from top to bottom) and columns (from left to right) of \( B \) such that the sizes of the corresponding sets are increasing, then \( B \) is a lower triangular matrix with \( B(S,S) = 1 \) for all \( S \subseteq [n] \). In particular, this implies that \( B \) is invertible for all \( p \in (0,1) \), and hence, \( A'' \) and \( BA'' \) (also a \(2^n\times (2^n-n)\) matrix) have the same rank. Moreover, for any \( S,T \subseteq [n] \) with \( |T| \not = 1 \) we get
In the case \( S \subseteq T \), we can simplify further to obtain
Note that since \(p\ne 1/2\), if \( S \subseteq T \), then \( (BA'')(S,T) = 0 \) if and only if \( |S| = 1\). If we order the rows (from top to bottom) and columns (from left to right) of \( BA'' \) so that the corresponding sets are increasing in size, it is obvious that the \((2^n -n) \times (2^n -n) \) submatrix of \( BA'' \) obtained by removing the rows corresponding to \(|S|=1\) has full rank. This implies that \( BA'' \) has rank at least \( 2^n -n \) which implies the same for \( A'' \) since B is invertible. Finally, since \( A'' \) is a submatrix of \(A'_{n,p} \), we obtain the desired lower bound on the rank of the latter. \(\square \)
Step 2
The rank of \(A_{n,p}\) is at most \(2^n-n\).
Proof (Proof of Step 2)
We first claim that if \( \nu = ( \nu ({\rho }))_{\rho \in \{ 0,1 \}^n}\) is in the range, then it is in the set defined in (2). To see this, let \(\nu =A_{n,p} \, q\) for some \( q = (q_\sigma )_{\sigma \in \mathcal {B}_n} \) and fix an \(i \in [n]\). The expression in the left-hand side of (2) becomes
With \(\sigma \in \mathcal {B}_n\) fixed, let
be the bijection which flips \(\rho \) on the partition element of \(\sigma \) which contains i. It is clear that for all \(\rho \) with \(\rho (i)=0\), we have
and hence, the previous expression is
\(A_{n,p}\) is mapping into a \(2^n\)-dimensional vector space, and each of the n equations in (2) gives one linear constraint. It is easy to see that these n constraints are linearly independent (e.g., one can see this by just looking at the number of times each of the vectors \(0^k1^{n-k}\) appears on the two sides). It follows that the rank of \(A_{n,p}\) is at most \(2^n-n\). \(\square \)
With Steps 1 and 2 completed, together with the claim at the start of Step 2, we conclude that the rank is as claimed and the range is characterized as claimed. Finally, the claim concerning probability vectors follows immediately.
Remark 3
-
(i)
The argument for the \(p\ne 1/2\) case can equally well be carried out with minor modifications for the \(p=1/2\) case, but we preferred the simpler argument which even gives more.
-
(ii)
This last proof shows that, when dealing with formal solutions, we only need to use partitions which have at most one non-singleton partition element. This is in large contrast to the earlier proof of the \(p=1/2\) case where we only needed to use partitions which have at most two partition elements.
-
(iii)
The rank of an operator as a function of its matrix elements is not continuous, but it is easily seen to be lower semicontinuous. We see this lack of continuity at \(p=1/2\) as well as of course at \(p=0 \) and \( p = 1 \).
2.3 The Fully Invariant Case
It is sometimes natural to consider situations where one has some further invariance property. One natural case is the following. The symmetric group \(S_n\) acts naturally on \(\mathcal {B}_n\), \(\{ 0,1\}^n\), \(\mathcal {P}(\mathcal {B}_n)\), \(\mathcal {P}(\{ 0,1\}^n)\), \(\mathbb {R}^{\mathcal {B}_n}\) and \(\mathbb {R}^{\{ 0,1 \}^n}\) where \(\mathcal {P}(X)\) denotes the set of probability measures on X. (Of course \(\mathcal {P}(\mathcal {B}_n)\subseteq \mathbb {R}^{\mathcal {B}_n}\) and the action on the former is just the restriction of the action on the latter; similarly for \(\mathcal {P}(\{ 0,1\}^n)\subseteq \mathbb {R}^{\{ 0,1 \}^n}\).) To understand uniqueness of a color representation when we restrict to \(S_n\)-invariant probability measures, it is natural to again extend to the vector space setting, which is done as follows. Let \(Q^\mathrm{{Inv}}_n:=\{q\in \mathbb {R}^{\mathcal {B}_n}: g(q) = q \,\,\,\forall g\in S_n\}\) and \(V^\mathrm{{Inv}}_n:=\{\nu \in \mathbb {R}^{\{ 0,1 \}^n}: g(\nu )= \nu \,\,\,\forall g\in S_n\}\). We next let \(A^\mathrm{{Inv}}_{n,p}\) be the restriction of \(A_{n,p}\) to \(Q^\mathrm{{Inv}}_n\). It is elementary to check that \(A^\mathrm{{Inv}}_{n,p}\) maps into \(V^\mathrm{{Inv}}_n\) and furthermore, it is easy to check, by averaging, that
Recalling that \(P_n\) is the set of partitions of the integer n, we have an obvious mapping from \(\mathcal {B}_n\) to \(P_n\), denoted by \(\sigma \mapsto \pi (\sigma )\), which is constant on \( S_n \) orbits. \(\mathbb {R}^{P_n}\) can then be canonically identified with \(Q^\mathrm{{Inv}}_n\) via \((q_{\pi })_{\pi \in P_n}\) is identified with \((q_{\sigma })_{\sigma \in \mathcal {B}_n}\) where \(q_{\sigma }={q_{\pi (\sigma )}}/{a_{\pi (\sigma )}}\) where \(a_{\pi }\) is the number of \(\sigma \)’s for which \(\pi (\sigma )=\pi \). In an analogous way, \(V^\mathrm{{Inv}}_n\) can be canonically identified with \(\mathbb {R}^{n+1}\); namely, \((\nu _i)_{0\le i \le n}\) is identified with \((\nu ({\rho }))_{\rho \in \{ 0,1\}^n}\) where \(\nu ({\rho })={\nu _{\Vert \rho \Vert }}/{\left( {\begin{array}{c}n\\ \Vert \rho \Vert \end{array}}\right) }\) and \( \Vert \rho \Vert \) is the number of ones in the binary string \( \rho \).
Using this notation, we have the following theorem. Again, in [4], this was done for some small values of n.
Theorem 5
-
(i).
For \(p\not \in \{0, {1}/{2},1\}\), \(A^\mathrm{{Inv}}_{n,p}\) has rank n and hence nullity \(|P_n|-n\). The range of \(A^\mathrm{{Inv}}_{n,p}\) (after identifying \(V^\mathrm{{Inv}}_n\) with \(\mathbb {R}^{n+1}\)) is
$$\begin{aligned} \big \{ (\nu _0,\ldots ,\nu _n): \nu _n=\frac{p}{1-p}\sum _{k=0}^{n-1} \frac{n-k}{n} \nu _k -\sum _{k=0}^{n-2} \frac{k+1}{n} \nu _{k+1} \big \}. \end{aligned}$$(9) -
(ii)
\(A^\mathrm{{Inv}}_{n,\frac{1}{2}}\) has rank \(\lfloor n/2 \rfloor +1\) and hence nullity \(|P_n|-\lfloor n/2 \rfloor -1\). The range of \(A_{n,\frac{1}{2}}\) (after identifying \(V^\mathrm{{Inv}}_n\) with \(\mathbb {R}^{n+1}\)) is
$$\begin{aligned} \big \{ (\nu _0,\ldots ,\nu _n): \nu _i= \nu _{n-i} \,\,\, \forall i=1,\ldots ,n \big \}. \end{aligned}$$(10)
Proof
-
(i).
Denoting by \(U_n\) the subset of \(\mathbb {R}^{n+1}\) satisfying (9), we claim that (after identifying \(V^\mathrm{{Inv}}_n\) with \(\mathbb {R}^{n+1}\))
$$\begin{aligned} U_n= A^\mathrm{{Inv}}_{n,p}(Q^\mathrm{{Inv}}_n). \end{aligned}$$(11)Since \(U_n\) is clearly an n-dimensional subspace of \(\mathbb {R}^{n+1}\), the proof of (i) will then be done. To see this, first take \(\nu \in U_n\) and let \(\nu ^\mathrm{{Inv}}\) be the corresponding element in \(V^\mathrm{{Inv}}_n\). We first need to show that (2) is satisfied for \(\nu ^\mathrm{{Inv}}\). Fixing any \(i \in [n]\), we have
$$\begin{aligned} p\sum _{\rho :\rho (i)=0} \nu ^\mathrm{{Inv}}_{\rho }=p\sum _{k=0}^{n-1} \left( {\begin{array}{c}n-1\\ k\end{array}}\right) \frac{\nu _k}{\left( {\begin{array}{c}n\\ k\end{array}}\right) }= p\sum _{k=0}^{n-1} \frac{n-k}{n} \nu _k \end{aligned}$$and
$$\begin{aligned}&(1-p)\sum _{\rho :\rho (i)=1} \nu ^\mathrm{{Inv}}_{\rho } = (1-p)\sum _{k=0}^{n-2} \left( {\begin{array}{c}n-1\\ k\end{array}}\right) \frac{\nu _{k+1}}{\left( {\begin{array}{c}n\\ k+1\end{array}}\right) }+(1-p)\nu _n \\&\quad = (1-p)\sum _{k=0}^{n-2} \frac{k+1}{n} \nu _{k+1}+(1-p)\nu _n. \end{aligned}$$Hence, since \(\nu \in U_n\), (2) holds. In view of (8), this shows \(\subseteq \) in (11) holds.
Now, fix \( \nu ^\mathrm{{Inv}}\in A^\mathrm{{Inv}}_{n,p}(Q^\mathrm{{Inv}}_n)\). Clearly \( \nu ^\mathrm{{Inv}}\in V^\mathrm{{Inv}}_n\) and by Theorem 2, (2) holds. The above computation shows that the corresponding \(\nu \in \mathbb {R}^{n+1}\) satisfies (9) and hence is in \(U_n\). This shows that \(\supseteq \) in (11) holds as well.
-
(ii).
Denoting now by \(U_n\) the subset of \(\mathbb {R}^{n+1}\) satisfying (10), we claim that
$$\begin{aligned} U_n= A^\mathrm{{Inv}}_{n,\frac{1}{2}}(Q^\mathrm{{Inv}}_n). \end{aligned}$$(12)Since \(U_n\) is clearly an \((\lfloor n/2 \rfloor +1)\)-dimensional subspace of \(\mathbb {R}^{n+1}\), the proof of (ii) will then be done. However, in view of (1) in Theorem 2 and (8), this is immediate.
\(\square \)
3 Limiting Solutions as p Approaches 0
In this section, we provide a proof of Theorem 3.
Proof (Proof of Theorem 3)
We will show that given the assumptions of the lemma, for \( p>0\) sufficiently small there is a color representation \( (q_\sigma ) = (q_\sigma (p)) \) of \( X_p \sim \nu _p \) which is such that \( q_\sigma = 0 \) for all \( \sigma \in \mathcal {B}_n\) with more than one non-singleton partition element. To this end, fix \( p \in (0,1/2) \). We now refer to the proof of Theorem 2. By Step 1 in that proof, we have that a color representation \( (q_\sigma (p)) \) with the desired properties exists if and only if the (unique) solution \( (q_{\sigma ^S}(p))_{|S| \not = 1} \) to
is nonnegative. As in the proof of Theorem 2, let \( A'' \) be the \( 2^n \times (2^n-n)\) matrix corresponding to (13) and define \( B = ( B(S,S'))_{S,S' \subseteq [n]} \) by
In the proof of Step 1 of Theorem 2, we saw that for \( S,T \subseteq [n] \) with \( |T| \not = 1 \),
Let \( D = (D(S,S'))_{S,S' \subseteq [n]}\) be the diagonal matrix with
Then for \( S,T \subseteq [n] \) with \( |T| \not = 1 \),
Furthermore, one can verify that if we define the matrix \( C = (C(S,S'))_{S,S' \subseteq [n]} \) by
then (since \(p \not \in \{ 0,1/2,1\}\))
Since, by Step 1 in the proof of Theorem 2, the rank of \(A'' \) is exactly \( 2^n-n \), it follows that if we think of \( \nu _p \) as a column vector, then (13) is equivalent to
(with \( t \) here meaning transpose and \( e_S \) denoting the vector \( (I(S'=S))_{{S'} \subseteq [n]} \)). Now, note that \( DB\nu _p(1^\emptyset ) = \nu _p(1^\emptyset ) \) and that if \( S \subseteq [n] \) has size \( |S| \ge 2 \), we have that
Since \( |S|\ge 2 \), the denominator is \(p (1 + O(p))\), and by the left-hand side of (5) the numerator is given by \( \nu _p(1^S) + o(\nu _p(1^S)) \). It follows that
for any \( S \subseteq [n] \) with \( |S| \not = 1 \). If we apply \( C \) to the vector \( (DB\nu _p(1^S))_{C \subseteq [n]}\), a computation shows that we get
By (14) and the assumption that \( \nu _p(1^S) \asymp \nu _p(1^S 0^{S^c})\), it follows that \( q_{\sigma ^S} \sim p^{-1} \nu _p(1^S0^{[n] \backslash S})\) for any \( S \subseteq [n] \) with \( |S| \ge 2 \). Since \( q_{\sigma ^\emptyset } = 1 - \sum _{S \subseteq [n] :|S| \ge 2} q_{\sigma ^S}\), again using the assumptions, this concludes the proof.
4 Limiting Solutions as p Approaches 1/2
Before we proceed to the proof of Theorem 4, we state and prove a few lemmas that will be useful in this proof.
Lemma 1
Let \( f :2^{[n]} \rightarrow \mathbb {R} \). Define \( \varphi f :2^{[n]} \rightarrow \mathbb {R} \) by
and \( \varphi ^{-1} f :2^{[n]} \rightarrow \mathbb {R} \) by
Then,
This lemma is a type of Möbius inversion formula. For completeness, we present a short proof.
Proof
Let \( T \subseteq [n] \). Then, we have that
\(\square \)
Lemma 2
Define \( A :\mathcal {B}_n \rightarrow \mathbb {R}^{2^{[n]}} \) by
Then, \( A \) has rank \( 2^n - n \).
Proof (Proof of Lemma 2)
Recall the definition of \( \sigma ^T \) from the proof of Theorem 2. One can check that for any \( S \subseteq [n] \),
This implies in particular that
Since \( (I(S \subseteq T))_{S,T \subseteq [n]} \) has full rank, it follows that \( A \), when restricted to sets \( S \subseteq [n] \) with \( |S| \not = 1 \), has full rank, i.e., rank \(2^n-n \). Since \( A(\{ i \},\sigma ^T) = A(\emptyset , \sigma ^T) = 1 \) for all \( T \subseteq [n] \) with \( |T| \not = 1 \) and all \( i \in [n] \), \( A \) can have rank at most \( 2^n-n \); hence, the desired conclusion follows.
Lemma 3
If \( S \subseteq [n] \), \( |S| \) is odd and \( \nu :\{ 0,1\}^n \rightarrow \mathbb {R} \) is \( \{ 0,1\} \)-symmetric,
Proof (Proof of Lemma 3)
Fix a set \( S \subseteq [n] \) with \( |S| \) odd. Since \( |S|\) is odd and \( \nu \) is symmetric,
Next, by inclusion exclusion, for any set \( T \subseteq S \),
Combining the two earlier equations and then changing the order of summation, we obtain
which is the desired conclusion. \(\square \)
Lemma 4
Suppose that \( S \subseteq [n] \), \( |S| \) is even and that \( \nu _p :\{ 0,1\}^n \rightarrow \mathbb {R} \) is differentiable in \( p \) at \( p =1/2 \). Suppose further that for all \( T \subseteq S \) and all \( p \in (1/2,1) \), \( \nu _p \) satisfies
Then,
Proof (Proof of Lemma 4)
Fix a set \( S \subseteq [n] \) with \( |S| \) even. Note that, using the assumption on \( (\nu _p) \), for any \( T \subseteq S \), we have that
Next, by the proof of the Lemma 3, we have that
By (16), this equals
Since \( |S| \) is even, \( |T| \) and \( |S\backslash T| \) have the same parity, and hence, the desired conclusion follows. \(\square \)
We now proceed to the proof of Theorem 4.
Proof (Proof of Theorem 4)
Assume that \( (q_\sigma ^{(p)})_{\sigma \in \mathcal {B}_n} \) is such that
holds. Note that for \( p \) close to \( 1/2 \), we have that
Further, as \( \nu _p \) is differentiable in \( p \) at \( 1/2 \), we have that
Using these expansions, we will now apply \( \varphi \), as defined in Lemma 1, to both sides of (17). To this end, we first introduce the following notation. Given \( \sigma \in \mathcal {B}_n\) and \( S \subseteq [n]\), write \( \sigma _S = \{ T_1, T_2, \ldots , T_m \} \), where \( m = \Vert \sigma \Vert \), to denote that the partition elements of \( \sigma \) when restricted to \( S \) are given by \( T_1, T_2, \ldots , T_m \subseteq S \). Using this notation, for any fixed set \( S \subseteq [n] \) and \( \sigma \in \mathcal {B}_n \), we have that
Similarly, we have that
Noting that \( \varphi \), as defined in Lemma 1, is linear, applying it to (17) and using the above derivations, we hence obtain
Using Lemmas 3 and 4, it follows that this is equivalent to that
and
Now, let \( (q_\sigma )_{\sigma \in \mathcal {B}_n} \) be any subsequential limit, as \( p \rightarrow 1/2 \), of formal solutions \( (q_\sigma ^{(p)})_{\sigma \in \mathcal {B}_n}\) to (17). Then, combining the previous two equations and letting \( p \rightarrow 1/2 \), we obtain
By applying \( \varphi ^{-1} \) as defined in Lemma 1, we obtain (6). For the other direction, note that by Lemma 2, the matrix corresponding to the left-hand side in the previous equation has rank \( 2^n-n\). By Theorem 2, this is also the rank of \( A_{n,p} \) when \( p \not \in \{ 0,1/2,1 \} \), and hence of the equivalent matrix given by the left-hand side of (18). By a standard argument, it follows that (6) exactly describes the limiting solutions. This concludes the proof.
We now provide the proof of Corollary 1.
Proof (Proof of Corollary 1)
We first need to place ourselves into the context of Theorem 4 which we do as follows. With J fixed, define a function h from (0, 1) to \(\mathbb {R}\) where h(p) is such that the one-dimensional marginal of \(\nu _{J,h(p)}\) is p. It is easy to see that \(h(1/2)=0\) and that h is symmetric about 1/2. It also follows from well-known inequalities that h is increasing, bijective and differentiable. We now let \(\nu _p :=\nu _{J,h(p)}\). Understanding what happens as \(h\rightarrow 0\) is the same as understanding what happens for \(\nu _p\) as \(p\rightarrow 1/2\). We need to look at the solutions of (6). Only symmetric solutions can arise and we then, for a random partition, let, for \(i=1,2,3\), \(q_i\) be the probability that there are i partition elements. \(q_1\) and \(q_3\) each correspond to one configuration, while \(q_2\) corresponds to three. In (6), by symmetry, there are just four equations corresponding to S having sizes zero, one, two and three. S having size zero and one both yield the equation
The interesting equations are for |S| being two and three. It is easy to check that the \(|S|=2\) equation yields
For the \(|S|=3\) equation, we first need the right-hand side. By the chain rule, this equals the derivative of the probability of having all 1’s with respect to h at \(h=0\) times \(h'(p)\) at \(p=1/2\). For the latter, using the inverse instead, it is straightforward to compute \(p'(h)\) at \(h=0\) to be \(\frac{3e^{3J}+e^{-J}}{2e^{3J}+6e^{-J}}\), and hence, \(h'(1/2)=\frac{2e^{3J}+6e^{-J}}{3e^{3J}+e^{-J}}\). For the derivative of the probability of having all 1’s with respect to h for \(h=0\), a computation yields this to be \(\frac{3e^{3J}}{2e^{3J}+6e^{-J}}\), and hence, the right-hand side is \(\frac{3e^{3J}}{3e^{3J}+e^{-J}}\). This easily yields the final equation to be
One checks that the \(3\times 3\) system has a unique solution, and hence, Theorem 4 implies that \(\lim _{h\rightarrow 0}q^{J,h}\) exists. One can also check that this unique solution is strictly positive implying that for fixed J and small h, \(\nu _{J,h}\) is a color process. One finds \(q_2\) to be \(\frac{12(e^{4J}-1)}{(3+e^{4J})(1+3e^{4J})}\), while one easily checks that \(q_2^{\text {RCM}}=\frac{6e^{-2J}(e^{2J}-1)}{3+e^{4J}}\). Since one can check that for all \(J>0\), \(\frac{12(e^{4J}-1)}{(3+e^{4J})(1+3e^{4J})} < \frac{6e^{-2J}(e^{2J}-1)}{3+e^{4J}}\), we obtain the claim.
References
Björnberg, J.E., Mailler, C., Mörters, P., Ueltschi, D.: Characterising random partitions by random colouring. Electron. Commun. Probab. 25, 4 (2020)
Forsström, M. P.: Divide and color representations for Ising models with an external field. Preprint
Forsström, M.P., Steif, J.E.: Divide and color representations for threshold Gaussian and stable vectors. Preprint
Steif, J.E., Tykesson, J.: Generalized divide and color models. ALEA. Latin Am. J. Probab. Math. Stat. 16, 899–955 (2019)
Acknowledgements
Open access funding provided by Royal Institute of Technology.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Malin Palö Forsström acknowledges support from the European Research Council, Grant No. 682537. Jeffrey E. Steif acknowledges the support of the Swedish Research Council, Grant No. 2016-03835 and the Knut and Alice Wallenberg Foundation, Grant No. 2012.0067.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Forsström, M.P., Steif, J.E. An Analysis of the Induced Linear Operators Associated to Divide and Color Models. J Theor Probab 34, 1043–1060 (2021). https://doi.org/10.1007/s10959-020-01001-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10959-020-01001-4