Abstract
Reversible cellular automata are seen as microscopic physical models, and their states of macroscopic equilibrium are described using invariant probability measures. We establish a connection between the invariance of Gibbs measures and the conservation of additive quantities in surjective cellular automata. Namely, we show that the simplex of shiftinvariant Gibbs measures associated to a Hamiltonian is invariant under a surjective cellular automaton if and only if the cellular automaton conserves the Hamiltonian. A special case is the (wellknown) invariance of the uniform Bernoulli measure under surjective cellular automata, which corresponds to the conservation of the trivial Hamiltonian. As an application, we obtain results indicating the lack of (nontrivial) Gibbs or Markov invariant measures for “sufficiently chaotic” cellular automata. We discuss the relevance of the randomization property of algebraic cellular automata to the problem of approach to macroscopic equilibrium, and pose several open questions. As an aside, a shiftinvariant preimage of a Gibbs measure under a preinjective factor map between shifts of finite type turns out to be always a Gibbs measure. We provide a sufficient condition under which the image of a Gibbs measure under a preinjective factor map is not a Gibbs measure. We point out a potential application of preinjective factor maps as a tool in the study of phase transitions in statistical mechanical models.
Introduction
Reversible cellular automata are deterministic, spatially extended, microscopically reversible dynamical systems. They provide a suitable framework—an alternative to Hamiltonian dynamics—to examine the dynamical foundations of statistical mechanics with simple caricature models. The intuitive structure of cellular automata makes them attractive to mathematicians, and their combinatorial nature makes them amenable to perfect simulations and computational study.
Some reversible cellular automata have long been observed, in simulations, to exhibit “thermodynamic behavior”: starting from a random configuration, they undergo a transient dynamics until they reach a state of macroscopic (statistical) equilibrium. Which of the equilibrium states the system is going to settle in could often be guessed on the basis of few statistics of the initial configuration.
One such example is the Q2R cellular automaton [91], which is a deterministic dynamics on top of the Ising model. Like the standard Ising model, a configuration of the Q2R model consists of an infinite array of symbols \(\mathtt {+}\) (representing an upward magnetic spin) and \(\mathtt {}\) (a downward spin) arranged on the twodimensional square lattice. The symbols are updated in iterated succession of two updating stages: at the first stage, the symbols on the even sites (the black cells of the chess board) are updated, and at the second stage, the symbols on the odd sites. The updating of a symbol is performed according to a simple rule: a spin is flipped if and only if among the four neighboring spins, there are equal numbers of upward and downward spins. The dynamics is clearly reversible (changing the order of the two stages, we could traverse backward in time). It also conserves the Ising energy (i.e., the number of pairs of adjacent spins that are antialigned).
Few snapshots from a simulation are shown in Fig. 1. Starting with a random configuration in which the direction of each spin is determined by a biased coin flip, the Q2R cellular automaton evolves towards a state of apparent equilibrium that resembles a sample from the Ising model at the corresponding temperature.^{Footnote 1} More sophisticated variants of the Q2R model show numerical agreement with the phase diagram of the Ising model, at least away from the critical point [15]. See [87], Chapter 17, for further simulations and an interesting discussion.
Wolfram was first to study cellular automata from the point of view of statistical mechanics [93, 94] (see also [95]). He made a detailed heuristic analysis of the socalled elementary cellular automata (those with two states per site and local rule depending on three neighboring sites in one dimension) using computer simulations. One of Wolfram’s observations (the randomizing property of the XOR cellular automaton) was mathematically confirmed by Lind [50], although the same result had also been obtained independently by Miyamoto [58]. Motivated by the problem of foundations of statistical mechanics, Takesue made a similar study of elementary reversible cellular automata and investigated their ergodic properties and thermodynamic behavior [83–85]. Recognizing the role of conservation laws in presence or absence of thermodynamic behavior, he also started a systematic study of additive conserved quantities in cellular automata [30, 86].
This article concerns the “states of macroscopic equilibrium” and their connection with conservation laws in a class of cellular automata including the reversible ones.
As in statistical mechanics, we identify the “macroscopic states” of lattice configurations with probability measures on the space of all such configurations. The justification and proper interpretation of this formulation is beyond the scope of this article. We content ourselves with recalling two distinct points of view: the subjective interpretation (probability measures are meant to describe the partial states of knowledge of an observer; see [36]) and the frequentist interpretation (a probability measure represents a class of configurations sharing the same statistics). See [80] for comparison and discussion. If we call tailmeasurable observables “macroscopic”, a probability measure that is trivial on the tail events would give a full description of a macroscopic state (see Paragraph (7.8) of [27]). On the other hand, restricting “macroscopic” observables to statistical averages (i.e., averages of local observables over the lattice), one could identify the macroscopic states with probability measures that are shiftinvariant and ergodic. The configurations in the ergodic set of a shiftergodic probability measure (i.e., the generic points in its support; see [66]) may then be considered as “typical” microscopic states for the identified macroscopic state.
The interpretation of “equilibrium” is another unsettling issue that we leave open. Equilibrium statistical mechanics postulates that the equilibrium states (of a lattice model described by interaction energies) are suitably described by Gibbs measures (associated with the interaction energies) [27, 35, 75]. One justification (within the subjective interpretation) is the variational principle that characterizes the shiftinvariant Gibbs measures as measures that maximize entropy under a fixed expected energy density constraint. Within a dynamical framework, on the other hand, the system is considered to be in macroscopic equilibrium if its internal fluctuations are not detected by macroscopic observables. One is therefore tempted to identify the equilibrium states of a cellular automaton with (tailtrivial or shiftergodic) probability measures that are timeinvariant. Unfortunately, there are usually an infinity of invariant measures that do not seem to be of physical relevance. For instance, in any cellular automaton, the uniform distribution on the shift and time orbit of a jointly periodic configuration is timeinvariant and shiftergodic, but may hardly be considered a macroscopic equilibrium state. Other conditions such as “smoothness” or “attractiveness” therefore might be needed.
Rather than reversible cellular automata (i.e., those whose trajectories can be traced backward by another cellular automaton), we work with the broader class of surjective cellular automata (i.e., those that act surjectively on the configuration space). Every reversible cellular automaton is surjective, but there are many surjective cellular automata that are not reversible. Surjective cellular automata are nevertheless “almost injective” in that the average amount of information per site they erase in each time step is vanishing. They are precisely those cellular automata that preserve the uniform Bernoulli measure (cf. Liouville’s theorem for Hamiltonian systems). Even if not necessarily physically relevant, they provide a richer source of interesting examples, which could be used in case studies. For instance, most of the known examples of the randomization phenomenon (which, we shall argue, could provide an explanation of approach to equilibrium) are in nonreversible surjective cellular automata.
The invariance of Gibbs measures under surjective cellular automata turns out to be associated with their conservation laws. More precisely, if an additive energylike quantity, formalized by a Hamiltonian, is conserved by a surjective cellular automaton, the cellular automaton maps the simplex of shiftinvariant Gibbs measures corresponding to that Hamiltonian onto itself (Theorem 6). The converse is true in a stronger sense: if a surjective cellular automaton maps a (not necessarily shiftinvariant) Gibbs measure for a Hamiltonian to a Gibbs measure for the same Hamiltonian, the Hamiltonian must be conserved by the cellular automaton (Corollary 10). The proof of this correspondence is an immediate consequence of the variational characterization of shiftinvariant Gibbs measures and the fact that surjective cellular automata preserve the average entropy per site of shiftinvariant probability measures (Theorem 4). An elementary proof of a special case was presented earlier [42].
Note that if a conserved Hamiltonian has a unique Gibbs measure, then that unique Gibbs measure will be invariant under the cellular automaton. This is the case, for example, in one dimension, or when the Hamiltonian does not involve the interaction of more than one site (the Bernoulli case). An important special case is the trivial Hamiltonian (all configurations on the same “energy” level) which is obviously conserved by every surjective cellular automaton. The uniform Bernoulli measure is the unique Gibbs measure for the trivial Hamiltonian, and we recover the wellknown fact that every surjective cellular automaton preserves the uniform Bernoulli measure on its configuration space (i.e., Corollary 4). If, on the other hand, the simplex of shiftinvariant Gibbs measures for a conserved Hamiltonian has more than one element, the cellular automaton does not need to preserve individual Gibbs measures in this simplex (Example 9).
We do not know whether, in general, a surjective cellular automaton maps the nonshiftinvariant Gibbs measures for a conserved Hamiltonian to Gibbs measures for the same Hamiltonian, but this is known to be the case for a proper subclass of surjective cellular automata including the reversible ones (Theorem 5), following a result of Ruelle.
The essence of the abovementioned connection between conservation laws and invariant Gibbs measures comes about in a more abstract setting, concerning the preinjective factor maps between strongly irreducible shifts of finite type. We show that a shiftinvariant preimage of a (shiftinvariant) Gibbs measure under such a factor map is again a Gibbs measure (Corollary 7). We find a simple sufficient condition under which a preinjective factor map transforms a shiftinvariant Gibbs measure into a measure that is not Gibbs (Proposition 9). An example of a surjective cellular automaton is given that eventually transforms every starting Gibbs measure into a nonGibbs measure (Example 7). As an application in the study of phase transitions in equilibrium statistical mechanics, we demonstrate how the result of Aizenman and Higuchi regarding the structure of the simplex of Gibbs measures for the twodimensional Ising model could be more transparently formulated using a preinjective factor map (Example 5).
The correspondence between invariant Gibbs measures and conservation laws allows us to reduce the problem of invariance of Gibbs measures to the problem of conservation of additive quantities. Conservation laws in cellular automata have been studied by many from various points of view (see e.g. [3, 6, 18, 24–26, 30, 62, 68, 74]). For example, simple algorithms have been proposed to find all additive quantities of up to a given interaction range that are conserved by a cellular automaton. Such an algorithm can be readily applied to find all the fullsupport Markov measures that are invariant under a surjective cellular automaton (at least in one dimension). We postpone the study of this and similar algorithmic problems to a separate occasion.
A highlight of this article is the use of this correspondence to obtain severe restrictions on the existence of invariant Gibbs measures in two interesting classes of cellular automata with strong chaotic behavior. First, we show that a strongly transitive cellular automaton cannot have any invariant Gibbs measure other than the uniform Bernoulli measure (Corollary 11). The other result concerns the class of onedimensional reversible cellular automata that are obtained by swapping the role of time and space in positively expansive cellular automata. For such reversible cellular automata, we show that the uniform Bernoulli measure is the unique invariant Markov measure with full support (Corollary 13).
Back to the interpretation of shiftergodic probability measures as macroscopic states, one might interpret the latter results as an indication of “absence of phase transitions” in the cellular automata in question. Much sharper results have been obtained by others for narrower classes of cellular automata having algebraic structures (see the references in Example 10).
A mathematical description of approach to equilibrium (as observed in the Q2R example) seems to be very difficult in general. The randomization property of algebraic cellular automata (the result of Miyamoto and Lind and its extensions; see Example 13) however provides a partial explanation of approach to equilibrium in such cellular automata. Finding “physically interesting” cellular automata with similar randomization property is an outstanding open problem.
The structure of the paper is as follows. Section 2 is dedicated to the development of the setting and background material. Given the interdisciplinary nature of the subject, we try to be as selfcontained as possible. Basic results regarding the preinjective factor maps between shifts of finite type as well as two degressing applications appear in Sect. 3. In Sect. 4, we apply the results of the previous section on cellular automata. Conservation laws in cellular automata are discussed in Sect. 4.1. Proving the absence of nontrivial conservation laws in two classes of chaotic cellular automata in Sect. 4.3, we obtain results regarding the rigidity of invariant measures for these two classes. Section 4.4 contains a discussion of the problem of approach to equilibrium.
Background
Observables, Probabilities, and Dynamical Systems
Let \({\fancyscript{X}}\) be a compact metric space. By an observable we mean a Borel measurable function \(f{:}\,{\fancyscript{X}}\rightarrow \mathbb {R}\). The set of continuous observables on \({\fancyscript{X}}\) will be denoted by \(C({\fancyscript{X}})\). This is a Banach space with the uniform norm. The default topology on \(C({\fancyscript{X}})\) is the topology of the uniform norm. The set of Borel probability measures on \({\fancyscript{X}}\) will be denoted by \({\fancyscript{P}}({\fancyscript{X}})\). The expectation operator of a Borel probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\) is a positive linear (and hence continuous) functional on \(C({\fancyscript{X}})\). Conversely, the Riesz representation theorem states that every normalized positive linear functional on \(C({\fancyscript{X}})\) is the expectation operator of a unique probability measure on \({\fancyscript{X}}\). Therefore, the Borel probability measures can equivalently be identified as normalized positive linear functionals on \(C({\fancyscript{X}})\). We assume that \({\fancyscript{P}}({\fancyscript{X}})\) is topologized with the weak topology. This is the weakest topology with respect to which, for every observable \(f\in C({\fancyscript{X}})\), the mapping \(\pi \mapsto \pi (f)\) is continuous. The space \({\fancyscript{P}}({\fancyscript{X}})\) under the weak topology is compact and metrizable. If \(\delta _x\) denotes the Dirac measure concentrated at \(x\in {\fancyscript{X}}\), the map \(x\mapsto \delta _x\) is an embedding of \({\fancyscript{X}}\) into \({\fancyscript{P}}({\fancyscript{X}})\). The Dirac measures are precisely the extreme elements of the convex set \({\fancyscript{P}}({\fancyscript{X}})\), and by the Krein–Milman theorem, \({\fancyscript{P}}({\fancyscript{X}})\) is the closed convex hull of the Dirac measures.
Let \({\fancyscript{X}}\) and \({\fancyscript{Y}}\) be compact metric spaces and \(\Phi {:}\, {\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) a continuous mapping. We denote the induced mapping \({\fancyscript{P}}({\fancyscript{X}})\rightarrow {\fancyscript{P}}({\fancyscript{Y}})\) by the same symbol \(\Phi \); hence \((\Phi \pi )(E){\triangleq }\pi (\Phi ^{1}E)\). The dual map \(C({\fancyscript{Y}})\rightarrow C({\fancyscript{X}})\) is denoted by \(\Phi ^*\); that is, \((\Phi ^*f)(x){\triangleq }f(\Phi x)\). The following lemma is wellknown.
Lemma 1
Let \({\fancyscript{X}}\) and \({\fancyscript{Y}}\) be compact metric spaces and \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) a continuous map.

(a)
\(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is onetoone if and only if \(\Phi ^*{:}\,C({\fancyscript{Y}})\rightarrow C({\fancyscript{X}})\) is onto, which in turn holds if and only if \(\Phi {:}\,{\fancyscript{P}}({\fancyscript{X}})\rightarrow {\fancyscript{P}}({\fancyscript{Y}})\) is onetoone.

(b)
\(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is onto if and only if \(\Phi ^*{:}\,C({\fancyscript{Y}})\rightarrow C({\fancyscript{X}})\) is onetoone, which in turn holds if and only if \(\Phi {:}\,{\fancyscript{P}}({\fancyscript{X}})\rightarrow {\fancyscript{P}}({\fancyscript{Y}})\) is onto.
Proof

(a)
Suppose that \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is onetoone. Let \(g\in C({\fancyscript{X}})\) be arbitrary. Define \({\fancyscript{Y}}_0{\triangleq }\,\Phi {\fancyscript{X}}\). Then \(g\circ \Phi ^{1}{:}\,{\fancyscript{Y}}_0\rightarrow \mathbb {R}\) is a continuous realvalued function on a closed subset of a compact metric space. Hence, by the Tietze extension theorem, it has an extension \(f{:}\,{\fancyscript{Y}}\rightarrow \mathbb {R}\). We have \(f\circ \Phi =g\circ \Phi ^{1}\circ \Phi =g\). Therefore, \(\Phi ^*{:}\,C({\fancyscript{Y}})\rightarrow C({\fancyscript{X}})\) is onto. The other implications are trivial.

(b)
Suppose that \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is not onto. Let \({\fancyscript{Y}}_0{\triangleq }\,\Phi {\fancyscript{X}}\) and pick an arbitrary \(y\in {\fancyscript{Y}}\setminus {\fancyscript{Y}}_0\). Using the Tietze extension theorem, we can find \(f,f'\in C({\fancyscript{Y}})\) such that \(f _{{\fancyscript{Y}}_0}=f' _{{\fancyscript{Y}}_0}\) but \(f(y)\ne f'(y)\). Then \(f\circ \Phi =f'\circ \Phi \). Hence, \(\Phi ^*{:}\,C({\fancyscript{Y}})\rightarrow C({\fancyscript{Y}})\) is not onetoone. Next, suppose that \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is onto. By the Krein–Milman theorem, the set \({\fancyscript{P}}({\fancyscript{Y}})\) is the closed convex hull of Dirac measures on \({\fancyscript{Y}}\). Let \(\pi =\sum _i \lambda _i\delta _{y_i}\in {\fancyscript{P}}({\fancyscript{Y}})\) be a convex combination of Dirac measures. Pick \(x_i\in {\fancyscript{X}}\) such that \(\Phi x_i=y_i\), and define \(\nu {\triangleq }\sum _i\lambda _i\delta _{x_i}\in {\fancyscript{P}}({\fancyscript{X}})\). Then \(\Phi \nu =\pi \). Therefore, \(\Phi {\fancyscript{P}}({\fancyscript{X}})\) is dense in \({\fancyscript{P}}({\fancyscript{Y}})\). Since \(\Phi {\fancyscript{P}}({\fancyscript{X}})\) is also closed, we obtain that \(\Phi {\fancyscript{P}}({\fancyscript{X}})={\fancyscript{P}}({\fancyscript{Y}})\). That is, \(\Phi {:}\,{\fancyscript{P}}({\fancyscript{X}})\rightarrow {\fancyscript{P}}({\fancyscript{Y}})\) is onto. The remaining implication is trivial.
\(\square \)
By a dynamical system we shall mean a compact metric space \({\fancyscript{X}}\) together with a continuous action \((i,x)\mapsto \varphi ^i x\) of a discrete commutative finitely generated group or semigroup \(\mathbb {L}\) on \({\fancyscript{X}}\). In case of a cellular automaton, \(\mathbb {L}\) is the set of nonnegative integers \(\mathbb {N}\) (or the set of integers \({\mathbb {Z}}\) if the cellular automaton is reversible). For a ddimensional shift, \(\mathbb {L}\) is the ddimensional hypercubic lattice \({\mathbb {Z}}^d\). Every dynamical system \(({\fancyscript{X}},\varphi )\) has at least one invariant measure, that is, a probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\) such that \(\varphi ^i\pi =\pi \) for every \(i\in \mathbb {L}\). In fact, every nonempty, closed and convex subset of \({\fancyscript{P}}({\fancyscript{X}})\) that is closed under the application of \(\varphi \) contains an invariant measure. We will denote the set of invariant measures of \(({\fancyscript{X}},\varphi )\) by \({\fancyscript{P}}({\fancyscript{X}},\varphi )\).
We also define \(C({\fancyscript{X}},\varphi )\) as the closed linear subspace of \(C({\fancyscript{X}})\) generated by the observables of the form \(g\circ \varphi ^i  g\) for \(g\in C({\fancyscript{X}})\) and \(i\in \mathbb {L}\); that is,
(see [75], Sects. 4.7–4.8). Then \({\fancyscript{P}}({\fancyscript{X}},\varphi )\) and \(C({\fancyscript{X}},\varphi )\) are annihilators of each other:
Lemma 2
(see e.g. [43], Proposition 2.13) Let \(({\fancyscript{X}},\varphi )\) be a dynamical system. Then,

(a)
\({\fancyscript{P}}({\fancyscript{X}},\varphi ) = \left\{ \pi \in {\fancyscript{P}}({\fancyscript{X}}){:}\, \pi (f)=0 \quad \text {for every} \, f\in C({\fancyscript{X}},\varphi ) \right\} \).

(b)
\(C({\fancyscript{X}},\varphi ) = \left\{ f\in C({\fancyscript{X}}){:}\, \pi (f)=0 \quad \text {for every} \, \pi \in {\fancyscript{P}}({\fancyscript{X}},\varphi ) \right\} \).
Proof

(a)
A probability measure \(\pi \) on \({\fancyscript{X}}\) is in \({\fancyscript{P}}({\fancyscript{X}},\varphi )\) if and only if \(\pi (g\circ \varphi ^i  g)=0\) for every \(g\in C({\fancyscript{X}})\) and \(i\in \mathbb {L}\). Furthermore, for each \(\pi \), the set \(\{f\in C({\fancyscript{X}}){:}\, \pi (f)=0\}\) is a closed linear subspace of \(C({\fancyscript{X}})\). Therefore, the equality in (a) holds.

(b)
Let us denote the righthand side of the claimed equality in (b) by D. The set D is closed and linear, and contains all the elements of the form \(g\circ \varphi ^ig\) for all \(g\in C({\fancyscript{X}})\) and \(i\in \mathbb {L}\). Therefore, \(C({\fancyscript{X}},\varphi )\subseteq D\). Conversely, let \(f\in C({\fancyscript{X}})\setminus C({\fancyscript{X}},\varphi )\). Then every element \(h\in \langle f, C({\fancyscript{X}},\varphi )\rangle \) has a unique representation \(h=a_h f + u_h\) where \(a_h\in \mathbb {R}\) and \(u_h\in C({\fancyscript{X}},\varphi )\). Define the linear functional \(J{:}\,\langle f, C({\fancyscript{X}},\varphi )\rangle \rightarrow \mathbb {R}\) by \(J(h)=J(a_h f + u_h){\triangleq }a_h\). Then J is bounded (\(\lefth\right=\lefta_h f + u_h\right\ge a_h\delta =\leftJ(h)\right\delta \), where \(\delta >0\) is the distance between f and \(C({\fancyscript{X}},\varphi )\)), and hence, by the HahnBanach theorem, has a bounded linear extension \(\widehat{J}\) on \(C({\fancyscript{X}})\). According to the Riesz representation theorem, there is a unique signed measure \(\pi \) on \({\fancyscript{X}}\) such that \(\pi (h)=\widehat{J}(h)\) for every \(h\in C({\fancyscript{X}})\). Let \(\pi =\pi ^+\pi ^\) be the Hahn decomposition of \(\pi \). Since \(\pi (f)=1\), either \(\pi ^+(f)>0\) or \(\pi ^(f)>0\). If \(\pi ^+(f)>0\), define \(\pi ^*{\triangleq }\frac{1}{\pi ^+({\fancyscript{X}})}\pi ^+\); otherwise \(\pi ^*{\triangleq }\frac{1}{\pi ^({\fancyscript{X}})}\pi ^\). Then \(\pi ^*\) is a probability measure with \(\pi ^*(u)=0\) for every \(u\in C({\fancyscript{X}},\varphi )\), which according to part (a), ensures that \(\pi ^*\in {\fancyscript{P}}({\fancyscript{X}},\varphi )\). On the other hand \(\pi ^*(f)>0\), and hence \(f\notin D\). We conclude that \(C({\fancyscript{X}},\varphi )=D\).
\(\square \)
If \(K({\fancyscript{X}})\) is a dense subspace of \(C({\fancyscript{X}})\), the subspace \(C({\fancyscript{X}},\varphi )\) can also be expressed in terms of \(K({\fancyscript{X}})\). Namely, if we define
then \(C({\fancyscript{X}},\varphi )=\overline{K({\fancyscript{X}},\varphi )}\). If \(D_0\) is a finite generating set for the group/semigroup \(\mathbb {L}\), the subspace \(K({\fancyscript{X}},\varphi )\) may also be expressed as
In particular, if \(\mathbb {L}={\mathbb {Z}}\) or \(\mathbb {L}=\mathbb {N}\), then every element of \(K({\fancyscript{X}},\varphi )\) is of the form \(h\circ \varphi h\) for some \(h\in K({\fancyscript{X}})\).
A morphism between two dynamical systems \(({\fancyscript{X}},\varphi )\) and \(({\fancyscript{Y}},\psi )\) is a continuous map \(\Theta {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) such that \(\Theta \varphi =\psi \Theta \). An epimorphism (i.e., an onto morphism) is also called a factor map. If \(\Theta {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is a factor map, then \(({\fancyscript{Y}},\psi )\) is a factor of \(({\fancyscript{X}},\varphi )\), and \(({\fancyscript{X}},\varphi )\) is an extension of \(({\fancyscript{Y}},\psi )\). A monomorphism (i.e., a onetoone morphism) is also known as an embedding. If \(({\fancyscript{Y}},\psi )\) is embedded in \(({\fancyscript{X}},\varphi )\) by the inclusion map, \(({\fancyscript{Y}},\psi )\) is called a subsystem of \(({\fancyscript{X}},\varphi )\). A conjugacy between dynamical systems is the same as an isomorphism; two systems are said to be conjugate if they are isomorphic.
Shifts and Cellular Automata
A cellular automaton is a dynamical system on symbolic configurations on a lattice. The configuration space itself has translational symmetry and can be considered as a dynamical system with the shift action. We allow constraints on the local arrangement of symbols to include models with socalled hard interactions, such as the hardcore model (Example 3) or the contour model (Example 2). Such a restricted configuration space is modeled by a (strongly irreducible) shift of finite type.
The sites of the ddimensional (hypercubic) lattice are indexed by the elements of the group \(\mathbb {L}{\triangleq }{\mathbb {Z}}^d\). A neighborhood is a nonempty finite set \(N\subseteq \mathbb {L}\) and specifies a notion of closeness between the lattice sites. The N neighborhood of a site \(a\in \mathbb {L}\) is the set \(N(a){\triangleq }a+N = \{a+i{:}\, i\in N\}\). Likewise, the Nneighborhood of a set \(A\subseteq \mathbb {L}\) is \(N(A){\triangleq }A+N {\triangleq }\{a+i{:}\, a\in A \text { and } i\in N\}\). The (symmetric) N boundary of a set \(A\subseteq \mathbb {L}\) is \(\partial N(A){\triangleq }N(A)\cap N(\mathbb {L}\setminus A)\). For a set \(A\subseteq \mathbb {L}\), we denote by \(N^{1}(A){\triangleq }AN\) the set of all sites \(b\in \mathbb {L}\) that have an Nneighbor in A.
A configuration is an assignment \(x{:}\,\mathbb {L}\rightarrow S\) of symbols from a finite set S to the lattice sites. The symbol x(i) assigned to a site \(i\in \mathbb {L}\) is also called the state of site i in x. For two configurations \(x,y{:}\,\mathbb {L}\rightarrow S\), we denote by \({\mathrm {diff}}(x,y){\triangleq }\{i\in \mathbb {L}{:}\, x(i)\ne y(i)\}\) the set of sites on which x and y disagree. Two configurations x and y are said to be asymptotic (or tailequivalent) if \({\mathrm {diff}}(x,y)\) is finite. If \(D\subseteq \mathbb {L}\) is finite, an assignment \(p{:}\,D\rightarrow S\) is called a pattern on D. If \(p{:}\,D\rightarrow S\) and \(q{:}\,E\rightarrow S\) are two patterns (or partial configurations) that agree on \(D\cap E\), we denote by \(p\vee q\) the pattern (or partial configuration) that agrees with p on D and with q on E.
Let S be a finite set of symbols with at least two elements. The set \(S^\mathbb {L}\) of all configurations of symbols from S on \(\mathbb {L}\) is given the product topology, which is compact and metrizable. The convergence in this topology is equivalent to sitewise eventual agreement. If \(D\subseteq \mathbb {L}\) is a finite set and x a configuration (or a partial configuration whose domain includes D), the set
is called a cylinder with base D. If \(p:D\rightarrow S\) is a pattern, we may write more concisely [p] rather than \([p]_D\). In one dimension (i.e., if \(\mathbb {L}={\mathbb {Z}}\)), we may also use words to specify cylinder sets: if \(u=u_0u_1\cdots u_{n1}\in S^*\) is a word over the alphabet S and \(k\in {\mathbb {Z}}\), we write \([u]_k\) for the set of configurations \(x\in S^{\mathbb {Z}}\) such that \(x_{k+i}=u_i\) for each \(0\le i<n\). The cylinders are clopen (i.e., both open and close) and form a basis for the product topology. The Borel \(\sigma \)algebra on \(S^\mathbb {L}\) is denoted by \({\mathfrak {F}}\). For \(A\subseteq \mathbb {L}\), the sub\(\sigma \)algebra of events occurring in A (i.e., the \(\sigma \)algebra generated by the cylinders whose base is a subset of A) will be denoted by \({\mathfrak {F}}_A\).
Given a configuration \(x:\mathbb {L}\rightarrow S\) and an element \(k\in \mathbb {L}\), we denote by \(\sigma ^k x\) the configuration obtained by shifting (or translating) x by vector k; that is, \((\sigma ^k x)(i){\triangleq }x(k+i)\) for every \(i\in \mathbb {L}\). The dynamical system defined by the action of the shift \(\sigma \) on \(S^\mathbb {L}\) is called the full shift. A closed shiftinvariant set \({\fancyscript{X}}\subseteq S^\mathbb {L}\) is called a shift space and the subsystem of \((S^\mathbb {L},\sigma )\) obtained by restricting \(\sigma \) to \({\fancyscript{X}}\) is called a shift system. We shall use the same symbol \(\sigma \) for the shift action of all shift systems. This will not lead to confusion, as the domain will always be clear from the context.
A shift space \({\fancyscript{X}}\subseteq S^\mathbb {L}\) is uniquely determined by its forbidden patterns, that is, the patterns \(p:D\rightarrow S\) such that \([p]_D\cap {\fancyscript{X}}=\varnothing \). Conversely, every set F of patterns defines a shift space by forbidding the occurrence of the elements of F; that is,
The set of patterns \(p{:}\,D\rightarrow S\) that are allowed in \({\fancyscript{X}}\) (i.e., \([p]\cap {\fancyscript{X}}\ne \varnothing \)) is denoted by \(L({\fancyscript{X}})\). If \(D\subseteq \mathbb {L}\) is finite, we denote by \(L_D({\fancyscript{X}}){\triangleq }L({\fancyscript{X}})\cap S^D\) the set of patterns on D that are allowed in \({\fancyscript{X}}\). For every pattern \(p\in L_D({\fancyscript{X}})\), there is a configuration \(x\in {\fancyscript{X}}\) such that \(p\vee x _{\mathbb {L}\setminus D}\in {\fancyscript{X}}\). Given a finite set \(D\subseteq \mathbb {L}\) and a configuration \(x\in {\fancyscript{X}}\), we write \(L_D({\fancyscript{X}}\,\,x)\) the set of patterns \(p\in L_D({\fancyscript{X}})\) such that \(p\vee x _{\mathbb {L}\setminus D}\in {\fancyscript{X}}\).
A shift \(({\fancyscript{X}},\sigma )\) (or a shift space \({\fancyscript{X}}\)) is of finite type, if \({\fancyscript{X}}\) can be identified by forbidding a finite set of patterns, that is, \({\fancyscript{X}}={\fancyscript{X}}_F\) for a finite set F. The shifts of finite type have the following gluing property: for every shift of finite type \(({\fancyscript{X}},\sigma )\), there is a neighborhood \(0\in M\subseteq \mathbb {L}\) such that for every two sets \(A,B\subseteq \mathbb {L}\) with \(M(A)\cap M(B)=\varnothing \) and every two configurations \(x,y\in {\fancyscript{X}}\) that agree outside \(A\cup B\), there is another configuration \(z\in {\fancyscript{X}}\) that agrees with x outside B and with y outside A. A similar gluing property is the strong irreducibility: a shift \(({\fancyscript{X}},\sigma )\) is strongly irreducible if there is a neighborhood \(0\in M\subseteq \mathbb {L}\) such that for every two sets \(A,B\subseteq \mathbb {L}\) with \(M(A)\cap M(B)=\varnothing \) and every two configurations \(x,y\in {\fancyscript{X}}\), there is another configuration \(z\in {\fancyscript{X}}\) that agrees with x in A and with y in B. Note that strong irreducibility is a stronger version of topological mixing. A dynamical system \(({\fancyscript{X}},\varphi )\) is (topologically) mixing if for every two nonempty open sets \(U,V\subseteq {\fancyscript{X}}\), \(U\cap \varphi ^{t}V\ne \varnothing \) for all but finitely many t. A onedimensional shift of finite type is strongly irreducible if and only if it is mixing. Our primary interest in this article will be the shifts of finite type that are strongly irreducible, for these are sufficiently broad to encompass the configuration space of most physically interesting lattice models.
The morphisms between shift systems are the same as the sliding block maps. A map \(\Theta {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) between two shift spaces \({\fancyscript{X}}\subseteq S^\mathbb {L}\) and \({\fancyscript{Y}}\subseteq T^\mathbb {L}\) is a sliding block map if there is a neighborhood \(0\in N\subseteq \mathbb {L}\) (a neighborhood for \(\Theta \)) and a function \(\theta {:}\,L_N({\fancyscript{X}})\rightarrow T\) (a local rule for \(\Theta \)) such that
for every configuration \(x\in {\fancyscript{X}}\) and every site \(i\in \mathbb {L}\). Any sliding block map is continuous and commutes with the shift, and hence, is a morphism. Conversely, every morphism between shift systems is a sliding block map. Finite type property and strong irreducibility are both conjugacy invariants. A morphism \(\Theta {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) between two shifts \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\) is said to be preinjective if for every two distinct asymptotic configuration \(x,y\in {\fancyscript{X}}\), the configurations \(\Theta x\) and \(\Theta y\) are distinct.
A cellular automaton on a shift space \({\fancyscript{X}}\) is a dynamical system identified by an endomorphism \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) of \(({\fancyscript{X}},\sigma )\). The evolution of a cellular automaton starting from a configuration \(x\in {\fancyscript{X}}\) is seen as synchronous updating of the state of different sites in x using the local rule of \(\Phi \). A cellular automaton \(({\fancyscript{X}},\Phi )\) is said to be surjective (resp., injective, preinjective, bijective) if \(\Phi \) is surjective (resp., injective, preinjective, bijective). If \(\Phi \) is bijective, the cellular automaton is further said to be reversible, for \(({\fancyscript{X}},\Phi ^{1})\) is also a cellular automaton. In this article, we only work with cellular automata that are defined over strongly irreducible shifts of finite type. It is wellknown that for cellular automata over strongly irreducible shifts of finite type, surjectivity and preinjectivity are equivalent (the GardenofEden theorem; see below). In particular, every injective cellular automaton is also surjective, and hence reversible.
Let \({\fancyscript{X}}\subseteq S^\mathbb {L}\) be a shift space. A linear combination of characteristic functions of cylinder sets is called a local observable. An observable \(f{:}\,{\fancyscript{X}}\rightarrow \mathbb {R}\) is local if and only if it is \({\mathfrak {F}}_D\)measurable for a finite set \(D\subseteq \mathbb {L}\). A finite set D with such property is a base for f; the value of f at a configuration x can be evaluated by looking at x “through the window D”. The set of all local observables on \({\fancyscript{X}}\), denoted by \(K({\fancyscript{X}})\), is dense in \(C({\fancyscript{X}})\). The set of all local observables on \({\fancyscript{X}}\) with base D is denoted by \(K_D({\fancyscript{X}})\).
Let \(D\subseteq \mathbb {L}\) be a nonempty finite set. The D block presentation of a configuration \(x{:}\,\mathbb {L}\rightarrow S\) is a configuration \(x^{[D]}{:}\,\mathbb {L}\rightarrow S^D\), where \(x^{[D]}(i){\triangleq }x _{i+D}\). If \({\fancyscript{X}}\) is a shift space, the set of Dblock presentations of the elements of \({\fancyscript{X}}\) is called the Dblock presentation of \({\fancyscript{X}}\), and is denoted by \({\fancyscript{X}}^{[D]}\). The shifts \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{X}}^{[D]},\sigma )\) are conjugate via the map \(x\mapsto x^{[D]}\).
More background on shifts and cellular automata (from the view point of dynamical systems) can be found in the books [45, 48, 49].
Hamiltonians and Gibbs Measures
We will use the term Hamiltonian in more or less the same sense as in the Ising model or other lattice models from statistical mechanics, except that we do not require it to be interpreted as “energy”. A Hamiltonian formalizes the concept of a local and additive quantity, be it energy, momentum or a quantity with no familiar physical interpretation.
Let \({\fancyscript{X}}\) be an arbitrary set. A potential difference on \({\fancyscript{X}}\) is a partial mapping \(\Delta :{\fancyscript{X}}\times {\fancyscript{X}}\rightarrow \mathbb {R}\) such that

(a)
\(\Delta (x,x)=0\) for every \(x\in {\fancyscript{X}}\),

(b)
\(\Delta (y,x)=\Delta (x,y)\) whenever \(\Delta (x,y)\) exists, and

(c)
\(\Delta (x,z)=\Delta (x,y)+\Delta (y,z)\) whenever \(\Delta (x,y)\) and \(\Delta (y,z)\) both exist.
Let \({\fancyscript{X}}\subseteq S^\mathbb {L}\) be a shift space. A potential difference \(\Delta \) on \({\fancyscript{X}}\) is a (relative) Hamiltonian if

(d)
\(\Delta (x,y)\) exists precisely when x and y are asymptotic,

(e)
\(\Delta (\sigma ^a x,\sigma ^a y)=\Delta (x,y)\) whenever \(\Delta (x,y)\) exists and \(a\in \mathbb {L}\), and

(f)
For every finite \(D\subseteq \mathbb {L}\), \(\Delta \) is continuous when restricted to pairs (x, y) with \({\mathrm {diff}}(x,y)\subseteq D\).
Note that due to the compactness of \({\fancyscript{X}}\), the latter continuity is uniform among all pairs (x, y) with \({\mathrm {diff}}(x,y)\subseteq D\). If the condition (f) is strengthened by the following condition, we say that \(\Delta \) is a finiterange Hamiltonian.
 (f\(^\prime \)):

There exists a neighborhood \(0\in M\subseteq \mathbb {L}\) (the interaction neighborhood of \(\Delta \)) such that \(\Delta (x,y)\) depends only on the restriction of x and y to \(M({\mathrm {diff}}(x,y))\).
Hamiltonians in statistical mechanics are usually constructed by assigning interaction energies to different local arrangements of site states. Equivalently, they can be constructed using observables. A local observable \(f\in K({\fancyscript{X}})\) defines a finiterange Hamiltonian \(\Delta _f\) on \({\fancyscript{X}}\) via
for every two asymptotic configurations \(x,y\in {\fancyscript{X}}\). The value of \(f\circ \sigma ^i\) is then interpreted as the contribution of site i to the energylike quantity formalized by \(\Delta _f\). The same construction works for nonlocal observables that are “sufficiently shortranged” (i.e., whose dependence on faraway sites decays rapidly). The variation of an observable \(f:{\fancyscript{X}}\rightarrow \mathbb {R}\) relative to a finite set \(A\subseteq \mathbb {L}\) is defined as
where the supremum is taken over all pairs of configurations x and y in \({\fancyscript{X}}\) that agree on A. A continuous observable f is said to have summable variations if
where \(I_n{\triangleq }[n,n]^d\) and \(\partial I_n{\triangleq }I_{n+1}\setminus I_n\). Every observable f that has summable variations defines a Hamiltonian via (7), in which the sum is absolutely convergent. We denote the set of observables with summable variations with \(SV({\fancyscript{X}})\). Note that \(K({\fancyscript{X}})\subseteq SV({\fancyscript{X}})\subseteq C({\fancyscript{X}})\).
Question 1
Is every Hamiltonian on a strongly irreducible shift of finite type generated by an observable with summable variations via (7)? Is every finiterange Hamiltonian on a strongly irreducible shift of finite type generated by a local observable?
Proposition 1
Every finiterange Hamiltonian on a full shift is generated by a local observable.
Proof
The idea is to write the Hamiltonian as a telescopic sum (see e.g. [30], or [41], Sect. 5).
Let \(\Delta \) be a finiterange Hamiltonian with interaction range M. Let \(\diamondsuit \) be an arbitrary uniform configuration. Let \(\preceq \) be the lexicographic order on \(\mathbb {L}={\mathbb {Z}}^d\), and denote by \({\mathrm {succ}}(k)\), the successor of site \(k\in \mathbb {L}\) in this ordering. For every configuration z that is asymptotic to \(\diamondsuit \), we can write
where \(z_k\) is the configuration that agrees with z on every site \(i\prec k\) and with \(\diamondsuit \) on every site \(i\succeq k\). Note that all but a finite number of terms in the above sum are 0.
For every configuration z, we define \(f(z){\triangleq }\Delta (z_0,z_{{\mathrm {succ}}(0)})\) with the same definition for \(z_k\) as above. This is clearly a local observable with base M. If z is asymptotic to \(\diamondsuit \), the above telescopic expansion shows that \(\Delta (\diamondsuit ,z)=\Delta _f(\diamondsuit ,z)\). If x and y are arbitrary asymptotic configurations, we have \(\Delta (x,y)=\Delta (\hat{x},\hat{y})=\Delta (\diamondsuit ,\hat{y})\Delta (\diamondsuit ,\hat{x})\), where \(\hat{x}\) and \(\hat{y}\) are the configurations that agree, respectively, with x and y on \(M^{1}(M({\mathrm {diff}}(x,y)))\) and with \(\diamondsuit \) everywhere else. Therefore, we can write \(\Delta (x,y)=\Delta (\hat{x},\hat{y})=\Delta _f(\hat{x},\hat{y})=\Delta _f(x,y)\). \(\square \)
Whether the above proposition extends to finiterange Hamiltonians on strongly irreducible shifts of finite type is not known, but in [12], examples of shifts of finite type are given on which not every finiterange Hamiltonian is generated by a local observable. On the other hand, the main result of [11] implies that on a onedimensional mixing shift of finite type, every finiterange Hamiltonian can be generated by a local observable.
The trivial Hamiltonian on \({\fancyscript{X}}\) (i.e., the Hamiltonian \(\Delta \) for which \(\Delta (x,y)=0\) for all asymptotic \(x,y\in {\fancyscript{X}}\)) plays a special role as it identifies an important notion of equivalence between observables (see Sect. 2.5).
Another important concept regarding Hamiltonians is that of ground configurations. Let \({\fancyscript{X}}\subseteq S^\mathbb {L}\) be a shift space and \(\Delta \) a Hamiltonian on \({\fancyscript{X}}\). A ground configuration for \(\Delta \) is a configuration \(z\in {\fancyscript{X}}\) such that \(\Delta (z,x)\ge 0\) for every configuration \(x\in {\fancyscript{X}}\) that is asymptotic to z. The existence of ground configurations is well known. We shall use it later in the proof of Theorem 9.
Proposition 2
Every Hamiltonian on a shift space of finite type has at least one ground configuration.
Proof
Let \({\fancyscript{X}}\) be a shift space of finite type and \(\Delta \) a Hamiltonian on \({\fancyscript{X}}\). Let \(I_1\subseteq I_2\subseteq \cdots \) be a chain of finite subsets of \(\mathbb {L}\) that is exhaustive (i.e., \(\bigcup _n I_n=\mathbb {L}\)). For example, we could take \(I_n=[n,n]^d\) in \(\mathbb {L}={\mathbb {Z}}^d\). Let \(z_0\in {\fancyscript{X}}\) be an arbitrary configuration, and construct a sequence of configurations \(z_1,z_2,\ldots ,\in {\fancyscript{X}}\) as follows.
For each n, choose \(z_n\in {\fancyscript{X}}\) to be a configuration with \({\mathrm {diff}}(z_0,z_n)\subseteq I_n\) such that \(\Delta (z_0,z_n)\) is minimum (i.e., \(\Delta (z_0,z_n)\le \Delta (z_0,x)\) for all \(x\in {\fancyscript{X}}\) with \({\mathrm {diff}}(z_0,x)\subseteq I_n\)). The minimum exists because \(L_{I_n}({\fancyscript{X}}\,\,z_0)\) is finite. By compactness, there is a subsequence \(n_1<n_2<\cdots \) such that \(z_{n_i}\) converges. The limit \(z{\triangleq }\lim _{i\rightarrow \infty } z_{n_i}\) is a ground configuration.
To see this, let \(x\in {\fancyscript{X}}\) be asymptotic to z, and choose k such that \(I_k\supseteq {\mathrm {diff}}(z,x)\). Since \({\fancyscript{X}}\) is of finite type, there is a \(l\ge k\) such that for every two configuration \(u,v\in {\fancyscript{X}}\) that agree on \(I_l\setminus I_k\), there is a configuration \(w\in {\fancyscript{X}}\) that agrees with u on \(I_l\) and with v outside \(I_k\). In particular, for every sufficiently large i, x (and z) agree with \(z_{n_i}\) on \(I_l\setminus I_k\), and hence there is a configuration \(x_{n_i}\) that agrees with x on \(I_l\) and with \(z_{n_i}\) outside \(I_k\). Then, \({\mathrm {diff}}(z_{n_i},x_{n_i})={\mathrm {diff}}(z,x)\). Since \(z_{n_i}\rightarrow z\), we also get \(x_{n_i}\rightarrow x\). The continuity property of \(\Delta \) now implies that \(\Delta (z_{n_i},x_{n_i})\rightarrow \Delta (z,x)\). On the other hand, \(\Delta (z_{n_i},x_{n_i})=\Delta (z_0,x_{n_i})\Delta (z_0,z_{n_i})\ge 0\). Therefore, \(\Delta (z,x)\ge 0\). \(\square \)
Example 1
(Ising model) The Ising model is a simple model on the lattice designed to give a statistical explanation of the phenomenon of spontaneous magnetization in ferromagnetic material (see e.g. [27, 90]). The configuration space of the ddimensional Ising model is the full shift \({\fancyscript{X}}{\triangleq }\{\mathtt {},\mathtt {+}\}^{\mathbb {L}}\), where \(\mathbb {L}={\mathbb {Z}}^d\), and where having \(\mathtt {+}\) and \(\mathtt {}\) at a site i is interpreted as an upward or downward magnetization of the tiny segment of the material approximated by site i. The state of site i is called the spin at site i.
The interaction between spins is modeled by associating an interaction energy \(1\) to every two adjacent spins that are aligned (i.e., both are upward or both downward) and energy \(+1\) to every two adjacent spins that are not aligned. Alternatively, we can specify the energy using the energy observable \(f\in K({\fancyscript{X}})\) defined by
where \(n^{\mathtt {+}}(x)\) and \(n^{\mathtt {}}(x)\) are, respectively, the number of upward and downward spins adjacent to site 0. This defines a Hamiltonian \(\Delta _f\). The two uniform configurations (all sites \(\mathtt {+}\) and all sites \(\mathtt {}\)) are ground configurations for \(\Delta _f\), although \(\Delta _f\) has many other ground configurations. ◯
Example 2
(Contour model) The contour model was originally used to study phase transition in the Ising model. Each site of twodimensional lattice \(\mathbb {L}={\mathbb {Z}}^2\) may take a state from the set
Not all configurations are allowed. The allowed configurations are those in which the state of adjacent sites match in the obvious fashion. For example,
can be placed on the right side of
but not on top of it, and
can be placed on the left side of
but not on its right side. The allowed configurations depict decorations of the lattice formed by closed or biinfinite paths (see Fig. 2b). These paths are referred to as contours.
The space of allowed configurations \({\fancyscript{Y}}\subseteq T^{\mathbb {Z}}\) is a shift space of finite type. It is also easy to verify that \(({\fancyscript{Y}},\sigma )\) is strongly irreducible. Define the local observable \(g\in K({\fancyscript{Y}})\), where
The Hamiltonian \(\Delta _g\) simply compares the length of the contours in two asymptotic configurations. The uniform configuration in which every site is in state
is a ground configuration for \(\Delta _g\). Any configuration with a single biinfinite horizontal (or vertical) contour is also a ground configuration for \(\Delta _g\). ◯
Example 3
(Hardcore gas) Let \(0\subseteq W\subseteq \mathbb {L}\) be a neighborhood, and define a shift space \({\fancyscript{X}}\subseteq \{\mathtt {0},\mathtt {1}\}^\mathbb {L}\) consisting of all configurations x for which \(W(i)\cap W(j)=\varnothing \) for every distinct \(i,j\in \mathbb {L}\) with \(x(i)=x(j)=\mathtt {1}\). This is the configuration space of the hardcore gas model. A site having state \(\mathtt {1}\) is interpreted as containing a particle, whereas a site in state \(\mathtt {0}\) is thought of to be empty. It is assumed that each particle occupies a volume W and that the volume of different particles cannot overlap. The onedimensional version of the hardcore shift with volume \(W=\{0,1\}\) is also known as the golden mean shift.
The hardcore shift is clearly of finite type. It is also strongly irreducible. In fact, \({\fancyscript{X}}\) has a stronger irreducibility property: for every two asymptotic configurations \(x,y\in {\fancyscript{X}}\), there is a sequence \(x=x_0,x_1,\ldots ,x_n=y\) of configurations in \({\fancyscript{X}}\) such that \({\mathrm {diff}}(x_i,x_{i+1})\) is singleton. In particular, Proposition 1 can be adapted to cover the Hamiltonians on \({\fancyscript{X}}\).
Let \(h(x){\triangleq }1\) if \(x(0)=\mathtt {1}\) and \(h(x){\triangleq }0\) otherwise. The Hamiltonian \(\Delta _h\) compares the number of particles on two asymptotic configurations. The empty configuration is the unique ground configuration for \(\Delta _h\). ◯
Gibbs measures are a class of probability measures identified by Hamiltonians. Let \({\fancyscript{X}}\) be a shift space. A Gibbs measure for a finiterange Hamiltonian \(\Delta \) is a probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\) satisfying
for every two asymptotic configurations \(x,y\in {\fancyscript{X}}\) and all sufficiently large E. (If M is the interaction neighborhood of \(\Delta \), the above equality will hold for every \(E\supseteq M({\mathrm {diff}}(x,y))\).) More generally, if \(\Delta \) is an arbitrary Hamiltonian on \({\fancyscript{X}}\), a probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\) is said to be a Gibbs measure for \(\Delta \) if
for every configuration \(x\in {\fancyscript{X}}\) that is in the support of \(\pi \) and every configurations \(y\in {\fancyscript{X}}\) that is asymptotic to x. The limit is taken along the directed family of finite subsets of \(\mathbb {L}\) with inclusion.^{Footnote 2} The above limit is in fact uniform among all pairs of configurations x, y in the support of \(\pi \) whose disagreements \({\mathrm {diff}}(x,y)\) are included in a finite set \(D\subseteq \mathbb {L}\) (see Appendix). Note also that if \({\fancyscript{X}}\) is strongly irreducible, every Gibbs measure on \({\fancyscript{X}}\) has full support, and therefore, the relation (15) must hold for every two asymptotic \(x,y\in {\fancyscript{X}}\). The set of Gibbs measures for a Hamiltonian \(\Delta \), denoted by \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\), is nonempty, closed and convex. According to the Krein–Milman theorem, the set \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\) coincides with the closed convex hull of its extremal elements. The extremal elements of \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\) are mutually singular. The subset \({\fancyscript{G}}_\Delta ({\fancyscript{X}},\sigma )\) of shiftinvariant elements of \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\) is also nonempty (using convexity and compactness), closed and convex, and hence equal to the closed convex hull of its extremal elements. The extremal elements of \({\fancyscript{G}}_\Delta ({\fancyscript{X}},\sigma )\) are precisely its ergodic elements, and hence again mutually singular.
The Gibbs measures associated to finiterange Hamiltonians have the Markov property. A measure \(\pi \) on a shift space \({\fancyscript{X}}\subseteq S^\mathbb {L}\) is called a Markov measure if there is a neighborhood \(0\in M\subseteq \mathbb {L}\) such that for every two finite sets \(D,E\subseteq \mathbb {L}\) with \(M(D)\subseteq E\) and every pattern \(p:E\rightarrow S\) with \(\pi ([p]_{E\setminus D})>0\) it holds
The data contained in the conditional probabilities \(\pi \left( [p]_D\,\, [p]_{M(D)\setminus D}\right) \) for all choices of D, E and p is called the specification of the Markov measure \(\pi \). The specification of a Gibbs measure associated to a finiterange Hamiltonian is positive (i.e., all the conditional distributions are positive) and shiftinvariant. Conversely, every positive shiftinvariant Markovian specification is the specification of a Gibbs measure. In fact, Eq. (14) identifies a onetoone correspondence between finiterange Hamiltonians and the positive shiftinvariant Markovian specifications.
The uniform Bernoulli measure on a full shift \({\fancyscript{X}}\subseteq S^\mathbb {L}\) is the unique Gibbs measure for the trivial Hamiltonian on \({\fancyscript{X}}\). More generally, the shiftinvariant Gibbs measures on a strongly irreducible shift of finite type associated to the trivial Hamiltonian are precisely the measures that maximize the entropy (see below).
We shall call a Gibbs measure regular if its corresponding Hamiltonian is generated by an observable with summable variations.
Entropy, Pressure, and the Variational Principle
Statistical mechanics attempts to explain the macroscopic behaviour of a physical system by statistical analysis of its microscopic details. In the subjective interpretation (see [36]), the probabilities reflect the partial knowledge of an observer. A suitable choice for a probability distribution over the possible microscopic states of a system is therefore one which, in light of the available partial observations, is least presumptive.
The standard approach to pick the least presumptive probability distribution is by maximizing entropy. The characterization of the uniform probability distribution over a finite set as the probability distribution that maximizes entropy is widely known. Maximizing entropy subject to partial observations leads to Boltzmann distribution. The infinite systems based on lattice configurations have a similar (though more technical) picture. Below, we give a minimum review necessary for our discussion. Details and more information can be found in the original monographs and textbooks [27, 35, 44, 75, 78]. The equilibrium statistical mechanics, which can be built upon the maximum entropy postulate, has been enormously successful in predicting the absence or presence of phase transitions, and in describing the qualitative features of the phases; see [27].
Let \({\fancyscript{X}}\subseteq S^\mathbb {L}\) be a shift space and \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\) a probability measure on \({\fancyscript{X}}\). The entropy of a finite set \(A\subseteq \mathbb {L}\) of sites under \(\pi \) is
(By convention, \(0\log 0{\triangleq }0\).) This is the same as the Shannon entropy of the random variable \({\mathbf {x}}_A\) in the probability space \(({\fancyscript{X}},\pi )\), where \({\mathbf {x}}_A\) is the projection \(x\mapsto x _{A}\). Let us recall few basic properties of the entropy. The entropy \(H({\mathbf {x}})\) of a random variable \({\mathbf {x}}\) is nonnegative. If \({\mathbf {x}}\) takes its values in a finite set of cardinality n, then \(H({\mathbf {x}})\le \log n\). The entropy is subadditive, meaning that \(H(({\mathbf {x}},{\mathbf {y}}))\le H({\mathbf {x}})+H({\mathbf {y}})\) for every two random variables \({\mathbf {x}}\) and \({\mathbf {y}}\). If \({\mathbf {y}}=f({\mathbf {x}})\) depends deterministically on \({\mathbf {x}}\), we have \(H(f({\mathbf {x}}))\le H({\mathbf {x}})\).
Let \(I_n{\triangleq }[n,n]^d\subseteq \mathbb {L}\) be the centered \((2n+1)\times (2n+1)\times \cdots \times (2n+1)\) box in the lattice. If \(\pi \) is shiftinvariant, the subadditivity of \(A\mapsto H_\pi (A)\) ensures that the limit
exists (Fekete’s lemma). The limit value \(h_\pi ({\fancyscript{X}},\sigma )\) is the average entropy per site of \(\pi \) over \({\fancyscript{X}}\). It is also referred to as the (KolmogorovSinai) entropy of the dynamical system \(({\fancyscript{X}},\sigma )\) under \(\pi \) (see [92], Theorem 4.17).
The entropy functional \(\pi \mapsto h_\pi ({\fancyscript{X}},\sigma )\) is nonnegative and affine. Although it is not continuous, it is upper semicontinuous.
Proposition 3
(Upper Semicontinuity) If \(\lim _{i\rightarrow \infty }\pi _i=\pi \), then \(\limsup _{i\rightarrow \infty }h_{\pi _i}({\fancyscript{X}},\sigma ) \le h_\pi ({\fancyscript{X}},\sigma )\).
Proof
The pointwise infimum of a family of continuous functions is upper semicontinuous. \(\square \)
The entropy functional is also bounded. Due to the compactness of \({\fancyscript{P}}({\fancyscript{X}},\sigma )\) and the upper semicontinuity of \(\pi \mapsto h_\pi ({\fancyscript{X}},\sigma )\), the entropy \(h_\pi ({\fancyscript{X}},\sigma )\) takes its maximum value at some measures \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\). This maximum value coincides with the topological entropy of the shift \(({\fancyscript{X}},\sigma )\), defined by
which is the average combinatorial entropy per site of \({\fancyscript{X}}\).
The following propositions are easy to prove, and are indeed valid for arbitrary dynamical systems.
Proposition 4
(Factoring) Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a factor map between two shifts \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\) and \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) a probability measure on \({\fancyscript{X}}\). Then, \(h_{\Phi \pi }({\fancyscript{Y}},\sigma )\le h_\pi ({\fancyscript{X}},\sigma )\).
Proposition 5
(Embedding) Let \(\Phi :{\fancyscript{Y}}\rightarrow {\fancyscript{X}}\) be an embedding of a shift \(({\fancyscript{Y}},\sigma )\) in a shift \(({\fancyscript{X}},\sigma )\) and \(\pi \in {\fancyscript{P}}({\fancyscript{Y}},\sigma )\) a probability measure on \({\fancyscript{Y}}\). Then, \(h_\pi ({\fancyscript{Y}},\sigma )= h_{\Phi \pi }({\fancyscript{X}},\sigma )\).
Given a continuous observable \(f\in C({\fancyscript{X}})\), the mapping \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\mapsto \pi (f)\) is continuous and affine. Its range is closed, bounded, and convex, that is, a finite closed interval \([e_{\min },e_{\max }]\subseteq \mathbb {R}\). For each \(e\in [e_{\min },e_{\max }]\), let us define
Let \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) denote the set of measures \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) with \(\pi (f)=e\) and \(h_\pi ({\fancyscript{X}},\sigma )=s_f(e)\), that is, the measures \(\pi \) that maximize entropy under the constraint \(\pi (f)=e\). By the compactness of \({\fancyscript{P}}({\fancyscript{X}},\sigma )\) and the upper semicontinuity of \(\pi \mapsto h_\pi ({\fancyscript{X}},\sigma )\), the set \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) is nonempty (as long as \(e\in [e_{\min },e_{\max }]\)). The mapping \(s_f(\cdot )\) is concave and continuous. The measures in \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) (and more generally, the solutions of similar entropy maximization problems with multiple contraints \(\pi (f_1)=e_1\), \(\pi (f_2)=e_2\), ..., \(\pi (f_n)=e_n\)) could be implicitly identified after a Legendre transform.
The pressure associated to \(f\in C({\fancyscript{X}})\) could be defined as
The functional \(f\mapsto P_f({\fancyscript{X}},\sigma )\) is convex and Lipschitz continuous. It is the convex conjugate of the entropy functional \(\nu \mapsto h_\nu ({\fancyscript{X}},\sigma )\) (up to a negative sign), and we also have
(see [75], Theorem 3.12). Note that the pressure \(P_0({\fancyscript{X}},\sigma )\) associated to 0 is the same as the topological entropy of \(({\fancyscript{X}},\sigma )\). Again, the compactness of \({\fancyscript{P}}({\fancyscript{X}},\sigma )\) and the upper semicontinuity of \(\nu \mapsto h_\nu ({\fancyscript{X}},\sigma )\) ensure that the supremum in (21) can be achieved. The set of shiftinvariant probability measures \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) for which the equality in
is satisfied will be denoted by \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\). Following the common terminology of statistical mechanics and ergodic theory, we call the elements of \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) the equilibrium measures for f. Let us emphasize that this terminology lacks a dynamical justification that we are striving for. The Bayesian justification is further clarified below.
A celebrated theorem of Dobrushin, Lanford and Ruelle characterizes the equilibrium measures (for “shortranged” observables over strongly irreducible shift spaces of finite type) as the associated shiftinvariant Gibbs measures.
Theorem 1
(Characterization of Equilibrium Measures; see [75], Theorem 4.2, and [44], Sects. 5.2 and 5.3,and [57]) Let \({\fancyscript{X}}\subseteq S^\mathbb {L}\) be a strongly irreducible shift space of finite type. Let \(f\in SV({\fancyscript{X}})\) be an observable with summable variations and \(\Delta _f\) the Hamiltonian it generates. The set of equilibrium measures for f coincides with the set of shiftinvariant Gibbs measures for \(\Delta _f\).
Consider now an observable \(f\in C({\fancyscript{X}})\), and as before, let \([e_{\min },e_{\max }]\) be the set of possible values \(\nu (f)\) for \(\nu \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\). For every \(\beta \in \mathbb {R}\), we have
That is, \(\beta \in \mathbb {R}\mapsto P_{\beta f}\) is the Legendre transform of \(e\mapsto s_f(e)\). If f has summable variations and the Hamiltonian \(\Delta _f\) is not trivial, it can be shown that \(\beta \mapsto P_{\beta f}\) is strictly convex (see [75], Sect. 4.6, or [35], Sect. III.4). It follows that \(e\mapsto s_f(e)\) is continuously differentiable everywhere except at \(e_{\min }\) and \(e_{\max }\), and
for every \(e\in (e_{\min },e_{\max })\). For \(e\in (e_{\min },e_{\max })\), the above theorem identifies the elements of \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) as the shiftinvariant Gibbs measures for \(\beta _e f\), where \(\beta _e\in \mathbb {R}\) is the unique value at which \(\beta \mapsto P_{\beta f}({\fancyscript{X}},\sigma )\) has a tangent with slope e. The mapping \(e\mapsto \beta _e\) is continuous and nonincreasing. The set of slopes of tangents to \(\beta \mapsto P_{\beta f}({\fancyscript{X}},\sigma )\) at a point \(\beta \in \mathbb {R}\) is a closed interval \([e^_\beta ,e^+_\beta ]\subseteq (e_{\min },e_{\max })\). We have
When f is interpreted as the energy contribution of a single site, \(1/\beta \) is interpreted as the temperature and e as the mean energy per site. By a Bayesian reasoning, if \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) is singleton, its unique element is an appropriate choice of the probability distribution of the system in thermal equilibrium when the mean energy per site is e. If \({\fancyscript{E}}_{\beta f}({\fancyscript{X}},\sigma )\) is singleton, the unique element is interpreted as a description of the system in thermal equilibrium at temperature \(1/\beta \). The existence of more than one element in \({\fancyscript{E}}_{\beta f}({\fancyscript{X}},\sigma )\) (or in \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\)) is interpreted as the existence of more than one phase (e.g., liquid or gas) at temperature \(1/\beta \) (resp., with energy density e). The presence of distinct tangents to \(\beta \mapsto P_{\beta f}({\fancyscript{X}},\sigma )\) at a given inverse temperature \(\beta \) implies the existence of distinct phases at temperature \(1/\beta \) having different mean energy per site.
Note that since the elements of \({\fancyscript{E}}_{\beta f}({\fancyscript{X}},\sigma )={\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}},\sigma )\) are shiftinvariant, they only offer a description of the equilibrium states that respect the translation symmetry of the model. By extrapolating the interpretation, one could consider the Gibbs measures \(\pi \in {\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}})\) that are not shiftinvariant as states of equilibrium in which the translation symmetry is broken.
Physical Equivalence of Observables
Let \({\fancyscript{X}}\subseteq S^{\mathbb {L}}\) be a strongly irreducible shift space of finite type. Every local observable generates a finiterange Hamiltonian via Eq. (7). However, different local observables may generate the same Hamiltonians. Two local observables \(f,g\in K({\fancyscript{X}})\) are physically equivalent (see [75], Sects. 4.6–4.7, [35], Sects. I.4 and III.4, or [27], Sect. 2.4), \(f\sim g\) in symbols, if they identify the same Hamiltonian, that is, if \(\Delta _f=\Delta _g\). The following proposition gives an alternate characterization of physical equivalence, which will allow us to extend the notion of physical equivalence to \(C({\fancyscript{X}})\).
Proposition 6
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type. Two observables \(f,g\in K({\fancyscript{X}})\) are physically equivalent, if and only if there is a constant \(c\in \mathbb {R}\) such that \(\pi (f)=\pi (g)+c\) for every probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\).
Proof
 \(\Rightarrow \)):

Let \(h{\triangleq }fg\). Let us pick an arbitrary configuration \(\diamondsuit \in {\fancyscript{X}}\) with the property that the spatial average
$$\begin{aligned} c{\triangleq }\lim _{n\rightarrow \infty } \frac{\sum _{i\in I_n} h(\sigma ^i \diamondsuit )}{\leftI_n\right} \end{aligned}$$(27)(where \(I_n{\triangleq }[n,n]^d\subseteq \mathbb {L}\)) exists. That such a configuration exists follows, for example, from the ergodic theorem.^{Footnote 3} We claim that indeed
$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{\sum _{i\in I_n} h(\sigma ^i x)}{\leftI_n\right} = c \end{aligned}$$(28)for every configuration \(x\in {\fancyscript{X}}\). This follows from the fact that \(\Delta _h=\Delta _f\Delta _g=0\). More specifically, let \(0\in M\subseteq \mathbb {L}\) be a neighborhood that witnesses the strong irreducibility of \({\fancyscript{X}}\), and let \(D\subseteq \mathbb {L}\) be a finite base for h (i.e., h is \({\mathfrak {F}}_D\)measurable). For each configuration \(x\in {\fancyscript{X}}\) and each \(n\ge 0\), let \(x_n\) be a configuration that agrees with x on \(I_n+D\) and with \(\diamondsuit \) off \(I_n+D+MM\). Then
$$\begin{aligned} \sum _{i\in I_n} h(\sigma ^i x)&= \sum _{i\in I_n} h(\sigma ^i x_n) \end{aligned}$$(29)$$\begin{aligned}&= \sum _{i\in I_n} h(\sigma ^i\diamondsuit ) + \Delta _h(\diamondsuit ,x_n) + o(\leftI_n\right) \end{aligned}$$(30)$$\begin{aligned}&= \sum _{i\in I_n} h(\sigma ^i\diamondsuit ) + o(\leftI_n\right) \;, \end{aligned}$$(31)and the claim follows. Now, the dominated convergence theorem concludes that \(\pi (f)\pi (g)=\pi (h)=c\), for every \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\).
 \(\Leftarrow \)):

Following the definition, \(P_f({\fancyscript{X}},\sigma )=P_g({\fancyscript{X}},\sigma )c\), and f and g have the same equilibrium measures. Theorem 1 then implies that the shiftinvariant Gibbs measures of \(\Delta _f\) and \(\Delta _g\) coincide, which in turn concludes that \(\Delta _f=\Delta _g\).
\(\square \)
As a corollary, the physical equivalence relation is closed in \(K({\fancyscript{X}})\times K({\fancyscript{X}})\):
Corollary 1
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type. Let \(h_1,h_2,\ldots \) be local observables on \({\fancyscript{X}}\) such that \(\Delta _{h_i}=0\) for each i. If \(h_i\) converge to a local observable h, then \(\Delta _h=0\).
The continuous extension of this relation (i.e., the closure of \(\sim \) in \(C({\fancyscript{X}})\times C({\fancyscript{X}})\)) gives a notion of physical equivalence of arbitrary continuous observables.
Proposition 7
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type. Two observables \(f,g\in C({\fancyscript{X}})\) are physically equivalent if and only if there is a constant \(c\in \mathbb {R}\) such that \(\pi (f)=\pi (g)+c\) for every probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\).
Proof
First, suppose that f and g are physically equivalent. Then, there exist sequences \(f_1,f_2,\ldots \) and \(g_1,g_2,\ldots \) of local observables such that \(f_i\rightarrow f\), \(g_i\rightarrow g\) and \(f_i\sim g_i\). By Proposition 6, there are real numbers \(c_i\) such that for every \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\), \(\pi (f_i)\pi (g_i)=c_i\). Taking the limits as \(i\rightarrow \infty \), we obtain \(\pi (f)\pi (g)=c\), where \(c{\triangleq }\lim _i c_i\) is independent of \(\pi \).
Conversely, suppose there is a constant \(c\in \mathbb {R}\) such that \(\pi (f)=\pi (g)+c\) for every probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\). Let \(h{\triangleq }fgc\). Then, according to Lemma 2, \(h\in C({\fancyscript{X}},\sigma )\). Therefore, by the denseness of \(K({\fancyscript{X}})\) in \(C({\fancyscript{X}})\), there exists a sequence of local observables \(h_i\in C({\fancyscript{X}},\sigma )\) such that \(h_i\rightarrow h\). Choose another sequence of local observables \(g_i\) that converges to g, and set \(f_i{\triangleq }h_i+g_i+c\). By Lemma 2, \(\pi (f_i)=\pi (g_i)+c\) for every \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\), which along with Proposition 6, implies that \(\Delta _{f_i}=\Delta _{g_i}\). Taking the limit, we obtain that f and g are physically equivalent. \(\square \)
Using Lemma 2, we also get the following characterization.
Corollary 2
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type. Two observables \(f,g\in C({\fancyscript{X}})\) are physically equivalent if and only if \(fgc\in C({\fancyscript{X}},\sigma )\) for some \(c\in \mathbb {R}\).
Physically equivalent observables define the same set of equilibrium measures. Moreover, the equilibrium measures of two observables with summable variations that are not physically equivalent are disjoint. (However, continuous observables that are not physically equivalent might in general share equilibrium measures; see [75], Corollary 3.17.)
Proposition 8
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type. If two observables \(f,g\in C({\fancyscript{X}})\) are physically equivalent, they have the same set of equilibrium measures. Conversely, if two observables \(f,g\in SV({\fancyscript{X}})\) with summable variations share an equilibrium measure, they are physically equivalent.
Proof
The first claim is an easy consequence of the characterization of physical equivalence given in Proposition 7. The converse follows from the characterization of equilibrium measures as Gibbs measures (Theorem 1). \(\square \)
EntropyPreserving Maps
Entropy and Preinjective Maps
The GardenofEden theorem states that a cellular automaton over a strongly irreducible shift of finite type is surjective if and only if it is preinjective [10, 23, 31, 60, 63]. This is one of the earliest results in the theory of cellular automata, and gives a characterization of when a cellular automaton has a socalled GardenofEden, that is, a configuration with no preimage. The GardenofEden theorem can be proved by a counting argument. Alternatively, the argument can be phrased in terms of entropy (see [49], Theorem 8.1.16 and [9], Chapter 5).
Theorem 2
(see [49], Theorem 8.1.16 and [56], Theorem 3.6) Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a factor map from a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Then, \(h({\fancyscript{Y}},\sigma )\le h({\fancyscript{X}},\sigma )\) with equality if and only if \(\Phi \) is preinjective.
Theorem 3
(see [14], Theorem 3.3, and [56], Lemma 4.1, and [23], Lemma 4.4) Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type and \({\fancyscript{Y}}\subseteq {\fancyscript{X}}\) a proper subsystem. Then, \(h({\fancyscript{Y}},\sigma )<h({\fancyscript{X}},\sigma )\).
Corollary 3
(GardenofEden Theorem [10, 23, 60, 63]) Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a cellular automaton on a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\). Then, \(\Phi \) is surjective if and only if it is preinjective.
Proof
\(\square \)
Another corollary of Theorem 2 (along with Lemma 1 and Proposition 4) is the socalled balance property of preinjective cellular automata.
Corollary 4
(see [14], Theorem 2.1, and [56], Theorems 3.3 and 3.6) Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map from a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Every maximum entropy measure \(\nu \in {\fancyscript{P}}({\fancyscript{Y}},\sigma )\) has a maximum entropy preimage \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\).
In particular, a cellular automaton on a full shift is surjective if and only if it preserves the uniform Bernoulli measure [31, 55]. In Sect. 4.2, we shall find a generalization of this property.
The probabilistic version of Theorem 2 states that the preinjective factor maps preserve the entropy of shiftinvariant probability measures, and seems to be part of the folklore (see e.g. [32]).
Theorem 4
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a factor map from a shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Let \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) be a probability measure. Then, \(h_{\Phi \pi }({\fancyscript{Y}},\sigma )\le h_\pi ({\fancyscript{X}},\sigma )\) with equality if \(\Phi \) is preinjective.
Proof
For any factor map \(\Phi \), the inequality \(h_{\Phi \pi }({\fancyscript{Y}},\sigma )\le h_\pi ({\fancyscript{X}},\sigma )\) holds by Proposition 4. Suppose that \(\Phi \) is preinjective. It is enough to show that \(h_{\Phi \pi }({\fancyscript{Y}},\sigma )\ge h_\pi ({\fancyscript{X}},\sigma )\).
Let \(0\subseteq M\subseteq \mathbb {L}\) be a neighborhood for \(\Phi \) and a witness for the finitetype gluing property of \({\fancyscript{X}}\) (see Sect. 2.2). Let \(A\subseteq \mathbb {L}\) be a finite set. By the preinjectivity of \(\Phi \), for every \(x\in {\fancyscript{X}}\), the pattern \(x _{A\setminus M(A^\mathsf c )}\) is uniquely determined by \(x _{\partial M(A)}\) and \((\Phi x) _{A}\). Indeed, suppose that \(x'\in {\fancyscript{X}}\) is another configuration with \(x' _{\partial M(A)}=x _{\partial M(A)}\) and \((\Phi x') _{A}=(\Phi x) _{A}\). Then, the configuration \(x''\) that agrees with x on \(M(A^\mathsf c )\) and with \(x'\) on M(A) is in \({\fancyscript{X}}\) and asymptotic to x. Since \((\Phi x'') _{A}=(\Phi x') _{A}=(\Phi x) _{A}\), it follows that \(\Phi x''=\Phi x\). Therefore, \(x''=x\), and in particular, \(x' _{A\setminus M(A^\mathsf c )}=x'' _{A\setminus M(A^\mathsf c )}=x _{A\setminus M(A^\mathsf c )}\).
From the basic properties of the Shannon entropy, it follows that
Now, choose the neighborhood M to be \(I_r{\triangleq }[r,r]^d\subseteq \mathbb {L}\) for a sufficiently large r. For \(A{\triangleq }I_n=[n,n]^d\), we obtain
Dividing by \(\leftI_n\right\) we get
which proves the theorem by letting \(n\rightarrow \infty \). \(\square \)
From Theorem 4 and Lemma 1, it immediately follows that the functionals \(f\mapsto P_f\) and \(f\mapsto s_f(\cdot )\) are preserved under the dual of a preinjective factor map.
Corollary 5
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a factor map from a shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Let \(f\in C({\fancyscript{Y}})\) be an observable. Then, \(P_{f\circ \Phi }({\fancyscript{X}},\sigma )\ge P_f({\fancyscript{Y}},\sigma )\) with equality if \(\Phi \) is preinjective.
Corollary 6
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a factor map from a shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Let \(f\in C({\fancyscript{Y}})\) be an observable. Then, \(s_{f\circ \Phi }(\cdot )\ge s_f(\cdot )\) with equality if \(\Phi \) is preinjective.
Central to this article is the the following correspondence between the equilibrium (Gibbs) measures of a model and its preinjective factors.
Corollary 7
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map from a shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Let \(f\in C({\fancyscript{Y}})\) be an observable and \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) a probability measure. Then \(\pi \in {\fancyscript{E}}_{f\circ \Phi }({\fancyscript{X}},\sigma )\) if and only if \(\Phi \pi \in {\fancyscript{E}}_f({\fancyscript{Y}},\sigma )\).
Corollary 8
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map from a shift of finite type \(({\fancyscript{X}},\sigma )\) onto a shift \(({\fancyscript{Y}},\sigma )\). Let \(f\in C({\fancyscript{Y}})\) be an observable, \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) a probability measure, and \(e\in \mathbb {R}\). Then, \(\pi \in {\fancyscript{E}}_{\langle f\circ \Phi \rangle =e}({\fancyscript{X}},\sigma )\) if and only if \(\Phi \pi \in {\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{Y}},\sigma )\).
Example 4
Let \({\fancyscript{X}}\subseteq \{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\) be the shift obtained by forbidding \(\mathtt {1}\mathtt {1}\) and \(\mathtt {2}\mathtt {2}\), and \({\fancyscript{Y}}\subseteq \{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\) the shift obtained by forbidding \(\mathtt {2}\mathtt {2}\) and \(\mathtt {2}\mathtt {1}\). Then, both \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\) are mixing shifts of finite type. For every configuration \(x\in {\fancyscript{X}}\), let \(\Phi x\in \{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\) be the configuration in which
Then, \(\Phi \) is a preinjective factor map from \({\fancyscript{X}}\) onto \({\fancyscript{Y}}\).
Consider the observables \(g_0,g_1,g_2{:}\,{\fancyscript{Y}}\rightarrow \mathbb {R}\) defined by
for \(y\in {\fancyscript{Y}}\). The Hamiltonians \(\Delta _{g_0}\), \(\Delta _{g_1}\) and \(\Delta _{g_2}\) count the number of \(\mathtt {0}\), \(\mathtt {1}\) and \(\mathtt {2}\) s, respectively. The unique Gibbs measures for \(\Delta _{g_0}\), \(\Delta _{g_1}\) and \(\Delta _{g_2}\) are, respectively, the distribution of the biinfinite Markov chains with transition matrices
In general, every finiterange Gibbs measure on a onedimensional mixing shift of finite type is the distribution of a biinfinite Markov chain and vice versa (see [27], Theorem 3.5 and [11]). The observables induced by \(g_0\), \(g_1\) and \(g_2\) on \({\fancyscript{X}}\) via \(\Phi \) satisfy
for every \(x\in {\fancyscript{X}}\). The unique Gibbs measures for \(\Delta _{g_0\circ \Phi }\), \(\Delta _{g_1\circ \Phi }\) and \(\Delta _{g_2\circ \Phi }\) are, respectively, the distribution of the biinfinite Markov chains with transition matrices
By Corollary 7, we have \(\Phi \pi _0=\nu _0\), \(\Phi \pi _1=\nu _1\), and \(\Phi \pi _2=\nu _2\). ◯
Complete Preinjective Maps
In this section, we discuss the extension of Corollary 7 to the case of nonshiftinvariant Gibbs measures (Conjecture 1, Theorem 5, and Corollary 9). We start with an example that deviates from the main line of this article (i.e., understanding macroscopic equilibrium in surjective cellular automata) but rather demonstrates an application of factor maps as a tool in the study of phase transitions in equilibrium statistical mechanics models. The (trivial) argument used in this example however serves as a model for the proof of Theorem 5.
Example 5
(Ising and contour models) There is a natural correspondence between the twodimensional Ising model (Example 1) and the contour model (Example 2).
As before, let \({\fancyscript{X}}=\{\mathtt {+},\mathtt {}\}^{{\mathbb {Z}}^2}\) and \({\fancyscript{Y}}{\triangleq }T^{{\mathbb {Z}}^2}\) denote the configuration spaces of the Ising model and the contour model. Define a sliding block map \(\Theta {:}\,{\fancyscript{X}}\rightarrow T^{{\mathbb {Z}}^2}\) with neighborhood \(N{\triangleq }\{(0,0),(0,1),(1,0),(1,1)\}\) and local rule \(\theta {:}\,\{\mathtt {+},\mathtt {}\}^N\rightarrow T\), specified by
(see Fig. 2). Then, \(\Theta \) is a factor map onto \({\fancyscript{Y}}\) and is preinjective. In fact, \(\Theta \) is 2to1: every configuration \(y\in {\fancyscript{Y}}\) has exactly two preimages \(x,x'\in {\fancyscript{X}}\), where \(x'=x\) (i.e., \(x'\) is obtained from x by flipping the direction of the spin at every site). Moreover, if f denotes the energy observable for the Ising model and g the contour length observable for the contour model, we have \(\Delta _f=2\Delta _{g\circ \Theta }\).
This relationship, which was first discovered by Peierls [67], is used to reduce the study of the Ising model to the study of the contour model (see e.g. [28, 90]). The Gibbs measures for \(\beta \Delta _f\) represent the states of thermal equilibrium for the Ising model at temperature \(1/\beta \). According to Corollary 7 (and Theorem 1), the shiftinvariant Gibbs measures \(\pi \in {\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}},\sigma )\) are precisely the \(\Theta \)preimages of the shiftinvariant Gibbs measures \(\nu \in {\fancyscript{G}}_{2\beta \Delta _g}({\fancyscript{Y}},\sigma )\) for the contour model.
In fact, in this case it is also easy to show that the \(\Theta \)image of every Gibbs measure for \(\beta \Delta _f\) (not necessarily shiftinvariant) is a Gibbs measure for \(2\beta \Delta _g\). Indeed, suppose that \(\pi \in {\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}})\) is a Gibbs measure for \(\beta \Delta _f\) and \(\nu {\triangleq }\Theta \pi \) its image. Let \(y,y'\in {\fancyscript{Y}}\) be asymptotic configurations, and \(E\supseteq {\mathrm {diff}}(y,y')\) a sufficiently large finite set of sites. Let \(x_1,x_2\in {\fancyscript{X}}\) be the preimages of y, and \(x'_1,x'_2\in {\fancyscript{X}}\) the preimages of \(y'\). Without loss of generality, we can assume that \(x_1\) is asymptotic to \(x'_1\), and \(x_2\) is asymptotic \(x'_2\). It is easy to see that \(\Theta ^{1}[y]_E=[x_1]_{N(E)}\cup [x_2]_{N(E)}\) and \(\Theta ^{1}[y']_E=[x'_1]_{N(E)}\cup [x'_2]_{N(E)}\). Note that the cylinders \([x_1]_{N(E)}\) and \([x_2]_{N(E)}\) are disjoint, and so are the cylinders \([x'_1]_{N(E)}\) and \([x'_2]_{N(E)}\). Therefore,
Since N(E) is large and \(\pi \) is a Gibbs measure for \(\beta \Delta _f\), we have \(\pi ([x'_1]_{N(E)})=\mathrm {e}^{\beta \Delta _f(x_1,x'_1)}\pi ([x_1]_{N(E)})\) and \(\pi ([x'_2]_{N(E)})=\mathrm {e}^{\beta \Delta _f(x_2,x'_2)}\pi ([x_2]_{N(E)})\). Since, \(\Delta _f(x_1,x'_1) =\Delta _f(x_2,x'_2)=2\Delta _g(y,y')\), it follows that \(\nu ([y']_E)=\mathrm {e}^{2\beta \Delta _g(y,y')}\nu ([y]_E)\).
It has been proved that for any \(0<\beta <\infty \), the contour model with Hamiltonian \(2\beta \Delta _g\) has a unique Gibbs measure [1, 33]; the main difficulty is to show that the infinite contours are “unstable”, in the sense that, under every Gibbs measure, the probability of appearance of an infinite contour is zero.^{Footnote 4} Let us denote the unique Gibbs measure for \(2\beta \Delta _g\) by \(\nu _\beta \). It follows that the simplex of Gibbs measures for the Ising model at temperature \(1/\beta \) is precisely \(\Theta ^{1}\nu _\beta \). For, the set \(\Theta ^{1}\nu _\beta \) includes \({\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}})\) (by the above observation) and is included in \({\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}},\sigma )\) (because \(\nu _\beta \) must be shiftinvariant). Therefore, the Gibbs measures for the Ising model at any temperature \(1/\beta \) are shiftinvariant and \({\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}})={\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}},\sigma )=\Theta ^{1}\nu _\beta \).
It is not difficult to show that if \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is a continuous kto1 map between two compact metric spaces, then every probability measure on \({\fancyscript{Y}}\) has at most k mutually singular preimages under \(\Phi \). In particular, the simplex \({\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}})={\fancyscript{G}}_{\beta \Delta _f}({\fancyscript{X}},\sigma )=\Theta ^{1}\nu _\beta \) of Gibbs measures for the Ising model at temperature \(1/\beta \) has at most 2 ergodic elements.
Whether the Ising model at temperature \(1/\beta \) has two ergodic Gibbs measures or one depends on a specific geometric feature of the typical contour configurations under the measure \(\nu _\beta \). Roughly speaking, the contours of a contour configuration divide the twodimensional plane into disjoint clusters. A configuration with no infinite contour generates either one or no infinite cluster, depending on whether each site is surrounded by a finite or infinite number of contours. Note that since \(\nu _\beta \) is ergodic, the number of infinite clusters in a random configuration chosen according to \(\nu _\beta \) is almost surely constant. If \(\nu _\beta \)almost every configuration has an infinite cluster, then it follows by symmetry that \(\Theta ^{1}\nu _\beta \) contains two distinct ergodic measures, one in which the infinite cluster is colored with \(\mathtt {+}\) and one with \(\mathtt {}\). The converse is also known to be true [76]: if \(\nu _\beta \)almost every configuration has no infinite cluster, then \(\Theta ^{1}\nu _\beta \) has only one element.
Contour representations are used to study a wide range of statistical mechanics models, and are particularly fruitful to prove the “stability” of ground configurations at low temperature (see e.g. [20, 79]). ◯
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map between two strongly irreducible shifts of finite type \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\). Let \(f\in SV({\fancyscript{Y}})\) be an observable having summable variations and \(\Delta _f\) the Hamiltonian defined by f. Then, according to Theorem 1, the equilibrium measures of f and \(f\circ \Phi \) are precisely the shiftinvariant Gibbs measures for the Hamiltonians \(\Delta _f\) and \(\Delta _{f\circ \Phi }\). A natural question is whether Corollary 7 remains valid for arbitrary Gibbs measures (not necessarily shiftinvariant). If \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is a morphism between two shifts and \(\Delta \) is a Hamiltonian on \({\fancyscript{Y}}\), let us denote by \(\Phi ^*\Delta \), the Hamiltonian on \({\fancyscript{X}}\) defined by \((\Phi ^*\Delta )(x,y){\triangleq }\Delta (\Phi x,\Phi y)\).
Conjecture 1
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map between two strongly irreducible shifts of finite type \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\). Let \(\Delta \) be a Hamiltonian on \({\fancyscript{Y}}\), and \(\pi \) a probability measure on \({\fancyscript{X}}\). Then, \(\pi \) is a Gibbs measure for \(\Phi ^*\Delta \) if and only if \(\Phi \pi \) is a Gibbs measure for \(\Delta \).
One direction of the latter conjecture is known to be true for a subclass of preinjective factor maps. Let us say that a preinjective factor map \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) between two shifts is complete if for every configuration \(x\in {\fancyscript{X}}\) and every configuration \(y'\in {\fancyscript{Y}}\) that is asymptotic to \(y{\triangleq }\Phi x\), there is a (unique) configuration \(x'\in {\fancyscript{X}}\) asymptotic to x such that \(\Phi x'= y'\).
Lemma 3
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a complete preinjective factor map between two shifts of finite type \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\). For every finite set \(D\subseteq \mathbb {L}\), there is a finite set \(E\subseteq \mathbb {L}\) such that every two asymptotic configurations \(x,x'\in {\fancyscript{X}}\) with \({\mathrm {diff}}(\Phi x,\Phi x')\subseteq D\) satisfy \({\mathrm {diff}}(x,x')\subseteq E\).
Proof
See Fig. 3 for an illustration.
For a configuration \(x\in {\fancyscript{X}}\), let \({\fancyscript{A}}_x\) be the set of all configurations \(x'\) asymptotic to x such that \({\mathrm {diff}}(\Phi x,\Phi x')\subseteq D\). The set \({\fancyscript{A}}_x\) is finite. Therefore, there is a finite set \(E_x\) such that all the elements of \({\fancyscript{A}}_x\) agree outside \(E_x\). We claim that if \(C_x\supseteq E_x\) is a large enough finite set of sites, then for every configuration \(x_1\in [x]_{C_x}\), all the elements of \({\fancyscript{A}}_{x_1}\) agree outside \(E_x\).
To see this, suppose that \(C_x\) is large, and consider a configuration \(x_1\in [x]_{C_x}\). Let \(x'_1\) be a configuration asymptotic to \(x_1\) such that \({\mathrm {diff}}(\Phi x_1,\Phi x'_1)\subseteq D\). By the gluing property of \({\fancyscript{Y}}\), there is a configuration \(y'\in {\fancyscript{Y}}\) that agrees with \(\Phi x'_1\) in a large neighborhood of D and with \(\Phi x\) outside D. Since \(\Phi \) is a complete preinjective factor map, there is a unique configuration \(x'\) asymptotic to x such that \(\Phi x'=y'\). Now, by the gluing property of \({\fancyscript{X}}\), there is a configuration \(x''_1\) that agrees with \(x'\) in \(C_x\) and with \(x_1\) outside \(E_x\). Since \(C_x\) was chosen large, it follows that \(\Phi x''_1=\Phi x'_1\). Since \(x'_1\) and \(x''_1\) are asymptotic, the preinjectivity of \(\Phi \) ensures that \(x''_1=x'_1\). Therefore, \(x_1\) and \(x'_1\) agree outside \(E_x\).
The cylinders \([x]_{C_x}\) form an open cover of \({\fancyscript{X}}\). Therefore, by the compactness of \({\fancyscript{X}}\), there is a finite set \({\fancyscript{I}}\subseteq {\fancyscript{X}}\) such that \(\bigcup _{x\in {\fancyscript{I}}} [x]_{C_x}\supseteq {\fancyscript{X}}\). The set \(E{\triangleq }\bigcup _{x\in {\fancyscript{I}}} E_x\) has the desired property. \(\square \)
Theorem 5
(see [75], Proposition 2.5) Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a complete preinjective factor map between two strongly irreducible shifts of finite type \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\). Let \(\Delta \) be a Hamiltonian on \({\fancyscript{Y}}\), and \(\pi \) a probability measure on \({\fancyscript{X}}\). If \(\pi \) is a Gibbs measure for \(\Phi ^*\Delta \), then \(\Phi \pi \) is a Gibbs measure for \(\Delta \).
Proof
Let \(0\in N\subseteq \mathbb {L}\) be a neighborhood for \(\Phi \). Let \(0\in M\subseteq \mathbb {L}\) be a neighborhood that witnesses the finite type gluing property of both \({\fancyscript{X}}\) and \({\fancyscript{Y}}\). We write \(\tilde{N}{\triangleq }N^{1}(N)\) and \(\tilde{M}{\triangleq }M^{1}(M)\).
Let y and \(y'\) be two asymptotic configurations in \({\fancyscript{Y}}\), and set \(D{\triangleq }{\mathrm {diff}}(y,y')\). For every configuration \(x\in \Phi ^{1}[y]_{\tilde{M}(D)}\), there is a unique configuration \(x'\in \Phi ^{1}[y']_{\tilde{M}(D)}\) that is asymptotic to x and such that \({\mathrm {diff}}(\Phi x,\Phi x')\subseteq D\). (Namely, by the gluing property of \({\fancyscript{Y}}\), the configuration \(y'_x\) that agrees with \(y'\) in \(\tilde{M}(D)\) and with \(\Phi x\) outside D is in \({\fancyscript{Y}}\). Since \(\Phi \) is a complete preinjective factor map, there is a unique configuration \(x'\) that is asymptotic to x and \(\Phi x'=y'_x\).) The relation \(x\mapsto x'\) is a onetoone correspondence. By Lemma 3, there is a large enough finite set \(E\subseteq \mathbb {L}\) such that for every \(x\in \Phi ^{1}[y]_{\tilde{M}(D)}\), it holds \({\mathrm {diff}}(x,x')\subseteq E\).
Consider a large finite set \(\hat{D}\subseteq \mathbb {L}\) and another finite set \(\hat{E}\subseteq \mathbb {L}\) that is much larger than \(\hat{D}\). (More precisely, we need \(\hat{D}\supseteq \tilde{M}(D)\) and \(\hat{E}\supseteq N(\hat{D})\cup \tilde{N}(E)\).) Let \(P_y\) denote the set of patterns \(p\in L_{\hat{E}}({\fancyscript{X}})\) such that \(\Phi [p]_{\hat{E}}\subseteq [y]_{\hat{D}}\). Then, \(\Phi ^{1}[y]_{\hat{D}}=\bigcup _{p\in P_y} [p]_{\hat{E}}\) (provided \(\hat{E}\supseteq N(\hat{D})\)). Let \({\fancyscript{I}}\subseteq {\fancyscript{X}}\) be a finite set consisting of one representative from each cylinder \([p]_{\hat{E}}\), for \(p\in A\). Then, \(\Phi ^{1}[y]_{\hat{D}}=\bigcup _{x\in {\fancyscript{I}}}[x]_{\hat{E}}\). Moreover, \(\Phi ^{1}[y']_{\hat{D}}=\bigcup _{x\in {\fancyscript{I}}}[x']_{\hat{E}}\) (provided \(\hat{E}\supseteq \tilde{N}(E)\)). Since the terms in each of the two latter unions are disjoint, we have
For each \(x\in \Phi ^{1}[y]_{\hat{D}}\), we have
Let us denote the first term on the righthand side by \(\delta _{\hat{E}}(x)\) and the second term by \(\gamma _{\hat{D}}(x)\). Note that, since \(\pi \) is a Gibbs measure for \(\Phi ^*\Delta \) and \({\mathrm {diff}}(x,x')\subseteq E\), \(\delta _{\hat{E}}(x)\rightarrow 0\) uniformly over \(\Phi ^{1}[y]_{\hat{D}}\) as \(\hat{E}\nearrow \mathbb {L}\). Note also that, by the continuity property of \(\Delta \), \(\gamma _{\hat{D}}(x)\rightarrow 0\) uniformly over \(\Phi ^{1}[y]_{\hat{D}}\) as \(\hat{D}\nearrow \mathbb {L}\). We can now write
Consider a small number \(\varepsilon >0\). If \(\hat{D}\) is sufficiently large, we have \(\left\gamma _{\hat{D}}\right<\varepsilon /2\). Moreover, after choosing \(\hat{D}\), we can choose \(\hat{E}\) large enough so that \(\left\delta _{\hat{E}}\right<\varepsilon /2\). Therefore, for \(\hat{D}\) sufficiently large we get
It follows that
as \(\hat{D}\nearrow \mathbb {L}\). Since this is valid for every two asymptotic configurations \(y,y'\in {\fancyscript{Y}}\), we conclude that \(\Phi \pi \) is a Gibbs measure for \(\Delta \). \(\square \)
Corollary 9
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a conjugacy between two strongly irreducible shifts of finite type \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\). Let \(\Delta \) be a Hamiltonian on \({\fancyscript{Y}}\), and \(\pi \) a probability measure on \({\fancyscript{X}}\). Then, \(\pi \) is a Gibbs measure for \(\Phi ^*\Delta \) if and only if \(\Phi \pi \) is a Gibbs measure for \(\Delta \).
The Image of a Gibbs Measure
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map between two strongly irreducible shifts of finite type. According to Corollary 7, a preimage of a shiftinvariant Gibbs measure under the induced map \({\fancyscript{P}}({\fancyscript{X}},\sigma )\rightarrow {\fancyscript{P}}({\fancyscript{Y}},\sigma )\) is again a Gibbs measure. The image of a Gibbs measure, however, does not need to be a Gibbs measure as the following example demonstrates.
Example 6
(XOR map) Let \({\fancyscript{X}}={\fancyscript{Y}}{\triangleq }\{\mathtt {0},\mathtt {1}\}^{{\mathbb {Z}}}\) be the binary full shift and \(\Phi \) the socalled XOR map, defined by \((\Phi x)(i){\triangleq }x(i)+x(i+1) \pmod {2}\). Let \(\pi \) be the shiftinvariant Bernoulli measure on \({\fancyscript{X}}\) with marginals \(\mathtt {1}\mapsto p\) and \(\mathtt {0}\mapsto 1p\), where \(0< p< 1\). This is a Gibbs measure for the Hamiltonian \(\Delta _f\), where \(f:{\fancyscript{X}}\rightarrow \mathbb {R}\) is the singlesite observable defined by \(f(x){\triangleq }\log p\) if \(x(0)=\mathtt {1}\) and \(f(x){\triangleq }\log (1p)\) if \(x(0)=\mathtt {0}\). We claim that unless \(p=\frac{1}{2}\), \(\Phi \pi \) is not a regular Gibbs measure (i.e., a Gibbs measure for a Hamiltonian generated by an observable with summable variations).
Suppose, on the contrary, that \(p\ne \frac{1}{2}\) and \(\Phi \pi \) is a Gibbs measure for \(\Delta _g\) for some \(g\in SV({\fancyscript{X}})\). Then \(\pi \) is also an equilibrium measure for \(g\circ \Phi \) (Corollary 7), implying that f and \(g\circ \Phi \) are physically equivalent (Proposition 8). Consider the two uniform configurations \(\underline{\mathtt {0}}\) and \(\underline{\mathtt {1}}\), where \(\underline{\mathtt {0}}(i){\triangleq }\mathtt {0}\) and \(\underline{\mathtt {1}}(i){\triangleq }\mathtt {1}\) for every \(i\in {\mathbb {Z}}\). We have \(f(\underline{\mathtt {0}})=\log p\ne \log (1p)=f(\underline{\mathtt {1}})\), whereas \(g\circ \Phi (\underline{\mathtt {0}})=g\circ \Phi (\underline{\mathtt {1}})\). If \(\delta _{\underline{\mathtt {0}}}\) and \(\delta _{\underline{\mathtt {1}}}\) are, respectively, the probability measures concentrated on \(\underline{\mathtt {0}}\) and \(\underline{\mathtt {1}}\), we get that \(\delta _{\underline{\mathtt {0}}}(f)\delta _{\underline{\mathtt {0}}}(g\circ \Phi ) \ne \delta _{\underline{\mathtt {1}}}(f)\delta _{\underline{\mathtt {1}}}(g\circ \Phi )\). This is a contradiction with the physical equivalence of f and \(g\circ \Phi \), because \(\delta _{\underline{\mathtt {0}}},\delta _{\underline{\mathtt {1}}}\in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) (Proposition 7).
In fact, the same argument shows that none of the nfold iterations \(\Phi ^n\pi \) are regular Gibbs measures, because \(\Phi ^n(\underline{\mathtt {0}})=\Phi ^n(\underline{\mathtt {1}})\) for every \(n\ge 1\). On the other hand, it has been shown [50, 58], that \(\Phi ^n\pi \) converges in density to the uniform Bernoulli measure, which is a Gibbs measure and is invariant under \(\Phi \). The question of approach to equilibrium will be discussed in Sect. 4.4. ◯
The latter example was first suggested by van den Berg (see [51], Sect. 3.2) as an example of a measure that is strongly nonGibbsian, in the sense that attempting to define a Hamiltonian for it via (15) would lead to a function \(\Delta \) for which the continuity property fails everywhere. The question of when a measure is Gibbsian and the study of the symptoms of being nonGibbsian is an active area of research as nonGibbsianness sets boundaries on the applicability of the socalled renormalization group technique in statistical mechanics (see e.g. [19, 21]).
The observation in Example 6 can be generalized as follows.
Proposition 9
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type and \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) a Gibbs measure for a Hamiltonian \(\Delta _f\), where \(f\in SV({\fancyscript{X}})\). Suppose that \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) is a preinjective factor map from \(({\fancyscript{X}},\sigma )\) onto another shift of finite type \(({\fancyscript{Y}},\sigma )\). A necessary condition for \(\Phi \pi \) to be a regular Gibbs measure is that for every two measures \(\mu _1,\mu _2\in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) with \(\mu _1(f)\ne \mu _2(f)\) it holds \(\Phi \mu _1\ne \Phi \mu _2\).
Example 7
(XOR map; Example 6 continued) The argument of Example 6 can be stretched to show that the iterations of the XOR map turn every Gibbs measure other than the uniform Bernoulli measure eventually to a nonGibbs measure. More specifically, for every observable \(f\in SV({\fancyscript{X}})\) that is not physically equivalent to 0 and every shiftinvariant Gibbs measure \(\pi \) for \(\Delta _f\), there is an integer \(n_0\ge 1\) such that for any \(n\ge n_0\), the measure \(\Phi ^n\pi \) is not a regular Gibbs measure.
This is a consequence of the selfsimilar behaviour of the XOR map. Namely, the map \(\Phi \) satisfies \((\Phi ^{2^k}x)(i) = x(i) + x(i+2^k) \pmod {2}\) for every \(i\in {\mathbb {Z}}\) and every \(k\ge 1\). If f is not physically equivalent to 0, two periodic configurations \(x,y\in {\fancyscript{X}}\) with common period \(2^k\) can be found such that \(2^{k}\sum _{i=0}^{2^k1}f(\sigma ^i x)\ne 2^{k}\sum _{i=0}^{2^k1}f(\sigma ^i y)\). If \(\mu _x\) and \(\mu _y\) denote, respectively, the shiftinvariant measures concentrated at the shift orbits of x and y, we obtain that \(\mu _x(f)\ne \mu _y(f)\). Nevertheless, \(\Phi ^n x=\Phi ^n y=\underline{\mathtt {0}}\) for all \(n\ge 2^k\), implying that \(\Phi ^n \mu _x = \Phi ^n \mu _y = \delta _{\underline{\mathtt {0}}}\). Therefore, according to Proposition 9, the measure \(\Phi ^n\pi \) cannot be a regular Gibbs measure. ◯
With the interpretation of the shiftergodic measures as the macroscopic states (see the Introduction), the above proposition reads as follows: a sufficient condition for the nonGibbsianness of \(\Phi \pi \) is that there are two macroscopic states that are distinguishable by the density of f and are mapped to the same state by \(\Phi \).
If the induced map \(\Phi :{\fancyscript{P}}({\fancyscript{X}},\sigma )\rightarrow {\fancyscript{P}}({\fancyscript{Y}},\sigma )\) is not onetoone, then there are Gibbs measures (even Markov measures) whose images are not Gibbs. For, suppose \(\mu _1,\mu _2\in {\fancyscript{P}}({\fancyscript{X}},\sigma )\) are distinct measures with \(\Phi \mu _1=\Phi \mu _2\). Then, there is a local observable \(f\in K({\fancyscript{X}})\) such that \(\mu _1(f)\ne \mu _2(f)\). Every shiftinvariant Gibbs measure for \(\Delta _f\) is mapped by \(\Phi \) to a measure that is not regular Gibbs.
Question 2
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{Y}}\) be a preinjective factor map between two strongly irreducible shifts of finite type \(({\fancyscript{X}},\sigma )\) and \(({\fancyscript{Y}},\sigma )\), and suppose that the induced map \(\Phi :{\fancyscript{P}}({\fancyscript{X}},\sigma )\rightarrow {\fancyscript{P}}({\fancyscript{Y}},\sigma )\) is injective. Does \(\Phi \) map every (regular, shiftinvariant) Gibbs measure to a Gibbs measure?
Cellular Automata
Conservation Laws
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a cellular automaton on a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\). We say that \(\Phi \) conserves (the energylike quantity formalized by) a Hamiltonian \(\Delta \) if \(\Delta (\Phi x,\Phi y)=\Delta (x,y)\) for every two asymptotic configurations \(x,y\in {\fancyscript{X}}\). If \(\Delta =\Delta _f\) is the Hamiltonian generated by a local observable \(f\in K({\fancyscript{X}})\), then we may also say that \(\Phi \) conserves f (in the aggregate). More generally, we say that a continuous observable \(f\in C({\fancyscript{X}})\) is conserved by \(\Phi \) if f and \(f\circ \Phi \) are physically equivalent. According to Proposition 7, this is equivalent to the existence of a constant \(c\in \mathbb {R}\) such that \((\Phi \pi )(f)=\pi (f)+c\) for every shiftinvariant probability measure \(\pi \). However, in this case c is always 0.
Proposition 10
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\). A continuous observable \(f\in C({\fancyscript{X}})\) is conserved by \(\Phi \) if and only if \((\Phi \pi )(f)=\pi (f)\) for every probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\).
Proof
Let \(c\in \mathbb {R}\) be such that \((\Phi \pi )(f)=\pi (f)+c\) for every \(\pi \in {\fancyscript{P}}({\fancyscript{X}},\sigma )\). Then, for every \(n>0\), \((\Phi ^n\pi )(f)=\pi (f)+nc\). However, every continuous function on a compact space is bounded. Therefore, \(c=0\). \(\square \)
If an observable f is conserved by a cellular automaton \(\Phi \), we say that f is bound by a conservation law under \(\Phi \). There is also a concept of local conservation law. Let \(D_0\) be a finite generating set for the group \(\mathbb {L}={\mathbb {Z}}^d\). Suppose that \(f\in C({\fancyscript{X}})\) is an observable that is conserved by a cellular automaton \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\). By Proposition 10 and Lemma 2, this means that \(f\circ \Phi f\in C({\fancyscript{X}},\sigma )\), that is
for some \(h_i^{(n)}\in K({\fancyscript{X}})\) (for \(i\in D_0\) and \(n=0,1,2,\ldots \)). In other words, for every configuration \(x\in {\fancyscript{X}}\) it holds
If furthermore \(f\circ \Phi f\in K({\fancyscript{X}},\sigma )\subseteq C({\fancyscript{X}},\sigma )\) (where \(K({\fancyscript{X}},\sigma )\) is defined as in (2)), then we have the more intuitive equation
for some \(h_i\in K({\fancyscript{X}})\). In this case, we say that f is locally conserved by \(\Phi \) (or satisfies a local conservation law under \(\Phi \)). The value \(h_i(\sigma ^k x)\) is then interpreted as the flow (of the energylike quantity captured by f) from site k to site \(ki\). The latter equation is a continuity equation, stating that at each site k, the changes in the observed quantity after one step should balance with the incoming and the outgoing flows. If \({\fancyscript{X}}\) is a full shift, it is known that every conserved local observable is locally conserved. The proof is similar to that of Proposition 1.
Local conservation laws enjoy a somewhat symmetric relationship with time and space. Namely, an observable \(f\in K({\fancyscript{X}})\) is locally conserved by \(\Phi \) if and only if the observable \(\alpha {\triangleq }f\circ \Phi f\) is in \(K({\fancyscript{X}},\Phi )\cap K({\fancyscript{X}},\sigma )\). Moreover, to every observable \(\alpha \in K({\fancyscript{X}},\Phi )\cap K({\fancyscript{X}},\sigma )\), there corresponds at least one observable \(f\in K({\fancyscript{X}})\) such that \(\alpha =f\circ \Phi f\) and f is locally conserved by \(\Phi \). In general, there might be several observables f with the latter property. If \(\alpha =f\circ \Phi f=f'\circ \Phi f'\) for two observables \(f,f'\in K({\fancyscript{X}})\), then \((ff')=(ff')\circ \Phi \); that is, \(ff'\) is invariant under \(\Phi \). Every constant observable is invariant under any cellular automaton. The following is an example of a cellular automaton with nonconstant invariant local observables.
Example 8
(Invariant observables) Let \(\Phi :\{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\rightarrow \{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\) be the cellular automaton with
(see Fig. 4). The observable \(f:\{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\rightarrow \mathbb {R}\) defined by \(f(x){\triangleq }1\) if \(x(i)=\mathtt {2}\) and \(f(x){\triangleq }0\) otherwise is obviously invariant. The Hamiltonian \(\Delta _f\) counts the number of occurrences of symbol \(\mathtt {2}\) and is conserved by \(\Phi \). In fact, there are infinitely many linearly independent, physically nonequivalent observables that are invariant under \(\Phi \). Namely, the relative position of the occurrences of \(\mathtt {2}\) remain unchanged, and hence, for any finite set \(D\subseteq {\mathbb {Z}}\), the logical conjunction of \(f\circ \sigma ^i\) for \(i\in D\) is invariant. It follows that \(\Phi \) has infinitely many distinct (and linearly independent) conservation laws.
Such abundance of conservation laws is common among all cellular automata having nonconstant invariant local observables (see Lemma 2 of [25]), and has been suggested as the reason behind the “nonphysical” behavior in these cellular automata (see e.g. [83]). Every surjective equicontinuous cellular automaton is periodic [5, 7] and hence has nonconstant invariant local observables. It follows that every surjective cellular automaton that has a nontrivial equicontinuous cellular automaton as factor has nonconstant invariant local observables and an infinity of linearly independent conservation laws. ◯
Question 3
Does every surjective cellular automaton with equicontinuous points have nonconstant local observables?
Every cellular automaton \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) conserves the trivial Hamiltonian \(\Delta \equiv 0\) on \({\fancyscript{X}}\). Furthermore, every observable \(f\in C({\fancyscript{X}})\) that is physically equivalent to 0 (i.e., \(fc\in C({\fancyscript{X}},\sigma )\) for some \(c\in \mathbb {R}\)) is trivially conserved by \(\Phi \). Likewise, a local observable \(f\in K({\fancyscript{X}})\) is trivially locally conserved by \(\Phi \) if \(fc\in K({\fancyscript{X}},\sigma )\) for some \(c\in \mathbb {R}\). We shall say that two local observables \(f,g\in K({\fancyscript{X}})\) are locally physically equivalent if \(fgc\in K({\fancyscript{X}},\sigma )\) for some \(c\in \mathbb {R}\). The following proposition is the analogue of Proposition 10.
Proposition 11
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\). A local observable \(f\in K({\fancyscript{X}})\) is locally conserved by \(\Phi \) if and only if f and \(f\circ \Phi \) are locally physically equivalent.
Invariance of Gibbs Measures
As a corollary of the results of Sect. 3, we obtain a correspondence between the conservation laws of a surjective cellular automata and its invariant Gibbs measures. It is wellknown that every surjective cellular automaton over a full shift preserves the uniform Bernoulli measure (see [31], Theorem 5.4 and [55]). The invariance of the uniform Bernoulli measure is sometimes called the balance property of (the local update rule of) the surjective cellular automata. In case of surjective cellular automata over strongly irreducible shifts of finite type, a similar property is known to hold: every measure of maximum entropy is mapped to a measure of maximum entropy (see [14], Corollary 2.3 and [56], Theorems 3.3 and 3.6). The following two theorems can be seen as further generalizations of the balance property. Indeed, choosing \(f\equiv 0\) in either of the two theorems implies that a surjective cellular automaton maps each measure of maximum entropy to a measure of maximum entropy. An elementary proof of Theorem 6 in the special case of surjective cellular automata on onedimensional full shifts and singlesite observables was earlier presented in [42].
Theorem 6
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a surjective cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\), and let \(f\in SV({\fancyscript{X}})\) be an observable with summable variations. The following conditions are equivalent:

(a)
\(\Phi \) conserves f.

(b)
\(\Phi \) maps the set \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) of equilibrium measures for f onto itself.

(c)
There exist a measure in \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) whose \(\Phi \)image is also in \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\).
If \(f\in C({\fancyscript{X}})\) does not have summable variations, condition (a) still implies the other two conditions.
Proof
 a \(\Rightarrow \) b):

Suppose that \(\Phi \) conserves f. By Proposition 8 and Corollary 7 we have \(\pi \in {\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) if and only if \(\Phi \pi \in {\fancyscript{E}}_f({\fancyscript{X}},\sigma )\). Using Lemma 1, we obtain \(\Phi {\fancyscript{E}}_f({\fancyscript{X}},\sigma )={\fancyscript{E}}_f({\fancyscript{X}},\sigma )\).
 b \(\Rightarrow \) c):

Trivial.
 c \(\Rightarrow \) a):

Let f have summable variations. Then, so does \(f\circ \Phi \). Suppose that there exists a measure \(\pi \in {\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) such that \(\Phi \pi \in {\fancyscript{E}}_f({\fancyscript{X}},\sigma )\). By Corollary 7, we also have \(\pi \in {\fancyscript{E}}_{f\circ \Phi }({\fancyscript{X}},\sigma )\). Therefore, \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\cap {\fancyscript{E}}_{f\circ \Phi }({\fancyscript{X}},\sigma )\ne \varnothing \) and by Proposition 8, f and \(f\circ \Phi \) are physically equivalent. That is, \(\Phi \) conserves f.
\(\square \)
Theorem 7
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a surjective cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\). Let \(f\in C({\fancyscript{X}})\) be an observable and \(e\in \mathbb {R}\). If \(\Phi \) conserves f, then \(\Phi \) maps \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) onto itself.
From Theorems 6 and 7 it follows that each of the (convex and compact) sets \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) and \({\fancyscript{E}}_{\langle f\rangle =e}({\fancyscript{X}},\sigma )\) contains an invariant measure for \(\Phi \), provided that \(\Phi \) conserves f. However, following the common reasoning of statistical mechanics (see the Introduction), such an invariant measure should not be considered as a macroscopic equilibrium state unless it is shiftergodic (see Example 9 below).
In the implication (c \(\Rightarrow \) a) of Theorem 6, the set \({\fancyscript{E}}_f({\fancyscript{X}},\sigma )\) of equilibrium measures for f can be replaced by the potentially larger set \({\fancyscript{G}}_{\Delta _f}({\fancyscript{X}})\) of Gibbs measures for \(\Delta _f\).
Corollary 10
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a surjective cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\), and let \(f\in SV({\fancyscript{X}})\) be an observable with summable variations. Suppose that there is a Gibbs measure for \(\Delta _f\) whose \(\Phi \)image is also a Gibbs measure for \(\Delta _f\). Then, \(\Phi \) conserves \(\Delta _f\).
Proof
Let \(\pi \) be a probability measure on \({\fancyscript{X}}\) such that \(\pi ,\Phi \pi \in {\fancyscript{G}}_{\Delta _f}({\fancyscript{X}})\). Let \({\fancyscript{H}}\) denote the closed convex hull of the measures \(\sigma ^k\pi \) for \(k\in \mathbb {L}\). Then, \({\fancyscript{H}}\) is a closed, convex, shiftinvariant set, and therefore, contains a shiftinvariant element \(\nu \). Moreover, both \({\fancyscript{H}}\) and \(\Phi {\fancyscript{H}}\) are subsets of \({\fancyscript{G}}_{\Delta _f}({\fancyscript{X}})\). In particular, \(\nu ,\Phi \nu \in {\fancyscript{G}}_{\Delta _f}({\fancyscript{X}})\). Hence, \(\nu ,\Phi \nu \in {\fancyscript{E}}_f({\fancyscript{X}},\sigma )\), and the claim follows from Theorem 6. \(\square \)
For reversible cellular automata, Corollary 9 leads to a variant of Theorem 6 concerning all (not necessarily shiftinvariant) Gibbs measures.
Theorem 8
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a reversible cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\), and let \(\Delta \) be a Hamiltonian on \({\fancyscript{X}}\). The following conditions are equivalent:

(a)
\(\Phi \) conserves \(\Delta \).

(b)
A probability measure is in \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\) if and only if its \(\Phi \)image is in \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\).

(c)
There exists a measure in \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\) whose \(\Phi \)image is also in \({\fancyscript{G}}_\Delta ({\fancyscript{X}})\).
Proof
If \(\Phi \) conserves \(\Delta \), we have, by definition, \(\Phi ^*\Delta =\Delta \), and Corollary 9 (and Lemma 1) imply that \(\Phi ^{1}{\fancyscript{G}}_\Delta ({\fancyscript{X}})={\fancyscript{G}}_\Delta ({\fancyscript{X}})\). Conversely, suppose that \(\pi \) is a probability measure such that \(\pi ,\Phi \pi \in {\fancyscript{G}}_\Delta ({\fancyscript{X}})\). Then, by Corollary 9, \(\pi \in {\fancyscript{G}}_\Delta ({\fancyscript{X}})\cap {\fancyscript{G}}_{\Phi ^*\Delta }({\fancyscript{X}})\), and it follows from the definition of a Gibbs measure that \(\Phi ^*\Delta =\Delta \). That is, \(\Phi \) conserves \(\Delta \). \(\square \)
Example 9
(Q2R cellular automaton) The Q2R model discussed in the Introduction is not, strictly speaking, a cellular automaton (with the standard definition), as it involves alternate application of two maps that do not commute with the shift. Simple tricks can however be used to turn it into a standard cellular automaton (see e.g. [88], Sect. 5.2).
Let \({\fancyscript{X}}{\triangleq }\{\mathtt {+},\mathtt {}\}^{{\mathbb {Z}}^2}\) be the space of spin configurations, and denote by \(\Phi _{\mathsf {e}}\) the mapping \({\fancyscript{X}}\rightarrow {\fancyscript{X}}\) that updates the even sites. That is,
where the spinflipping operation is denoted by overline, and \(n_i^{\mathtt {+}}(x)\) (resp., \(n_i^{\mathtt {}}(x)\)) represents the number of sites j among the four immediate neighbors of i such that \(x(j)=\mathtt {+}\) (resp., \(x(j)=\mathtt {}\)). Similarly, let \(\Phi _{\mathsf {o}}\) denotes the mapping that updates the odd sites. The composition \(\Phi {\triangleq }\Phi _{\mathsf {o}}\Phi _{\mathsf {e}}\) commutes with the shifts \(\sigma ^k\), for k in the sublattice \((2{\mathbb {Z}})^2\), and (after a recoding) could be considered as a cellular automaton.
Let f denote the energy observable defined in Example 1. For every \(\beta >0\), the Hamiltonian \(\Delta _{\beta f}\) is conserved by \(\Phi \). Therefore, according to Theorem 8, the set \({\fancyscript{G}}_{\Delta _{\beta f}}({\fancyscript{X}})\) of Gibbs measures for \(\Delta _{\beta f}\) is invariant under \(\Phi \). In fact, in this example, it is easy to show that \(\Phi \) preserves every individual Gibbs measure in \({\fancyscript{G}}_{\Delta _{\beta f}}({\fancyscript{X}})\).
It is natural to ask whether the preservation of individual elements of \({\fancyscript{G}}_{\Delta _{\beta f}}({\fancyscript{X}})\) holds in general. This is however not the case. When \(\beta \) large enough, it is known that \({\fancyscript{G}}_{\Delta _{\beta f}}({\fancyscript{X}})\) contains two distinct shiftergodic measures, obtained from each other by a spin flip transformation (see Example 5). The cellular automaton \(\Phi ' x{\triangleq }\overline{\Phi x}\), which flips every spin after applying \(\Phi \), conserves \(\Delta _{\beta f}\) but does not preserve either of the two distinct shiftergodic Gibbs measures for \(\Delta _{\beta f}\). ◯
Absence of Conservation Laws
In light of the above connection, every statement about conservation laws in surjective cellular automata has an interpretation in terms of invariance of Gibbs measures, and vice versa. In this section, we see an example of such reinterpretation that leads to otherwise nontrivial results. Namely, proving the abscence of conservation laws in two relatively rich families of surjective and reversible cellular automata, we obtain strong constraints on the invariant measures of the cellular automata within each family. Roughly speaking, strong chaotic behavior is incompatible with the presence of conservation laws. In contrast, any surjective cellular automaton with a nontrivial equicontinuous factor has an infinity of linearly independent conservation laws (see Example 8).
We say that a dynamical system \(({\fancyscript{X}},\Phi )\) is strongly transitive if for every point \(z\in {\fancyscript{X}}\), the set \(\bigcup _{i=0}^\infty \Phi ^{i}z\) is dense in \({\fancyscript{X}}\). Strong transitivity is stronger than transitivity (!) and weaker than minimality. A dynamical system \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) is minimal if it has no nontrivial closed subsystems, and is transitive if for every pair of nonempty open sets \(A,B\subseteq {\fancyscript{X}}\), there is an integer \(n\ge 0\) such that \(A\cap \Phi ^{n}B\ne \varnothing \). In our setting (i.e., \({\fancyscript{X}}\) being compact), minimality is equivalent to the property that the only closed sets \(E\subseteq {\fancyscript{X}}\) with \(E\subseteq \Phi E\) are \(\varnothing \) and \({\fancyscript{X}}\), which is easily seen to imply strong transitivity. However, note that cellular automata over nontrivial strongly irreducible shifts of finite type cannot be minimal. This is because every strongly irreducible shift of finite type has configurations that are periodic in at least one direction. (More specifically, for each \(k\in \mathbb {L}\setminus \{0\}\), there is a configuration x such that \(\sigma ^{pk}x=x\) for some \(p>0\).) Transitivity is often considered as one of the main indicators of chaos (see e.g. [4, 46]). Every transitive cellular automaton is known to be sensitive to initial conditions (i.e., uniformly unstable) [13, 47].^{Footnote 5}
Example 10
(XOR cellular automata) The ddimensional XOR cellular automaton with neighborhood \(N\subseteq {\mathbb {Z}}^d\) is defined by the map \(\Phi :\{\mathtt {0},\mathtt {1}\}^{{\mathbb {Z}}^d}\rightarrow \{\mathtt {0},\mathtt {1}\}^{{\mathbb {Z}}^d}\), where \((\Phi x)(i){\triangleq }\sum _{i\in N} x(i) \pmod {2}\). To avoid the trivial case, we assume that the neighborhood has at least two elements. Examples 6 and 7 were about the onedimensional XOR cellular automaton with neighborhood \(\{0,1\}\). Figure 5a depicts a sample run of the onedimensional model with neighborhood \(\{1,1\}\).
The XOR cellular automaton is strongly transitive. An argument similar to that in Example 7 shows that the uniform Bernoulli measure is the only regular Gibbs measure that is invariant under an XOR cellular automaton. Note, however, that there are many other (nonGibbs) invariant measures. For example, the Dirac measure concentrated at the uniform configuration with \(\mathtt {0}\) everywhere is invariant. So is the (atomic) measure uniformly distributed over any jointly periodic orbit (i.e., a finite orbit of \((\sigma ,\Phi )\)).
In fact, much more is known about the invariant measures of the XOR cellular automata, with a strong indication that the uniform Bernoulli measure is the only “state of macroscopic equilibrium”. For instance, the uniform Bernoulli measure on \(\{0,1\}^{\mathbb {Z}}\) is known to be the only shiftergodic probability measure that is invariant and of positive entropy for the XOR cellular automaton with neighborhood \(\{0,1\}\) [34]. Another such result states that the only measures that are strongly mixing for the shift and invariant under the XOR cellular automaton with neighborhood \(\{1,1\}\) are the uniform Bernoulli measure and the Dirac measure concentrated at the uniform configuration with \(\mathtt {0}s\) everywhere [58]. (Note that the onedimensional Gibbs measures are all strongly mixing.) Similar results have been obtained for broad classes of cellular automata with algebraic structure (e.g. [69, 77, 81]). See [70] for a survey. ◯
The following theorem is a slight generalization of Theorem 5 in [25].
Theorem 9
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a strongly transitive cellular automaton over a shift of finite type \(({\fancyscript{X}},\sigma )\). Then, \(\Phi \) does not conserve any nontrivial Hamiltonian.
Proof
Let \(0\in M\subseteq \mathbb {L}\) be a finite window that witnesses the finite type gluing property of \({\fancyscript{X}}\).
Let \(\Delta \) be a nontrivial Hamiltonian on \({\fancyscript{X}}\), and suppose there exist two asymptotic configurations u and v such that \(\varepsilon {\triangleq }\Delta (u,v)>0\). By the continuity property of \(\Delta \), there is a finite set \(D\supseteq M(M^{1}({\mathrm {diff}}(u,v)))\) such that for every two asymptotic configurations \(u'\in [u]_D\) and \(v'\in [v]_D\) with \({\mathrm {diff}}(u',v')={\mathrm {diff}}(u,v)\), \(\Delta (u',v')\ge \varepsilon /2>0\).
Let z be a ground configuration for \(\Delta \) (see Proposition 2). Since \(\Phi \) is strongly transitive, there is a configuration \(x\in [v]_D\) and a time \(t\ge 0\) such that \(\Phi ^t x=z\). Construct a configuration \(y\in {\fancyscript{X}}\) that agrees with u on D and with x outside \({\mathrm {diff}}(u,v)\). In particular, \(y\in [u]_D\). Then, \(\Delta (y,x)\ge \varepsilon /2\), whereas \(\Delta (\Phi ^t y,\Phi ^t x)\le 0\). Therefore, \(\Delta \) is not conserved by \(\Phi \). \(\square \)
Corollary 11
Let \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a strongly transitive cellular automaton over a strongly irreducible shift of finite type \(({\fancyscript{X}},\sigma )\). Then, \(\Phi \) does not preserve any regular Gibbs measure other than the Gibbs measures for the trivial Hamiltonian.
A special case of the above corollary (for the permutive cellular automata and Bernoulli measures) is also proved in [2] (Corollary 3.6). Let us recall that the shiftinvariant Gibbs measures for the trivial Hamiltonian on \({\fancyscript{X}}\) coincide with the measures of maximum entropy for \(({\fancyscript{X}},\sigma )\) (Theorem 1). Therefore, according to Corollary 11, if \(\Phi \) is strongly transitive, the measures of maximum entropy for \(({\fancyscript{X}},\sigma )\) are the only candidates for Gibbs measures that are invariant under both \(\sigma \) and \(\Phi \). Since the set of measures with maximum entropy for \(({\fancyscript{X}},\sigma )\) is closed and convex, and is preserved under \(\Phi \), it follows that at least one measure with maximum entropy is invariant under \(\Phi \). However, this measure does not need to be ergodic for the shift.
Next, we are going to introduce a class of onedimensional reversible cellular automata with no local conservation law. The proof will be via reduction to Theorem 9. Note that reversible cellular automata over nontrivial strongly irreducible shifts of finite type cannot be strongly transitive: the inverse of a strongly transitive system is minimal, and as mentioned above, cellular automata over nontrivial strongly irreducible shifts of finite type cannot be minimal.
Example 11
(Transpose of XOR) Figure 5b depicts a sample spacetime diagram of the reversible cellular automaton \(\Phi \) on \((\{\mathtt {0},\mathtt {1}\}\times \{\mathtt {0},\mathtt {1}\})^{\mathbb {Z}}\) with neighbourhood \(\{0,1\}\) and local rule \(((a,b),(c,d))\mapsto (b,a+d)\), where the addition is modulo 2. Observe that rotating a spacetime diagram of \(\Phi \) by 90 degrees, we obtain what is essentially a spacetime diagram of the XOR cellular automaton with neighbourhood \(\{1,1\}\) (see Fig. 5a and Example 10).
As in Example 3 of [25], it is possible to show that \(\Phi \) has no nontrivial finiterange conservation law. Below, we shall present an alternative proof (using its connection with the XOR cellular automaton) that covers a large class of similar reversible cellular automata. ◯
We shall say that two surjective onedimensional cellular automata are transpose of each other if the biinfinite spacetime diagrams of each is obtained (up to a conjugacy) from the biinfinite spacetime diagrams of the other by swapping the role of space and time. To be more specific, let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a surjective cellular automaton on a onedimensional mixing shift space of finite type \({\fancyscript{X}}\subseteq S^{\mathbb {Z}}\). Define the continuous map \(\Theta :S^{{\mathbb {Z}}\times {\mathbb {Z}}}\rightarrow S^{\mathbb {Z}}\) where \((\Theta z)(i){\triangleq }z(i,0)\), and let \(\tilde{{\fancyscript{X}}}\) be the twodimensional shift space formed by all configurations \(z\in S^{{\mathbb {Z}}\times {\mathbb {Z}}}\) such that
is a biinfinite orbit of \(\Phi \), that is \(\Theta \sigma ^{(0,k+1)} z = \Phi \Theta \sigma ^{(0,k)} z\) for each \(k\in {\mathbb {Z}}\). Set \(\mathsf {V}{\triangleq }(0,1)\) and \(\mathsf {H}{\triangleq }(1,0)\). The dynamical system \((\tilde{{\fancyscript{X}}},\sigma ^\mathsf {V},\sigma ^\mathsf {H})\) (together with the map \(\Theta \)) is the natural extension of \(({\fancyscript{X}},\Phi ,\sigma )\). Now, let \(\Psi :{\fancyscript{Y}}\rightarrow {\fancyscript{Y}}\) be another surjective cellular automaton on a onedimensional mixing shift space of finite type \({\fancyscript{Y}}\subseteq T^{\mathbb {Z}}\). We say \(\Psi \) is a transpose of \(\Phi \) if its natural extension is conjugate to \((\tilde{{\fancyscript{X}}},\sigma ^\mathsf {H},\sigma ^\mathsf {V})\). The transpose of \(\Phi \) (if it exists) is unique only up to conjugacy. When there is no danger of confusion, we denote any representative of the transpose conjugacy class by \(\Phi ^\intercal \).
Proposition 12
A surjective cellular automaton on a onedimensional mixing shift of finite type is mixing provided it has a transpose (acting on a mixing shift of finite type).
Proof
A dynamical system is mixing if and only if its natural extension is mixing. \(\square \)
Obviously, not every cellular automaton has a transpose. A class of cellular automata that do have transposes is the class of those that are positively expansive. A dynamical system \(({\fancyscript{X}},\Phi )\) is positively expansive if there exists a real number \(\varepsilon >0\) such that for every two distinct points \(x,y\in {\fancyscript{X}}\), there is a time \(t\ge 0\) such that \(\Phi ^t x\) and \(\Phi ^t y\) have distance at least \(\varepsilon \). If \(({\fancyscript{X}},\sigma )\) is a mixing shift of finite type and \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) is a positively expansive cellular automaton, then \(\Phi \) is surjective, and it is known that a transpose of \(\Phi \) exists and is a reversible cellular automaton on a mixing shift of finite type (see [48], Sect. 5.5^{Footnote 6}). If, furthermore, \(({\fancyscript{X}},\sigma )\) is a full shift, then the transpose of \(\Phi \) also acts on a full shift (see [65], Theorem 3.12).
Proposition 13
Every positively expansive cellular automaton on a onedimensional mixing shift of finite type is strongly transitive.
Proof
Any continuous map \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) on a compact metric space that is transitive, open, and positively expansive is strongly transitive [38]. Every positively expansive cellular automaton on a mixing shift of finite type is itself mixing (see the above paragraph) and open (see [48], Theorem 5.45).
Alternatively, every positively expansive cellular automaton on a mixing shift of finite type is conjugate to a mixing onesided shift of finite type (see [48], Theorem 5.49), and hence strongly transitive. \(\square \)
The local conservation laws of a cellular automaton and its transpose are in onetoone correspondence.
Theorem 10
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) and \(\Phi ^\intercal :{\fancyscript{X}}^\intercal \rightarrow {\fancyscript{X}}^\intercal \) be surjective cellular automata over onedimensional mixing shifts of finite type \({\fancyscript{X}}\) and \({\fancyscript{X}}^\intercal \), and suppose that \(\Phi \) and \(\Phi ^\intercal \) are transpose of each other. There is a onetoone correspondence (up to local physical equivalence) between the observables \(f\in K({\fancyscript{X}})\) that are locally conserved by \(\Phi \) and the observables \(f^\intercal \in K({\fancyscript{X}}^\intercal )\) that are locally conserved by \(\Phi ^\intercal \). Moreover, f is locally physically equivalent to 0 if and only if \(f^\intercal \) is so.
Proof
Recall that an observable \(f\in K({\fancyscript{X}})\) is locally conserved by \(\Phi \) if and only if it satisfies the continuity equation
for some observable \(g\in K({\fancyscript{X}})\), where the terms \(g\circ \sigma \) and g are interpreted, respectively, as the flow pouring into a site from its right neighbour and the flow leaving that site towards its left neighbour. This equation may alternatively be written as
which can be interpreted as the local conservation of the observable g when the role of \(\Phi \) and \(\sigma \) are exchanged.
To specify the correspondence between local conservation laws of \(\Phi \) and \(\Phi ^\intercal \) more precisely, let \(\tilde{{\fancyscript{X}}}\) be the shift space of the spacetime diagrams of \(\Phi \), so that \((\tilde{{\fancyscript{X}}},\sigma ^\mathsf {V},\sigma ^\mathsf {H})\) is the natural extension of \(({\fancyscript{X}},\Phi ,\sigma )\), and \((\tilde{{\fancyscript{X}}},\sigma ^\mathsf {H},\sigma ^\mathsf {V})\) is the natural extension of \(({\fancyscript{X}}^\intercal ,\Phi ^\intercal ,\sigma )\), and let \(\Theta :\tilde{{\fancyscript{X}}}\rightarrow {\fancyscript{X}}\) and \(\Theta ^\intercal :\tilde{{\fancyscript{X}}}\rightarrow {\fancyscript{X}}^\intercal \) be, respectively, the corresponding factor maps, extracting (up to a conjugacy) the 0th row and the 0th column of \(\tilde{{\fancyscript{X}}}\).
Let us use the following notation. Suppose that local observables \(f,g\in K({\fancyscript{X}})\) and \(f^\intercal ,g^\intercal \in K({\fancyscript{X}}^\intercal )\) are such that \(f\circ \Theta = g^\intercal \circ \Theta ^\intercal \) and \(f^\intercal \circ \Theta ^\intercal =g\circ \Theta \), and setting \(\tilde{f}_\mathsf {V}{\triangleq }f\circ \Theta = g^\intercal \circ \Theta ^\intercal \) and \(\tilde{f}_\mathsf {H}{\triangleq }f^\intercal \circ \Theta ^\intercal =g\circ \Theta \), it holds
Then, we write \(f_1\perp f^\intercal _1\) for any two local observables \(f_1\in K({\fancyscript{X}})\) and \(f^\intercal _1\in K({\fancyscript{X}})\) that are locally physically equivalent to f and \(f^\intercal \), respectively.
We verify that

(i)
a local observable \(f\in K({\fancyscript{X}})\) is locally conserved by \(\Phi \) if and only if \(f\perp f^\intercal \) for some local observable \(f^\intercal \in K({\fancyscript{X}}^\intercal )\),

(ii)
the relation \(\perp \) is linear, and

(iii)
\(f\perp 0\) if and only if f is locally physically equivalent to 0.
Note that these three statements (along with the similar statements obtained by swapping f and \(f^\intercal \)) would imply that \(\perp \) is a onetoone correspondence with the desired properties.
To prove the first statement, suppose that \(f\perp f^\intercal \). Then,
where \(\tilde{f}_\mathsf {V}= f\circ \Theta \) and \(\tilde{f}_\mathsf {H}=g\circ \Theta \) for some \(g\in K({\fancyscript{X}})\). Rewriting this equation as
we obtain, using Lemma 1 and the surjectivity of \(\Theta \), that f is locally conserved by \(\Phi \). Conversely, suppose that \(\Phi \) locally conserves f, and let \(g\in K({\fancyscript{X}})\) be such that \(f\circ \Phi  f=g\circ \sigma  g\). Therefore,
Since \(f\circ \Theta \) and \(g\circ \Theta \) are local observables, there exists a finite region \(D\subseteq {\mathbb {Z}}\times {\mathbb {Z}}\) such that \(f\circ \Theta ,g\circ \Theta \in K_D(\tilde{{\fancyscript{X}}})\). By the definition of natural extension, there is an integer \(k>0\) such that \(z _{D}\) is uniquely and continuously determined by \(\Theta ^\intercal \sigma ^{k\mathsf {H}}z\) (i.e., column \(k\) of the spacetime shift of \(\Phi \)). Hence, there exist observables \(f^\intercal ,g^\intercal \in K({\fancyscript{X}}^\intercal )\) such that \(f\circ \Theta \circ \sigma ^{k\mathsf {H}}=g^\intercal \circ \Theta ^\intercal \) and \(g\circ \Theta \circ \sigma ^{k\mathsf {V}}=f^\intercal \circ \Theta ^\intercal \). Now, setting \(\tilde{f}_\mathsf {V}{\triangleq }f\circ \sigma ^k\circ \Theta =g^\intercal \circ \Theta ^\intercal \) and \(\tilde{f}_\mathsf {H}{\triangleq }g\circ \sigma ^k\circ \Theta =f^\intercal \circ \Theta ^\intercal \), we can write
which means \(f\circ \sigma ^k \perp f^\intercal \). Finally, note that f and \(f\circ \sigma ^k\) are locally physically equivalent.
The linearity of \(\perp \) and the fact that \(0\perp 0\) are clear. It remains to show that if \(f_1\in K({\fancyscript{X}})\) is a local observable such that \(f_1\perp 0\), then \(f_1\) is locally physically equivalent to 0. Suppose that \(f_1\perp 0\). Then, there is an observable \(f\in K({\fancyscript{X}})\) locally physically equivalen to \(f_1\), and an observable \(f^\intercal \in K({\fancyscript{X}}^\intercal )\) locally physically equivalent to 0 such that
Since \(f^\intercal \) is locally physically equivalent to 0, it has the form \(f^\intercal =h^\intercal \circ \sigma  h^\intercal + c\) for some observable \(h^\intercal \in K({\fancyscript{X}}^\intercal )\) and some constant \(c\in \mathbb {R}\). Therefore,
Since \(h^\intercal \) is a local observable, we can find, as before, an integer \(l>0\) and a local observable \(h\in K({\fancyscript{X}})\) such that \(h^\intercal \circ \sigma ^l\circ \Theta ^\intercal = h\circ \Theta \). Therefore, composing both sides of (63) with \(\sigma ^{l\mathsf {V}}\) leads to
which, together with Lemma 1, gives
The latter equation can be rewritten as
which says that \(f\circ \Phi ^l  h\circ \sigma + h\) is invariant under \(\Phi \). On the other hand, since \(({\fancyscript{X}}^\intercal ,\sigma )\) is a mixing shift, it follows from Proposition 12 that \(({\fancyscript{X}},\Phi )\) is also mixing. As a consequence, every continuous observable that is invariant under \(\Phi \) is constant. In particular, \(f\circ \Phi ^l  h\circ \sigma + h=c'\) for some constant \(c'\in \mathbb {R}\), which means \(f\circ \Phi ^l\) is locally physically equivalent to 0. Since f is locally conserved by \(\Phi \), the observable \(f\circ \Phi ^l\) is also locally physically equivalent to f, and this completes the proof. \(\square \)
Corollary 12
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a reversible cellular automaton on a onedimensional mixing shift of finite type \(({\fancyscript{X}},\sigma )\), and suppose that \(\Phi \) has a positively expansive transpose. Then, \(\Phi \) has no nontrivial local conservation law.
As mentioned in Sect. 4.1, for cellular automata on full shifts, every conserved local observable is locally conserved.
Corollary 13
Let \(\Phi {:}\,{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) be a reversible cellular automaton on a onedimensional full shift \(({\fancyscript{X}},\sigma )\), and suppose that \(\Phi \) has a positively expansive transpose. The uniform Bernoulli measure is the only finiterange Gibbs measure (\(\equiv \) fullsupport Markov measure) that is invariant under \(\Phi \).
Example 12
(Nonadditive positively expansive) Let \(\Phi :\{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\rightarrow \{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\) be the cellular automaton defined by neighborhood \(N{\triangleq }\{1,0,1\}\) and local update rule \(\varphi :\{\mathtt {0},\mathtt {1},\mathtt {2}\}^N\rightarrow \{\mathtt {0},\mathtt {1},\mathtt {2}\}\) defined by
See Fig. 6a for a sample run. Note that the local rule is both left and rightpermutive (i.e., \(a\mapsto \varphi (a,b,c)\) and \(c\mapsto \varphi (a,b,c)\) are permutations). It follows that \(\Phi \) is positively expansive, and hence also strongly transitive. Therefore, according to Theorem 9 and Corollary 11, \(\Phi \) has no nontrivial conservation law and the uniform Bernoulli measure is the only regular Gibbs measure on \(\{\mathtt {0},\mathtt {1},\mathtt {2}\}^{\mathbb {Z}}\) that is invariant under \(\Phi \).
The cellular automaton \(\Phi \) has a transpose \(\Phi ^\intercal :(\{\mathtt {0},\mathtt {1},\mathtt {2}\}\times \{\mathtt {0},\mathtt {1},\mathtt {2}\})^{\mathbb {Z}}\rightarrow (\{\mathtt {0},\mathtt {1},\mathtt {2}\}\times \{\mathtt {0},\mathtt {1},\mathtt {2}\})^{\mathbb {Z}}\) defined with neighborhood \(\{0,1\}\) and local rule
where the subtractions are modulo 3 (see Fig. 6b). This is a reversible cellular automaton. It follows from Corollaries 12 and 13 that \(\Phi ^\intercal \) has no nontrivial local conservation law and no invariant fullsupport Markov measure other than the uniform Bernoulli measure. ◯
According to Corollary 11, if \(\Phi \) is a strongly transitive cellular automaton on a full shift \({\fancyscript{X}}\), the uniform Bernoulli measure on \({\fancyscript{X}}\) is the only regular Gibbs measure that is preserved by \(\Phi \). Likewise, Corollary 13 states that for a class of onedimensional reversible cellular automata, the uniform Bernoulli measure is the only invariant fullsupport Markov measure. Note that even with these constraints, a cellular automaton in either of these two classes still has a large collection of other invariant measures. For example, for every d linearly independent vectors \(k_1,k_2,\ldots ,k_d\in {\mathbb {Z}}^d\), the set of ddimensional spatially periodic configurations having \(k_i\) as periods (i.e., \(\{x: \sigma ^{k_i} x=x \text { for} \, i=1,2,\ldots ,d\}\)) is finite and invariant under any cellular automaton, and therefore any cellular automaton has an (atomic) invariant measure supported at such a set. Nevertheless, if we restrict our attention to sufficiently “smooth” measures, the uniform Bernoulli measure becomes the “unique” invariant measure for a cellular automaton in either of the above classes.^{Footnote 7} In this sense, Corollaries 11 and 13 may be interpreted as weak indications of “absence of phase transition” for cellular automata in the two classes in question.
Question 4
Let \(({\fancyscript{X}},\sigma )\) be a strongly irreducible shift of finite type. Which shiftergodic measures can be invariant under a strongly transitive cellular automaton? Can a shiftergodic measure with positive but submaximum entropy on \(({\fancyscript{X}},\sigma )\) be invariant under a strongly transitive cellular automaton?
Randomization and Approach to Equilibrium
This section contains a few remarks and open questions regarding the problem of approach to equilibrium in surjective cellular automata.
Example 13
(Randomization in XOR cellular automata) The XOR cellular automata (Examples 10, 6 and 7) exhibit the same kind of “approach to equilibrium” as observed in the Q2R model (see the Introduction). Starting from a biased Bernoulli random configuration, the system quickly reaches a uniformly random state, where it remains (see Fig. 7). A mathematical explanation of this behavior was first found independently by Miyamoto [58] and Lind [50] (following Wolfram [93]) and has since been extended and strengthened by others.
Let \({\fancyscript{X}}{\triangleq }\{\mathtt {0},\mathtt {1}\}^{\mathbb {Z}}\), and consider the XOR cellular automaton \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) with neighborhood \(\{0,1\}\). If \(\pi \) is a shiftinvariant probability measure on \({\fancyscript{X}}\), the convergence of \(\Phi ^t\pi \) as \(t\rightarrow \infty \) fails as long as \(\pi \) is strongly mixing and different from the uniform Bernoulli measure and the Dirac measures concentrated at one of the two uniform configurations [58, 59]. However, if \(\pi \) is a nondegenerate Bernoulli measure, the convergence holds if a negligible set of time steps are ignored. More precisely, there is a set \(J\subseteq \mathbb {N}\) of density 1 such that for every nondegenerate Bernoulli measure \(\pi \), the sequence \(\{\Phi ^t\pi \}_{t\in J}\) converges, as \(t\rightarrow \infty \), to the uniform Bernoulli measure \(\mu \). In particular, we have the convergence of the Cesàro averages
as \(n\rightarrow \infty \) [50, 58].
The same type of convergence holds as long as \(\pi \) is harmonically mixing [71]. Similar results have been obtained for a wide range of algebraic cellular automata (see e.g. [8, 22, 34, 52, 71–73, 81]). In particular, the reversible cellular automaton of Example 11 has been shown to have the same randomizing effect [53]. See [70] for a survey.
It is also worth mentioning a similar result due to Johnson and Rudolph [37] regarding maps of the unit circle \(\mathbb {T}{\triangleq }\mathbb {R}/{\mathbb {Z}}\). Namely, let \(\pi \) be a Borel measure on \(\mathbb {T}\). They showed that if \(\pi \) is invariant, ergodic and of positive entropy for the map \(3\times : x\mapsto 3x \pmod {1}\), then it is randomized by the map \(2\times : x\mapsto 2x \pmod {1}\), in the sense that \((2\times )^t\pi \) converges to the Lebesgue measure along a subsequence \(J\subseteq \mathbb {N}\) of density 1. ◯
Randomization behavior similar to that in the XOR cellular automaton has been observed in simulations of other (nonadditive) cellular automata, but the mathematical results are so far limited to algebraic cellular automata. The uniform Bernoulli measure is the unique measure with maximum entropy on the full shift \(({\fancyscript{X}},\sigma )\) (i.e., the “state of maximum randomness”). The convergence (in density) of \(\Phi ^t\pi \) to the uniform Bernoulli measure may thus be interpreted as a manifestation of the second law of thermodynamics [70].
We say that a cellular automaton \(\Phi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) (asymptotically) randomizes a probability measure \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\), if there is a set \(J\subseteq \mathbb {N}\) of density 1 such that the weak limit
exists and is a shiftinvariant measure with maximum entropy, that is, \(h_{\Phi ^\infty \pi }({\fancyscript{X}},\sigma )=h({\fancyscript{X}},\sigma )\). The density of a set \(J\subseteq \mathbb {N}\) is defined as
Note that the limit measure \(\Phi ^\infty \pi \) must be invariant under \(\Phi \), even if \(({\fancyscript{X}},\sigma )\) has multiple measures with maximum entropy. If \(\Phi \) randomizes a measure \(\pi \), the Cesàro averages \((\sum _{t=0}^{n1} \Phi ^t\pi )/n\) will also converge to \(\Phi ^\infty \pi \). The converse is also true as long as \(\pi \) is shiftinvariant and the limit measure is shiftergodic:
Lemma 4
[see [37], Corollary 1.4] Let \({\fancyscript{X}}\) be a compact metric space and \({\fancyscript{Q}}\subseteq {\fancyscript{P}}({\fancyscript{X}})\) a closed and convex set of probability measures on \({\fancyscript{X}}\). Let \(\pi _1,\pi _2,\ldots \) be a sequence of elements in \({\fancyscript{Q}}\) whose Cesàro averages \((\sum _{i=0}^{n1} \pi _i)/n\) converge to a measure \(\mu \) as \(n\rightarrow \infty \). If \(\mu \) is extremal in \({\fancyscript{Q}}\), then there is a set \(J\subseteq \mathbb {N}\) of density 1 such that \(\pi _i\rightarrow \mu \) as \(J\ni i\rightarrow \infty \).
As mentioned in Example 13, the stronger notion of randomization fails for the XOR cellular automaton. We say that a cellular automaton \(\Phi \) strongly randomizes a measure \(\pi \) if \(\Phi ^t\pi \) converges to a measure with maximum entropy.
Question 5
Are there examples of surjective or reversible cellular automata that strongly randomize all (say) Bernoulli measures? Is there a generic obstacle against strong randomization in surjective or reversible cellular automata?
If the cellular automaton \(\Phi \) has nontrivial conservation laws, the orbit of a measure \(\pi \) will be entirely on the same “energy level”. Nevertheless, we could expect \(\pi \) to be randomized within its energy level. To evade an abundance of invariant measures, let us assume that \(\Phi \) has only finitely many linearly independent conservation laws. More precisely, let \(F=\{f_1,f_2,\ldots ,f_n\}\subseteq C({\fancyscript{X}})\) be a collection of observables conserved by \(\Phi \) such that every observable \(g\in C({\fancyscript{X}})\) conserved by \(\Phi \) is physically equivalent to an element of the linear span of F. The measures \(\Phi ^t\pi \) as well as their accumulation points are confined in the closed convex set
Let us say that \(\Phi \) randomizes \(\pi \) modulo F if there is a set \(J\subseteq \mathbb {N}\) of density 1 such that
exists, is shiftinvariant, and has entropy \(s_{f_1,f_2,\ldots ,f_n}(\pi (f_1),\pi (f_2),\ldots ,\pi (f_n))\), where
Question 6
What are some examples of nonalgebraic cellular automata (with or without nontrivial conservation laws) having a randomization property?
Suitable candidates to inspect for the occurrence of a randomization behavior are those that do not have any nontrivial conservation laws.
Question 7
Do strongly transitive cellular automata randomize every Gibbs measure?
Question 8
Does a onedimensional reversible cellular automaton that has a positively expansive transpose randomize every Gibbs measure?
Conclusions
There is a wealth of open issues in connection with the statistical mechanics of reversible and surjective cellular automata. We have asked a few questions in this article. From the modeling point of view, there are at least three central problems that need to be addressed:

What is a good description of macroscopic equilibrium states?

What is a satisfactory description of approach to equilibrium?

How do physical phenomena such as phase transition appear in the dynamical setting of cellular automata?
By virtue of their symbolic nature, various questions regarding cellular automata can be conveniently approached using computational and algorithmic methods. Nevertheless, many fundamental global properties of cellular automata have turned out to be algorithmically undecidable, at least in two and higher dimensions. For example, the question of whether a given twodimensional cellular automaton is reversible (or surjective) is undecidable [39]. Similarly, all nontrivial properties of the limit sets of cellular automata are undecidable, even when restricted to the onedimensional case [29, 40] (see also [17]). Whether a given cellular automaton on a full shift conserves a given local observable can be verified using a simple algorithm [30], but whether a (onedimensional) cellular automaton has any nontrivial local conservation law is undecidable [25]. It is an interesting open problem whether the latter undecidability statement remains true when restricted to the class of reversible (or surjective) cellular automata. We hope to address this and other algorithmic questions related to the statistical mechanics of cellular automata in a separate study.
Problems similar to those studied here have been addressed in different but related settings and with various motivations. Simple necessary and sufficient conditions have been obtained that characterize when a onedimensional probabilistic cellular automaton has a Bernoulli or Markov invariant measure [54, 89]. The equivalence of parts (b) and (c) in Theorem 6 is also true for positiverate probabilistic cellular automata [16]. For positiverate probabilistic cellular automata, however, the existence of an invariant Gibbs measure implies that all shiftinvariant invariant Gibbs measures are Gibbs for the same Hamiltonian! The ergodicity problem of the probabilistic cellular automata (see e.g. [89]) has close similarity with the problem of randomization in surjective cellular automata.
Notes
The Q2R model has no temperature parameter. A correspondence can however be made with the temperature at which the Ising model has the same expected energy density.
The original definition of a Gibbs measure given by Dobrushin, Lanford and Ruelle is via conditional probabilities (see e.g. [27]). The definition given here can be shown to be equivalent to the original definition using the martingale convergence theorem. See Appendix.
We could simply choose \(\diamondsuit \) to be a periodic configuration if we knew such a configuration existed. Unfortunately, it is not known whether every strongly irreducible shift of finite type (in more than two dimensions) has a periodic configuration.
In fact, the theorem of Aizenman and Higuchi states that the simplex of Gibbs measures for the twodimensional Ising model at any temperature has at most two extremal elements. However, the uniqueness of the Gibbs measure for the contour model is implicit in their result, and constitutes the main ingredient of the proof.
The proof in [48] is presented for the case that \(({\fancyscript{X}},\sigma )\) is a full shift, but the same proof, with slight adaptation, works for any arbitrary mixing shift of finite type. To prove the openness of \(\Phi \), see [64], Theorems 6.3 and 6.4, and note that \(\Phi \) is both left and rightclosing.
The term “smoothness” here refers to the continuity of the conditional probabilities \(\pi ([p]_D\,\,{\mathfrak {F}}_D)(z)\) for Gibbs measures (which is a defining proeprty). Unfortunately, Corollary 11 restricts only the invariance of regular Gibbs measures (see Sect. 2.3). We do not know if every Hamiltonian is generated by an observable with summable variations. However, see [82] in this direction.
References
Aizenman, M.: Translation invariance and instability of phase coexistence in the two dimensional Ising system. Commun. Math. Phys. 73, 83–94 (1980)
Ban, J.C., Chang, C.H., Chen, T.J.: The complexity of permutive cellular automata. J. Cell. Autom. 6(4–5), 385–397 (2011)
Bernardi, V.: Lois de conservation sur automates cellulaires. Ph.D. thesis, Université de Provence (2007)
Blanchard, F.: Topological chaos: what may this mean? J. Differ. Equ. Appl. 15(1), 23–46 (2009)
Blanchard, F., Tisseur, P.: Some properties of cellular automata with equicontinuity points. Ann. l. H. Poincaré 36(5), 569–582 (2000)
Boccara, N., Fukś, H.: Cellular automaton rules conserving the number of active sites. J. Phys. A 31(28), 6007–6018 (1998)
Burkhead, E.G.: Equicontinuity properties of \(D\)dimensional cellular automata. Topol. Proc. 30(1), 197–222 (2006)
Cai, H., Luo, X.: Laws of large numbers for a cellular automaton. Ann. Probab. 21(3), 1413–1426 (1993)
CeccheriniSilberstein, T., Coornaer, M.: Cellular Automata and Groups. Springer, Berlin (2010)
CeccheriniSilberstein, T.G., Machi, A., Scarabotti, F.: Amenable groups and cellular automata. Ann. l. Fourier 49(2), 673–685 (1999)
Chandgotia, N., Han, G., Marcus, B., Meyerovitch, T., Pavlov, R.: One dimensional Markov random fields, Markov chains and topological Markov fields. Proc. Am. Math. Soc. 142, 227–242 (2014)
Chandgotia, N., Meyerovitch, T.: Markov random fields, Markov cocycles and the 3colored chessboard. Preprint (2013). [ arXiv:1305.0808]
Codenotti, B., Margara, L.: Transitive cellular automata are sensitive. Am. Math. Mon. 103(1), 58–62 (1996)
Coven, E.M., Paul, M.E.: Endomorphisms of irreducible subshifts of finite type. Math. Syst. Theory 8(2), 167–175 (1974)
Creutz, M.: Deterministic Ising dynamics. Ann. Phys. 167, 62–72 (1986)
Dai Pra, P., Louis, P.Y., lly, S.R.: Stationary measures and phase transition for a class of probabilistic cellular automata. ESAIM 6, 89–104 (2002)
Delacourt, M.: Rice’s theorem for \(\mu \)limit sets of cellular automata. In: Proceedings of the 38th International Colloquium on Automata, Languages and Programming (ICALP 2011), Part II, LNCS, vol. 6756, pp. 89–100 (2011)
Durand, B., Formenti, E., Róka, Z.: Number conserving cellular automata I: decidability. Theor. Comput. Sci. 299, 523–535 (2003)
van Enter, A.C.D., Fernández, R., Sokal, A.D.: Regularity properties and pathologies of positionspace renormalizationgroup transformations: scope and limitations of Gibbsian theory. J. Stat. Phys. 72(5—6), 879–1167 (1993)
Fernández, R.: Contour ensembles and the description of Gibbsian probability distributions at low temperature. Notes for a minicourse given at the 21 Colóquio Brasileiro de Matemática, IMPA, Rio de Janeiro, July 21–25, 1997 (1998)
Fernández, R.: Gibbsianness and nonGibbsianness in lattice random fields. In: Bovier, A., Dunlop, F., den Hollander, F., van Enter, A., Dalibard, J. (eds.) Mathematical Statistical Physics, Les Houches, Session LXXXIII, 2005, pp. 731–799. Elsevier, Amsterdam (2006)
Ferrari, P.A., Maass, A., Martínez, S., Ney, P.: Cesàro mean distribution of group automata starting from measures with summable decay. Ergod. Theory Dyn. Syst. 20(6), 1657–1670 (2000)
Fiorenzi, F.: Cellular automata and strongly irreducible shifts of finite type. Theor. Comput. Sci. 299(1–3), 477–493 (2003)
Formenti, E., Grange, A.: Number conserving cellular automata II: dynamics. Theor. Comput. Sci. 304, 269–290 (2003)
Formenti, E., Kari, J., Taati, S.: On the hierarchy of conservation laws in a cellular automaton. Natural Comput. 10(4), 1275–1294 (2011)
GarcíaRamos, F.: Product decomposition for surjective \(2\)block NCCA. In: Proceedings of the 17th International Workshop on Cellular Automata and Discrete Complex Systems (AUTOMATA 2011), DMTCS, pp. 147–158 (2012)
Georgii, H.O.: Gibbs Measures and Phase Transitions. Walter de Gruyter, Berlin, New York (1988)
Griffiths, R.B.: Peierls proof of spontaneous magnetization in a twodimensional Ising ferromagnet. Phys. Rev. 136(2A), A437–A439 (1964)
Guillon, P., Richard, G.: Revisiting the Rice theorem of cellular automata. In: Proceedings of the 27th International Symposium on Theoretical Aspects of Computer Science (STACS 2010), pp. 441–452 (2010)
Hattori, T., Takesue, S.: Additive conserved quantities in discretetime lattice dynamical systems. Phys. D 49, 295–322 (1991)
Hedlund, G.A.: Endomorphisms and automorphisms of the shift dynamical system. Math. Syst. Theory 3, 320–375 (1969)
Helvik, T., Lindgren, K., Nordahl, M.G.: Continuity of information transport in surjective cellular automata. Commun. Math. Phys. 272(1), 53–74 (2007)
Higuchi, Y.: On the absence of nontranslationally invariant Gibbs states for the twodimensional Ising model. In: Random Fields (Esztergom, 1979), Colloquia Mathematica Societatis János Bolyai, vol. 29, pp. 517–534. NorthHolland, Amsterdam (1981)
Host, B., Maass, A., Martínez, S.: Uniform bernoulli measure in dynamics of permutative cellular automata with algebraic local rules. Discret. Contin. Dyn. Syst. 9(6), 1423–1446 (2003)
Israel, R.B.: Convexity in the Theory of Lattice Gases. Princeton University Press, Princeton (1979)
Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106(4), 620–630 (1957)
Johnson, A., Rudolph, D.J.: Convergence under \(\times _q\) of \(\times _p\) invariant measures on the circle. Adv. Math. 115(1), 117–140 (1995)
Kameyama, A.: Topological transitivity and strong transitivity. Acta Math. Univ. Comen. 71(2), 139–145 (2002)
Kari, J.: Reversibility and surjectivity problems of cellular automata. J. Comput. Syst. Sci. 48(1), 149–182 (1994)
Kari, J.: Rice’s theorem for the limit sets of cellular automata. Theor. Comput. Sci. 127, 229–254 (1994)
Kari, J.: Theory of cellular automata: a survey. Theor. Comput. Sci. 334, 3–33 (2005)
Kari, J., Taati, S.: Conservation laws and invariant measures in surjective cellular automata. In: Proceedings of the 17th International Workshop on Cellular Automata and Discrete Complex Systems (AUTOMATA 2011), DMTCS, pp. 113–122 (2012)
Katok, A., Robinson, Jr., E.A.: Cocycles, cohomology and combinatorial constructions in ergodic theory. In: Proceedings of Symposia in Pure Mathematics, Smooth Ergodic Theory and Its Applications, vol. 69, pp. 107–173. American Mathematical Society, Seattle (2001)
Keller, G.: Equilibrium States in Ergodic Theory. Cambridge University Press, Cambridge (1998)
Kitchens, B.P.: Symbolic Dynamics: Onesided, Twosided and Countable State Markov Shifts. Springer, New York (1998)
Kolyada, S.F.: LiYorke sensitivity and other concepts of chaos. Ukr. Math. J. 56(8), 1242–1257 (2004)
Kůrka, P.: Languages, equicontinuity and attractors in cellular automata. Ergodic Theory Dyn. Syst. 17, 417–433 (1997)
Kůrka, P.: Topological and Symbolic Dynamics, Cours Spécialisés, vol. 11. Société Mathématique de France, Paris (2003)
Lind, D., Marcus, B.: An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, Cambridge (1995)
Lind, D.A.: Applications of ergodic theory and sofic systems to cellular automata. Phys. D 10(1—2), 36–44 (1984)
Lőrinczi, J., Maes, C., Velde, K.V.: Transformations of Gibbs measures. Probab. Theory Relat. Fields 112, 121–147 (1998)
Maass, A., Martínez, S.: On Cesàro limit distribution of a class of permutative cellular automata. J. Stat. Phys. 90(1/2), 435–452 (1998)
Maass, A., Martínez, S.: Time averages for some classes of expansive onedimensional cellular automata. In: Goles, E., Martínez, S. (eds.) Cellular Automata and Complex Systems, Nonlinear Phenomena and Complex Systems, vol. 3, pp. 37–54. Kluwer Academic Publishers, Berlin (1999)
Mairesse, J., Marcovici, I.: Probabilistic cellular automata and random fields with i.i.d. directions. Ann. l. H. Poincaré 50(2), 455–475 (2014)
Maruoka, A., Kimura, M.: Condition for injectivity of global maps for tesselation automata. Inf. Control 32, 158–162 (1976)
Meester, R., Steif, J.E.: Higherdimensional subshifts of finite type, factor maps and measures of maximal entropy. Pac. Math. J. 200(2), 497–510 (2001)
Meyerovitch, T.: Gibbs and equilibrium measures for some families of subshifts. Ergodic Theory and Dynamical Systems 33(3), 934–953 (2013)
Miyamoto, M.: An equilibrium state for a onedimensional life game. J. Math. Kyoto Univ. 19, 525–540 (1979)
Miyamoto, M.: Stationary measures for automaton rules 90 and 150. J. Math. Kyoto Univ. 34, 531–538 (1994)
Moore, E.F.: Machine models of selfreproduction. In: Proceedings of Symposia in Applied Mathematics, pp. 17–33. American Mathematical Society, Providence (1962)
Moothathu, T.K.S.: Homogeneity of surjective cellular automata. Discret. Contin. Dyn. Syst. 13(1), 195–202 (2005)
Moreira, A., Boccara, N., Goles, E.: On conservative and monotone onedimensional cellular automata and their particle representation. Theor. Comput. Sci. 325(2), 285–316 (2004)
Myhill, J.: The converse of Moore’s GardenofEden theorem. Proc. Am. Math. Soc. 14, 685–686 (1963)
Nasu, M.: Constanttoone and onto global maps of homomorphisms between strongly connected graphs. Ergodic Theory Dyn. Syst. 3(3), 387–413 (1983)
Nasu, M.: Textile Systems for Endomorphisms and Automorphisms of the Shift. Memoirs of the American Mathematical Society, vol. 114, 545th edn. American Mathematical Society, Providence (1995)
Oxtoby, J.C.: Ergodic sets. Bull. Am. Math. Soc. 58(2), 116–136 (1952)
Peierls, R.: On Ising’s model of ferromagnetism. Math. Proc. Camb. Philos. Soc. 32(3), 477–481 (1936)
Pivato, M.: Conservation laws in cellular automata. Nonlinearity 15, 1781–1793 (2002)
Pivato, M.: Invariant measures for bipermutative cellular automata. Discret. Contin. Dyn. Syst. 12(4), 723–736 (2006)
Pivato, M.: The ergodic theory of cellular automata. In: Meyers, R.A. (ed.) Encyclopedia of Complexity and System Science, pp. 965–999. Springer, New York (2009)
Pivato, M., Yassawi, R.: Limit measures for affine cellular automata. Ergodic Theory Dyn. Syst. 22, 1269–1287 (2002)
Pivato, M., Yassawi, R.: Limit measures for affine cellular automata II. Ergodic Theory Dyn. Syst. 24, 1961–1980 (2004)
Pivato, M., Yassawi, R.: Asymptotic randomization of sofic shifts by linear cellular automata. Ergodic Theory Dyn. Syst. 26(4), 1177–1201 (2006)
Pomeau, Y.: Invariant in cellular automata. J. Phys. A 17(8), L415–L418 (1984)
Ruelle, D.: Thermodynamic Formalism, 2nd edn. Cambridge University Press, Cambridge (2004)
Russo, L.: The infinite cluster method in the twodimensional Ising model. Commun. Math. Phys. 67(3), 251–266 (1979)
Sablik, M.: Measure rigidity for algebraic bipermutative cellular automata. Ergodic Theory Dyn. Syst. 27(6), 1965–1990 (2007)
Simon, B.: The Statistical Mechanics of Lattice Gases, vol. I. Princeton University Press, Princeton (1993)
Sinai, Y.G.: Theory of Phase Transitions: Rigorous Results. Akadémiai Kiadó, Budapest (1982)
Sklar, L.: Physics and Chance: Philosophical Issues in the Foundations of Statistical Mechanics. Cambridge University Press, Cambridge (1993)
Sobottka, M.: Rightpermutative cellular automata on topological Markov chains. Discret. Contin. Dyn. Syst. 20(4), 1095–1109 (2008)
Sullivan, W.G.: Potentials for almost Markovian random fields. Commun. Math. Phys. 33, 61–74 (1973)
Takesue, S.: Reversible cellular automata and statistical mechanics. Phys. Rev. Lett. 59(22), 2499–2502 (1987)
Takesue, S.: Ergodic properties and thermodynamic behavior of elementary reversible cellular automata. I. Basic properties. J. Stat. Phys. 56(3/4), 371–402 (1989)
Takesue, S.: Relaxation properties of elementary reversible cellular automata. Phys. D 45(1–3), 278–284 (1990)
Takesue, S.: Staggered invariants in cellular automata. Complex Syst. 9, 149–168 (1995)
Toffoli, T., Margolus, N.: Cellular Automata Machines: A New Environment for Modeling. MIT Press, Cambridge (1987)
Toffoli, T., Margolus, N.: Invertible cellular automata: a review. Phys. D 45, 229–253 (1990)
Toom, A.L., Vasilyev, N.B., Stavskaya, O.N., Mityushin, L.G., Kuryumov, G.L., Pirogov, S.A.: Discrete local Markov systems. In: Dobrushin, R.L., Kryukov, V.I., Toom, A.L. (eds.) Stochastic Cellular Systems: Ergodicity, Memory, Morphogenesis. Manchester University Press, Manchester (1990)
Velenik, Y.: Le modèle d’Ising. Notes for a course given at the University of Geneva (2009). http://cel.archivesouvertes.fr/cel00392289
Vichniac, G.Y.: Simulating physics with cellular automata. Phys. D 10, 96–116 (1984)
Walters, P.: An Introduction to Ergodic Theory. Springer, New York (1982)
Wolfram, S.: Statistical mechanics of cellular automata. Rev. Mod. Phys. 55, 601–644 (1983)
Wolfram, S.: Universality and complexity in cellular automata. Phys. D 10, 1–35 (1984)
Wolfram, S.: A New Kind of Science. Wolfram Media, Champaign (2002)
Acknowledgments
We would like to thank Aernout van Enter, Nishant Chandgotia, Felipe GarcíaRamos, Tom Kempton and Marcus Pivato for helpful comments and discussions.
Author information
Authors and Affiliations
Corresponding author
Appendix: Equivalence of the Definitions of a Gibbs Measure
Appendix: Equivalence of the Definitions of a Gibbs Measure
Proposition 14
Let \(\Delta \) be a Hamiltonian on a shift space \({\fancyscript{X}}\) and \(\pi \in {\fancyscript{P}}({\fancyscript{X}})\) a probability measure. The following conditions are equivalent:

(a)
For every finite set \(D\subseteq \mathbb {L}\),
$$\begin{aligned} \pi ([q]_D\,\,{\mathfrak {F}}_{D^\mathsf c })(z)&= \mathrm {e}^{\Delta \left( p\vee z _{D^\mathsf c }\;,\;q\vee z _{D^\mathsf c }\right) }\, \pi ([p]_D\,\,{\mathfrak {F}}_{D^\mathsf c })(z) \end{aligned}$$(75)for \(\pi \)almost every \(z\in {\fancyscript{X}}\) and every two patterns \(p,q\in L_D({\fancyscript{X}}\,\, z)\).

(b)
For every finite set \(D\subseteq \mathbb {L}\),
$$\begin{aligned} \frac{\pi ([q\vee z _{D^\mathsf c }]_E)}{\pi ([p\vee z _{D^\mathsf c }]_E)}&\rightarrow \mathrm {e}^{\Delta \left( p\vee z _{D^\mathsf c }\;,\;q\vee z _{D^\mathsf c }\right) } \end{aligned}$$(76)uniformly in \(z\in {\mathrm {supp}}(\pi )\) and \(p,q\in L_D({\fancyscript{X}}\,\,z)\) as \(E\nearrow \mathbb {L}\) along the directed family of finite subsets of \(\mathbb {L}\).

c)
For every configuration \(x\in {\fancyscript{X}}\) that is in the support of \(\pi \) and every configuration \(y\in {\fancyscript{X}}\) that is asymptotic to x,
$$\begin{aligned} \frac{\pi ([y]_E)}{\pi ([x]_E)}&\rightarrow \mathrm {e}^{\Delta (x,y)} \;, \end{aligned}$$(77)as \(E\nearrow \mathbb {L}\) along the directed family of finite subsets of \(\mathbb {L}\).
Proof
(a)\(\Rightarrow \)(b) Assume that condition (a) is satisfied. Let \(D\subseteq \mathbb {L}\) be a finite set. Integrating (75), for any finite \(E\supseteq D\) we get
Setting
we can write
Now, let \(p,q\in L_D({\fancyscript{X}})\) be fixed patterns with \(\pi ([p]_D),\pi ([q]_D)>0\), and let \(\varepsilon >0\). By the uniform continuity of \(z\mapsto \Delta (p\vee z _{D^\mathsf c },q\vee z _{D^\mathsf c })\), there is a sufficiently large finite set \(E_\varepsilon \subseteq \mathbb {L}\) such that, for every \(E\supseteq E_\varepsilon \) and every \(z,\zeta \) with \(z _{E\setminus D}=\zeta _{E\setminus D}\), we have \(\left\delta (z,\zeta )\right<\varepsilon \). In particular, for every \(E\supseteq E_\varepsilon \) and every \(z\in {\mathrm {supp}}(\pi )\) satisfying \(p,q\in L_D({\fancyscript{X}}\,\,z)\), we get
Substituting in (81), we obtain, for every \(z\in {\mathrm {supp}}(\pi )\) satisfying \(p,q\in L_D({\fancyscript{X}}\,\,z)\), that
provided \(E\supseteq E_\varepsilon \). Dividing by \(\pi ([p\vee z _{D^\mathsf c }]_E)\) and letting \(\varepsilon \rightarrow 0\) proves the claim.
(b)\(\Rightarrow \)(c) Trivial.
(c)\(\Rightarrow \)(a) Suppose that \(\pi \) satisfies condition (c). Let \(I_1\subseteq I_2\subseteq \cdots \) be an arbitrary chain of finite subsets of \(\mathbb {L}\) with \(\bigcup _n I_n=\mathbb {L}\). Let \(z\in {\fancyscript{X}}\) be a configuration in the support of \(\pi \) and \(D\subseteq \mathbb {L}\) a finite set. For every two patterns \(p,q\in L_D({\fancyscript{X}}\,\,z)\), we have
as \(n\rightarrow \infty \), implying that
as \(n\rightarrow \infty \).
Note that the \(\sigma \)algebra \({\mathfrak {F}}_{D^\mathsf c }\) is generated by the filtration \({\mathfrak {F}}_{I_1\setminus D}\subseteq {\mathfrak {F}}_{I_2\setminus D}\subseteq \cdots \). Therefore, by the martingale convergence theorem, for \(\pi \)almost every z, and every \(p\in L_D({\fancyscript{X}}\,\, z)\),
as \(n\rightarrow \infty \).
Combining (86) and (87), we obtain
for \(\pi \)almost every \(z\in {\fancyscript{X}}\) and every \(p,q\in L_D({\fancyscript{X}}\,\,z)\). \(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Kari, J., Taati, S. Statistical Mechanics of Surjective Cellular Automata. J Stat Phys 160, 1198–1243 (2015). https://doi.org/10.1007/s1095501512812
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1095501512812
Keywords
 Cellular automata
 Gibbs measures
 Conservation laws
 Macroscopic equilibrium
Mathematics Subject Classification
 37B15
 37A60
 37D35
 37B10
 82Bxx
 82Cxx