1 Introduction

Self-similar groups is an active topic of modern group theory. They initially appeared as interesting examples of groups with unusual properties (see [1, 21, 24, 44]). The main techniques of the theory were developed for the study of these examples. Later a connection to dynamical systems was discovered (see [33, 35]) via the notion of the iterated monodromy group. Many interesting problems were solved using self-similar groups (see [10, 11, 16, 22]).

One of the ways to define self-similar groups is to say that they are groups generated by all states of an automaton (of Mealy type, also called a transducer, or sequential machine). Especially important case is when the group is generated by the states of a finite automaton. All examples mentioned above (including the iterated monodromy groups of expanding dynamical systems) are like that.

The main goal of this article is to indicate a new relation between self-similar groups and another classical notion of automaticity: automatic sequences and matrices. See the monographs [2, 50] for theory of automatic sequences and applications.

More precisely, we are going to study natural linear representations of self-similar groups over finite fields, and show that matrices associated with elements of a group generated by a finite automaton are automatic.

There are several ways to define automatic sequences and matrices. One can use Moore automata, substitutions (e.g., Morse–Thue substitution leading to the famous Morse–Thue sequence), or Christol’s characterization of automatic sequences in terms of algebraicity of the generating power series over a suitable finite field [2, Theorem 12.2.5]. The theory of automatic sequences is rich and is related to many topics in dynamical systems, ergodic theory, spectral theory of Schrödinger operators, number theory etc., see [2, 50].

It is well known that linear groups (that is subgroups of groups of matrices \(GL_N(\Bbbk )\), where \(\Bbbk \) is a field) is quite a restrictive class of groups as the Tits alternative [48] holds for them. Moreover, the group of (finite) upper triangular matrices is solvable, and the group of upper unitriangular matrices is nilpotent. In contrast, if one uses infinite triangular matrices over a finite field, one can get much more groups. In particular, every countable residually finite p-group can be embedded into the group of upper uni-triangular matrices over the finite field \(\mathbb {F}_p\).

We will pay special attention to the case when the constructed representation is a representation by infinite unitriangular matrices. One of the results of our paper is showing that the natural (and optimal in certain sense) representation by uni-triangular matrices constructed in [29, 31] leads to automatic matrices, if the group is generated by a finite automaton. In particular, the diagonals of these uni-triangular matrices are automatic sequences. We study them separately, in particular, by computing their generating series (as algebraic functions).

The roots of the subject of our paper go back to L. Kaloujnin’s results on Sylow p-subgroups of the symmetric group [2628], theory of groups generated by finite automata [1, 17, 19, 25, 45], theory of self-similar groups [7, 35], and group actions on rooted trees [8, 19, 23].

Note that study of actions on rooted trees (every self-similar group is, by definition, an automorphism group of the rooted tree of finite words over a finite alphabet) is equivalent to the study of residually finite groups by geometric means, i.e., via representations of them in groups of automorphisms of rooted trees. The theory of actions on rooted trees is quite different from the Bass–Serre theory [41] of actions on (unrooted) trees, and uses different methods and tools. The important case is when a group is residually finite p-group (p prime), i.e., is approximated by finite p-groups. The class of residually finite p-groups contains groups with many remarkable properties. For instance, Golod’s p-groups that were constructed in [20] based on Golod-Shafarevich theorem to answer a question of Burnside on the existence of a finitely generated infinite torsion group are residually p groups. Other important examples are the first self-similar groups mentioned at the beginning of this introduction.

At the end of the paper we study a notion of uniseriality which plays an important role in the study of actions of groups on finite p-groups [15, 30]. Our analysis is based upon classical results of L. Kaloujnine on height of automorphisms of rooted trees [27, 28]. Applications of uniseriality to Lie algebras associated with self-similar groups were, for instance, demonstrated in [5]. Proposition 5.29 gives a simple criterion of uniseriality of action of a group on rooted trees and allows to substitute lemma 5.2 from [5]. A number of examples is presented which demonstrate the basic notions, ideas, and results.

2 Groups acting on rooted trees

2.1 Rooted trees and their automorphisms

Let X be a finite alphabet. Denote by \(X^*\) the set of finite words over X, which we will view as the free monoid generated by X. It is a disjoint union of the sets \(X^n\) of words of length n. We denote the empty word, which is the only element of \(X^0\), by \(\emptyset \). We will write elements of \(X^n\) either as words \(x_1x_2\cdots x_n\), or as n-tuples \((x_1, x_2, \ldots , x_n)\).

We consider \(X^*\) as the set of vertices of a rooted tree, defined as the right Cayley graph of the free monoid. Namely, two vertices of \(X^*\) are connected by an edge if and only if they are of the form v, vx for \(v\in X^*\) and \(x\in X\). The empty word is the root of the tree. For \(v\in X^*\), we consider \(vX^*=\{vw\;:\;w\in X^*\}\) as the set of vertices of a sub-tree with the root v.

Denote by \(\mathop {\mathrm {Aut}}(X^*)\) the group of all automorphisms of the rooted tree \(X^*\). Every element of \(\mathop {\mathrm {Aut}}(X^*)\) preserves the levels \(X^n\) of the tree, and for every \(g\in \mathop {\mathrm {Aut}}(X^*)\) beginning of length \(n\le m\) of the word \(g(x_1x_2\cdots x_m)\) is equal to \(g(x_1x_2\cdots x_n)\). It follows that for every \(v\in X^*\) the transformation \(\alpha _{g, v}:X\longrightarrow X\) defined by \(g(vx)=g(v)\alpha _{g, v}(x)\) is a permutation, and that the action of g on \(X^*\) is determined by these permutations according to the rule

$$\begin{aligned} g(x_1x_2\cdots x_n)=\alpha _{g, \emptyset }(x_1)\alpha _{g, x_1}(x_2)\alpha _{g, x_1x_2}(x_3)\cdots \alpha _{g, x_1x_2\cdots x_{n-1}}(x_n). \end{aligned}$$
(1)

The map \(v\mapsto \alpha _{g, v}\) from \(X^*\) to the symmetric group \(\mathop {\mathrm {Symm}}(X)\) is called the portrait of the automorphism g.

Equivalently, we can represent g by the sequence

$$\begin{aligned} \tau =[\tau _0, \tau _1, \tau _2, \ldots ] \end{aligned}$$

of maps \(\tau _n:X^n\longrightarrow \mathop {\mathrm {Symm}}(X)\), where \(\tau _n(v)=\alpha _{g, v}\). Such sequences are called, following L. Kaloujnine [26], tableaux, and are denoted \(\tau (g)\).

If \([\tau _0, \tau _1, \ldots ]\) and \([\sigma _0, \sigma _1, \ldots ]\) are tableaux of elements \(g_1, g_2\in \mathop {\mathrm {Aut}}(X^*)\), respectively, then the tableau of their product \(g_1g_2\) is the sequence of functions

$$\begin{aligned} \tau (g_1g_2)=[\tau _n(g(x_1x_2\cdots x_n))\cdot \sigma _n(x_1x_2\cdots x_n)]_{i=0}^\infty . \end{aligned}$$
(2)

The tableau of the inverse of the element \(g_1\) is

$$\begin{aligned} \tau (g_1^{-1})=\left[ \tau _n(g^{-1}(x_1x_2\cdots x_n))^{-1}\right] _{n=0}^\infty . \end{aligned}$$
(3)

Here, and in most of our paper, (except when we talk about bisets, i.e., about sets with left and right actions) group elements and permutations act from the left.

Denote by \(X^{[n]}\) the finite sub-tree of \(X^*\) spanned by the set of vertices \(\bigcup _{k=0}^nX^k\). The group \(\mathop {\mathrm {Aut}}(X^*)\) acts on \(X^{[n]}\), and the kernel of the action coincides with the kernel of the action on \(X^n\). The quotient of \(\mathop {\mathrm {Aut}}(X^*)\) by the kernel of the action is a finite group, which is naturally identified with the full automorphism group of the tree \(X^{[n]}\). We will denote this finite group by \(\mathop {\mathrm {Aut}}(X^{[n]})\).

The group \(\mathop {\mathrm {Aut}}(X^*)\) is naturally isomorphic to the inverse limit of the groups \(\mathop {\mathrm {Aut}}(X^{[n]})\) (with respect to the restriction maps). This shows that \(\mathop {\mathrm {Aut}}(X^*)\) is a profinite group. The basis of neighborhoods of the identity of \(\mathop {\mathrm {Aut}}(X^*)\) is the set of kernels of its action on the levels \(X^n\) of the tree \(X^*\).

2.2 Self-similarity of \(\mathop {\mathrm {Aut}}(X^*)\)

Let \(g\in \mathop {\mathrm {Aut}}(X^*)\), and \(v\in X^*\). Then there exists an automorphism of \(X^*\), denoted \(g|_v\) such that

$$\begin{aligned} g(vw)=g(v)g|_v(w) \end{aligned}$$

for all \(w\in X^*\).

We call \(g|_v\) the section of g at v. The sections obviously have the properties

$$\begin{aligned} g|_{v_1v_2}=g|_{v_1}|_{v_2},\qquad (g_1g_2)|_v=g_1|_{g_2(v)}g_2|_v, \end{aligned}$$
(4)

for all \(g, g_1, g_2\in \mathop {\mathrm {Aut}}(X^*)\) and \(v, v_1, v_2\in X^*\).

The portrait of the section \(g|_v\) is obtained by restricting the portrait of g to the subtree \(vX^*\), and then identifying \(vX^*\) with \(X^*\) by the map \(vw\mapsto w\).

Definition 2.1

The set \(g|_{X^*}=\{g|_v\;:\;v\in X^*\}\subset \mathop {\mathrm {Aut}}(X^*)\) for \(g\in \mathop {\mathrm {Aut}}(X^*)\), is called the set of states of g. An automorphism \(g\in \mathop {\mathrm {Aut}}(X^*)\) is said to be finite state if \(g|_{X^*}\) is finite.

It follows from (4) that

$$\begin{aligned} g^{-1}|_X=(g|_{X^*})^{-1},\qquad (g_1g_2)|_X\subset g_1|_Xg_2|_X, \end{aligned}$$

which implies that the set of finite state elements of \(\mathop {\mathrm {Aut}}(X^*)\) is a group. We call it the group of finite automata, and denote it \(\mathop {\mathrm {FAut}}(X^*)\). This name comes from interpretation of elements of \(\mathop {\mathrm {Aut}}(X^*)\) with automata (transducers), see 3.1 below. Namely, the set of states of the automaton corresponding to g is \(g|_{X^*}\). The element \(g\in g|_{X^*}\) is the initial state. If the current state of the automaton is h, and when it reads a letter \(x\in X\) on its input, then it outputs h(x) and changes its current state to \(h|_x\). It is easy to check that if we give the consecutive letters of a word \(x_1x_2\cdots x_n\) on input of the automaton with the initial state g, then we will get on output the word \(g(x_1x_2\cdots x_n)=g(x_1)g|_{x_1}(x_2)g|_{x_1x_2}(x_3)\ldots \), compare with (1).

Every element \(g\in \mathop {\mathrm {Aut}}(X^*)\) is uniquely determined by the permutation \(\pi \) it defines on the first level \(X\subset X^*\) and the first level sections \(g|_x\), \(x\in X\). In fact, the map

$$\begin{aligned} g\mapsto \pi \cdot (g|_x)_{x\in X} \end{aligned}$$

is an isomorphism of \(\mathop {\mathrm {Aut}}(X^*)\) with the wreath product \(\mathop {\mathrm {Symm}}(X)\ltimes \mathop {\mathrm {Aut}}(X^*)^X=\mathop {\mathrm {Symm}}(X)\wr \mathop {\mathrm {Aut}}(X^*)\). We call the isomorphism

$$\begin{aligned} \phi :\mathop {\mathrm {Aut}}(X^*)\longrightarrow \mathop {\mathrm {Symm}}(X)\wr \mathop {\mathrm {Aut}}(X^*):g\mapsto \pi \cdot (g|_x)_{x\in X} \end{aligned}$$

the wreath recursion.

For a fixed ordering \(x_1, x_2, \ldots , x_d\) of the letters of X, the elements of \(\mathop {\mathrm {Symm}}(X)\wr \mathop {\mathrm {Aut}}(X^*)\) are written as \(\pi (g_1, g_2, \ldots , g_d)\), where \(\pi \in \mathop {\mathrm {Symm}}(d)\) and \(g_i=g|_{x_i}\).

Definition 2.2

A subgroup \(G\le \mathop {\mathrm {Aut}}(X^*)\) is said to be self-similar if \(g|_x\in G\) for all \(g\in G\) and \(x\in X\).

In other words, a group \(G\le \mathop {\mathrm {Aut}}(X^*)\) is self-similar if restriction of the wreath restriction to G is a homomorphism \(\phi :G\longrightarrow \mathop {\mathrm {Symm}}(X)\wr G\). Note that the wreath recursion is usually not an isomorphism (but is an embedding, since we assume that G acts faithfully on \(X^*\)).

Example 2.3

Let \(X=\{0, 1\}\). Consider the automorphism of the tree \(X^*\) given by the rules

$$\begin{aligned} a(0w)=1w,\qquad a(1w)=0a(w). \end{aligned}$$

These rules can be written using wreath recursion as \(\phi (a)=\sigma (1, a)\), where \(\sigma =(01)\) is the transposition. We will usually omit \(\phi \), and write

$$\begin{aligned} a=\sigma (1, a), \end{aligned}$$

thus identifying \(\mathop {\mathrm {Aut}}(X^*)\) with \(\mathop {\mathrm {Symm}}(X)\wr \mathop {\mathrm {Aut}}(X^*)\).

The automorphism a is called the (binary) adding machine, since it describes the process of adding one to a dyadic integer: \(a(x_1x_2\cdots x_n)=y_1y_2\cdots y_n\) if and only if

$$\begin{aligned} (x_1+2x_2+2^2x_3+\cdots +2^{n-1}x_n)+1=y_1+2y_2+\cdots +2^{n-1}y_n \qquad (\hbox {mod}\,2^{n}). \end{aligned}$$

The group generated by a (which is infinite cyclic) is self-similar, and is a subgroup of the group of finite automata.

Example 2.4

Consider the group G generated by the elements abcd that are defined inductively by the recursions

$$\begin{aligned} a=\sigma ,\quad b=(a, c),\quad c=(a, d),\quad d=(1, b). \end{aligned}$$

Here \(\sigma \), as before, is the transposition (01), and when we omit either the element of \(\mathop {\mathrm {Symm}}(X)\) or the element of \(\mathop {\mathrm {Aut}}(X^*)^X\) when writing elements of \(\mathop {\mathrm {Symm}}(X)\wr \mathop {\mathrm {Aut}}(X^*)\), we assume that it is equal to the identity element of the respective group.

The group G is then a self-similar subgroup of the group of finite automata. It is the Grigorchuk group, defined in [21].

2.3 Self-similarity bimodule

We can identify the letters \(x\in X\) with transformations \(v\mapsto xv\) of the set \(X^*\). Then the identity \(g(xv)=yh(v)\) for \(x\in X\), \(y=g(x)\), and \(h=g|_x\) is written as equality of compositions of transformations:

$$\begin{aligned} g\cdot x=y\cdot h. \end{aligned}$$

Consider the set \(X\cdot G\) of compositions of the form \(x\cdot g\), i.e., transformations \(v\mapsto xg(v)\), \(v\in X^*\). It is closed with respect to pre- and post-compositions with the elements of G:

$$\begin{aligned} (x\cdot g)\cdot h=x\cdot (gh),\qquad h\cdot (x\cdot g)=h(x)\cdot (h|_xg). \end{aligned}$$

We get in this way a biset, i.e., a set with two commuting left and right actions of the group G.

Let \(\Bbbk \) be a field, and let \(\Bbbk [G]\) be the group ring over \(\Bbbk \). Denote by \(\Phi \) the vector space \(\Bbbk ^{X\cdot G}\) spanned by \(X\cdot G\). Then the left and the right actions of G on \(X\cdot G\) are extended by linearity to a structure of a \(\Bbbk [G]\)-bimodule on \(\Phi \). We will denote by \({}_G\Phi \) and \(\Phi _G\) the space \(\Phi \) seen as a left and a right \(\Bbbk [G]\)-module, respectively.

It follows directly from the definition of the right action of G on \(X\cdot G\) that X (identified with \(X\cdot 1\)) is a free basis of \(\Phi _G\). The left action is not free in general, since it is possible to have \(g(xv)=xv\) for all \(v\in X^*\) and for a non-trivial element \(g\in G\), which will imply, by definition of the left action, that \(g\cdot x=x\).

For every element \(a\in \Bbbk [G]\) the map \(v\mapsto a\cdot v\) for \(v\in \Phi \) is an endomorphism of \(\Phi _G\), denoted \(\Xi (a)\). The map \(\Xi :\Bbbk [G]\longrightarrow \mathop {\mathrm {End}}\Phi _G\) is obviously a homomorphism of \(\Bbbk \)-algebras.

After fixing a basis of the right module \(\Phi _G\) (for example X), we can identify the algebra of endomorphisms \(\mathop {\mathrm {End}}\Phi _G\) of the right \(\Bbbk [G]\)-module \(\Phi _G\) with the algebra of \(|X|\times |X|\) matrices over \(\Bbbk [G]\). In this case the homomorphism \(\Xi :\Bbbk [G]\longrightarrow \mathsf {M}_{|X|}(\Bbbk [G])\cong \mathop {\mathrm {End}}\Phi _G\) is called the matrix recursion associated with the self-similar group G (and the basis of the right module).

More explicitly, if \(B=\{e_1, \ldots , e_d\}\) is a basis of the right \(\Bbbk [G]\)-module \(\Phi _G\), then, for \(a\in \Bbbk [G]\) the matrix \(\Xi (a)=(a_{i, j})_{1\le i, j\le d}\in \mathsf {M}_d(\Bbbk [G])\) is given by the condition

$$\begin{aligned} a\cdot e_j=\sum _{i=1}^de_i\cdot a_{i, j}. \end{aligned}$$

If we use the basis \(\{x_1, x_2, \ldots , x_d\}=X\) of the right module \(\Phi _G\), then the matrix recursion \(\Xi \) is a direct rewriting of the wreath recursion \(\phi :G\longrightarrow \mathop {\mathrm {Symm}}(X)\wr G\) in matrix terms. Namely, \(\Xi (g)\) is the matrix with entries \(a_{ij}\), \(1\le i, j\le d\), given by the rule

$$\begin{aligned} a_{ij}=\left\{ \begin{array}{l@{\quad }l} g|_{x_j} &{} \text {if } g(x_j)=x_i, \\ 0 &{} \text {otherwise} \end{array}\right. \end{aligned}$$
(5)

Example 2.5

The adding machine recursion \(a=\sigma (1, a)\) is defined in the terms of the bimodule as

$$\begin{aligned} a\cdot x_0=x_1,\qquad a\cdot x_1=x_0\cdot a, \end{aligned}$$

where \(x_0, x_1\) are identified with the symbols 0, 1, respectively, from Example 2.3.

It follows that the recursion is written in matrix form as

$$\begin{aligned} \Xi (a)=\left( \begin{array}{c@{\quad }c} 0 &{} a\\ 1 &{} 0\end{array}\right) . \end{aligned}$$

The recursive definition of the generators abcd of the Grigorchuk group is written as

$$\begin{aligned} \Xi (a)=\left( \begin{array}{c@{\quad }c} 0 &{} 1\\ 1 &{} 0\end{array}\right) ,\quad \Xi (b)=\left( \begin{array}{c@{\quad }c} a &{} 0\\ 0 &{} c\end{array}\right) ,\quad \Xi (c)=\left( \begin{array}{c@{\quad }c} a &{} 0\\ 0 &{} d\end{array}\right) ,\quad \Xi (d)=\left( \begin{array}{c@{\quad }c} 1 &{} 0\\ 0 &{} b\end{array}\right) . \end{aligned}$$

When we change the basis of the right module \(\Phi _G\), we just conjugate the map \(\Xi \) by the transition matrix. Namely, if \(\{x_1, \ldots , x_d\}\) and \(\{y_1, \ldots , y_d\}\) are bases of the right module \(\Phi _G\), then we can write \(y_j=\sum _{i=1}^\infty y_i\cdot b_{i, j}\) for \(b_{i, j}\in \Bbbk [G]\). Then the matrix \(T=(b_{i, j})_{1\le i, j\le d}\) is the transition matrix from the basis \(\{x_i\}_{1\le i\le d}\) to the basis \(\{y_i\}_{1\le i\le d}\).

Example 2.6

Consider again the adding machine example. Let us take, instead of the standard basis \(\{x_0, x_1\}=X\), the basis \(y_0=x_0+x_1\), \(y_1=x_1\). (Here we replace the letters 0, 1 of the binary alphabet by \(x_0\) and \(x_1\), respectively, in order not to confuse them with elements \(0, 1\in \Bbbk [G]\).) Then the transition matrix to the new basis is \(T=\left( \begin{array}{c@{\quad }c} 1 &{} 0\\ 1 &{} 1\end{array}\right) \). It inverse is \(T^{-1}=\left( \begin{array}{c@{\quad }c}1 &{} 0\\ -1 &{} 1\end{array}\right) \). Consequently, the matrix recursion in the new basis is

$$\begin{aligned} a\mapsto T^{-1}\left( \begin{array}{c@{\quad }c} 0 &{} a\\ 1 &{} 0\end{array}\right) T=\left( \begin{array}{c@{\quad }c} a &{} a\\ 1-a &{} -a\end{array}\right) . \end{aligned}$$

This can be checked directly:

$$\begin{aligned} a\cdot y_0=a\cdot (x_0+x_1)=x_1+x_0\cdot a=y_1+(y_0-y_1)\cdot a=y_0\cdot a+y_1\cdot (1-a) \end{aligned}$$

and

$$\begin{aligned} a\cdot y_1=a\cdot x_1=x_0\cdot a=y_0\cdot a-y_1\cdot a. \end{aligned}$$

If we take the basis \(\{y_0=x_0, y_1=x_1\cdot a\}\), then matrix recursion becomes

$$\begin{aligned} a\mapsto \left( \begin{array}{c@{\quad }c} 1 &{} 0\\ 0 &{} a^{-1}\end{array}\right) \cdot \left( \begin{array}{c@{\quad }c} 0 &{} a \\ 1 &{} 0\end{array}\right) \cdot \left( \begin{array}{c@{\quad }c} 1 &{} 0 \\ 0 &{} a \end{array}\right) = \left( \begin{array}{c@{\quad }c} 0 &{} a^2 \\ a^{-1} &{} 0\end{array}\right) . \end{aligned}$$

If the basis is a subset of \(X\cdot G\), then the matrix recursion corresponds to a wreath recursion \(G\mapsto \mathop {\mathrm {Symm}}(X)\wr G\). For instance, in the last example the matrix recursion corresponds to the wreath recursion

$$\begin{aligned} a\mapsto \sigma (a^{-1}, a^2). \end{aligned}$$

This wreath recursion describes the process of adding 1 to a dyadic numbers in the binary numeration system with digits 0 and 3. For more on changes of bases in the biset \(X\cdot G\) and the corresponding transformations of the wreath recursion see [35, 36].

If \(\Phi _1\) and \(\Phi _2\) are bimodules over a \(\Bbbk \)-algebra A, then their tensor product \(\Phi _1\otimes \Phi _2\) is the quotient of the \(\Bbbk \)-vector space spanned by \(\Phi _1\times \Phi _2\) by the sub-space generated by the elements of the form

$$\begin{aligned} (v_1, a\cdot v_2)-(v_1\cdot a, v_2) \end{aligned}$$

for \(v_1\in \Phi _1\), \(v_2\in \Phi _2\), \(a\in A\). It is a \(\Bbbk [G]\)-bimodule with respect to the actions \(a\cdot (v_1\otimes v_2)=(a\cdot v_1)\otimes v_2\) and \((v_1\otimes v_2)\cdot a=v_1\otimes (v_2\cdot a)\).

If \(\Phi _2\) is a left A-module, and \(\Phi _1\) is an A-bimodule, then the left module \(\Phi _1\otimes \Phi _2\) is defined in the same way.

Let \(\Phi \), as above, be the bimodule associated with a self-similar group G. Then X is a basis of the right \(\Bbbk [G]\)-module \(\Phi _G\), and the set

$$\begin{aligned} X^n=\{x_1\otimes \cdots \otimes x_n\;:\;x_i\in X\} \end{aligned}$$

is a basis of the right module \(\Phi ^{\otimes n}_G\), which is hence a free module. Note that \(X^n\cdot G\) is the basis of \(\Phi ^{\otimes n}\) as a vector space over \(\Bbbk \).

We identify \(x_1\otimes \cdots \otimes x_n\) with the word \(x_1\cdots x_n\). The left module structure on \(\Phi ^{\otimes n}\) is given by the rules similar to the definition of \(\Phi \):

$$\begin{aligned} g\cdot v=g(v)\cdot g|_v \end{aligned}$$
(6)

for \(v\in X^n\) and \(g\in G\). In particular, up to an ordering of the basis \(X^n\), the associated matrix recursion \(\Xi ^n:\Bbbk [G]\longrightarrow \mathsf {M}_{|X^n|}(\Bbbk [G])\) is obtained from the recursion \(\Xi ^{n-1}:\Bbbk [G]\longrightarrow \mathsf {M}_{|X^{n-1}|}(\Bbbk [G])\) by replacing every entry \(a_{ij}\) of the matrix \(\Xi ^{n-1}(a)\) by the matrix \(\Xi (a_{ij})\).

Example 2.7

The matrix recursion \(G\longrightarrow \mathsf {M}_4(\Bbbk [G])\) for the adding machine (in the standard basis \(X^2\)) is

$$\begin{aligned} a\mapsto \left( \begin{array}{c@{\quad }c|c@{\quad }c}0 &{} 0 &{} 0 &{} a\\ 0 &{} 0 &{} 1 &{} 0\\ \hline 1 &{} 0 &{} 0 &{} 0\\ 0 &{} 1 &{} 0 &{} 0\end{array}\right) , \end{aligned}$$

which is obtained by iterating the matrix recursion

$$\begin{aligned} a\mapsto \left( \begin{array}{c@{\quad }c}0 &{} a\\ 1 &{} 0\end{array}\right) . \end{aligned}$$

In this case the basis \(X^2\) is ordered in the lexicographic order \(x_0x_0<x_0x_1<x_1x_0<x_1x_1\). But since a is the adding machine, and it describes adding 1 to a dyadic integer that is written is such a way that the less significant digits come before the more significant ones, it is more natural to order the basis in the inverse lexicographic order \(x_0x_0<x_1x_0<x_0x_1<x_1x_1\). In this case the matrix recursion becomes

$$\begin{aligned} a\mapsto \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}0 &{} 0 &{} 0 &{} a\\ 1 &{} 0 &{} 0 &{} 0\\ 0 &{} 1 &{} 0 &{} 0\\ 0 &{} 0 &{} 1 &{} 0\end{array}\right) . \end{aligned}$$

Proposition 2.8

Let T be the transition matrix from a basis X of \(\Phi _G\) to a basis Y. Suppose that all entries of T are elements of \(\Bbbk \). Then the transition matrices \(T_n\) from the basis \(X^{\otimes n}\) to \(Y^{\otimes n}\) are equal to

$$\begin{aligned} T_n=\underbrace{T\otimes T\otimes \cdots \otimes T}_{|X|}, \end{aligned}$$

where \(\otimes \) is the Kronecker product of matrices.

Proof

Let \(T_{n-1}=(a_{u, v})_{u\in X^{\otimes (n-1)}, v\in Y^{\otimes (n-1)}}\), i.e.,

$$\begin{aligned} v=\sum _{u\in X^{\otimes (n-1)}}u\cdot a_{u, v} \end{aligned}$$

for all \(v\in Y^{\otimes (n-1)}\). Similarly, denote \(T=(a_{x, y})_{x\in X, y\in Y}\). Then

$$\begin{aligned} y\otimes v= & {} y\otimes \sum _{u\in X^{\otimes (n-1)}}u\cdot a_{u, v}=\left( \sum _{x\in X}x\cdot a_{x, y}\right) \otimes \sum _{u\in X^{\otimes (n-1)}}u\cdot a_{u, v} \\= & {} \sum _{u\otimes x\in X^{\otimes n}}x\cdot \otimes a_{x, y}\cdot u\cdot a_{u, v}= \sum _{u\otimes x\in X^{\otimes n}}x\otimes u\cdot a_{x, y}\cdot a_{u, v}, \end{aligned}$$

which shows that

$$\begin{aligned} a_{xu, yv}=a_{x, y}a_{u, v}, \end{aligned}$$

which agrees with the definition of the Kronecker product. \(\square \)

In other words, we can write

$$\begin{aligned} T_n=T^{(n)}\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}T_{n-1} &{} 0 &{} \cdots &{} 0\\ 0 &{} T_{n-1} &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} T_{n-1}\end{array}\right) , \end{aligned}$$
(7)

where \(T_1=T\), and \(T^{(n)}\) is the matrix T in which each entry \(a_{ij}\) is replaced by \(a_{ij}\) times the unit matrix of dimension \(|X|^{n-1}\times |X|^{n-1}\). Here the rows and columns of \(T_n\) correspond to the elements of \(X^{\otimes n}\) and \(Y^{\otimes n}\), respectively, ordered lexicographically.

It is easy to see from the proof, that in the general case (when not all entries of T are elements of \(\Bbbk \)), the formula (7) remains to be true, if we replace \(T^{(n)}\) by the image of T under the \((n-1)\)st iteration of the matrix recursion (in the basis X).

Example 2.9

Let \(\Bbbk =\mathbb {R}\), \(X=\{x_0, x_1\}\). Consider a new basis of \(\Phi _G\)

$$\begin{aligned} \left\{ y_0=\frac{x_0+x_1}{\sqrt{2}}, y_1=\frac{x_0-x_1}{\sqrt{2}}\right\} . \end{aligned}$$

The transition matrix to the new basis is

$$\begin{aligned} \left( \begin{array}{r@{\quad }r} 1/\sqrt{2} &{} 1/\sqrt{2}\\ 1/\sqrt{2} &{} -1/\sqrt{2}\end{array}\right) =\frac{1}{\sqrt{2}}\left( \begin{array}{r@{\quad }r} 1 &{} 1 \\ 1 &{} -1\end{array}\right) . \end{aligned}$$

Then the transition matrix from \(X^{\otimes n}\) to \(Y^{\otimes n}\) satisfies the recursion

$$\begin{aligned} H_n=\frac{1}{\sqrt{2}}\left( \begin{array}{r@{\quad }r} 1 &{} 1 \\ 1 &{} -1\end{array}\right) \left( \begin{array}{c@{\quad }c} H_{n-1} &{} 0 \\ 0 &{} H_{n-1}\end{array}\right) =\frac{1}{\sqrt{2}}\left( \begin{array}{r@{\quad }r} H_{n-1} &{} H_{n-1} \\ H_{n-1} &{} -H_{n-1}\end{array}\right) . \end{aligned}$$

2.4 Inductive limit of \(\Bbbk ^{X^n}\)

Let \(\Bbbk ^{X^n}\) be the vector space of functions \(X^n\longrightarrow \Bbbk \). It is naturally isomorphic to the nth tensor power of \(\Bbbk ^X\). The isomorphism maps an elementary tensor \(f_1\otimes f_2\otimes \cdots \otimes f_n\) to the function

$$\begin{aligned} f_1\otimes f_2\otimes \cdots \otimes f_n(x_1x_2\cdots x_n)= f_1(x_1)f_2(x_2)\cdots f_n(x_n). \end{aligned}$$

More generally, we have natural isomorphisms \(\Bbbk ^{X^n}\otimes \Bbbk ^{X^m}\cong \Bbbk ^{X^{n+m}}\) defined by the equality

$$\begin{aligned} f_1\otimes f_2(x_1x_2\cdots x_{n+m})=f_1(x_1x_2\cdots x_n)f_2(x_{n+1}x_{n+2}\cdots x_{n+m}). \end{aligned}$$

We denote by \(\delta _v\), for \(v\in X^n\) the delta-function of v, i.e., the characteristic function of \(\{v\}\). It is an element of \(\Bbbk ^{X^n}\). Note that

$$\begin{aligned} \delta _{x_1x_2\cdots x_n}=\delta _{x_1}\otimes \delta _{x_2}\otimes \cdots \otimes \delta _{x_n}, \end{aligned}$$

with respect to the above identification of \(\Bbbk ^{X^{n+m}}\) with \(\Bbbk ^{X^n}\otimes \Bbbk ^{X^m}\).

Let \(G\le \mathop {\mathrm {Aut}}(X^*)\). Denote by \(\pi _n\) the natural permutational representation of G on \(\Bbbk ^{X^n}\) coming from the action G on \(X^n\). It is given by the rule \(\pi _n(\delta _v)=\delta _{g(v)}\), i.e., by

$$\begin{aligned} \pi _n(g)(f)(v)=f(g^{-1}(v)),\qquad f\in \Bbbk ^X,\quad v\in X^n. \end{aligned}$$

Denote by \(V_n\) the vector space \(\Bbbk ^{X^n}\) seen as a left \(\Bbbk [G]\)-module of the representation \(\pi _n\), and by \([\varepsilon ]\) the left \(\Bbbk [G]\)-module of the trivial representation of G. More explicitly, it is a one-dimensional vector space over \(\Bbbk \) spanned by an element \(\varepsilon \), together with the left action of \(\Bbbk [G]\) given by the rule

$$\begin{aligned} g\cdot \varepsilon =\varepsilon \end{aligned}$$

for all \(g\in G\). The following proposition is a direct corollary of (6).

Proposition 2.10

The left module \(V_n\) is isomorphic to \(\Phi ^{\otimes n}\otimes [\varepsilon ]\). The isomorphism is the \(\Bbbk \)-linear extension of the map \(\delta _{x_1x_2\cdots x_n}\mapsto x_1\otimes x_2\otimes \cdots \otimes x_n\otimes \varepsilon \) for \(x_i\in X\).

Denote by \(\mathbf {1}\) the function \(\sum _{x\in X}\delta _x\in V_1\) taking constant value \(1\in \Bbbk \). We have then, for every \(f\in V_n=\Bbbk ^{X^n}\),

$$\begin{aligned} f\otimes \mathbf {1}(x_1x_2\cdots x_{n+1})=f(x_1x_2\cdots x_n). \end{aligned}$$

The following proposition is straightforward.

Proposition 2.11

The map \(\iota _n:v\mapsto v\otimes \mathbf {1}:V_n\longrightarrow V_{n+1}\) is an embedding of the left \(\Bbbk [G]\)-modules. In other words,

$$\begin{aligned} \pi _{n+1}(g)(f\otimes \mathbf {1})=\pi _n(g)(f)\otimes \mathbf {1}\end{aligned}$$

for all \(g\in G\) and \(f\in V_n\).

The space \(X^\omega =\{x_1x_2\cdots \;:\;x_i\in X\}\) has a natural topology of a direct (Tikhonoff) power of the discrete space X. A basis of this topology consists of the cylindrical sets \(vX^\omega \), for \(v\in X^*\).

Denote by \(C(X^\omega , \Bbbk )\) the vector space of maps \(f:X^\omega \longrightarrow \Bbbk \) such that \(f^{-1}(a)\) is open and closed (clopen) for every \(a\in \Bbbk \). In other words, \(C(X^\omega , \Bbbk )\) is the space of all continuous maps \(f:X^\omega \longrightarrow \Bbbk \), where \(\Bbbk \) is taken with discrete topology. Note that the set of values of any element of \(C(X^\omega , \Bbbk )\) is finite, since \(X^\omega \) is compact.

For example, a map \(f:X^\omega \longrightarrow \mathbb {R}\) belongs to \(C(X^\omega , \mathbb {R})\) if and only it is continuous and has a finite set of values.

The group G acts naturally on \(X^\omega \) by homeomorphisms, hence it also acts naturally on the space \(C(X^\omega , \Bbbk )\) by the rule

$$\begin{aligned} g(\xi )(w)=\xi (g^{-1}(w)) \end{aligned}$$

for \(g\in G\), \(\xi \in C(X^\omega , \Bbbk )\), and \(w\in X^\omega \).

For every \(f\in V_n=\Bbbk ^{X^n}\) consider the natural extension of \(f:X^n\longrightarrow \Bbbk \) to a function on \(X^\omega \):

$$\begin{aligned} f(x_1x_2\cdots )=f(x_1x_2\cdots x_n). \end{aligned}$$

For example, the delta-function \(\delta _v\) is extended to the characteristic functions of the subset \(vX^\omega \), which we will also denote \(\delta _v\).

It is easy to see that this defines an embedding of the \(\Bbbk [G]\)-modules \(V_n\longrightarrow C(X^\omega , \Bbbk )\). Moreover, these embeddings agree with the embeddings \(\iota _n:V_n\longrightarrow V_{n+1}\).

Denote by \(V_\infty \) the direct limit of the G-modules \(V_n\) with respect to the maps \(\iota _n\). We will denote by \(\pi _\infty \) the corresponding representation of G on \(V_\infty \).

Proposition 2.12

The module \(V_\infty \) is naturally isomorphic to the left \(\Bbbk [G]\)-module \(C(X^\omega , \Bbbk )\).

Proof

The set \(\{f^{-1}(t)\;:\;t\in \Bbbk , f^{-1}(t)\ne \emptyset \}\) is a finite covering of \(X^\omega \) by clopen disjoint sets. Every clopen set of \(X^\omega \) is a finite union of cylindrical sets of the form \(vX^\omega \), for \(v\in X^*\). Consequently, there exists n such that f is constant on every cylindrical set of the form \(vX^\omega \) for \(v\in X^n\). Then \(f\in \Bbbk ^{X^n}\) in the identification of \(\Bbbk ^{X^n}\) with a subspace of \(C(X^\omega , \Bbbk )\), described above. It follows that the inductive limit of \(\Bbbk ^{X^n}\) coincides with \(C(X^\omega , \Bbbk )\). We have already seen that the representations \(\pi _n\) agree with the representation of G on \(C(X^\omega , \Bbbk )\), restricted to \(V_n=\Bbbk ^{X^n}\)) which finishes the proof. \(\square \)

Let \(\mathsf {B}\) be a basis of the \(\Bbbk \)-vector space \(\Bbbk ^X\) such that the constant one function \(\mathbf {1}\) belongs to \(\mathsf {B}\). Then \(\mathsf {B}^{\otimes n}\) is a basis of the \(\Bbbk \)-vector space \(\Bbbk ^{X^n}=V_n\), and we have \(\iota _n(\mathsf {B}^{\otimes n})\subset \mathsf {B}^{\otimes {n+1}}\). Then the inductive limit \(\mathsf {B}_\infty \) of the bases \(\mathsf {B}^{\otimes n}\) with respect to the maps \(\iota _n\) is a basis of \(C(X^\omega , \Bbbk )=V_\infty \). The elements of this basis are equal to functions of the form

$$\begin{aligned} f(x_1x_2\cdots )\mapsto f_1(x_1)f_2(x_2)\cdots , \end{aligned}$$

where \(f_i\in \mathsf {B}\) and all but a finite number of the functions \(f_i\) are equal to the constant one.

Example 2.13

Suppose that the field \(\Bbbk \cong \mathbb {F}_q\) is finite, and let \(X=\Bbbk \). Then the functions \(e_k:X\longrightarrow \Bbbk :x\mapsto x^k\) for \(k=1, 2,\ldots , q-1\) together with the constant one function \(\mathbf {1}\), formally denoted \(x^0\), form a basis of \(V_1\).

The corresponding basis of \(C(X^\omega , \Bbbk )\) is equal to the set of all monomial functions

$$\begin{aligned} f(x_1, x_2, \ldots )=x_1^{k_1}x_2^{k_2}\cdots , \end{aligned}$$

where all but a finite number of powers \(k_i\) are equal to zero.

Writing the elements of \(C(X^\omega , \Bbbk )\) in this basis amounts to representing them as polynomials.

Example 2.14

Let \(X=\{x_0, x_1\}\), \(\mathop {\mathrm {char}}\Bbbk \ne 2\), and let \(\mathsf {W}\) be the basis of \(\Bbbk ^{X}\) consisting of the functions \(y_0=\delta _{x_0}+\delta _{x_1}=\mathbf {1}\) and \(y_1=\delta _{x_0}-\delta _{x_1}\). The corresponding basis \(\mathsf {W}_\infty \) of \(C(X^\omega , \Bbbk )\) is called the Walsh basis, see [51].

For \(\Bbbk =\mathbb {C}\), the Walsh basis is an orthonormal set of complex-valued functions on \(X^\omega \) with respect to the uniform Bernoulli measure on \(X^\omega \). This is a direct corollary of the fact that \(\{y_0, y_1\}\) is orthonormal. Since \(\mathsf {W}_\infty \) is a basis of the linear space of continuous functions \(X^\omega \longrightarrow \mathbb {C}\) with finite sets of values, and this space is dense in the Hilbert space \(L^2(X^\omega )\), the Walsh basis is an orthonormal basis of \(L^2(X^\omega )\).

We can use Proposition 2.8 to find transition matrices from \(\{\delta _v\}_{v\in X^n}\) to the basis \(\mathsf {W}^{\otimes n}\) (just use the proposition for the case of the trivial group G). In the case of Walsh basis we get the matrices from Example 2.9, but without \(1/\sqrt{2}\):

$$\begin{aligned} H_n=\left( \begin{array}{c@{\quad }c} H_{n-1} &{} H_{n-1} \\ H_{n-1} &{} -H_{n-1}\end{array}\right) , \end{aligned}$$

compare with Example 2.9. These matrices are examples of Hadamard matrices (i.e., matrices whose entries are +1 and \(-\)1 and whose rows are orthogonal) and were constructed for the first time by Sylvester [46]. They are also called Walsh matrices.

Fig. 1
figure 1

Walsh basis

See Fig. 1, where graphs of the first eight elements of the Walsh basis are shown. Here we identify \(\{0, 1\}^\omega \) with the unit interval [0, 1] via real binary numeration system.

Example 2.15

A related basis of \(C(X^\omega , \Bbbk )\) is the Haar basis, which is constructed in the following way. Again, we assume that characteristic of \(\Bbbk \) is different from 2, and \(X=\{x_0, x_1\}\). Let \(y_0=\mathbf {1}\) and \(y_1=\delta _{x_0}-\delta _{x_1}\), as in the previous example. Let us construct an increasing sequence of bases \(\mathsf {Y}_n\) of \(\Bbbk ^{X^n}<C(X^\omega , \Bbbk )\) in the following way. Let \(\mathsf {Y}_0=\{y_0\}\). Define then inductively:

$$\begin{aligned} \mathsf {Y}_{n+1}=\mathsf {Y}_n\cup \{\delta _v\otimes y_1\;:\;v\in X^n\}. \end{aligned}$$

Note that, since \(\{\delta _v\;:\;v\in X^n\}\) is a basis of \(\Bbbk ^{X^n}\), the set \(\{\delta _v\otimes y_0\;:\;v\in X^n\}\cup \{\delta _v\otimes y_1\;:\;v\in X^n\}\) is a basis of \(\Bbbk ^{X^{n+1}}\). But \(\{\delta _v\otimes y_0\;:\;v\in X^n\}=\{\delta _v\;:\;v\in X^n\}\) is a basis of \(\Bbbk ^{X^n}\), since \(y_0=\mathbf {1}\). Consequently, \(\mathsf {Y}_{n+1}\) is a basis of \(\Bbbk ^{X^{n+1}}\). (Here everywhere \(\Bbbk ^{X^n}\) are identified with the corresponding subspaces of \(C(X^\omega , \Bbbk )\).)

In the case \(\Bbbk =\mathbb {C}\), and identification of \(C(X^\omega , \mathbb {C})\) with a linear subspace of \(L^2(X^\omega , \mu )\), where \(\mu \) is the uniform Bernoulli measure on \(X^\omega \), it makes sense to normalize the elements of \(\mathsf {Y}_n\) in order to make them of norm one. Since norm of \(\delta _v\) is equal to \(2^{-n/2}\), the recurrent definition of the basis in this case is

$$\begin{aligned} \mathsf {Y}_{n+1}=\mathsf {Y}_n\cup \left\{ 2^{-n/2}\otimes y_1\;:\;v\in X^n\right\} . \end{aligned}$$

It is easy to see that the union \(\mathsf {Y}_\infty \) of the bases \(\mathsf {Y}_n\) is an orthonormal basis of \(L^2(X^\omega , \mu )\). It is called the Haar basis. See its use in the context of groups acting on rooted trees in [6].

3 Automata

3.1 Mealy and Moore automata

Definition 3.1

A Mealy automaton (or Mealy machine) is a tuple

$$\begin{aligned} \mathfrak {A}=(Q, X, Y, \pi , \tau , q_0), \end{aligned}$$

where

  • Q is the set of states of the automaton;

  • X and Y are the input and output alphabets of the automaton;

  • \(\pi :Q\times X\longrightarrow Q\) is the transition map;

  • \(\tau :Q\times X\longrightarrow X\) is the output map;

  • \(q_0\in Q\) is the initial state.

We always assume that X and Y are finite and have more than one element each.

We frequently assume that \(X=Y\), and say that the automaton is defined over the alphabet X. The automaton is finite if the set Q is finite. In some cases, we do not assume that an initial state is chosen.

Let \(\mathfrak {A}=(Q, X, Y, \pi , \tau , q_0)\) be a Mealy automaton. Let us extend the definition of the maps \(\pi \) and \(\tau \) to maps \(\pi :Q\times X^*\longrightarrow Q\) and \(\tau :Q\times X^*\longrightarrow X\) by the inductive rules

$$\begin{aligned} \pi (q, xv)=\pi (\pi (q, x), v),\qquad \tau (q, xv)=\tau (\pi (q, x), v)). \end{aligned}$$

We interpret the automaton \(\mathfrak {A}\) as a machine, which being in a state \(q\in Q\) and reading a letter \(x\in X\), goes to the state \(\pi (q, x)\), and gives the letter \(\tau (q, x)\) on the output. If the machine starts at the state \(q\in Q\), and reads a word v, then its final state will be \(\pi (q, v)\), and its final letter on output will be \(\tau (q, v)\).

Definition 3.2

The transformation \(\mathfrak {A}_{q_0}:X^*\longrightarrow X^*\) or \(\mathfrak {A}_{q_0}:X^\omega \longrightarrow X^\omega \) defined by a Mealy automaton \(\mathfrak {A}=(Q, X, Y, \pi , \tau , q_0)\) is the map

$$\begin{aligned} \mathfrak {A}_{q_0}(x_0x_1x_2\cdots )=\tau (q_0, x_0)\tau (q_1, x_1)\tau (q_2, x_2)\ldots , \end{aligned}$$
(8)

where \(q_{i+1}=\pi (q_i, x_i)\).

In other words, \(\mathfrak {A}_{q_0}(v)\) is the word that the machine gives on output, when it reads the word v on input, if \(q_0\) is its initial state.

Example 3.3

Let \(G\le \mathop {\mathrm {Aut}}(X^*)\) be a self-similar group. Consider the corresponding full automaton with the set of states \(Q=G\), and output and transition functions defined by the rules:

$$\begin{aligned} \pi (g, x)=g|_x,\qquad \tau (g, x)=g(x). \end{aligned}$$

It follows from (4) that if we choose \(g\in G\) as the initial state, then the transformations of \(X^*\) and \(X^\omega \) defined by this automaton coincides with the original transformations defined by \(g\in \mathop {\mathrm {Aut}}(X^*)\).

This automaton is infinite, but if \(G\le \mathop {\mathrm {FAut}}(X^*)\), then for every \(g\in G\), the set \(\{g|_v\;:\;v\in X^*\}\) is a finite set, and we can take it as a set of states of a finite automaton defining the transformation g.

A special type of Mealy automata are the Moore automata. The definition of a Moore automaton is the same as Definition 3.1, except that the output function is a map \(\tau :Q\longrightarrow X\), i.e., the output depends only on the state, and does not depend on the input letter.

Moore automata also act on words, essentially in the same way as Mealy automata. We can extend the definition of the transition function \(\pi \) to \(Q\times X^*\) by the same formula as for the Mealy automata. Then the action of a Moore automaton with initial state \(q_0\) on words is given by the rule

$$\begin{aligned} \mathfrak {A}_{q_0}(x_1x_2\cdots )=\tau (q_1)\tau (q_2)\ldots , \end{aligned}$$
(9)

where \(q_{i+1}=\pi (q_i, x_{i+1})\).

Even though the definition of a Moore automaton seems to be more restrictive than the definition of a Mealy automaton, the two notions are basically equivalent, as any Mealy automaton can be modeled by a Moore automaton. Hence, the set of maps defined by finite Mealy automata coincides with the set of maps defined by finite Moore automata.

Let \(\mathfrak {A}=(Q, X, Y, \pi , \tau , q_0)\) be a Mealy automaton. Consider the Moore automaton \(\mathfrak {A}'\) over the input and output alphabets X and Y, respectively, with the set of states \(Q\times X\cup \{p_0\}\), where \(p_0\) is an element not belonging to \(Q\times X\), and with the transition and output maps \(\pi '\) and \(\tau '\) given by the rules

$$\begin{aligned} \pi '(p_0, x)=(\pi (q_0, x), x),\quad \pi '((q, x_1), x_2))=(\pi (q, x_2), x_2)), \end{aligned}$$

and

$$\begin{aligned} \tau '(q, x_1)=\tau (q, x_1), \end{aligned}$$

where \(x, x_1, x_2\in X\) (We define \(\tau '(p_0)\) to be any letter, since it will never appear in the output). It is easy to check that the new Moore automaton with the initial state \(p_0\) defines the same maps on \(X^*\) and \(X^\omega \) as the original Mealy automaton \(\mathfrak {A}\).

Therefore, we will not use Moore automata to define transformations of the sets of words. They will be used to define automatic sequences and matrices in Sect. 4. Traditionally, Mealy automata are used in theory of groups generated by automata (see [19]), while Moore automata are used for generation of sequences (even though the term “Moore automata” is not used in [2]).

3.2 Diagrams of automata

The automata are usually represented as labeled graphs (called Moore diagrams). The set of vertices coincides with the set of states Q. For every \(q\in Q\) and \(x\in X\) there is an arrow from q to \(\pi (q, x)\) labeled by \((x, \tau (q, x))\) in the case of Mealy automata, and just by x in the case of Moore automata. The initial state is marked, and the states are marked by the values of \(\tau (q)\), if it is a Moore automaton.

Sometimes the arrows of diagrams of Mealy automata are just labeled by the input letters x, and the vertices are labeled by the corresponding transformation \(x\mapsto \tau (q, x)\).

Consider a directed graph with one marked (initial) vertex, in which the edges are labeled by pairs \((x, y)\in X^2\). The necessary and sufficient condition for such a graph to represent a Mealy automaton is that for every vertex q and every letter \(x\in X\) there exists a unique arrow starting at q and labeled by (xy) for some \(y\in X\). Then the image of a word \(x_1x_2\cdots \) under the action of the automaton is calculated by finding the unique direct path \(e_1, e_2, \ldots \) of arrows starting at the initial vertex, whose arrows are labeled by \((x_1, y_1), (x_2, y_2), \ldots \), respectively. Then \(y_1y_2\cdots \) is the image of \(x_1x_2\cdots \).

Fig. 2
figure 2

The binary adding machine

The diagram of the adding machine transformation (see Example 2.3) is shown on Fig. 2. We mark the initial state by a double circle.

Example 3.4

The generators abcd are defined by one automaton, shown on Fig. 3, for different choices of the initial state.

Fig. 3
figure 3

The generators of the Grigorchuk group

3.3 Non-deterministic automata

Let us generalize the notion of a Mealy automaton by allowing more general Moore diagrams.

Definition 3.5

A (non-deterministic) synchronous automaton \(\mathfrak {A}\) over an alphabet X is an oriented graph whose arrows are labeled by pairs of letters \((x, y)\in X^2\). Such automaton is called \(\omega \) -deterministic if for every infinite word \(x_1x_2\cdots \in X^\omega \) and for every vertex (i.e., state) q of \(\mathfrak {A}\) there exists at most one directed path starting in q which is labeled by \((x_1, y_1), (x_2, y_2), \ldots \) for some \(y_i\in X\).

Note that in the above definition for a vertex state q of \(\mathfrak {A}\) and a letter \(x\in X\) there maybe several or no edges starting at q and labeled by (xy) for \(y\in X\). It means that the automaton \(\mathfrak {A}\) may be non-deterministic on finite words and partial, i.e., that a state q transforms a finite word \(v\in X^*\) into several different words, and may not accept some of the words on input.

If an automaton \(\mathfrak {A}\) is \(\omega \)-deterministic, then every its state q defines a map between closed subsets of \(X^\omega \), mapping \(x_1x_2\cdots \) to \(y_1y_2\cdots \), if there exists a directed path starting in q and labeled by \((x_1, y_1), (x_2, y_2), \ldots \).

Example 3.6

Let \(X=\{0, 1\}\). The states \(T_0\) and \(T_1\) of the automaton shown on Fig. 4 define the transformations \(x_1x_2\cdots \mapsto 0x_1x_2\cdots \) and \(x_1x_2\cdots \mapsto 1x_1x_2\cdots \), respectively. The states \(T_0'\) and \(T_1'\) define the inverse transformations \(0x_1x_2\cdots \mapsto x_1x_2\cdots \) and \(1x_1x_2\cdots \mapsto x_1x_2\cdots \).

Note that the first automaton (defining the transformations \(T_0\) and \(T_1\)) is deterministic. For example, the state \(T_0\) acts on the finite words by transformations \(x_1x_2\cdots x_n\mapsto 0x_1x_2\cdots x_{n-1}\). The second automaton is partial and non-deterministic on finite words. For example, there are two arrows starting at \(T_0'\) labeled by (0, 1) and (0, 0), but no arrows labeled by (1, y).

Fig. 4
figure 4

Appending and erasing letters

An asynchronous automaton is defined in the same way, but the labels are pairs of arbitrary words \((u, v)\in (X^*)^2\).

Definition 3.7

A homeomorphism \(\phi :X^\omega \longrightarrow X^\omega \) is synchronously (resp. asynchronously) automatic if it is defined by a finite \(\omega \)-deterministic synchronous (resp. asynchronous) automaton.

A criterion for a homeomorphism to be synchronously automatic is given in Proposition 4.23.

Asynchronously automatic homeomorphisms of \(X^\omega \) are studied in [18, 19]. It is shown there that the set of asynchronously automatic homeomorphisms of \(X^\omega \) is a group, and that it does not depend on X (if \(|X|>1\)). More precisely, it is proved that for any two finite alphabets XY (such that \(|X|, |Y|>1\)) there exists a homeomorphism \(X^\omega \longrightarrow Y^\omega \) conjugating the corresponding groups of asynchronously automatic homeomorphisms. Very little is known about this group, which is called in [19] the group of rational homeomorphisms of the Cantor set.

4 Automatic matrices

4.1 Automatic sequences

Here we review the basic definitions and facts about automatic sequences. More can be found in the monographs [2, 50].

Let A be a finite alphabet, and let \(A^\omega \) be the space of the right-infinite sequence of elements of A with the direct product topology.

Fix an integer \(d\ge 2\), and consider the transformation \(\Xi _d:A^\omega \longrightarrow (A^\omega )^d\), which we called the stencil map, defined by the rule

$$\begin{aligned} \Xi _d(a_0a_1a_2\cdots )=(a_0a_da_{2d}\ldots , a_1a_{d+1}a_{2d+1}\ldots , \ldots , a_{d-1}a_{2d-1}a_{3d-1}\cdots ). \end{aligned}$$

It is easy to see that \(\Xi _d\) is a homeomorphism. We denote the coordinates of \(\Xi _d(w)\) by \(\Xi _d(w)_i\), so that

$$\begin{aligned} \Xi _d(w)=(\Xi _d(w)_0, \Xi _d(w)_1, \ldots , \Xi _d(w)_{d-1}), \end{aligned}$$

and call them d-decimations of the sequence w. Repeated d-decimations of w are all sequences that can be obtained from w by iterative application of the decimation procedure, i.e., all sequences of the form

$$\begin{aligned} \Xi _d(\Xi _d(\cdots \Xi _d(w)_{i_n}\cdots )_{i_2})_{i_1}. \end{aligned}$$

Definition 4.1

A sequence \(w\in A^\omega \) is d-automatic if the set of all repeated d-decimations of w (called the kernel of w in [2, Section 6.6]) is finite.

We say that a subset \(Q\subset A^\omega \) is d-decimation-closed if for every \(w\in Q\) all d-decimations of w belong to Q. The following is obvious.

Lemma 4.2

A sequence is d-automatic if it belongs to a finite d-decimation-closed subset of \(A^\omega \).

Classically, a sequence \(w=a_0a_1\cdots \) is called d-automatic if there exists a Moore automaton \(\mathfrak {A}\) with input alphabet \(\{0, 1, \ldots , d-1\}\) and output alphabet A such that if \(n=i_0+i_1d+\cdots +i_md^m\) is a base d expansion of n, then the output of \(\mathfrak {A}\) after reading the word \(i_0i_1\cdots i_m\) is \(a_n\). An equivalent variant of the definition requires that \(a_n\) is the output of the automaton after reading \(i_mi_{m-1}\cdots i_1i_0\). One also may allow, or not \(i_m\) to be equal to zero, and the numeration of the letters of the sequence w to start from 1. All these different definitions of automaticity of sequences are equivalent to each other, see [2, Section 5.2]. They are also equivalent to Definition 4.1, see [2, Theorem 6.6.2].

Example 4.3

The Thue–Morse sequence is the sequence \(t_0t_1\cdots \in \{0, 1\}^\omega \), where \(t_n\) is the sum modulo 2 of the digits of n in the binary numeration system. The beginning of length \(2^n\) of this sequence can be obtained from 0 by applying the substitution

$$\begin{aligned} 0\mapsto 01,\qquad 1\mapsto 10 \end{aligned}$$

n times:

$$\begin{aligned} 0\mapsto 01\mapsto 0110\mapsto 01101001\mapsto 0110100110010110\mapsto \cdots \end{aligned}$$

It is easy to see that this sequence is generated by the automaton shown on Fig. 5. Here we label the vertices (the states) of the automaton by the corresponding values of the output function. The initial state is marked by a double circle. For more on properties of the Thue–Morse sequence, see [2, 5.1].

Fig. 5
figure 5

Automaton generating the Thue–Morse sequence

The last example can be naturally generalized to include all automatic sequences. Namely, a k-uniform morphism \(\phi :X^*\longrightarrow Y^*\) is a morphism of monoids such that \(|\phi (x)|=k\) for every \(x\in X\). By a theorem of Combham (see [2, Theorem 6.3.2]) a sequence is k-automatic if and only if it is an image, under a coding (i.e., a 1-uniform morphism), of a fixed point of a k-uniform endomorphism \(\phi :X^*\longrightarrow X^*\).

Example 4.4

Consider the alphabet \(X=\{a, b, c\}\), and the morphism \(\phi :X^*\longrightarrow X^*\) given by

$$\begin{aligned} \phi (a)=aca,\quad \phi (b)=d,\quad \phi (c)=b,\quad \phi (d)=c. \end{aligned}$$

This substitution appears in the presentation [32] of the Grigorchuk group.

The fixed point of \(\phi \) is obtained as the limit of \(\phi ^n(a)\), and starts with \(acabacadacabaca\ldots \). The morphism \(\phi \) is not uniform, but it is easy to see that the fixed point belongs to \(\{ab, ac, ad\}^\infty \), and for the words \(B=ab, C=ac, D=ad\) it acts on \(\{B, C, D\}^\infty \) as a 2-uniform endomorphism:

$$\begin{aligned} \phi (B)=acad=CD,\quad \phi (C)=acab=CB,\quad \phi (D)=acac=CC. \end{aligned}$$

It follows from Combham’s theorem that the fixed point of \(\phi \) is 2-automatic.

Let us show how to construct an automaton producing a sequence satisfying the conditions of Definition 4.1.

Suppose that \(w_0\in A^\omega \) is automatic, and let Q be a finite d-decimation-closed subset of \(A^\omega \) that contains \(w_0\) (for example, we can take Q to be equal to the set of all repeated d-decimations of \(w_0\)).

Consider a Moore automaton with the set of states Q, initial state \(w_0\), input alphabet \(\{0, 1, \ldots , d-1\}\), output alphabet A, transition function

$$\begin{aligned} \pi (w, i)=\Xi _d(w)_i, \end{aligned}$$

output function

$$\begin{aligned} \tau (x_0x_1\cdots )=x_0. \end{aligned}$$

We call the constructed Moore automaton

$$\begin{aligned} \mathfrak {A}=(Q, \{0, 1, \ldots , d-1\}, A, \pi , \tau , w_0) \end{aligned}$$

the automaton of \(w_0\).

Proposition 4.5

Let \(w_0=a_0a_1\cdots \in A^\omega \) be an automatic sequence, and let \(\mathfrak {A}\) be its automaton. Let n be a non-negative integer, and let \(i_0, i_1, \ldots , i_m\) be a sequence of elements of the set \(\{0, 1, \ldots , d-1\}\) such that \(n=i_0+i_1d+i_2d^2+\cdots +i_md^m\). Then \(\tau (w_0, i_0i_1\cdots i_m)=a_n\), i.e., the output of \(\mathfrak {A}\) after reading \(i_0i_1\cdots i_m\) is \(a_n\).

Proof

It follows from the definition of the automaton \(\mathfrak {A}\) that

$$\begin{aligned} \pi (w, i_0\cdots i_m)=\Xi _d(\cdots \Xi _d(\Xi _d(x_0x_1\cdots )_{i_0})_{i_1}\cdots )_{i_m} \end{aligned}$$
(10)

for all \(w=x_0x_1\cdots \in Q\).

It also follows from the definition of the stencil map that the sequence (10) is equal to

$$\begin{aligned} x_nx_{n+d^{m+1}}x_{n+2d^{m+1 }}x_{n+3d^{m+1}}\ldots , \end{aligned}$$

where \(n=i_0+i_1d+i_2d^2+\cdots +i_md^m\). It follows that \(\tau (w_0, i_0i_1\cdots i_m)=x_n\). \(\square \)

4.2 Automatic infinite matrices

The notion of automaticity of sequences can be generalized to matrices in a straightforward way (see [2, Chapter 14], where they are called two-dimensional sequences).

Let A be a finite alphabet, and let \(A^{\omega \times \omega }\) be the space of all infinite to right and down two-dimensional matrices of elements of A, i.e., arrays of the form

$$\begin{aligned} a=\left( \begin{array}{c@{\quad }c@{\quad }c}a_{11} &{} a_{12} &{} \cdots \\ a_{21} &{} a_{22} &{} \cdots \\ \vdots &{} \vdots &{} \ddots \end{array}\right) . \end{aligned}$$
(11)

Fix an integer \(d\ge 2\), and consider the map:

$$\begin{aligned} \Xi _d:A^{\omega \times \omega }\longrightarrow (A^{\omega \times \omega })^{d\times d} \end{aligned}$$

from the set of infinite matrices to the set of \(d\times d\) matrices whose entries are elements of \(A^{\omega \times \omega }\). It maps the matrix a to the matrix

$$\begin{aligned} \Xi (a)=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}\Xi _d(a)_{00} &{} \Xi _d(a)_{01} &{} \cdots &{} \Xi _d(a)_{0, d-1}\\ \Xi _d(a)_{10} &{} \Xi _d(a)_{11} &{} \cdots &{} \Xi _d(a)_{1, d-1}\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \Xi _d(a)_{d-1, 0} &{} \Xi _d(a)_{d-1, 1} &{} \cdots &{} \Xi _d(a)_{d-1, d-1}\end{array}\right) , \end{aligned}$$

where

$$\begin{aligned} \Xi _d(a)_{ij}=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}a_{i, j} &{} a_{i, j+d} &{} a_{i, j+2d} &{} \cdots \\ a_{i+d, j} &{} a_{i+d, j+d} &{} a_{i+d, j+2d} &{} \cdots \\ a_{i+2d, j} &{} a_{i+2d, j+d} &{} a_{i+2d, j+2d} &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) . \end{aligned}$$

The entries of \(\Xi _d(a)\) are called d-decimations of a. We call \(\Xi _d\) the stencil map, since entries of the matrix \(\Xi _d(A)\) are obtained from the matrix A by selecting entries using a “stencil” consisting of a square grid of holes, see Fig. 6.

Fig. 6
figure 6

The stencil

The definition of automaticity for matrices is then the same as for sequences.

Definition 4.6

A matrix \(a\in A^{\omega \times \omega }\) is d-automatic ([dd]-automatic in terminology of [2]) if the set of matrices that can be obtained from a by repeated d-decimations is finite.

One can also use stencils with a rectangular grid of holes, i.e., selecting the entries of a decimation with one step horizontally, and with a different step vertically. This will lead us to the notion of a \([d_1, d_2]\)-automatic matrix, as in [2], but we do not use this notion in our paper.

An interpretation of automaticity of matrices via automata theory is also very similar to the interpretation for sequences. The only difference is that the input alphabet of the automaton is the direct product \(\{0, 1, \ldots , d-1\}\times \{0, 1, \ldots , d-1\}\). If we want to find an entry \(a_{n_1, n_2}\) of an automatic matrix defined by a Moore automaton \(\mathfrak {A}\), then we represent the indices \(n_1\) and \(n_2\) in base d:

$$\begin{aligned} n_1=i_0+i_1d+\cdots +i_md^m,\qquad n_2=j_0+j_1d+\cdots +j_md^m \end{aligned}$$

for \(0\le i_s, j_t\le d-1\), and then feed the sequence \((i_0, j_0)(i_1, j_1)\cdots (i_m, j_m)\) to the automaton \(\mathfrak {A}\). Its final output will be \(a_{n_1, n_2}\).

We say that a matrix \(a=(a_{ij})_{i\ge 0, j\ge 0}\) over a field \(\Bbbk \) is column-finite if the number of non-zero entries in each column of a is finite. The set \(\mathsf {M}_{\infty }(\Bbbk )\) of all column-finite matrices is an algebra isomorphic to the algebra of endomorphisms of the infinite-dimensional vector space \(\Bbbk ^\infty =\bigoplus _{\mathbb {N}}\Bbbk \). We denote by \(\mathsf {M}_k(\Bbbk )\) the algebra of \(k\times k\)-matrices over \(\Bbbk \).

Lemma 4.7

The stencil map \(\Xi _d:\mathsf {M}_{\infty }(\Bbbk )\longrightarrow \mathsf {M}_d(\mathsf {M}_{\infty }(\Bbbk ))\) is an isomorphism of \(\Bbbk \)-algebras.

Proof

A direct corollary of the multiplication rule for matrices. \(\square \)

Denote by \(\mathcal {A}_d(\Bbbk )\subset \mathsf {M}_\infty (\Bbbk )\) the set of all column-finite d-automatic matrices over \(\Bbbk \).

Proposition 4.8

Let \(\Bbbk \) be a finite field. Then \(\mathcal {A}_d(\Bbbk )\) is an algebra. The stencil map

$$\begin{aligned} \Xi _d:\mathcal {A}_d(\Bbbk )\longrightarrow \mathsf {M}_d(\mathcal {A}_d(\Bbbk )) \end{aligned}$$

is an isomorphism of \(\Bbbk \)-algebras.

Proof

Let A and B be d-automatic column-finite matrices. Let \(\mathfrak {A}\) and \(\mathfrak {B}\) be finite decimation-closed sets containing A and B, respectively. Then the set \(\{a_1A_1+b_1B_1\;:\;A_1\in \mathfrak {A}, B_1\in \mathfrak {B}, a_1, b_1\in \Bbbk \}\) is decimation-closed, and it contains any linear combination of A and B. The set is finite, which shows that any linear combination of A and B is automatic.

Let \(\mathfrak {C}\) be the linear span of all products \(A_1B_1\) for \(A_1\in \mathfrak {A}\), \(B_1\in \mathfrak {B}\). Since the stencil map \(\Xi _d:\mathsf {M}_\infty (\Bbbk )\longrightarrow \mathsf {M}_d\left( \mathsf {M}_\infty (\Bbbk )\right) \) is an isomorphism of \(\Bbbk \)-algebras, the set \(\mathfrak {C}\) is decimation-closed. It contains AB, hence AB is automatic.

The map \(\Xi _d:\mathcal {A}_d(\Bbbk )\longrightarrow \mathsf {M}_d\left( \mathcal {A}_d(\Bbbk )\right) \) is obviously a bijection. It is a homomorphism of \(\Bbbk \)-algebras, because it is a homomorphism on \(\mathsf {M}_\infty (\Bbbk )\). \(\square \)

If we have a finite d-decimation-closed set of matrices \(\mathfrak {A}\), then its elements are uniquely determined by the corresponding matrix recursion \(\Xi _d:\mathfrak {A}\longrightarrow \mathsf {M}_d\left( \mathfrak {A}\right) \) and by the top left entries \(a_{00}\) of each matrix \(A\in \mathfrak {A}\). Namely, suppose that we want to find an entry \(a_{m, n}\) of a matrix \(A\in \mathfrak {A}\). Let \(0\le i<d\) and \(0\le j<d\) be the remainders of division of n and m by d. Then \(a_{m, n}\) is equal to the entry \(b_{\frac{m-i}{d}, \frac{n-j}{d}}\) of the matrix \(B=\Xi _d(A)_{i, j}\). Repeating this procedure several times, we eventually will find a matrix \(C=(c_{i, j})_{i, j=0}^\infty \in \mathfrak {A}\) such that \(a_{m, n}=c_{0, 0}\).

Example 4.9

A particular case of automatic matrices are triangular matrices of the form

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}a_{00} &{} a_{01} &{} a_{02} &{} \cdots \\ 0 &{} a_{11} &{} a_{12} &{} \cdots \\ 0 &{} 0 &{} a_{22} &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) , \end{aligned}$$

where the diagonals \(a_{0, i}, a_{1, i+1}, a_{2, i+2}, \ldots \) are eventually periodic, and only a finite number of them are non-zero. Note that the set of such uni-triangular matrices is a group.

The following subgroup of this group (of matrices over the field \(\mathbb {F}_2\)) was considered in [38]. Let \(B_1=\left( \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 1 &{} 1\\ 1 &{} 0 &{} 0\\ 1 &{} 1 &{} 1\end{array}\right) \), \(C_1=\left( \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 0 &{} 0\\ 0 &{} 0 &{} 0\\ 1 &{} 0 &{} 0\end{array}\right) \), \(B_2=\left( \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0\\ 0 &{} 1 &{} 0\\ 1 &{} 1 &{} 0\end{array}\right) \), \(C_2=\left( \begin{array}{c@{\quad }c@{\quad }c} 1 &{} 0 &{} 0\\ 1 &{} 0 &{} 0\\ 1 &{} 0 &{} 0\end{array}\right) \). It is shown in [38] that the infinite matrices

$$\begin{aligned} F_1=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} I &{} B_1 &{} C_1 &{} O &{} O &{} \cdots \\ O &{} I &{} B_1 &{} C_1 &{} O &{} \cdots \\ O &{} O &{} I &{} B_1 &{} C_1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) ,\quad F_2=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} I &{} B_2 &{} C_2 &{} O &{} O &{} \cdots \\ O &{} I &{} B_2 &{} C_2 &{} O &{} \cdots \\ O &{} O &{} I &{} B_2 &{} C_2 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) \end{aligned}$$

generate a free group of rank 2. Here I and O are the identity and zero matrices of size \(3\times 3\), respectively.

In Sect. 5 we will show that any residually finite p-group can be represented in triangular form. Groups generated by finite automata will be represented by automatic uni-triangular matrices. The next subsection is the first step in this direction.

4.3 Representation of automata groups by automatic matrices

Let \(\mathsf {B}\) be a basis of \(\Bbbk ^X\) such that \(\mathbf {1}\in \mathsf {B}\). Order the elements of \(\mathsf {B}\) into a sequence \(y_0<y_1<\cdots <y_{d-1}\), where \(y_0=\mathbf {1}\). Recall that the inductive limit \(\mathsf {B}_\infty \) of the bases \(\mathsf {B}^{\otimes n}\) of \(\Bbbk ^{X^n}\) with respect to the embeddings \(f\mapsto f\otimes \mathbf {1}\) is a basis of \(C(X^\omega , \Bbbk )\), whose elements are infinite tensor products \(y_{i_0}\otimes y_{i_1}\otimes \cdots \), where all but a finite number of factors \(y_{i_k}\) are equal to \(y_0=\mathbf {1}\). In other words, \(\mathsf {B}_\infty \) consists of functions of the form

$$\begin{aligned} f(x_0x_1\cdots )=y_{i_0}(x_0)y_{i_1}(x_1)\cdots , \end{aligned}$$

where all but a finite number of factors on the right-hand side are equal to the constant one function.

We can order such products using the inverse lexicographic order, namely \(y_{i_0}\otimes y_{i_1}\otimes \cdots <y_{j_0}\otimes y_{j_1}\otimes \cdots \) if and only if \(y_{i_k}<y_{j_k}\), where k is the largest index such that \(y_{i_k}\ne y_{j_k}\).

It is easy to see that the ordinal type of \(\mathsf {B}_\infty \) is \(\omega \). Let \(e_0<e_1<e_2<\cdots \) be all elements of \(\mathsf {B}_\infty \) taken in the defined order. It is checked directly that if \(e_n=y_{i_0}\otimes y_{i_1}\otimes \ldots \), then \(n=i_0+i_1\cdot d+i_2\cdot d^2+\cdots \), i.e., \(\overline{\ldots i_2i_1i_0}\) is the base-d expansion of n (only a finite number of coefficients \(i_j\) are different from zero).

Definition 4.10

An ordered basis \(\mathsf {B}\) of \(\Bbbk ^X\) is called marked if its minimal element is \(\mathbf {1}\).

The self-similar basis \(\mathsf {B}_\infty \) of \(C(X^\omega , \Bbbk )\) associated with \(\mathsf {B}\) is the inverse lexicographically ordered set of functions of the form \(e_1\otimes e_2\otimes \ldots \), where \(e_i\in \mathsf {B}\) and all but a finite set of elements \(e_i\) are equal to \(\mathbf {1}\), as it is described above.

Let \(\mathsf {B}=\{b_0, b_1, \ldots , b_{d-1}\}\) be an arbitrary (non necessarily marked) basis of the space \(\Bbbk ^X\). We define the associated matrix recursion for linear operators on \(C(X^\omega , \Bbbk )\) in the usual way: given an operator a, define its image \(\Xi _{\mathsf {B}}(a)=(A_{i, j})_{i, j=0}^{d-1}\) in \(\mathsf {M}_d(C(X^\omega , \Bbbk )\) by the rule

$$\begin{aligned} a(b_j\otimes f)=\sum _{i=0}^{d-1}b_i\otimes A_{i, j}(f),\qquad f\in C(X^\omega , \Bbbk ),\quad 0\le j\le d-1. \end{aligned}$$

If \(\mathsf {B}\) is the basis \(\{\delta _x\}_{x\in X}\), then the matrix recursion \(\Xi _{\mathsf {B}}\) restricted to a self-similar group \(G\le \mathop {\mathrm {Aut}}(X^*)\) coincides with the matrix recursion (5) coming directly from the wreath recursion.

Lemma 4.11

Let \(\mathsf {B}\) be a marked basis of \(\Bbbk ^X\). Then the matrix recursion \(\Xi _{\mathsf {B}}\) coincides with the stencil map for the matrices of linear operators in the associated basis \(\mathsf {B}_\infty \).

Proof

Let \(A_{ij}\), \(0\le i, j\le d-1\), be the entries of \(\Xi _{\mathsf {B}}(a)\). Let \(n=i_0+i_1\cdot d+i_2\cdot d^2+\cdots \) be a non-negative integer written in base d. Then

$$\begin{aligned} a(b_{j+dn})=a(b_j\otimes b_n)=\sum _{i=0}^{d-1}b_i\otimes A_{i, j}(b_n). \end{aligned}$$
(12)

Let \(a_{m, n},\, 0\le m, n<\infty \), be the entries of the matrix of a in the basis \(\mathsf {B}_\infty \). Then

$$\begin{aligned} a(b_{j+dn})= & {} \sum _{k=0}^\infty a_{k, j+dn}b_k=\sum _{i=0}^{d-1}\sum _{r=0}^\infty a_{i+dr, j+dn}b_i\otimes b_r\\= & {} \sum _{i=0}^{d-1}b_i\otimes \left( \sum _{r=0}^\infty a_{i+dr, j+dn}b_r\right) , \end{aligned}$$

which together with (12) implies that \(c_{r, n}=a_{i+dr, j+dn}\) are the entries of \(A_{i, j}\) in the basis \(\mathsf {B}_\infty \), i.e., that \(A_{i, j}=\Xi _d(a)_{i, j}\). \(\square \)

Definition 4.12

Let \(\Bbbk \) be a finite field. We say that a linear operator a on \(C(X^\omega , \Bbbk )\) is automatic if there exists a finite set \(\mathfrak {A}\) of operators such that \(a\in \mathfrak {A}\), and for every \(a'\in \mathfrak {A}\) all entries of the matrix \(\Xi _{\mathsf {B}}(a')\) belong to \(\mathfrak {A}\).

Proposition 4.13

Let \(\mathsf {B}_1, \mathsf {B}_2\) be two bases of \(\Bbbk ^X\). A linear operator on \(C(X^\omega , \Bbbk )\) is automatic with respect to \(\mathsf {B}_1\) if and only if it is automatic with respect to \(\mathsf {B}_2\).

Proof

Let \(a_1\) be an operator which is automatic with respect to \(\mathsf {B}_1\). Let \(\mathfrak {A}\) be the corresponding finite set of operators, closed with respect to taking entries of the matrix recursion. Let \(\mathfrak {A}'\) be the set of all linear combinations of elements of \(\mathfrak {A}\), which is finite, since we assume that \(\Bbbk \) is finite. If T is the transition matrix from \(\mathsf {B}_1\) to \(\mathsf {B}_2\), then

$$\begin{aligned} \Xi _{\mathsf {B}_2}(a)=T^{-1}\Xi _{\mathsf {B}_1}(a)T \end{aligned}$$

for every linear operator a. It follows that \(\mathfrak {A}'\) is closed with respect to taking entries of \(\Xi _{\mathsf {B}_2}\). The set \(\mathfrak {A}'\) is finite, \(a_1\in \mathfrak {A}'\), hence \(a_1\) is also automatic with respect to \(\mathsf {B}_2\). \(\square \)

As a direct corollary of Proposition 4.13 we get the following relation between finite-state automorphisms of the rooted tree \(X^*\) and automatic matrices.

Theorem 4.14

Suppose that \(\Bbbk \) is finite. Let \(\mathsf {B}\) be a marked basis of \(\Bbbk ^X\). Then the matrix of \(\pi _\infty (g)\) in the associated basis \(\mathsf {B}_\infty \), where \(g\in \mathop {\mathrm {Aut}}(X^*)\), is d-automatic if and only if g is finite-state.

We get, therefore, a subgroup of the group of units of \(\mathcal {A}_d(\Bbbk )\) isomorphic to the group \(\mathop {\mathrm {FAut}}(X^*)\) of finite-state automorphisms of the tree \(X^*\).

Matrix recursions (i.e., homomorphisms from an algebra A to the algebra of matrices over A) associated with groups acting on rooted trees, and in more general cases were studied in many papers, for instance in [3, 4, 6, 34, 37, 42, 43]. Note that the algebra generated by the natural representation on \(C(X^\omega , \Bbbk )\) of a group G acting on the rooted tree \(X^*\) is different from the group ring. This algebra (and its analogs) were studied in [4, 34, 42].

4.4 Creation and annihilation operators

For \(h\in \Bbbk ^X\), denote by \(T_h\) the operator on \(C(X^\omega , \Bbbk )\) acting by the rule

$$\begin{aligned} T_h(f)=h\otimes f. \end{aligned}$$

It is easy to see that \(T_h\) is linear, and that we have \(T_{a_1h_1+a_2h_2}=a_1T_{h_1}+a_2T_{h_2}\) for all \(h_1, h_2\in \Bbbk ^X\) and \(a_1, a_2\in \Bbbk \).

Consider the dual vector space \((\Bbbk ^X)'\) to the space \(\Bbbk ^X\) of functions. We will denote the value of a functional v on a function \(f\in \Bbbk ^X\) by \(\langle v| f\rangle \). Then for every \(v\in (\Bbbk ^X)'\) we have an operator \(T_v\) on \(C(X^\omega , \Bbbk )\) defined by

$$\begin{aligned} T_v(f)(x_1, x_2, \ldots )=\langle v| f(x, x_1, x_2, \ldots )\rangle , \end{aligned}$$

where \(f(x, x_1, x_2, \ldots )\) on the right-hand side of the equation is seen for every choice of the variables \(x_1, x_2, \ldots \) as a function of one variable x, i.e., an element of \(\Bbbk ^X\).

Let \(\mathsf {B}=\{e_i\}_{i=0}^{d-1}\) be a basis of \(\Bbbk ^X\). Let \(\mathsf {B}'=\{e_i'\}_{i=0}^{d-1}\) be the basis of the dual space defined by \(\langle e_i'|e_j\rangle =\delta _{i, j}\). (Here and in the sequel, \(\delta _{i, j}\) is the Kronecker’s symbol equal to 1 when \(i=j\), and to 0 otherwise.) We will denote \(T_{e_i'}=T_{e_i}'\).

Then \(T_{e_i}\) is an isomorphism between the space \(C(X^\omega , \Bbbk )\) and its subspace \(e_i\otimes C(X^\omega , \Bbbk )\). It is easy to see that \(T_{e_i}'\) restricted onto \(e_i\otimes C(X^\omega , \Bbbk )\) is the inverse of this isomorphism, and that \(T_{e_i}'\) restricted onto \(e_j\otimes C(X^\omega , \Bbbk )\) is equal to zero for \(j\ne i\).

The operators \(T_{e_i}\) and \(T_{e_j}'\) satisfy the relations:

$$\begin{aligned} T_{e_i}'T_{e_j}=\delta _{e_i, e_j},\qquad \sum _{e_i\in \mathsf {B}}T_{e_i}T_{e_i}'=1. \end{aligned}$$
(13)

The products \(T_{e_i}T_{e_i}'\) are projections onto the summands \(e_i\otimes C(X^\omega , \Bbbk )\) of the direct sum decomposition

$$\begin{aligned} C(X^\omega , \Bbbk )=\bigoplus _{e_i\in \mathsf {B}}e_i\otimes C(X^\omega , \Bbbk ). \end{aligned}$$

Let \(\mathsf {B}\) be a marked basis of \(\Bbbk ^X\), and let \(\mathbf {1}=e_0<e_1<\cdots <e_{d-1}\) be its elements. Let \(\mathsf {B}_\infty \) be the associated ordered basis of \(C(X^\omega , \Bbbk )\).

If A is (not necessarily square) finite matrix, then we denote by \(A^{\oplus \infty }\) the infinite matrix

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}A &{} O &{} O &{} \cdots \\ O &{} A &{} O &{} \cdots \\ O &{} O &{} A &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) , \end{aligned}$$

where O is the zero matrix of the same size as A.

The following is a direct corollary of the definitions.

Proposition 4.15

The matrices of \(T_{e_0}\), \(T_{e_1}, \ldots , T_{e_{d-1}}\) in the basis \(\mathsf {B}_\infty \) are equal to

$$\begin{aligned} E_0=\left( \begin{array}{c} 1 \\ 0 \\ \vdots \\ 0\end{array}\right) ^{\oplus \infty },\quad E_1=\left( \begin{array}{c} 0 \\ 1 \\ \vdots \\ 0\end{array}\right) ^{\oplus \infty }, \ldots ,\quad E_{d-1}=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 \\ 0 \\ \vdots \\ 1\end{array}\right) ^{\oplus \infty }, \end{aligned}$$

respectively. The matrix of \(T_{e_i}'\) is the transpose \(E_i^\top \) of the matrix \(E_i\).

The matrices \(E_i\) and \(E_i'\) have a natural relation to decimation of matrices. The proof of the next proposition is a straightforward computation of matrix products.

Proposition 4.16

Let \(A=(a_{i, j})_{i, j=0}^\infty \) be an infinite matrix, and let

$$\begin{aligned} \Xi _d(A)=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}A_{0, 0} &{} A_{0, 1} &{} \cdots &{} A_{0, d-1}\\ A_{1, 0} &{} A_{1, 1} &{} \cdots &{} A_{1, d-1}\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ A_{d-1, 0} &{} A_{d-1, 1} &{} \cdots &{} A_{d-1, d-1}\end{array}\right) \end{aligned}$$

be the matrix of its d-decimations. Then

$$\begin{aligned} A_{i, j}=E_i^\top AE_j \end{aligned}$$

and

$$\begin{aligned} A=\sum _{i, j=0}^{d-1} E_iA_{i, j}E_j^\top . \end{aligned}$$

Corollary 4.17

Let A be an operator on \(C(X^\omega , \Bbbk )\), and let \(\mathsf {B}=\{e_i\}\) be a basis of \(\Bbbk ^X\). Then the entries of the associated matrix recursion for A are equal to \(T_{e_i}'AT_{e_j}\).

The next proposition is a direct corollary of Proposition 4.15.

Proposition 4.18

If \(h=a_0e_0+a_1e_1+\cdots +a_{d-1}e_{d-1}\), then the matrix of \(T_h\) is equal to \(\left( \begin{array}{c}a_0\\ a_1\\ \vdots \\ a_{d-1}\end{array}\right) ^{\oplus \infty }.\)

Corollary 4.19

Let \(\mathsf {B}\) be a marked basis of \(\Bbbk ^X\). Order the letters of the alphabet X in a sequence \(x_0, x_1, \ldots , x_{d-1}\). Let \(T_i=T_{\delta _{x_i}}\) and \(T_i'=T_{\delta _{x_i}}'\) be the corresponding operators, defined using the basis \(\{\delta _{x_i}\}\).

Let \(S=(a_{ij})_{i, j=0}^{d-1}\) be the transition matrix from the basis \(\delta _{x_0}<\delta _{x_1}<\cdots <\delta _{x_{d-1}}\) to \(\mathsf {B}\). Let \(S^{-1}=(b_{ij})_{i, j=0}^{d-1}\) be the inverse matrix.

Then the matrix of \(T_i\) is

$$\begin{aligned} \left( \begin{array}{c}b_{0, i}\\ b_{1, i}\\ \cdots \\ b_{d-1, i}\end{array}\right) ^{\oplus \infty }, \end{aligned}$$

and the matrix of \(T_i'\) is

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} a_{i, 0}&a_{i, 1}&\cdots&a_{i, d-1}\end{array}\right) ^{\oplus \infty }. \end{aligned}$$

Let us consider now the case of the basis \(\mathsf {D}=\{\delta _x\}_{x\in X}\). For simplicity, let us denote \(T_x=T_{\delta _x}\) and \(T_x'=T_{\delta _x}'\). Then the operators \(T_x\) and \(T_x'\) act on \(C(X^\omega , \Bbbk )\) by the rule

$$\begin{aligned} T_x(f)(x_1, x_2, \ldots )=\left\{ \begin{array}{l@{\quad }l} f(x_2, x_3, \ldots ) &{} \text {if } x=x_1, \\ 0 &{} \text {otherwise.} \end{array}\right. \end{aligned}$$
(14)

and

$$\begin{aligned} T_x'(f)(x_1, x_2, \ldots )=f(x, x_1, x_2, \ldots ). \end{aligned}$$
(15)

In other words, the operator \(T_x\) is induced by the natural homeomorphism \(X^\omega \longrightarrow xX^\omega :w\mapsto xw\), and \(T_x'\) is induced by its inverse map \(xX^\omega \longrightarrow X^\omega :xw\mapsto w\).

Proposition 4.20

Let \(\mathsf {B}_\infty \) be a basis of \(C(X^\omega , \Bbbk )\) associated with a marked basis \(\mathsf {B}\) (see Definition 4.10). Then the matrices of the operators \(T_x\) and \(T_x'\), for \(x\in X\), in the ordered basis \(\mathsf {B}_\infty \) are |X|-automatic.

Note that we do not require in this proposition the field \(\Bbbk \) to be finite.

Proof

Let \(\delta _x=\sum _{i=0}^{d-1}\alpha _{x, i}e_i\) for \(x\in X\), \(\mathsf {B}=\{e_i\}_{i=0}^{d-1}\) and \(\alpha _{x, i}\in \Bbbk \). It follows from Proposition 4.16 that the entries of the matrix \(\Xi _d(T_x)\) with respect to the basis \(\mathsf {B}\) are equal to \(\sum _{k=0}^{d-1}\alpha _{x, k}T_{e_i}'T_{e_k}T_{e_j}\). Every product of the form \(T_{e_i}'T_{e_k}T_{e_j}\) is equal, by relations (13), either to zero, or to \(T_{e_j}\). It follows that decimations of \(T_x\) are either zeros or of the form \(\alpha _{x, i}T_{e_j}\). It follows that the set of repeated decimations of the matrix of \(T_x\) is contained in \(\{\alpha _{x, 0}, \alpha _{x, 1}, \ldots , \alpha _{x, d-1}\}\cdot \{T_{e_0}, T_{e_1}, \ldots , T_{e_{d-1}}\}\cup \{0\}\). \(\square \)

Example 4.21

Let us consider the case \(X=\{0, 1\}\) and \(\mathsf {B}=\{y_0, y_1\}\), where \(y_0=\delta _0+\delta _1\) and \(y_1=\delta _1\). Then the transition matrix from \(\{\delta _0, \delta _1\}\) to \(\{y_0, y_1\}\) is \(\left( \begin{array}{c@{\quad }c} 1 &{} 0\\ 1 &{} 1\end{array}\right) \), whose inverse is \(\left( \begin{array}{c@{\quad }c} 1 &{} 0 \\ -1 &{} 1\end{array}\right) \).

It follows then from Corollary 4.19 that the matrices of \(T_0\), \(T_1\), \(T_0'\), and \(T_1'\) in the basis \(\mathsf {B}_\infty \) are

$$\begin{aligned} T_0= \left( \begin{array}{r} 1 \\ -1 \end{array}\right) ^{\oplus \infty },\quad T_1= \left( \begin{array}{c} 0 \\ 1 \end{array}\right) ^{\oplus \infty }, \end{aligned}$$

and

$$\begin{aligned} T_0'=\left( \begin{array}{c@{\quad }c}1&0\end{array}\right) ^{\oplus \infty },\quad T_1'=\left( \begin{array}{c@{\quad }c}1&1\end{array}\right) ^{\oplus \infty }. \end{aligned}$$

Example 4.22

In the case \(\Bbbk =\mathbb {C}\), it is natural to consider the operators

$$\begin{aligned} S_x=\frac{1}{\sqrt{|X|}}T_x. \end{aligned}$$

Then \(S_x\) are isometries of the Hilbert space \(L^2(X^\omega )\), and their conjugates \(S_x^*\) are equal to \(\sqrt{|X|}T_x'\).

The \(C^*\)-algebra of operators on \(L^2(X^\omega )\) generated by the operators \(S_x\) is called the Cuntz algebra [14], and is usually denoted \(\mathcal {O}_{|X|}\). Any isometries satisfying the relations

$$\begin{aligned} S_x^*S_x=1,\qquad \sum _{x\in X}S_xS_x^*=1 \end{aligned}$$

generate a \(C^*\)-algebra isomorphic to \(\mathcal {O}_{|X|}\). In particular, the \(C^*\)-algebra generated by the matrices \(E_i\) is the Cuntz algebra. Representation of the Cuntz algebra by matrices \(E_i\) is an example of a permutational representation of \(\mathcal {O}_d\). More on such and similar representations, see [9].

Recall that, for \(X=\{0, 1\}\), the Walsh basis of \(L^2(X^\omega )\) is the basis \(\mathsf {W}_\infty \) constructed starting from the basis \(\mathsf {W}=\{y_0, y_1\}\), where \(y_0=\delta _0+\delta _1\) and \(y_1=\delta _0-\delta _1\). Then direct computation with of the transition matrices show that the matrices of \(S_0\) and \(S_1\) are

$$\begin{aligned} S_0=\left( \begin{array}{c}\frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{array}\right) ^{\oplus \infty },\qquad S_1=\left( \begin{array}{r}\frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}}\end{array}\right) ^{\oplus \infty }. \end{aligned}$$

4.5 Cuntz algebras and Higman–Thompson groups

If \(\psi :X^\omega \longrightarrow X^\omega \) is a homeomorphism, then it induces a linear operator \(L_\psi \) on \(C(X^\omega , \Bbbk )\) given by

$$\begin{aligned} L_\psi (f)(w)=f(\psi ^{-1}(w)) \end{aligned}$$

for \(f\in C(X^\omega , \Bbbk )\) and \(w\in X^\omega \).

Fixing any ordered basis of \(C(X^\omega , \Bbbk )\), we get thus a natural faithful representation of the homeomorphism group of \(X^\omega \) in the group of units of the algebra \(\mathsf {M}_\infty (\Bbbk )\) of column-finite matrices over \(\Bbbk \).

Proposition 4.23

Let \(\psi \) be a homeomorphism of \(X^\omega \). Let \(u, v\in X^n\), and denote by \(\psi _{u, v}\) the partially defined map given by the formula

$$\begin{aligned} \psi _{u, v}(w)=\left\{ \begin{array}{l@{\quad }l} w' &{} \text {if } \psi (vw)=uw',\\ \text {not defined} &{} \text {otherwise.}\end{array}\right. \end{aligned}$$

The following conditions are equivalent.

  1. 1.

    The homeomorphism \(\psi \) is (non-deterministically) synchronously automatic.

  2. 2.

    The set of partial maps \(\{\psi _{u, v}\;:\;u, v\in X^*, |u|=|v|\}\) is finite.

  3. 3.

    For every finite field \(\Bbbk \), the operator \(L_\psi :C(X^\omega , \Bbbk )\longrightarrow C(X^\omega , \Bbbk )\) is automatic.

  4. 4.

    For some finite field \(\Bbbk \), the operator \(L_\psi :C(X^\omega , \Bbbk )\longrightarrow C(X^\omega , \Bbbk )\) is automatic.

Synchronously automatic homeomorphisms are defined in Definition 3.7.

Proof

Equivalence of conditions (2), (3), and (5) follow directly from Proposition 4.16.

Suppose that \(\psi \) is synchronously automatic. Let \(\mathfrak {A}\) be an initial automaton defining \(\psi \). For every pair \(u=a_1a_2\cdots a_n, v=b_1b_2\cdots b_n\in X^*\) of words of equal length, let \(Q_{u, v}\) be the set of states q of \(\mathfrak {A}\) such that there exists a directed path starting in the initial state \(q_0\) of \(\mathfrak {A}\) and labeled by \((a_1, b_1), (a_2, b_2), \ldots , (a_n, b_n)\). Then the set \(Q_{u, v}\) defines the map \(\psi _{u, v}\) in the following sense. We have \(\psi _{u, v}(x_1x_2\cdots )=y_1y_2\cdots \) if and only if there exists a path starting in an element of \(Q_{u, v}\) and labeled by \((x_1, y_1), (x_2, y_2), \ldots \). It follows that the number of possible maps of the form \(\psi _{u, v}\) is not larger than the number of subsets of the set of states of \(\mathfrak {A}\). This shows that every synchronously automatic homeomorphism satisfies condition (2).

Suppose now that a homeomorphism \(\psi \) satisfies condition (2), and let us show that it is synchronously automatic. Construct an automaton \(\mathfrak {A}\) with the set of states Q equal to the set of non-empty maps of the form \(\psi _{u, v}\). For every \((x, y)\in X^2\) and \(\psi _{u, v}\in Q\) we have an arrow from \(\psi _{u, v}\) to \(\psi _{ux, vy}\), labeled by (xy), provided the map \(\psi _{ux, vy}\) is not empty. The initial state of the automaton is the map \(\psi =\psi _{\emptyset , \emptyset }\). Let us show that this automaton defines the homeomorphism \(\psi \). It is clear that if \(\psi (x_1x_2\cdots )=y_1y_2\cdots \), then there exists a path starting at the initial state of \(\mathfrak {A}\) and labeled by \((x_1, y_1), (x_2, y_2), \ldots \). On the other hand, if such a path exists for a pair of infinite words \(x_1x_2\cdots , y_1y_2\cdots \), then the maps \(\psi _{x_1x_2\cdots x_n, y_1y_2\cdots y_n}\) are non-empty for every n. In other words, for every n the set \(W_n\) of infinite sequences \(w\in x_1x_2\cdots x_nX^\omega \) such that \(\psi (w)\in y_1y_2\cdots y_n X^\omega \) is non-empty. It is clear that the sets \(W_n\) are closed and \(W_{n+1}\subset W_n\) for every n. By compactness of \(X^\omega \) it implies that \(\bigcap _{n\ge 1}W_n\) is non-empty. It follows that \(\psi (x_1x_2\cdots )=y_1y_2\cdots \). \(\square \)

The next corollary follows directly from condition (2) of Proposition 4.23.

Corollary 4.24

The set of all automatic homeomorphisms of \(X^\omega \) is a group.

We have already seen in Theorem 4.14 that a homeomorphisms g of \(X^\omega \) defined by an automorphisms of \(X^*\) is automatic if and only if it is finite state. Note that in this case \(g_{u, v}\) is either empty (if \(g(v)\ne u\)) or is equal to \(g|_v\).

Another example of a group of finitely automatic homeomorphisms of \(X^\omega \) is the Higman-Thompson group \(\mathcal {V}_{|X|}\). It is the set of all homeomorphisms that can be defined in the following way. We say that a subset \(A\subset X^*\) is a cross-section if the sets \(uX^\omega \) for \(u\in A\) are disjoint and their union is \(X^\omega \). Let \(A=\{v_1, v_2, \ldots , v_n\}\) and \(B=\{u_1, u_2, \ldots , u_n\}\) be cross-sections of equal cardinality together with a bijection \(v_i\mapsto u_i\). Define a homeomorphism \(\psi :X^\omega \longrightarrow X^\omega \) by the rule

$$\begin{aligned} \psi (v_iw)=u_iw. \end{aligned}$$
(16)

The set of all homeomorphisms that can be defined in this way is the Higman–Thompson group \(\mathcal {V}_{|X|}\), see [12, 47].

Let \(\psi \) be the homeomorphism defined by (16). It follows directly from (14) and (15) that the operator \(L_\psi \) induced by \(\psi \) is equal to

$$\begin{aligned} L_\psi =\sum _{i=1}^n T_{u_i}T_{v_i}', \end{aligned}$$

where we use notation

$$\begin{aligned} T_{x_1x_2\cdots x_m}=T_{x_1}T_{x_2}\cdots T_{x_n},\qquad T_{x_1x_2\cdots x_m}'=T_{x_m}'T_{x_{m-1}}'\cdots T_{x_1}'. \end{aligned}$$

The next proposition follows then from Proposition 4.20.

Proposition 4.25

The Higman–Thompson group \(\mathcal {V}_{|X|}\) is a subgroup of the group of synchronously automatic homeomorphisms of \(X^\omega \).

The group generated by \(\mathcal {V}_2\) and the Grigorchuk group was studied by Roever [40]. He proved that it is a finitely presented simple group isomorphic to the abstract commensurizer of the Grigorchuk group. Generalizations of this group (for arbitrary self-similar group) was studied in [34].

5 Representations by uni-triangular matrices

5.1 Sylow p-subgroup of \(\mathop {\mathrm {Aut}}(X^*)\)

Let \(|X|=p\) be prime. We assume that \(X=\{0, 1, \ldots , p-1\}\) is equal to the field \(\mathbb {F}_p\) of p elements. From now on, we will write the vertices of the tree \(X^*\) as tuples \((x_1, x_2, \ldots , x_n)\) in order not to confuse them with products of elements of \(\mathbb {F}_p\).

Denote by \(\mathcal {K}_p\) the subgroup of \(\mathop {\mathrm {Aut}}(X^*)\) consisting of automorphisms g whose labels \(\alpha _{g, v}\) of the vertices of the portrait consist only of powers of the cyclic permutation \(\sigma =(0, 1, \ldots , p-1)\). It follows from (2) and (3) that \(\mathcal {K}_p\) is a group. The study of the group \(\mathcal {K}_p\) (and its finite analogs) were initiated by Kaloujnine [2628].

Suppose that an element \(g\in \mathcal {K}_p\) is represented by a tableau

$$\begin{aligned}{}[a_0, a_1(x_1), a_2(x_1, x_2), \ldots ], \end{aligned}$$

as in Sect. 2.1. Then \(a_n(x_1, x_2, \ldots , x_n)\) are maps from \(X^n\) to the group generated by the cyclic permutation \(\sigma \). The elements of this group act on \(X=\mathbb {F}_p\) by maps \(\sigma ^a:x\mapsto x+a\). It follows that we can identify functions \(a_n\) with maps \(X^n\longrightarrow \mathbb {F}_p\), so that an element \(g\in \mathcal {K}_p\) represented by a tableau

$$\begin{aligned}{}[a_0, a_1(x_1), a_2(x_1, x_2), \ldots ] \end{aligned}$$

acts on sequences \(v=(x_0, x_1, \ldots )\in X^\omega \) by the rule

$$\begin{aligned} g(v)=(x_0+a_0, x_1+a_1(x_0), x_2+a_2(x_1, x_2), x_3+a_3(x_1, x_2, x_3), \ldots ). \end{aligned}$$

It follows that if \(g_1, g_2\in \mathcal {K}_p\) are represented by the tableaux \([a_n]_{n=0}^\infty \) and \([b_n]_{n=0}^\infty \), then their product \(g_1g_2\) is represented by the tableau

$$\begin{aligned}&[b_0+a_0, \quad b_1(x_0)+a_1(x_0+b_0), \nonumber \\&\quad b_2(x_0, x_1)+a_2(x_0+b_0, x_1+b_1(x_0)), \nonumber \\&\quad b_3(x_0, x_1, x_2)+a_3(x_0+b_0, x_1+b_1(x_0), x_2+b_2(x_0, x_1)), \quad \ldots ]. \end{aligned}$$
(17)

Denote by \(\mathcal {K}_{p, n}\) the quotient of \(\mathcal {K}_p\) by the pointwise stabilizer of the \(n\hbox {th}\) level of the tree \(X^*\). We can consider \(\mathcal {K}_{p, n}\) as a subgroup of the automorphism group of the finite subtree \(X^{[n]}=\bigcup _{k=0}^nX^k\subset X^*\).

Proposition 5.1

The group \(\mathcal {K}_{p, n}\) is a Sylow subgroup of the symmetric group \(\mathop {\mathrm {Symm}}(X^n)\) and of the automorphism group of the tree \(X^{[n]}\).

Proof

The order of \(\mathop {\mathrm {Symm}}(X^n)\) is \(p^n!\), and the maximal power of p dividing it is

$$\begin{aligned} \frac{p^n}{p}+\frac{p^n}{p^2}+\cdots +\frac{p^n}{p^n}=\frac{p^n-1}{p-1}. \end{aligned}$$

It follows that the order of the Sylow p-subgroup of \(\mathop {\mathrm {Symm}}(X^n)\) is \(p^{\frac{p^n-1}{p-1}}\). The order of \(\mathcal {K}_{p, n}\) is equal to the number of possible tableaux

$$\begin{aligned}{}[a_0, a_1(x_1), a_2(x_1, x_2), \ldots a_{n-1}(x_1, \ldots , x_{n-1})], \end{aligned}$$

where \(a_i\) is an arbitrary map from \(X^i\) to the cyclic group \(\langle \sigma \rangle \) of order p. The number of possibly maps \(a_i\) is hence \(p^{p^i}\). Consequently, the number of possible tableaux is \(p^{1+p+p^2+\cdots +p^{n-1}}=p^{\frac{p^n-1}{p-1}}\). Since the group of all automorphisms of the tree \(X^{[n]}\) is contained in \(\mathop {\mathrm {Symm}}(X^n)\) and contains \(\mathcal {K}_{p, n}\), the subgroup \(\mathcal {K}_{p, n}\) is its Sylow p-subgroup. \(\square \)

Proposition 5.2

Let \(g\in \mathcal {K}_p\) be represented by a tableau \([a_0, a_1(x_1), a_2(x_1, x_2), \ldots ]\). Consider the map \(\alpha :\mathcal {K}_p\longrightarrow \mathbb {F}_p^\omega \), where \(\mathbb {F}_p^\omega \) is the infinite Cartesian product of additive groups of \(\mathbb {F}_p\), given by

$$\begin{aligned} \alpha (g)=\left( a_0, \sum _{x_1\in \mathbb {F}_p}a_1(x_1), \sum _{(x_1, x_2)\in \mathbb {F}_p^2}a_2(x_1, x_2), \ldots \right) . \end{aligned}$$
(18)

In other words, we just sum up modulo p all the decorations of the portrait of g on each level. Then \(\alpha \) is the abelianization epimorphism \(\mathcal {K}_p\longrightarrow \mathcal {K}_p/[\mathcal {K}_p, \mathcal {K}_p]\cong \mathbb {F}_p^\omega \).

Proof

It is easy to check that \(\alpha \) is a homomorphism. It remains to show that its kernel is the derived subgroup of \(\mathcal {K}_p\). This is a folklore fact, and we show here how it follows from a more general result of Kaloujnine.

Let \(g\in \mathcal {K}_p\) be represented by a tableau

$$\begin{aligned}{}[a_0, a_1(x_1), a_2(x_1, x_2), \ldots ]. \end{aligned}$$

Each function \(a_n(x_1, x_2, \ldots , x_n)\) can be written as a polynomial

$$\begin{aligned} \sum _{0\le k_i\le p-1}c_{k_1, k_2, \ldots , k_n}x_1^{k_1}x_2^{k_2}\cdots x_n^{k_n} \end{aligned}$$

for some coefficients \(c_{k_1, k_2, \ldots , k_n}\in \mathbb {F}_p\).

It is proved in [27, Theorem 6] (see also Equation (5.4) in [28]) that the derived subgroup \([\mathcal {K}_p, \mathcal {K}_p]\) of \(\mathcal {K}_p\) is the set of elements defined by tableaux in which \(a_0\) and the coefficient \(c_{p-1, p-1, \ldots , p-1}\) at the eldest term \(x_1^{p-1}x_2^{p-1}\cdots x_n^{p-1}\) are equal to zero for every n.

Note that \(\sum _{x\in \mathbb {F}_p}x^k\) is equal to zero for \(k=0, 1, \ldots , p-2\) and is equal to \(-1\) for \(k=p-1\). Therefore,

$$\begin{aligned} \sum _{(x_1, x_2, \ldots , x_n)\in \mathbb {F}_p^n}x_1^{k_1}x_2^{k_2}\cdots x_n^{k_n}=\prod _{i=1}^n\sum _{x\in \mathbb {F}_p}x^{k_i} \end{aligned}$$

is equal to zero for all n-tuples \((k_1, k_2, \ldots , k_n)\in \{0, 1, \ldots , p-1\}^n\) except for \((p-1, p-1, \ldots , p-1)\), when it is equal to \((-1)^n\). It follows that the coefficient at the eldest term of \(a_n(x_1, x_2, \ldots , x_n)\) is equal to zero if and only if \(\sum _{(x_1, x_2, \ldots , x_n)\in \mathbb {F}_p^n}a_n(x_1, x_2, \ldots , x_n)=0\). \(\square \)

5.2 Polynomial bases of \(C(X^\omega , \mathbb {F}_p)\)

Proposition 5.3

Suppose that an ordered basis \(\mathsf {B}\) of \(\Bbbk ^X\) is such that the matrices of \(\pi _1(g)\) for \(g\in G\le \mathop {\mathrm {Aut}}(X^*)\) are all upper uni-triangular, and the minimal element of \(\mathsf {B}\) is the constant one function. Then the matrices of \(\pi _n(g)\) in the basis \(\mathsf {B}^{\otimes n}\) and of \(\pi _\infty (g)\) in the associated basis \(\mathsf {B}_\infty \) are upper uni-triangular for all \(g\in G\).

See Sect. 2.4 for the definition of the representations \(\pi _n\). We say that a matrix is upper uni-triangular if all its elements below the main diagonal are equal to zero, and all the elements on the diagonal are equal to one. From now on, unless the contrary is specifically mentioned, “uni-triangular” will mean “upper uni-triangular”. Note that if G is a subgroup of the group of finite automata, then the matrices \(\pi _\infty (g)\) are automatic, by Theorem 4.14.

Proof

Let \(b_0<b_1<\cdots <b_{d-1}\) be the ordered basis \(\mathsf {B}\). Let \(\mathsf {Y}=\{y_0, y_1, \ldots , y_{d-1}\}\) be the corresponding basis of the right module \(\Phi _G\). Namely, we take for every \(\sum _{x\in X}a_x\delta _x\in \mathsf {B}\) the corresponding element \(\sum _{x\in X}a_xx\in \Phi =\Bbbk [X\cdot G]\), see Sect. 2.3. Then \(\mathsf {B}=\mathsf {Y}\otimes \varepsilon \), where \([\varepsilon ]\) is the left G-module of the trivial representation of G, see 2.4.

If the matrices of \(\pi _1(g)\) are uni-triangular, then

$$\begin{aligned} \pi _1(g)(b_i)=b_i+a_{i-1, i}b_{i-1}+a_{i-2, i}b_{i-2}+\cdots +a_{0, i}b_0 \end{aligned}$$

for some \(a_{k, i}\in \Bbbk \) and all i. It follows that, in the bimodule \(\Phi \), we have relations

$$\begin{aligned} g\cdot y_i=y_i\cdot g_{i, i}+a_{i-1, i}y_{i-1}\cdot g_{i-1, i}+a_{i-2, i}y_{i-2}\cdot g_{i-2, i}+\cdots +a_{0, i}y_0\cdot g_{0, i} \end{aligned}$$
(19)

for some \(g_{j, i}\in \Bbbk [G]\) such that \(g_{j, i}\cdot \epsilon =\epsilon \) (Recall that the last equality just means that the sum of coefficients of \(g_{j, i}\in \Bbbk [G]\), i.e., the value of the augmentation map, is equal to one). Consequently, relation (19) together with the condition \(g_{j, i}\cdot \varepsilon =\varepsilon \) hold for all \(g\in \Bbbk [G]\) such that \(g\cdot \varepsilon =\varepsilon \).

It follows that every element \(g\cdot y_{i_1}\otimes y_{i_2}\otimes \cdots \otimes y_{i_n}\in \Phi ^{\otimes n}\) is equal to \(y_{i_1}\otimes y_{i_2}\otimes \cdots \otimes y_{i_n}\cdot h\) plus a sum of elements of the form \(y_{j_1}\otimes y_{j_2}\otimes \cdots y_{j_n}\cdot a_{j_1, j_2, \ldots , j_n}\), where \(h\in \Bbbk [G]\) is such that \(h\cdot \varepsilon =\varepsilon \), \(a_{j_1, j_2, \ldots , j_n}\in \Bbbk [G]\), and \(j_k\le i_k\) for all \(k=1, 2, \ldots , n\), and \((j_1, j_2, \ldots , j_n)\ne (i_1, i_2, \ldots , i_n)\). Taking tensor product with \(\varepsilon \) and applying Proposition 2.10, we conclude that for every function \(b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_n}\in \Bbbk ^{X^n}\) the function \(\pi _\infty (g)(b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_n})\) is equal to \(b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_n}\) plus a linear combination of functions \(b_{j_1}\otimes b_{j_2}\otimes \cdots \otimes b_{j_n}\) such that \(j_k\le i_k\) for all \(k=1, 2, \ldots , n\), and \((j_1, j_2, \ldots , j_n)\ne (i_1, i_2, \ldots , i_n)\). But any such function \(b_{j_1}\otimes b_{j_2}\otimes \cdots \otimes b_{j_n}\) is an element of \(\mathsf {B}_\infty \), which is smaller than \(b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_n}\) in the inverse lexicographic order. This proves that the matrix of \(\pi _\infty (g)\) in the basis \(\mathsf {B}_\infty \) is uni-triangular. \(\square \)

Throughout the rest of our paper we assume that \(|X|=p\) is prime, \(\Bbbk \) is the field \(\mathbb {F}_p\) of p elements, G is a subgroup of \(\mathcal {K}_p\), and we identify X with \(\mathbb {F}_p\). We will be able then to use Proposition 5.3 to construct bases of \(C(X^\omega , \mathbb {F}_p)\) in which the representation \(\pi _\infty \) of \(\mathcal {K}_p\) (and hence of G) are uni-triangular.

Every function \(f\in \mathbb {F}_p^X\) can be represented as a polynomial \(f(x)\in \mathbb {F}_p[x]\), using the formula

$$\begin{aligned} \delta _t(x)=\frac{x(x-1)(x-2)\cdots (x-p+1)}{(x-t)}, \end{aligned}$$

where \((x-t)\) in the numerator and the denominator cancel each other (Recall that \((p-1)!=-1\quad (\hbox {mod}\,p)\), by Wilson’s theorem).

Since \(x^p=x\) as a function on \(\mathbb {F}_p\) (by Fermat’s little theorem), polynomials that differ by an element of the ideal generated by \(x^p-x\) represent the same function. Note that the ring \(\mathbb {F}_p[x]/(x^p-x)\) has cardinality \(p^p\), hence we get a natural bijection between \(\mathbb {F}_p[x]/(x^p-x)\) and \(\mathbb {F}_p^X\), mapping a polynomial to the function it defines on \(\mathbb {F}_p\). From now on, we will thus identify the space of functions \(\mathbb {F}_p^X\) with the \(\mathbb {F}_p\)-algebra \(\mathbb {F}_p[x]/(x^p-x)\).

Following Kaloujnine, we will call the elements of \(\mathbb {F}_p[x]/(x^p-x)\) reduced polynomials. We write them as usual polynomials \(a_0+a_1x+\cdots +a_{p-1}x^{p-1}\) (but keeping in mind reduction, when performing multiplication).

Suppose that \(g\in G\) is such that \(g(x)=x+1\) for all \(x\in X\). Then \(\pi _1(g)\) acts on the functions \(f\in V_1=\mathbb {F}_p^X\) by the rule

$$\begin{aligned} \pi _1(g)(f)(x)=f(x-1). \end{aligned}$$

In particular, if we represent f as a polynomial, then \(\pi _1(g)\) does not change its degree and the coefficient of the leading term. It follows that the matrix of the operator \(\pi _1(g)\) in the basis \(e_0(x)=\mathbf {1}, e_1(x)=x, e_2(x)=x^2, \ldots , e_{p-1}(x)=x^{p-1}\) is uni-triangular. Let us denote this marked basis by \(\mathsf {E}\).

Definition 5.4

The basis \(\mathsf {E}_\infty \) of \(C(X^\omega , \mathbb {F}_p)\) corresponding to \(\mathsf {E}\) and consisting of all monomial functions \(x_1^{k_1}x_2^{k_2}\cdots \) on \(X^\omega \) ordered inverse lexicographically, so that

$$\begin{aligned} e_0=1,\quad e_1=x_1,\quad e_2=x_1^2, \ldots , e_{p-1}=x_1^{p-1},\quad e_p=x_2,\quad e_{p+1}=x_1x_2, \ldots \end{aligned}$$

is called the Kaloujnine basis of monomials.

It is easy to see that \(e_n\in \mathsf {E}_\infty \) is equal to the monomial function

$$\begin{aligned} e_n(x_1x_2\cdots )=x_1^{k_1}x_2^{k_2}\ldots , \end{aligned}$$
(20)

where \(k_1, k_2, \ldots \) are the digits of the base p expansion of n, i.e., such that \(k_i\in \{0, 1, \ldots , p-1\}\) and

$$\begin{aligned} n=k_1+k_2\cdot p+k_3\cdot p^2+\cdots . \end{aligned}$$

Coordinates of a function \(f\in C(X^\omega , \mathbb {F}_p)\) in the basis \(\mathsf {E}_\infty \) are the coefficients of the representation of the function f as a polynomial in the variables \(x_1, x_2, \ldots \). Since we are dealing with functions, we assume that these polynomials are reduced, i.e., are elements of the ring \(\mathbb {F}_p[x_1, x_2, \ldots ]/(x_1^p-x_1, x_2^p-x_2, \ldots )\).

As an immediate corollary of Proposition 5.3 we get the following.

Theorem 5.5

The representation \(\pi _\infty \) of \(\mathcal {K}_p\) in the Kaloujnine basis \(\mathsf {E}_\infty \) is uni-triangular. In particular, the representations \(\pi _\infty \) of all its subgroups \(G\le \mathcal {K}_p\) are uni-triangular in \(\mathsf {E}_\infty \).

We can change the ordered basis \(\mathsf {E}=\{e_0=\mathbf {1}, e_1=x, \ldots , e_{p-1}(x)=x^{p-1}\}\) to any ordered basis \(\mathsf {F}=(f_0, f_1, \ldots , f_{p-1})\) consisting of polynomials of degrees \(0, 1, 2, \ldots , p-1\), respectively, since then the transition matrix from \(\mathsf {E}\) to \(\mathsf {F}\) will be triangular, hence the representation of G in the basis \(\mathsf {F}\) will be also uni-triangular.

For example, a natural choice is the basis \(\mathsf {B}\) in which the matrix of the cyclic permutation \(x\mapsto x+1:X\longrightarrow X\) is the Jordan cell

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 1 &{} 1 &{} \cdots \\ 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) . \end{aligned}$$

To get such a basis, define the functions \(b_0, b_1, \ldots , b_{p-1}\in V_1\) by the formula

$$\begin{aligned} b_k=(\pi _1(g)-1)^{p-1-k}(\delta _0) \end{aligned}$$

for \(k=0, 1, \ldots , p-2\) and \(b_{p-1}=\delta _0\). Then \((\pi _1(g)-1)(b_k)=(\pi _1(g)-1)^{p-k}(\delta _0)=b_{k-1}\), for all \(k=1, 2, \ldots , p-1\), i.e.,

$$\begin{aligned} b_{k-1}(x)=b_k(x-1)-b_k(x). \end{aligned}$$

Note that

$$\begin{aligned} b_0=(\pi _1(g)-1)^{p-1}(\delta _0)=(\pi _1(g)^{p-1}+\pi _1(g)^{p-2}+\cdots +1)(\delta _0)=\sum _{k=0}^{p-1}\delta _k=\mathbf {1}, \end{aligned}$$

i.e., the basis \(b_0<b_1<\cdots <b_{p-1}\) is marked.

Proposition 5.6

For every \(k\in \{1, \ldots , p-1\}\) and \(x\in X=\mathbb {F}_p\) we have

$$\begin{aligned} b_k(x)=(-1)^k\left( {\begin{array}{c}x+k\\ k\end{array}}\right) =(-1)^k\frac{(x+1)(x+2)\cdots (x+k)}{k!}. \end{aligned}$$

Note that \(k!\ne 0\) in \(\mathbb {F}_p\) for every \(k=1, 2, \ldots , p-1\).

Proof

We have \((p-1)!=1\) and \((-1)^{p-1}=1\) in \(\mathbb {F}_p\). We also have \((x+1)(x+2)\cdots (x+p-1)=0\) for all \(x\in \mathbb {F}_p{\setminus }\{0\}\). It follows that \((-1)^{p-1}\left( {\begin{array}{c}x+p-1\\ p-1\end{array}}\right) =\delta _0=f_{p-1}\).

It is enough now to check that the functions \((-1)^k\left( {\begin{array}{c}x+k\\ k\end{array}}\right) \) satisfy the recurrent relation \(b_{k-1}(x)=b_k(x-1)-b_k(x)\). But we have

$$\begin{aligned} (-1)^k\left( {\begin{array}{c}x-1+k\\ k\end{array}}\right) -(-1)^k\left( {\begin{array}{c}x+k\\ k\end{array}}\right)= & {} (-1)^{k-1}\left( \left( {\begin{array}{c}x+k\\ k\end{array}}\right) -\left( {\begin{array}{c}x+k-1\\ k\end{array}}\right) \right) \\= & {} (-1)^{k-1}\left( {\begin{array}{c}x+k-1\\ k-1\end{array}}\right) \end{aligned}$$

by the well known identity

$$\begin{aligned} \left( {\begin{array}{c}a\\ b\end{array}}\right) =\left( {\begin{array}{c}a-1\\ b\end{array}}\right) +\left( {\begin{array}{c}a-1\\ b-1\end{array}}\right) . \end{aligned}$$

\(\square \)

Proposition 5.7

The transition matrix from the basis \((\delta _0, \delta _1, \ldots , \delta _{p-1})\) to the basis \(\mathsf {B}=(b_0, b_1, \ldots , b_{p-1})\) is

$$\begin{aligned} T=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} \left( {\begin{array}{c}p-1\\ 0\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 1\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 2\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 3\end{array}}\right) &{} \cdots &{} \left( {\begin{array}{c}p-1\\ p-1\end{array}}\right) \\ \left( {\begin{array}{c}p-2\\ 0\end{array}}\right) &{} \left( {\begin{array}{c}p-2\\ 1\end{array}}\right) &{} \left( {\begin{array}{c}p-2\\ 2\end{array}}\right) &{} \left( {\begin{array}{c}p-2\\ 3\end{array}}\right) &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} 3 &{} 3 &{} 1 &{} \cdots &{} 0\\ 1 &{} 2 &{} 1 &{} 0 &{} \cdots &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} \cdots &{} 0\\ 1 &{} 0 &{} 0 &{} 0 &{} \cdots &{} 0\end{array}\right) \end{aligned}$$

Its inverse is obtained by transposing T with respect to the secondary diagonal:

$$\begin{aligned} T^{-1}=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 0 &{} 0 &{} 0 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ p-1\end{array}}\right) \\ 0 &{} 0 &{} 0 &{} 0 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ p-2\end{array}}\right) \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} 0 &{} 1 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 3\end{array}}\right) \\ 0 &{} 0 &{} 1 &{} 3 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 2\end{array}}\right) \\ 0 &{} 1 &{} 2 &{} 3 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 1\end{array}}\right) \\ 1 &{} 1 &{} 1 &{} 1 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 0\end{array}}\right) \end{array}\right) . \end{aligned}$$

Proof

It follows from Proposition 5.6 that the entry \(t_{ij}\) of the transition matrix, where \(i=0, 1, \ldots , p-1\) and \(j=0, 1, \ldots , p-1\) is equal to

$$\begin{aligned} t_{ij}= & {} (-1)^j\left( {\begin{array}{c}i+j\\ j\end{array}}\right) =(-1)^j\frac{(i+j)(i+j-1)\cdots (i+j-j+1)}{j!} \\= & {} \frac{(-i-j)(-i-j+1)\cdots (-i-1)}{j!} \\= & {} \frac{(p-i-j)(p-i-j+1)\cdots (p-i-1)}{j!}=\left( {\begin{array}{c}p-1-i\\ j\end{array}}\right) , \end{aligned}$$

which proves the first claim of the proposition.

In order to prove the second claim, we have to show that the product

$$\begin{aligned} \left( \left( {\begin{array}{c}p-1-i\\ j\end{array}}\right) \right) _{i,j=0}^{p-1}\cdot \left( \left( {\begin{array}{c}j\\ p-1-i\end{array}}\right) \right) _{i,j=0}^{p-1} \end{aligned}$$

is equal to the identity matrix. The general entry of the product is equal to

$$\begin{aligned} a_{ij}=\sum _{k=0}^{p-1}\left( {\begin{array}{c}p-1-i\\ k\end{array}}\right) \cdot \left( {\begin{array}{c}j\\ p-1-k\end{array}}\right) = \left( {\begin{array}{c}p-1-i+j\\ p-1\end{array}}\right) . \end{aligned}$$

But \(\left( {\begin{array}{c}x+p-1\\ p-1\end{array}}\right) =\frac{(x+p-1)(x+p-2)\cdots (x+1)}{(p-1)!}\) is equal to one for \(x=0\) and is equal to zero for \(x\ne 0\), which shows that the product is equal to the identity matrix. \(\square \)

Example 5.8

In the case \(p=2\), the transition matrix is \(T=\left( \begin{array}{c@{\quad }c}1 &{} 1\\ 1 &{} 0\end{array}\right) \), and its inverse is \(T^{-1}=\left( \begin{array}{c@{\quad }c} 0 &{} 1\\ 1 &{} 1\end{array}\right) \). Let us use this to find the matrix recursions for some self-similar groups acting on the binary tree in the new basis \(\mathsf {B}=\{b_0, b_1\}\).

For the adding machine (see Examples 2.3, 2.5, 2.6), we have

$$\begin{aligned} \Xi _2(a)=\left( \begin{array}{c@{\quad }c} 0 &{} 1\\ 1 &{} 1\end{array}\right) \cdot \left( \begin{array}{c@{\quad }c}0 &{} a\\ 1 &{} 0\end{array}\right) \cdot \left( \begin{array}{c@{\quad }c} 1 &{} 1\\ 1 &{} 0\end{array}\right) = \left( \begin{array}{c@{\quad }c}1 &{} 1\\ 1+a &{} 1\end{array}\right) . \end{aligned}$$

Proposition 5.9

The matrix of the binary adding machine \(\pi _\infty (a)\) in the basis \(\mathsf {B}_\infty \) is the infinite Jordan cell

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} 0 &{} \cdots \\ 0 &{} 1 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} 1 &{} \cdots \\ 0 &{} 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) . \end{aligned}$$

Proof

Let us prove the statement by induction, using the matrix recursion from Example 5.8. The matrix of a on \(V_0\) is (1). The four 2-decimations \(\Xi _2(J)_{ij}\) of the Jordan cell are

$$\begin{aligned}&\Xi _2(J)_{00}:\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} \textcircled 1 &{} 1 &{} \textcircled 0 &{} 0 &{} \cdots \\ 0 &{} 1 &{} 1 &{} 0 &{} \cdots \\ \textcircled 0 &{} 0 &{} \textcircled 1 &{} 1 &{} \cdots \\ 0 &{} 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) \mapsto \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 0 &{} 0 &{} \cdots \\ 0 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) , \\&\quad \Xi _2(J)_{01}:\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} \textcircled 1 &{} 0 &{}\textcircled 0 &{} \cdots \\ 0 &{} 1 &{} 1 &{} 0 &{} \cdots \\ 0 &{} \textcircled 0 &{} 1 &{} \textcircled 1 &{} \cdots \\ 0 &{} 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) \mapsto \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 0 &{} 0 &{} \cdots \\ 0 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) , \\&\quad \Xi _2(J)_{10}: \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} 0 &{} \cdots \\ \textcircled 0 &{} 1 &{} \textcircled 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} 1 &{} \cdots \\ \textcircled 0 &{} 0 &{} \textcircled 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) \mapsto \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} \cdots \\ 0 &{} 0 &{} 0 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) , \\&\quad \Xi _2(J)_{11}: \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} 0 &{} \cdots \\ 0 &{} \textcircled 1 &{} 1 &{} \textcircled 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} 1 &{} \cdots \\ 0 &{} \textcircled 0 &{} 0 &{} \textcircled 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) \mapsto \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 0 &{} 0 &{} \cdots \\ 0 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) , \end{aligned}$$

which agrees with the recursion \(\Xi _2(a)=\left( \begin{array}{c@{\quad }c}1 &{} 1 \\ 1+a &{} 1\end{array}\right) \). \(\square \)

Lemma 5.10

If \(g\in \mathop {\mathrm {Aut}}(X^*)\) satisfies the wreath recursion \(g=(g_0, g_1)\), then its matrix recursion in the basis \(\mathsf {B}_\infty \) (over the field \(\mathbb {F}_p\) for \(p=2\)) is

$$\begin{aligned} \Xi _2(g)=\left( \begin{array}{c@{\quad }c} g_1 &{} 0 \\ g_0+g_1 &{} g_0\end{array}\right) . \end{aligned}$$

If it satisfies \(g=\sigma (g_0, g_1)\), then

$$\begin{aligned} \Xi _2(g)= \left( \begin{array}{c@{\quad }c} g_0 &{} g_0 \\ g_0+g_1 &{} g_0\end{array}\right) . \end{aligned}$$

Proof

We have, in the first case,

$$\begin{aligned} \Xi _2(g)=\left( \begin{array}{c@{\quad }c}0 &{} 1\\ 1 &{} 1\end{array}\right) \left( \begin{array}{c@{\quad }c} g_0 &{} 0 \\ 0 &{} g_1\end{array}\right) \left( \begin{array}{c@{\quad }c} 1 &{} 1 \\ 1 &{} 0\end{array}\right) =\left( \begin{array}{c@{\quad }c} g_1 &{} 0\\ g_0+g_1 &{} g_0\end{array}\right) . \end{aligned}$$

In the second case:

$$\begin{aligned} \Xi _2(g)=\left( \begin{array}{c@{\quad }c}0 &{} 1\\ 1 &{} 1\end{array}\right) \left( \begin{array}{c@{\quad }c} 0 &{} g_1 \\ g_0 &{} 0\end{array}\right) \left( \begin{array}{c@{\quad }c} 1 &{} 1 \\ 1 &{} 0\end{array}\right) =\left( \begin{array}{c@{\quad }c} g_0 &{} g_0\\ g_0+g_1 &{} g_0\end{array}\right) . \end{aligned}$$

\(\square \)

Fig. 7
figure 7

The matrices of the generators bcd of the Grigorchuk group

Example 5.11

It follows from Lemma 5.10 that the matrix recursion for the generators of the Grigorchuk group (see Example 2.4) in the basis \(\mathsf {B}\) is

$$\begin{aligned} \Xi _2(a)= & {} \left( \begin{array}{c@{\quad }c}1 &{} 1\\ 0 &{} 1\end{array}\right) ,\quad \Xi _2(b)=\left( \begin{array}{c@{\quad }c}c &{} 0\\ a+c &{} a\end{array}\right) , \\ \Xi _2(c)= & {} \left( \begin{array}{c@{\quad }c}d &{} 0\\ a+d &{} a\end{array}\right) ,\quad \Xi _2(d)=\left( \begin{array}{c@{\quad }c}b &{} 0\\ 1+b &{} 1\end{array}\right) . \end{aligned}$$

See a visualization of the matrices bcd on Fig. 7, where black pixels correspond to ones, and white pixels to zeros.

Denote \(U_n=\langle b_0, b_1, \ldots , b_n\rangle =\langle e_0, e_1, \ldots , e_n\rangle <C(X^\omega , \mathbb {F}_p)\).

Proposition 5.12

Each space \(U_n\) is \(\mathcal {K}_p\)-invariant, and the kernel in \(\mathcal {K}_p\) of the restriction of \(\pi _\infty \) onto \(U_{p^{n-1}}=\langle b_0, b_1, \ldots , b_{p^{n-1}}\rangle \) coincides with the kernel of \(\pi _n\). In other words, restriction of \(\pi _\infty \) onto \(U_{p^{n-1}}\) defines a faithful representation of \(\mathcal {K}_{p, n}\).

Proof

The subspace \(\langle b_0, b_1, \ldots , b_{p^{n-1}}\rangle < C(X^\omega , \mathbb {F}_p)\) is equal to the span of the product \(V_{n-1}\cdot e_{p^{n-1}}\), where \(e_{p^{n-1}}\) is the function on \(X^\omega \) given by

$$\begin{aligned} e_{p^{n-1}}(x_1x_2\cdots )=x_n, \end{aligned}$$

according to (20). In other words, it is the tensor product \(V_{n-1}\otimes \langle e_1\rangle \), where \(e_1(x)=x\).

Suppose that \(g\in G\) belongs to the kernel of the restriction of \(\pi _\infty \) onto \(V_{n-1}\otimes \langle e_1\rangle \). Then for every \(v\in X^{n-1}\) we have \(\pi _1(g)(\delta _v)=\delta _v\), since \(\delta _v\in V_{n-1}\). Then

$$\begin{aligned} \pi _\infty (g)(\delta _v\otimes e_1)=\delta _v\otimes (e_1\circ \pi _1(g|_v)^{-1}), \end{aligned}$$

hence \(\pi _1(g|_v)\) is identical for every \(v\in X^{n-1}\). It follows that g acts trivially on \(X^n\), i.e., that \(\pi _n(g)\) is trivial. \(\square \)

Thus, we get a faithful representation of \(\mathcal {K}_{p, n}=\wr _{k=1}^nC_p\) by uni-triangular matrices of dimension \(p^{n-1}+1\). Note that this is the smallest possible dimension for a faithful representation, since the nilpotency class of \(\mathcal {K}_{p, n}\) is equal to \(p^{n-1}\), while the nilpotency class of the group of uni-triangular matrices of dimension d is equal to \(d-1\).

5.3 The first diagonal

Let \(\alpha :\mathcal {K}_p\longrightarrow \mathbb {F}_p^\omega \) be the abelianization homomorphism given by (18). We write

$$\begin{aligned} \alpha (g)=(\alpha _0(g), \alpha _1(g), \ldots ). \end{aligned}$$

If \(A=(a_{ij})_{i, j=0}^\infty \) is an infinite matrix, then its first diagonal is the sequence \((a_{01}, a_{12}, a_{23}, \ldots )\), i.e., the first diagonal above the main diagonal of A.

Theorem 5.13

Let \(g\in G\), and let \(A_g=(a_{ij})_{i, j=0}^\infty \) be the matrix of \(\pi _\infty (g)\) in the basis \(\mathsf {B}_\infty \), constructed in the previous section. Let \((s_1, s_2, \ldots )=(a_{01}, a_{12}, \ldots )\) be the first diagonal of \(A_g\). Then

$$\begin{aligned} s_n=\alpha _k(g), \end{aligned}$$

where \(p^k\) is the maximal power of p dividing n.

For example, if \(p=2\), and \(\alpha (g)=(a_0, a_1, a_2, \ldots )\), then the first diagonal of \(A_g\) is

$$\begin{aligned} a_0, a_1, a_0, a_2, a_0, a_1, a_0, a_3, a_0, a_1, a_0, a_2, \ldots \end{aligned}$$

Proof

The first diagonal of a product of two upper uni-triangular matrices A and B is equal to the sum of the first diagonals of the matrices A and B. It follows that it is enough to prove the theorem for rooted automorphisms of \(\mathop {\mathrm {Aut}}(X^*)\) (i.e., automorphisms g such that \(g|_v\) is trivial for all non-empty words \(v\in X^*\)) and for automorphisms acting trivially on the first level X.

If the automorphism is rooted, then it is a power of the automorphism

$$\begin{aligned} a:x_1x_2\cdots x_n\mapsto (x_1+1)x_2\cdots x_n. \end{aligned}$$

It follows from the definition of the basis \(\mathsf {B}\) that the matrix of \(\pi _\infty (a)\) is the block-diagonal matrix consisting of the Jordan cells of size p. Consequently, its first diagonal is the periodic sequence of period \((1, 1, \ldots , 1, 0)\) of length p. Hence, the first diagonal of \(a^s\) is \((s, s, \ldots , s, 0)\) repeated periodically. This proves the statement of the theorem for the automorphisms of the form \(a^s\).

Suppose that g acts trivially on the first level of the tree. Then its matrix recursion in the basis \(\{\delta _x\}_{x\in X}\) is the diagonal matrix with the entries \(g|_x\) on the diagonal. It follows that the matrix recursion for g in the basis \(\{b_i\}_{i=0}^{p-1}\) is equal to the product of the matrices

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c}0 &{} 0 &{} 0 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ p-1\end{array}}\right) \\ 0 &{} 0 &{} 0 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ p-2\end{array}}\right) \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} 1 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 2\end{array}}\right) \\ 0 &{} 1 &{} 2 &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 1\end{array}}\right) \\ 1 &{} 1 &{} 1 &{} \cdots &{} 1\end{array}\right) \cdot \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c}g|_0 &{} 0 &{} 0 &{} \cdots &{} 0\\ 0 &{} g|_1 &{} 0 &{} \cdots &{} 0\\ 0 &{} 0 &{} g|_2 &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \cdots &{} g|_{p-1}\end{array}\right) \cdot \\ \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c}1 &{} \left( {\begin{array}{c}p-1\\ 1\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 2\end{array}}\right) &{} \cdots &{} \left( {\begin{array}{c}p-1\\ p-1\end{array}}\right) \\ 1 &{} \left( {\begin{array}{c}p-2\\ 1\end{array}}\right) &{} \left( {\begin{array}{c}p-2\\ 2\end{array}}\right) &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} 2 &{} 1 &{} \cdots &{} 0\\ 1 &{} 1 &{} 0 &{} \cdots &{} 0\\ 1 &{} 0 &{} 0 &{} \cdots &{} 0\end{array}\right) . \end{aligned}$$

It is easy to see that the entries on the first diagonal above the main diagonal of the product are equal to zero, and that the entry in the left bottom corner is equal to \(g|_0+g|_1+\cdots +g|_{p-1}\).

If we apply the stencil map

$$\begin{aligned} \Xi _p(s_1, s_2, \ldots )= & {} ((s_1, s_{1+p}, \ldots ), (s_2, s_{2+p}, \ldots ), \ldots , (s_p, s_{2p}, \ldots ))\\= & {} (w_1, w_2, \ldots , w_p) \end{aligned}$$

to the first diagonal of \(A_g\), then \(w_1, w_2, \ldots , w_{p-1}\) are the main diagonals of the p-decimations \(B_1, B_2, \ldots , B_{p-1}\) of \(A_g\), where \((B_1, B_2, \ldots , B_{p-1})\) is the first diagonal above the main diagonal of \(\Xi _p(A_g)\). The sequence \(w_p\) is the first diagonal of the entry in the lower left corner of \(\Xi _p(A_g)\). It follows that \((s_1, s_2, \ldots )\) is of the form \((0, 0, \ldots , s_1', 0, 0, \ldots , s_2', \ldots )\), where there are \(p-1\) zeros at the beginning and between the entries \(s_i'\), and \((s_1', s_2', \ldots )\) is the first diagonal of \(\pi _\infty (g|_0+g|_1+\cdots +g|_{p-1})\) in the basis \((b_0, b_1, \ldots , b_{p-1})\). This provides us with an inductive proof of the statement of the theorem. \(\square \)

Example 5.14

Consider the matrices abcd generating the Grigorchuk group, as it is described in Example 5.11. It follows directly from the description of the action of the elements abcd on the rooted tree (see Example 2.4) that

$$\begin{aligned} \alpha (a)= & {} (1, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ldots ),\\ \alpha (b)= & {} (0, 1, 1, 0, 1, 1, 0, 1, 1, 0, \ldots ),\\ \alpha (c)= & {} (0, 1, 0, 1, 1, 0, 1, 1, 0, 1, \ldots ),\\ \alpha (d)= & {} (0, 0, 1, 1, 0, 1, 1, 0, 1, 1, \ldots ). \end{aligned}$$

It follows from Theorem 5.13 that the first diagonal of a is \((1, 0, 1, 0, \ldots )\). The first diagonal of b is \((s_1, s_2, \ldots )\) where

$$\begin{aligned} s_{2^{3k}(2m+1)}=0, \quad s_{2^{3k+1}(2m+1)}=1, \quad s_{2^{3k+2}(2m+1)}=1, \end{aligned}$$

where \(k, m\ge 0\) are integers.

Sequence \(s_n\) from the previous example is a Toeplitz sequence, see [39, 49] and [2, Exercise 10.11.42]. If G is a finitely generated finite state self-similar group, then the sequence \(\alpha _k(g)\) in Theorem 5.13 is eventually periodic for every \(g\in G\), and then the sequence \(s_n\) is Toeplitz.

5.4 Generating series

Let \(w=(a_0, a_1, \ldots )\in \Bbbk ^\omega \), and consider the corresponding formal power series \(a_0+a_1x+a_2x^2+\cdots =G_w(x)\in \Bbbk [[x]]\). It is easy to see that if

$$\begin{aligned} \Xi _d(w)=(w_0, w_1, \ldots , w_{d-1}), \end{aligned}$$

then

$$\begin{aligned} G_w(x)=G_{w_0}(x^d)+xG_{w_1}(x^d)+\cdots +x^{d-1}G_{w_{d-1}}(x^d). \end{aligned}$$

Note that if \(\Bbbk =\mathbb {F}_p\) and \(d=p\), then we get

$$\begin{aligned} G_w(x)=(G_{w_0}(x))^p+x(G_{w_1}(x))^p+\cdots +x^{p-1}(G_{w_{p-1}}(x))^p. \end{aligned}$$

We have the following characterization of automatic sequences, due to Christol, see [2, Theorem 12.2.5].

Theorem 5.15

Let \(\Bbbk \) be a finite field of characteristic p. Then a sequence \(w\in \Bbbk ^\omega \) is p-automatic if and only if the generating series \(G_w(x)\) is algebraic over \(\Bbbk (x)\).

Similarly, if \(A=(a_{ij})_{i, j=0}^\infty \) is a matrix over a field \(\Bbbk \), then we can consider the formal series \(G_A(x, y)=\sum _{i, j=0}^\infty a_{ij}x^iy^j\in \Bbbk [[x, y]]\). If

$$\begin{aligned} \Xi _d(A)=\left( A_{ij}\right) _{i, j=0}^{d-1}, \end{aligned}$$

then

$$\begin{aligned} G_A=\sum _{i, j=0}^{d-1}x^iy^jG_{A_{ij}}(x^d, y^d). \end{aligned}$$

We also have a complete analog of Christol’s theorem for matrices, see [2, Theorem 14.4.2].

Theorem 5.16

Let \(\Bbbk \) be a finite field of characteristic p. Then a matrix \(A=(a_{ij})_{i, j=0}^\infty \) is p-automatic if and only if the series \(G_A\) is algebraic over \(\Bbbk (x, y)\).

In the case when A is triangular, it may be natural to use the generating function

$$\begin{aligned} T_A(t, s)=\sum a_{ij}t^{j-i}s^i, \end{aligned}$$

so that \(T_A(t, s)=H_0(s)+H_1(s)t+H_2(s)t^2+\cdots ,\) where \(H_i(y)\) are generating functions of the diagonals of A. Note that \(T_A\) and \(G_A\) are related by the formula:

$$\begin{aligned} T_A(t, s)=G_A(s/t, t) \end{aligned}$$

Example 5.17

Consider the generators of the Grigorchuk group abcd given by the matrices from Example 5.11. Let A(xy), B(xy), C(xy), and D(xy) be the corresponding generating series. Note that the generating series of the unit matrix is \(I(x, y)=\frac{1}{1+xy}=\frac{1}{1+s}\).

It follows from the recursions in Example 5.11 that

$$\begin{aligned} A= & {} I(x^2, y^2)+yI(x^2, y^2)+xyI(x^2, y^2)=\frac{1+y+xy}{1+x^2y^2}=\frac{1}{1+s}+\frac{t}{1+s^2}. \\ B= & {} C^2+x(A^2+C^2)+xyA^2= (1+x)C^2+(x+xy)A^2 \\ C= & {} (1+x)D^2+(x+xy)A^2 \\ D= & {} (1+x)B^2+(x+xy)I^2. \end{aligned}$$

Let us make a substitution \(A=y\tilde{A}+\frac{1}{1+xy}\), \(B=y\tilde{B}+\frac{1}{1+xy}\), \(C=y\tilde{C}+\frac{1}{1+xy}\), and \(D=y\tilde{D}+\frac{1}{1+xy}\). Note that \(I=\frac{1}{1+xy}\), so we set \(\tilde{I}=0\). The series \(\tilde{A}, \tilde{B}, \tilde{C}, \tilde{D}\) are the generating series of the matrices obtained from the matrices abcd by removing the main diagonal, and shifting all columns to the left by one position.

We have then

$$\begin{aligned} \tilde{A}=\frac{1}{1+x^2y^2}= \frac{1}{1+s^2} \end{aligned}$$

and

$$\begin{aligned} y\tilde{B}+\frac{1}{1+xy}=(1+x)\left( y^2{\tilde{C}}^2+\frac{1}{1+x^2y^2}\right) +(x+xy)\left( y^2{\tilde{A}}^2+\frac{1}{1+x^2y^2}\right) , \end{aligned}$$

hence

$$\begin{aligned} y\tilde{B}=(1+x)y^2{\tilde{C}}^2+(x+xy)y^2{\tilde{A}}^2, \end{aligned}$$

hence

$$\begin{aligned} \tilde{B}=(y+xy){\tilde{C}}^2+\frac{xy+xy^2}{1+x^4y^4}=(s+t){\tilde{C}}^2+\frac{s+ts}{1+s^4}. \end{aligned}$$

Similarly,

$$\begin{aligned} \tilde{C}=(y+xy){\tilde{D}}^2+\frac{xy+xy^2}{1+x^4y^4}=(s+t){\tilde{D}}^2+\frac{s+ts}{1+s^4}, \end{aligned}$$

and

$$\begin{aligned} \tilde{D}=(y+xy){\tilde{B}}^2=(s+t){\tilde{B}}^2. \end{aligned}$$

Let us denote \(F=\frac{s+ts}{1+s^4}\). Then \(\tilde{B}\), \(\tilde{C}\), and \(\tilde{D}\) are solutions of the equations

$$\begin{aligned}&(s+t)^7{\tilde{B}}^8+\tilde{B}+(s+t)F^2+F=0 \\&(s+t)^7{\tilde{C}}^8+\tilde{C}+(s+t)^3F^4+F=0, \\&(s+t)^7{\tilde{D}}^8+\tilde{D}+(s+t)^3F^4+(s+t)F^2=0. \end{aligned}$$

Substituting \(t=0\), we get equations for the generating functions \(B_1(s), C_1(s), D_1(s)\) of the first diagonals above the main in the matrices bcd:

$$\begin{aligned}&s^7B_1^8+B_1+\frac{s^3}{1+s^8}+\frac{s}{1+s^4}=0, \end{aligned}$$
(21)
$$\begin{aligned}&s^7C_1^8+C_0+\frac{s^7}{1+s^{16}}+\frac{s}{1+s^4}=0, \end{aligned}$$
(22)
$$\begin{aligned}&s^7D_1^8+D_1+\frac{s^7}{1+s^{16}}+\frac{s^3}{1+s^8}=0. \end{aligned}$$
(23)

Denote

$$\begin{aligned} J=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 1 &{} 0 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 1 &{} 0 &{} \cdots \\ 0 &{} 0 &{} 0 &{} 1 &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \ddots \end{array}\right) \end{aligned}$$

Then every upper uni-triangular matrix M can be written as

$$\begin{aligned} I+D_1(M)J+D_2(M)J^2+D_3(M)J^3+\cdots , \end{aligned}$$
(24)

where \(D_i(M)\) are diagonal matrices whose main diagonals are equal to the \(i\hbox {th}\) diagonals of M.

The generating series \(T_M(t, s)\) is equal to

$$\begin{aligned} \frac{1}{1+s}+M_1(s)t+M_2(s)t^2+M_3(s)t^3+\cdots , \end{aligned}$$
(25)

where \(M_i(s)\) is the usual generating series of the main diagonal of \(D_i(M)\). Addition and multiplication of diagonal matrices \(\mathop {\mathrm {diag}}(a_0, a_1, \ldots )\) corresponds to the usual addition and the Hadamard (coefficient-wise) multiplication of the power series \(a_0+a_1s+a_2s^2+\cdots \). Note that we have

$$\begin{aligned} J\mathop {\mathrm {diag}}(a_0, a_1, \ldots )=\mathop {\mathrm {diag}}(a_1, a_2, \ldots )J, \end{aligned}$$

which gives an algebraic rule for multiplication of the power series (25) corresponding to multiplication of matrices.

Namely, we can replace the matrix M by the formal power series (25), where the series in the variable s are added and multiplied coordinate-wise, while the series in the variable t are multiplied in the usual (though non-commutative) way subject to the relation

$$\begin{aligned} t(a_0s^0+a_1s+a_2s^2+\cdots )=(a_1s^0+a_2s+a_3s^2+\cdots )t. \end{aligned}$$

Let \(A=\left( a_{ij}\right) _{i, j=0}^\infty \) be a matrix, and denote by \(\Delta _i(M)=(a_{0, i}, a_{1, i+1}, a_{2, i+2}, \ldots )\) the sequence equal to the ith diagonal of A.

Let \(d>2\), \(k\in \{0, 1, \ldots , d-1\}\), \(n\ge 1\) be integers, and let \(k+n=dq+r\) for \(q\in \mathbb {Z}\) and \(r\in \{0, 1, \ldots , d-1\}\). Then the sequence \(\Xi _d(\Delta _n(M))_k=(a_{k, dq+r}, a_{k+d, d(q+1)+r}, a_{k+2d, d(q+2)+r}, \ldots )\) is equal to the qth diagonal of the matrix \((\Xi _d{M})_{k, r}\).

Example 5.18

Consider, as in the previous examples, the matrices abcd generating the Grigorchuk group. Denote \(A_n=\Delta _n(a), B_n=\Delta _n(b), C_n=\Delta _n(c), D_n=\Delta _n(d)\). Note that \(A_0=(1, 1, \ldots )\), \(A_1=(1, 0, 1, 0, \ldots )\), and \(A_n=0\) for all \(n\ge 2\). We have

$$\begin{aligned} \Xi _2(B_{2n})= & {} \left( C_n, A_n\right) \\ \Xi _2(B_{2n+1})= & {} \left( 0, C_{n+1}+A_{n+1}\right) . \end{aligned}$$

Similarly,

$$\begin{aligned} \Xi _2(C_{2n})= & {} \left( D_n, A_n\right) \\ \Xi _2(C_{2n+1})= & {} \left( 0, A_{n+1}+D_{n+1}\right) , \end{aligned}$$

and

$$\begin{aligned} \Xi _2(D_{2n})= & {} \left( B_n, I_n\right) , \\ \Xi _2(D_{2n+1})= & {} \left( 0, I_{n+1}+B_{n+1}\right) . \end{aligned}$$

These stencil recursions give us recursive formulas for the corresponding generating functions, which we will denote \(B_n(s)\), \(C_n(s)\), \(D_n(s)\). Recall that \(A_n(s)=0\) for \(n\ge 2\), \(I_n(s)=0\) for \(n\ge 1\), \(A_1(s)=\frac{1}{1+s^2}\), and \(A_0(s)=I_0(s)=\frac{1}{1+s}\).

$$\begin{aligned} B_{2n}(s)= & {} C_n^2+sA_n^2 \\ B_{2n+1}(s)= & {} s(C_{n+1}^2+A_{n+1}^2) \\ C_{2n}(s)= & {} D_n^2+sA_n^2 \\ C_{2n+1}(s)= & {} s(D_{n+1}^2+A_{n+1}^2) \\ D_{2n}(s)= & {} B_n^2+sI_n^2 \\ D_{2n+1}(s)= & {} s(I_{n+1}^2+B_{n+1}^2) \end{aligned}$$

Note that iterations of the map \(2n\mapsto n, 2n+1\mapsto n+1\) on the set of non-negative integers are attracted to two fixed points \(0\mapsto 0\) and \(1\mapsto 1\). Consequently, we get the following

Proposition 5.19

For every \(n\ge 1\) the generating functions \(B_n(s), C_n(s), D_n(s)\) are of the form

$$\begin{aligned} \frac{p_0(s)}{1+s^{2^k}}+p_1(s)B_1(s)^{2^l}, \end{aligned}$$

where \(k, l\ge 0\) are integers, and \(p_0(s), p_1(s)\) are polynomials over \(\mathbb {F}_2\).

5.5 Principal columns

Let \(g\in \mathcal {K}_p\), let \(A_g\) be the matrix of \(\pi _\infty (g)\) in the basis \(\mathsf {E}_\infty \) of monomials. Recall that we number the columns and rows of the matrix \(A_g\) starting from zero.

Proposition 5.20

Every entry \(a_{i, j}\) of the matrix \(A_g\) is a polynomial function (not depending on g) of the entries of the columns number \(p, p^2, \ldots , p^{\lfloor \log _p j\rfloor }\).

The same statement is true for the matrices of \(\pi _\infty (g)\) in the basis \(\mathsf {B}_\infty \).

Proof

Let

$$\begin{aligned}{}[a_0, a_1(x_1), a_2(x_1, x_2), \ldots ] \end{aligned}$$

be the tableau of g, as in Sect. 5.1.

Recall that \(e_{p^n}\) is the monomial \(x_n\). Consequently,

$$\begin{aligned} \pi _\infty (g)(e_{p^n})=x_n+u_n(x_1, x_2, \ldots , x_{n-1}). \end{aligned}$$

It follows that the entries of column number \(p^n\) of the matrix \(A_g\) above the main diagonal are the coefficients of the representation of \(u_n\) as a linear combination of monomials \(e_i\) for \(i<p^n\), i.e., are the coefficients of the polynomial \(u_n\). (The entries below the diagonal are zeros, and the entry on the diagonal is equal to one, of course.)

If k is a natural number such that \(k<p^{n+1}\), then \(e_k=x_1^{r_1}x_2^{r_2}\cdots x_n^{r_n}\), where \(r_n, r_{n-1}, \ldots , r_1\) are the digits of the representation of k in the base p numeration system, and

$$\begin{aligned} \pi _\infty (g)(e_k)=(x_1+u_1)^{r_1}(x_2+u_2(x_1))^{r_2}\cdots (x_n+u_n(x_1, x_2, \ldots , x_{n-1})^{r_n}, \end{aligned}$$

which implies the statement of the proposition.

For the basis \(\mathsf {B}_\infty \) we have \(b_{p^n}=-\left( {\begin{array}{c}x_n+1\\ 1\end{array}}\right) =-x_n-1\), and a similar proof works. \(\square \)

Definition 5.21

Columns number \(p^n\), \(n=0, 1, 2, \ldots \), of the matrix \(A_g\) are called the principal columns of \(A_g\).

Example 5.22

Let, for \(p=2\), the first four principal columns of the matrix \(A_g\), \(g\in \mathcal {K}_p\) be \((a_{01}, 1, 0, \ldots )^\top \), \((a_{02}, a_{12}, 1, 0, \ldots )^\top \), and \((a_{04}, a_{14}, a_{24}, a_{34}, 1, 0, \ldots )^\top \). Then the columns number 3, 5, 6, and 7 (when numeration of the columns starts from zero) are

$$\begin{aligned}&\left( \begin{array}{c} a_{01}a_{02}\\ a_{01}a_{12}+a_{12}+a_{02}\\ a_{01}\\ 1 \end{array}\right) ,\quad \left( \begin{array}{c} a_{01}a_{04}\\ a_{01}a_{14}+a_{14}+a_{04}\\ a_{01}a_{24}\\ a_{24}+a_{34}a_{01}+a_{34}\\ a_{01}\\ 1 \end{array}\right) ,\\&\quad \left( \begin{array}{c} a_{02}a_{04}\\ a_{02}a_{14}+a_{12}a_{04}+a_{12}a_{14}\\ a_{04}+a_{24}a_{02}+a_{24} \\ a_{14}+a_{34}+a_{02}a_{34}+a_{12}a_{34}+a_{12}a_{24} \\ a_{02} \\ a_{12} \\ 1 \end{array}\right) , \end{aligned}$$

and

$$\begin{aligned} \left( \begin{array}{c} a_{01}a_{02}a_{04}\\ (a_{04}+a_{14}a_{01}+a_{14})(a_{02}+a_{12})+a_{01}a_{12}a_{04}\\ a_{01}a_{04}+a_{01}a_{24}a_{02}+a_{01}a_{24} \\ (a_{02}+a_{12}+1)(a_{24}+a_{34}+a_{01}a_{34})+ a_{01}(a_{14}+a_{12}a_{24})+a_{04}+a_{14} \\ a_{01}a_{02}\\ a_{02}+a_{12}+a_{01}a_{12} \\ a_{01}\\ 1 \end{array}\right) , \end{aligned}$$

respectively.

5.6 Uniseriality

Recall that the basis \(\mathsf {E}_\infty \) of \(C(X^\omega , \mathbb {F}_p)\) consists of monomial functions \(e_0, e_1, \ldots \), where

$$\begin{aligned} e_{k_0+k_1p+k_2p^2+\cdots }(x_1, x_2, \ldots )=x_1^{k_0}x_2^{k_1}x_3^{k_2}\cdots , \end{aligned}$$

where \(k_i\in \{0, 1, \ldots , p-1\}\) are almost all equal to zero. Let us call, following [27] (though we use a slightly different definition), \(n=k_0+k_1p+k_2p^2+\cdots \) the height of the monomial \(e_n=x_1^{k_0}x_2^{k_1}x_3^{k_2}\cdots \).

Height of a reduced polynomial f is defined as the maximal height of its monomials, and is denoted \(\gamma (f)\). We define \(\gamma (0)=-1\) (note that our definition is different from the definition of Kaloujnine, which uses \(\gamma (f)+1\), so that height of 0 is zero).

Let us describe, following [13], an algorithm for computing height of a function \(f\in \mathbb {F}_p^{X^n}\). Let \(\mathsf {B}\) be the basis \(b_0<b_1<\cdots <b_{p-1}\) of \(\mathbb {F}_p^X\), constructed in 5.2.

Let \(\{\delta _0', \delta _1', \ldots , \delta _{p-1}'\}\) and \(\{b_0', b_1', \ldots , b_{p-1}'\}\) be the bases of the space of functionals \(\left( \mathbb {F}_p^X\right) '\) dual to the bases \(\{\delta _0, \delta _1, \ldots , \delta _{p-1}\}\) and \(\mathsf {B}\), respectively, i.e., \(\delta _i'\) and \(b_i'\) are defined by the condition

$$\begin{aligned} \langle \delta _i'|\delta _j\rangle =\langle b_i'|b_i\rangle =\delta _{i, j} \end{aligned}$$

for all \(0\le i, j\le p-1\).

Then we have

$$\begin{aligned} \langle \delta _x'|f\rangle =f(x), \end{aligned}$$

for all \(f\in \mathbb {F}_p^X\) and \(x\in X=\mathbb {F}_p\).

It follows from Proposition 5.7 and elementary linear algebra that the transition matrix from the basis \(\{\delta _0', \delta _1', \ldots , \delta _{p-1}'\}\) to \(\{b_0', b_1', \ldots , b_{p-1}'\}\) is the matrix transposed to the matrix \(T^{-1}\) of Proposition 5.7, i.e., the matrix

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0 &{} \cdots &{} 0 &{} 0 &{} 0 &{} 1\\ 0 &{} \cdots &{} 0 &{} 0 &{} 1 &{} 1\\ 0 &{} \cdots &{} 0 &{} 1 &{} 2 &{} 1\\ 0 &{} \cdots &{} 1 &{} 3 &{} 3 &{} 1\\ \vdots &{} \cdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ \left( {\begin{array}{c}p-1\\ p-1\end{array}}\right) &{} \cdots &{} \left( {\begin{array}{c}p-1\\ 3\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 2\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 1\end{array}}\right) &{} \left( {\begin{array}{c}p-1\\ 0\end{array}}\right) \end{array}\right) . \end{aligned}$$

In other words,

$$\begin{aligned} b_k'=\sum _{l=0}^{p-1}\left( {\begin{array}{c}l\\ p-1-k\end{array}}\right) \delta _l', \end{aligned}$$
(26)

so that

$$\begin{aligned} \langle b_k'|f\rangle =\sum _{l=0}^{p-1}\left( {\begin{array}{c}l\\ p-1-k\end{array}}\right) f(l). \end{aligned}$$

For instance, \(\langle b_{p-1}'|f\rangle =\sum _{x\in X}f(x)\).

Define linear maps \(R_k:\mathbb {F}_p^{X^{n+1}}\longrightarrow \mathbb {F}_p^{X^n}\), \(k=0, 1, \ldots , p-1\), as the linear extension of the map

$$\begin{aligned} b_{i_1}\otimes \cdots b_{i_n}\otimes b_{i_{n+1}}\mapsto \langle b_k'|b_{i_{n+1}}\rangle \cdot b_{i_1}\otimes \cdots b_{i_n}. \end{aligned}$$

In other terms, the map \(R_k\) is given by

$$\begin{aligned} R_k(f)(x_1, x_2, \ldots , x_n)=\langle b_k'|f(x_1, x_2, \ldots , x_n, x)\rangle , \end{aligned}$$

where \(f(x_1, x_2, \ldots , x_n, x)\in \mathbb {F}_p^{X^{n+1}}\) on the right-hand side of the equality is treated as a function of x for every choice of \((x_1, x_2, \ldots , x_n)\in \mathbb {F}_p^{X^n}\).

Using (26), we see that \(R_k\) can be computed using the formula

$$\begin{aligned} R_k(f)(x_1, x_2, \ldots , x_n)=\sum _{x=0}^{p-1}\left( {\begin{array}{c}x\\ p-1-k\end{array}}\right) f(x_1, x_2, \ldots , x_n, x). \end{aligned}$$

Proposition 5.23

Let \(f\in \mathbb {F}_p^{X^n}\). Define \(j_n\) as the maximal value of \(j=0, 1, \ldots , p-1\) such that \(R_j(f)\ne 0\), and then define inductively \(j_k\) for \(1\le k<n\) as the maximal value of j such that \(R_j\circ R_{j_{k+1}}\circ \cdots \circ R_{j_n}(f)\ne 0\). Then

$$\begin{aligned} \gamma (f)=j_1+j_2p+j_3p^2+\cdots +j_np^{n-1}. \end{aligned}$$

Proof

For any monomial \(b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_m}\) and any \(0\le j\le p-1\), we have

$$\begin{aligned} \langle b_j'|b_{i_m}\rangle \cdot b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_{m-1}} =\left\{ \begin{array}{l@{\quad }l}0 &{} \text {if } j\ne i_m, \\ b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_{m-1}} &{} \text {otherwise.} \end{array}\right. \end{aligned}$$

The proof of the proposition is now straightforward. \(\square \)

In [13] a different basis of the dual space was used (formal dot products with \(b_k\)), but the transition matrix from their basis to \(\{b_i'\}\) is triangular, so a statement similar to Proposition 5.23 holds.

One can find the height of a function \(f\in \mathbb {F}_p^{X^n}\) “from the other end” by applying the maps \(T_{b_k'}\), defined in Sect. 4.4. Recall, that these maps act by the rule

$$\begin{aligned} T_{b_k'}(f)(x_1, x_2, \ldots , x_{n-1})=\langle b_k'|f(x, x_1, x_2, \ldots , x_{n-1})\rangle , \end{aligned}$$

i.e., they are linear extensions of the maps

$$\begin{aligned} T_{b_k'}(b_{i_1}\otimes b_{i_2}\otimes \cdots \otimes b_{i_n})=\langle b_k'|b_{i_1}\rangle b_{i_2}\otimes \cdots \otimes b_{i_n}, \end{aligned}$$
(27)

and are computed by the rule

$$\begin{aligned} T_{b_k'}(f)(x_1, x_2, \ldots , x_{n-1})=\sum _{x=0}^{p-1}\left( {\begin{array}{c}x\\ p-1-k\end{array}}\right) f(x, x_1, x_2, \ldots , x_{n-1}). \end{aligned}$$

Equation (27) imply then the following algorithm for computing height of a function.

Proposition 5.24

Let \(f\in \mathbb {F}_p^{X^n}\), and let \(h_k=T_{b_k'}(f)\) for \(0\le k\le p-1\). Let \(h_{k_1}, h_{k_2}, \ldots , h_{k_s}\) be the functions of the maximal height among the functions \(h_k\), and let \(j_1=\max (k_i)\). Then

$$\begin{aligned} \gamma (f)=j_1+d\cdot \gamma (h_{j_1}). \end{aligned}$$

Proposition 5.24 seems to be less efficient than Proposition 5.23 in general, but it is convenient in the case \(p=2\). Let \(f\in \mathbb {F}_2^{X^n}\). Denote \(f_0=T_{\delta _0'}(f)\), \(f_1=T_{\delta _1'}(f)\), i.e.,

$$\begin{aligned} f_0(x_1, x_2, \ldots , x_{n-1})= & {} f(0, x_1, x_2, \ldots , x_{n-1}),\\ f_1(x_1, x_2, \ldots , x_{n-1})= & {} f(1, x_1, x_2, \ldots , x_{n-1}). \end{aligned}$$

Proposition 5.25

For every \(f\in \mathbb {F}_2^{X^n}\) we have

$$\begin{aligned} \gamma (f)=\left\{ \begin{array}{l@{\quad }l}2\max (\gamma (f_0), \gamma (f_1))+1 &{} \text {if } \gamma (f_0)\ne \gamma (f_1), \\ 2\gamma (f_0) &{} \text {if } \gamma (f_0)=\gamma (f_1). \end{array}\right. \end{aligned}$$

Proof

We have \(h_0=f_1\), \(h_1=f_0+f_1\).

If \(\gamma (f_0)<\gamma (f_1)\), then \(\gamma (h_1)=\gamma (f_0+f_1)=\gamma (f_1)\), hence \(\gamma (f)=2\gamma (h_1)+1=2\gamma (f_1)+1\).

If \(\gamma (f_0)>\gamma (f_1)\), then \(\gamma (h_1)=\gamma (f_0+f_1)=\gamma (f_0)>\gamma (f_1)=\gamma (h_0)\), hence \(\gamma (f)=2\gamma (h_1)+1=2\gamma (f_0)+1\).

If \(\gamma (f_0)=\gamma (f_1)\), then \(\gamma (h_1)=\gamma (f_0+f_1)<\gamma (f_1)=\gamma (f_0)\), hence \(\gamma (f)=2\gamma (h_0)\). \(\square \)

For more on height of functions on trees, and its generalizations, see [13].

Denote, as before, \(U_n=\langle e_0, e_1, \ldots , e_n\rangle \). Then \(U_n\) consists of reduced polynomials of height not bigger than n.

Since the representation of G is uni-triangular in the basis \(\mathsf {E}_\infty \), the spaces \(U_n\) are G-invariant, i.e., are sub-modules of the G-module \(C(X^\omega , \mathbb {F}_p)\). Note also that \(U_{n-1}\) has co-dimension 1 in \(U_n\).

Proposition 5.26

Let \(g\in \mathcal {K}_p\) be the adding machine. Then \(U_n=(g-1)U_{n+1}\) for every n.

Proof

We know that the matrix of g in the basis \(\mathsf {B}_\infty \) is the infinite Jordan cell. Consequently, \((g-1)(b_0)=0\), and \((g-1)(b_{n+1})=b_n\) for all \(n\ge 0\). It follows that

$$\begin{aligned} (g-1)(U_{n+1})=(g-1)(\langle b_0, b_1, \ldots , b_{n+1}\rangle )=\langle b_0, b_1, \ldots , b_n\rangle =U_n. \end{aligned}$$

\(\square \)

Theorem 5.27

If V is a sub-module of the \(\mathcal {K}_p\)-module \(C(X^\omega , \mathbb {F}_p)\), then either \(V=\{0\}\), or \(V=C(X^\omega , \mathbb {F}_p)\), or \(V=U_n\) for some n.

Proof

Let \(v\in V\) and \(n\ge 0\) be such that \(v\in U_n{\setminus } U_{n-1}\). Let \(g\in \mathcal {K}_p\) be the adding machine defined as the automorphism of the tree \(X^*\) acting by the rule

$$\begin{aligned} g(x_1, x_2, \ldots , x_n)=\left\{ \begin{array}{l@{\quad }l} (x_1+1, x_2, \ldots , x_n) &{} 0\le x_1\le p-2,\\ (0, g(x_2, \ldots , x_n)) &{} x_1=p-1.\end{array}\right. \end{aligned}$$

Then \((g-1)^k(v)\in U_{n-k}{\setminus } U_{n-k-1}\) for all \(1\le k\le n\). (We assume that \(U_{-1}=\{0\}\).) It follows that \(\langle v, (g-1)(v), (g-1)^2(v), \ldots , (g-1)^n(v)\rangle _{\mathbb {F}_p}=U_n\subset V\).

Let n be the maximal height of an element of V. If n is finite, then by the proven above, \(V=U_n\). If n is infinite, then, by the proven above, V contains \(\bigcup _{n=0}^\infty U_n=C(X^\omega , \mathbb {F}_p)\). \(\square \)

We adopt therefore, the following definition.

Definition 5.28

Let \(G\le \mathcal {K}_p\). We say that the action of G on \(C(X^\omega , \mathbb {F}_p)\) is uniserial if for every \(n\ge 0\) the set \(\bigcup _{g\in G}(g-1)U_{n+1}\) generates \(U_n\).

A module M is said to be uniserial if its lattice of sub-modules is a chain. It is easy to see that the same arguments as in the proof of Theorem 5.27 show that if the action of G on \(C(X^\omega , \mathbb {F}_p)\) is uniserial, then \(U_n\) are the only proper sub-modules of the G-module \(C(X^\omega , \mathbb {F}_p)\). Consequently, the G-module \(C(X^\omega , \mathbb {F}_p)\) is uniserial.

In group theory (see [15, 30]) an action of a group G on a finite p-group U is said to be uni-serial, if \(|N:[N, G]|=p\) for every non-trivial G-invariant subgroup \(N\le U\). Here [NG] is the subgroup of N generated by the elements \(h^gh^{-1}\) for \(h\in H\) and \(g\in G\), where \(h^g\) denotes the action of \(g\in G\) on \(h\in H\).

Let \(g\in \mathcal {K}_p\), and let

$$\begin{aligned}{}[f_0, f_1(x_1), f_2(x_1, x_2), \ldots ] \end{aligned}$$

be the tableau of g. We have seen in Sect. 5.5 (see the proof of Proposition 5.20) that the entries of the principal columns \((a_{0, p^n}, a_{1, p^n}, \ldots , a_{p^n-1, p^n})^\top \) of the matrix \((a_{i, j})_{i, j=0}^\infty \) of \(\pi _\infty (g)\) in the basis \(\mathsf {E}_\infty \) are precisely the coefficients of the polynomials \(f_n\):

$$\begin{aligned} f_n(x_1, x_2, \ldots , x_{n-1})=\sum _{k=0}^{p^n-1}a_{k, p^n}e_k, \end{aligned}$$

where \(e_k\) is the monomial of height k.

It follows that the height of \(f_n\) is equal to the largest index of a non-zero non-diagonal entry of the column number \(p^n\) of the matrix of \(\pi _\infty (g)\) in the basis \(\mathsf {E}_\infty \). Note that the same is true for the matrix of \(\pi _\infty (g)\) in the basis \(\mathsf {B}_\infty \).

Proposition 5.29

Let \(G\le \mathcal {K}_p\), and let \(\alpha :\mathcal {K}_p\longrightarrow \mathbb {F}_p^\omega :g\mapsto (\alpha _0(g), \alpha _1(g), \ldots )\) be the abelianization homomorphism given by (18). The action of the group G on \(C(X^\omega , \mathbb {F}_p)\) is uniserial if and only if every homomorphism \(\alpha _k:G\longrightarrow \mathbb {F}_p\) is non-zero.

Proof

It follows from Theorem 5.13 that all homomorphisms \(\alpha _k\) are non-zero if and only if for every \(k=1, 2, \ldots \) there exists \(g_k\in G\) such that the entry number k on the first diagonal of \(\pi _\infty (g_k)\) is non-zero.

Then for every monomial \(e_k\) the height of \((1-g_k)(e_k)\) is equal to \(k-1\), which shows that \(\bigcup _{i=1}^k(1-g_k)(U_k)\) generates \(U_{k-1}\), hence the action of G is uniserial. \(\square \)

Corollary 5.30

Let S be a generating set of \(G\le \mathcal {K}_p\). Then the action of G on \(C(X^\omega , \mathbb {F}_p)\) is uniserial if and only if for every \(k=0, 1, \ldots \) there exists \(g_k\in S\) such that \(\alpha _k(g_k)\ne 0\).

Note that it also follows from Theorem 5.13 and from the fact that the entries in the principal columns are the coefficients of the polynomials in the tableau, that \(\alpha _n(g)\ne 0\) if and only if height of the polynomial \(f_n\) of the tableau \([f_0, f_1(x_1), f_2(x_1, x_2), \ldots ]\) representing g is equal to \(p^n-1\), i.e., has the maximal possible value.

Example 5.31

The cyclic group generated by an element \(g\in \mathcal {K}_p\) is transitive on the levels \(X^n\) if and only if \(\alpha _n(g)\ne 0\) for all n. It follows that if G contains a level-transitive element, then its action is uniserial. But there exist torsion groups with uniserial action on \(C(X^\omega , \mathbb {F}_p)\), as the following example shows.

Example 5.32

It is easy to check that for the generators abcd of the Grigorchuk group, we have \(\alpha (a)=(1, 0, 0, 0, \ldots )\), and

$$\begin{aligned} \alpha (b)= & {} (0, 1, 1, 0, 1, 1, 0, 1, 1, 0, \ldots ) \\ \alpha (c)= & {} (0, 1, 0, 1, 1, 0, 1, 1, 0, 1, \ldots ) \\ \alpha (d)= & {} (0, 0, 1, 1, 0, 1, 1, 0, 1, 1, \ldots ). \end{aligned}$$

(In the last three equalities, each sequence have a pre-period of length 1 and a period of length 3). It follows that the action of the Grigorchuk group is uniserial.

Example 5.33

Gupta–Sidki group [24] is generated by two elements ab acting on \(\{0, 1, 2\}^*\), where a is the cyclic permutation \(\sigma =(012)\) on the first level of the tree (i.e., changing only the first letter of a word), and b is defined by the wreath recursion

$$\begin{aligned} b=(a, a^{-1}, b). \end{aligned}$$

Then \(\alpha (a)=(1, 0, 0, \ldots )\), and \(\alpha (b)=(0, 0, 0, \ldots )\), hence the group \(\langle a, b\rangle \) does not act uniserially on \(\{0, 1, 2\}^\omega \).