Self-similar groups, automatic sequences, and unitriangular representations

We study natural linear representations of self-similar groups over finite fields. In particular, we show that if the group is generated by a finite automaton, then obtained matrices are automatic. This shows a new relation between two separate notions of automaticity: groups generated by automata and automatic sequences. We also show that if the group acts on the tree by $p$-adic automorphisms, then the corresponding linear representation is a representation by infinite triangular matrices. We relate this observation with the notion of height of an automorphism of a rooted tree due to L.Kaloujnine.


Introduction
Self-similar groups is an active topic of modern group theory. They initially appeared as interesting examples of groups with unusual properties (see [Ale72,Sus79,Gri80,GS83]). The main techniques of the theory were developed for the study of these examples. Later a connection to dynamical systems was discovered (see [Nek03,Nek05]) via the notion of the iterated monodromy group. Many interesting problems were solved using self-similar groups (see [Gri98,GLSŻ00,BV05,BN06]).
One of the ways to define self-similar groups is to say that they are groups generated by all states of an automaton (of Mealy type, also called a transducer, or sequential machine). Especially important case is when the group is generated by the states of a finite automaton. All examples mentioned above (including the iterated monodromy groups of expanding dynamical systems) are like that.
The main goal of this article is to indicate a new relation between self-similar groups and another classical notion of automaticity: automatic sequences and matrices. See the monographs [AS03,vH03] for theory of automatic sequences and applications.
More precisely, we are going to study natural linear representations of selfsimilar groups over finite fields, and show that matrices associated with elements of a group generated by a finite automaton are automatic.
There are several ways to define automatic sequences and matrices. One can use Moore automata, substitutions (e.g., Morse-Thue substitution leading to the famous Morse-Thue sequence), or Christol's characterization of automatic sequences in terms of algebraicity of the generating power series over a suitable finite field [AS03, Theorem 12.2.5]. The theory of automatic sequences is rich and is related to many topics in dynamical systems, ergodic theory, spectral theory of Schrödinger operators, number theory etc., see [AS03,vH03].
It is well known that linear groups (that is subgroups of groups of matrices GL N (k), where k is a field) is quite a restrictive class of groups as the Tits alternative [Tit72] holds for them. Moreover, the group of (finite) upper triangular matrices is solvable, and the group of upper unitriangular matrices is nilpotent. In contrast, if one uses infinite triangular matrices over finite field, one can get much more groups. In particular, every countable residually finite p-group can be embedded into the group of upper uni-triangular matrices over the finite field F p .
We will pay special attention to the case when the constructed representation is a representation by infinite unitriangular matrices. One of the results of our paper is showing that the natural (and optimal in certain sense) representation by uni-triangular matrices constructed in [LNS05,Leo12] leads to automatic matrices, if the group is generated by a finite automaton. In particular, the diagonals of these uni-triangular matrices are automatic sequences. We study them separately, in particular, by computing their generating series (as algebraic functions).
Note that study of actions on rooted trees (every self-similar group is, by definition, an automorphism group of the rooted tree of finite words over a finite alphabet) is equivalent to the study of residually finite groups by geometric means, i.e., via representations of them in groups of automorphisms of rooted trees. The theory of actions on rooted trees is quite different from the Bass-Serre theory [Ser80] of actions on (unrooted) trees, and uses different methods and tools. The important case is when a group is residually finite p-group (p prime), i.e., is approximated by finite p-groups. The class of residually finite pgroups contains groups with many remarkable properties. For instance, Golod's p-groups that were constructed in [Gol64] based on Golod-Shafarevich theorem to answer a question of Burnside on the existence of a finitely generated infinite torsion group are residually p groups. Other important examples are the first self-similar groups mentioned at the beginning of this introduction.
At the end of the paper we study a notion of uniseriality which plays an important role in the study of actions of groups on finite p-groups [DdSMS99,LGM02]. Our analysis is based upon classical results of L. Kaloujnine on height of automorphisms of rooted trees [Kal48,Kal51]. Applications of uniseriality to Lie algebras associated with self-similar groups were, for instance, demonstrated in [BG00a]. Proposition 5.29 gives a simple criterion of uniseriality of action of a group on rooted trees and allows to substitute lemma 5.2 from [BG00a]. A number of examples is presented which demonstrate the basic notions, ideas, and results.
Denote by Aut(X * ) the group of all automorphisms of the rooted tree X * . Every element of Aut(X * ) preserves the levels X n of the tree, and for every g ∈ Aut(X * ) beginning of length n ≤ m of the word g(x 1 x 2 . . . x m ) is equal to g(x 1 x 2 . . . x n ). It follows that for every v ∈ X * the transformation α g,v : X −→ X defined by g(vx) = g(v)α g,v (x) is a permutation, and that the action of g on X * is determined by these permutations according to the rule g(x 1 x 2 . . . x n ) = α g,∅ (x 1 )α g,x1 (x 2 )α g,x1x2 (x 3 ) . . . α g,x1x2...xn−1 (x n ). (1) The map v → α g,v from X * to the symmetric group Symm(X) is called the portrait of the automorphism g. Equivalently, we can represent g by the sequence τ = [τ 0 , τ 1 , τ 2 , . . .] of maps τ n : X n −→ Symm(X), where τ n (v) = α g,v . Such sequences are called, following L. Kaloujnine [Kal47], tableaux, and is denoted τ (g). If [τ 0 , τ 1 , . . .] and [σ 0 , σ 1 , . . .] are tableaux of elements g 1 , g 2 ∈ Aut(X * ), respectively, then tableau of their product g 1 g 2 is the sequence of functions τ (g 1 g 2 ) = [τ n (g(x 1 x 2 . . . x n )) · σ n (x 1 x 2 . . . x n )] ∞ i=0 . (2) Tableau of the inverse of the element g 1 is Here, and in most of our paper, (except when we will talk about bisets, i.e., about sets with left and right actions) group elements and permutations act from the left.
Denote by X [n] the finite sub-tree of X * spanned by the set of vertices n k=0 X k . The group Aut(X * ) acts on X [n] , and the kernel of the action coincides with the kernel of the action on X n . The quotient of Aut(X * ) by the kernel of the action is a finite group, which is naturally identified with the full automorphism group of the tree X [n] . We will denote this finite group by Aut(X [n] ).
The group Aut(X * ) is naturally isomorphic to the inverse limit of the groups Aut(X [n] ) (with respect to the restriction maps). This shows that Aut(X * ) is a profinite group. The basis of neighborhoods of identity of Aut(X * ) is the set of kernels of its action on the levels X n of the tree X * .
2.2 Self-similarity of Aut(X * ) Let g ∈ Aut(X * ), and v ∈ X * . Then there exists an automorphism of X * , denoted g| v such that g(vw) = g(v)g| v (w) for all w ∈ X * .
We call g| v the section of g at v. The sections obviously have the properties for all g, g 1 , g 2 ∈ Aut(X * ) and v, v 1 , v 2 ∈ X * . The portrait of the section g| v is obtained by restricting the portrait of g onto the subtree vX * , and then identifying vX * with X * by the map vw → w.
Definition 2.1. The set g| X = {g| v : v ∈ X * } ⊂ Aut(X * ) for g ∈ Aut(X * ), is called the set of states of g. An automorphism g ∈ Aut(X * ) is said to be finite state if g| X is finite.
It follows from (4) that g −1 | X = (g| X ) −1 , (g 1 g 2 )| X ⊂ g 1 | X g 2 | X , which implies that the set of finite state elements of Aut(X * ) is a group. We call it the group of finite automata, and denote it FAut(X * ). This name comes from interpretation of elements of Aut(X * ) with automata (transducers), see 3.1 below. Namely, the set of state of the automaton corresponding to g is g| X . The element g ∈ g| X is the initial state. If the current state of the automaton is h, and it reads a letter x ∈ X on its input, then it outputs h(x) and changes it current state to h| x . It is easy to check that if we give the consecutive letters of a word x 1 x 2 . . . x n on input of the automaton with the initial state g, then we will get on output the word g(x 1 x 2 . . . x n ) = g(x 1 )g| x1 (x 2 )g| x1x2 (x 3 ) . . ., compare with (1). Every element g ∈ Aut(X * ) is uniquely determined by the permutation π it defines on the first level X ⊂ X * and the first level sections g| x , x ∈ X. In fact, the map g → π · (g| x ) x∈X is an isomorphism of Aut(X * ) with the wreath product Symm(X)⋉Aut(X * ) X = Symm(X) ≀ Aut(X * ). We call the isomorphism For a fixed ordering x 1 , x 2 , . . . , x d of the letters of X, the elements of Symm(X)≀ Aut(X * ) are written as π(g 1 , g 2 , . . . , g d ), where π ∈ Symm(d) and g i = g| xi .
In other words, a group G ≤ Aut(X * ) is self-similar if restriction of the wreath recursion onto G is a homomorphism φ : G −→ Symm(X) ≀ G. Note that wreath recursion is usually not an isomorphism (but is an embedding, since we assume that G acts faithfully on X * ).
The group generated by a (which is infinite cyclic) is self-similar, and is a subgroup of the group of finite automata.
Example 2.4. Consider the group G generated by the elements a, b, c, d that are defined inductively by the recursions Here σ, as before, is the transposition (01), and when we omit either the element of Symm(X) or the element of Aut(X * ) X when writing elements of Symm(X) ≀ Aut(X * ), we assume that it is equal to the identity element of the respective group.
The group G is then a self-similar subgroup of the group of finite automata. It is the Grigorchuk group, defined in [Gri80].

Self-similarity bimodule
We can identify the letters x ∈ X with transformations v → xv of the set X * . Then the identity g(xv) = yh(v) for x ∈ X, y = g(x), and h = g| x is written as equality of compositions of transformations: Consider the set X · G of compositions of the form x · g, i.e., transformations v → xg(v), v ∈ X * . It is closed with respect to pre-and post-compositions with the elements of G: We get in this way a biset, i.e., a set with two commuting left and right actions of the group G. Let k be a field, and let k[G] be the group ring over k. Denote by Φ the vector space k X·G spanned by X · G. Then the left and the right actions of G on X · G are extended by linearity to a structure of a k[G]-bimodule on Φ. We will denote by G Φ and Φ G the space Φ seen as a left and a right k[G]-module, respectively.
It follows directly from the definition of the right action of G on X ·G that X (identified with X · 1) is a free basis of Φ G . The left action is not free in general, since it is possible to have g(xv) = xv for all v ∈ X * and for a non-trivial element g ∈ G, which will imply, by definition of the left action, that g · x = x. For After fixing a basis of the right module Φ G (for example X), we can identify the algebra of endomorphisms End Φ G of the right k[G]-module Φ G with the algebra of |X| × |X| matrices over k [G]. In this case the homomorphism Ξ : is called the matrix recursion associated with the self-similar group G (and the basis of the right module). More If we use the basis {x 1 , x 2 , . . . , x d } = X of the right module Φ G , then the matrix recursion Ξ is a direct rewriting of the wreath recursion φ : G −→ Symm(X) ≀ G in matrix terms. Namely, Ξ(g) is the matrix with entries a ij , 1 ≤ i, j ≤ d, given by the rule Example 2.5. The adding machine recursion a = σ(1, a) is defined in the terms of the bimodules as where x 0 , x 1 are identified with the symbols 0, 1, respectively, from Example 2.3. It follows that the recursion is written in matrix form as The recursive definition of the generators a, b, c, d of the Grigorchuk group is written as When we change the basis of the right module Φ G , we just conjugate the map Ξ by the transition matrix. Namely, if {x 1 , . . . , x d } and {y 1 , . . . , y d } are bases of the right module Φ G , then we can write Then the matrix T = (b i,j ) 1≤i,j≤d is the transition matrix from the basis {x i } 1≤i≤d to the basis {y i } 1≤i≤d .
Example 2.6. Consider again the adding machine example. Let us take, instead of the standard basis {x 0 , x 1 } = X, the basis y 0 = x 0 + x 1 , y 1 = x 1 . (Here we replace the letters 0, 1 of the binary alphabet by x 0 and x 1 , respectively, in order not to confuse them with elements 0, 1 ∈ k[G].) Then the transition matrix to the new basis is T = 1 0 1 1 . It inverse is T −1 = 1 0 −1 1 .
Consequently, the matrix recursion in the new basis is This can be checked directly: and a · y 1 = a · x 1 = x 0 · a = y 0 · a − y 1 · a.
If we take the basis {y 0 = x 0 , y 1 = x 1 · a}, then matrix recursion becomes If the basis is a subset of X · G, then the matrix recursion corresponds to a wreath recursion G → Symm(X) ≀ G. For instance, in the last example the matrix recursion corresponds to the wreath recursion a → σ(a −1 , a 2 ).
This wreath recursion describes the process of adding 1 to a dyadic numbers in the binary numeration system with digits 0 and 3. For more on changes of bases in the biset X · G and the corresponding transformations of the wreath recursion see [Nek05, Nek08].
If Φ 1 and Φ 2 are bimodules over a k-algebra A, then their tensor product Φ 1 ⊗Φ 2 is the quotient of the k-vector space spanned by Φ 1 ×Φ 2 by the sub-space generated by the elements of the form If Φ 2 is a left A-module, and Φ 1 is an A-bimodule, then the left module Φ 1 ⊗ Φ 2 is defined in the same way.
Let Φ, as above, be the bimodule associated with a self-similar group G. Then X is a basis of the right k[G]-module Φ G , and the set is a basis of the right module Φ ⊗n G , which is hence a free module. Note that X n · G is the basis of Φ ⊗n as a vector space over k.
We identify x 1 ⊗ · · · ⊗ x n with the word x 1 · · · x n . The left module structure on Φ ⊗n is given by the rules similar to the definition of Φ: for v ∈ X n and g ∈ G. In particular, up to an ordering of the basis X n , the associated matrix recursion Ξ n : ) by replacing every entry a ij of the matrix Ξ n−1 (a) by the matrix Ξ(a ij ).
Example 2.7. The matrix recursion G −→ M 4 (k[G]) for the adding machine (in the standard basis X 2 ) is which is obtained by iterating the matrix recursion a → 0 a 1 0 .
In this case the basis X 2 is ordered in the lexicographic order x 0 x 0 < x 0 x 1 < x 1 x 0 < x 1 x 1 . But since a is the adding machine, and it describes adding 1 to a dyadic integer that is written is such a way that the less significant digits come before the more significant ones, it is more natural to order the basis in the inverse lexicographic order x 0 x 0 < x 1 x 0 < x 0 x 1 < x 1 x 1 . In this case the matrix recursion becomes Proposition 2.8. Let T be the transition matrix from the basis X of Φ G to a basis Y . Suppose that all entries of T are elements of k. Then the transition matrices T n from the basis X ⊗n to Y ⊗n is equal to where ⊗ is the Kronecker product of matrices.
for all v ∈ Y ⊗(n−1) . Similarly, denote T = (a x,y ) x∈X,y∈Y . Then which shows that a xu,yv = a x,y a u,v , which agrees with the definition of the Kronecker product.
In other words, we can write where T 1 = T , and T (n) is the matrix T in which each entry a ij is replaced by a ij times the unit matrix of dimension |X| n−1 × |X| n−1 . Here the rows and columns of T n correspond to the elements of X ⊗n and Y ⊗n , respectively, ordered lexicographically. It is easy to see from the proof, that in the general case (when not all entries of T are elements of k), the formula (7) remains to be true, if we replace T (n) by the image of T under the (n − 1)st iteration of the matrix recursion (in the basis X).
Example 2.9. Let k = R, X = {x 0 , x 1 }. Consider a new basis of Φ G The transition matrix to the new basis is Then the transition matrix from X ⊗n to Y ⊗n satisfies the recursion

Inductive limit of k X n
Let k X n be the vector space of functions X n −→ k. It is naturally isomorphic to the nth tensor power of k X . The isomorphism maps an elementary tensor f 1 ⊗ f 2 ⊗ · · · ⊗ f n to the function More generally, we have natural isomorphisms k X n ⊗ k X m ∼ = k X n+m defined by the equality We denote by δ v , for v ∈ X n the delta-function of v, i.e., the characteristic function of {v}. It is an element of k X n . Note that δ x1x2...xn = δ x1 ⊗ δ x2 ⊗ · · · ⊗ δ xn , with respect to the above identification of k X n+m with k X n ⊗ k X m .
Let G ≤ Aut(X * ). Denote by π n the natural permutational representation of G on k X n coming from the action G on X n . It is given by the rule π n (δ v ) = δ g(v) , i.e., by Denote by V n the vector space k X n seen as a left k[G]-module of the representation π n , and by [ε] the left k[G]-module of the trivial representation of G. More explicitly, it is a one-dimensional vector space over k spanned by an element ε, together with the left action of k[G] given by the rule The following proposition is a direct corollary of (6).
Denote by 1 the function x∈X δ x ∈ V 1 taking constant value 1 ∈ k. We have then, for every f ∈ V n = k X n , The following proposition is straightforward.
Proposition 2.11. The map ι n : v → v ⊗ 1 : V n −→ V n+1 is an embedding of the left k[G]-modules. In other words, The space X ω = {x 1 x 2 . . . : x i ∈ X} has a natural topology of a direct (Tikhonoff) power of the discrete space X. A basis of this topology consists of the cylindrical sets vX ω , for v ∈ X * .
Denote by C(X ω , k) the vector space of maps f : X ω −→ k such that f −1 (a) is open and closed (clopen) for every a ∈ k. In other words, C(X ω , k) is the space of all continuous maps f : X ω −→ k, where k is taken with discrete topology. Note that the set of values of any element of C(X ω , k) is finite, since X ω is compact.
For example, a map f : X ω −→ R belongs to C(X ω , R) if and only it is continuous and has a finite set of values.
The group G acts naturally on X ω by homeomorphisms, hence it also acts naturally on the space C(X ω , k) by the rule for g ∈ G, ξ ∈ C(X ω , k), and w ∈ X ω .
For every f ∈ V n = k X n consider the natural extension of f : X n −→ k to a function on X ω : For example, the delta-function δ v is extended to the characteristic functions of the subset vX ω , which we will also denote δ v . It is easy to see that this defines an embedding of k[G]-modules V n −→ C(X ω , k). Moreover, these embeddings agree with the embeddings ι n : Denote by V ∞ the direct limit of the G-modules V n with respect to the maps ι n . We will denote by π ∞ the corresponding representation of G on V ∞ .
Proof. The set {f −1 (t) : t ∈ k, f −1 (t) = ∅} is a finite covering of X ω by clopen disjoint sets. Every clopen set of X ω is a finite union of cylindrical sets of the form vX ω , for v ∈ X * . Consequently, there exists n such that f is constant on every cylindrical set of the form vX ω for v ∈ X n . Then f ∈ k X n in the identification of k X n with a subspace of C(X ω , k), described above. It follows that the inductive limit of k X n coincides with C(X ω , k). We have already seen that the representations π n agree with the representation of G on C(X ω , k), restricted to V n = k X n ) which finishes the proof.
Let B be a basis of the k-vector space k X such that the constant one function 1 belongs to B. Then B ⊗n is a basis of the k-vector space k X n = V n , and we have ι n (B ⊗n ) ⊂ B ⊗n+1 . Then the inductive limit B ∞ of the bases B ⊗n with respect to the maps ι n is a basis of C(X ω , k) = V ∞ . The elements of this basis are equal to functions of the form where f i ∈ B and all but a finite number of the functions f i are equal to the constant one.
Example 2.13. Suppose that the field k ∼ = F q is finite, and let X = k. Then the functions e k : X −→ k : x → x k for k = 1, 2, . . . , q − 1 together with the constant one function 1, formally denoted x 0 , form a basis of V 1 .

Figure 1: Walsh basis
The corresponding basis of C(X ω , k) is equal to the set of all finite monomial functions f (x 1 , x 2 , . . .) = x k1 1 x k2 2 · · · , where all but a finite number of powers k i are equal to zero.
Writing the elements of C(X ω , k) in this basis amounts to representing them as polynomials.
For k = C, the Walsh basis is an orthonormal set of complex-valued functions on X ω with respect to the uniform Bernoulli measure on X ω . This is a direct corollary of the fact that {y 0 , y 1 } is orthonormal. Since W ∞ is a basis of the linear space of continuous functions X ω −→ C with finite sets of values, and this space is dense in the Hilbert space L 2 (X ω ), the Walsh basis is an orthonormal basis of L 2 (X ω ).
We can use Proposition 2.8 to find transition matrices from {δ v } v∈X n to the basis W ⊗n (just use the proposition for the case of the trivial group G). In the case of Walsh basis we get the matrices from Example 2.9, but without 1/ √ 2: compare with Example 2.9. These matrices are examples of Hadamard matrices (i.e., matrices whose entries are +1 and -1 and whose rows are orthogonal) and were constructed for the first time by J.J.Sylvester [Syl67]. They are also called Walsh matrices.
See Figure 1, where graphs of the first eight elements of the Walsh basis are shown. Here we identify {0, 1} ω with the unit interval [0, 1] via real binary numeration system.
Example 2.15. A related basis of C(X ω , k) is the Haar basis, which is constructed in the following way. Again, we assume that characteristic of k is different from 2, and X = {x 0 , x 1 }. Let y 0 = 1 and y 1 = δ x0 − δ x1 , as in the previous example. Let us construct an increasing sequence of bases Y n of k X n < C(X ω , k) in the following way. Let Y 0 = {y 0 }. Define then inductively: In the case k = C, and identification of C(X ω , C) with a linear subspace of L 2 (X ω , µ), where µ is the uniform Bernoulli measure on X ω , it makes sense to normalize the elements of Y n in order to make them of norm one. Since norm of δ v is equal to 2 −n/2 , the recurrent definition of the basis in this case is It is easy to see that the union Y ∞ of the bases Y n is an orthonormal basis of L 2 (X ω , µ). It is called the Haar basis. See its use in the context of groups acting on rooted trees in [BG00b]. • X and Y are the input and output alphabets of the automaton;

Mealy and Moore automata
We always assume that X and Y are finite and have more than one element each.
We frequently assume that X = Y , and say that the automaton is defined over the alphabet X. The automaton is finite if the set Q is finite. In some cases, we do not assume that an initial state is chosen.
Let A = (Q, X, Y, π, τ, q 0 ) be a Mealy automaton. Let us extend the definition of the maps π and τ to maps π : Q × X * −→ Q and τ : Q × X * −→ X by the inductive rules We interpret the automaton A as a machine, which being in a state q ∈ Q and reading a letter x ∈ X, goes to the state π(q, x), and gives the letter τ (q, x) on the output. If the machine starts at the state q ∈ Q, and reads a word v, then its final state will be π(q, v) is, it the final letter on output will be τ (q, v). where In other words, A q0 (v) is the word that the machine gives on output, when it reads the word v on input, if q 0 is its initial state.
Example 3.3. Let G ≤ Aut(X * ) be a self-similar group. Consider the corresponding full automaton with the set of states Q = G, and output and transition functions defined by the rules: It follows from (4) that if we choose g ∈ G as the initial state, then the transformations of X * and X ω defined by this automaton coincides with the original transformations defined by g ∈ Aut(X * ). This automaton is infinite, but if G ≤ FAut(X * ), then for every g ∈ G, the set {g| v : v ∈ X * } is a finite set, and we can take it as a set of states of a finite automaton defining the transformation g.
A special type of Mealy automata are the Moore automata. The definition of a Moore automaton is the same as Definition 3.1, except that the output function is a map τ : Q −→ X, i.e., the output depends only on the state, and does not depend on the input letter.
Moore automata also act on words, essentially in the same way as Mealy automata. We can extend the definition of the transition function π to Q × X * by the same formula as for the Mealy automata. Then the action of a Moore automaton with initial state q 0 on words is given by the rule where q i+1 = π(q i , x i+1 ).
Even though the definition of a Moore automaton seems to be more restrictive than the definition of a Mealy automaton, the two notions are basically equivalent, as any Mealy automaton can be modeled by a Moore automaton. Hence, the set of maps defined by finite Mealy automata coincides with the set of maps defined by finite Moore automata.
Let A = (Q, X, Y, π, τ, q 0 ) be a Mealy automaton. Consider the Moore automaton A ′ over the input and output alphabets X and Y , respectively, with the set of states Q × X ∪ {p 0 }, where p 0 is an element not belonging to Q × X, and with the transition and output maps π ′ and τ ′ given by the rules to be any letter, since it will never appear in the output.) It is easy to check that the new Moore automaton with the initial state p 0 defines the same maps on X * and X ω as the original Mealy automaton A. Therefore, we will not use Moore automata to define transformations of the sets of words. They will be used to define automatic sequences and matrices in Section 4. Traditionally, Mealy automata are used in theory of groups generated by automata (see [GNS00]), while Moore automata are used for generation of sequences (even though the term "Moore automata" is not used in [AS03]).

Diagrams of automata
The automata are usually represented as finite labeled graphs (called Moore diagrams). The set of vertices coincides with the set of states Q. For every q ∈ Q and x ∈ X there is an arrow from q to π(q, x) labeled by (x, τ (q, x)) in the case of Mealy automata, and just by x in the case of Moore automata. The initial state is marked, and the states are marked by the values of τ (q), if it is a Moore automaton.
Sometimes the arrows of diagrams of Mealy automata are just labeled by the input letters x, and the vertices are labeled by the corresponding transformation Consider a directed graph with one marked (initial) vertex, in which the edges are labeled by pairs (x, y) ∈ X 2 . The necessary and sufficient condition for such a graph to represent a Mealy automaton is that for every vertex q and every letter x ∈ X there exists a unique arrow starting at q and labeled by (x, y) for some y ∈ X. Then an image of a word x 1 x 2 . . . under the action of the automaton is calculated by finding the unique direct path e 1 , e 2 , . . . of arrows starting at the initial vertex, whose arrows are labeled by (x 1 , y 1 ), (x 2 , y 2 ), . . ., respectively. Then y 1 y 2 . . . is the image of x 1 x 2 . . ..
The diagram of the adding machine transformation (see Example 2.3) is shown on Figure 2. We mark the initial state by a double circle.

Non-deterministic automata
Let us generalize the notion of a Mealy automaton by allowing more general Moore diagrams.
Definition 3.5. A (non-deterministic) synchronous automaton A over an alphabet X is an oriented graph whose arrows are labeled by pairs of letters (x, y) ∈ X 2 . Such automaton is called ω-deterministic if for every infinite word x 1 x 2 . . . ∈ X ω and for every vertex (i.e., state) q of A there exists at most one directed path starting in q which is labeled by (x 1 , y 1 ), (x 2 , y 2 ), . . . for some y i ∈ X.

Figure 4: Appending and erasing letters
Note that in the above definition for a vertex state q of A and a letter x ∈ X there maybe several or no edges starting at q and labeled by (x, y) for y ∈ X. It means that the automaton A may be non-deterministic on finite words and partial, i.e., that a state q transforms a finite word v ∈ X * into several different words, and may not accept some of the words on input.
If an automaton A is ω-deterministic, then every its state q defines a map between closed subsets of X ω , mapping x 1 x 2 . . . to y 1 y 2 . . ., if there exists a directed path starting in q and labeled by (x 1 , y 1 ), (x 2 , y 2 ), . . .. Note that the first automaton (defining the transformations T 0 and T 1 ) is deterministic. For example, the state T 0 acts on the finite words by transformations x 1 x 2 . . . x n → 0x 1 x 2 . . . x n−1 . The second automaton is partial and non-deterministic on finite words. For example, there are two arrows starting at T ′ 0 labeled by (0, 1) and (0, 0), but no arrows labeled by (1, y).
An asynchronous automaton is defined in the same way, but the labels are pairs of arbitrary words (u, v) ∈ (X * ) 2 .
A criterion for a homeomorphism to be synchronously automatic is given in Proposition 4.23.
Asynchronously automatic homeomorphisms of X ω are studied in [GNS00,GN00]. It is shown there that the set of synchronously automatic homeomorphisms of X ω is a group, and that it does not depend on X (if |X| > 1). More precisely, it is proved that for any two finite alphabets X, Y (such that |X|, |Y | > 1) there exists a homeomorphism X ω −→ Y ω conjugating the corresponding groups of asynchronously automatic automorphisms. Very little is known about this group, which is called in [GNS00] the group of rational homeomorphisms of the Cantor set.

Automatic matrices 4.1 Automatic sequences
Here we review the basic definitions and facts about automatic sequences. More can be found in the monograph [AS03,vH03].
Let A be a finite alphabet, and let A ω be the space of the right-infinite sequence of elements of A with the direct product topology.
Fix an integer d ≥ 2, and consider the transformation Ξ d : It is easy to see that Ξ d is a homeomorphism. We denote the coordinates of and call them d-decimations of the sequence w. Repeated d-decimations of w are all sequences that can be obtained from w by iterative application of the decimation procedure, i.e., all sequences of the form We say that a subset Q ⊂ A ω is d-decimation-closed if for every w ∈ Q all d-decimations of w belong to Q. The following is obvious.
Classically, a sequence w = a 0 a 1 . . . is called d-automatic if there exists a Moore automaton A with input alphabet {0, 1, . . . , d−1} and output alphabet A such that if n = i 0 +i 1 d+· · ·+i m d m is a base d expansion of n, then the output of A after reading the word i 0 i 1 . . . i m is a n . An equivalent variant of the definition requires that a n is the output of the automaton after reading i m i m−1 . . . i 1 i 0 . One also may allow, or not i m to be equal to zero, and the numeration of the letters of the sequence w to start from 1. All these different definitions of automaticity of sequences are equivalent to each other, see [AS03, Section 5.2]. They are also equivalent to Definition 4.1, see [AS03, Theorem 6.6.2].
where t n is the sum modulo 2 of the digits of n in the binary numeration system. The beginning of length 2 n of this sequence can be obtained from 0 by applying the substitution 0 → 01, 1 → 10 n times: It is easy to see that this sequence is generated by the automaton shown on The last example can be naturally generalized to include all automatic sequences. Namely, a k-uniform morphism φ : X * −→ Y * is a morphism of monoids such that |φ(x)| = k for every x ∈ X. By a theorem of Combham (see [AS03, Theorem 6.3.2]) a sequence is k-automatic if and only if it is an image, under a coding (i.e., a 1-uniform morphism), of a fixed point of a k-uniform endomorphism φ : X * −→ X * .
Example 4.4. Consider the alphabet X = {a, b, c}, and the morphism φ : This substitution appears in the presentation [Lys85] of the Grigorchuk group.
The fixed point of φ is obtained as the limit of φ n (a), and starts with acabacadacabaca . . .. The morphism φ is not uniform, but it is easy to see that the fixed point belongs to {ab, ac, ad} ∞ , and on the words B = ab, C = ac, D = ad it acts on {B, C, D} ∞ as a 2-uniform endomorphism: It follows from Combham's theorem that the fixed point of φ is 2-automatic.
Let us show how to construct an automaton producing a sequence satisfying the conditions of Definition 4.1.
Suppose that w 0 ∈ A ω is automatic, and let Q be a finite d-decimationclosed subset of A ω that contains w 0 (for example, we can take Q to be equal to the set of all repeated d-decimations of w 0 ).
Consider a Moore automaton with the set of states Q, initial state w 0 , input alphabet {0, 1, . . . , d − 1}, output alphabet A, transition function We call the constructed Moore automaton the automaton of w 0 .
Proposition 4.5. Let w 0 = a 0 a 1 . . . ∈ A ω be an automatic sequence, and let A be its automaton. Let n be a non-negative integer, and let i 0 , i 1 , . . . , i m be a sequence of elements of the set Proof. It follows from the definition of the automaton A that It also follows from the definition of the stencil map that the sequence (10) is equal to

Automatic infinite matrices
The notion of automaticity of sequences can be generalized to matrices in a straightforward way (see [AS03,Chapter 14], where they are called twodimensional sequences). Let A be a finite alphabet, and let A ω×ω be the space of all infinite to right and down two-dimensional matrices of elements of A, i.e., arrays of the form Fix an integer d ≥ 2, and consider the map: The entries of Ξ d (a) are called d-decimations of a. We call Ξ d the stencil map, since entries of the matrix Ξ d (A) are obtained from the matrix A by selecting entries using a "stencil" consisting of a square grid of holes, see Figure 6 . The definition of automaticity for matrices is then the same as for sequences.
Definition 4.6. A matrix a ∈ A ω×ω is d-automatic ([d, d]-automatic in terminology of [AS03]) if the set of matrices that can be obtained from a by repeated d-decimations is finite.
One can also use stencils with a rectangular grid of holes, i.e., selecting the entries of a decimation with one step horizontally, and with a different step vertically. This will lead us to the notion of a [d 1 , d 2 ]-automatic matrix, as in [AS03], but we do not use this notion in our paper.
An interpretation of automaticity of matrices via automata theory is also very similar to the interpretation for sequences. The only difference is that the input alphabet of the automaton is the direct product {0, 1, . . . , d − 1} × {0, 1, . . . , d − 1}. If we want to find an entry a n1,n2 of an automatic matrix defined by a Moore automaton A, then we represent the indices n 1 and n 2 in base d: n 2 = j 0 + j 1 d + · · · + j m d m for 0 ≤ i s , j t ≤ d − 1, and then feed the sequence (i 0 , j 0 )(i 1 , j 1 ) . . . (i m , j m ) to the automaton A. Its final output will be a n1,n2 . We say that a matrix a = (a ij ) i≥0,j≥0 over a field k is column-finite if the number of non-zero entries in each column of a is finite. The set M ∞ (k) of all column-finite matrices is an algebra isomorphic to the algebra of endomorphisms of the infinite-dimensional vector space k ∞ = N k. We denote by M k (k) the algebra of k × k-matrices over k. Proposition 4.8. Let k be a finite field. Then A d (k) is an algebra. The stencil map is an isomorphism of k-algebras.
Proof. Let A and B be d-automatic column-finite matrices. Let A and B be finite decimation-closed sets containing A and B, respectively. Then the set generate a free group of rank 2. Here I and O are the identity and zero matrices of size 3 × 3, respectively. In Section 5 we will show that any residually finite p-group can be represented in triangular form. Groups generated by finite automata will be represented by automatic uni-triangular matrices. The next subsection is the first step in this direction.

Representation of automata groups by automatic matrices
Let B be a basis of k X such that 1 ∈ B. Order the elements of B into a sequence y 0 < y 1 < . . . < y d−1 , where y 0 = u. Recall that the inductive limit B ∞ of the bases B ⊗n of k X n with respect to the embeddings f → f ⊗ u is a basis of C(X ω , k), whose elements are infinite tensor products y i0 ⊗ y i1 ⊗ · · · , where all but a finite number of factors y i k are equal to y 0 = u. In other words, B ∞ consists of functions of the form where all but a finite number of factors on the right-hand side are equal to the constant one function.
We can order such products using the inverse lexicographic order, namely y i0 ⊗ y i1 ⊗ · · · < y j0 ⊗ y j1 ⊗ · · · if and only if y i k < y j k , where k is the largest index such that y i k = y j k .
It is easy to see that the ordinal type of B ∞ is ω. Let e 0 < e 1 < e 2 < . . . be all elements of B ∞ taken in the defined order. It is checked directly that if e n = y i0 ⊗ y i1 ⊗ · · · , then n = i 0 + i 1 · d + i 2 · d 2 + · · · , i.e., . . . i 2 i 1 i 0 is the base-d expansion of n (only a finite number of coefficients i j are different from zero).
Definition 4.10. An ordered basis B of k X is called marked if its minimal element is 1.
The self-similar basis B ∞ of C(X ω , k) associated with B is the inverse lexicographically ordered set of functions of the form e 1 ⊗ e 2 ⊗ . . ., where e i ∈ B and all but a finite set of elements e i are equal to 1, as it is described above.
Let B = {b 0 , b 1 , . . . , b d−1 } be an arbitrary (non necessarily marked) basis of the space k X . We define the associated matrix recursion for linear operators on C(X ω , k) in the usual way: given an operator a, define its image Ξ B (a) = If B is the basis {δ x } x∈X , then the matrix recursion Ξ B restricted to a selfsimilar group G ≤ Aut(X * ) coincides with the matrix recursion (5) coming directly from the wreath recursion.
Lemma 4.11. Let B be a marked basis of k X . Then the matrix recursion Ξ B coincides with the stencil map for the matrices of linear operators in the associated basis B ∞ .
Proof. Let A ij , 0 ≤ i, j ≤ d − 1, be the entries of Ξ B (a). Let n = i 0 + i 1 · d + i 2 · d 2 + · · · be a non-negative integer written in base d. Then Let a m,n , 0 ≤ m, n < ∞, be the entries of the matrix of a in the basis B ∞ . Then which together with (12) implies that c r,n = a i+dr,j+dn are the entries of A i,j in the basis B ∞ , i.e., that Definition 4.12. Let k be a finite field. We say that a linear operator a on C(X ω , k) is automatic if there exists a finite set A of operators such that a ∈ A, and for every a ′ ∈ A all entries of the matrix Ξ B (a ′ ) belong to A.
Proposition 4.13. Let B 1 , B 2 be two bases of k X . An linear operator on C(X ω , k) is automatic with respect to B 1 if and only if it is automatic with respect to B 2 .
Proof. Let a 1 be an operator which is automatic with respect to B 1 . Let A be the corresponding finite set of operators, closed with respect to taking entries of the matrix recursion. Let A ′ be the set of all linear combinations of elements of A, which is finite, since we assume that k is finite. If T is the transition matrix from B 1 to B 2 , then Ξ B2 (a) = T −1 Ξ B1 (a)T for every linear operator a. It follows that A ′ is closed with respect to taking entries of Ξ B2 . The set A ′ is finite, a 1 ∈ A ′ , hence a 1 is also automatic with respect to B 2 .
As a direct corollary of Proposition 4.13 we get the following relation between finite-state automorphisms of the rooted tree X * and automatic matrices.
Theorem 4.14. Suppose that k is finite. Let B be a marked basis of k X . Then the matrix of π ∞ (g) in the associated basis B ∞ , where g ∈ Aut(X * ), is d-automatic if and only if g is finite-state.
We get, therefore, a subgroup of the group of units of A d (k) isomorphic to the group FAut(X * ) of finite-state automorphisms of the tree X * .
Matrix recursions (i.e., homomorphisms from an algebra A to the algebra of matrices over A) associated with groups acting on rooted trees, and in more general cases were studied in many papers, for instance in [Sid97, BG00b, Bar06, Sid09, Bac08, Nek04, Nek09]. Note that the algebra generated by the natural representation on C(X ω , k) of a group G acting on the rooted tree X * is different from the group ring. This algebra (and its analogs) were studied in [Sid97, Bar06,Nek04].

Creation and annihilation operators
For h ∈ k X , denote by T h the operator on C(X ω , k) acting by the rule It is easy to see that T h is linear, and that we have T a1h1+a2h2 = a 1 T h1 + a 2 T h2 for all h 1 , h 2 ∈ k X and a 1 , a 2 ∈ k.
Consider the dual vector space (k X ) ′ to the space k X of functions. We will denote the value of a functional v on a function f ∈ k X by v|f . Then for every v ∈ (k X ) ′ we have an operator T v on C(X ω , k) defined by where f (x, x 1 , x 2 , . . .) on the right-hand side of the equation is seen for every choice of the variables x 1 , x 2 , . . . as a function of one variable x, i.e., an element of k X . Let be the basis of the dual space defined by e ′ i |e j = δ i,j . (Here and in the sequel, δ i,j is the Kronecker's symbol equal to 1 when i = j, and to 0 otherwise.) We will denote T e ′ i = T ′ ei . Then T ei is an isomorphism between the space C(X ω , k) and its subspace e i ⊗ C(X ω , k). It is easy to see that T ′ ei restricted onto e i ⊗ C(X ω , k) is the inverse of this isomorphism, and that T ′ ei restricted onto e j ⊗ C(X ω , k) is equal to zero for j = i.
The operators T ei and T ′ ej satisfy the relations: The products T ei T ′ ei are projections onto the summands e i ⊗ C(X ω , k) of the direct sum decomposition Let B be a marked basis of k X , and let 1 = e 0 < e 1 < . . . < e d−1 be its elements. Let B ∞ be the associated ordered basis of C(X ω , k).
If A is (not necessarily square) finite matrix, then we denote by A ⊕∞ the infinite matrix  where O is the zero matrix of the same size as A.
The following is a direct corollary of the definitions.
respectively. The matrix of T ′ ei is the transpose E ⊤ i of the matrix E i . The matrices E i and E ′ i have a natural relation to decimation of matrices. The proof of the next proposition is a straightforward computation of matrix products.
Proposition 4.16. Let A = (a i,j ) ∞ i,j=0 be an infinite matrix, and let be the matrix of its d-decimations. Then Corollary 4.17. Let A be an operator on C(X ω , k), and let B = {e i } be a basis of k X . Then the entries of the associated matrix recursion for A are equal to T ′ ei AT ej .
The next proposition is a direct corollary of Proposition 4.15.
Proposition 4.18. If h = a 0 e 0 + a 1 e 1 + · · · + a d−1 e d−1 , then the matrix of T h is equal to Corollary 4.19. Let B be a marked basis of k X . Order the letters of the alphabet X in a sequence x 0 , x 1 , . . . , be the corresponding operators, defined using the basis {δ xi }.
Let S = (a ij ) d−1 i,j=0 be the transition matrix from the basis δ x0 < δ x1 < · · · < δ Let us consider now the case of the basis D = {δ x } x∈X . For simplicity, let us denote T x = T δx and T ′ x = T ′ δx . Then the operators T x and T ′ x act on C(X ω , k) by the rule and In other words, the operator T x is induced by the natural homeomorphism X ω −→ xX ω : w → xw, and T ′ x is induced by its inverse map xX ω −→ X ω : xw → w.
Proposition 4.20. Let B ∞ be a self-similar basis of C(X ω , k) associated with a marked basis B (see Definition 4.10). Then the matrices of the operators T x and T ′ x , for x ∈ X, in the ordered basis B ∞ are |X|-automatic.
Note that we do not require in this proposition the field k to be finite.
and α x,i ∈ k. It follows from Proposition 4.16 that the entries of the matrix Ξ 2 (T x ) with respect to the basis B are equal to d−1 k=0 α x,k T ′ ei T e k T ej . Every product of the form T ′ ei T e k T ej is equal, by relations (13), either to zero, or to T ej . It follows that decimations of T x are either zeros or of the form α x,i T ej . It follows that the set of repeated decimations of the matrix of T x is contained in Example 4.22. In the case k = C, it is natural to consider the operators Then S x are isometries of the Hilbert space L 2 (X ω ), and their conjugates S * x are equal to |X|T ′ x . The C * -algebra of operators on L 2 (X ω ) generated by the operators S x is called the Cuntz algebra [Cun77], and is usually denoted O |X| . Any isometries satisfying the relations generate a C * -algebra isomorphic to O |X| . In particular, the C * -algebra generated by the matrices E i is the Cuntz algebra. Representation of the Cuntz algebra by matrices E i is an example of a permutational representation of O d . More on such and similar representations, see [BJ99]. Recall that, for X = {0, 1}, the Walsh basis of L 2 (X ω ) is the basis W ∞ constructed starting from the basis W = {y 0 , y 1 }, where y 0 = δ 0 + δ 1 and y 1 = δ 0 − δ 1 . Then direct computation with of the transition matrices show that the matrices of S 0 and S 1 are

Cuntz algebras and Higman-Thompson groups
If ψ : X ω −→ X ω is a homeomorphism, then it induces a linear operator L ψ on C(X ω , k) given by for f ∈ C(X ω , k) and w ∈ X ω . Fixing any ordered basis of C(X ω , k), we get thus a natural faithful representation of the homeomorphism group of X ω in the group of units of the algebra M ∞ (k) of column-finite matrices over k.
Proposition 4.23. Let ψ be a homeomorphism of X ω . Let u, v ∈ X n , and denote by ψ u,v the partially defined map given by the formula The following conditions are equivalent.
3. For every finite field k, the operator L ψ : 4. For some finite field k, the operator L ψ : C(X ω , k) −→ C(X ω , k) is automatic.
Synchronously automatic homeomorphisms are defined in Definition 3.7.
Proof. Equivalence of conditions (2), (3), and (5) follow directly from Proposition 4.16. Suppose that ψ is synchronously automatic. Let A be an initial automaton defining ψ. For every pair u = a 1 a 2 . . . a n , v = b 1 b 2 . . . b n ∈ X * of words of equal length, let Q u,v be the set of states q of A such that there exists a directed path starting in the initial state q 0 of A and labeled by (a 1 , b 1 ), (a 2 , b 2 ), . . . , (a n , b n ). Then the set Q u,v defines the map ψ u,v in the following sense. We have ψ u,v (x 1 x 2 . . .) = y 1 y 2 . . . if and only if there exists a path starting in an element of Q u,v and labeled by (x 1 , y 1 ), (x 2 , y 2 ), . . .. It follows that the number of possible maps of the form ψ u,v is not larger than the number of subsets of the set of states of A. This shows that every synchronously automatic homeomorphism satisfies condition (2).
Suppose now that a homeomorphism ψ satisfies condition (2), and let us show that it is synchronously automatic. Construct an automaton A with the set of states Q equal to the set of non-empty maps of the form ψ u,v . For every (x, y) ∈ X 2 and ψ u,v ∈ Q we have an arrow from ψ u,v to ψ ux,vy , labeled by (x, y), provided the map ψ ux,vy is not empty. The initial state of the automaton is the map ψ = ψ ∅,∅ . Let us show that this automaton defines the homeomorphism ψ. It is clear that if ψ(x 1 x 2 . . .) = y 1 y 2 . . ., then there exists a path starting at the initial state of A and labeled by (x 1 , y 1 ), (x 2 , y 2 ), . . .. On the other hand, if such a path exists for a pair of infinite words x 1 x 2 . . . , y 1 y 2 . . ., then the maps ψ x1x2...xn,y1y2...yn are non-empty for every n. In other words, for every n the set W n of infinite sequences w ∈ x 1 x 2 . . . x n X ω such that ψ(w) ∈ y 1 y 2 . . . y n X ω is non-empty. It is clear that the sets W n are closed and W n+1 ⊂ W n for every n. By compactness of X ω it implies that n≥1 W n is non-empty. It follows that ψ(x 1 x 2 . . .) = y 1 y 2 . . .. The next corollary follows directly from condition (2) of Proposition 4.23.
Corollary 4.24. The set of all automatic homeomorphisms of X ω is a group.
We have already seen in Theorem 4.14 that a homeomorphisms g of X ω defined by an automorphisms of X * is automatic if and only if it is finite state. Note that in this case g u,v is either empty (if g(v) = u) or is equal to g| v .
Another example of a group of finitely automatic homeomorphisms of X ω is the Higman-Thompson group V |X| . It is the set of all homeomorphisms that can be defined in the following way. We say that a subset A ⊂ X * is a crosssection if the sets uX ω for u ∈ A are disjoint and their union is X ω . Let A = {v 1 , v 2 , . . . , v n } and B = {u 1 , u 2 , . . . , u n } be cross-sections of equal cardinality together with a bijection v i → u i . Define a homeomorphism ψ : The set of all homeomorphisms that can be defined in this way is the Higman-Thompson group V |X| , see [Tho80,CFP96]. Let ψ be the homeomorphism defined by (16). It follows directly from (14) and (15) that the operator L ψ induced by ψ is equal to where we use notation The next proposition follows then from Proposition 4.20.
Proposition 4.25. The Higman-Thompson group V |X| is a subgroup of the group of synchronously automatic homeomorphisms of X ω .
The group generated by V 2 and the Grigorchuk group was studied by K. Roever in [Röv99]. He proved that it is a finitely presented simple group isomorphic to the abstract commensurizer of the Grigorchuk group. Generalizations of this group (for arbitrary self-similar group) was studied in [Nek04].
5 Representations by uni-triangular matrices 5.1 Sylow p-subgroup of Aut(X * ) Let |X| = p be prime. We assume that X = {0, 1, . . . , p − 1} is equal to the field F p of p elements. From now on, we will write vertices of the tree X * as tuples (x 1 , x 2 , . . . , x n ) in order not to confuse them with products of elements of F p .
Denote by K p the subgroup of Aut(X * ) consisting of automorphisms g whose labels α g,v of the vertices of the portrait consist only of powers of the cyclic permutation σ = (0, 1, . . . , p − 1). It follows from (2) and (3) that K p is a group. The study of the group K p (and its finite analogs were initiated by L. Kaloujnine [Kal48,Kal47,Kal51]).
Suppose that an element g ∈ K p is represented by a tableau as in Subsection 2.1. Then a n (x 1 , x 2 , . . . , x n ) are maps from X n to the group generated by the cyclic permutation σ. The elements of this group act on X = F p by maps σ a : x → x + a. It follows that we can identify functions a n with maps X n −→ F p , so that an element g ∈ K p represented by a tableau acts on sequences v = (x 0 , x 1 , . . .) ∈ X ω by the rule It follows that if g 1 , g 2 ∈ K p are represented by the tableaux [a n ] ∞ n=0 and [b n ] ∞ n=0 , then their product g 1 g 2 is represented by the tableau Denote by K p,n the quotient of K p by the pointwise stabilizer of the nth level of the tree X * . We can consider K p,n as a subgroup of the automorphism group of the finite subtree X [n] = n k=0 X k ⊂ X * . Proposition 5.1. The group K p,n is a Sylow subgroup of the symmetric group Symm(X n ) and of the automorphism group of the tree X [n] .
Proof. The order of Symm(X n ) is p n !, and the maximal power of p dividing it is p n p + p n p 2 + · · · + p n p n = p n − 1 p − 1 .
It follows that the order of the Sylow p-subgroup of Symm(X n ) is p p n −1 p−1 . The order of K p,n is equal to the number of possible tableaux a 1 (x 1 ), a 2 (x 1 , x 2 ), . . . a n−1 (x 1 , . . . , x n−1 )], where a i is an arbitrary map from X i to the cyclic group σ of order p. The number of possibly maps a i is hence p p i . Consequently, the number of possible tableaux is p 1+p+p 2 +···+p n−1 = p p n −1 p−1 . Since the group of all automorphisms of the tree X [n] is contained in Symm(X n ) and contains K p,n , the subgroup K p,n is its Sylow p-subgroup too.
Proposition 5.2. Let g ∈ K p be represented by a tableau [a 0 , a 1 (x 1 ), a 2 (x 1 , x 2 ), . . .]. Consider the map α : K p −→ F ω p , where F ω p is the infinite Cartesian product of additive groups of F p , given by In other words, we just sum up modulo p all the decorations of the portrait of g on each level. Then α is the abelianization epimorphism Proof. It is easy to check that α is a homomorphism. It remains to show that its kernel is the derived subgroup of K p . This is a folklore fact, and we show here how it follows from a more general result of Kaloujnine. Let g ∈ K p be represented by a tableau Each function a n (x 1 , x 2 , . . . , x n ) can be written as a polynomial 0≤ki≤p−1 c k1,k2,...,kn x k1 1 x k2 2 · · · x kn n for some coefficients c k1,k2,...,kn ∈ F p . It is proved in [Kal48, Theorem 6] (see also Equation (5,4) in [Kal51]) that the derived subgroup [K p , K p ] of K p is the set of elements defined by tableaux in which a 0 and the coefficient c p−1,p−1,...,p−1 at the eldest term x p−1 1 x p−1 2 . . . x p−1 n are equal to zero for every n.

Polynomial bases of C(X ω , F p )
Proposition 5.3. Suppose that an ordered basis B of k X is such that the matrices of π 1 (g) for g ∈ G ≤ Aut(X * ) are all upper uni-triangular, and the minimal element of B is the constant one function. Then the matrices of π n (g) in the basis B ⊗n and of π ∞ (g) in the associated basis B ∞ are upper uni-triangular for all g ∈ G.
See Subsection 2.4 for definition of the representations π n . We say that a matrix is upper uni-triangular if all its elements below the main diagonal are equal to zero, and all elements on the diagonal are equal to one. From now on, unless the contrary is specifically mentioned, "uni-triangular" will mean "upper uni-triangular".
Proof. Let b 0 < b 1 < . . . < b d−1 be the ordered basis B. Let Y = {y 0 , y 1 , . . . , y d−1 } be the corresponding basis of the right module Φ G . Namely, we take for every x∈X a x δ x ∈ B the corresponding element x∈X a x x ∈ Φ = k[X · G], see Subsection 2.3. Then B = Y ⊗ ε, where [ε] is the left G-module of the trivial representation of G, see 2.4.
Throughout the rest of our paper we assume that |X| = p is prime, k is the field F p of p elements, G is a subgroup of K p , and we identify X with F p . We will be able then to use Proposition 5.3 to construct bases of C(X ω , F p ) in which the representation π ∞ of K p (and hence of G) are uni-triangular.
Every function f ∈ F X p can be represented as a polynomial f (x) ∈ F p [x], using the formula where (x − t) in the numerator and the denominator cancel each other. (Recall that (p − 1)! = −1 (mod p), by Wilson's theorem.) Since x p = x as a function on F p (by Fermat's little theorem), representations that differ by an element of the ideal generated by x p − x represent the same function. Note that the ring F p [x]/(x p − x) has cardinality p p , hence we get a natural bijection between F p [x]/(x p − x) and F X p , mapping a polynomial to the function it defines on F p . From now on, we will thus identify the space of functions F X p with the F p -algebra F p [x]/(x p − x). Following Kaloujnine, we will call the elements of F p [x]/(x p − x) reduced polynomials. We write them as usual polynomials a 0 + a 1 x + · · · + a p−1 x p−1 (but keeping in mind reduction, when performing multiplication).
Suppose that g ∈ G is such that g(x) = x + 1 for all x ∈ X. Then π 1 (g) acts on the functions f ∈ V 1 = F X p by the rule In particular, if we represent f as a polynomial, then π 1 (g) does not change its degree and the coefficient of the leading term. It follows that the matrix of the operator π 1 (g) in the basis e 0 (x) = 1, e 1 (x) = x, e 2 (x) = x 2 , . . . , e p−1 (x) = x p−1 is uni-triangular. Let us denote this marked basis by E.
is called the Kaloujnine basis of monomials.
It is easy to see that e n ∈ E ∞ is equal to the monomial function where k 1 , k 2 , . . . are the digits of the base p expansion of n, i.e., such that k i ∈ {0, 1, . . . , p − 1} and Coordinates of a function f ∈ C(X ω , F p ) in the basis E ∞ are the coefficients of the representation of the function f as a polynomial in the variables x 1 , x 2 , . . .. Since we are dealing with functions, we assume that these polynomials are reduced, i.e., are elements of the ring F p [x 1 , x 2 , . . .]/(x p 1 − x 1 , x p 2 − x 2 , . . .). As an immediate corollary of Proposition 5.3 we get the following.
Theorem 5.5. The representation π ∞ of K p in the Kaloujnine basis E ∞ is unitriangular. In particular, the representations π ∞ of all its subgroups G ≤ K p are uni-triangular in E ∞ .
We can change the ordered basis E = {e 0 = 1, e 1 = x, . . . , e p−1 (x) = x p−1 } to any ordered basis F = (f 0 , f 1 , . . . , f p−1 ) consisting of polynomials of degrees 0, 1, 2, . . . , p − 1, respectively, since then the transition matrix from E to F will be triangular, hence the representation of G in the basis F will be also unitriangular.
Proof. We have (p − 1)! = 1 and (−1) p−1 = 1 in F p . We also have (x + 1)(x + 2) · · · (x + p − 1) = 0 for all x ∈ F p \ {0}. It follows that (−1) p−1 x+p−1 It is enough now to check that the functions (−1) k x+k k satisfy the recurrent Proposition 5.7. The transition matrix from the basis (δ 0 , δ 1 , . . . , δ p−1 ) to the basis Its inverse is obtained by transposing T with respect to the secondary diagonal: Proof. It follows from Proposition 5.6 that the entry t ij of the transition matrix, where i = 0, 1, . . . , p − 1 and j = 0, 1, . . . , p − 1 is equal to which proves the first claim of the proposition. In order to prove the second claim, we have to show that the product is equal to the identity matrix. The general entry of the product is equal to is equal to one for x = 0 and is equal to zero for x = 0, which shows that the product is equal to the identity matrix. , which agrees with the recursion Ξ 2 (a) = 1 1 1 + a 1 .
Example 5.11. It follows from Lemma 5.10 that the matrix recursion for the generators of the Grigorchuk group (see Example 2.4) in the basis B is See a visualization of the matrices b, c, d on Figure 7, where black pixels correspond to ones, and white pixels to zeros.
Proposition 5.12. Each space U n is K p -invariant, and the kernel in K p of the restriction of π ∞ onto U p n−1 = b 0 , b 1 , . . . , b p n−1 coincides with the kernel of π n . In other words, restriction of π ∞ onto U p n−1 defines a faithful representation of K p,n . Proof. The subspace b 0 , b 1 , . . . , b p n−1 < C(X ω , F p ) is equal to the span of the product V n−1 · e p n−1 , where e p n−1 is the function on X ω given by according to (20). In other words, it is the tensor product V n−1 ⊗ e 1 , where e 1 (x) = x. Suppose that g ∈ G belongs to the kernel of the restriction of π ∞ onto V n−1 ⊗ e 1 . Then for every v ∈ X n−1 we have hence π 1 (g| v ) is identical for every v ∈ X n−1 . It follows that g acts trivially on X n , i.e., that π n (g) is trivial.
Thus, we get a faithful representation of K p,n = ≀ n k=1 C p by uni-triangular matrices of dimension p n−1 +1. Note that this is the smallest possible dimension for a faithful representation, since the nilpotency class of K p,n is equal to p n−1 , while the nilpotency class of the group of uni-triangular matrices of dimension d is equal to d − 1.
is an infinite matrix, then its first diagonal is the sequence (a 01 , a 12 , a 23 , . . .), i.e., the first diagonal above the main diagonal of A.
Theorem 5.13. Let g ∈ G, and let A g = (a ij ) ∞ i,j=0 be the matrix of π ∞ (g) in the basis B ∞ , constructed in the previous section. Let (s 1 , s 2 , . . .) = (a 01 , a 12 , . . .) be the first diagonal of A g . Then where p k is the maximal power of p dividing n.
Proof. The first diagonal of a product of two upper uni-triangular matrices A and B is equal to the sum of the first diagonals of the matrices A and B. It follows that it is enough to prove the theorem for rooted automorphisms of Aut(X * ) (i.e., automorphisms g such that g| v is trivial for all non-empty words v ∈ X * ) and for automorphisms acting trivially on the first level X.
If the automorphism is rooted, then it is a power of the automorphism It follows from the definition of the basis B that the matrix of π ∞ (a) is the block-diagonal matrix consisting of the Jordan cells of size p. Consequently, its first diagonal is the periodic sequence of period (1, 1, . . . , 1, 0) of length p. Hence, the first diagonal of a s is (s, s, . . . , s, 0) repeated periodically. This proves the statement of the theorem for the automorphisms of the form a s .
Suppose that g acts trivially on the first level of the tree. Then its matrix recursion in the basis {δ x } x∈X is the diagonal matrix with the entries g| x on the diagonal. It follows that the matrix recursion for g in the basis {b i } p−1 i=0 is equal to the product of the matrices It is easy to see that the entries on the first diagonal above the main diagonal of the product are equal to zero, and that the entry in the left bottom corner is equal to g| 0 + g| 1 + · · · + g| p−1 . If we apply the stencil map It follows from Theorem 5.13 that the first diagonal of a is (1, 0, 1, 0, . . .). The first diagonal of b is (s 1 , s 2 , . . .) where where k, m ≥ 0 are integers. Sequence s n from the previous example is a Toeplitz sequence, see [Toe28,PU79] and [AS03, Exercise 10.11.42]. If G is a finitely generated finite state selfsimilar group, then the sequence α k (g) in Theorem 5.13 is eventually periodic for every g ∈ G, and then the sequence s n is Toeplitz.
Theorem 5.15. Let k be a finite field of characteristic p. Then a sequence w ∈ k ω is p-automatic if and only if the generating series G w (x) is algebraic over k(x).
i,j=0 is a matrix over a field k, then we can consider the formal series We also have a complete analog of Christol's theorem for matrices, see [AS03, Theorem 14.4.2].
Theorem 5.16. Let k be a finite field of characteristic p. Then a matrix A = (a ij ) ∞ i,j=0 is p-automatic if and only if the series G A is algebraic over k(x, y).
In the case when A is triangular, it may be natural to use the generating function where H i (y) are generating functions of the diagonals of A. Note that T A and G A are related by the formula: Example 5.17. Consider the generators of the Grigorchuk group a, b, c, d given by the matrices from Example 5.11. Let A(x, y), B(x, y), C(x, y), and D(x, y) be the corresponding generating series. Note that the generating series of the unit matrix is I(x, y) = 1 1+xy = 1 1+s . It follows from the recursions in Example 5.11 that A = I(x 2 , y 2 ) + yI(x 2 , y 2 ) + xyI(x 2 , y 2 ) = 1 + y + xy 1 + x 2 y 2 = 1 1 + s + t 1 + s 2 .
Let us make a substitution A = yÃ + 1 1+xy , B = yB + 1 1+xy , C = yC + 1 1+xy , and D = yD + 1 1+xy . Note that I = 1 1+xy , so we setĨ = 0. The seriesÃ,B,C,D are the generating series of the matrices obtained from the matrices a, b, c, d by removing the main diagonal, and shifting all columns to the left by one position.
Let us denote F = s+ts 1+s 4 . ThenB,C, andD are solutions of the equations Substituting t = 0, we get equations for the generating functions B 1 (s), C 1 (s), D 1 (s) of the first diagonals above the main in the matrices b, c, d: Then every upper uni-triangular matrix M can be written as where D i (M ) are diagonal matrices whose main diagonals are equal to the ith diagonals of M .
The generating series T M (t, s) is equal to where M i (s) is the usual generating series of the main diagonal of D i (M ). Addition and multiplication of diagonal matrices diag(a 0 , a 1 , . . .) corresponds to the usual addition and the Hadamard (coefficient-wise) multiplication of the power series a 0 + a 1 s + a 2 s 2 + · · · . Note that we have J diag(a 0 , a 1 , . . .) = diag(a 1 , a 2 , . . .)J, which gives an algebraic rule for multiplication of the power series (25) corresponding to multiplication of matrices. Namely, we can replace the matrix M by the formal power series (25), where the series in the variable s are added and multiplied coordinate-wise, while the series in the variable t are multiplied in the usual (though non-commutative) way subject to the relation t(a 0 s 0 + a 1 s + a 2 s 2 + · · · ) = (a 1 s 0 + a 2 s + a 3 s 2 + · · · )t.
Note that iterations of the map 2n → n, 2n + 1 → n + 1 on the set of non-negative integers are attracted to two fixed points 0 → 0 and 1 → 1. Consequently, we get the following which implies the statement of the proposition.
For the basis B ∞ we have b p n = − xn+1 1 = −x n − 1, and a similar proof works.
Definition 5.21. Columns number p n , n = 0, 1, 2, . . ., of the matrix A g are called the principal columns of A g .
where f (x 1 , x 2 , . . . , x n , x) ∈ F X n+1 p on the right-hand side of the equality is treated as a function of x for every choice of (x 1 , x 2 , . . . , x n ) ∈ F X n p . Using (26), we see that R k can be computed using the formula R k (f )(x 1 , x 2 , . . . , x n ) = p−1 x=0 x p − 1 − k f (x 1 , x 2 , . . . , x n , x).
Proof. For any monomial b i1 ⊗ b i2 ⊗ · · · ⊗ b im and any 0 ≤ j ≤ p − 1, we have The proof of the proposition is now straightforward.
In [CSLST05] a different basis of the dual space was used (formal dot products with b k ), but the transition matrix from their basis to {b ′ i } is triangular, so a statement similar to Proposition 5.23 holds.
One can find the height of a function f ∈ F X n p "from the other end" by applying the maps T b ′ k , defined in 4.4. Recall, that these maps act by the rule T b ′ k (f )(x 1 , x 2 , . . . , x n−1 ) = b ′ k |f (x, x 1 , x 2 , . . . , x n−1 ) , i.e., they are linear extensions of the maps and are computed by the rule x 1 , x 2 , . . . , x n−1 ).
Equation (27) imply then the following algorithm for computing height of a function.
Proposition 5.24. Let f ∈ F X n p , and let h k = T b ′ k (f ) for 0 ≤ k ≤ p − 1. Let h k1 , h k2 , . . . , h ks be the functions of the maximal height among the functions h k , and let j 1 = max(k i ). Then γ(f ) = j 1 + d · γ(h j1 ).
Proposition 5.25. For every f ∈ F X n 2 we have Proof. We have h 0 = f 1 , h 1 = f 0 + f 1 .
For more on height of functions on trees, and its generalizations, see [CSLST05]. Denote, as before, U n = e 0 , e 1 , . . . , e n . Then U n consists of reduced polynomials of height not bigger than n.
Since the representation of G is uni-triangular in the basis E ∞ , the spaces U n are G-invariant, i.e., are sub-modules of the G-module C(X ω , F p ). Note also that U n−1 has co-dimension 1 in U n .
Proposition 5.26. Let g ∈ K p be the adding machine. Then U n = (g − 1)U n+1 for every n.
Theorem 5.27. If V is a sub-module of the K p -module C(X ω , F p ), then either V = {0}, or V = C(X ω , F p ), or V = U n for some n.
Let n be the maximal height of an element of V . If n is finite, then by the proven above, V = U n . If n is infinite, then, by the proven above, V contains ∞ n=0 U n = C(X ω , F p ).
We adopt therefore, the following definition.
Definition 5.28. Let G ≤ K p . We say that the action of G on C(X ω , F p ) is uniserial if for every n ≥ 0 the set g∈G (g − 1)U n+1 generates U n .
A module M is said to be uniserial if its lattice of sub-modules is a chain. It is easy to see that the same arguments as in the proof of Theorem 5.27 show that if the action of G on C(X ω , F p ) is uniserial, then U n are the only proper submodules of the G-module C(X ω , F p ). Consequently, the G-module C(X ω , F p ) is uniserial.
In group theory (see [DdSMS99,LGM02]) an action of a group G on a finite p-group U is said to be uni-serial, if |N : [N, G]| = p for every non-trivial Ginvariant subgroup N ≤ U . Here [N, G] is the subgroup of N generated by the elements h g h −1 for h ∈ H and g ∈ G, where h g denotes the action of g ∈ G on h ∈ H.
Let g ∈ K p , and let [f 0 , f 1 (x 1 ), f 2 (x 1 , x 2 ), . . .] be the tableau of g. We have seen in Subsection 5.5 (see the proof of Proposition 5.20) that the entries of the principal columns (a 0,p n , a 1,p n , . . . , a p n −1,p n ) ⊤ of the matrix (a i,j ) ∞ i,j=0 of π ∞ (g) in the basis E ∞ are precisely the coefficients of the polynomials f n : f n (x 1 , x 2 , . . . , x n−1 ) = p n −1 k=0 a k,p n e k , where e k is the monomial of height k.
It follows that the height of f n is equal to the largest index of a non-zero non-diagonal entry of the column number p n of the matrix of π ∞ (g) in the basis E ∞ . Note that the same is true for the matrix of π ∞ (g) in the basis B ∞ .
Proposition 5.29. Let G ≤ K p , and let α : K p −→ F ω p : g → (α 0 (g), α 1 (g), . . .) be the abelianization homomorphism given by (18). The action of the group G on C(X ω , F p ) is uniserial if and only if every homomorphism α k : G −→ F p is non-zero.
Proof. It follows from Theorem 5.13 that all homomorphisms α k are non-zero if and only if for every k = 1, 2, . . . there exists g k ∈ G such that the entry number k on the first diagonal of π ∞ (g k ) is non-zero.
Then for every monomial e k the height of (1 − g k )(e k ) is equal to k − 1, which shows that k i=1 (1 − g k )(U k ) generates U k−1 , hence the action of G is uniserial.
Corollary 5.30. Let S be a generating set of G ≤ K p . Then the action of G on C(X ω , F p ) is uniserial if and only if for every k = 0, 1, . . . there exists g k ∈ S such that α k (g k ) = 0.
Note that it also follows from Theorem 5.13 and from the fact that the entries in the principal columns are the coefficients of the polynomials in the tableau, that α n (g) = 0 if and only if height of the polynomial f n of the tableau [f 0 , f 1 (x 1 ), f 2 (x 1 , x 2 ), . . .] representing g is equal to p n − 1, i.e., has the maximal possible value.
Example 5.31. The cyclic group generated by an element g ∈ K p is transitive on the levels X n if and only if α n (g) = 0 for all n. It follows that if G contains a level-transitive element, then its action is uniserial. But there exist torsion groups with uniserial action on C(X ω , F p ), as the following example shows.
(In the last three equalities, each sequence have a pre-period of length 1 and a period of length 3.) It follows that the action of the Grigorchuk group is uniserial.
Example 5.33. Gupta-Sidki group [GS83] is generated by two elements a, b acting on {0, 1, 2} * , where a is the cyclic permutation σ = (012) on the first level of the tree (i.e., changing only the first letter of a word), and b is defined by the wreath recursion b = (a, a −1 , b).