1 Introduction

The theory of automatic and biautomatic groups was developed in the 1980s, and is explored in a book by D. B. A. Epstein et al. [4]. Roughly speaking, a group G is biautomatic if it can be equipped with a regular set of normal forms, in such a way that paths starting and ending at neighbouring vertices in a Cayley graph of G fellow-travel; see Sect. 2.2 for a precise definition. Biautomaticity implies various geometric and algorithmic properties: for instance, a biautomatic group is finitely presented, satisfies a quadratic isoperimetric inequality, has solvable conjugacy problem, and finitely generated abelian subgroups of biautomatic groups are undistorted.

There has been substantial interest in biautomaticity of various classes of non-positively curved groups. In particular, word-hyperbolic groups [1] and cubulated groups [11]—that is, groups acting geometrically and cellularly on CAT(0) cube complexes—are biautomatic; more generally, Helly groups (introduced recently in [2]) are biautomatic. Artin groups of finite type [3] and central extensions of word-hyperbolic groups [10] are also biautomatic. For several decades, it has been an open question whether or not all CAT(0) groups—groups acting geometrically on CAT(0) spaces—are biautomatic. However, I. J. Leary and A. Minasyan have recently [8] constructed examples of CAT(0) groups that are not biautomatic.

More precisely, the paper [8] studies commensurating HNN-extensions of \(\mathbb {Z}^n\) (called Leary–Minasyan groups in this paper), defined for a matrix \(A \in GL_n(\mathbb {Q})\) and a finite-index subgroup \(L \le \mathbb {Z}^n \cap A^{-1}(\mathbb {Z}^n)\) by the presentation

$$\begin{aligned} G(A,L) = \langle x_1,\ldots ,x_n,t \mid x_ix_j = x_jx_i \text { for } 1 \le i < j \le n, t\textbf{x}^{\textbf{v}}t^{-1} = \textbf{x}^{A\textbf{v}} \text { for } \textbf{v} \in L \rangle , \end{aligned}$$

where we write \(\textbf{x}^{\textbf{w}} := x_1^{w_1} \cdots x_n^{w_n}\) for any \(\textbf{w} = (w_1,\ldots ,w_n) \in \mathbb {Z}^n\). It was shown [8, Theorem 1.1] that such a group G(AL) is CAT(0) if and only if A is conjugate in \(GL_n(\mathbb {R})\) to an orthogonal matrix, and biautomatic if and only if A has finite order. Thus, such groups provide first examples of CAT(0) groups that are not biautomatic.

A special case of Leary–Minasyan groups for \(n = 1\) are the Baumslag–Solitar groups, defined for \(p,q \in \mathbb {Z}\setminus \{0\}\) by the presentation \(BS(p,q) = \langle x,t \mid tx^pt^{-1} = x^q \rangle \). It is well-known that BS(pq) is biautomatic if \(|p| = |q|\) (because it is cubulated, for instance), and that BS(pq) cannot be embedded in a biautomatic group if \(|p| \ne |q|\) [5, Corollary 6.8]. This motivated a question [8, Question 12.2], suggested by K.-U. Bux, which asks if a similar dichotomy is true for arbitrary Leary–Minasyan groups. We settle this question in the affirmative: that is, G(AL) is either biautomatic or not embeddable into a biautomatic group.

Theorem 1.1

Let \(A \in GL_n(\mathbb {Q})\), and let L be a finite-index subgroup of \(\mathbb {Z}^n \cap A^{-1}(\mathbb {Z}^n)\). Then G(AL) is a subgroup of a biautomatic group if and only if A has finite order. In particular, there exist CAT(0) groups that are not embeddable into biautomatic groups.

Theorem 1.1 is a consequence of Theorem 1.2 below, which is a statement about commensurators of abelian subgroups in biautomatic groups.

Given a group G and a subgroup \(H \le G\), we define the commensurator \({{\,\textrm{Comm}\,}}_G(H)\) of H in G as the set of elements \(g \in G\) for which both \(H \cap gHg^{-1}\) and \(H \cap g^{-1}Hg\) have finite index in H; it is easily seen to be a subgroup of G. A related concept is that of abstract commensurator \({{\,\textrm{Comm}\,}}(H)\) of a group H, whose elements are equivalence classes of isomorphisms between finite-index subgroups of H, forming a group under composition (see Sect. 2.1 for a precise definition). For any \(H \le G\), there is a canonical map \({{\,\textrm{Comm}\,}}_G(H) \rightarrow {{\,\textrm{Comm}\,}}(H)\) which sends \(g \in {{\,\textrm{Comm}\,}}_G(H)\) to the equivalence class of the isomorphism \(\varphi :H \cap g^{-1}Hg \rightarrow H \cap gHg^{-1}\) defined as \(\varphi (h) = ghg^{-1}\).

Theorem 1.2

Let G be a biautomatic group and let \(H \le G\) be a finitely generated abelian subgroup. Then the image of \({{\,\textrm{Comm}\,}}_G(H)\) in \({{\,\textrm{Comm}\,}}(H)\) is finite. In particular, there exists a finite-index subgroup \({{\,\textrm{Comm}\,}}_G^0(H) \unlhd {{\,\textrm{Comm}\,}}_G(H)\) such that every element of \({{\,\textrm{Comm}\,}}_G^0(H)\) centralises some finite-index subgroup of H.

Theorem 1.2 can be seen as a generalisation of [8, Theorem 1.2]—the only difference between these two results is that H is assumed to be finitely generated in the former and \(\mathcal {M}\)-quasiconvex (for some biautomatic structure \((Y,\mathcal {M})\) for G) in the latter; see Sect. 2.3 for a definition of \(\mathcal {M}\)-quasiconvex subsets. Indeed, \(\mathcal {M}\)-quasiconvexity implies finite generation for any subgroup of G, and so we may deduce [8, Theorem 1.2] from Theorem 1.2.

Note that Theorem 1.1 can be seen as an immediate corollary of Theorem 1.2. Indeed, if G is biautomatic and \(\widehat{G} \le G\) is a subgroup isomorphic to a Leary–Minasyan group G(AL) (with an isomorphism sending \(\widehat{t} \in \widehat{G}\) to \(t \in G(A,L)\) and a subgroup \(H < \widehat{G}\) to the subgroup \(\langle x_1,\ldots ,x_n \rangle < G(A,L)\)), then we have \(\widehat{t} \in {{\,\textrm{Comm}\,}}_{\widehat{G}}(H) \le {{\,\textrm{Comm}\,}}_G(H)\). It then follows from Theorem 1.2 that \(t^k \in {{\,\textrm{Comm}\,}}_G^0(H)\) for some \(k \in \mathbb {Z}_{\ge 1}\), implying that \(A^k = I_n\), and so that A has order \(\le k\). Conversely, if A has finite order then G(AL) is itself biautomatic by [8, Theorem 1.1], and so it is a subgroup of a biautomatic group.

In fact, the class of CAT(0) groups not embeddable into biautomatic groups is more general than merely the groups G(AL). In particular, if X is a proper CAT(0) space with no Euclidean factors, and if \(K \le {\text {Isom}}(X)\) is a closed subgroup acting minimally and cocompactly on X, then one can show that a lattice G in \({\text {Isom}}(\mathbb {E}^n) \times K\) is not a subgroup of a biautomatic group unless G has discrete image under the projection \({\text {Isom}}(\mathbb {E}^n) \times K \rightarrow {\text {Isom}}(\mathbb {E}^n)\). Indeed, it has been pointed out by S. Hughes [6, Theorem 7.7 and its proof] that in this case there exists an element \(t \in G\) that commensurates a subgroup \(H \cong \mathbb {Z}^n\) of G, but \(t^k\) does not centralise any finite index subgroup of H (for any \(k \ge 1\)). The class of lattices in \({\text {Isom}}(\mathbb {E}^n) \times K\), with K as above, contains all CAT(0) Leary–Minasyan groups, and is studied in more detail by S. Hughes in [6].

Our proof of Theorem 1.2 relies on a triangulation of the sphere \(\mathbb {S}^{N-1}\) associated to a biautomatic structure \((X,\mathcal {L})\) for \(\mathbb {Z}^N\), described by W. D. Neumann and M. Shapiro in [12]; see Remark 4.4 for more details. We equip such a triangulation with an additional structure: namely, we define a polyhedral function \(f:\mathbb {R}^N \rightarrow \mathbb {R}\) (see Sect. 2.4 for a definition) whose restriction to each polyhedral cone \(\{ \beta \textbf{v} \mid \textbf{v} \in \Delta , \beta \in [0,\infty ) \}\), where \(\Delta \subseteq \mathbb {S}^{N-1}\) is a simplex in this triangulation, is linear and homogeneous. This function is chosen in such a way that it roughly approximates lengths of words in \(\mathcal {L}\) representing an element of \(\mathbb {Z}^N\). The key point of this construction is that, for a group G with a biautomatic structure \((Y,\mathcal {M})\), it allows us to deal not only with an \(\mathcal {M}\)-quasiconvex abelian subgroup \(\widehat{H} \le G\), but also with any subgroup of such an \(\widehat{H}\).

The structure of the paper is as follows. In Sect. 2, we introduce definitions and main results on commensurators, biautomatic groups and polyhedral functions. In Sect. 3, we prove several results on polyhedral functions, and in Sect. 4 we associate a polyhedral function to a biautomatic structure for \(\mathbb {Z}^N\) and we compare our construction to that of Neumann–Shapiro. In Sect. 5, we use these results to prove Theorem 1.2.

2 Preliminaries

Throughout this paper, we denote by \(\langle {-}, {-} \rangle \) the standard inner product on \(\mathbb {R}^n\), and by \(\Vert {-} \Vert \) the standard \(\ell _2\)-norm on \(\mathbb {R}^n\), so that \(\Vert \textbf{v} \Vert ^2 = \langle \textbf{v},\textbf{v} \rangle \) for all \(\textbf{v} \in \mathbb {R}^n\). We also write \(I_n \in GL_n(\mathbb {R})\) for the \(n \times n\) identity matrix.

We denote by [G : H] the index of a subgroup H in a group G. We write Z(G) for the centre of a group G, and \(C_G(S)\) for the centraliser of a subset \(S \subseteq G\).

2.1 Commensurators

Given a group G and a subgroup \(H \le G\), the commensurator of H in G is

$$\begin{aligned} {{\,\textrm{Comm}\,}}_G(H) := \{ g \in G \mid [H : H \cap gHg^{-1}]< \infty \text { and } [H : H \cap g^{-1}Hg] < \infty \}. \end{aligned}$$

It is easy to check that \({{\,\textrm{Comm}\,}}_G(H)\) is a subgroup of G containing H.

A related notion is that of an abstract commensurator of a group H. In order to define it, let \(\mathcal {C}_H\) be the set of all isomorphisms \(\varphi :A \rightarrow B\), where A and B are finite-index subgroups of H. We say \(\varphi ,\psi \in \mathcal {C}_H\) are equivalent, denoted \(\varphi \sim _H \psi \) if there exists a finite-index subgroup A of H contained in the domains of both \(\varphi \) and \(\psi \) such that \(\varphi (h) = \psi (h)\) for any \(h \in A\); we denote by \([\varphi ]\) or \([\varphi ]_H\) the equivalence class of \(\varphi \) in \(\mathcal {C}_H\). We then define the abstract commensurator of H as

$$\begin{aligned} {{\,\textrm{Comm}\,}}(H) := \mathcal {C}_H / {\sim _H}. \end{aligned}$$

Given two isomorphisms \(\varphi :A \rightarrow B\) and \(\varphi ':A' \rightarrow B'\) between finite-index subgroups \(A,A',B,B' \le H\), we may define the product \([\varphi ][\varphi ']\) of the classes \([\varphi ],[\varphi '] \in {{\,\textrm{Comm}\,}}(H)\) to be the equivalence class of the map \(\psi :(\varphi ')^{-1}(A \cap B') \rightarrow \varphi (A \cap B')\) defined by \(\psi (h) = \varphi (\varphi '(h))\). It is easy to verify that this makes \({{\,\textrm{Comm}\,}}(H)\) into a group.

Now given an element \(g \in {{\,\textrm{Comm}\,}}_G(H)\) for groups \(H \le G\), we have \(\varphi _g \in \mathcal {C}_H\), where \(\varphi _g:H \cap g^{-1}Hg \rightarrow H \cap gHg^{-1}\) is the map defined by \(\varphi _g(h) = ghg^{-1}\). Thus we have a canonical map \(\Phi :{{\,\textrm{Comm}\,}}_G(H) \rightarrow {{\,\textrm{Comm}\,}}(H)\) sending \(g \mapsto [\varphi _g]\), and this map can be easily checked to be a group homomorphism. It follows from the definitions that \(g \in \ker (\Phi )\) precisely when \(\varphi _g\) coincides with identity on some finite-index subgroup of \(H \cap g^{-1}Hg\), which happens if and only if g centralises a finite-index subgroup of H.

We will be interested in commensurators of finitely generated free abelian groups. In that case, it is easy to see that \({{\,\textrm{Comm}\,}}(\mathbb {Z}^n)\) is isomorphic to \(GL_n(\mathbb {Q})\). Indeed, given a matrix \(A \in GL_n(\mathbb {Q})\), the intersection \(L := \mathbb {Z}^n \cap A^{-1}(\mathbb {Z}^n)\) will have finite index in \(\mathbb {Z}^n\), and we may define a map \(\psi _A:L \rightarrow AL\) sending \(\textbf{v} \mapsto A\textbf{v}\). This gives a map \(GL_n(\mathbb {Q}) \rightarrow {{\,\textrm{Comm}\,}}(\mathbb {Z}^n)\) sending \(A \mapsto [\psi _A]\), which can be checked to be a group isomorphism.

We will also need to relate (abstract) commensurators of two groups one of which is a finite-index subgroup of another, and so we will use the following result. It is well-known, but we give a proof here for completeness.

Lemma 2.1

Let G be a group, and let \(H,H' \le G\) be subgroups such that \(H' \le H\) and \([H:H'] < \infty \). Then \({{\,\textrm{Comm}\,}}_G(H) = {{\,\textrm{Comm}\,}}_G(H')\), and there exists an isomorphism \(\Psi :{{\,\textrm{Comm}\,}}(H') \rightarrow {{\,\textrm{Comm}\,}}(H)\) such that \(\Phi = \Psi \circ \Phi '\), where \(\Phi :{{\,\textrm{Comm}\,}}_G(H) \rightarrow {{\,\textrm{Comm}\,}}(H)\) and \(\Phi ':{{\,\textrm{Comm}\,}}_G(H') \rightarrow {{\,\textrm{Comm}\,}}(H')\) are the canonical maps.

Proof

Let \(g \in G\), and denote by \(K_\pm \) and \(K_\pm '\) the groups \(g^{\pm 1} H g^{\mp 1}\) and \(g^{\pm 1} H' g^{\mp 1}\), respectively; note that \([K_\pm : K_\pm '] = [H : H'] < \infty \). If we have \([H' : H' \cap K_\pm '] < \infty \), then

$$\begin{aligned} {[}H : H \cap K_\pm ] \le [H : H' \cap K_\pm '] = [H:H'] [H' : H' \cap K_\pm '] < \infty . \end{aligned}$$

Thus \({{\,\textrm{Comm}\,}}_G(H') \subseteq {{\,\textrm{Comm}\,}}_G(H)\). On the other hand, if \([H : H \cap K_\pm ] < \infty \) then

$$\begin{aligned} {[}H' : H' \cap K_\pm ']&\le [H : H' \cap K_\pm '] \\&= [H : H \cap K_\pm ] [H \cap K_\pm : H' \cap K_\pm ] [H' \cap K_\pm : H' \cap K_\pm '] \\&\le [H : H \cap K_\pm ] [H : H'] [K_\pm : K_\pm '] < \infty . \end{aligned}$$

Thus \({{\,\textrm{Comm}\,}}_G(H) \subseteq {{\,\textrm{Comm}\,}}_G(H')\), and so \({{\,\textrm{Comm}\,}}_G(H) = {{\,\textrm{Comm}\,}}_G(H')\).

Now any isomorphism \(\varphi ':A' \rightarrow B'\) between finite-index subgroups of \(H'\) is also an isomorphism between finite-index subgroups of H, and so represents a class \([\varphi ']_H \in {{\,\textrm{Comm}\,}}(H)\). Moreover, two such isomorphisms agree on a finite-index subgroup of \(H'\) if and only if they agree on a finite-index subgroup of H. It follows that the map \(\Psi :{{\,\textrm{Comm}\,}}(H') \rightarrow {{\,\textrm{Comm}\,}}(H)\), defined by \(\Psi \left( [\varphi ']_{H'} \right) = [\varphi ']_H\), is well-defined and injective; furthermore, this map is easily seen to be a group homomorphism. Finally, given any isomorphism \(\varphi :A \rightarrow B\) between finite-index subgroups of H we may consider the map

$$\begin{aligned} \varphi ' = \varphi |_{A \cap H' \cap \varphi ^{-1}(B \cap H')}:A \cap H' \cap \varphi ^{-1}(B \cap H') \rightarrow B \cap H' \cap \varphi (A \cap H'), \end{aligned}$$

so that \([\varphi ']_{H'} \in {{\,\textrm{Comm}\,}}(H')\) and \([\varphi ']_H = [\varphi ]_H \in {{\,\textrm{Comm}\,}}(H)\); thus \(\Psi \) is surjective.

By construction, both \(\Phi \) and \(\Psi \circ \Phi '\) send an element \(g \in {{\,\textrm{Comm}\,}}_G(H) = {{\,\textrm{Comm}\,}}_G(H')\) to the map represented by \(\varphi _g:A \rightarrow gAg^{-1}, h \mapsto ghg^{-1}\) for some finite-index subgroup A of H, implying that \(\Phi = \Psi \circ \Phi '\), as required. \(\square \)

2.2 Biautomatic groups

Here we briefly introduce biautomatic groups and their main properties which we will be using in this paper. We refer the interested reader to [4] for a more comprehensive account.

We fix a group G with a finite generating set Y of G; we view Y as an abstract alphabet together with a fixed injective map \(Y \hookrightarrow G\). We will always assume that Y is symmetric (i.e. the image S of Y in G satisfies \(S = S^{-1}\)) and contains the identity (i.e. contains an element \(1 \in Y\) mapping to the identity \(1_G \in G\)). We denote by \(Y^*\) the free monoid on Y, i.e. the set of all words with letters in Y, forming a monoid under concatenation. Given a word in \(U \in Y^*\), we denote by \(|U| \in \mathbb {Z}_{\ge 0}\) its length; furthermore, for any \(t \in \mathbb {Z}_{\ge 0}\) we set U(t) to be the prefix of U of length t if \(t \le |U|\), and we set \(U(t) = U\) for \(t > |U|\).

We have a monoid homomorphism \(Y^* \rightarrow G\) induced by the inclusion \(Y \hookrightarrow G\); we denote by \(\overline{U} \in G\) the image of \(U \in Y^*\) under this map, and we say that U represents \(\overline{U}\). Given an element \(g \in G\) we also denote by \(|g|_Y\) the distance between vertices 1 and g in the Cayley graph \({{\,\textrm{Cay}\,}}(G,Y)\); that is, \(|g|_Y\) is equal to the length of the shortest word in \(Y^*\) representing g, and so for any \(U \in Y^*\) we have \(\left| \overline{U} \right| _Y \le |U|\).

A notion that appears in the theory of biautomatic groups is that of a finite state automaton (FSA). Roughly, a (deterministic) FSA \(\mathfrak {A}\) over Y is a finite directed multigraph \(\Gamma \) with an assignment of a starting state \(v_0 \in V(\Gamma )\) and accept states \(\mathcal {A} \subseteq V(\Gamma )\), and with edges labelled by elements of Y in such a way that for each vertex \(v \in V(\Gamma )\) and each \(y \in Y\), there exists at most one edge starting at v and labelled by y; see [4, Definition 1.2.1] for a detailed definition. Given such an \(\mathfrak {A}\), we say a subset \(\mathcal {M} \subseteq Y^*\) is recognised by \(\mathfrak {A}\) if \(\mathcal {M}\) is the set of words that label directed paths in \(\Gamma \) starting at \(v_0\) and ending in \(\mathcal {A}\). A subset \(\mathcal {M} \subseteq Y^*\) is said to be a regular language if it is recognised by some FSA over Y.

This allows us to define biautomatic groups, as follows.

Definition 2.2

Let G be a group, let Y be a finite symmetric generating set of G containing the identity, and let \(\mathcal {M} \subseteq Y^*\). We say that \((Y,\mathcal {M})\) is a biautomatic structure for G if

  1. (i)

    \(\mathcal {M}\) is a regular language; and

  2. (ii)

    \(\mathcal {M}\) satisfies the (two-sided) fellow traveller property: that is, there exists a constant \(\lambda \ge 0\) such that if \(U,V \in \mathcal {M}\) and \(x,y \in Y\) are such that \(\overline{U} = \overline{xVy}\), then \(\left| \overline{U(t)}^{-1} \overline{x} \overline{V(t)} \right| _Y \le \lambda \) for all \(t \in \mathbb {Z}_{\ge 0}\).

We say a biautomatic structure \((Y,\mathcal {M})\) for G is finite-to-one if for each \(g \in G\) there exist only finitely many \(U \in \mathcal {M}\) such that \(\overline{U} = g\). We say G is biautomatic if it has a biautomatic structure.

Condition (ii) in Definition 2.2 has a more geometric description. In particular, if \(U,V \in \mathcal {M}\) are such that \(\overline{U} = \overline{xVy}\) for some \(x,y \in Y\), then there exist paths \(P_U\) and \(P_V\) in \({{\,\textrm{Cay}\,}}(G,Y)\) starting and ending distance \(\le 1\) away and labelled by U and V, respectively. For any \(t \in \mathbb {Z}_{\ge 0}\), we set \(P_U(t)\) to be the initial subpath of \(P_U\) of length t if \(t \le |U|\), and we set \(P_U(t) = P_U\) if \(t > |U|\); we define \(P_V(t)\) similarly. Condition (ii) then says that the endpoints of \(P_U(t)\) and \(P_V(t)\) are distance \(\le \lambda \) apart for any t.

It is known that every biautomatic group admits a finite-to-one biautomatic structure: see [4, Theorem 2.5.1]. For such a biautomatic structure, we have the following observation that will be crucial in our arguments.

Lemma 2.3

(see [4, Lemma 2.3.9]) Let \((Y,\mathcal {M})\) be a finite-to-one biautomatic structure on a group G. Then there exists a constant \(\kappa \ge 0\) such that if \(U,V \in \mathcal {M}\) and \(x,y \in Y\) are such that \(\overline{U} = \overline{xVy}\), then \(|U|-\kappa \le |V| \le |U|+\kappa \). \(\square \)

2.3 Quasiconvex subsets and subgroups

An important notion in biautomatic groups is that of quasiconvexity, defined as follows. We refer the interested reader to the paper [5] by S. M. Gersten and H. B. Short for more details.

Definition 2.4

Let \((Y,\mathcal {M})\) be a biautomatic structure for a group G. We say a subset \(S \subseteq G\) is \(\mathcal {M}\)-quasiconvex if there exists a constant \(\nu \ge 0\) such that any path in \({{\,\textrm{Cay}\,}}(G,Y)\) that starts and ends at vertices in S and is labelled by a word in \(\mathcal {M}\) belongs to the \(\nu \)-neighbourhood of S.

An important consequence of quasiconvexity is that if an \(\mathcal {M}\)-quasiconvex subset is a subgroup, then it is itself biautomatic. More precisely, we have the following result.

Theorem 2.5

(see [5, Theorem 3.1 and its proof]) Let \((Y,\mathcal {M})\) be a biautomatic structure on a group G, and let \(H \le G\) be an \(\mathcal {M}\)-quasiconvex subgroup. Then there exists a biautomatic structure \((X,\mathcal {L})\) for H and a constant \(\mu \ge 0\) such that for any \(V \in \mathcal {M}\) with \(\overline{V} \in H\), there exists \(U \in \mathcal {L}\) with \(\overline{U} = \overline{V}\), \(|U| = |V|\) and \(\left| \overline{U(t)}^{-1} \overline{V(t)} \right| _Y \le \mu \) for all \(t \in \mathbb {Z}_{\ge 0}\). Moreover, if \((Y,\mathcal {M})\) is finite-to-one then so is \((X,\mathcal {L})\). \(\square \)

We refer to the biautomatic structure \((X,\mathcal {L})\) given by Theorem 2.5 as the biautomatic structure associated to \((Y,\mathcal {M})\). It follows from Theorem 2.5 that the quasiconvexity relation between groups equipped with biautomatic structures is transitive in the sense of Lemma 2.6 below. This result is straightforward, but we give a proof for completeness.

Lemma 2.6

Let G be a group with a biautomatic structure \((Y,\mathcal {M})\), let \(H \le G\) be an \(\mathcal {M}\)-quasiconvex subgroup with the associated biautomatic structure \((X,\mathcal {L})\), and let \(K \le H\) be an \(\mathcal {L}\)-quasiconvex subgroup. Then K is \(\mathcal {M}\)-quasiconvex in G.

Proof

It follows by Theorem 2.5 that there exists \(\mu \ge 0\) such that for any \(V \in \mathcal {M}\) with \(\overline{V} \in H\), there exists \(U \in \mathcal {L}\) with \(\overline{U} = \overline{V}\) and \(\left| \overline{V(t)}^{-1} \overline{U(t)} \right| _Y \le \mu \) for all \(t \in \mathbb {Z}_{\ge 0}\). Now if in addition we have \(\overline{V} \in K\), then \(\overline{U} \in K\) and so, by Definition 2.4, for each \(t \in \mathbb {Z}_{\ge 0}\) there exists \(k_t \in K\) such that \(\left| \overline{U(t)}^{-1} k_t \right| _X \le \nu \), where \(\nu \ge 0\) is some universal constant. If we set \(\delta := \max \{ |\overline{x}|_Y \mid x \in X \}\), we then have

$$\begin{aligned} \left| \overline{V(t)}^{-1} k_t \right| _Y \le \left| \overline{V(t)}^{-1} \overline{U(t)} \right| _Y + \left| \overline{U(t)}^{-1} k_t \right| _Y \le \mu + \delta \nu , \end{aligned}$$

for all t, and so any path in \({{\,\textrm{Cay}\,}}(G,Y)\) represented by V whose endpoints belong to K is in the \((\mu +\delta \nu )\)-neighbourhood of K. Thus K is \(\mathcal {M}\)-quasiconvex in G, as required. \(\square \)

One of the main sources of \(\mathcal {M}\)-quasiconvex subgroups in a biautomatic group G are centralisers of finite subsets, as per the following result.

Proposition 2.7

([5, Proposition 4.3]) Let \((Y,\mathcal {M})\) be a biautomatic structure on a group G, and let \(S \subseteq G\) be a finite subset. Then \(C_G(S)\) is \(\mathcal {M}\)-quasiconvex. \(\square \)

2.4 Polyhedral functions

Finally, we introduce polyhedral functions, which we will use to approximate lengths of words in \(\mathcal {L}\), where \((X,\mathcal {L})\) is a biautomatic structure for \(\mathbb {Z}^n\).

Definition 2.8

Given a finite subset \(Z = \{ \textbf{z}_1,\ldots ,\textbf{z}_\alpha \} \subseteq \mathbb {R}^n\), a polyhedral cone over Z is the set \(C(Z) = \left\{ \sum _{j=1}^\alpha \mu _j \textbf{z}_j \,\big |\, \mu _1,\ldots ,\mu _\alpha \ge 0 \right\} \subseteq \mathbb {R}^n\). Given a polyhedral cone \(C = C(Z)\) and \(\textbf{y} \in \mathbb {R}^n\) such that \(\langle \textbf{z},\textbf{y} \rangle > 0\) for all \(\textbf{z} \in Z\) (equivalently, \(\langle \textbf{z},\textbf{y} \rangle > 0\) for all \(\textbf{z} \in C \setminus \{\textbf{0}\}\)), we define a function \(f_{C,\textbf{y}}:\mathbb {R}^n \rightarrow \mathbb {R}\) by

$$\begin{aligned} f_{C,y}(\textbf{v}) = {\left\{ \begin{array}{ll} \langle \textbf{v},\textbf{y} \rangle &{} \text {if } \textbf{v} \in C, \\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

We say \(f:\mathbb {R}^n \rightarrow \mathbb {R}\) is a polyhedral function if there exists a finite collection of functions \(f_{C_1,\textbf{y}_1},\ldots ,f_{C_m,\textbf{y}_m}:\mathbb {R}^n \rightarrow \mathbb {R}\) as above such that

  1. (i)

    \(\mathbb {R}^n = \bigcup _{j=1}^m C_j\);

  2. (ii)

    if \(\textbf{v} \in C_j \cap C_k\) then \(\langle \textbf{v},\textbf{y}_j \rangle = \langle \textbf{v},\textbf{y}_k \rangle \); and

  3. (iii)

    \(f(\textbf{v}) = \max \{ f_{C_j,\textbf{y}_j}(\textbf{v}) \mid 1 \le j \le m \}\) for all \(\textbf{v} \in \mathbb {R}^n\).

Note that if f is a polyhedral function then f is continuous and positively homogeneous: that is, \(f(\mu \textbf{v}) = \mu f(\textbf{v})\) for all \(\textbf{v} \in \mathbb {R}^n\) and \(\mu \ge 0\). Such functions have been studied before, for instance, in [9]: in their notation, a polyhedral function is precisely a function that belongs to \(\mathscr {P}_+(E_n)\) and is positive (that is, \(f(\textbf{v}) \ge 0\) for all \(\textbf{v}\), with equality if and only if \(\textbf{v} = \textbf{0}\)).

In this paper we will use the following well-known alternative characterisation of polyhedral cones. Here, a linear halfspace is a subset of \(\mathbb {R}^n\) of the form \(\{ \textbf{v} \in \mathbb {R}^n \mid \langle \textbf{v},\textbf{w} \rangle \ge 0 \}\) for some \(\textbf{w} \in \mathbb {R}^n \setminus \{\textbf{0}\}\).

Theorem 2.9

(J. Farkas, H. Minkowski and H. Weyl; see [7, Theorems 6  & 7]) Let \(C \subseteq \mathbb {R}^n\) be a subset. Then the following are equivalent:

  1. (i)

    C is a polyhedral cone over some finite subset \(Z \subseteq \mathbb {R}^n\);

  2. (ii)

    there exist linear halfspaces \(K_1,\ldots ,K_\beta \subseteq \mathbb {R}^n\) such that \(C = \bigcap _{j=1}^\beta K_j\). \(\square \)

Example 2.10

An example of a polyhedral function is depicted in Fig. 1, where the dotted line denotes \(f^{-1}(c)\) for some constant \(c > 0\). Here we set \(C_j = C\left( \{ \textbf{z}_{j,1},\textbf{z}_{j,2} \} \right) \) for \(1 \le j \le 6\), where

$$\begin{aligned} \textbf{z}_{1,1}&= \textstyle \left( \frac{1}{4},\frac{1}{2} \right) ,&\textbf{z}_{1,2}&= \textstyle \left( 0,\frac{1}{2} \right) ,&\textbf{y}_1&= (0,2), \\ \textbf{z}_{2,1} = \textbf{z}_{3,1}&= \textstyle \left( \frac{1}{4},\frac{1}{2} \right) ,&\textbf{z}_{2,2} = \textbf{z}_{3,2}&= \textstyle \left( \frac{1}{4},0 \right) ,&\textbf{y}_2 = \textbf{y}_3&= (4,0), \\ \textbf{z}_{4,1}&= \textstyle \left( \frac{1}{4},0 \right) ,&\textbf{z}_{4,2}&= \textstyle \left( 0,-\frac{1}{2} \right) ,&\textbf{y}_4&= (4,-2), \\ \textbf{z}_{5,1}&= \textstyle \left( -\frac{1}{2},0 \right) ,&\textbf{z}_{5,2}&= \textstyle \left( 0,-\frac{1}{2} \right) ,&\textbf{y}_5&= (-2,-2), \\ \textbf{z}_{6,1}&= \textstyle \left( 0,\frac{1}{2} \right) ,&\textbf{z}_{6,2}&= \textstyle \left( -\frac{1}{2},0 \right) ,&\textbf{y}_6&= (-2,2). \end{aligned}$$

It is easy to check that the conditions (i) and (ii) in Definition 2.8 are satisfied.

Fig. 1
figure 1

A representation of a polyhedral function \(f:\mathbb {R}^2 \rightarrow \mathbb {R}\). See Examples 2.10, 3.4 and 4.5 for details

3 Geometry of polyhedral functions

Our first result on polyhedral functions says that the restriction of a polyhedral function to a linear subspace is also polyhedral.

Lemma 3.1

Let \(\theta :\mathbb {R}^n \rightarrow \mathbb {R}^N\) be a linear isometric embedding, and let \(f:\mathbb {R}^N \rightarrow \mathbb {R}\) be a polyhedral function. Then \(f \circ \theta :\mathbb {R}^n \rightarrow \mathbb {R}\) is a polyhedral function.

Proof

Since any linear isometric embedding can be expressed as a composite of linear isometric embeddings \(\theta ':\mathbb {R}^{n'} \rightarrow \mathbb {R}^{n'+1}\), it is enough to consider the case \(N = n+1\). In particular, under this assumption there exists \(\textbf{u} \in \mathbb {R}^{n+1}\) with \(\langle \textbf{u},\textbf{u} \rangle = 1\) such that \(\theta (\mathbb {R}^n) = \{ \textbf{v} \in \mathbb {R}^{n+1} \mid \langle \textbf{v},\textbf{u} \rangle = 0 \}\).

We first show that if \(C \subseteq \mathbb {R}^{n+1}\) is a polyhedral cone then so is \(\theta ^{-1}(C) \subseteq \mathbb {R}^n\). Indeed, by Theorem 2.9, in that case we have \(C = \bigcap _{j=1}^\beta K_j\) for some linear halfspaces \(K_1,\ldots ,K_\beta \subseteq \mathbb {R}^{n+1}\); for \(1 \le j \le \beta \), let \(\textbf{w}_j \in \mathbb {R}^{n+1} \setminus \{\textbf{0}\}\) be such that \(K_j = \{ \textbf{v} \in \mathbb {R}^{n+1} \mid \langle \textbf{v},\textbf{w}_j \rangle \ge 0 \}\). We then have \(\textbf{w}_j - \langle \textbf{u},\textbf{w}_j \rangle \textbf{u} \in \theta (\mathbb {R}^n)\) and so we may set \(\textbf{w}_j' := \theta ^{-1}(\textbf{w}_j - \langle \textbf{u},\textbf{w}_j \rangle \textbf{u})\); moreover, we set \(K_j' := \{ \textbf{v}' \in \mathbb {R}^n \mid \langle \textbf{v}',\textbf{w}_j' \rangle \ge 0 \}\), so that either \(K_j' = \mathbb {R}^n\) or \(K_j' \subseteq \mathbb {R}^n\) is a linear halfspace. Given any \(\textbf{v}' \in \mathbb {R}^n\) we have \(\langle \theta (\textbf{v}'),\textbf{u} \rangle = 0\), and so

$$\begin{aligned} \textbf{v}' \in K_j'&\iff \langle \textbf{v}',\textbf{w}_j' \rangle \ge 0 \iff \big \langle \theta (\textbf{v}'), \textbf{w}_j-\langle \textbf{u},\textbf{w}_j\rangle \textbf{u} \big \rangle \ge 0 \iff \langle \theta (\textbf{v}'), \textbf{w}_j \rangle \ge 0 \\&\iff \theta (\textbf{v}') \in K_j, \end{aligned}$$

implying that \(K_j' = \theta ^{-1}(K_j)\). It follows that \(\theta ^{-1}(C) = \bigcap _{j=1}^\beta K_j'\) and so (by Theorem 2.9) \(\theta ^{-1}(C) \subseteq \mathbb {R}^n\) is a polyhedral cone, as claimed.

Now since \(f:\mathbb {R}^{n+1} \rightarrow \mathbb {R}\) is polyhedral, there exist functions \(f_{C_1,\textbf{y}_1},\ldots ,f_{C_m,\textbf{y}_m}:\mathbb {R}^{n+1} \rightarrow \mathbb {R}\) as in Definition 2.8. For \(1 \le j \le m\), we set \(\textbf{y}_j' := \theta ^{-1}(\textbf{y}_j - \langle \textbf{u},\textbf{y}_j \rangle \textbf{u})\) and \(C_j' := \theta ^{-1}(C_j)\). Given any \(\textbf{z}' \in C_j'\) we have \(\theta (\textbf{z}') \in C_j\) and \(\langle \theta (\textbf{z}'),\textbf{u} \rangle = 0\), and so

$$\begin{aligned} \langle \textbf{z}',\textbf{y}_j' \rangle = \big \langle \theta (\textbf{z}'),\textbf{y}_j-\langle \textbf{u},\textbf{y}_j \rangle \textbf{u} \big \rangle = \langle \theta (\textbf{z}'),\textbf{y}_j \rangle > 0, \end{aligned}$$

implying that we may define a function \(f_{C_j',\textbf{y}_j'}:\mathbb {R}^n \rightarrow \mathbb {R}\) as in Definition 2.8.

We claim that \(f \circ \theta \) may be constructed from the functions \(f_{C_j',\textbf{y}_j'}\) as in Definition 2.8. Note first that we have

$$\begin{aligned} \mathbb {R}^n = \theta ^{-1}(\mathbb {R}^{n+1}) = \theta ^{-1}\Bigg ( \bigcup _{j=1}^m C_j \Bigg ) = \bigcup _{j=1}^m \theta ^{-1}(C_j) = \bigcup _{j=1}^m C_j', \end{aligned}$$

showing condition (i). Moreover, if \(\textbf{v}' \in C_j' \cap C_k'\) then we have \(\theta (\textbf{v}') \in C_j \cap C_k\) and so

$$\begin{aligned} \langle \textbf{v}',\textbf{y}_j' \rangle&= \big \langle \theta (\textbf{v}'),\textbf{y}_j-\langle \textbf{u},\textbf{y}_j \rangle \textbf{u} \big \rangle = \langle \theta (\textbf{v}'),\textbf{y}_j \rangle \\&= \langle \theta (\textbf{v}'),\textbf{y}_k \rangle = \big \langle \theta (\textbf{v}'),\textbf{y}_k-\langle \textbf{u},\textbf{y}_k \rangle \textbf{u} \big \rangle = \langle \textbf{v}',\textbf{y}_k' \rangle , \end{aligned}$$

showing condition (ii). Finally, for any \(\textbf{v}' \in \mathbb {R}^n\) we have \(f_{C_j,\textbf{y}_j} \circ \theta (\textbf{v}') = 0 = f_{C_j',\textbf{y}_j'}(\textbf{v}')\) if \(\textbf{v}' \notin C_j'\), and

$$\begin{aligned} f_{C_j,\textbf{y}_j} \circ \theta (\textbf{v}') = \langle \theta (\textbf{v}'),\textbf{y}_j \rangle = \big \langle \theta (\textbf{v}'), \textbf{y}_j-\langle \textbf{u},\textbf{y}_j \rangle \textbf{u} \big \rangle = \langle \textbf{v}',\textbf{y}_j' \rangle = f_{C_j',\textbf{y}_j'}(\textbf{v}') \end{aligned}$$

if \(\textbf{v}' \in C_j'\), implying that \(f_{C_j,\textbf{y}_j} \circ \theta = f_{C_j',\textbf{y}_j'}\). Thus \(f \circ \theta (\textbf{v}') = \max \{ f_{C_j',\textbf{y}_j'}(\textbf{v}') \mid 1 \le j \le m \}\) for all \(\textbf{v}' \in \mathbb {R}^n\), showing condition (iii). It follows that \(f \circ \theta \) is polyhedral, as required. \(\square \)

We now turn our attention to group actions that preserve the values of a polyhedral function. In particular, in Proposition 3.3 we show that a polyhedral function cannot be G-invariant for any infinite subgroup \(G \le GL_n(\mathbb {R})\). In order to prove this, we will use the following result.

Lemma 3.2

Let \(\{ \textbf{y}_j \mid j \in \mathcal {I} \}\) be a spanning set of \(\mathbb {R}^n\), and let \(A \in GL_n(\mathbb {R})\) be such that \(A(K_j) = K_j\), where \(K_j := \{ \textbf{v} \in \mathbb {R}^n \mid \langle \textbf{v},\textbf{y}_j \rangle = 1 \}\), for each \(j \in \mathcal {I}\). Then \(A = I_n\).

Proof

Since the \(\textbf{y}_j\) span \(\mathbb {R}^n\), some subset \(\{ \textbf{y}_{j_1},\ldots ,\textbf{y}_{j_n} \}\) of the \(\textbf{y}_j\) form a basis for \(\mathbb {R}^n\). Now for each \(k \in \{ 1,\ldots ,n \}\), there exists a unique \(\textbf{z}_k \in \mathbb {R}^n\) such that \(\langle \textbf{z}_k,\textbf{y}_{j_k} \rangle = 1\) and \(\langle \textbf{z}_k,\textbf{y}_{j_\ell } \rangle = 0\) for all \(\ell \ne k\); moreover, it is easy to see that \(\{ \textbf{z}_1,\ldots ,\textbf{z}_n \}\) is a basis for \(\mathbb {R}^n\)—the basis dual to \(\{ \textbf{y}_{j_1},\ldots ,\textbf{y}_{j_n} \}\).

Now let \(\textbf{z} = \sum _{k=1}^n \textbf{z}_k\). It is then easy to verify that \(\bigcap _{\ell =1}^n K_{j_\ell } = \{\textbf{z}\}\), and that \(\bigcap _{1 \le \ell \le n, \ell \ne k} K_{j_\ell } = \{ \textbf{z}+\lambda \textbf{z}_k \mid \lambda \in \mathbb {R}\}\) for all k. As \(A(K_{j_\ell }) = K_{j_\ell }\) for all \(\ell \), it thus follows that \(A\textbf{z} = \textbf{z}\) and \(\{ A\textbf{z}+\lambda A\textbf{z}_k \mid \lambda \in \mathbb {R}\} = \{ \textbf{z}+\lambda \textbf{z}_k \mid \lambda \in \mathbb {R}\}\) for all k. This implies that for each k, there exists \(\lambda _k \in \mathbb {R}\setminus \{0\}\) such that \(A\textbf{z}_k = \lambda _k \textbf{z}_k\). But then we have

$$\begin{aligned} \sum _{k=1}^n \textbf{z}_k = \textbf{z} = A\textbf{z} = \sum _{k=1}^n A\textbf{z}_k = \sum _{k=1}^n \lambda _k \textbf{z}_k. \end{aligned}$$

As the \(\textbf{z}_k\) are linearly independent, it follows that \(\lambda _k = 1\), and so \(A\textbf{z}_k = \textbf{z}_k\), for all k. As the \(\textbf{z}_k\) span \(\mathbb {R}^n\), it thus follows that \(A = I_n\), as required. \(\square \)

Proposition 3.3

Let \(f:\mathbb {R}^n \rightarrow \mathbb {R}\) be a polyhedral function, and let \(G \le GL_n(\mathbb {R})\) be a subgroup. If \(f(\textbf{v}) = f(A\textbf{v})\) for all \(\textbf{v} \in \mathbb {R}^n\) and all \(A \in G\), then G is finite.

Proof

We use the notation of Definition 2.8. In what follows, an affine hyperplane is a subset of \(\mathbb {R}^n\) of the form \(\{ \textbf{v} \in \mathbb {R}^n \mid \langle \textbf{v},\textbf{y} \rangle = c \}\) for some \(\textbf{y} \in \mathbb {R}^n \setminus \{\textbf{0}\}\) and \(c \in \mathbb {R}\).

Let \(\mathcal {I} \subseteq \{1,\ldots ,m\}\) be the set of all j such that \(C_j \subseteq \mathbb {R}^n\) has non-empty interior (if \(C_j = C(Z_j)\), this is equivalent to saying that \(Z_j\) spans \(\mathbb {R}^n\)). Then, by condition (i) in Definition 2.8, we have \(\mathbb {R}^n \setminus \bigcup _{j \in \mathcal {I}} C_j \subseteq \bigcup _{j \notin \mathcal {I}} C_j\), where the left hand-side is open and the right hand side is a finite union of convex subsets with empty interior; this implies that the left hand side is actually empty, and so \(\mathbb {R}^n = \bigcup _{j \in \mathcal {I}} C_j\).

Therefore, conditions (ii) and (iii) imply that \(f^{-1}(1) \subseteq \bigcup _{j \in \mathcal {I}} K_j\), where we set \(K_j := \{ \textbf{v} \in \mathbb {R}^n \mid \langle \textbf{v},\textbf{y}_j \rangle = 1 \}\). Moreover, for each \(j \in \mathcal {I}\), the set \(K_j \cap f^{-1}(1)\) has non-empty interior in \(K_j\). The converse is also true: if K is an affine hyperplane not equal to any \(K_j\) for \(j \in \mathcal {I}\), then we have \(K \cap f^{-1}(1) \subseteq K \cap \bigcup _{j \in \mathcal {I}} K_j = \bigcup _{j \in \mathcal {I}} (K \cap K_j)\), and the latter is a finite union of \((n-2)\)-dimensional affine subspaces which therefore must have empty interior in K. Thus, the set \(\mathcal {K} := \{ K_j \mid j \in \mathcal {I} \}\) of affine hyperplanes consists of precisely those K for which \(K \cap f^{-1}(1)\) has non-empty interior in K.

Now the group G has a canonical action on the set of all affine hyperplanes in \(\mathbb {R}^n\). Moreover, if \(A \in G\) then by the assumptions \(A\left( f^{-1}(1) \right) = f^{-1}(1)\), implying that if \(K \cap f^{-1}(1)\) has non-empty interior in an affine hyperplane K, then \(A(K) \cap f^{-1}(1)\) has non-empty interior in A(K). Thus the set \(\mathcal {K}\) is G-invariant, and so we have homomorphism \(\Phi :G \rightarrow {{\,\textrm{Sym}\,}}(\mathcal {K})\). As \(\mathcal {K}\) is finite, it is then enough to show that \(\Phi \) is injective.

We first claim that the set \(\{ \textbf{y}_j \mid j \in \mathcal {I} \}\) spans \(\mathbb {R}^n\). Suppose for contradiction that the set \(\{ \textbf{y}_j \mid j \in \mathcal {I} \}\) spans a proper subspace of \(\mathbb {R}^n\). Then there exists \(\textbf{u} \in \mathbb {R}^n \setminus \{\textbf{0}\}\) such that \(\langle \textbf{u},\textbf{y}_j \rangle = 0\) for all \(j \in \mathcal {I}\). But, as shown above, we have \(\mathbb {R}^n = \bigcup _{j \in \mathcal {I}} C_j\), and so in that case \(\textbf{u} \in C_k\) for some \(k \in \mathcal {I}\). This is impossible, as we have \(\langle \textbf{z},\textbf{y}_k \rangle > 0\) for all \(\textbf{z} \in C_k \setminus \{\textbf{0}\}\). Thus indeed \(\{ \textbf{y}_j \mid j \in \mathcal {I} \}\) spans \(\mathbb {R}^n\), as claimed.

Now let \(A \in \ker (\Phi )\), so that \(A(K_j) = K_j\) for all \(j \in \mathcal {I}\). It then follows from Lemma 3.2 that \(A = I_n\). Therefore, \(\Phi :G \rightarrow {{\,\textrm{Sym}\,}}(\mathcal {K})\) is injective and so G is finite, as required. \(\square \)

Example 3.4

Let f be the polyhedral function depicted in Fig. 1, and \(\theta :\mathbb {R}\rightarrow \mathbb {R}^2\) be an isometric embedding whose image is the diagonal \(\{ (v,v) \mid v \in \mathbb {R}\} \subset \mathbb {R}^2\). We may then check that \(f\circ \theta (v) = 2\sqrt{2} \left| v \right| \) for all \(v \in \mathbb {R}\), and so \(f \circ \theta \) is polyhedral, as per Lemma 3.1: in the notation of Definition 2.8 we could take

$$\begin{aligned} C_1&:= C(\{1\}) = [0,\infty ),&\textbf{y}_1&:= 2\sqrt{2}, \\ C_2&:= C(\{-1\}) = (-\infty ,0],&\textbf{y}_2&:= -2\sqrt{2} \end{aligned}$$

to define \(f \circ \theta \).

A straightforward calculation shows that for any non-identity matrix \(A \in GL_2(\mathbb {R})\) there exists \(\textbf{v} \in \mathbb {R}^2\) such that \(f(A\textbf{v}) \ne f(\textbf{v})\), and so if \(G \le GL_2(\mathbb {R})\) is such that \(f(\textbf{v}) = f(A\textbf{v})\) for all \(\textbf{v} \in \mathbb {R}^2\) and \(A \in G\), then G must be trivial. On the other hand, \(f \circ \theta (-v) = f \circ \theta (v)\) for all \(v \in \mathbb {R}\). Nevertheless, if \(G \le GL_1(\mathbb {R})\) is a subgroup such that \(f \circ \theta (v) = f \circ \theta (Av)\) for all \(v \in \mathbb {R}\) and \(A \in G\), then we must have \(G \le \left\{ \begin{pmatrix}1\end{pmatrix},\begin{pmatrix}-1\end{pmatrix} \right\} \), and so G is still finite, as per Proposition 3.3.

4 A polyhedral function associated to a biautomatic structure on \(\mathbb {Z}^n\)

In this section, we associate to any biautomatic structure \((X,\mathcal {L})\) on \(\mathbb {Z}^N\) a polyhedral function. Our aim is to do this in such a way that given an element \(\textbf{v} \in \mathbb {Z}^N\) represented by a word \(U \in \mathcal {L}\), the length |U| of U can be roughly approximated by \(f(\textbf{v})\): see Proposition 4.2.

We first need the following auxiliary result.

Lemma 4.1

Let \((X,\mathcal {L})\) be a finite-to-one biautomatic structure on \(\mathbb {Z}^N\), and suppose that there exist \(U_0,V_1,U_1,\ldots ,U_\alpha ,V_\alpha \in X^*\) such that \(U_0 \cdot V_1^* \cdot U_1 \cdot {} \cdots {} \cdot V_\alpha ^* \cdot U_\alpha \subseteq \mathcal {L}\). Then the set \(\{ \textbf{z}_1,\ldots ,\textbf{z}_\alpha \} \subseteq \mathbb {Q}^N\), where \(\textbf{z}_j = \overline{V_j} / |V_j|\), is linearly independent. [For the avoidance of doubt, we do allow having \(\textbf{z}_j = \textbf{z}_{j'}\) for \(j \ne j'\)—that is, we claim that the \(\textbf{z}_j\) become linearly independent after deleting repetitions.]

Proof

Suppose for contradiction that the set \(\{ \textbf{z}_1,\ldots ,\textbf{z}_\alpha \}\) is not independent. Then there exist \(\mu _1,\ldots ,\mu _\alpha \in \mathbb {R}\), not all zero, such that if \(\textbf{z}_j = \textbf{z}_{j'}\) for some \(1 \le j < j' \le \alpha \) then \(\mu _{j'} = 0\), and such that

$$\begin{aligned} \sum _{j=1}^\alpha \mu _j \textbf{z}_j = 0. \end{aligned}$$

Since \(\textbf{z}_j \in \mathbb {Q}^N\) we can also choose \(\mu _j \in \mathbb {Q}\) for all j. Without loss of generality, assume also that \(\mu _j > 0\) for some j, and (by rescaling the \(\mu _j\) if necessary) that \(\frac{\mu _j}{|V_j|} \in \mathbb {Z}\) for all j. We now consider two cases—depending on whether or not \(\mu _j < 0\) for some j—obtaining a contradiction in each.

Suppose first that \(\mu _j \ge 0\) for all j. It then follows that for each \(\beta \in \mathbb {Z}_{\ge 0}\), the word

$$\begin{aligned} U_0 V_1^{\beta \mu _1/|V_1|} U_1 \cdots V_\alpha ^{\beta \mu _\alpha /|V_\alpha |} U_\alpha \in \mathcal {L} \end{aligned}$$

represents the element \(\sum _{j=0}^\alpha \overline{U_j} \in \mathbb {Z}^N\). As \(\mu _j > 0\) for some j, this gives infinitely many words representing a single element of \(\mathcal {L}\), contradicting the fact that \((X,\mathcal {L})\) is finite-to-one.

Suppose now that, on the contrary, \(\mu _j < 0\) for some j. Let \(\mathcal {I}_+ = \{ j \mid \mu _j > 0 \}\) and \(\mathcal {I}_- = \{ j \mid \mu _j < 0 \}\); by the assumptions, both \(\mathcal {I}_+\) and \(\mathcal {I}_-\) are non-empty. Then for each \(\beta \in \mathbb {Z}_{\ge 0}\), the words

$$\begin{aligned} W_\beta ^+&:= U_0 V_1^{\beta \mu _1^+/|V_1|} U_1 \cdots V_\alpha ^{\beta \mu _\alpha ^+/|V_\alpha |} U_\alpha \in \mathcal {L} \end{aligned}$$

and

$$\begin{aligned} W_\beta ^-&:= U_0 V_1^{\beta \mu _1^-/|V_1|} U_1 \cdots V_\alpha ^{\beta \mu _\alpha ^-/|V_\alpha |} U_\alpha \in \mathcal {L}, \end{aligned}$$

where, for \(\varepsilon \in \{\pm \}\), \(\mu _j^\varepsilon = |\mu _j|\) if \(j \in \mathcal {I}_\varepsilon \) and \(\mu _j^\varepsilon = 0\) otherwise, represent the same element of \(\mathbb {Z}^N\). This means that \(W_\beta ^+\) and \(W_\beta ^-\) satisfy the fellow traveller property (see Definition 2.2) for some constant \(\lambda \ge 0\) independent of \(\beta \).

Now let \(j_+ = \min \mathcal {I}_+\) and \(j_- = \min \mathcal {I}_-\). Then the prefixes of \(W_\beta ^+\) and \(W_\beta ^-\) of length \(t = t(\beta ) = \beta \mu + \sum _{j=0}^\alpha |U_j|\), where \(\mu = \min \{\mu _{j_+},-\mu _{j_-}\} \), are

$$\begin{aligned} W_\beta ^+(t)&:= U_0 \cdots U_{j_+-1} V_{j_+}^{\left\lfloor \beta \mu /|V_{j_+}| \right\rfloor } Y_\beta ^+ \end{aligned}$$

and

$$\begin{aligned} W_\beta ^-(t)&:= U_0 \cdots U_{j_--1} V_{j_-}^{\left\lfloor \beta \mu /|V_{j_-}| \right\rfloor } Y_\beta ^- \end{aligned}$$

respectively, where \(Y_\beta ^+\) and \(Y_\beta ^-\) are some words of length \(\le |V_{j_+}| + |V_{j_-}| + \sum _{j=0}^\alpha |U_j|\). But this means that \(\overline{W_\beta ^+(t)} - \overline{W_\beta ^-(t)}\) is bounded distance away from the point \(\frac{\beta \mu }{|V_{j_+}|}\overline{V_{j_+}} - \frac{\beta \mu }{|V_{j_-}|}\overline{V_{j_-}} = \beta \mu (\textbf{z}_{j_-}-\textbf{z}_{j_+}) \in \mathbb {Q}^N\) with respect to any fixed norm on \(\mathbb {Q}^N\). As by assumptions \(\mu > 0\) and \(\textbf{z}_{j_+} \ne \textbf{z}_{j_-}\), it follows that \(\overline{W_\beta ^+(t)} - \overline{W_\beta ^-(t)}\) is arbitrarily far from the origin for large \(\beta \), contradicting the fellow traveller property. \(\square \)

Proposition 4.2

Let \((X,\mathcal {L})\) be a finite-to-one biautomatic structure on \(\mathbb {Z}^N\). Then there exist a polyhedral function \(f:\mathbb {R}^N \rightarrow \mathbb {R}\) and a constant \(\xi \ge 0\) such that

$$\begin{aligned} f(\overline{U}) - \xi \le \left| U \right| \le f(\overline{U}) + \xi \end{aligned}$$

for all \(U \in \mathcal {L}\).

Proof

As \((X,\mathcal {L})\) is finite-to-one and as \(\mathbb {Z}^N\) has polynomial growth, it follows that \(\mathcal {L}\) has polynomial growth as well—that is, there exists a polynomial g(x) such that for each \(n \ge 0\) there are at most g(n) words \(U \in \mathcal {L}\) with \(|U| \le n\). It follows from [4, Proposition 1.3.8] that \(\mathcal {L}\) is simply starred: that is, there exist integers \(\alpha _1,\ldots ,\alpha _m \in \mathbb {Z}_{\ge 0}\) and words \(U_{j,0},V_{j,1},U_{j,1},\ldots ,V_{j,\alpha _j},U_{j,\alpha _j} \in X^*\) (for each \(j \in \{1,\ldots ,m\}\)) such that

$$\begin{aligned} \mathcal {L} = \bigcup _{j=1}^m U_{j,0} \cdot (V_{j,1})^* \cdot U_{j,1} \cdot {} \cdots {} \cdot (V_{j,\alpha _j})^* \cdot U_{j,\alpha _j}. \end{aligned}$$
(1)

By Lemma 4.1, for each j the set \(\{ \textbf{z}_{j,1},\ldots ,\textbf{z}_{j,\alpha _j} \} \subseteq \mathbb {R}^N\), where \(\textbf{z}_{j,k} = \frac{\overline{V_{j,k}}}{|V_{j,k}|}\), is linearly independent. Thus, if \(H_{j,k} \subseteq \mathbb {R}^N\) is the affine hyperplane \(\{ \textbf{v} \in \mathbb {R}^N \mid \langle \textbf{z}_{j,k},\textbf{v} \rangle = 1 \}\), then for \(1 \le j \le m\) the intersection \(\bigcap _{k=1}^{\alpha _j} H_{j,k}\) is non-empty, and so contains a point \(\textbf{y}_j \in \bigcap _{k=1}^{\alpha _j} H_{j,k}\). For each \(j \in \{ 1,\ldots ,m \}\), let \(C_j \subseteq \mathbb {R}^N\) be the polyhedral cone over \(\{ \textbf{z}_{j,1},\ldots ,\textbf{z}_{j,\alpha _j} \}\); notice that we have \(\langle \textbf{z}_{j,k}, \textbf{y}_j \rangle = 1 > 0\) for all k by construction, and so we may define a function \(f_{C_j,\textbf{y}_j}:\mathbb {R}^N \rightarrow \mathbb {R}\) as in Definition 2.8. We then define \(f:\mathbb {R}^N \rightarrow \mathbb {R}\) by setting

$$\begin{aligned} f(\textbf{v}) := \max \{ f_{C_j,\textbf{y}_j}(\textbf{v}) | 1 \le j \le m \}. \end{aligned}$$

We claim the following.

Lemma 4.3

f is a polyhedral function; in particular, in the above notation, conditions (i), (ii) and (iii) in Definition 2.8 are satisfied.

We postpone the proof of Lemma 4.3 until later, and finish the proof of Proposition 4.2 first.

We now define a few constants, as follows. We set \(\delta := \max \left\{ \sum _{k=1}^{\alpha _j} |U_{j,k}| \,\big |\, 1 \le j \le m \right\} \), \(\zeta := \max \{ \Vert \textbf{y}_j\Vert \mid 1 \le j \le m \}\), and \(\eta := \max \{ \Vert \overline{x} \Vert \mid x \in X \}\). We then set

$$\begin{aligned} \xi := \zeta \eta \delta + \delta . \end{aligned}$$

Notice that if \(\textbf{v},\textbf{v}' \in C_j\) for some j then condition (ii) in Definition 2.8 implies that \(f(\textbf{v}) = \langle \textbf{v},\textbf{y}_j \rangle \) and \(f(\textbf{v}') = \langle \textbf{v}',\textbf{y}_j \rangle \), and therefore

$$\begin{aligned} |f(\textbf{v})-f(\textbf{v}')| = |\langle \textbf{v}-\textbf{v}',\textbf{y}_j \rangle | \le \left\| \textbf{v}-\textbf{v}' \right\| \left\| \textbf{y}_j \right\| \le \zeta \left\| \textbf{v}-\textbf{v}' \right\| ; \end{aligned}$$

this implies, by considering values of f at some intermediate points on the geodesic connecting \(\textbf{v}\) and \(\textbf{v}'\), that in fact \(|f(\textbf{v})-f(\textbf{v}')| \le \zeta \left\| \textbf{v}-\textbf{v}' \right\| \) for any \(\textbf{v},\textbf{v}' \in \mathbb {R}^N\). In particular, it follows that if \(\textbf{v},\textbf{v}' \in \mathbb {Z}^N\) then \(|f(\textbf{v})-f(\textbf{v}')| \le \zeta \eta |\textbf{v}-\textbf{v}'|_X\).

Now let \(U \in \mathcal {L}\), so that by (1) we have \(U = U_{j,0} V_{j,1}^{\beta _1} U_{j,1} \cdots V_{j,\alpha _j}^{\beta _{\alpha _j}} U_{j,\alpha _j}\) for some j and some \(\beta _1,\ldots ,\beta _{\alpha _j} \in \mathbb {Z}_{\ge 0}\). We set \(\textbf{v} := \sum _{k=1}^{\alpha _j} \beta _k \overline{V_{j,k}}\), so that we have \(\overline{U}-\textbf{v} = \sum _{k=0}^{\alpha _j} \overline{U_{j,k}}\), implying that

$$\begin{aligned} \left| f(\overline{U}) - f(\textbf{v}) \right| \le \zeta \eta \left| \sum _{k=0}^{\alpha _j} \overline{U_{j,k}} \right| _X \le \zeta \eta \sum _{k=0}^{\alpha _j} |U_{j,k}| \le \zeta \eta \delta . \end{aligned}$$

On the other hand, we have \(\textbf{v} = \sum _{k=1}^{\alpha _j} \beta _k|V_{j,k}| \textbf{z}_{j,k}\), and we can compute that

$$\begin{aligned} f_{C_j,\textbf{y}_j}(\textbf{v}) = \langle \textbf{v},\textbf{y}_j \rangle = \sum _{k=1}^{\alpha _j} \beta _k|V_{j,k}| \langle \textbf{z}_{j,k},\textbf{y}_j \rangle = \sum _{k=1}^{\alpha _j} \beta _k|V_{j,k}|. \end{aligned}$$

Moreover, it follows from the condition (ii) in Definition 2.8 that we have \(f(\textbf{v}) = f_{C_j,\textbf{y}_j}(\textbf{v})\), and therefore

$$\begin{aligned} |U| = \sum _{k=0}^{\alpha _j} |U_{j,k}| + \sum _{k=1}^{\alpha _j} \beta _k |V_{j,k}| = \sum _{k=0}^{\alpha _j} |U_{j,k}| + f(\textbf{v}). \end{aligned}$$

We thus have

$$\begin{aligned} \left| |U| - f(\overline{U}) \right| \le \big | |U| - f(\textbf{v}) \big | + \left| f(\textbf{v}) - f(\overline{U}) \right| \le \left| \sum _{k=0}^{\alpha _j} |U_{j,k}| \right| + \zeta \eta \delta \le \delta + \zeta \eta \delta = \xi , \end{aligned}$$

as required. \(\square \)

We now prove Lemma 4.3 that was stated in the proof of Proposition 4.2.

Proof of Lemma 4.3

The condition (iii) follows from the construction—so we only need to check (i) and (ii). In what follows, we set \(\delta := \max \left\{ \sum _{k=0}^{\alpha _j} |U_{j,k}| \,\big |\, 1 \le j \le m \right\} \). As \((X,\mathcal {L})\) is a finite-to-one biautomatic structure, by Lemma 2.3 there exists \(\kappa \ge 0\) such that if \(U,V \in \mathcal {L}\) are such that \(\left| \overline{U}-\overline{V} \right| _X \le 1\) then \(|U|-\kappa \le |V| \le |U|+\kappa \).

In order to show (i), suppose for contradiction that \(D := \mathbb {R}^N \setminus \bigcup _{j=1}^m C_j\) is non-empty. Thus, as \(D \subseteq \mathbb {R}^N\) is open and \(\mathbb {Q}^N \subseteq \mathbb {R}^N\) is dense, we may pick \(\textbf{v} \in D \cap \mathbb {Q}^N\). Since D is invariant under multiplication by \(\mu \) for any \(\mu > 0\), we may furthermore assume that \(\textbf{v} \in \mathbb {Z}^N\), and that if \(\textbf{w} \in \mathbb {Z}^N\) but \(\textbf{w} \notin D\) then \(|\textbf{v}-\textbf{w}|_X > \delta \).

Now let \(U \in \mathcal {L}\) be a word with \(\overline{U} = \textbf{v}\). By (1), we have \(U = U_{j,0} V_{j,1}^{\beta _1} U_{j,1} \cdots V_{j,\alpha _j}^{\beta _{\alpha _j}} U_{j,\alpha _j}\) for some j and some \(\beta _1,\ldots ,\beta _{\alpha _j} \in \mathbb {Z}_{\ge 0}\). However, we then have \(\textbf{w} := \sum _{k=1}^{\alpha _j} \beta _k \overline{V_{j,k}} = \sum _{k=1}^{\alpha _j} \beta _k |V_{j,k}| \textbf{z}_{j,k} \in C_j\), but \(|\textbf{v}-\textbf{w}|_X \le \sum _{k=0}^{\alpha _j} |U_{j,k}| \le \delta \), contradicting the choice of \(\textbf{v}\). Thus \(\{ C_j \mid 1 \le j \le m \}\) must cover \(\mathbb {R}^N\), which shows (i).

In order to show (ii), let \(j,k \in \{ 1,\ldots ,m \}\). Since \(\textbf{z}_{j,\ell } \in \mathbb {Q}^N\) and \(\textbf{z}_{k,\ell } \in \mathbb {Q}^N\) for all \(\ell \), we may express \(C_j \cap C_k\) as the set of solutions of a system of linear inequalities with rational coefficients (see Theorem 2.9). In particular, it follows that any non-empty open subset of \(C_j \cap C_k\) contains a point in \(\mathbb {Q}^N\), and so \(C_j \cap C_k\) is the closure of \(C_j \cap C_k \cap \mathbb {Q}^N\) in \(\mathbb {R}^N\). As the functions \(\langle {-}, \textbf{y}_j \rangle \) and \(\langle {-}, \textbf{y}_k \rangle \) are continuous, it is thus enough to verify (ii) when \(\textbf{v} \in \mathbb {Q}^N\).

Thus, let \(\textbf{v} \in C_j \cap C_k \cap \mathbb {Q}^N\). Since \(C_j\) and \(C_k\) are invariant under multiplication by any \(\mu > 0\), and since the functions \(\langle {-}, \textbf{y}_j \rangle \) and \(\langle {-}, \textbf{y}_k \rangle \) are linear, we may furthermore assume (after multiplying \(\textbf{v}\) by a positive integer if necessary) that \(\textbf{v} = \sum _{\ell =1}^{\alpha _j} \mu _{j,\ell } \textbf{z}_{j,\ell } = \sum _{\ell =1}^{\alpha _k} \mu _{k,\ell } \textbf{z}_{k,\ell }\) with \(\mu _{j,\ell }/|V_{j,\ell }| \in \mathbb {Z}_{\ge 0}\) and \(\mu _{k,\ell }/|V_{k,\ell }| \in \mathbb {Z}_{\ge 0}\) for all \(\ell \). For any \(\beta \in \mathbb {Z}_{\ge 0}\), we define the words

$$\begin{aligned} W_{\beta ,j}&= U_{j,0} V_{j,1}^{\beta \mu _{j,1}/|V_{j,1}|} U_{j,1} \cdots V_{j,\alpha _j}^{\beta \mu _{j,\alpha _j}/|V_{j,\alpha _j}|} U_{j,\alpha _j} \in \mathcal {L} \end{aligned}$$

and

$$\begin{aligned} W_{\beta ,k}&= U_{k,0} V_{k,1}^{\beta \mu _{k,1}/|V_{k,1}|} U_{k,1} \cdots V_{k,\alpha _k}^{\beta \mu _{k,\alpha _k}/|V_{k,\alpha _k}|} U_{k,\alpha _k} \in \mathcal {L}. \end{aligned}$$

We then have

$$\begin{aligned} \overline{W_{\beta ,j}} = \sum _{\ell =0}^{\alpha _j} \overline{U_{j,\ell }} + \sum _{\ell =1}^{\alpha _j} \frac{\beta \mu _{j,\ell }}{|V_{j,\ell }|} \overline{V_{j,\ell }} = \sum _{\ell =0}^{\alpha _j} \overline{U_{j,\ell }} + \beta \sum _{\ell =1}^{\alpha _j} \mu _{j,\ell } \textbf{z}_{j,\ell } = \sum _{\ell =0}^{\alpha _j} \overline{U_{j,\ell }} + \beta \textbf{v}, \end{aligned}$$

and similarly \(\overline{W_{\beta ,k}} = \sum _{\ell =0}^{\alpha _k} \overline{U_{k,\ell }} + \beta \textbf{v}\). It follows that

$$\begin{aligned} \left| \overline{W_{\beta ,j}} - \overline{W_{\beta ,k}} \right| _X \le \sum _{\ell =0}^{\alpha _j} |U_{j,\ell }| + \sum _{\ell =0}^{\alpha _k} |U_{k,\ell }| \le 2\delta , \end{aligned}$$

and so \(\Big | |W_{\beta ,j}| - |W_{\beta ,k}| \Big | \le 2\delta \kappa \).

On the other hand, we may compute that \(|W_{\beta ,j}| = \sum _{\ell =0}^{\alpha _j} |U_{j,\ell }| + \beta \sum _{\ell =1}^{\alpha _j} \mu _{j,\ell }\) and \(|W_{\beta ,k}| = \sum _{\ell =0}^{\alpha _k} |U_{k,\ell }| + \beta \sum _{\ell =1}^{\alpha _k} \mu _{k,\ell }\), implying that

$$\begin{aligned} \beta \left| \sum _{\ell =1}^{\alpha _k} \mu _{k,\ell } - \sum _{\ell =1}^{\alpha _j} \mu _{j,\ell } \right| \le \sum _{\ell =0}^{\alpha _j} |U_{j,\ell }| + \sum _{\ell =0}^{\alpha _k} |U_{k,\ell }| + \Big | |W_{\beta ,j}| - |W_{\beta ,k}| \Big | \le 2\delta +2\delta \kappa ; \end{aligned}$$

as \(\beta \) can be chosen to be arbitrarily large, it follows that \(\sum _{\ell =1}^{\alpha _j} \mu _{j,\ell } = \sum _{\ell =1}^{\alpha _k} \mu _{k,\ell }\). Finally, we get

$$\begin{aligned} \langle \textbf{v},\textbf{y}_j \rangle = \sum _{\ell =1}^{\alpha _j} \mu _{j,\ell } \langle \textbf{z}_{j,\ell },\textbf{y}_j \rangle = \sum _{\ell =1}^{\alpha _j} \mu _{j,\ell } = \sum _{\ell =1}^{\alpha _k} \mu _{k,\ell } = \sum _{\ell =1}^{\alpha _k} \mu _{k,\ell } \langle \textbf{z}_{k,\ell },\textbf{y}_k \rangle = \langle \textbf{v},\textbf{y}_k \rangle . \end{aligned}$$

This proves (ii). \(\square \)

Remark 4.4

Our construction is related to the Neumann–Shapiro triangulation of \(\mathbb {S}^{N-1}\) associated to a biautomatic structure on \(\mathbb {Z}^N\) [12]. Namely, let \(C_1,\ldots ,C_m \subseteq \mathbb {R}^N\) be the polyhedral cones constructed in the proof of Proposition 4.2. Some subset of these cones—which we get after discarding cones contained in either a proper subspace of \(\mathbb {R}^N\) or another cone—is precisely the set of \((N-1)\)-simplices in the relevant Neumann–Shapiro triangulation of \(\mathbb {S}^{N-1}\). We may furthermore order vertices in each of these polyhedral cones, with the ordering induced by the order \(\textbf{z}_{j,1} \prec \cdots \prec \textbf{z}_{j,\alpha _j}\) on \(C_j\), thus recovering the complete structure of the triangulations exhibited in [12].

However, we amend this triangulation by constructing a polyhedral function, associating to \((X,\mathcal {L})\) a geometric rather than combinatorial structure. This allows easier treatment of arbitrary subgroups of \(\mathbb {Z}^N\). In particular, even though a triangulation of \(\mathbb {S}^{N-1}\) does not necessarily induce a triangulation on an arbitrary equatorial subsphere \(\mathbb {S}^{n-1} \subset \mathbb {S}^{N-1}\) for \(n < N\), composing a polyhedral function with an arbitrary isometric linear inclusion \(\mathbb {R}^n \hookrightarrow \mathbb {R}^N\) still yields a polyhedral function by Lemma 3.1.

Example 4.5

Let \(X = \{ \varepsilon ,x,y,x^{-1},y^{-1} \}\) be the generating set of \(\mathbb {Z}^2\) such that x, y and \(\varepsilon \) map to (1, 0), (0, 1) and (0, 0), respectively. Define the language \(\mathcal {L}\) as follows:

$$\begin{aligned} \mathcal {L} :=&\left( \varepsilon xy^2\right) ^* \left( \varepsilon y\right) ^* \cup \left( \varepsilon xy^2\right) ^* \left( \varepsilon ^3 x\right) ^* \cup \left( \varepsilon xy^2\right) ^* y^{-1} \left( \varepsilon ^3 x\right) ^* \\&\cup \left( \varepsilon ^3 x\right) ^* \left( \varepsilon y^{-1}\right) ^* \cup \left( \varepsilon x^{-1}\right) ^* \left( \varepsilon y^{-1}\right) ^* \cup \left( \varepsilon y\right) ^* \left( \varepsilon x^{-1}\right) ^*. \end{aligned}$$

One may then check that \((X,\mathcal {L})\) is indeed a finite-to-one biautomatic structure for \(\mathbb {Z}^2\). Moreover, in this case the polyhedral function constructed in the proof of Proposition 4.2 is precisely the function f depicted in Fig. 1, and one may check that the values of \(\textbf{y}_j\) and \(\textbf{z}_{j,k}\) indicated in Example 2.10 are consistent with the notation used in the proof of Proposition 4.2 for the language \(\mathcal {L}\). The thin black lines in Fig. 1 represent the paths starting at (0, 0) and labelled by words in \(\mathcal {L}\).

Proposition 4.2 then implies that given any \(n \in \mathbb {Z}_{\ge 0}\), if \(S_n \subset \mathbb {Z}^2\) is the set of elements represented by words in \(\mathcal {L}\) of length n, then \(S_n\) is bounded distance away from an appropriate rescaling of the dotted line in Fig. 1, for some bound that is independent of n.

5 The proof of Theorem 1.2

Finally, we use Lemma 3.1 and Propositions 3.3  & 4.2 to prove Theorem 1.2. The idea of our proof is to embed H (or a finite-index torsion-free subgroup of H) into an \(\mathcal {M}\)-quasiconvex free abelian subgroup \(\widehat{H} \le G\), where \((Y,\mathcal {M})\) is a biautomatic structure of G. We then associate a polyhedral function to \(\widehat{H}\), as per Proposition 4.2, and Lemma 3.1 allows us to restrict this function to a polyhedral function associated to H. The following result then implies that the latter polyhedral function is (in a sense) K-invariant, where K is the image of \({{\,\textrm{Comm}\,}}_G(H)\) in \({{\,\textrm{Comm}\,}}(H)\), and consequently K is finite by Proposition 3.3.

Proposition 5.1

Let G be a group with a finite-to-one biautomatic structure \((Y,\mathcal {M})\), let \(\widehat{H} \le G\) be a free abelian \(\mathcal {M}\)-quasiconvex subgroup of rank \(N \ge 0\), and let \(\varphi :\widehat{H} \rightarrow \mathbb {R}^N\) be an embedding with \(\varphi (\widehat{H}) = \mathbb {Z}^N\). Then there exists a polyhedral function \(f:\mathbb {R}^N \rightarrow \mathbb {R}\) such that \(f \circ \varphi (h) = f \circ \varphi (ghg^{-1})\) for all \(g \in G\) and all \(h \in \widehat{H} \cap g^{-1}\widehat{H}g\).

Proof

As \((Y,\mathcal {M})\) is a finite-to-one biautomatic structure on G, by Lemma 2.3 there exists a constant \(\kappa \ge 0\) such that if \(U,V \in \mathcal {M}\) are such that \(\overline{U} = \overline{y^{-1}Vy}\) for some \(y \in Y\), then \(\big | |U|-|V| \big | \le \kappa \). Since \(\widehat{H}\) is \(\mathcal {M}\)-quasiconvex, by Theorem 2.5 there exists a finite-to-one biautomatic structure \((X,\mathcal {L})\) on \(\widehat{H}\) such that for any \(V \in \mathcal {M}\) with \(\overline{V} \in \widehat{H}\), there exists \(U \in \mathcal {L}\) with \(\overline{U} = \overline{V}\) and \(|U| = |V|\). By identifying the subgroup \(\widehat{H}\) with \(\mathbb {Z}^N\) via \(\varphi \), let \(f:\mathbb {R}^N \rightarrow \mathbb {R}\) and \(\xi \ge 0\) be the polyhedral function and the constant given by Proposition 4.2.

Now let \(g \in G\) and let \(h \in \widehat{H} \cap g^{-1}\widehat{H}g\), so that \(h^\beta ,gh^\beta g^{-1} \in \widehat{H}\) for all \(\beta \in \mathbb {Z}_{\ge 0}\); without loss of generality, assume that \(g \ne 1\). For each \(\beta \in \mathbb {Z}_{\ge 0}\), let \(U_\beta ,V_\beta \in \mathcal {L}\) and \(U_\beta ',V_\beta ' \in \mathcal {M}\) be such that \(\overline{U_\beta } = \overline{U_\beta '} = h^\beta \), \(\overline{V_\beta } = \overline{V_\beta '} = gh^\beta g^{-1}\), \(|U_\beta | = |U_\beta '|\) and \(|V_\beta | = |V_\beta '|\). It then follows from Proposition 4.2 that \(\big | f \circ \varphi (h^\beta ) - |U_\beta | \big | \le \xi \) and \(\big | f \circ \varphi (gh^\beta g^{-1}) - |V_\beta | \big | \le \xi \). Moreover, by the choice of the constant \(\kappa \ge 0\) above, we have \(\big | |U_\beta |-|V_\beta | \big | = \big | |U_\beta '|-|V_\beta '| \big | \le \kappa |g|_Y\). Finally, it follows from Definition 2.8 that \(f(\beta \textbf{x}) = \beta f(\textbf{x})\) for all \(\beta \in \mathbb {Z}_{\ge 0}\) and \(\textbf{x} \in \mathbb {R}^n\), and therefore

$$\begin{aligned}&\beta \left| f \circ \varphi (h) - f \circ \varphi (ghg^{-1}) \right| = \left| f \circ \varphi (h^\beta ) - f \circ \varphi (gh^\beta g^{-1}) \right| \\&\qquad \qquad {} \le \left| f \circ \varphi (h^\beta ) - |U_\beta | \right| + \big | |U_\beta |-|V_\beta | \big | + \left| |V_\beta | - f \circ \varphi (gh^\beta g^{-1}) \right| \\&\qquad \qquad {} \le 2\xi + \kappa |g|_Y. \end{aligned}$$

As \(\beta \in \mathbb {Z}_{\ge 0}\) was arbitrary, it follows that \(f \circ \varphi (h) = f \circ \varphi (ghg^{-1})\), as required. \(\square \)

Proof of Theorem 1.2

Since H is finitely generated abelian, it has a torsion-free subgroup \(H'\) of finite index. By Lemma 2.1, it is enough to show that the image of \({{\,\textrm{Comm}\,}}_G(H')\) in \({{\,\textrm{Comm}\,}}(H')\) is finite. Therefore, we will assume (without loss of generality) that H is torsion-free.

Now let \((Y,\mathcal {M})\) be a biautomatic structure for G. Since H is finitely generated, its centraliser \(C_G(H)\) is \(\mathcal {M}\)-quasiconvex by Proposition 2.7; let \((Y',\mathcal {M}')\) be the associated biautomatic structure for \(C_G(H)\). In particular, \(C_G(H)\) is finitely generated and so its centre, \(Z(C_G(H))\), is \(\mathcal {M}'\)-quasiconvex in \(C_G(H)\) (again by Proposition 2.7), and so \(\mathcal {M}\)-quasiconvex in G (by Lemma 2.6). Let \((Y'',\mathcal {M}'')\) be the associated biautomatic structure for \(Z(C_G(H))\).

Now \(Z(C_G(H))\) is a finitely generated abelian group containing H, and so \(Z(C_G(H))/H\) is a finitely generated abelian group. Thus \(Z(C_G(H))/H\) has a finite-index torsion-free subgroup, say \(\widehat{H}/H\). Its preimage \(\widehat{H}\) is a finite-index abelian subgroup of \(Z(C_G(H))\) containing H, and as H is torsion-free so is \(\widehat{H}\). Moreover, as \(\widehat{H}\) has finite index in \(Z(C_G(H))\), it is \(\mathcal {M}''\)-quasiconvex, and so by Lemma 2.6 it is \(\mathcal {M}\)-quasiconvex in G. Thus, \(\widehat{H} \le G\) is an \(\mathcal {M}\)-quasiconvex free abelian subgroup of finite rank (N, say) containing H.

Now let \(\widehat{\varphi }:\widehat{H} \rightarrow \mathbb {R}^N\) be an embedding such that \(\widehat{\varphi }(\widehat{H}) = \mathbb {Z}^N\). By identifying the subspace of \(\mathbb {R}^N\) spanned by \(\widehat{\varphi }(H)\) with \(\mathbb {R}^n\) via a linear isometry, we see that \(\widehat{\varphi }|_H = \theta \circ \varphi \), where \(\theta :\mathbb {R}^n \rightarrow \mathbb {R}^N\) is a linear isometric embedding, and \(\varphi :H \rightarrow \mathbb {R}^n\) is an embedding as a lattice. Given an element \(g \in {{\,\textrm{Comm}\,}}_G(H)\), we may define a matrix \(A_g \in GL_n(\mathbb {R})\) such that \(\varphi (ghg^{-1}) = A_g\varphi (h)\) for all \(h \in H \cap g^{-1}Hg\); such a matrix is unique since \(H \cap g^{-1}Hg\) has finite index in H and so \(\varphi (H \cap g^{-1}Hg)\) is a lattice in \(\mathbb {R}^n\). This defines a map

$$\begin{aligned} \Theta :{{\,\textrm{Comm}\,}}_G(H)&\rightarrow GL_n(\mathbb {R}), \\ g&\mapsto A_g, \end{aligned}$$

which is easily seen to be a homomorphism—in fact, the map \(\Theta \) is just a composite \({{\,\textrm{Comm}\,}}_G(H) \rightarrow {{\,\textrm{Comm}\,}}(H) \cong GL(V) \hookrightarrow GL_n(\mathbb {R})\), where \(V = \varphi (H) \otimes \mathbb {Q}< \mathbb {R}^n\) is an n-dimensional \(\mathbb {Q}\)-vector subspace.

Now by Proposition 5.1, there exists a polyhedral function \(\widehat{f}:\mathbb {R}^N \rightarrow \mathbb {R}\) such that \(\widehat{f} \circ \widehat{\varphi }(h) = \widehat{f} \circ \widehat{\varphi }(ghg^{-1})\) for all \(g \in G\) and all \(h \in \widehat{H} \cap g^{-1}\widehat{H}g\). By Lemma 3.1, the function \(f = \widehat{f} \circ \theta :\mathbb {R}^n \rightarrow \mathbb {R}\) is also polyhedral. Now fix \(g \in {{\,\textrm{Comm}\,}}_G(H)\). Then, for any \(h \in H \cap g^{-1}Hg\) we have

$$\begin{aligned} f(A_g \varphi (h))&= f \circ \varphi (ghg^{-1}) = \widehat{f} \circ \theta \circ \varphi (ghg^{-1}) = \widehat{f} \circ \widehat{\varphi }(ghg^{-1}) \\&= \widehat{f} \circ \widehat{\varphi }(h) = \widehat{f} \circ \theta \circ \varphi (h) = f(\varphi (h)), \end{aligned}$$

and so \(f(A_g\textbf{v}) = f(\textbf{v})\) for all \(\textbf{v} \in \varphi (H \cap g^{-1}Hg)\). As f is polyhedral, we have \(f(\beta \textbf{v}) = \beta f(\textbf{v})\) for all \(\beta \in [0,\infty )\) and all \(\textbf{v} \in \mathbb {R}^n\), implying that \(f(A_g\textbf{v}) = f(\textbf{v})\) for all \(\textbf{v} \in K\), where \(K := \{ \beta \textbf{w} \mid \beta \in [0,\infty ), \textbf{w} \in \varphi (H \cap g^{-1}Hg) \}\). As \(\varphi (H \cap g^{-1}Hg)\) is a lattice in \(\mathbb {R}^n\), the subset \(K \subseteq \mathbb {R}^n\) is dense; and as f is polyhedral, it is continuous, implying that \(f(A_g\textbf{v}) = f(\textbf{v})\) for all \(\textbf{v} \in \mathbb {R}^n\).

Thus, we have \(f(A\textbf{v}) = f(\textbf{v})\) for all \(A \in \Theta ({{\,\textrm{Comm}\,}}_G(H))\) and all \(\textbf{v} \in \mathbb {R}^n\). It follows from Proposition 3.3 that \(\Theta ({{\,\textrm{Comm}\,}}_G(H))\) is finite. Since \(\Theta \) factors as a composite \({{\,\textrm{Comm}\,}}_G(H) \rightarrow {{\,\textrm{Comm}\,}}(H) \hookrightarrow GL_n(\mathbb {R})\), it follows that \({{\,\textrm{Comm}\,}}_G(H)\) has finite image in \({{\,\textrm{Comm}\,}}(H)\), as required.

The ‘in particular’ part of the Theorem follows directly from the definition of the map \({{\,\textrm{Comm}\,}}_G(H) \rightarrow {{\,\textrm{Comm}\,}}(H)\): indeed, \(g \in {{\,\textrm{Comm}\,}}_G(H)\) is in the kernel of this map if and only if it centralises a finite-index subgroup of H. \(\square \)