1 Introduction

The subgroup membership problem (also known as the generalized word problem) for a group G asks whether for given group elements \(g_0, g_1, \ldots , g_k \in G\), \(g_0\) belongs to the subgroup \(\langle g_1, \ldots , g_k \rangle \) generated by \(g_1, \ldots , g_k\). To make this a well-defined computational problem, one has to fix an input representation for elements of G. Here, a popular choice is to restrict to finitely generated (f.g. for short) groups. In this case, group elements can be encoded by finite words over a finite set of generators. The subgroup membership problem is one of the best studied problems in computational group theory. Let us survey some important results on subgroup membership problems.

For symmetric groups \(S_n\), Sims [38] has developed a polynomial time algorithm for the uniform variant of the subgroup membership problem, where n is part of the input; see also [3] for efficient parallel algorithms. Here, we only consider the non-uniform subgroup membership problem, where we fix an infinite f.g. group G. For a f.g. free group, the subgroup membership problem can be solved using Nielsen reduction (see e.g. [25]); a polynomial time algorithm was found by Avenhaus and Madlener [1]. In fact, in [1] it is shown that the subgroup membership problem for a f.g. free group is \(\textsf{P}\)-complete. Another polynomial time algorithm uses Stallings’s folding procedure [39]; an almost linear time implementation can be found in [40]. An extension of Stallings’s folding for fundamental groups of certain graphs of groups was developed in [17]. ‘The folding procedure from [17] can be used to show that subgroup membership is decidable for right-angled Artin groups with a chordal independence graph. Moreover, Friedl and Wilton [12] used the results of [17] in combination with deep results from 3-dimensional topology in order to decide the subgroup membership problem for 3-manifold groups. Other extensions of Stallings’s folding and applications to subgroup membership problems can be found in [19, 27, 35]. Using completely different (more algebraic) techniques, the subgroup membership problem has been shown to be decidable for polycyclic groups [2, 26] and f.g. metabelian groups [33, 34]. For f.g. nilpotent groups the subgroup membership problem is complete for the circuit complexity class \(\textsf{TC}^0\) [29].

On the undecidability side, Mihaĭlova [28] has shown that the subgroup membership problem is undecidable for the direct product \(F_2 \times F_2\) (where \(F_2\) is the free group of rank two). This implies undecidability of the subgroup membership problem for many other groups, e.g., \(\textsf{SL}(4,\mathbb {Z})\) (the group of \(4 \times 4\) integer matrices with determinant one) or the 5-strand braid group \(B_5\). Rips [31] constructed hyperbolic groups with an undecidable subgroup membership problem.

Apart from the above mentioned results for free groups [1] (\(\textsf{P}\)-completeness) and nilpotent groups [29] (\(\textsf{TC}^0\)-completeness) the authors are not aware of other precise complexity results for subgroup membership problems in infinite groups. The completeness results from [1, 29] assume that group elements are represented by finite words over the generators of the free group. In recent years, group theoretic decision problems have also been studied with respect to more succinct representations of group elements. For instance, the so-called compressed word problem, where the input group element is represented by a straight-line program (a context-free grammar that produces exactly one string) has received a lot of attention; see [4, 22] for surveys. For the subgroup membership problem in free groups, Gurevich and Schupp studied in [14] a succinct variant, where input group elements are of the form \(a_1^{z_1} a_2^{z_2} \cdots a_k^{z_k}\). Here, the \(a_i\) are from a fixed free basis of the free group and the \(z_i\) are binary encoded integers. Based on an adaptation of Stallings’s folding, they show that this succinct membership problem can be solved in polynomial time. Then, Gurevich and Schupp proceed in [14] by showing that their succinct folding algorithm for free groups can be adapted so that it works for the free product \(\mathbb {Z}/2\mathbb {Z} * \mathbb {Z}/3\mathbb {Z}\). The particular interest in this group comes from the fact that it is isomorphic to the modular group \(\textsf{PSL}(2,\mathbb {Z})\), which is the quotient of \(\textsf{SL}(2,\mathbb {Z})\) by \(\langle -\textsf{Id}_2 \rangle \cong \mathbb {Z}/2\mathbb {Z}\) (\(\textsf{Id}_2\) is the \(2 \times 2\) identity matrix). As an application of the succinct folding algorithm for \(\mathbb {Z}/2\mathbb {Z} * \mathbb {Z}/3\mathbb {Z}\), Gurevich and Schupp show that the subgroup membership problem for \(\textsf{PSL}(2,\mathbb {Z})\) is decidable in polynomial time when all matrix entries are encoded in binary notation.

A related result was shown in [29]: the subgroup membership problem for a f.g. nilpotent group can be solved in polynomial time, when group elements are represented by binary encoded Mal’cev coordinates.

The polynomial time algorithm for the succinct membership problem for \(\mathbb {Z}/2\mathbb {Z} * \mathbb {Z}/3\mathbb {Z}\) from [14] is tailored towards this group, and it is not clear how to adapt the algorithm to related groups. The latter is the goal of this paper. For this it turns out to be useful to consider a more succinct representation of input elements for free groups. Recall that Gurevich and Schupp use words of the form \(a_1^{z_1} a_2^{z_2} \cdots a_k^{z_k}\), where the integers \(z_i\) are given in binary notation and the \(a_i\) are generators from a free basis. Here, we represent group elements by so-called power words which were studied in [23] in the context of group theory. A power word has the form \(p_1^{z_1} p_2^{z_2} \cdots p_k^{z_k}\), where as above the integers \(z_i\) are given in binary notation but the \(p_i\) are arbitrary words over the group generators. In [23] it was shown that the so-called power word problem (does a given power word represent the group identity?) for a f.g. free group F is \(\textsf{AC}^0\)-reducible to the ordinary word problem for F (and hence in logspace). In Section 3, we prove that the power-compressed subgroup membership problem (i.e., the subgroup membership problem with all group elements represented by power words) for a free group can be solved in polynomial time by using a folding procedure à la Stallings (Theorem 1). This generalizes the above mentioned result of Gurevich and Schupp. At first sight, the step from power words of the form \(a_1^{z_1} a_2^{z_2} \cdots a_k^{z_k}\) (with the \(a_i\) generators) to general power words as defined above looks not very spectacular. But apart from the quite technical details, the power-compressed subgroup membership problem has a major advantage over the restricted version of Gurevich and Schupp: we show that if G is a f.g. group and H is a finite index subgroup of G then the power-compressed subgroup membership problem for G is polynomial time reducible to the power-compressed subgroup membership problem for H (Lemma 11). Hence, the power-compressed subgroup membership problem for every f.g. virtually free group (a finite extension of a f.g. free group) can be solved in polynomial time (Corollary 2). This result opens up new applications to matrix group algorithms. It is well-known that the group \(\textsf{GL}(2,\mathbb {Z})\) (the group of all \(2 \times 2\) integer matrices with determinant \(\pm 1\)) is f.g. virtually free. Moreover, given a matrix \(A \in \textsf{GL}(2,\mathbb {Z})\) with binary encoded entries one can compute a power word (over a fixed finite generating set of \(\textsf{GL}(2,\mathbb {Z})\)) that represents A. Hence, the subgroup membership problem for \(\textsf{GL}(2,\mathbb {Z})\) can be decided in polynomial time when elements of \(\textsf{GL}(2,\mathbb {Z})\) are represented by matrices with binary encoded integers (Corollary 3).

In Section 5 we present another application of our folding procedure for power words: we show that the finite index problem for f.g. subgroups of \(\textsf{GL}(2,\mathbb {Z})\) can be decided in polynomial time, when elements of \(\textsf{GL}(2,\mathbb {Z})\) are represented as matrices with binary encoded integers (Corollary 5). In the finite index problem for a group G the goal is to compute the index (an element of \(\mathbb {N} \cup \{\infty \}\)) of a given f.g. subgroup of G. The finite index problem has been studied in [16] (for free groups), [27] (amalgamated products of finite groups), [37] (virtually free groups), [18] (quasiconvex subgroups of automatic groups), [9] (direct products of free-abelian and free groups) and [8] (solvable Baumslag-Solitar groups \(\textsf{BS}(1,q)\)).

Related Work

Related to the subgroup membership problem is the more general rational subset membership problem. A rational subset in a group G is given by a finite automaton, where transitions are labelled with elements of G. Such an automaton accepts a subset of G in the natural way. In the rational subset membership problem for G the input consists of a rational subset \(L \subseteq G\) and an element \(g \in G\) and the question is, whether \(g \in L\). This problem was shown to be decidable for free groups by Benois [6] via an automaton saturation procedure that moreover can be implemented in cubic time [7]. Stallings’s folding can be viewed as a special case of Benois’s construction.

Rational subset membership problems (and special cases) for matrix groups are a very active research field. Some recent results can be found in [5, 8, 11, 21, 30]. Closest to our work is [5], where it is shown that the identity problem for \(\textsf{SL}(2,\mathbb {Z})\) (does the identity matrix belong to a finitely generated subsemigroup of \(\textsf{SL}(2,\mathbb {Z})\)?) and the rational subset membership problem for \(\textsf{PSL}(2,\mathbb {Z})\) are \(\mathbb {N}P\)-complete (when matrix entries are given in binary notation). For this, the authors of [5] use the ideas of Gurevich and Schupp [14]. In [8, 11], first steps towards \(\textsf{GL}(2,\mathbb {Q})\) are taken: in [11] the authors prove decidability of membership in so-called flat rational subsets of \(\textsf{GL}(2,\mathbb {Q})\), whereas [8] establishes the decidability of the full rational subset membership problem for the Baumslag-Solitar groups \(\textsf{BS}(1,q) < \textsf{GL}(2,\mathbb {Q})\) with \(q \ge 2\).

2 Preliminaries

General Notations

For an integer \(z \in \mathbb {Z}\) we define its signum as usual: \(\textsf{sign}(0) = 0\), and for \(z > 0\), \(\textsf{sign}(z)=1\) and \(\textsf{sign}(-z)=-1\). As usual, \(\varSigma ^*\) denotes the set of all finite words over an alphabet \(\varSigma \), \(\varepsilon \) denotes the empty word, and \(\varSigma ^+ = \varSigma ^* \setminus \{\varepsilon \}\) is the set of all non-empty words. The length of a word w is denoted by |w|. If \(w = uv \in \varSigma ^*\) then u is called a prefix of w and v is called a suffix of w. The word \(u \in \varSigma ^*\) is a factor of the word \(w \in \varSigma ^*\) if \(w = s u t\) for some \(s,t \in \varSigma ^*\). At one point, it will be convenient to work with \(\omega \)-words. An \(\omega \)-word over the alphabet \(\varGamma \) is an infinite sequence \(a_1 a_2 a_3 a_4 \cdots \) with \(a_i \in \varGamma \) for all \(i \ge 1\). With \(\varGamma ^\omega \) we denote the set of all \(\omega \)-word over the alphabet \(\varGamma \).

Groups

We assume some basic background in group theory; see [25] or [32] for more details. For a group G and a subset \(A \subseteq G\), we denote with \(\langle A \rangle \) the subgroup of G generated by A. It is the set of all products of elements from \(A \cup A^{-1}\). We say that A generates G (or A is a generating set for G) if \(G = \langle A \rangle \). If \(A = A^{-1}\) then A is a symmetric generating set for G. A group G is called finitely generated if it has a finite generating set.

Fix a finite set \(\varSigma \) of symbols and let \(\varSigma ^{-1} = \{ a^{-1} \mid a \in \varSigma \}\) be a set of formal inverses of the symbols in \(\varSigma \) with \(\varSigma \cap \varSigma ^{-1} = \emptyset \). Let \(\varGamma = \varSigma \cup \varSigma ^{-1}\). We define an involution on \(\varGamma ^*\) by setting \((a^{-1})^{-1} = a\) for \(a \in \varSigma \) and \((a_1 a_2 \cdots a_k)^{-1} = a_k^{-1} \cdots a_2^{-1} a_1^{-1}\) for \(a_1, \ldots , a_k \in \varGamma \). A word \(w \in \varGamma ^*\) is called freely reduced if it neither contains a factor \(a a^{-1}\) nor \(a^{-1} a\) for \(a \in \varSigma \). With \(\textsf{red}(\varGamma ^*)\) we denote the set of all freely reduced words. For every word \(w \in \varGamma ^*\) one obtains a unique freely reduced word that is obtained from w by deleting factors \(a a^{-1}\) and \(a^{-1} a\) (\(a \in \varSigma \)) as long as possible. We denote this word with \(\textsf{red}(w)\); it can be computed in linear time from w.

The free group generated by \(\varSigma \), denoted by \(F(\varSigma )\), consists of the set \(\textsf{red}(\varGamma ^*)\) together with the multiplication \(\cdot \) defined by \(u \cdot v = \textsf{red}(uv)\) for \(u, v \in \textsf{red}(\varGamma ^*)\). The group identity of \(F(\varSigma )\) is the empty word \(\varepsilon \). A group G that has a free subgroup of finite index in G is called virtually free. Usually, we identify a (not necessarily freely reduced) word \(w \in \varGamma ^*\) with the group element \(\textsf{red}(w) \in F(\varSigma )\).

For every group G there exists a free group \(F(\varSigma )\) and a surjective homomorphism \(\pi : F(\varSigma ) \rightarrow G\). We then have \(G = \langle \pi (\varSigma ) \rangle \). If G is finitely generated then we can choose \(\varSigma \) to be finite. In this situation, we also identify \(\varSigma \) with the generating set \(\pi (\varSigma )\). We also say that the element \(w \in F(\varSigma )\) represents the group element \(\pi (w)\). For \(u,v \in F(\varSigma )\) we say that \(u=v\) in G if \(\pi (u) = \pi (v)\). Sometimes, we identify \(w \in F(\varSigma )\) (or \(w \in (\varSigma \cup \varSigma ^{-1})^*\)) with the corresponding group element \(\pi (w)\).

Fix a f.g. group G together with a surjective morphism \(\pi : F(\varSigma ) \rightarrow G\) with \(\varSigma \) finite. The subgroup membership problem for G is the following decision problem:

input: words \(w_0, w_1, \ldots , w_n \in F(\varSigma )\).

question: Does \(\pi (w_0)\) belong to the subgroup \(\langle \pi (w_1), \ldots , \pi (w_n) \rangle \le G\)?

Note that we formulated the subgroup membership problem for G with respect to a fixed surjective morphism \(\pi : F(\varSigma ) \rightarrow G\). In other words, for every such surjective morphism \(\pi : F(\varSigma ) \rightarrow G\), we have another variant of the subgroup membership problem for G. On the other hand, it is easy to see that the computational complexity of the subgroup membership problem for G does not depend on the concrete choice of \(\pi : F(\varSigma ) \rightarrow G\), at least if we only care about complexity classes containing polynomial time (actually, a smaller complexity class such as deterministic logspace would be also fine, but this is not needed for our considerations). To see this take another surjective morphism \(\pi ' : F(\varTheta ) \rightarrow G\) (with \(\varTheta \) finite as well). Then for every generator \(a \in \varSigma \) there is an element \(h(a) \in F(\varTheta )\) such that \(\pi (a) = \pi '(h(a))\) in the group G. The mapping h uniquely extends to a morphism \(h : F(\varSigma ) \rightarrow F(\varTheta )\) (this is the crucial property of free groups). We then have \(\pi (w_0) \in \langle \pi (w_1), \ldots , \pi (w_n) \rangle \) if and only if \(\pi (h(w_0)) \in \langle \pi (h(w_1)), \ldots , \pi (h(w_n)) \rangle \). Since the morphism h can be easily computed in polynomial time (simply replace every symbol a by h(a)), this shows that subgroup membership problem for G with respect to \(\pi : F(\varSigma ) \rightarrow G\) is polynomial time reducible to the subgroup membership problem for G with respect to \(\pi ' : F(\varTheta ) \rightarrow G\). This justifies to not mention the surjective morphism \(\pi : F(\varSigma ) \rightarrow G\) in the subgroup membership problem for G.

In this paper we are interested in a variant of the subgroup membership problem for G where the words \(w_0, w_1, \ldots , w_n\) are given in a more succinct way. In the next section, we define this variant.

3 Stallings’s Folding for Power-Compressed Words

In this section we present our succinct version of Stallings’s folding that was mentioned in the introduction. We start with the definition of power words and power-compressed graphs. These graphs are basically finite automata where the transitions are labelled with power words. We prefer to use the term “graph” instead of “automaton”, since the former is more common in the literature on Stallings’s folding.

A power word over an alphabet \(\varSigma \) is a sequence \((p_1, n_1) (p_2, n_2) \cdots (p_k, n_k)\) of pairs where \(p_1, \ldots , p_k \in \varSigma ^+\) and \(n_1, \ldots , n_k \in \mathbb {N} \setminus \{0\}\). Such a power word represents the ordinary word \(p_1^{n_1} p_2^{n_2} \cdots p_k^{n_k}\) and we usually identify a power word with the word it represents. The difference between the sequence of pairs \((p_1, n_1) (p_2, n_2) \cdots (p_k, n_k)\) and the word \(p_1^{n_1} p_2^{n_2} \cdots p_k^{n_k}\) comes from to the succinctness of descriptions. When a power word is part of the input for a computational problem, we always assume that the exponents \(n_i\) are given in binary notation, whereas the words \(p_i\) (also called the periods of the power word) are written down explicitly by listing all symbols in the words. Therefore, we define the input length \(\Vert w\Vert \) of the power word \(w = (p_1, n_1) (p_2, n_2) \cdots (p_k, n_k)\) as

$$\begin{aligned} \sum \limits _{i=1}^k (|p_i| + \log n_i). \end{aligned}$$

On the other hand, the length of the word \(p_1^{n_1} p_2^{n_2} \cdots p_k^{n_k}\) is \(\sum _{i=1}^k n_i |p_i|\). Therefore, a power word should be seen as a succinct representation of the word it represents.

In the case of a power word over an alphabet \(\varGamma = \varSigma \cup \varSigma ^{-1}\) we may also allow negative exponents. Of course, \(p^{-n}\) stands for \((p^{-1})^n\).

Consider a f.g. group G together with a surjective morphism \(\pi : F(\varSigma ) \rightarrow G\) for \(\varSigma \) finite. The power-compressed subgroup membership problem for G is the following problem:

input: Power words \(w_0, w_1, \ldots , w_n\) over the alphabet \(\varGamma = \varSigma \cup \varSigma ^{-1}\).Footnote 1

question: Does \(\pi (w_0)\) belong to the subgroup \(\langle \pi (w_1), \ldots , \pi (w_n) \rangle \le G\)?

As for the (ordinary) subgroup membership problem, the concrete choice of the surjective morphism \(\pi : F(\varSigma ) \rightarrow G\) does not influence the complexity of the power-compressed subgroup membership problem. The reason is the same as for the subgroup membership problem. The morphism \(h : F(\varSigma ) \rightarrow F(\varTheta )\) from the previous section can be also applied to a power words: the power word \(w = (p_1, n_1) (p_2, n_2) \cdots (p_k, n_k)\) is mapped to

$$\begin{aligned} h(w) = (h(p_1), n_1) (h(p_2), n_2) \cdots (h(p_k), n_k). \end{aligned}$$

This yields a polynomial time reduction from the power-compressed subgroup membership problem for G with respect to \(\pi : F(\varSigma ) \rightarrow G\) to the power-compressed subgroup membership problem for G with respect to \(\pi ^{prime} : F(\varTheta ) \rightarrow G\).

The goal of this section is to show that the power-compressed subgroup membership problem can be decided in polynomial time for a f.g. free group. In Section 4 we will extend this result to f.g. virtually free groups.

Our main tool for solving the power-compressed subgroup membership problem for f.g. free groups is an extension of Stallings’s folding procedure for power-compressed words. First we need some combinatorial results for words. Fix a finite alphabet \(\varSigma \) with the inverse alphabet \(\varSigma ^{-1}\) for the rest of Section 3 and let \(\varGamma = \varSigma \cup \varSigma ^{-1}\).

3.1 Combinatorics on Words

We fix an arbitrary linear order < on \(\varGamma \). In order to simplify notation later, it is convenient to require that \(a < a^{-1}\) for every \(a \in \varSigma \). With \(\preceq \) we denote the lexicographic order with respect to <. Let \(\varOmega \subseteq \textsf{red}(\varGamma ^*)\) denote the set of all freely reduced words w such that

  • w is non-empty,

  • w is cyclically reduced (i.e, w cannot be written as \(a u a^{-1}\) for \(a \in \varGamma \)),

  • w is primitive (i.e, w cannot be written as \(u^n\) for some \(n \ge 2\)),

  • w is lexicographically minimal among all cyclic permutations of w and \(w^{-1}\) (i.e., \(w \preceq uv\) for all \(u,v \in \varGamma ^*\) with \(vu =w\) or \(vu = w^{-1}\)).

Note that \(\varSigma \subseteq \varOmega \) and \(\varSigma ^{-1} \cap \varOmega = \emptyset \) (since \(a < a^{-1}\) for \(a \in \varSigma \)). For every \(w \in \varOmega \) and \(n \in \mathbb {Z}\) we have \(w^n \in \textsf{red}(\varGamma ^*)\) (since w is freely reduced and cyclically reduced).

The set \(\varOmega \) was introduced in [23] in order to solve the power word problem (that was mentioned in the introduction) for a free group in logspace. The crucial fact about words in \(\varOmega \) is that if two powers \(p^x\) and \(q^y\) (\(p,q \in \varOmega \), \(x,y\in \mathbb {Z}\)) have a long enough common factor then \(p=q\); see Lemma 1 below.

Example 1

Assume that \(a<b<a^{-1}<b^{-1}\). Then the word \(w = a b a b^{-1}\) belongs to \(\varOmega \). It is clearly freely reduced, cyclically reduced, and primitive. Moreover, the cyclic permutations of w and \(w^{-1} = b a^{-1} b^{-1} a^{-1}\) are:

$$\begin{aligned}\begin{array}{ccccc} w \ \ = \ \ {} &{} a &{} b &{} a &{} b^{-1} \ \\ &{} b &{} a &{} b^{-1} &{} a\\ &{} a &{} b^{-1} &{} a &{} b\\ &{} b^{-1} &{} a &{} b &{} a\\ w^{-1} \ \ = \ \ {} &{} b &{} a^{-1}&{} b^{-1}&{} a^{-1}\\ &{} a^{-1} &{} b^{-1} &{} a^{-1} &{} b\\ &{} b^{-1} &{} a^{-1} &{} b &{} a^{-1}\\ &{} a^{-1} &{} b &{} a^{-1} &{} b^{-1} \end{array} \end{aligned}$$

Among those words, w is indeed the lexicographically minimal one.

The following lemma can be found in [23, Lemma 11].

Lemma 1

Let \(p,q \in \varOmega \) and \(x,y\in \mathbb {Z}\). If \(p^x\) and \(q^y\) have a common factor of length at least \(|p| + |q| - 1\) then \(p=q\).

We also need the following statement:

Lemma 2

If \(p \in \varOmega \), \(u,v \in \varGamma ^*\), and \(u p v = pp\) then \(u=\varepsilon \) or \(v = \varepsilon \).

Proof

Assume that \(upv = pp\) such that \(u \ne \varepsilon \) and \(v \ne \varepsilon \). We obtain a factorization \(p = qr\) such that \(q \ne \varepsilon \), \(r \ne \varepsilon \) and \(p = rq=qr\). Hence, \(q,r \in s^*\) for some string \(s \in \varGamma ^+\) (see e.g. [24, Proposition 1.3.2]), which implies that p is not primitive, a contradiction.\(\square \)

3.2 Power-Compressed Graphs

A power-compressed graph is a tuple \(\mathcal {G}=(V,E,\iota ,\tau , \lambda , v_0)\), where V is the set of vertices, E is the set of directed edges with \(V \cap E = \emptyset \), \(\iota :E \rightarrow V\) maps an edge to its source vertex, \(\tau :E \rightarrow V\) maps an edge to its target vertex, \(\lambda :E \rightarrow \varGamma ^+ \times (\mathbb {Z} \setminus \{0\})\) assigns to every edge its label, and \(v_0 \in V\) is the so-called base point. Moreover, for every edge e such that \(\iota (e) = u\), \(\tau (e)=v\), and \(\lambda (e) = (p,z)\) there is an inverse edge \(e^{-1} \ne e\) such that \(\iota (e^{-1}) = v\), \(\tau (e^{-1})=u\), \(\lambda (e^{-1}) = (p,-z)\), and \((e^{-1})^{-1} = e\). In this paper, V and E will be always finite. Note that we may have edges \(e, e^{\prime } \in E\) with \(e \ne e'\), \(\iota (e) = \iota (e^{\prime })\), \(\tau (e) = \tau (e^{\prime })\), and \(\lambda (e) = \lambda (e^{\prime })\).

When we describe a power-compressed graph we often specify for a pair of edges \(e, e^{-1}\) only one of them and implicitly assume the existence of its inverse edge. An edge e is called short if \(\lambda (e) \in \varGamma \times \{-1,1\}\), otherwise it is called long. If \(\mathcal {G}\) only contains short edges, then \(\mathcal {G}\) is called an uncompressed graph, or just graph.Footnote 2 We define the input length of \(\mathcal {G}\) as \(|\mathcal {G}| = \sum _{e \in E} \Vert \lambda (e) \Vert \) (here, we view \(\lambda (e) = (p,z)\) as a power word consisting of a single power).

A path in \(\mathcal {G}\) is a sequence

$$\begin{aligned} \rho = [v_1, e_1, v_2, e_2, \ldots , v_k, e_k, v_{k+1}], \end{aligned}$$

where \(k \ge 0\), \(e_1, \ldots , e_k \in E\), \(\iota (e_i) = v_i\) and \(\tau (e_i) = v_{i+1}\) for \(1 \le i \le k\). If \(v_i \ne v_j\) for all ij with \(1 \le i < j \le k+1\) then \(\rho \) is called a simple path. If \(v_1 = v_{k+1}\) and \(k \ge 1\) then \(\rho \) is a cycle. If \(v_i \ne v_j\) for all ij with \(1 \le i < j \le k\) and \(v_1 = v_{k+1}\) then \(\rho \) is a simple cycle. Let \(\iota (\rho ) = v_1\) and \(\tau (\rho ) = v_{k+1}\). If \(\lambda (e_i) = (p_i, z_i)\) then we define \(\lambda (\rho )\) as the power word \((p_1, z_1) (p_2, z_2) \cdots (p_k,z_k)\). The path \(\rho \) is oriented if \(\textsf{sign}(z_i) = \textsf{sign}\textsf{sign}(z_j)\) for all ij. The path \(\rho \) is without backtracking if \(e_{i+1} \ne e_i^{-1}\) for all \(1 \le i \le k-1\). The power-compressed graph \(\mathcal {G}\) is connected if for all \(u, v \in V\) there is a path \(\rho \) with \(\iota (\rho ) = u\) and \(\tau (\rho ) = v\). The power-compressed graph \(\mathcal {G}\) is a tree if it is connected and it does not contain a cycle without backtracking.

In the following, we identify a pair \((p,z) \in \varGamma ^+ \times (\mathbb {Z} \setminus \{0\})\) with the power \(p^z\). In particular, in an uncompressed graph every edge is labelled with a symbol from \(\varGamma \). With a power-compressed graph \(\mathcal {G}\) we can associate an uncompressed graph \(\textsf{decompress}(\mathcal {G})\) that is obtained by replacing in \(\mathcal {G}\) every \(p^z\)-labelled edge e by a path \(\rho \) of short edges from \(\iota (e)\) to \(\tau (e)\) and such that \(\lambda (\rho ) = p^z\). Moreover, if \(\iota (e) \ne \tau (e)\) then \(\rho \) is a simple path and if \(\iota (e) = \tau (e)\) then \(\rho \) is a simple cycle.

A power-compressed graph \(\mathcal {G}=(V,E,\iota ,\tau , \lambda , v_0)\) can be viewed as a finite automaton over the alphabet \(\varGamma \), where transition labels are succinct words of the form \(p^z\) with z given in binary notation: V is the set of states, an edge e corresponds to a transition from \(\iota (e)\) to \(\tau (e)\) with label \(\lambda (e)\) and \(v_0\) is the unique initial and final state. We denote with \(L(\mathcal {G})\) the set of all words \(w \in \varGamma ^*\) accepted by the automaton \(\mathcal {G}\). With \(F(\mathcal {G})\) we denote the image of \(L(\mathcal {G})\) in the free group \(F(\varSigma )\). Since every edge of \(\mathcal {G}\) has an inverse edge, it is easy to see that \(F(\mathcal {G})\) is a subgroup of \(F(\varSigma )\).

3.3 Folding Uncompressed Graphs

Before we continue with power-compressed graphs let us first explain Stallings’s folding procedure [39] for uncompressed graphs, which is one of the most powerful techniques for analysing subgroups of free groups; see e.g. [16]. Let \(\mathcal {G}\) and \(\mathcal {H}\) be two uncompressed graphs as defined in Section 3.2. We say that \(\mathcal {G}\) can be folded into \(\mathcal {H}\) if there exist two edges \(e \ne e^{\prime }\) in \(\mathcal {G}\) such that \(\iota (e) = \iota (e^{\prime })\) and \(\lambda (e) = \lambda (e^{\prime })\) and \(\mathcal {H}\) is obtained from \(\mathcal {G}\) by merging the two vertices \(\tau (e)\) and \(\tau (e^{\prime })\) (note that we may have already \(\tau (e)=\tau (e^{\prime })\) in \(\mathcal {G}\)) into a single vertex and removing the edges e and \(e^{-1}\) (this is an arbitrary choice; we could also keep e and \(e^{-1}\) and remove \(e^{\prime }\) and \({e^{\prime }}^{-1}\)) from the graph. One can easily show that \(F(\mathcal {G}) = F(\mathcal {H})\) holds in this situation. Every vertex of \(\mathcal {G}\) is mapped to a vertex of \(\mathcal {H}\) in the natural way (\(\tau (e)\) and \(\tau (e^{\prime })\) are mapped to the same vertex of \(\mathcal {H}\)). If a graph \(\mathcal {G}\) cannot be folded further then we say that \(\mathcal {G}\) is folded. In this case, \(\mathcal {G}\) is a deterministic automaton and \(w \in L(\mathcal {G})\) implies \(\textsf{red}(w) \in L(\mathcal {G})\).

Consider now a finite set of words \(A = \{w_1, \ldots , w_n\} \subseteq \varGamma ^+\) and let \(g_i = \textsf{red}(w_i) \in F(\varSigma )\) be the free group element represented by \(w_i\). We construct a so-called bouquet graph \(\mathcal {B}(A)\) such that

$$\begin{aligned} F(\mathcal {B}(A)) = \langle g_1,\ldots ,g_n \rangle \le F(\varSigma ) \end{aligned}$$

as follows:

  • First we define for a non-empty word \(w = a_1 a_2 \cdots a_k\) (\(a_i \in \varGamma \)) the cycle graph

    $$\begin{aligned} \mathcal {C}(w) = (\{ v_0, \ldots , v_{k-1}\}, \{e_i^{\pm 1} :1 \le i \le k \}, \iota , \tau , v_0), \end{aligned}$$

    where \(\iota (e_i) = v_{i-1}\), \(\lambda (e_i) = a_i\), and \(\tau (e_i) = v_{i \bmod k}\) for \(1 \le i \le k\).

  • We then define the bouquet graph \(\mathcal {B}(A)\) by taking the disjoint union of the cycle graphs \(\mathcal {C}(w_1), \ldots , \mathcal {C}(w_n)\) and then merging the base points of the \(\mathcal {C}(w_i)\).

Let \(\mathcal {S}(A)\) be the graph obtained by folding \(\mathcal {B}(A)\) as long as possible. The final graph of this procedure is in fact unique up to graph isomorphism. The graph \(\mathcal {S}(A)\) is sometimes called the Stallings’s graph for A. Note that as an automaton, \(\mathcal {S}(A)\) is deterministic. The above discussion leads to the following crucial fact (see also [16] for a more detailed discussion):

Lemma 3

Let A and \(g_1, \ldots , g_n\) be as above and let \(g \in \textsf{red}(\varGamma ^*)\) be a freely reduced word and hence an element of \(F(\varSigma )\). Then g is accepted by \(\mathcal {S}(A)\) if and only if \(g \in \langle g_1,\ldots ,g_n \rangle \le F(\Sigma )\).

3.4 Folding Power-Compressed Graphs

Fix a power-compressed graph \(\mathcal {G} = (V,E,\iota ,\tau , \lambda , v_0)\) for the rest of this section and let P be the set of all words p such that \(\lambda (e) = p^z\) for some \(e \in E\) and \(z \in \mathbb {Z} \setminus \{0\}\). We will refer to the following numbers throughout this section:

  • \(\alpha := \max \{ |p| :p \in P \} \ge 1\),

  • \(\beta := 2 \alpha - 1 \ge 1\),

  • \(\gamma := 2 (\alpha + \beta ) \ge 4\).

We say that \(\mathcal {G}\) is normalized if

  • \(P \subseteq \varOmega \) (where \(\varOmega \) is defined in Section 3.1), and

  • for every \(e \in E\), if e is long and \(\lambda (e) = p^z\) then \(|z| \ge \gamma \).

Let \(E_\ell \) be the set of long edges of \(\mathcal {G}\).

Lemma 4

From a given power-compressed graph \(\mathcal {G}\) we can compute in polynomial time a normalized power-compressed graph \(\mathcal {G}^{\prime }\) such that \(F(\mathcal {G}) = F(\mathcal {G}^{\prime })\).

Proof

We first modify \(\mathcal {G}\) such that for every edge label \(\lambda (e) = p^z\) we have \(p \in \varOmega \). This can be done in polynomial time by [23, Lemma 12] which states that a given power word w over the alphabet \(\varGamma \) can be transformed in polynomial time (in fact, even in logspace) into a power word \(w'\) over the alphabet \(\varGamma \) such that (i) all periods of \(w'\) belong to \(\varOmega \) and (ii) \(w=w'\) in \(F(\varSigma )\). We finally replace every long edge e with \(\lambda (e) = p^z\) and \(|z| < \gamma \) by a simple path (or simple cycle) \(\rho \) of short edges such that \(\lambda (\rho ) = p^z\).\(\square \)

We say that \(\mathcal {G}\) is weakly folded if none of the following two conditions A and B hold:

Condition A: There exist two (long or short) edges \(e_1 \ne e_2\) such that \(\iota (e_1) = \iota (e_2)\), \(\lambda (e_1) = p^{z_1}\) and \(\lambda (e_2) = p^{z_2}\) for some \(p \in P \cup P^{-1}\) and \(z_1, z_2 \in \mathbb {N} \setminus \{0\}\).

Condition B: There exist a long edge e with \(\lambda (e) = p^{z}\) and a path \(\rho \) consisting of short edges such that \(\iota (e) = \iota (\rho )\), \(\lambda (\rho ) = p\), \(p \in P \cup P^{-1}\), and \(z \in \mathbb {N} \setminus \{0\}\).

We say that \(\mathcal {G}\) is strongly folded if the graph \(\textsf{decompress}(\mathcal {G})\) is folded in the sense of Section 3.3. Clearly, if \(\mathcal {G}\) is strongly folded then \(\mathcal {G}\) is also weakly folded.

Lemma 5

A given normalized power-compressed graph \(\mathcal {G} = (V,E,\iota ,\tau , \lambda ,v_0)\) can be folded in polynomial time into a normalized and weakly folded power-compressed graph \(\mathcal {G}^{\prime }\). We have \(F(\mathcal {G}) = F(\mathcal {G}^{\prime })\).

Proof

In order to estimate the complexity of our algorithm, we use two termination parameters: the number \(|E_\ell |\) of long edges and the total number of edges |E|. The algorithm performs a sequence of folding steps that are explained below. In each step, the value \(|E_\ell |\) will not increase. If \(|E_\ell |\) does not change then |E| will not increase, but if \(|E_\ell |\) decreases then |E| may increase by at most \(\gamma -1\). The situation becomes difficult because it may happen that in a folding step neither \(|E_\ell |\) nor |E| changes. We distinguish the following three types of folding steps, where \(\mathcal {G} = (V,E,\iota ,\tau , \lambda , v_0)\) is the power-compressed graph before the folding step and \(\mathcal {G}^{\prime } = (V^{\prime },E^{\prime },\iota ^{\prime },\tau ^{\prime }, \lambda ^{\prime }, v_0^{\prime })\) is the power-compressed graph after the folding step.

decreasing (p-edge) fold: If condition A holds with \(z_1 = z_2\) then we can merge \(\tau (e_1)\) and \(\tau (e_2)\) into a single vertex (let us call it v) and replace the two edges \(e_1\) and \(e_2\) by a single edge from \(\iota (e_1) = \iota (e_2)\) to v with label \(p^{z_1}\).

More formally: If we define \(\equiv _V\) to be the smallest (with respect to inclusion) equivalence relation on V with \(\tau (e_1) \equiv _V \tau (e_2)\) and \(\equiv _E\) to be the smallest equivalence relation on E with \(e_1 \equiv _E e_2\) then we can identify \(V'\) (respectively, \(E^{\prime }\)) with the set of equivalence classes \(\{ [v]_{\equiv _V} :v \in V\}\) (respectively, \(\{ [e]_{\equiv _E} :e \in E\}\)). Moreover \(\iota ^{\prime }([e]_{\equiv _E}) = [\iota (e)]_{\equiv _V}\), \(\tau ^{\prime }([e]_{\equiv _E}) = [\tau (e)]_{\equiv _V}\), \(\lambda ^{\prime }([e]_{\equiv _E}) = \lambda (e)\) (all these mappings are well-defined). The surjective mapping \(\mu \) with \(\mu (v) = [v]_{\equiv _V}\) is called the merging function associated with the merging step. Note that some of (or all) the vertices \(\iota (e_1)\), \(\tau (e_1)\), \(\tau (e_2)\) can be equal.

nondecreasing (p-edge) fold: If condition A holds with (w.l.o.g.) \(z_1 < z_2\) then we can fold the two edges \(e_1\) and \(e_2\) by first setting \(V^{\prime } = V\), \(E^{\prime } = E\), \(\tau ^{\prime } = \tau \), \(\iota ^{\prime }(e_2) = \tau (e_1)\) and \(\lambda ^{\prime }(e_2) = p^{z_2-z_1}\). On all other arguments, \(\iota ^{\prime }\) (respectively, \(\lambda ^{\prime }\)) coincides with \(\iota \) (respectively, \(\lambda \)). The resulting graph \(\mathcal {G}^{\prime }\) may be not normalized, namely if \(e_2\) is long (in \(\mathcal {G}^{\prime }\)) and \(z_2 - z_1 < \gamma \). In this case we replace \(e_2\) by a simple path (or cycle, in case \(\iota ^{\prime }(e_2) = \tau ^{\prime }(e_2)\)) of fresh short edges from \(\iota ^{\prime }(e_2)\) to \(\tau ^{\prime }(e_2)\) spelling the word \(p^{z_2-z_1}\). Note that we have \(V \subseteq V^{\prime }\). We define the merging function \(\mu : V \rightarrow V^{\prime }\) as the canonical inclusion mapping.

nondecreasing (p-path) fold: If the situation in condition B occurs, then we first set \(V^{\prime } = V\), \(E^{\prime } = E\), \(\tau ^{\prime } = \tau \), \(\iota ^{\prime }(e) = \tau (\rho )\) and \(\lambda ^{\prime }(e) = p^{z - 1}\). On all other arguments, \(\iota ^{\prime }\) (respectively, \(\lambda ^{\prime }\)) coincides with \(\iota \) (respectively, \(\lambda \)). If \(z - 1 < \gamma \) then we replace in \(\mathcal {G}^{\prime }\) the edge e by a simple path (or cycle) of short fresh edges spelling the word \(p^{z-1}\). Again we define the merging function \(\mu : V \rightarrow V^{\prime }\) as the canonical inclusion mapping.

Note that each of the above folding steps simulates several folding steps in the corresponding uncompressed graph. Figure 1 shows some folding steps:

  • (a) to (b): nondecreasing p-path fold (where \(\rho \) is the path that is inverse to the red path labelled with ab)

  • (b) to (c): decreasing p-edge fold

  • (c) to (d): nondecreasing q-edge fold (the \(q^6\)-labelled edge coils once around the \(q^5\)-labelled loop and the remaining q-labelled edge is replaced by the two short edges labelled with a and c).

  • (d) to (e): nondecreasing q-path fold

  • (e) to (f): decreasing a-edge fold

Fig. 1
figure 1

Some folding steps, where \(p = ab \in \varOmega \) and \(q = ac \in \varOmega \). We assume that \(\gamma =4\) and that all inverse edges are implicitly present. The edges involved in the folding steps are red; dotted arrows only indicate the direction of foldings and are not part of the graph. The final graph is weakly folded; in fact it is also strongly folded

Assume we make a sequence of k folding steps, where \(\mathcal {G}\) is the initial graph, \(\mathcal {G}^{\prime }\) is the final graph and \(\mu _i\) (\(1 \le i \le k\)) is the merging function for the i-th folding step. Then we can define the composition \(\mu = \mu _1 \circ \mu _2 \circ \cdots \circ \mu _k\) (where \(\mu _1\) is applied first); it maps every vertex v of \(\mathcal {G}\) to a vertex \(\mu (v)\) of \(\mathcal {G}'\). We then say that vertex v is mapped to vertex \(\mu (v)\) during the folding. For two vertices uv of \(\mathcal {G}\) with \(\mu (u)=\mu (v)\) we say that u and v are merged during the folding.

Note that every folding step preserves the property of being normalized and that \(|E_\ell |\) never increases. Clearly, a decreasing fold decreases |E| (and possibly \(|E_\ell |\) in case \(e_1\) and \(e_2\) are long edges). Therefore, we can always perform decreasing folds if possible. A nondecreasing fold can reduce the number of long edges in which case the number of short edges increases by at most \(\alpha \cdot (\gamma -1)\). If a nondecreasing fold does not reduce the number of long edges then both |E| and \(|E_\ell |\) stay the same. Hence, the total number of decreasing folds is bounded by \(|E| + \alpha \cdot (\gamma -1) \cdot |E_\ell |\). Bounding the number of nondecreasing folds is not so easy. If we just iteratively fold then we may obtain an exponential running time. In order to ensure termination in polynomial time, we arrange the folding steps as follows: Assume that \(P = \{p_1, p_2, \ldots , p_n\}\). We say that the current graph is folded with respect to \(p_j\) if neither condition A nor condition B holds with \(p = p_j\). For the following algorithm it is useful to consider the graph \(\mathcal {G}_p\) where the edge set of \(\mathcal {G}_p\) contains all long edges from E that are labelled with a power of p. In addition, \(\mathcal {G}_p\) contains a p-labelled edge from u to v if \(\mathcal {G}\) contains a path \(\rho \) of short edges from u to v and such that \(\lambda (\rho ) = p\) (note that \(\mathcal {G}_p\) is in general not normalized). Such an edge should be only viewed as an abbreviation of the corresponding path \(\rho \) (which is unique if no decreasing folds are possible in \(\mathcal {G}\)).

The main structure of the folding algorithm is shown in Algorithm 1. In the following, we always perform decreasing folds when possible without mentioning this explicitly.

figure a

We now explain how to fold the current graph \(\mathcal {G}\) with respect to some \(p = p_i\) (line 3 of Algorithm 1). We consider each connected component of the graph \(\mathcal {G}_p\) separately. For the following consideration, we can assume that \(\mathcal {G}_p\) is connected. We claim that \(\mathcal {G}_p\) can be folded either into a simple oriented path or a simple oriented cycle. Moreover, if \(\mathcal {G}_p\) is a tree then it is folded into a simple oriented path. The case that \(\mathcal {G}_p\) consists of a single edge is clear. If \(\mathcal {G}_p\) has more than one edge then we consider the following cases.

Case 1. \(\mathcal {G}_p\) is a tree: Choose an edge e with \(\iota (e) = u\) and \(\tau (e) = v\) where v is a leaf. Let \(\mathcal {G}'\) be the connected graph obtained from \(\mathcal {G}_p\) by removing \(e, e^{-1}\) and v. By induction, \(\mathcal {G}'\) can be folded into a simple oriented path \(\rho = [v_1, e_1, v_2, e_2, \ldots , v_{k}, e_k, v_{k+1}]\), where w.l.o.g. \(\lambda (e_i) = p^{a_i}\) with \(a_i > 0\) for all i. Let \(v_i\) be the vertex to which \(u = \iota (e)\) is mapped during the folding. Assume that \(\lambda (e) = p^b\) with \(b > 0\) (the case \(b < 0\) is analogous). If there exists \(j \ge i\) such that \(b = a_{i} + \cdots + a_{j}\) then nothing has to be done (the vertex v is mapped to \(v_{j+1}\) during the folding and the edges e and \(e^{-1}\) are removed). If there is no such j then we have to add a vertex to the path: if there is \(j \ge i\) such that \(a_{i} + \cdots + a_{j-1}< b < a_{i} + \cdots + a_{j}\) then we replace the edge \(e_j\) by an edge from \(v_{j}\) to a fresh vertex \(v'\) and an edge from \(v'\) to \(v_{j+1}\). The label of the first edge is \(p^{b - (a_{i} + \cdots + a_{j-1})}\) and the label of the second edge is \(p^{a_{i} + \cdots + a_{j}-b}\). If \(a_{i} + \cdots + a_{k} < b\) then we add an edge from \(v_{k+1}\) to the new vertex \(v'\) with label \(p^{b - (a_{i} + \cdots + a_{k})}\). In both cases the vertex \(v = \tau (e)\) is mapped to the new vertex \(v'\) during the folding. The resulting graph is an oriented path.

Case 2. \(\mathcal {G}_p\) is not a tree. Then we choose an edge e such that \(\mathcal {G^{\prime }} := \mathcal {G}_p \setminus e\) (the graph obtained from \( \mathcal {G}_p\) by removing the edges e and \(e^{-1}\)) is still connected. By induction, we obtain the following two cases.

Case 2.1. \(\mathcal {G}^{\prime }\) is folded into a simple oriented path

$$\rho = [v_1, e_1, v_2, e_2, \ldots , v_{k}, e_k, v_{k+1}],$$

where w.l.o.g. \(\lambda (e_i) = p^{a_i}\) with \(a_i> 0\) for all i. Let \(v_i\) (respectively, \(v_l\)) be the vertex to which \(\iota (e)\) (respectively, \(\tau (e)\)) is mapped during the folding and let \(\lambda (e) = p^b\) with \(b > 0\). We proceed as in case 1. In case there exists \(j \ge i\) with \(b = a_{i} + \cdots + a_j\) then we additionally merge \(v_{j+1}\) and \(v_l\). We may have already \(v_{j+1} = v_l\) in which case we end up with a simple oriented path. Otherwise we obtain a simple oriented path with a simple oriented cycle attached to it. If there is no \(j \ge i\) with \(b = a_{i} + \cdots + a_j\) then we add a new vertex \(v'\) to the path as in case 1 and merge \(v'\) with \(v_l\). This yields again a simple oriented path with a simple oriented cycle attached to it. We then fold the two ends of the simple path onto the cycle (by coiling them around the cycle) and obtain a simple oriented cycle.

Case 2.2. \(\mathcal {G}^{\prime }\) is folded into a simple oriented cycle \(\mathcal {C}\). We proceed analogously to case 2.1. We either obtain a single simple oriented cycle or two simple oriented cycles \(\rho _1\) and \(\rho _2\) that are glued together in a single vertex v (to see this, one can first remove an arbitrary edge from the cycle \(\mathcal {C}\), which yields a simple oriented path, then carries out the construction from case 2.1 and finally adds the removed edge again). Such a pair of cycles can be replaced by a single cycle as follows: Let \(\lambda (\rho _1) = p^{z_1}\) and \(\lambda (\rho _2) = p^{z_2}\) with \(z_1, z_2 > 0\). Then one can replace the two cycles by a single cycle \(\rho \) with \(\lambda (\rho ) = p^z\), where \(z = \gcd (z_1,z_2)\). Folding the cycles into a single cycle actually corresponds to Euclid’s algorithm.Footnote 3 Of course, we also have to map the vertices of \(\rho _1\) and \(\rho _2\) into the cycle \(\rho \). For this we start with a \(p^z\)-labelled loop at vertex v. If \(v^{\prime } \ne v\) is a vertex belonging to say \(\rho _1\) and the simple path from v to \(v^{\prime }\) on the cycle \(\rho _1\) is labelled with \(p^y\), \(y > 0\), then we compute \(r := y \bmod z\) and subdivide the loop into an edge from v to \(v^{\prime }\) with label \(p^r\) and an edge from \(v'\) back to v with label \(p^{z-r}\). We continue in this way with the other vertices on \(\rho _1\) and \(\rho _2\).

Let the power-compressed graph \(\mathcal {H}_p\) be the outcome of the above procedure. It is a disjoint union of simple oriented paths and simple oriented cycles and hence folded with respect to p. The running time of the computations in cases 1 and 2 is polynomial in \(\Vert \mathcal {G}_p\Vert \) and due to the recursion this running time has to be charged for every edge of \(\mathcal {G}_p\). Recall that edges labelled with p in \(\mathcal {H}_p\) actually correspond to paths of short edges in the original graph \(\mathcal {G}\). This concludes the description of line 3 in Algorithm 1.

It remains to argue that we make only polynomially many iterations of the while-loop in Algorithm 1. For this assume that the current graph (call it \(\mathcal {G}'\)) is folded with respect to \(p_i\) and that we fold the graph with respect to some \(p_j\) with \(j > i\). Let us denote the sequence of folding steps with respect to \(p_j\) with \(\mathcal {F}_j\) and let \(\mathcal {G}^{\prime \prime }\) be the graph after the execution of \(\mathcal {F}_j\). Moreover, assume that \(\mathcal {G}^{\prime \prime }\) is no longer folded with respect to \(p_i\). We argue that this implies that during the execution of \(\mathcal {F}_j\) we made progress in the sense that |E| or \(|E_\ell |\) decreases. Since \(\mathcal {G}^{\prime }\) is folded with respect to \(p_i\) but \(\mathcal {G}^{\prime \prime }\) is not, we must have \(\mathcal {G}^{\prime }_{p_i} \ne \mathcal {G}^{\prime \prime }_{p_i}\). But this implies that |E| or \(|E_\ell |\) must decrease during \(\mathcal {F}_j\). Otherwise we only make non-decreasing \(p_j\)-edge and \(p_j\)-path folds that do not eliminate long edges. Such folds only change the source and target vertices of \(p_j^z\)-labelled long edges, which does not modify the graph \(\mathcal {G}^{\prime }_{p_i}\).

Since we have already bounded the number of decreasing folds by \(|E| + \alpha \cdot (\gamma -1) \cdot |E_\ell |\) and the number of long edges never increases, the index i in Algorithm 1 can only decrease a polynomial number of times (more precisely: \(|E| + \alpha \cdot \gamma \cdot |E_\ell |\) times). This shows that Algorithm 1 works in polynomial time and concludes the proof of Lemma 5.

It remains to convert a weakly folded power-compressed graph in polynomial time into a strongly folded power-compressed graph. The general idea is the following. Let \(\mathcal {G}\) be a normalized and weakly folded power-compressed graph. Recall that \(\textsf{decompress}(\mathcal {G})\) is obtained from \(\mathcal {G}\) by replacing every long edge e with label \(p^z\) by a simple path (or simple cycle) \(\rho \) with \(\iota (e)=\iota (\rho )\), \(\tau (e)=\tau (\rho )\) and \(\lambda (\rho ) = p^z\). We show that any sequence of folding steps in \(\textsf{decompress}(\mathcal {G})\) can only affect a short initial and final part of this path \(\rho \). Hence, it suffices to partially decompress \(\mathcal {G}\) and then fold short edges as long as possible.

Let us be a bit more precise: We will show that in the above situation, vertices in \(\textsf{decompress}(\mathcal {G})\) that neither belong to the prefix of \(\rho \) labelled with \(p^{\gamma /2}\) nor to the suffix of \(\rho \) labelled with \(p^{\gamma /2}\) – later such vertices will be called protected – cannot be merged with other vertices during a sequence of folding steps starting in \(\textsf{decompress}(\mathcal {G})\) (recall the definition of \(\beta \) and \(\gamma = 2(\alpha +\beta )\) from the beginning of Section 3.4). For this we need the following simple lemma:

Lemma 6

Let \(\mathcal {H}\) be an uncompressed graph and assume that \(\mathcal {H}\) is folded into \(\mathcal {H}^{\prime }\) by a sequence of folding steps. If thereby two vertices u and v of \(\mathcal {H}\) are merged to a single vertex of \(\mathcal {H}^{\prime }\), then there must exist a path \(\rho \) without backtracking in \(\mathcal {H}\) from u to v such that \(\lambda (\rho ) = \varepsilon \) in \(F(\varSigma )\).Footnote 4

Proof

It suffices to find a path \(\rho \) from u to v such that \(\lambda (\rho ) = \varepsilon \) in \(F(\varSigma )\). By removing subpaths \([u^{\prime }, e, v^{\prime }, e^{-1}, u^{\prime }]\) from \(\rho \) we obtain a path \(\rho '\) without backtracking and such that \(\lambda (\rho ^{\prime }) = \varepsilon \) still holds in \(F(\varSigma )\). The existence of such a path can be shown by a straightforward induction over the number of folding steps from \(\mathcal {H}\) to \(\mathcal {H}^{\prime }\). Note that if two different vertices \(v_1\) and \(v_2\) of an uncompressed graph are merged in a single folding step, then there exist two different edges \(e_1 \ne e_2\) such that \(\iota (e_1)=\iota (e_2)\), \(\tau (e_1)=v_1\), \(\tau (e_2)=v_2\), and \(\lambda (e_1) = \lambda (e_2) = a\) for some \(a \in \Gamma \). Hence, the path \(\rho ' = [v_1, e_1^{-1}, \iota (e_1), e_2, v_2]\) satisfies \(\lambda (\rho ') = a^{-1} a = \varepsilon \) in \(F(\varSigma )\). \(\square \)

Due to Lemma 6 it will suffice to show that a non-empty path without backtracking in \(\textsf{decompress}(\mathcal {G})\) that starts in a protected vertex is labelled with a word w such that \(w \ne \varepsilon \) in \(F(\varSigma )\). Since \(\mathcal {G}\) is normalized and weakly folded, it will turn out that this word w must be a prefix of an \(\omega \)-word from the following set \(\mathcal {L} \subseteq \varGamma ^\omega \): The set \(\mathcal {L}\) consists of all \(\omega \)-words of the form

$$\begin{aligned} s p_1^{z_1} w_1 p_2^{z_2} w_2 p_{3}^{z_{3}} w_3 p_{4}^{z_{4}} w_4\cdots \end{aligned}$$
(1)

such that the following properties hold for all \(i \ge 1\):

  • \(p_i \in \varOmega \cup \varOmega ^{-1}\),

  • s is a suffix of \(p_1\),

  • \(w_i \in \textsf{red}(\varGamma ^*) \setminus (p_i^{-1}\varGamma ^* \cup \varGamma ^* p_{i+1}^{-1})\),

  • \(z_1 \ge \alpha +\beta = \gamma /2\) and \(z_i \ge \gamma \) if \(i \ge 2\),

  • if \(w_i = \varepsilon \), then \(p_i \ne p_{i+1}^{-1}\).

By our previous discussion, the following lemma is crucial:

Lemma 7

Every non-empty prefix w of an \(\omega \)-word from \(\mathcal {L}\) satisfies \(w \ne \varepsilon \) in \(F(\varSigma )\), i.e., \(\textsf{red}(w) \ne \varepsilon \).

In order to prove Lemma 7, the following technical lemma turns out be useful. It ensures that in a factor \(p_i^{\alpha +\beta } w_i p_{i+1}^{\alpha +\beta }\) in (1) not too much cancellation happens. More precisely, it allows to show that \(\textsf{red}(p_i^{\alpha +\beta } w_i p_{i+1}^{\alpha +\beta })\) starts with \(p_i\) and ends with \(p_{i+1}\).

Lemma 8

Let \(p \in \varOmega \cup \varOmega ^{-1}\) and assume that \(v \in \varGamma ^*\) satisfies one of the following two conditions:

  1. (i)

    \(v \in \textsf{red}(\varGamma ^*) \setminus p^{-1} \varGamma ^*\)

  2. (ii)

    \(v = w q'\), where \(q' \ne \varepsilon \) is a prefix of \(q^{\alpha +\beta }\) for some \(q \in \varOmega \cup \varOmega ^{-1}\), \(w \in \textsf{red}(\varGamma ^*) \setminus (p^{-1} \varGamma ^* \cup \varGamma ^* q^{-1})\), and if \(w = \varepsilon \) then \(p \ne q^{-1}\).

Then p is a prefix of \(\textsf{red}(p^{\alpha +\beta }v)\). In addition, if case (ii) holds and \(q' = q^{\alpha +\beta }\) then q is a suffix of \(\textsf{red}(p^{\alpha +\beta }v)\).

Proof

If \(v \in \textsf{red}(\varGamma ^*) \setminus p^{-1} \varGamma ^*\) then p is a prefix of \(\textsf{red}(p^{\alpha +\beta }v)\) (note that \(\alpha +\beta \ge 2\)).

Now assume that (ii) holds, i.e., \(v = w q'\) where \(q' \ne \varepsilon \) is a prefix of \(q^{\alpha +\beta }\) for some \(q \in \varOmega \cup \varOmega ^{-1}\), \(w \in \textsf{red}(\varGamma ^*) \setminus (p^{-1} \varGamma ^* \cup \varGamma ^* q^{-1})\), and if \(w = \varepsilon \), then \(p \ne q^{-1}\). It suffices to show that p is a prefix of \(\textsf{red}(p^{\alpha +\beta }wq')\). Then by symmetry, q is a suffix of \(\textsf{red}(p^{\alpha +\beta }wq^{\alpha +\beta })\).

Since \(p^{\alpha +\beta }\), w and \(q'\) are freely reduced, cancellations can only occur at the two borders between \(p^{\alpha +\beta }\), w and \(q'\). Let us start to reduce the word \(p^{\alpha +\beta } w q'\). Since \(p^{-1}\) is not a prefix of w and \(q^{-1}\) is not a suffix of w, the reductions at the two borders can only consume \(|p|-1 \le \alpha -1\) symbols from the prefix of w and \(|q|-1\le \alpha -1\) symbols from the suffix of w. If w is not completely cancelled during the reduction, we obtain a freely reduced word of the form \(p^{\alpha +\beta -1} r s q''\), where r is a non-empty prefix of p, s is a non-empty factor of w, and \(q''\) is a possibly empty factor of \(q^{\alpha +\beta }\). Thus, p is indeed a prefix of \(\textsf{red}(p^{\alpha +\beta }wq') = p^{\alpha +\beta -1} r s q''\).

Let us now assume that w is completely cancelled during the reduction. Since w is freely reduced, we obtain factorizations \(w = u^{-1} t^{-1}\), \(p = r u\), and \(q = t s\). Moreover, \(q' = t q''\) and \(p^{\alpha +\beta } w q'\) is reduced to \(p^{\alpha +\beta -1} r q''\). Now the word \(p^{\alpha +\beta -1} r q''\) can be further reduced at the border between the freely reduced words \(p^{\alpha +\beta -1} r\) and \(q''\). If \(|q''| < \alpha \) then the reduction can continue for at most \(\alpha -1\) steps. Then, the free reduction of \(p^{\alpha +\beta } w q'\) consumes from \(p^{\alpha +\beta }\) only a suffix of length at most \(2(\alpha -1) < \alpha + \beta \). Hence, the first copy of p survives.

We can therefore assume that \(|q''| \ge \alpha \). This allows us to write \(q'' = s q^k s'\) (recall that \(q=ts\) and that the prefix t of \(q'\) was cancelled), where \(k \ge 0\) and \(s'\) is a prefix of q. We distinguish several cases:

  • \(p \ne q^{-1}\): then by Lemma 1 the reduction of \(p^{\alpha +\beta -1} r s q^k s'\) can proceed for at most \(|p|+|q|-2 < \beta \) steps.

  • \(p = q^{-1}\) and \(|r| \ne |s|\): then by Lemma 2 the reduction of \(p^{\alpha +\beta -1} r s q^k s'\) can proceed for at most \(|p|-1 < \alpha \le \beta \) steps.

  • \(p = q^{-1}\) and \(|r| = |s|\): we obtain \(p = ru\) and \(p^{-1} = ts\), i.e., \(ru = s^{-1} t^{-1}\). Since \(|r| = |s| = |s^{-1}|\) we have \(r = s^{-1}\) and \(u = t^{-1}\). Therefore \(w = u^{-1} t^{-1} = u^{-1} u\). Since \(w \in \textsf{red}(\varGamma ^*)\), we must have \(w= \varepsilon \). Together with \(p = q^{-1}\) this yields a contradiction to the assumptions of the lemma.

In total, during the free reduction of \(p^{\alpha +\beta } w q'\) only a suffix of \(p^{\alpha +\beta }\) of length \(< \alpha + \beta \) is cancelled. Hence, the first copy of p is not cancelled. This concludes the proof of the lemma. \(\square \)

We can now prove Lemma 7.

Proof (Proof of Lemma 7)

Let w be a non-empty prefix of an \(\omega \)-word from \(\mathcal {L}\). We can write w as

$$ w = s \, p_1^{n_1} \prod _{i=1}^k (p_i^{\alpha +\beta } w_i p_{i+1}^{\alpha +\beta } \, p_{i+1}^{n_{i+1}}) \, t $$

such that \(k \ge 0\) and for all i in the proper range we have

  • \(n_i \ge 0\),

  • \(p_i \in \varOmega \cup \varOmega ^{-1}\),

  • s is a suffix of \(p_1\),

  • \(w_i \in \textsf{red}(\varGamma ^*) \setminus (p_i^{-1}\varGamma ^* \cup \varGamma ^* p_{i+1}^{-1})\),

  • if \(w_i = \varepsilon \), then \(p_i \ne p^{-1}_{i+1}\).

Moreover, for the word t one of the following cases must hold:

  • t is a prefix of \(p_{k+1}\),

  • \(t = p_{k+1}^{\alpha +\beta } w_{k+1}\) with \(w_{k+1} \in \textsf{red}(\varGamma ^*) \setminus p_{k+1}^{-1}\varGamma ^*\),

  • \(t = p_{k+1}^{\alpha +\beta } w_{k+1} v\) with v a non-empty proper prefix of a word \(q^{\alpha +\beta }\) for some \(q\in \varOmega \cup \varOmega ^{-1}\), \(w_{k+1} \in \textsf{red}(\varGamma ^*) \setminus (p_{k+1}^{-1}\varGamma ^* \cup \varGamma ^* q^{-1})\), and if \(w_{k+1} = \varepsilon \) then \(p_{k+1} \ne q^{-1}\).

By Lemma 8 every word \(\textsf{red}(p_i^{\alpha +\beta } w_i p_{i+1}^{\alpha +\beta })\) starts with \(p_i\) and ends with \(p_{i+1}\). Moreover, \(\textsf{red}(t)\) is a prefix of \(p_{k+1}\) or, by Lemma 8, starts with \(p_{k+1}\). This implies that

$$ \textsf{red}(w) = s \, p_1^{n_1} \prod _{i=1}^k (\textsf{red}(p_i^{\alpha +\beta } w_i p_{i+1}^{\alpha +\beta }) \, p_{i+1}^{n_{i+1}})\, \textsf{red}(t) \ne \varepsilon . $$

This concludes the proof of the lemma. \(\square \)

Consider now a normalized and weakly folded power-compressed graph \(\mathcal {G}\). Recall that \(\textsf{decompress}(\mathcal {G})\) is obtained from \(\mathcal {G}\) by replacing every long edge e with label \(p^z\) by a simple path (or simple cycle) \(\rho \) with \(\iota (e)=\iota (\rho )\), \(\tau (e)=\tau (\rho )\) and \(\lambda (e)=\lambda (\rho )\). The vertices of \(\textsf{decompress}(\mathcal {G})\) that are not already in \(\mathcal {G}\) (i.e., the inner vertices of the paths that replace the long edges) are also called the fresh vertices of \(\textsf{decompress}(\mathcal {G})\). We say that a fresh vertex v of \(\textsf{decompress}(\mathcal {G})\) is protected if the following hold: let e be the long edge of \(\mathcal {G}\) such that v is an inner vertex of the path \(\rho \) that replaces e. Let \(\lambda (e) = \lambda (\rho )= p^z\), where \(p \in \varOmega \cup \varOmega ^{-1}\) and \(z \ge \gamma \). Then the path \(\rho \) can be split into two subpaths \(\rho _1\) and \(\rho _2\) such that \(\rho _1\) is a simple path from \(\iota (e)\) to v and \(\rho _2\) is a simple path from v to \(\tau (e)\). Then v is protected if \(p^{\alpha +\beta } = p^{\gamma /2}\) is a prefix of \(\lambda (\rho _1)\) and a suffix of \(\lambda (\rho _2)\). Intuitively, v is not too close to the two end points \(\iota (e)\) and \(\tau (e)\).

Lemma 9

Let \(\mathcal {G}\) be a normalized and weakly folded power-compressed graph and let v be a fresh and protected vertex of \(\textsf{decompress}(\mathcal {G})\). Let \(\rho \) be a non-empty path without backtracking in \(\textsf{decompress}(\mathcal {G})\) that starts in v, i.e., \(\iota (\rho ) = v\). Then \(\lambda (\rho ) \ne \varepsilon \) in \(F(\varSigma )\).

Proof

Let e be the edge in \(\mathcal {G}\) such that decompressing e produces v and let \(\rho '\) be the simple path/cycle that replaces e. Let \(\lambda (e) = \lambda (\rho ') = p^z\) with \(p \in \varOmega \cup \varOmega ^{-1}\) and \(z \ge \gamma \). If \(\rho \) is a simple subpath of \(\rho '\) then \(\lambda (\rho )\) is a non-empty factor of \(p^z\) and therefore freely reduced.

Now assume that \(\rho \) is not a simple subpath of \(\rho '\). By Lemma 7 it suffices to show that \(\lambda (\rho )\) is a non-empty prefix of an \(\omega \)-word from \(\mathcal {L}\). The path \(\rho \) has to leave the path \(\rho '\) via \(\iota (e)\) or \(\tau (e)\). In both cases we can factorize \(\lambda (\rho )\) as \(\lambda (\rho ) = s p_1^{z_1} w\), where \(p_1 \in \{p,p^{-1}\}\), \(z_1 \in \mathbb {N}\), and \(s p_1^{z_1}\) is a suffix of \(p^z\) or a suffix of \(p^{-z}\). Moreover, since the vertex v is protected we must have \(z_1 \ge \gamma /2\).

The remaining word w can be factorized as

$$w = w_1 p_2^{z_2} w_2 p_{3}^{z_{3}} \cdots w_{k-1} p_{k}^{z_{k}} w_k t$$

where every \(p_i^{z_i}\) is the label of a long edge of \(\mathcal {G}\) (hence, \(p_i \in \varOmega \cup \varOmega ^{-1}\) and \(z_i \ge \gamma \)) and every \(w_i\) is the label of a path consisting of short edges in \(\mathcal {G}\). For the word t, there are two cases:

  • \(t = \varepsilon \) or

  • \(t \ne \varepsilon \) arises from long edge \(e'\) of \(\mathcal {G}\), in which case t is a non-empty prefix of \(\lambda (e')\). Hence, t is a non-empty prefix of a word \(p_{k+1}^{z_{k+1}}\) for some \(p_{k+1} \in \varOmega \cup \varOmega ^{-1}\).

Since \(\mathcal {G}\) is weakly folded, the following conditions hold:

  • \(w_i \in \textsf{red}(\varGamma ^*)\) (since \(\rho \) is without backtracking and the situation from condition A on page11 does not occur in \(\mathcal {G}\)),

  • \(w_i \notin p_i^{-1}\varGamma ^*\) and \(w_i \notin \varGamma ^* p_{i+1}^{-1}\) if \(p_{i+1}\) exists (since the situation from condition B on page 11 does not occur in \(\mathcal {G}\)),

  • if \(w_i = \varepsilon \) and \(p_{i+1}\) exists, then \(p_i \ne p^{-1}_{i+1}\) (since \(\rho \) is without backtracking and the situation from condition A on page 11 does not occur in \(\mathcal {G}\)).

This shows that \(\lambda (\rho )\) is a prefix of an \(\omega \)-word from \(\mathcal {L}\). \(\square \)

Lemma 10

A given normalized and weakly folded power-compressed graph \(\mathcal {G}\) can be folded in polynomial time into a strongly folded power-compressed graph \(\mathcal {G}'\). We have \(F(\mathcal {G}) = F(\mathcal {G}')\).

Proof

We first construct a power-compressed graph \(\mathcal {H}\) by partially decompressing \(\mathcal {G}\). Consider a long edge e in \(\mathcal {G}\). Let \(\iota (e) = u\), \(\tau (e)=v\) and \(\lambda (e) = p^z\) with \(p \in \varOmega \cup \varOmega ^{-1}\) and \(z \ge \gamma \). We then replace e by

  • a simple path \(\rho _1\) of new short edges going from u to a new vertex \(u'\) and such that \(\lambda (\rho _1) = p^{\gamma /2}= p^{\alpha +\beta }\),

  • a new edge from \(u'\) to another new vertex \(v'\) with label \(p^{z-\gamma }\) (if \(z=\gamma \) then \(u'=v'\) and the new edge is not needed), and

  • a simple path \(\rho _2\) of new short edges going from \(v'\) to v and such that \(\lambda (\rho _2) = p^{\gamma /2} = p^{\alpha +\beta }\).

The power-compressed graph \(\mathcal {H}\) is not necessarily normalized (this is not needed).

We next fold short edges in \(\mathcal {H}\) as long as possible. Thereby, the number of edges decreases in each step (folding two short edges is a special case of a decreasing fold). Hence, the process stops after polynomially many folding steps. Let \(\mathcal {H}'\) be the resulting power-compressed graph. We show that \(\mathcal {H}'\) is strongly folded, which proves the lemma.

Assume the contrary. Then there exist two edges \(e_1 \ne e_2\) in \(\textsf{decompress}(\mathcal {H}')\) such that \(\iota (e_1) = \iota (e_2)\) and \(\lambda (e_1) = \lambda (e_2)\). If \(e_1\) and \(e_2\) are already edges of \(\mathcal {H}'\), then \(e_1\) and \(e_2\) are two short edges of \(\mathcal {H}'\) that can be folded, which is a contradiction. Therefore, w.l.o.g. \(e_1\) and \(\tau (e_1)\) must arise from decompressing a long edge of \(\mathcal {H}'\), i.e., from replacing a long edge in \(\mathcal {H}'\) by a simple path of new short edges. Hence \(\tau (e_1)\) is a fresh vertex of \(\textsf{decompress}(\mathcal {H}')\) and \(\tau (e_1) \ne \tau (e_2)\). We clearly can also fold \(\textsf{decompress}(\mathcal {H})\) (which is the same as \(\textsf{decompress}(\mathcal {G})\)) into \(\textsf{decompress}(\mathcal {H}')\). There are vertices \(u_1 \ne u_2\) in \(\textsf{decompress}(\mathcal {H})\) such that \(u_i\) is mapped to \(\tau (e_i)\) while folding \(\textsf{decompress}(\mathcal {H})\) into \(\textsf{decompress}(\mathcal {H}')\). Moreover, also \(u_1\) must be a fresh vertex of \(\textsf{decompress}(\mathcal {H})\). Due to the partial decompression of \(\mathcal {G}\) into \(\mathcal {H}\), \(u_1\) is a fresh and protected vertex of \(\textsf{decompress}(\mathcal {G})\). By Lemma 6 there must exist a non-empty path \(\rho \) in \(\textsf{decompress}(\mathcal {G})\) from \(u_1\) to \(u_2\) without backtracking such that \(\lambda (\rho ) = \varepsilon \) in \(F(\varSigma )\). But this contradicts Lemma 9. \(\square \)

Lemmas 4, 5 and 10 finally yield the main technical result of Section 3.4:

Corollary 1

A given power-compressed graph \(\mathcal {G}\) can be folded in polynomial time into a strongly folded power-compressed graph \(\mathcal {G}'\). We have \(F(\mathcal {G}) = F(\mathcal {G}')\).

3.5 Power-Compressed Subgroup Membership Problem for Free Groups

We can now show the main result of Section 3:

Theorem 1

The power-compressed subgroup membership problem for a f.g. free group can be solved in polynomial time.

Proof

Let \(w_0, w_1, \ldots , w_n\) be the input power words and let \(A = \{ w_1, \ldots , w_n \}\). We construct from A a power-compressed bouquet graph in the same way as in Section 3.3 for uncompressed graphs: to a non-empty power word \(w = p_1^{z_1} p_2^{z_2} \cdots p_k^{z_k}\) we associate the power-compressed cycle graph

$$ \mathcal {C}(w) = (\{ v_0, \ldots , v_{k-1}\}, \{e_i^{\pm 1} :1 \le i \le k\}, \iota , \tau , v_0),$$

where \(\iota (e_i) = v_{i-1}\), \(\lambda (e_i) = p_i^{z_i}\), and \(\tau (e_i) = v_{i \bmod k}\). We then construct the power-compressed bouquet graph \(\mathcal {B}(A)\) by taking the disjoint union of \(\mathcal {C}(w_1), \ldots , \mathcal {C}(w_n)\) and then merging their base points. Using Corollary 1 we can fold \(\mathcal {B}(A)\) in polynomial time into a strongly folded power-compressed graph \(\mathcal {S}(A)\). Let \(v_0\) be its base point. As explained at the end of Section 3.2 we can view \(\mathcal {S}(A)\) as a finite automaton, where transitions are labelled with succinct words of the form \(p^z\) with z given in binary notation. By Lemma 3, \(\mathcal {S}(A)\) accepts a freely reduced word \(g \in \textsf{red}(\Gamma ^*) = F(\varSigma )\) if and only if g belongs to the subgroup \(\langle \textsf{red}(w_1), \ldots , \textsf{red}(w_n) \rangle \le F(\varSigma )\). Since \(\mathcal {S}(A)\) is strongly folded, it is a deterministic automaton in the sense that the labels of two outgoing transitions of a state do not have a non-empty common prefix.

For the rest of the proof it is convenient to switch from power words to straight-line programs. A straight-line program is a context-free grammar \(\mathcal {P}\) that produces exactly one word that is denoted with \(\textrm{val}(\mathcal {P})\). By repeated squaring, our given power word \(w_0\) can be easily transformed in polynomial time into an equivalent straight-line program. Moreover, from a given straight-line program \(\mathcal {P}\) over the alphabet \(\varGamma = \varSigma \cup \varSigma ^{-1}\) one can compute in polynomial time a new straight-line program \(\mathcal {Q}\) such that \(\textrm{val}(\mathcal {Q}) = \textsf{red}(\textrm{val}(\mathcal {P}))\); see [22, Theorem 4.11]. Hence, we can compute in polynomial time a straight-line program \(\mathcal {Q}\) for \(\textsf{red}(w_0)\). The transition labels of the automaton \(\mathcal {S}(A)\) can be also transformed into equivalent straight-line programs; such automata with straight-line compressed transition labels were investigated in [15]. It remains to check in polynomial time whether the deterministic automaton \(\mathcal {S}(A)\) accepts \(\textrm{val}(\mathcal {Q})\). This is possible in polynomial time by [15, Theorem 1].\(\square \)

4 Power-Compressed Subgroup Membership for Virtually Free Groups

A main advantage of the power-compressed subgroup membership problem is that its complexity is preserved under finite index group extensions. The proof of the following lemma follows [13], where it is shown that the complexity of the (ordinary) subgroup membership problem is preserved under finite index group extensions. In order to extend this result to the power-compressed setting, we make use of the conjugate collection process for power words from [23, Theorem 6].

Lemma 11

Let G be a fixed f.g. group and H a fixed subgroup of finite index in G.Footnote 5 The power-compressed subgroup membership problem for G is polynomial time reducible to the power-compressed subgroup membership problem for H.

Proof

Using the following standard trick we can assume that H is a normal subgroup of finite index in G: Let N be the intersection of all conjugate subgroups \(g^{-1} H g\). Then N is a normal subgroup of G and has still finite index in G (the latter is a well-known fact). Since \(N \le H\), the power-compressed subgroup membership problem for N is polynomial time reducible to the power-compressed subgroup membership problem for H. Hence, it suffices to show that the power-compressed subgroup membership problem for G is polynomial time reducible to the power-compressed subgroup membership problem for N.

By the above consideration, we can assume that H is a normal subgroup of finite index in G. Let us fix a symmetric generating \(\varTheta \) for H and let \(R \subseteq G\) be a (finite) set of coset representatives for H with \(1 \in R\). Then \(\varSigma := \varTheta \cup (R \setminus \{1\})\) generates G. On R we can define the structure of the quotient group G/H by defining \(r \cdot r' \in R\) and \(\overline{r} \in R\) for \(r,r' \in R\) such that \(r r' \in H (r \cdot r')\) and \(r^{-1} \in H \overline{r}\). Recall that G and H are fixed groups, hence \(r \cdot r'\) and \(\overline{r}\) can be computed in constant time. In [23, Theorem 6] it is shown that the power word problem for G can be reduced in polynomial time (in fact, in \(\mathbb {N}C^1\)) to the power word problem for H. The proof shows the following fact:

Fact 1. Given a power word w over the alphabet \(\varSigma \) we can compute in polynomial time a power word \(w'\) over the alphabet \(\varTheta \) and \(r \in R\) such that \(w = w' r\) in G.

Let us now take a finite list of power words \(w_0, w_1, \ldots , w_n\) over the alphabet \(\varSigma \) and let \(g_i \in G\) be the group element represented by \(w_i\). We want to check whether \(g_0 \in A:= \langle g_1, \ldots , g_n \rangle \).

First we use Fact 1 and rewrite in polynomial time each power word \(w_i\) as \(w'_i r_i\) with \(w'_i \in \varTheta ^*\) a power word and \(r_i \in R\). Let \(w'_i\) represent \(g'_i \in H\). By computing the closure of \(\{ r_1, \overline{r}_1, \ldots , r_n, \overline{r}_n\}\) with respect to the multiplication \(\cdot \) on R we obtain in constant time the set of all coset representatives \(r \in R\) such that \(H r \cap A \ne \emptyset \). Let us denote this closure with \(V \subseteq R\). Clearly, \(1 \in V\). If \(r_0 \notin V\) then we have \(g_0 = g'_0 r_0 \notin A \) and we are done.

Claim 1. In polynomial time we can compute a finite list of generators for \(H \cap A\) written as power words over \(\varTheta \).

For the proof of Claim 1 we follow [13]: we compute a power-compressed graph \(\mathcal {G}\) (in the sense of Section 3.2) as follows. All coset representatives from V are vertices of \(\mathcal {G}\). Moreover, we add a simple path from \(r \in V\) to \(r' \in V\) labelled with the power word \(w_i\) iff \(r \cdot r_i = r'\) (\(1 \le i \le n\)). The corresponding inverse path (that consists of the inverse edges) is of course labelled with \(w_i^{-1}\) and we have \(r' \cdot \overline{r}_i = r\). The label of a path from \(1 \in V\) back to \(1 \in V\) in the graph \(\mathcal {G}\) belongs to \(\{ w_1, w^{-1}_1, \ldots , w_n, w^{-1}_n\}^*\) and hence can be viewed as a power word over the alphabet \(\varSigma \). As such, it represents an element of the group \(H \cap A\).

Fix a spanning tree of \(\mathcal {G}\), let E be the set of edges of \(\mathcal {G}\) and let \(T \subseteq E\) be those edges that belong to the fixed spanning tree. We then obtain a set of generators for \(H \cap A\) by taking for every edge \(e \in E \setminus T\) the circuit in \(\mathcal {G}\) obtained by following the unique simple path in T from 1 to \(\iota (e)\), followed by the edge e, followed by the unique simple path in T from \(\tau (e)\) back to 1. Let \(x_e \in \{ w_1, w^{-1}_1, \ldots , w_n, w^{-1}_n\}^*\) be the label of this circuit. Every \(x_e\) represents an element of \(H \cap A\) and the set of all these elements (for \(e \in E \setminus T\)) is a generating set of \(H \cap A\); see [13] for details. Moreover, every \(x_e\) can be written as a power word over the alphabet \(\varSigma \) of polynomial length. Using Fact 1 we can rewrite this power word in polynomial time into \(x'_e r_e\) where \(x'_e\) is a power word over the alphabet \(\varTheta \) and \(r_e \in R\). But since \(x_e\) represents an element of H, we must have \(r_e = 1\). Hence the power words \(x'_e\) represent a generating set of \(H \cap A\).

Now we can finish the proof of the lemma. We use the graph \(\mathcal {G}\) defined above. Since \(r_0 \in V\), there is a path from 1 to \(r_0\). Let \(x \in \{ w_1, w^{-1}_1, \ldots , w_n, w^{-1}_n\}^*\) be the label of this path. It is a power word over \(\varSigma \) and by Fact 1, x can be rewritten into the form yr for a power word y over \(\varTheta \) and \(r \in R\). Clearly, we must have \(r = r_0\). In the group G we have \(g_0 x^{-1} = g'_0 r_0 r_0^{-1} y^{-1} = g'_0 y^{-1}\) (here, the words x and y are identified with the corresponding elements of G). Note that \(g'_0 y^{-1}\) is represented by the power word \(w'_0 y^{-1}\) over the alphabet \(\varTheta \). Since the word x represents an element of A we have \(g_0 \in A\) if and only if \(g_0 x^{-1} \in A\) if and only if \(g'_0 y^{-1} \in A\) if and only if \(g'_0 y^{-1} \in H \cap A\). The latter is an instance of the power-compressed subgroup membership problem for H since we have power-compressed generators for \(H \cap A\). This concludes the proof.\(\square \)

From Theorem 1 and Lemma 11 we immediately obtain the following corollary:

Corollary 2

The power-compressed subgroup membership problem for a fixed f.g. virtually free group can be solved in polynomial time.

The group \(\textsf{GL}(2,\mathbb {Z})\) consists of all \((2 \times 2)\)-matrices over the integers with determinant \(-1\) or 1. It is a well-known example of a f.g. virtually free group [36]. We are interested in the situation where group elements of \(\textsf{GL}(2,\mathbb {Z})\) are represented by 4-tuples of binary encoded integers. Testing whether such a 4-tuple belongs to \(\textsf{GL}(2,\mathbb {Z})\) is of course possible in polynomial time.

Lemma 12

From a given matrix \(A \in \textsf{GL}(2,\mathbb {Z})\) with binary encoded entries one can compute in polynomial time a power word over a fixed finite generating set of \(\textsf{GL}(2,\mathbb {Z})\), which evaluates to the matrix A.

Proof

For the group \(\textsf{SL}(2,\mathbb {Z})\) of all \((2 \times 2)\)-matrices over the integers with determinant 1 the result is shown in [14], see also [10, Proposition 15.4]. Now, \(\textsf{SL}(2,\mathbb {Z})\) is a normal subgroup of index two in \(\textsf{GL}(2,\mathbb {Z})\). Fix an arbitrary matrix \(B \in \textsf{GL}(2,\mathbb {Z})\) with determinant \(-1\). Given a matrix \(A \in \textsf{GL}(2,\mathbb {Z})\) with binary encoded entries and determinant \(-1\) we first compute the matrix \(A B^{-1} \in \textsf{SL}(2,\mathbb {Z})\). Using [14] we can compute in polynomial time a power word w for \(A B^{-1}\). Hence, wB (where B is taken as an additional generator) is a power word for A.\(\square \)

Corollary 3

The subgroup membership problem for \(\textsf{GL}(2,\mathbb {Z})\) can be solved in polynomial time when matrix entries are given in binary encoding.

Proof

Since \(\textsf{GL}(2,\mathbb {Z})\) is f.g. virtually free, the power-compressed subgroup membership problem for \(\textsf{GL}(2,\mathbb {Z})\) can be solved in polynomial time by Corollary 2. By Lemma 12 this shows Corollary 3.\(\square \)

5 The Finite Index Problem

For a f.g. group G with the finite generating set \(\varSigma \) we define the finite index problem as follows, where as usual \(\varGamma = \varSigma \cup \varSigma ^{-1}\).

input: words \(w_1, \ldots , w_n\) over the alphabet \(\varGamma \).

output: the index (an element of \(\mathbb {N} \cup \{\infty \}\)) of the subgroup \(\langle g_1, \ldots , g_n \rangle \le G\), where \(g_i\) is the group element represented by \(w_i\).

If the words \(w_1, \ldots , w_n\) are represented as power words then we speak of the power-compressed finite index problem.

Theorem 2

The power-compressed finite index problem for a f.g. free group can be solved in polynomial time.

Proof

We use the following criterion from [16]. Consider a f.g. subgroup \(\langle A \rangle \le F(\varSigma )\) with \(A \subseteq \varGamma ^+\) finite. We compute the folded (uncompressed) graph \(\mathcal {S}(A)\) as described before Lemma 3. Let \(v_0\) be the base point of \(\mathcal {S}(A)\). We define the core of \(\mathcal {S}(A)\), denoted with \(\textsf{core}(\mathcal {S}(A))\), by removing from \(\mathcal {S}(A)\) all vertices \(v \ne v_0\) and edges e that do not belong to a cycle without backtracking that contains the base point \(v_0\); see also [16, Definitions 3.5 and 5.3]. This means that we delete as long as possible vertices \(v \ne v_0\) for which there is a unique edge e with \(\tau (e)=v\) together with the edges e and \(e^{-1}\).Footnote 6 For instance, the core of the graph in Figure 1(f), where the origin of the \(p^4\)-labelled edge is the base point, is obtained by removing the b-labelled edge and its target vertex. On the other hand, if the base point is the target of the \(p^4\)-labelled edge, then the core is obtained by removing the b-labelled edge and its target vertex as well as the \(p^4\)-labelled edge and its source vertex.

Assume that \(\textsf{core}(\mathcal {S}(A)) =(V,E,\iota ,\tau , \lambda , v_0)\). It is shown in [16, Proposition 8.3] that \(\langle A \rangle \) has finite index in \(F(\varSigma )\) if and only if \(\textsf{core}(\mathcal {S}(A))\) is a \(\varGamma \)-regular graph in the sense that for every \(v \in V\) and every \(a \in \varGamma \) there is a (necessarily unique) edge \(e \in E\) with \(\iota (e)=v\) and \(\lambda (e)=a\).

Let us now consider the case where the words in A are power words. We then compute in polynomial time the power-compressed and strongly folded graph \(\mathcal {S}(A)\) as described in the proof of Theorem 1. We compute the core of this graph in the same way as above by removing vertices \(v \ne v_0\) of degree one together with the adjacent edges; let us denote this core with \(\mathcal {C}(A)\). It remains to check whether \(\textsf{decompress}(\mathcal {C}(A))\) is \(\varGamma \)-regular. The case \(|\varSigma |=1\) is trivial. So, let us assume that \(|\varSigma |\ge 2\) (and hence \(|\varGamma | \ge 4\)). But then \(\textsf{decompress}(\mathcal {C}(A))\) has vertices of degree two if \(\mathcal {C}(A)\) contains long edges. To see this note that since \(\mathcal {C}(A)\) is strongly folded, every edge label \(p^z\) of a long edge must be a freely reduced word. Hence, if \(\mathcal {C}(A)\) contains long edges then \(\langle A \rangle \) has infinite index in \(F(\varSigma )\). On the other hand, if \(\mathcal {C}(A)\) only contains short edges, then we can directly apply the above criterion from [16] in order to compute the index \([F(\varSigma ):\langle A \rangle ]\).\(\square \)

Lemma 13

Let G be a fixed f.g. group and H a fixed subgroup of finite index in G (thus, H must be f.g. as well). The power-compressed finite index problem for G is polynomial time reducible to the power-compressed finite index problem for H.

Proof

As in the proof of Lemma 13 we can restrict to the case where H is a normal subgroup of G. Otherwise we define N as the intersection of all conjugate subgroups \(g^{-1} H g\). Then N is a normal subgroup of finite index in G. Let \(d = [H:N]\) be the index of N in H, which is a fixed constant. Assume that we can reduce in polynomial time the power-compressed finite index problem for G to the power-compressed finite index problem for N. The power-compressed finite index problem for N is polynomial time reducible to the power-compressed finite index problem for H (for a f.g. subgroup \(A \le N\) we have \([N:A] = [H:A]/d\)). Hence, the power-compressed finite index problem for G is polynomial time reducible to the power-compressed finite index problem for H.

Let us now assume that H is normal subgroup of G and assume that we can solve the power-compressed index problem for H in polynomial time. We take over all notations from the proof of Lemma 13. Hence, \(A \le G\) is the f.g. subgroup whose index [G : A] we want to compute and the generators of A are given as power words.

Recall that \(V \subseteq R\) is the set of coset representatives of H such that \(r \in V\) if and only if \(Hr \cap A \ne \emptyset \). We claim that \(|V| = [A : H \cap A]\). To see this choose for every \(r \in V\) an arbitrary element \(g_r \in Hr \cap A\). We then have \(Hr \cap A = (H \cap A) g_r \ne \emptyset \). Moreover, the sets \(Hr \cap A\) (\(r \in V\)) are pairwise disjoint; hence also the sets \((H \cap A) g_r\) (\(r \in V\)) are pairwise disjoint. Since \(A = \bigcup _{r \in V} Hr \cap A = \bigcup _{r \in V} (H \cap A) g_r\), it follows that the sets \((H \cap A) g_r\) (\(r \in V\)) are the right cosets of \(H \cap A\) in A. This shows that \(|V| = [A : H \cap A] < \infty \). In particular, we can compute \([A : H \cap A]<\infty \) in constant time.

By Claim 1 from the proof of Lemma 13 we can compute in polynomial time a finite list of generators for \(H \cap A\) written as power words. Hence, we can compute the index \([H : H \cap A]\) in polynomial time. We now have

$$\begin{aligned}{}[G : H \cap A] = [G : H] \cdot [H : H \cap A] = [G : A] \cdot [A : H \cap A] \end{aligned}$$

and thus

$$\begin{aligned}{}[G : A] = \frac{[G : H] \cdot [H : H \cap A]}{[A : H \cap A]} . \end{aligned}$$

Here, [G : H] is a fixed constant, and \([H : H \cap A]\) and \([A : H \cap A]\) can be computed in polynomial time. Hence, [G : A] can be computed in polynomial time.\(\square \)

Theorem 2 and Lemma 13 yield:

Corollary 4

The power-compressed finite index problem for a fixed f.g. virtually free group can be solved in polynomial time.

With Lemma 12 and Corollary 4 we finally obtain:

Corollary 5

The finite index problem for \(\textsf{GL}(2,\mathbb {Z})\) can be solved in polynomial time when matrix entries are given in binary encoding.

6 Future Work

There is not much hope to generalize Corollary 3 to higher dimensions. For \(\textsf{SL}(4,\mathbb {Z})\) the subgroup membership problem is undecidable and decidability of the subgroup membership problem for \(\textsf{SL}(3,\mathbb {Z})\) is a long standing open problem [20].

A more feasible problem concerns the rational subset membership problem for free groups when transitions are labelled with power words. It is easy to see that this problem is \(\textsf{NP}\)-hard (reduction from subset sum) and we conjecture that it belongs to \(\textsf{NP}\). As a consequence this would show that the rational subset membership problem for \(\textsf{GL}(2,\mathbb {Z})\) is \(\textsf{NP}\)-complete when the transitions of the automaton are labelled with binary encoded matrices. The corresponding statement for \(\textsf{PSL}(2,\mathbb {Z})\) was shown in [5].

Another interesting problem is whether the subgroup membership problem for a free group can be solved in polynomial time, when all group elements are represented by straight-line programs (which can be more succinct than power words). One might try to show this using an adaptation of Stallings’s folding, but controlling the size of the graph during the folding seems to be more difficult when the transition labels are represented by straight-line programs instead of power words.