1 Introduction

The Mahler measure of an algebraic number \(\alpha \) with minimal polynomial \(f(x) = a_n x^n + \cdots +a_0\in \mathbb {Z}[x]\), is defined as:

$$\begin{aligned} M(\alpha ) = |a_n| \prod _{i=1}^n \max \{1, |\alpha _i|\} = \pm a_n \prod _{\begin{array}{c} i=1\\ |\alpha _i|>1 \end{array}}^n \alpha _i, \end{aligned}$$

where \(\alpha _1,\ldots ,\alpha _n\) are the distinct roots of f(x); i.e. \(f(x) = a_n \prod _{i=1}^n (x-\alpha _i)\in \mathbb {C}[x]\). It is clear that \(M(\alpha )\ge 1\) is a real algebraic integer, and it follows from Kronecker’s theorem that \(M(\alpha )=1\) if and only if \(\alpha \) is a root of unity. Moreover, we will freely use the facts that \(M(\alpha )=M(\beta )\) whenever \(\alpha \) and \(\beta \) have the same minimal polynomial, and that \(M(\alpha )=M(\alpha ^{-1})\). Lehmer [12] famously asked in 1933 if the Mahler measure for an algebraic number which is not a root of unity can be arbitrarily close to 1. This question became known as Lehmer’s problem, and (somewhat inaccurately) the statement that an absolute constant \(c>1\) exists such that \(M(\alpha )>1\) implies \(M(\alpha )\ge c\) became known as Lehmer’s conjecture, despite the fact that Lehmer himself did not conjecture this and merely asked if one could find smaller values of the Mahler measure than he found. It is often suggested that the minimal value of c is a Salem number, namely \(\tau = 1.17\ldots \), which is the largest real root of the polynomial \(f(x) = x^{10}+x^9-x^7-x^6-x^5-x^4-x^3+x+1\), discovered by Lehmer in his 1933 paper.

Although there has been much computational work performed in order to find irreducible polynomials of small Mahler measure (we refer the reader to M. Mossinghoff’s website [15] for the latest tables of known polynomials, as well as the papers by Mossinghoff [16] and Mossinghoff et al. [17]), remarkably, no polynomial of smaller nontrivial Mahler measure has been found since Lehmer’s original 1933 work. Since that time, the best asymptotic bound towards Lehmer’s problem was discovered by Dobrowolski [8]. It is clear that in considering the problem, one can reduce to considering the Mahler measure of algebraic units. Smyth [19] proved that in fact, non-reciprocal units have a minimal Mahler measure \(\theta _0=M(\theta _0)\), where \(\theta _0\) is the smallest Pisot-Vijayaraghavan number and is given by the positive root of \(x^3-x-1\). In another direction, Borwein et al. [3] proved the Lehmer conjecture for polynomials with only odd coefficients.

The study of iteration of the Mahler measure began with questions about which algebraic numbers are themselves Mahler measures. Adler and Marcus [1] proved that every Mahler measure is a Perron number and asked if the Perron numbers given by the positive roots of \(x^n - x - 1\) are also values of the Mahler measure for any \(n > 3\). Recall that \(\alpha \) is a Perron number if and only \(\alpha >1\) is a real algebraic integer such that all conjugates of \(\alpha \) over \({\mathbb {Q}}\) have absolute value \(<\alpha \). This notion of ‘Perron number’ was introduced by Lind [13] who also proved several properties of the class of Perron numbers in [14], including that they are closed under addition and multiplication and are dense in the real interval \([1,\infty )\). Boyd [4] proved that the positive roots of \(x^n - x - 1\) for \(n>3\) were not values of the Mahler measure, but Dubickas [10] showed that for every Perron number \(\beta \), there exists a natural number n such that \(n\beta \) is a value of the Mahler measure. Dixon and Dubickas [7] and Dubickas [11] established further results on which numbers are in the value set of M. However, the question whether a given number is a Mahler measure of an algebraic number is very hard to answer in general. For instance, it is an open question of Schinzel [18] whether or not \(\sqrt{17}+1\) is the Mahler measure of an algebraic number.

Dubickas [9] appears to have been the first to pose questions on the Mahler measure as a dynamical system, introducing the concept of the stopping time of an algebraic number under M, defined as the number of iterations required to reach a fixed point. We note that the stopping time is one less than the cardinality of the forward orbit of the number under iteration of M, which we will call the orbit size. Specifically, we set \(M^{(0)}(\alpha )=\alpha \) and let \(M^{(n)}(\alpha ) = M\circ \cdots \circ M(\alpha )\) denote the nth iteration of M. We define the orbit of\(\alpha \)underM to be the set:

$$\begin{aligned} {\mathcal {O}}_M(\alpha ) = \{ M^{(n)}(\alpha ) : n \ge 0\}. \end{aligned}$$
(1)

Then the orbit size of \(\alpha \) is \(\# {\mathcal {O}}_M(\alpha )\), while the stopping time is \(\#{\mathcal {O}}_M(\alpha )-1\). It is easy to see that for any algebraic number \(\alpha \), \(M(\alpha )\le M^{(2)}(\alpha )\), so M is nondecreasing after at least one iteration, and thus, the Mahler measure either grows, or is fixed.

In fact, by Northcott’s theorem, it is easy to see that if \(\alpha \) is a wandering point of M, then \(M^{(n)}(\alpha )\rightarrow \infty \), as the degree of \(M^{(n)}(\alpha )\) can never be larger than the degree of the Galois closure of the field \(\mathbb {Q}(\alpha )\). In particular, there are no cycles of length greater than 1; each number \(\alpha \) either wanders (that is, the orbit under M is infinite), or it is preperiodic and ends in a fixed point of M. Dubickas claimed in [9] that ‘generically’ \(M^{(n)}(\alpha )\rightarrow \infty \), however, he did not give an example or a proof of this. The first explicit results in this direction appear to have been by Zhang [21], who proved that if \([\mathbb {Q}(\alpha ):\mathbb {Q}]\le 3\), then \(\# {\mathcal {O}}_M(\alpha ) <\infty \), and also found an algebraic number \(\alpha \) of degree 4 with minimal polynomial \(x^4 + 5x^2 +x -1\) such that \(M^{(2n)}(\alpha ) = M^{(2)}(\alpha )^{2^{n-1}}\), proving that \(M^{(n)}(\alpha )\rightarrow \infty \) for this example.

Further, it is trivial to see that the fixed points of M correspond to natural numbers, Pisot-Vijayaraghavan numbers, and Salem numbers. This raises several natural questions: for example, can one show that the Lehmer problem could be reduced to the study of fixed points of M? The answer to such a question might help establish the long held folklore conjecture that Salem numbers are indeed minimal for Lehmer’s problem. The fixed points for the dynamical system induced by the multiplicative Weil-height have recently been classified by Dill [6].

Dubickas posed several questions in [9], including whether one could classify all numbers of stopping time 1 (that is, numbers which are not fixed by M, but for which \(M(\alpha )\) is fixed), and whether algebraic numbers of arbitrary stopping time existed. In a later paper [10], he established, among other things, that for every \(k\in \mathbb {N}\), there exists a cubic algebraic integer of norm 2 with stopping time k.

In this paper, we will prove several other results regarding the stopping time of algebraic numbers. Our first result is a direct generalization of Dubickas’s result:

Theorem 1

For any \(d \ge 3\), \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\) and \(k\in {\mathbb {N}}\) there is an algebraic integer \(\alpha \) of degree d, \(N(\alpha )=l\) and \(\# {\mathcal {O}}_M(\alpha )=k\).

The proof of Theorem 1 will be given in Sect. 2 below. To study the possible behaviour of algebraic units under iteration of M is more delicate. It is clear that \(\# {\mathcal {O}}_M(\alpha )\le 2\) for all algebraic units of degree at most 3, and this result is (non-trivially) also true if the degree is 4:

Theorem 2

Let \(\alpha \) be an algebraic unit of degree 4. Then either \(\# {\mathcal {O}}_M(\alpha )\le 2\) or \(\# {\mathcal {O}}_M(\alpha )=\infty \). Moreover, if \(\# {\mathcal {O}}_M(\alpha )=\infty \), then \(M^{(3)}(\alpha )=M(\alpha )^2\).

The first algebraic unit \(\alpha \) with \(\# {\mathcal {O}}_M(\alpha )\ge 3\) we found has degree 6 and orbit size 5. It is given by any root of \(x^6 - x^5 - 4x^4 - 2x^2 - 4x - 1\). Despite an extensive search, we did not find any unit of degree 5 of orbit size \(\ge 3\), nor a unit of degree 6 of finite orbit size \(\ge 6\).

It will follow from the proof of Theorem 2 that we have the following corollary:

Corollary 1

If \(\alpha \) is an algebraic unit of degree 4, then the sequence \((\log M^{(n)}(\alpha ))_{n\in {\mathbb {N}}}\) satisfies a linear homogeneous recursion.

The proofs of Theorem 2 and Corollary 1 are given in Sect. 3. We note that, in the example of a degree 4 wandering point given by Zhang [21], the sequence \((\log M^{(n)}(\alpha ))_{n\in {\mathbb {N}}}\) satisfied the recursion relation \(x_n = 2x_{n-2}\) for \(n \ge 3\). Based on the above corollary and further experimental data, we make the following conjecture:

Conjecture 1

For every algebraic unit \(\alpha \), there exists a constant k such that the sequence \((\log (M^{(n)}(\alpha )))_{n\ge k}\) satisfies a linear homogeneous recursion.

We note that, in the case of a large Galois group, the behavior of units is particularly simple. We prove that, if the Galois group contains the alternating group, then the orbit of a unit must either stop after at most one iteration, or the unit wanders. Specifically, we prove in Sect. 4 the following theorem:

Theorem 3

If \(\alpha \) is an algebraic unit of degree d such that the Galois group of the Galois closure of \({\mathbb {Q}}(\alpha )\) over \({\mathbb {Q}}\) contains the alternating group \(A_d\), then \(\# {\mathcal {O}}_M(\alpha )\in \{1,2,\infty \}\).

More precisely, if \(\alpha \) is as above, of degree \(\ge 5\), and such that none of \(\pm \alpha ^{\pm 1}\) is conjugate to a Pisot number, then \(\#{\mathcal {O}}_M(\alpha )=\infty \).

One might be led by Theorems 2 and 3 to suspect that, in fact, algebraic units cannot have arbitrarily large but finite orbits under M. However, we prove that this is not the case.

Theorem 4

Let \(S\in {\mathbb {N}}\) be arbitrary, and let \(d\ge 12\) be divisible by 4. Then there exist algebraic units of degree d whose orbit size is finite but greater than S.

The proof is given in Sect. 5. It would be interesting to know whether there are large finite orbits of algebraic units in any degree less than 12.

2 Arbitrary orbit size for non-units and proof of Theorem 1

In [10], Dubickas proved the case \(d=3\) and \(l=2\) (and k arbitrary). In order to prove Theorem 1, we will start with a few examples.

Example 1

Since there are Pisot-Vijayaraghavan numbers of any degree and norm, we know that for any \(d\in {\mathbb {N}}\) and any \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\) there are algebraic numbers \(\alpha \) of degree d, norm l and orbit size 1. By Perron’s criterion, we may take the largest root of \(x^d + l^2 x^{d-1} +l\).

Similarly, the polynomial \(x^d + l^{d}x +l\) has precisely one root \(\beta \) inside the unit circle and all other roots are of absolute value \(>\vert l\vert \). Hence, the polynomial is irreducible. Let \(\alpha \) be the largest root of this polynomial. Then \(M(\alpha )=\vert l/\beta \vert \), which is a Pisot number. Thus, \(\alpha \) has norm l, degree d and orbit size 2.

Example 2

For any \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\) we consider \(f(x)=x^3 -l^2 x +l\). Let \(\alpha _1,\alpha _2,\alpha _3\) be the roots of f ordered so that \(\vert \alpha _1 \vert \ge \vert \alpha _2 \vert \ge \vert \alpha _3 \vert \).

If \(l \ge 2\) we have

$$\begin{aligned} \begin{array}{lp{1em}l} f (-l-1) = -2l^2 -2l -1< 0, &{} \qquad &{} f(-l)=l>0 \\ f(l-1)=-2l^2 +4l -1<0, &{} \qquad &{} f(l)=l>0 \\ f(1)=1-l^2 +l < 0, &{} \qquad &{} \displaystyle f\left( \frac{1}{l}\right) =\frac{1}{l^3} >0 \end{array} \end{aligned}$$

Hence, the three roots are real and none of them is an integer. If f is reducible, then one of the factors must be linear, this is a contradiction since f is monic. Hence, f is irreducible and it follows that

$$\begin{aligned} \alpha _1 \in (-l-1,-l), \quad \alpha _2 \in (l-1,l), \quad \alpha _3 \in \left( \frac{1}{l},1\right) . \end{aligned}$$

Therefore we find \(M^{(0)}(\alpha _1)=\alpha _1\), \(M^{(1)}(\alpha _1)=-\alpha _1\alpha _2 =l/\alpha _3\),

$$\begin{aligned} M^{(2)}(\alpha _1)=M\bigg (\frac{l}{\alpha _3}\bigg )= \frac{l^2}{\alpha _2\alpha _3}=-\alpha _1 l, \end{aligned}$$

and

$$\begin{aligned} M^{(3)}(\alpha _1)=M(-\alpha _1 l)=\alpha _1 l \alpha _2 l \alpha _3 l =l^4 \in {\mathbb {Z}}. \end{aligned}$$

These are all elements in the orbit of \(\alpha _1\) under iteration of M. Hence, \(\alpha _1\) is an algebraic integer of degree 3, \(N(\alpha _1)=l\) and \(\# {\mathcal {O}}_M(\alpha _1)=4\). Moreover \(-\alpha _1\) is an algebraic integer of degree 3, \(N(-\alpha _1)=-l\) and \(\# {\mathcal {O}}_M(-\alpha _1)=4\).

In the same fashion one can prove that any root of the polynomial \(x^3 +lx^2 -l\) is of degree 3, norm \(-l\) and orbit size 3.

Example 3

Again let \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\) be arbitrary and consider \(f(x)=x^4 -l^2 x^2 + (l^2 -l)x +l\). The four roots of f are ordered as \(\vert \alpha _1\vert \ge \vert \alpha _2 \vert \ge \vert \alpha _3\vert \ge \vert \alpha _4\vert \). A direct computation shows that f is irreducible and \(\# {\mathcal {O}}_M(\alpha _1)=4\) if \(l\in \{-3,-2,-4\}\). If \(l \not \in \{-3,-2-1,0,1,2\}\), then we show as in the last example that

$$\begin{aligned} \alpha _1 \in (-l-1,-l), \quad \alpha _2\in (l-1,l), \quad \alpha _3 \in (1,2), \quad \alpha _4 \in \left( -1,-\frac{1}{l^2}\right) \end{aligned}$$

if \(l>0\), and

$$\begin{aligned} \alpha _1 \in (-l-1,-l), \quad \alpha _2\in (l-1,l), \quad \alpha _3 \in (1,2), \quad \alpha _4 \in \left( 1,\frac{1}{l^2}\right) \end{aligned}$$

if \(l < 0\). Obviously f has no linear factor. Moreover, \(\alpha _4\) and \(\alpha _1\) must be Galois conjugates, since the norm of \(\alpha _1\) has to be a divisor of l. Hence, if f is not irreducible it factors into \(g(x)=(x-\alpha _1)(x-\alpha _4)\) and \(h(x)=(x-\alpha _2)(x-\alpha _3)\). This can only occur if g and h are in \({\mathbb {Z}}[x]\). Comparing the size of the roots, the only possibilities are \(g(x)=x^2 + (l+1)x +1\) and \(h(x)=x^2 -(l+1)x +l\). However, multiplying these two polynomials does not give f. Hence, f is irreducible.

Now we calculate the orbit size of \(\alpha _1\). We have

  • \(M^{(1)}(\alpha _1)=-l/\alpha _4\)

  • \(M^{(2)}(\alpha _1)=\pm l^2 \alpha _1\)

  • \(M^{(3)}(\alpha _1)=\pm l^9\)

and hence \(\# {\mathcal {O}}_M(\alpha _1)=4\). We have shown, that any root \(\alpha \) of f is an algebraic integer of degree 4, norm l and orbit size 4.

Example 4

One can show with similar methods as above, that any root of \(x^d - l^{d-2} x +l\) has orbit size 3, for all \(d\ge 4\) and \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\): To this end, we note

$$\begin{aligned} \vert -l^{d-2} z \vert = \vert l \vert ^{d-2} > \vert l\vert +1 \ge \vert z^d + l\vert \quad \forall ~ z\in {\mathbb {C}}, ~\vert z \vert =1, \end{aligned}$$
(2)

and

$$\begin{aligned} \vert z^d \vert = \vert l \vert ^{d} > \vert l\vert ^{d-1} + \vert l\vert \ge \vert -l^{d-2} z + l\vert \quad \forall ~ z\in {\mathbb {C}}, ~\vert z \vert =\vert l\vert . \end{aligned}$$
(3)

Now we apply Rouché’s theorem. Then (2) tells us that \(x^d - l^{d-2} x +l\) has precisely one root \(\alpha _d\) inside the unit circle, and (3) tells us that all roots \(\alpha _1,\ldots ,\alpha _d\) of \(x^d - l^{d-2} x +l\) have absolute value \(<\vert l \vert \).

Before we proceed with calculating the orbit size of one of these roots, we need to show that \(x^d - l^{d-2} x +l\) is irreducible. This is obviously the case if \(\vert l\vert \) is a prime number. So in particular, we can assume that \(\vert l \vert \ge 4\). Using this assumption and \(d\ge 4\), the same calculation as in (2) proves that there is precisely one root of \(x^d - l^{d-2} x +l\) of absolute value \(\le \sqrt{\vert l \vert }\) (necessarily \(\alpha _d\)).

It follows that no product of two or more of the elements \(\alpha _1,\ldots ,\alpha _{d-1}\) can be a divisor of l. Hence, the only possibility for \(x^d - l^{d-2} x +l\) to be reducible is, if it has a root \(a\in {\mathbb {Z}}\). This a must be a divisor of \(\vert l\vert \) and it must satisfy \(a^d = l^{d-2}a-l\). Hence, \(a^{d-1} \mid l\) which implies \(\vert a \vert ^{d-1} \le \vert l\vert \). This is not possible, as we have just seen that \(\vert a\vert \ge \sqrt{\vert l\vert }\). It follows that \(x^d - l^{d-2} x +l\) is indeed irreducible, and \(\alpha _1\) is an algebraic integer of degree d, and norm l.

We then have:

  • \(M^{(1)}(\alpha _1) = \alpha _1\cdots \alpha _{d-1} =l/\vert \alpha _d \vert \notin {\mathbb {Z}}\), and

  • \(M^{(2)}(\alpha _1)=M(\pm \frac{l}{\alpha _d})=\pm \prod _{i=1}^d \frac{l}{\alpha _i} \in {\mathbb {Z}}\).

Hence \(\alpha _1\) has orbit size 3.

Proposition 1

Let \(d \ge 3\) be an integer and let \(\alpha _1,\ldots ,\alpha _d\) be a full set of Galois conjugates of an algebraic integer \(\alpha \). Assume the following conditions:

  1. (i)

    \(\vert \alpha _1 \vert> \vert \alpha _2 \vert \ge \cdots \ge \vert \alpha _{d-1} \vert> 1 > \vert \alpha _d \vert \), and

  2. (ii)

    \(\vert \alpha _i \vert \le \vert N(\alpha ) \vert \) for all \(i\in \{2,\ldots ,d\}\).

Then \(\alpha \) is a pre-periodic point of M. More precisely, if we let

$$\begin{aligned} c(\alpha )&= \min \{\min \{k\in {\mathbb {N}}: 2\mid k \text { and } \vert \alpha _d \cdot N(\alpha )^{b_{k}} \vert > 1\},\\&\quad \min \{k\in {\mathbb {N}}: 2\not \mid k \text { and } \vert \alpha _1 \vert < \vert N(\alpha )^{b_k} \vert \}\}, \end{aligned}$$

where we define \(b_1=1\), and \(b_n = b_{n-1}\cdot (d-1)+(-1)^{n-1}\) for all \(n\ge 2\), then \(\# {\mathcal {O}}_M(\alpha )=c(\alpha )+2\).

Proof

First we note that \(\alpha \) cannot be an algebraic unit. Hence, \(\vert N(\alpha )\vert \ge 2\) and \(b_k\ge 1\) for all k. We claim that \(b_k \rightarrow \infty \). To see this, notice that \(b_1=1, b_2=d-2\ge 1\), and we want to show that for \(n\ge 3\), \(b_n\ge (d-2)(d-1)^{n-2}+1\). Now, this is true for \(n=3\), since \(b_3=(d-2)(d-1)+1\). By induction, suppose \(b_{n-1}\ge (d-2)(d-1)^{n-3}+1\), then

$$\begin{aligned} b_n&\ge ((d-2)(d-1)^{n-3}+1)(d-1)+(-1)^{n-1} \\&=(d-2)(d-1)^{n-2}+(d-1)+(-1)^{n-1} \\&\ge (d-2)(d-1)^{n-2}+1, \end{aligned}$$

as desired. Therefore, \(b_n\ge 1\) for all n, and \(b_n \rightarrow \infty \).

So the integer \(c:=c(\alpha )\) does indeed exist. We claim that for all \(k \le c\) we have

$$\begin{aligned} M^{(k)}(\alpha )= {\left\{ \begin{array}{ll} \pm N(\alpha )^{b_k}/\alpha _d &{} \text { if } 2 \not \mid k \\ \pm N(\alpha )^{b_k} \cdot \alpha _1 &{} \text { if } 2\mid k \end{array}\right. } \end{aligned}$$
(4)

Note that \(\alpha _1,\alpha _d \in {\mathbb {R}}\), since there is no other conjugate of the same absolute value. Therefore, the sign in (4) has to be chosen such that the value is positive. We prove the claim by induction.

For \(k=1\), we calculate

$$\begin{aligned} M^{(1)}(\alpha )=M(\alpha ) = \pm \alpha _1\cdot \ldots \cdot \alpha _{d-1}= \pm \frac{N(\alpha )}{\alpha _d}=\pm \frac{N(\alpha )^{b_1}}{\alpha _d}, \end{aligned}$$

by assumption (i). Now, assume that (4) is correct for a fixed \(k < c\). If k is even, then by assumption (i) we have

$$\begin{aligned} M^{(k+1)}(\alpha )&= M(\pm N(\alpha )^{b_k} \cdot \alpha _1) = \pm N(\alpha )^{b_k\cdot (d-1)}\cdot \alpha _1 \cdot \ldots \cdot \alpha _{d-1}\\&= \pm \frac{N(\alpha )^{b_k \cdot (d-1) +1}}{\alpha _d}= \pm \frac{N(\alpha )^{b_{k+1}}}{\alpha _d}. \end{aligned}$$

Here we have used that \(k<c\) and hence \(\vert N(\alpha )^{b_k} \cdot \alpha _{d} \vert <1\).

If k is odd, then by assumption (ii) we have

$$\begin{aligned} M^{(k+1)}(\alpha )&= M\bigg (\pm \frac{N(\alpha )^{b_k}}{\alpha _d}\bigg ) = \pm \frac{N(\alpha )^{b_k}}{\alpha _d}\cdot \frac{N(\alpha )^{b_k}}{\alpha _{d-1}} \cdot \ldots \cdot \frac{N(\alpha )^{b_k}}{\alpha _2}\\&= \pm \frac{N(\alpha )^{b_k\cdot (d-1)}}{\alpha _2\cdot \ldots \cdot \alpha _{d-1}} = \pm N(\alpha )^{b_k\cdot (d-1)-1}\cdot \alpha _1 = \pm N(\alpha )^{b_{k+1}}\cdot \alpha _1. \end{aligned}$$

Here we have used that \(k<c\) and hence \(\vert \frac{N(\alpha )^{b_k}}{\alpha _1} \vert < 1\). This proves the claim. Moreover, the proof of the claim shows that \(M^{(k+1)}(\alpha ) > M^{(k)}(\alpha _1)\) for all \(k \in \{0,\ldots ,c-1\}\).

Now, we calculate \(M^{(c+1)}(\alpha )\). By definition of c, every conjugate of \(M^{(c)}(\alpha )\) is greater than 1 in absolute value. Therefore, \(M^{(c+1)}(\alpha )\in {\mathbb {N}}\). It follows, that \(M^{(c+2)}(\alpha )=M^{(c+1)}(\alpha )\). Hence, \(\# {\mathcal {O}}_M(\alpha _1)=c+2\) as claimed. \(\square \)

It remains to prove the existence of an algebraic number of degree d satisfying the assumptions of Proposition 1 for an arbitrary c.

The strategy is as follows: We will prove the locations of the roots of a class of irreducible polynomials satisfying assumptions (i) and (ii) from Proposition 1, then by Proposition 1, show that any root of one of the polynomials in the class will have desired degree, norm and orbit size.

We fix for the rest of this section arbitrary integers \(d\ge 3\), \(c\ge 2\) and \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\). Moreover, we define

$$\begin{aligned} f_n(x)=x\cdot (x^{d-2}-2)\cdot (x-n) + l \end{aligned}$$

and denote the roots of \(f_n\) by \(\alpha _1^{(n)},\ldots ,\alpha _d^{(n)}\) ordered such that

$$\begin{aligned} \vert \alpha _1^{(n)} \vert \ge \vert \alpha _2^{(n)} \vert \ge \cdots \ge \vert \alpha _d^{(n)} \vert . \end{aligned}$$

Lemma 1

Let \(n \ge \vert l \vert +3\) be an integer. With the notation from above we have

$$\begin{aligned}&\alpha _1^{(n)} \in \left( n-\frac{1}{n},n+ \frac{1}{n}\right) , ~ \alpha _d^{(n)} \in \left( -\frac{\vert l \vert }{n},-\frac{1}{2n}\right) \cup \left( \frac{1}{2n},\frac{\vert l \vert }{n}\right) , \\&\vert \alpha _i^{(n)} \vert \in \bigg (1, \root d-2 \of {3-\frac{1}{d}}\bigg ) \end{aligned}$$

for all \(i\in \{2,\ldots ,d-1\}\). Moreover, \(\alpha _d^{(n)}\) is negative if and only if \(\alpha _1^{(n)} <n\).

Proof

We apply Rouché’s theorem and first prove the location of \(\alpha _1^{(n)}\). Let z be any complex number with \(\vert z \vert = n+1/n\). Then

$$\begin{aligned}&\vert z\cdot (z^{d-2} -2 ) \cdot (z-n)\vert \\&\quad \ge \left| n+\frac{1}{n}\right| \cdot \left| \left( n+\frac{1}{n}\right) ^{d-2}-2\right| \cdot \frac{1}{n} \\&\quad = \left| 1+\frac{1}{n^2}\right| \cdot \left| \left( n+\frac{1}{n}\right) ^{d-2}-2\right| \\&\quad > \vert l \vert . \end{aligned}$$

Hence by Rouché’s theorem, \(f_{n}\) has exactly as many roots of absolute value \(< n+1/n\) as \(x\cdot (x^{d-2}-2)\cdot (x-n)\), so \(f_{n}\) has d roots of absolute value \(< n+1/n\). Now, let z be any complex number with \(\vert z \vert = n - 1/n\), suppose that \(n=\vert l \vert + m\) where \(m\ge 3\). Then

$$\begin{aligned}&\left| z\cdot (z^{d-2} -2 ) \cdot (z-n)\right| \ge \left| n-\frac{1}{n}\right| \cdot \left| \left( n-\frac{1}{n}\right) ^{d-2}-2\right| \cdot \frac{1}{n} \\&\quad \ge \left| n-\frac{1}{n}\right| \cdot \left| \left( n-\frac{1}{n}\right) -2\right| \cdot \frac{1}{n} \\&\quad = \left( \vert l \vert +m- \frac{1}{\vert l \vert +m}\right) \left( \vert l \vert +m- \frac{1}{\vert l \vert +m}-2\right) \cdot \frac{1}{\vert l \vert +m} \\&\quad = \left( 1-\frac{1}{(\vert l \vert +m)^2}\right) \left( \vert l \vert - \frac{1}{\vert l \vert +m} +m-2\right) \\&\quad = \vert l \vert -\frac{\vert l \vert }{(\vert l \vert +m)^2}-\frac{1}{\vert l \vert +m}+(m-2)+\frac{1}{(\vert l \vert +m)^3} \\&\qquad -\frac{m}{(\vert l \vert +m)^2}+\frac{2}{(\vert l \vert +m)^2} \\&\quad > \vert l \vert , \end{aligned}$$

since \(m\ge 3\). Again by Rouché’s theorem, \(f_n\) has \(d-1\) roots of absolute value \(<n-1/n\). Since \(f_n\) has no roots on the circle \(\vert z \vert = n- 1/n\), \(f_n\) has a single root in \((-n-1/n, -n+1/n)\cup (n-1/n, n+1/n)\). Now,

$$\begin{aligned}&\left| \left( -n-\frac{1}{n}\right) \left( \left( -n-\frac{1}{n}\right) ^{d-2}-2\right) \left( -2n-\frac{1}{n}\right) \right| \\&\quad \ge (\vert l \vert +2)\left| \left( n+\frac{1}{n}\right) ^{d-2}-2\right| \left( 2n+\frac{1}{n}\right) \\&\quad \ge (\vert l \vert +2)\,\vert l \vert \, (2(\vert l \vert +2)) \\&\quad \ge (\vert l \vert +2)\,\vert l \vert \, (2\vert l \vert +4) \\&\quad \ge \vert l \vert ^{2}>\vert l \vert . \end{aligned}$$

Similarly,

$$\begin{aligned} \left| \left( -n+\frac{1}{n}\right) \left( \left( -n+\frac{1}{n}\right) ^{d-2}-2\right) \left( -2n+\frac{1}{n}\right) \right| \ge 2\vert l \vert ^2 > \vert l \vert . \end{aligned}$$

Since

$$\begin{aligned} \left( -n-\frac{1}{n}\right) \left( \left( -n-\frac{1}{n}\right) ^{d-2}-2\right) \left( -2n-\frac{1}{n}\right) \end{aligned}$$

has the same sign as

$$\begin{aligned} \left( -n+\frac{1}{n}\right) \left( \left( -n+\frac{1}{n}\right) ^{d-2}-2\right) \left( -2n+\frac{1}{n}\right) , \end{aligned}$$

\(f_{n}(-n+1/n)\) has the same sign as \(f_{n}(-n-1/n)\). Therefore, since there is only one root in the annulus \(\vert z \vert \in (n-1/n,n+ 1/n)\), which is necessarily real, \(f_{n}\) cannot have any root in the interval \((-n-1/n, -n+1/n)\), thus \(f_n\) has a single root in the interval \((n-1/n, n+1/n)\).

To prove the location of \(\alpha _d^{(n)}\), let z be any complex number with \(\vert z \vert =\vert l \vert /n\). Then,

$$\begin{aligned}&\vert z\cdot (z^{d-2} -2 ) \cdot (z-n)\vert \ge \frac{\vert l \vert }{n} \cdot \left( 2-\frac{\vert l \vert }{n}\right) \cdot \left( n-\frac{\vert l \vert }{n}\right) \\&\quad = 2\vert l \vert - 2 \frac{\vert l \vert ^2}{n^2} - \frac{\vert l \vert ^2}{n}+\frac{\vert l \vert ^3}{n^3}> 2\vert l \vert - 2 \frac{\vert l \vert ^2}{n^2} - \frac{\vert l \vert ^2}{n} \\&\quad \ge 2\vert l\vert - \vert l \vert \frac{\vert l \vert ^2+4\vert l\vert }{(\vert l \vert +2)^2} > \vert l \vert . \end{aligned}$$

By Rouché’s theorem, \(f_n\) has exactly as many roots of absolute value \(<\vert l \vert /n\) as the polynomial \(x\cdot (x^{d-2}-2)\cdot (x-n)\). This is, \(f_n\) has exactly one root of absolute value \(<\vert l \vert /n\). This root is necessarily real. A straightforward computation shows that \(f_n(\pm 1/(2n))\) have the same sign as \(f_n(0)\). Hence \(f_n\) cannot have any root in the interval \((-1/(2n),1/(2n))\).

To show the location of \(\alpha _i^{(n)}\) for all \(i\in \{2,\ldots ,d-1\}\), let z be any complex number with \(\vert z \vert =1\). Then,

$$\begin{aligned} \vert z\cdot (z^{d-2} -2 ) \cdot (z-n)\vert&= \vert z^{d-2} -2 \vert \cdot \vert z-n\vert \\&\ge n-1> \vert l \vert , \end{aligned}$$

so \(f_n\) has a single root of absolute value \(<1\). The argument above also shows that \(f_n\) has no roots on the circle \(\vert z \vert =1\). Now, let z be any complex number with \(\vert z \vert =\root d-2 \of {3-1/d}\). Then,

$$\begin{aligned}&\vert z\cdot (z^{d-2} -2 ) \cdot (z-n)\vert \\&\quad \ge \, \left( 3-\frac{1}{d}\right) ^{\frac{1}{d-2}} \cdot \left( 1-\frac{1}{d}\right) \cdot \left( n-\big (3-\frac{1}{d} \big )^{\frac{1}{d-2}}\right) . \end{aligned}$$

By elementary calculus,

$$\begin{aligned} \left( 3-\frac{1}{d}\right) \left( 1-\frac{1}{d}\right) ^{d-2}>1 \quad \forall ~ d\ge 3, \end{aligned}$$

which gives

$$\begin{aligned} \vert z\cdot (z^{d-2} -2 ) \cdot (z-n)\vert \ge n-\left( 3-\frac{1}{d}\right) ^{\frac{1}{d-2}} > \vert l \vert , \end{aligned}$$

where the last inequality follows from the assumption \(n\ge \vert l \vert +3\). Hence, by Rouché’s theorem, \(f_n\) has \(d-1\) roots of absolute value less than \(\root d-2 \of {3-1/d}\). Therefore, \(f_n\) has exactly \(d-2\) roots with absolute values in the interval \((1, \root d-2 \of {3-1/d})\).

The last part of the lemma is obvious, since \(x \cdot (x^{d-2}-2)\cdot (x-n)\) changes the sign at 0 and at n in the same way. \(\square \)

Lemma 2

Let \(n\ge \vert l \vert +3\). Then \(f_n\) is irreducible in \({\mathbb {Q}}[x]\) whenever l is odd.

Proof

From Lemma 1 we know \(\alpha _1^{(n)} > \vert l \vert \). Hence, \(\alpha _1^{(n)}\) must be a conjugate of the only root of \(f_n\) which is less than 1 in absolute value. If \(f_n\) would be reducible, then some product of the elements \(\alpha _2^{(n)},\ldots ,\alpha _{d-1}^{(n)}\) must be a divisor of l. But every such product lies strictly between 1 and 3. Since 2 is no divisor of l by assumption, \(f_n\) is necessarily irreducible. \(\square \)

Lemma 3

Let p be a prime and let \(f=x^d + a_{d-1}x^{d-1}+\cdots +a_2 x^2 + a_1 x + a_0 \in {\mathbb {Z}}[x]\) be such that \(p\mid a_i\) for all \(i\in \{0,\ldots ,d-1\}\) and \(p^2 \not \mid a_2\). Then either f has a divisor of degree \(\le 2\) or f is irreducible.

Proof

This follows exactly as the classical Eisenstein criterion. Assume, that \(f=g\cdot h\), where

$$\begin{aligned} g(x)=x^r + g_{r-1}x^{r-1}+\cdots +g_0 \quad \text { and } \quad h(x)=x^s+h_{s-1}x^{s-1}+\cdots +h_0 \in {\mathbb {Z}}[x] \end{aligned}$$

with \(r,s\ge 3\). Since the reduction of \(g\cdot h\) modulo p is equal to \(x^d \in \nicefrac {{\mathbb {Z}}}{p{\mathbb {Z}}}[x]\) and \(\nicefrac {{\mathbb {Z}}}{p{\mathbb {Z}}}[x]\) is an integral domain, we know that each coefficient of g and h is divisible by p. It follows \(p^2 \mid g_0h_2 + g_1 h_1 + g_2 h_0 =a_2\), which is a contradiction. \(\square \)

Lemma 4

Let \(n \ge \vert l \vert +3\) and \(\vert l \vert \) both be even. Then \(f_n\) is irreducible.

Proof

We first note that \(f_n\) does not have a factor of degree 1. Otherwise, some divisor a of l would be a root of \(f_n\). But \(\vert a (a-n)\vert \ge n -1 \ge \vert l \vert +1\). Hence, in particular, \(f_n(a)\ne 0\) for all \(a\mid l\). It follows, that \(f_n\) is irreducible for \(d=3\). From now on we assume \(d\ge 4\).

If l and n are even, then

$$\begin{aligned} f_n(x)=x(x^{d-2}-2)(x-n)+l=x^d -nx^{d-1} -2 x^2 +2n x +l \end{aligned}$$

is—by Lemma 3—irreducible if it does not have a factor of degree 2.

Since \(\alpha _1^{(n)}\) is larger than \(\vert l \vert \) (which is the absolute value of the product of all roots of \(f_n\)), it must be conjugate to \(\alpha _d^{(n)}\) which is the only root of absolute value \(\le 1\). If \(\alpha _d^{(n)}\) would be the only conjugate of \(\alpha _1^{(n)}\), then \(\alpha _1^{(n)}+\alpha _d^{(n)} \in {\mathbb {Z}}\). This is not possible by Lemma 1. This means, that there is no factor of degree 2, having \(\alpha _1^{(n)}\) or \(\alpha _d^{(n)}\) as a root. This proves that \(f_n\) is irreducible for \(d=4\). For \(d\ge 5\) the only possibility of a divisor of degree 2 is \(x^2 - (\alpha _i^{(n)}+\alpha _j^{(n)})x + \alpha _i^{(n)}\alpha _j^{(n)}\), for \(i\ne j\in \{2,\ldots ,d-1\}\). By Lemma 1, we have

$$\begin{aligned} \vert \alpha _i^{(n)}\alpha _j^{(n)}\vert >1 \quad \text { and } \quad \vert \alpha _i^{(n)}\alpha _j^{(n)} \vert< \root d-2 \of {3-\frac{1}{d}}^2 < 2. \end{aligned}$$

Hence, such polynomial is not in \({\mathbb {Z}}[x]\). We conclude that \(f_n\) does not have a factor of degree \(\le 2\) and therefore \(f_n\) is irreducible. \(\square \)

Theorem 5

Let \(d\ge 3\) and \(l\in {\mathbb {Z}}{\setminus }\{\pm 1,0\}\) such that \((d,l)\notin \{(3,2),(3,-2)\}\). Moreover, let \(b_1,b_2,\ldots \) be the sequence from Proposition 1 and \(c\ge 2\) be an integer with \(c\ne 2\) if \(d\in \{3,4\}\). Then any root \(\alpha \) of

$$\begin{aligned} f_{\vert l \vert ^{b_c-1}}(x)=x(x^{d-2}-2)(x-\vert l \vert ^{b_c-1})+l \end{aligned}$$

is an algebraic integer of degree d, norm l, and orbit size \(c+2\).

Proof

The cases we have to exclude, are those which violate assumption (ii) in Proposition 1 or satisfy \(\vert l^{b_c-1} \vert <\vert l\vert +3\).

In Lemmas 2 and 4, we proved that \(\alpha \) has degree d. Moreover, by Lemma 1, \(\alpha \) satisfies assumptions (i) and (ii) from Proposition 1. As usual we denote with \(\alpha _1,\ldots ,\alpha _d\) the full set of conjugates of \(\alpha \). Then by Lemma 1, we achieve \(\vert \alpha _d l^{b_c} \vert > \vert l \vert /2 \ge 1\) and \(\vert \alpha _1 \vert < \vert l^{b_{c}-1} \vert +1 \le \vert l^{b_c}\vert \).

Furthermore, we know that

$$\begin{aligned} \vert \alpha _1\vert > \vert l \vert ^{b_c-1}-1 \ge \vert l \vert ^{b_{c-1}} \quad \text { and } \quad \vert \alpha _d l^{b_{c-1}}\vert< \frac{\vert l \vert ^{b_{c-1}+1}}{\vert l \vert ^{b_c-1}}<1. \end{aligned}$$

Again from Lemma 1 we also have \(\vert \alpha _d l^{b_{c-2}}\vert < 1\) and \(\vert \alpha _1\vert > l^{b_{c-2}}\), if \(c \ge 3\).

What we have shown is, that in the notation from Proposition 1, we have \(c(\alpha )=c\), and hence \(\# {\mathcal {O}}_M(\alpha )=c+2\). \(\square \)

Remark 1

A closed formula for the recursion \(b_1,b_2,\ldots \) is

$$\begin{aligned} b_n=\frac{1}{d}((d-1)^n+(-1)^{n-1}). \end{aligned}$$

So Theorem 5 is fairly effective.

Corollary 2

For any triple (dlk) of integers, with \(d\ge 3\), \(l\notin \{\pm 1,0\}\), and \(1\le k\), there are algebraic integers \(\alpha \) with \([{\mathbb {Q}}(\alpha ):{\mathbb {Q}}]=d\), \(N(\alpha )=l\) and \(\# {\mathcal {O}}_M(\alpha )=k\).

Proof

For (3, 2, k) and \((3,-2,k)\) this is due to Dubickas [10] (note that he states the case \(N(\alpha )=2\), but then \(-\alpha \) does the job in the case of negative norm). Together with Theorem 5 and the examples at the beginning of this note, we conclude the corollary. \(\square \)

3 Behavior of degree 4 units and proof of Theorem 2

In light of Theorem 1, one might ask if arbitrarily long but finite orbits occur for algebraic units. In this section we will prove Theorem 2, which states that the orbit size of an algebraic unit of degree 4 must be 1, 2, or \(\infty \).

Let \(\alpha \) be an algebraic unit of degree 4. If \(\alpha \) is a root of unity, a Pisot number, a Salem number or an inverse of such number we surely have \(\# {\mathcal {O}}_M(\alpha )\le 2\). Hence, we may and will assume for the rest of this section that the conjugates of \(\alpha \) satisfy

$$\begin{aligned} \vert \alpha _1 \vert \ge \vert \alpha _2\vert> 1 > \vert \alpha _3 \vert \ge \vert \alpha _4 \vert . \end{aligned}$$

Denote the Galois group of \({\mathbb {Q}}(\alpha _1,\alpha _2,\alpha _3,\alpha _4)/{\mathbb {Q}}\) by \(G_{\alpha }\). The Galois orbit of any \(\beta \in {\mathbb {Q}}(\alpha _1,\alpha _2,\alpha _3,\alpha _4)\) is denoted by \(G_\alpha \cdot \beta \).

Then \(M(\alpha )=\pm \alpha _1\alpha _2\) and

$$\begin{aligned} G_{\alpha }\cdot (\alpha _1 \alpha _2) \subseteq \{\alpha _1\alpha _2, \alpha _1\alpha _3,\alpha _1\alpha _4,\alpha _2\alpha _3,\alpha _2\alpha _4,\alpha _3\alpha _4 \}. \end{aligned}$$

Lemma 5

If \(\vert \alpha _1\alpha _4\vert = 1\) or \(\vert \alpha _1\alpha _3\vert = 1\), then we have either \(\# {\mathcal {O}}_M(\alpha )=2\) or \(\# {\mathcal {O}}_M(\alpha )=\infty \).

Proof

If \(\vert \alpha _1\alpha _4\vert =1\), then also \(\vert \alpha _2\alpha _3\vert =1\), and if \(\vert \alpha _1\alpha _3\vert =1\), then also \(\vert \alpha _2\alpha _4\vert =1\). In both cases we see that

$$\begin{aligned} \vert \alpha _1\vert =\vert \alpha _2\vert ~ \Longleftrightarrow ~ \vert \alpha _3\vert =\vert \alpha _4\vert . \end{aligned}$$
(5)

We first assume that \(\alpha _1\notin {\mathbb {R}}\). Then \(\alpha _2=\overline{\alpha _1}\) and hence \(\vert \alpha _1\vert =\vert \alpha _2\vert \). Obviously it is \(M(\alpha _1)=\alpha _1\alpha _2\). By our assumptions and (5), all values \(\vert \alpha _1\alpha _3\vert \), \(\vert \alpha _1\alpha _4\vert \), \(\vert \alpha _2\alpha _3\vert \), \(\vert \alpha _2\alpha _4\vert \), \(\vert \alpha _3\alpha _4\vert \) are less or equal to 1. Hence \(M^{(2)}(\alpha _1)=M(\alpha _1\alpha _2)=\alpha _1\alpha _2\). Therefore, \(\# {\mathcal {O}}_M(\alpha _1)=2\).

If \(\alpha _1\in {\mathbb {R}}\) and \(\vert \alpha _1\vert =\vert \alpha _2\vert \), then \(\alpha _2=-\alpha _1\) and \(\alpha _4=-\alpha _3\). Hence, the only non-trivial Galois conjugate of \(M(\alpha _1)=\alpha _1^2\) is \(\alpha _3^2\) and lies inside the unit circle. Therefore, \(M^{(2)}(\alpha _1) = \alpha _1^2\) and \(\# {\mathcal {O}}_M(\alpha _1)=2\).

From now on we assume that \(\vert \alpha _1 \vert \ne \vert \alpha _2\vert \). Then, by (5), we have

$$\begin{aligned} \vert \alpha _1 \vert> \vert \alpha _2\vert> 1> \vert \alpha _3 \vert > \vert \alpha _4\vert \end{aligned}$$

and \(\alpha _1\) must be totally real. Moreover, we see that

$$\begin{aligned} \alpha _1^n, \alpha _2^n, \alpha _3^n, \alpha _4^n \quad \text { are pairwise distinct for all } n\in {\mathbb {N}}, \end{aligned}$$
(6)

and

$$\begin{aligned} (\alpha _1\alpha _2)^n, (\alpha _3\alpha _4)^n, (\alpha _1\alpha _3)^n, (\alpha _2\alpha _4)^n \quad \text { are pairwise distinct for all } n\in {\mathbb {N}}. \end{aligned}$$
(7)

We notice that in this situation it is not possible that \(\vert \alpha _1 \alpha _3 \vert =1\), since otherwise \(\vert \alpha _2 \alpha _4\vert <1\) which contradicts \(1=\vert \alpha _1\alpha _2\alpha _3\alpha _4\vert \). Therefore, \(\vert \alpha _1 \alpha _4 \vert =1\), and \(\alpha _4 = \pm \alpha _1^{-1}\). It follows that also \(\alpha _3=\pm \alpha _2^{-1}\). This gives natural constraints on the Galois group \(G_{\alpha }\), namely

$$\begin{aligned} G_{\alpha }\subseteq \{\mathrm{id},(12)(34),(13)(24),(14)(23),(14),(23),(1342),(1243)\}\subseteq S_4. \end{aligned}$$

In particular, since \(G_{\alpha }\) is a transitive subgroup of \(S_4\) with order divisible by 4,

$$\begin{aligned} G_{\alpha }=\{\mathrm{id},(12)(34),(13)(24),(14)(23)\} \text { or } \{\mathrm{id}, (1342),(14)(23),(1243)\} \subseteq G_{\alpha }. \end{aligned}$$

In the first case, \(G_{\alpha }\cdot (\alpha _1\alpha _2)=\{\alpha _1\alpha _2,\alpha _3\alpha _4\}\), which implies that \(\alpha _1\alpha _2\) is a quadratic unit. Hence \(\# {\mathcal {O}}_M(\alpha )=\# {\mathcal {O}}_M(\alpha _1\alpha _2)+1=2\).

In the second case, \(G_{\alpha }\cdot (\alpha _1\alpha _2)=\{\alpha _1\alpha _2,\alpha _3\alpha _4,\alpha _1\alpha _3,\alpha _2\alpha _4\}\). Note that \(\alpha _1\alpha _2\) is still of degree 4 by (7). Hence \(M^{(2)}(\alpha _1)=M(\alpha _1\alpha _2)=\pm \alpha _1^2\alpha _2\alpha _3= \alpha _1^2\). By (6) it follows that \(M^{(3)}(\alpha ) = M(\alpha _1^2) =(\alpha _1\alpha _2)^2 = M(\alpha )^2\). Now, by induction and (7) and (6), it follows that \(M^{(n)}(\alpha _1)=\alpha _1^{2^n}\) for all even \(n \in {\mathbb {N}}\). Hence \(\# {\mathcal {O}}_M(\alpha _1)=\infty \). \(\square \)

From now on, we assume:

$$\begin{aligned} \vert \alpha _1 \alpha _4 \vert \ne 1\ne \vert \alpha _1\alpha _3\vert . \end{aligned}$$
(8)

Lemma 6

Assuming (8), if \(\alpha _1^n = \alpha _2^n\) or \(\alpha _3^n=\alpha _4^n\) for some \(n\in {\mathbb {N}}\), then \(\# {\mathcal {O}}_M(\alpha _1)=2\).

Proof

If \(\alpha _1^n=\alpha _2^n\) for some \(n\in {\mathbb {N}}\), then \(\alpha _1/\alpha _2\) is a root of unity. Since none of the elements \(\alpha _1/\alpha _3\), \(\alpha _1/\alpha _4\), \(\alpha _2/\alpha _3\), \(\alpha _2/\alpha _4\), \(\alpha _3/\alpha _1\), \(\alpha _3/\alpha _2\), \(\alpha _4/\alpha _1\), \(\alpha _4/\alpha _2\) lies on the unit circle, we have

$$\begin{aligned} G_{\alpha }\cdot \left( \frac{\alpha _1}{\alpha _2}\right) \subseteq \left\{ \frac{\alpha _1}{\alpha _2},\frac{\alpha _2}{\alpha _1}, \frac{\alpha _3}{\alpha _4},\frac{\alpha _4}{\alpha _3}\right\} . \end{aligned}$$

Hence

$$\begin{aligned} G_{\alpha }\subseteq \{\mathrm{id}, (12),(12)(34),(13)(24),(14)(23),(1324),(1423)\}. \end{aligned}$$

This implies

$$\begin{aligned} M^{(2)}(\alpha _1)=M(\pm \alpha _1\alpha _2)= \pm \alpha _1\alpha _2 = M(\alpha _1), \end{aligned}$$

and hence \(\# {\mathcal {O}}_M(\alpha _1)=2\). The same proof applies if \(\alpha _3^n=\alpha _4^n\). \(\square \)

Lemma 7

Assuming (8) and \(\# {\mathcal {O}}_M(\alpha _1)>2\), we have

  1. (a)

    \(\vert \alpha _1 \alpha _2 \vert >1\), \(\vert \alpha _1\alpha _3\vert >1\).

  2. (b)

    \(\vert \alpha _3 \alpha _4 \vert <1\), \(\vert \alpha _2\alpha _4\vert <1\).

  3. (c)

    One of the values \(\vert \alpha _1\alpha _4\vert \) and \(\vert \alpha _2\alpha _3\vert \) is \(<1\) and the other is \(>1\).

  4. (d)

    \(\alpha _1^n\), \(\alpha _2^n\), \(\alpha _3^n\), \(\alpha _4^n\) are pairwise distinct for all \(n\in {\mathbb {N}}\).

  5. (e)

    \((\alpha _1\alpha _2)^n\), \((\alpha _3\alpha _4)^n\), \((\alpha _1\alpha _3)^n\), \((\alpha _2\alpha _4)^n\) are pairwise distinct for all \(n\in {\mathbb {N}}\).

Proof

Obviously \(\vert \alpha _1\alpha _2 \vert >1\) and \(\vert \alpha _3\alpha _4\vert <1\). Moreover, \(1\ne \vert \alpha _1\alpha _3 \vert \ge \vert \alpha _2\alpha _4\vert \) and \( \vert \alpha _1\alpha _3 \vert \cdot \vert \alpha _2\alpha _4\vert =1\). This means \(\vert \alpha _1\alpha _3\vert >1\) and \(\vert \alpha _2\alpha _4\vert <1\), proving parts (a) and (b).

Since \(\vert \alpha _1\alpha _4\vert \cdot \vert \alpha _2\alpha _3\vert =1\) and \(\vert \alpha _1\alpha _4\vert \ne 1\), part (c) follows.

The elements \(\alpha _1\) and \(\alpha _2\) lie outside the unit circle, and \(\alpha _3\) and \(\alpha _4\) lie inside or on the unit circle. Hence, the only possibilities for (d) to fail are \(\alpha _1^n=\alpha _2^n\) or \(\alpha _3^n=\alpha _4^n\) for some \(n\in {\mathbb {N}}\). By the previous lemma, both implies \(\# {\mathcal {O}}_M(\alpha _1)=2\), which is excluded by our assumptions.

Part (e) follows immediately from (a), (b) and (d). \(\square \)

Lemma 8

If \(M^{(3)}(\alpha _1)=M(\alpha _1)^2\) and \(\# {\mathcal {O}}_M(\alpha _1)>2\), then \(\# {\mathcal {O}}_M(\alpha _1)=\infty \).

Proof

This is true if assumption (8) is not satisfied, by Lemma 5. If we assume (8), then by Lemma 7 (d) and (e), we are in the same situation as at the end of the proof of Lemma 5. Hence, an easy induction proves the claim. \(\square \)

We now complete the proof of the statement that \(\# {\mathcal {O}}_M(\alpha _1)\in \{1,2,\infty \}\). It suffices to prove this under the assumption (8). From now on we assume \(\# {\mathcal {O}}_M(\alpha )>2\) and show that this implies \(\# {\mathcal {O}}_M(\alpha )=\infty \). By Lemma 7, we have

$$\begin{aligned} M^{(2)}(\alpha )&\in \{\pm \alpha _1^2\alpha _2\alpha _3, \pm \alpha _1^3 \alpha _2\alpha _3\alpha _4,\pm \alpha _1^2\alpha _2^2\alpha _3^2,\pm \alpha _1^2\alpha _2\alpha _4,\pm \alpha _1\alpha _2^2\alpha _3\} \nonumber \\&=\left\{ \pm \frac{\alpha _1}{\alpha _4}, \pm \alpha _1^2, \pm \frac{1}{\alpha _4^2}, \pm \frac{\alpha _1}{\alpha _3}, \pm \frac{\alpha _2}{\alpha _4} \right\} \end{aligned}$$
(9)

In two of these cases the orbit of \(\alpha \) can be determined immediately:

  • If \(M^{(2)}(\alpha )=\pm \alpha _1^2\), then (since we have \(\# {\mathcal {O}}_M(\alpha )>2\)) it is \(\alpha _1^n \ne \alpha _2^n\) for all \(n\in {\mathbb {N}}\). Hence \(M^{(3)}(\alpha )=M(\alpha )^2\) which implies \(\# {\mathcal {O}}_M(\alpha )=\infty \).

  • Similarly, if \(M^{(2)}(\alpha )=\pm 1/\alpha _4^2\), then (since \(\# {\mathcal {O}}_M(\alpha )>2\)) it is \(\alpha _3^n \ne \alpha _4^n\) for all \(n\in {\mathbb {N}}\). Hence \(M^{(3)}(\alpha )=M(\alpha _4^2)=M(\alpha )^2\) and again \(\# {\mathcal {O}}_M(\alpha )=\infty \).

We now study the other three cases.

3.1 The case \(M^{(2)}(\alpha )=\pm \alpha _1/\alpha _4\)

This case occurs if \(\alpha _1\alpha _3\in G_{\alpha }\cdot (\alpha _1\alpha _2)\), and

  • \(\vert \alpha _1\alpha _4\vert >1\) but \(\alpha _1\alpha _4 \not \in G_{\alpha }\cdot (\alpha _1\alpha _2)\), or

  • \(\vert \alpha _2\alpha _3\vert >1\) but \(\alpha _2\alpha _3 \not \in G_{\alpha }\cdot (\alpha _1\alpha _2)\).

In both cases the only possibilities for \(G_{\alpha }\) are the following copies of the cyclic group \(C_4\) and the dihedral group \(D_8\):

  1. (I)

    \(C_4=\{\mathrm{id}, (1342), (14)(23), (1243) \}\), or

  2. (II)

    \(D_8=\{\mathrm{id}, (1243),(14)(23),(1342),(12)(34),(13)(24),(14),(23) \}\).

In both cases \(\{\alpha _1/\alpha _4, \alpha _3/\alpha _2,\alpha _4/\alpha _1, \alpha _2/\alpha _3 \}\) is a full set of conjugates of \(\alpha _1/\alpha _4\). It follows that

$$\begin{aligned} M^{(3)}(\alpha )=M \left( \frac{\alpha _1}{\alpha _4}\right) =\pm \frac{\alpha _1}{\alpha _4}\cdot \frac{\alpha _2}{\alpha _3}= (\alpha _1\alpha _2)^2 = M(\alpha )^2 \end{aligned}$$

Hence, by Lemma 8 we have \(\# {\mathcal {O}}_M(\alpha )=\infty \).

3.2 The case \(M^{(2)}(\alpha )=\pm \frac{\alpha _1}{\alpha _3}\)

This case occurs if \(\alpha _1\alpha _3 \not \in G_{\alpha }\cdot (\alpha _1\alpha _2)\), and \(\vert \alpha _1\alpha _4\vert >1\), and \(\alpha _1\alpha _4 \in G_{\alpha }\cdot (\alpha _1\alpha _2)\).

Hence, the only possibilities for \(G_{\alpha }\) are the following copies of the cyclic group \(C_4\) and the dihedral group \(D_8\):

  1. (I)

    \(C_4=\{\mathrm{id}, (1234), (13)(24), (1432) \}\), or

  2. (II)

    \(D_8=\{\mathrm{id}, (1234),(13)(24),(1432),(12)(34),(14)(23),(13),(24) \}\).

In both cases \(\{\alpha _1/\alpha _3, \alpha _2/\alpha _4,\alpha _3/\alpha _1, \alpha _4/\alpha _2 \}\) is a full set of Galois conjugates of \(\alpha _1/\alpha _3\). It follows that

$$\begin{aligned} M^{(3)}(\alpha )=M\left( \frac{\alpha _1}{\alpha _3}\right) =\pm \frac{\alpha _1}{\alpha _3}\cdot \frac{\alpha _2}{\alpha _4}= (\alpha _1\alpha _2)^2 = M(\alpha )^2 \end{aligned}$$

Hence, again we have \(\# {\mathcal {O}}_M(\alpha )=\infty \) by Lemma 8.

3.3 The case \(M^{(2)}(\alpha )=\pm \frac{\alpha _2}{\alpha _4}\)

This case occurs if \(\alpha _1\alpha _3 \not \in G_{\alpha }\cdot (\alpha _1\alpha _2)\), and \(\vert \alpha _2\alpha _3\vert >1\), and \(\alpha _2\alpha _3 \in G_{\alpha }\cdot (\alpha _1\alpha _2)\).

Hence, the only possibilities for \(G_{\alpha }\) are the following copies of the cyclic group \(C_4\) and the dihedral group \(D_8\):

  1. (I)

    \(C_4=\{\mathrm{id}, (1234), (13)(24), (1432) \}\), or

  2. (II)

    \(D_8=\{\mathrm{id}, (1234),(13)(24),(1432),(12)(34),(14)(23),(13),(24) \}\).

In both cases \(\{\alpha _2/\alpha _4, \alpha _3/\alpha _1,\alpha _4/\alpha _2, \alpha _1/\alpha _3 \}\) is a full set of conjugates of \(\alpha _2/\alpha _4\). It follows that

$$\begin{aligned} M^{(3)}(\alpha )=M\left( \frac{\alpha _2}{\alpha _4}\right) =\pm \frac{\alpha _2}{\alpha _4}\cdot \frac{\alpha _1}{\alpha _3}= \pm (\alpha _1\alpha _2)^2 = M(\alpha )^2 \end{aligned}$$

Hence, also in this case we have \(\# {\mathcal {O}}_M(\alpha )=\infty \).

This concludes the proof of Theorem 2. We now prove Corollary 1:

Proof of Corollary 1

Let \(\alpha \) be an algebraic unit of degree 4. We set

$$\begin{aligned} a_n=\log (M^{(n)}(\alpha )) \end{aligned}$$

for all \(n\in {\mathbb {N}}\). If \(\# {\mathcal {O}}_M(\alpha )\le 2\), then \(a_{n+1}=a_n\) for all \(n\in {\mathbb {N}}\). If \(\# {\mathcal {O}}_M(\alpha )=\infty \), then Theorem 2 tells us \(a_3=2a_1\). Moreover, \(M^{(4)}(\alpha )=M(M^{(3)}(\alpha ))=M(M(\alpha )^2)=M(M(\alpha ))^2=M^{(2)}(\alpha )^2\). Hence, \(a_4=2a_2\), and by induction we find \(a_{n+1}=2a_{n-1}\), proving the claim. \(\square \)

4 Symmetric and alternating Galois groups

In this section we will prove Theorem 3. We know that \(\# {\mathcal {O}}_M(\alpha ) \in \{1,2,\infty \}\) whenever \(\alpha \) is an algebraic unit of degree \(\le 4\). (We note in passing that the orbit size for units of degree less than 4 is trivially 1 or 2.) So we assume from now on that \(\alpha \) is an algebraic unit with \([{\mathbb {Q}}(\alpha ):{\mathbb {Q}}]=d\ge 5\). Denote by \(G_{\alpha }\) the Galois group of the Galois closure of \({\mathbb {Q}}(\alpha )\). We assume that \(G_{\alpha }\) contains a subgroup isomorphic to \(A_d\), so \(G_{\alpha }\) is either the full symmetric group or the alternating group. Every self-reciprocal polynomial admits natural restrictions on which permutations of the zeros are given by field automorphisms. Hence, \(\alpha \) cannot be conjugated to ± a Salem number (see [5] for more precise statements on the structure of the Galois group \(G_\alpha \), when \(\alpha \) is a Salem number). If one of \(\pm \alpha ^{\pm 1}\) is conjugated to a Pisot number, then surely \(\# {\mathcal {O}}_M(\alpha )\in \{1,2\}\). Hence, we assume from now on that none of \(\pm \alpha ^{\pm 1}\) is conjugated to a Pisot number.

Hence, if we denote by \(\alpha _1,\ldots ,\alpha _d\) the Galois conjugates of \(\alpha \), we assume

$$\begin{aligned}&\vert \alpha _1\vert \ge \vert \alpha _2\vert \ge \cdots \ge \vert \alpha _r\vert> 1 \ge \vert \alpha _{r+1} \vert \ge \cdots \ge \vert \alpha _d \vert , \nonumber \\&\quad \text {where } r\in \{2,\ldots ,d-2\} \text { and } 1> \vert \alpha _{d-1} \vert . \end{aligned}$$
(10)

We identify \(G_{\alpha }\) with a subgroup of \(S_d\), by the action on the indices of \(\alpha _1,\ldots ,\alpha _d\). In particular, for any \(\sigma \in A_d\) and any \(f_1,\ldots ,f_d \in {\mathbb {Z}}\) the element

$$\begin{aligned} \sigma \cdot (\alpha _1^{f_1}\cdot \ldots \cdot \alpha _d^{f_d}):=\alpha _{\sigma (1)}^{f_1}\cdot \ldots \cdot \alpha _{\sigma (d)}^{f_d} \end{aligned}$$

is a Galois conjugate of \(\alpha _1^{f_1}\cdot \ldots \cdot \alpha _d^{f_d}\).

Lemma 9

Let \(i,j,k,l\in \{1,\ldots ,d\}\) be pairwise distinct, and let \(f_1,\ldots ,f_d \in {\mathbb {Z}}\). Then

  1. (a)

    \((i,j,k)\cdot (\alpha _1^{f_1}\cdots \alpha _d^{f_d}) = \alpha _1^{f_1}\cdots \alpha _d^{f_d} ~ \Longleftrightarrow ~ f_i=f_j=f_k\).

  2. (b)

    \((i,j)(k,l)\cdot (\alpha _1^{f_1}\cdots \alpha _d^{f_d}) = \alpha _1^{f_1}\cdots \alpha _d^{f_d} ~ \Longleftrightarrow ~ f_i=f_j \text { and } f_k=f_l\).

Proof

In both statements, the implication \(\Longleftarrow \) is trivial. Let’s start with the other implication in (a). We have

$$\begin{aligned} (i,j,k)\cdot (\alpha _1^{f_1}\cdots \alpha _d^{f_d}) = \alpha _1^{f_1}\cdots \alpha _d^{f_d} ~ \Longrightarrow ~ \alpha _j^{f_i - f_j}\cdot \alpha _k^{f_j-f_k}\cdot \alpha _i^{f_k-f_i}=1. \end{aligned}$$

Since \(d\ge 5\), we may choose two conjugates of \(\alpha \) not among \(\alpha _i,\alpha _j,\alpha _k\)—say \(\alpha _p\) and \(\alpha _q\). Since \(G_{\alpha }\) contains \(A_d\), the elements (ij)(pq), (ik)(pq), (jk)(pq), (ijk), and (ikj) are all contained in \(G_{\alpha }\). Applying these automorphisms to \(\alpha _j^{f_i - f_j}\cdot \alpha _k^{f_j-f_k}\cdot \alpha _i^{f_k-f_i}=1\), yields

$$\begin{aligned} \alpha _j^{f_i - f_j}\cdot \alpha _k^{f_j-f_k}\cdot \alpha _i^{f_k-f_i}&=1 =\alpha _j^{f_i - f_j}\cdot \alpha _i^{f_j-f_k}\cdot \alpha _k^{f_k-f_i}\\ \alpha _i^{f_i - f_j}\cdot \alpha _j^{f_j-f_k}\cdot \alpha _k^{f_k-f_i}&=1 =\alpha _k^{f_i - f_j}\cdot \alpha _j^{f_j-f_k}\cdot \alpha _i^{f_k-f_i}\\ \alpha _k^{f_i - f_j}\cdot \alpha _i^{f_j-f_k}\cdot \alpha _j^{f_k-f_i}&=1 =\alpha _i^{f_i - f_j}\cdot \alpha _k^{f_j-f_k}\cdot \alpha _j^{f_k-f_i}. \end{aligned}$$

Hence

$$\begin{aligned} \left( \frac{\alpha _i}{\alpha _k} \right) ^{2f_k-f_i-f_j} =1, \qquad \left( \frac{\alpha _i}{\alpha _k} \right) ^{2f_i-f_j-f_k} =1, \quad \text {and} \quad \left( \frac{\alpha _i}{\alpha _k} \right) ^{2f_j-f_k-f_i} =1. \end{aligned}$$

But \(\alpha _i/\alpha _k\) is not a root of unity, since it is a Galois conjugate of \(\alpha _1/\alpha _d\), which lies outside the unit circle. It follows that

$$\begin{aligned} 2f_k-f_i-f_j=2f_i-f_j-f_k=2f_j-f_k-f_i=0, \end{aligned}$$

and hence \(f_i=f_j=f_k\). This proves part (a).

Part (b) follows similarly: \((i,j)(k,l)\cdot (\alpha _1^{f_1}\cdots \alpha _d^{f_d})=\alpha _1^{f_1}\cdots \alpha _d^{f_d}\) implies

$$\begin{aligned} \alpha _i^{f_j}\cdot \alpha _j^{f_i}\cdot \alpha _k^{f_l}\cdot \alpha _l^{f_k}= \alpha _i^{f_i}\cdot \alpha _j^{f_j}\cdot \alpha _k^{f_k}\cdot \alpha _l^{f_l}. \end{aligned}$$

Without loss of generality, we assume \(f_j\ge f_i\) and \(f_k\ge f_l\). Using that (il)(jk) is an element of \(G_{\alpha }\), we get

$$\begin{aligned} \left( \frac{\alpha _j}{\alpha _i} \right) ^{f_j-f_i} = \left( \frac{\alpha _k}{\alpha _l} \right) ^{f_k-f_l} \quad \text { and } \quad \left( \frac{\alpha _k}{\alpha _l} \right) ^{f_j-f_i} = \left( \frac{\alpha _j}{\alpha _i} \right) ^{f_k-f_l}. \end{aligned}$$

Multiplying both equations yields

$$\begin{aligned} \left( \frac{\alpha _j}{\alpha _i} \right) ^{(f_j - f_i)+(f_k-f_l)}=\left( \frac{\alpha _k}{\alpha _l} \right) ^{(f_j - f_i)+(f_k-f_l)}, \end{aligned}$$

and hence

$$\begin{aligned} \left( \frac{\alpha _j\cdot \alpha _l}{\alpha _i\cdot \alpha _k} \right) ^{(f_j - f_i)+(f_k-f_l)}=1. \end{aligned}$$

Again, \((\alpha _j\cdot \alpha _l)/(\alpha _i\cdot \alpha _k)\) is a Galois conjugate of \((\alpha _1\cdot \alpha _2)/(\alpha _{d-1}\cdot \alpha _d)\), which lies outside the unit circle, and hence is not a root of unity. Therefore \((f_j - f_i)+(f_k-f_l)=0\). Since \(f_j\ge f_i\) und \(f_k\ge f_l\), it follows that \(f_j=f_i\) and \(f_k = f_l\), proving the lemma. \(\square \)

Lemma 10

Let \(f_1,\ldots ,f_d\) be pairwise distinct integers. Then \([{\mathbb {Q}}(\alpha _1^{f_1}\cdots \alpha _d^{f_d}):{\mathbb {Q}}]=\# G_{\alpha }\).

Proof

The proof is essentially the same as the proof of part (1) in Theorem 1.1 from [2]. Assume there is a \(\sigma ^{-1}\in G_{\alpha }\subseteq S_d\) such that \(\alpha _1^{f_1}\cdots \alpha _d^{f_d} = \sigma ^{-1}\cdot (\alpha _1^{f_1}\cdots \alpha _d^{f_d})\). Then

$$\begin{aligned} 1 = \alpha _1^{f_1-f_{\sigma (1)}}\cdots \alpha _d^{f_d-f_{\sigma (d)}}. \end{aligned}$$
(11)

If \(\sigma \) is an odd permutation, then \(G_\alpha =S_d\), then it was already proven by Smyth (see Lemma 1 of [20]) that \(f_i=f_{\sigma (i)}\) for all i, hence that \(\sigma =\mathrm{id}\). If \(\sigma \) is an even permutation, then by repeated application of Lemma 9 to equation (11) above, this is only possible if \(f_i-f_{\sigma (i)}\) is the same integer for all \(i\in \{1,\ldots ,d\}\), say \(f_i-f_{\sigma (i)}=k\).

Since \(\sigma ^{d!}=\mathrm{id}\), it follows that

$$\begin{aligned} f_1 = k+ f_{\sigma (1)} = 2k + f_{\sigma ^2(1)} = \ldots = d!\cdot k + f_{\sigma ^{d!} (1)}=d!\cdot k + f_1, \end{aligned}$$

and hence \(k=0\). Therefore we have \(f_i=f_{\sigma (i)}\) for all \(i\in \{1,\ldots ,d\}\). But by assumption the integers \(f_1,\ldots ,f_d\) are pairwise distinct, hence, we must again have \(\sigma =\mathrm{id}\). Since in either case, \(\sigma =\mathrm{id}\), this means that the images of \(\alpha _1^{f_1}\cdots \alpha _d^{f_d}\) are distinct under each non-identity element of \(G_\alpha \), so \([{\mathbb {Q}}(\alpha _1^{f_1}\cdots \alpha _d^{f_d}):{\mathbb {Q}}]=\# G_{\alpha } \). \(\square \)

Proposition 2

Let \(M^{(n)}(\alpha )=\alpha _1^{e_1}\cdot \ldots \cdot e_d^{e_d}\) such that the exponents \(e_1,\ldots ,e_d\) are pairwise distinct. Then \(M^{(n+1)}(\alpha )> M^{(n)}(\alpha )\).

Proof

We denote by \(Z_3\) the set of 3-cycles in \(G_{\alpha }\subseteq S_d\). For any \(k\in \{1,\ldots ,d\}\), the number of 3-cycles which fix k is equal to \((d-1)(d-2)(d-3)/3\). For any pair \(k\ne k'\in \{1,\ldots ,d\}\), the number of 3-cycles sending k to \(k'\) is \((d-2)\). Therefore,

$$\begin{aligned}&\left| \prod _{\tau \in Z_3} \tau \cdot M^{(n)}(\alpha ) \right| \nonumber \\&\quad = \left| \alpha _1^{\frac{(d-1)(d-2)(d-3)}{3}e_1 + (d-2)\sum _{k\ne 1} e_k} \cdot \ldots \cdot \alpha _d^{\frac{(d-1)(d-2)(d-3)}{3}e_d + (d-2)\sum _{k\ne d} e_k} \right| . \end{aligned}$$
(12)

Since \(\alpha \) is an algebraic unit, we have

$$\begin{aligned} \prod _{j=1}^d \alpha _j^{\sum _{k=1}^d e_k}=\pm 1. \end{aligned}$$

Hence, the value in (12) is equal to

$$\begin{aligned} \left| \prod _{j=1}^d \alpha _j^{\left( \frac{(d-1)(d-2)(d-3)}{3}-(d-2)\right) e_j} \right| = M^{(n)}(\alpha )^{\frac{(d-1)(d-2)(d-3)}{3} -(d-2)}>M^{(n)}(\alpha ). \end{aligned}$$

The last inequality follows from our general hypothesis that \(d\ge 5\). Since \(e_1,\ldots ,e_d\) are assumed to be pairwise distinct, it follows from Lemma 10 that the factors \(\tau \cdot M^{(n)}(\alpha )\) in (12) are also pairwise distinct conjugates of \(M^{(n)}(\alpha )\). In particular

$$\begin{aligned} M^{(n+1)}(\alpha ) = M(M^{n}(\alpha )) \ge \left| \prod _{\tau \in Z_3} \tau \cdot M^{(n)}(\alpha ) \right| > M^{(n)}(\alpha ) \end{aligned}$$

which is what we needed to prove. \(\square \)

Lemma 11

Let \(n\in {\mathbb {N}}\) and let \(M^{(n)}(\alpha )=\alpha _1^{e_1}\cdots \alpha _d^{e_d}\). Then we have:

  1. (a)

    \(e_i \ge e_{i+1}\) for all but at most one \(i\in \{1,\ldots ,d-1\}\).

  2. (b)

    If \(e_i< e_{i+1}\) for some \(i\in \{2,\ldots ,d-1\}\), then \(e_{i-1} > e_{i+1}\).

  3. (c)

    If \(e_i<e_{i+1}\) for some \(i\in \{1,\ldots ,d-2\}\), then \(e_{i} > e_{i+2}\).

  4. (d)

    If \(e_i < e_{i+1}\) for some \(i\in \{1,\ldots ,d-1\}\), then

$$\begin{aligned} e_1> e_2> \cdots> e_{i-1}> e_{i+1}> e_i> e_{i+2}> e_{i+3}> \cdots > e_d. \end{aligned}$$

Proof

It is known that \(M^{(n)}(\alpha )\) is a Perron number, which means that \(M^{(n)}(\alpha )\) does not have a Galois conjugate of the same or larger modulus (cf. [10] for this and other properties of values of the Mahler measure). This fact will be used several times in the following proof.

To prove (a), we have two cases: there are three distinct elements \(1\le i<j<k\le d\) such that \(e_i< e_j < e_k\), or else there exist \(1\le i<j<k<l\le d\) such that \(e_i < e_j\) and \(e_k < e_l\). Assume first that there are three distinct elements \(1\le i<j<k\le d\) such that \(e_i< e_j < e_k\). Recall that by definition we have \(\vert \alpha _i\vert \ge \vert \alpha _j\vert \ge \vert \alpha _k\vert \). Therefore \(\vert \alpha _k \vert ^{e_k -e_i} \le \vert \alpha _j \vert ^{e_k-e_i}\), which implies

$$\begin{aligned}&\underbrace{\vert \alpha _i \vert ^{e_i-e_j}}_{\le \vert \alpha _j \vert ^{e_i-e_j}} \cdot \vert \alpha _j\vert ^{e_j-e_k} \cdot \vert \alpha _k\vert ^{e_k-e_i} \le \vert \alpha _j \vert ^{e_i-e_k} \cdot \vert \alpha _k\vert ^{e_k - e_i} \le 1 \\&\quad \Longrightarrow ~\vert \alpha _i^{e_i} \cdot \alpha _j^{e_j} \cdot \alpha _k^{e_k} \vert \le \vert \alpha _i^{e_j} \cdot \alpha _j^{e_k} \cdot \alpha _k^{e_i} \vert \\&\quad \Longrightarrow ~ \vert M^{(n)}(\alpha ) \vert \le \vert (i,k,j)\cdot M^{(n)}(\alpha ) \vert . \end{aligned}$$

By Lemma 9, \((i,j,k)\cdot M^{(n)}(\alpha ) \ne M^{(n)}(\alpha )\) is a Galois conjugate of \(M^{(n)}(\alpha )\). This contradicts the fact that \(M^{(n)}(\alpha )\) is a Perron number. In particular, it is not possible that \(e_i> e_{i+1} > e_{i+2}\) for any \(i\in \{1,\ldots ,d-2\}\).

Now assume that we have \(1\le i<j<k<l \le d\) such that \(e_i < e_j\) and \(e_k < e_l\). Then \(\vert \alpha _i \vert \ge \vert \alpha _j \vert \) and \(\vert \alpha _k \vert \ge \vert \alpha _l \vert \) imply

$$\begin{aligned} \vert \alpha _i \vert ^{e_j - e_i} \cdot \vert \alpha _k \vert ^{e_l -e_k} \ge \vert \alpha _j \vert ^{e_j - e_i} \cdot \vert \alpha _l \vert ^{e_l -e_k}, \end{aligned}$$

and hence

$$\begin{aligned} \vert \alpha _i^{e_i} \cdot \alpha _j^{e_j} \cdot \alpha _k^{e_k} \cdot \alpha _l^{e_l} \vert \le \vert \alpha _i^{e_j} \cdot \alpha _j^{e_i} \cdot \alpha _k^{e_l} \cdot \alpha _l^{e_k} \vert . \end{aligned}$$

This, however, is equivalent to \(\vert M^{(n)}(\alpha )\vert \le \vert (i,j)(k,l)\cdot M^{(n)}(\alpha ) \vert \), which is not possible by Lemma 9, since \(M^{(n)}(\alpha )\) is a Perron number. This proves part (a) of the lemma.

In order to prove part (b), we assume for the sake of contradiction that \(e_i<e_{i+1}\) but \(e_{i-1}\le e_{i+1}\) for some \(i\in \{2,\ldots ,d-1\}\). By part (a), since we already have \(e_i<e_{i+1}\), we know that \(e_{i-1}\ge e_i\). We have

$$\begin{aligned}&(i-1,i,i+1)\cdot \vert \alpha _{i-1}\vert ^{e_{i-1}}\vert \alpha _{i}\vert ^{e_{i}}\vert \alpha _{i+1}\vert ^{e_{i+1}}\\&\quad =\vert \alpha _{i}\vert ^{e_{i-1}}\vert \alpha _{i+1} \vert ^{e_{i}}\vert \alpha _{i-1}\vert ^{e_{i+1}}. \end{aligned}$$

Now,

$$\begin{aligned}&\vert \alpha _{i-1}\vert ^{e_{i-1}-e_{i+1}}\vert \alpha _{i} \vert ^{e_{i}-e_{i-1}}\vert \alpha _{i+1}\vert ^{e_{i+1}-e_{i}}\\&\quad = \vert \alpha _{i-1}\vert ^{e_{i-1}-e_{i+1}}\vert \alpha _{i}\vert ^{e_{i}-e_{i-1}}\vert \alpha _{i+1} \vert ^{e_{i+1}-e_{i-1}}\vert \alpha _{i+1}\vert ^{e_{i-1}-e_{i}}\\&\quad \le \vert \alpha _{i-1}\vert ^{e_{i-1}-e_{i+1}}\vert \alpha _{i}\vert ^{e_{i}-e_{i-1}}\vert \alpha _{i-1} \vert ^{e_{i+1}-e_{i-1}}\vert \alpha _{i}\vert ^{e_{i-1}-e_{i}}\\&\quad = 1. \end{aligned}$$

Therefore,

$$\begin{aligned} \vert M^{(n)}(\alpha ) \vert \le \vert (i-1,i,i+1)\cdot M^{(n)}(\alpha ) \vert , \end{aligned}$$

giving a contradiction to \(M^{(n)}(\alpha )\) being a Perron number.

Similarly, if \(e_i< e_{i+1}\) and \(e_i\le e_{i+2}\), then we know by (a) that \(e_{i+1}\ge e_{i+2}\). This implies that \(\vert M^{(n)}(\alpha ) \vert \le \vert (i,i+2,i+1)\cdot M^{(n)}(\alpha )\vert \). This proves part (c).

So far we have proven that if \(e_i<e_{i+1}\) for some \(i\in \{1,\ldots ,d-1\}\), then we have

$$\begin{aligned} e_1 \ge e_2 \ge \cdots \ge e_{i-1}> e_{i+1}> e_i > e_{i+2} \ge e_{i+3} \ge \cdots \ge e_d. \end{aligned}$$

We need to show that all of the above inequalities are strict. Assume that this is not the case, and that \(e_k=e_{k+1}\). Then \(k,k+1,i,i+1\) must be pairwise distinct. It follows, that

$$\begin{aligned} \vert (i,i+1)(k,k+1)\cdot M^{(n)}(\alpha ) \vert = \vert (i,i+1)\cdot M^{(n)}(\alpha ) \vert > \vert M^{(n)}(\alpha ) \vert , \end{aligned}$$

which is a contradiction. The last inequality just follows from the fact that \(\vert \alpha _i \vert ^{e_{i+1}} \cdot \vert \alpha _{i+1}\vert ^{e_i} > \vert \alpha _i \vert ^{e_{i}} \cdot \vert \alpha _{i+1}\vert ^{e_{i+1}}\). \(\square \)

Lemma 12

Let \(f_1\ge f_2\ge \cdots \ge f_k\ge 0\) be integers, with \(f_1 \ge 1\), and let \(a_1\ge a_2\ge \ldots \ge a_d >0\) be real numbers such that \(\prod _{i=1}^k a_i>1\). Then \(\prod _{i=1}^k a_i^{f_i} >1\).

Proof

We prove the statement by induction on k, where the base case \(k=1\) is trivial. Now assume that the statement is true for k and that there are real numbers \(a_1\ge \ldots \ge a_{k+1} >0\), with \(\prod _{i=1}^{k+1}a_i > 1\), and integers \(f_1\ge \cdots \ge f_{k+1} \ge 0\), with \(f_1\ge 1\). If \(f_1=f_{k+1}\), then the claim follows immediately. Hence, we assume \(f_1 > f_{k+1}\). Set \(f_i'=f_i-f_{k+1}\) for all \(i\in \{1,\ldots ,k+1\}\). Then

$$\begin{aligned} f_1'\ge f_2'\ge \cdots f_k'\ge f_{k+1}'=0 \text { and } f_1'\ge 1. \end{aligned}$$

Moreover, \(\prod _{i=1}^k a_i\) is either greater than or equal to \(\prod _{i=1}^{k+1} a_i >1\) (if \(a_{k+1}\le 1\)), or it is a product of real numbers \(>1\). Hence, our induction hypothesis states \(\prod _{i=1}^k a_i^{f_i'} >1\). This implies

$$\begin{aligned} \prod _{i=1}^{k+1} a_i^{f_i} = \underbrace{\left( \prod _{i=1}^{k+1} a_i \right) ^{f_{k+1}}}_{\ge 1} \cdot \left( \prod _{i=1}^k a_i^{f_i'} \right) > 1, \end{aligned}$$

proving the lemma. \(\square \)

Proposition 3

Let \(M^{(n)}(\alpha )=\alpha _1^{e_1}\cdots \alpha _d^{e_d}\). If \(e_{i+1}\le e_i\) for all \(i\in \{1,\ldots ,d-1\}\), then \(M^{(n+1)}(\alpha )> M^{(n)}(\alpha )\).

Proof

We show that \(M^{(n)}(\alpha )\) has a non-trivial Galois conjugate outside the unit circle. This immediately implies the claim.

Since \(\alpha \) is an algebraic unit, we may assume that \(e_d=0\). Note however, that this uses our assumption \(e_{i+1}\le e_{i}\) for all i. We set

$$\begin{aligned} s:=\max \{i\in \{1,\ldots ,d\}\vert e_i\ne 0\}. \end{aligned}$$

By Proposition 2 we may assume that we have \(e_i=e_{i+1}\) for some \(i\in \{1,\ldots ,d-1\}\). This i is not equal to s, since \(e_s\ne 0 =e_{s+1}\) by definition. If \(i\notin \{s-1,s+1\}\), then

$$\begin{aligned} (i,i+1)(s,s+1)\cdot M^{(n)}(\alpha )=\alpha _1^{e_1}\cdots \alpha _{s-1}^{e_{s-1}}\alpha _s^{e_{s+1}}\alpha _{s+1}^{e_s}. \end{aligned}$$

If \(i=s-1\), then

$$\begin{aligned} (s-1,s+1,s)\cdot M^{(n)}(\alpha )&= \alpha _1^{e_1}\cdots \alpha _{s-2}^{e_{s-2}}\alpha _{s-1}^{e_{s}}\alpha _s^{e_{s+1}} \alpha _{s+1}^{e_{s-1}} \\&= \alpha _1^{e_1}\cdots \alpha _{s-1}^{e_{s-1}}\alpha _s^{e_{s+1}}\alpha _{s+1}^{e_s}. \end{aligned}$$

If finally \(i=s+1\), then

$$\begin{aligned} (s,s+1,s+2)\cdot M^{(n)}(\alpha )&=\alpha _1^{e_1}\cdots \alpha _{s-1}^{e_{s-1}}\alpha _s^{e_{s+2}}\alpha _{s+1}^{e_{s}} \alpha _{s+2}^{e_{s+1}}\\&=\alpha _1^{e_1}\cdots \alpha _{s-1}^{e_{s-1}}\alpha _s^{e_{s+1}}\alpha _{s+1}^{e_s}. \end{aligned}$$

Since \(e_{s+1}=0\), we see that in any case

$$\begin{aligned} \alpha _1^{e_1}\cdots \alpha _{s-1}^{e_{s-1}}\alpha _{s+1}^{e_s} \text { is a non-trivial Galois conjugate of } M^{(n)}(\alpha ). \end{aligned}$$
(13)

We will prove that this Galois conjugate lies outside the unit circle. Again we distinguish several cases.

If \(s\le r-1\), then all of the elements \(\alpha _1,\ldots ,\alpha _{s+1}\) lie outside the unit circle. Hence \(\vert \alpha _1\cdots \alpha _{s-1}\alpha _{s+1} \vert >1\).

If \(s\ge r+1\), then \(\vert \alpha _1\cdots \alpha _{s-1}\alpha _{s+1} \vert = \vert \alpha _s \alpha _{s+2}\cdots \alpha _{d}\vert ^{-1} >1\), since all of \(\alpha _s,\ldots ,\alpha _d\) lie inside the closed unit disc and \(\vert \alpha _d\vert < 1\).

Lastly, we consider the case \(2\le s=r \le d-2\). Then surely \(\vert \alpha _1 \cdots \alpha _{r-1} \vert \ge \vert \alpha _r \vert \) and \(\vert \alpha _{r+1}\vert \ge \vert \alpha _{r+2}\cdots \alpha _{d} \vert \), where the first inequality is strict whenever \(r\ne 2\), and the second inequality is strict whenever \(r\ne d-2\). By our general assumption it is \(d\ge 5\) and hence \(\vert \alpha _1\cdots \alpha _{r-1}\alpha _{r+1}\vert > \vert \alpha _r \alpha _{r+2} \cdots \alpha _d\vert \). Since the product of all \(\alpha _i\) is \(\pm 1\), it follows that \(\vert \alpha _1\cdots \alpha _{s-1}\alpha _{s+1} \vert >1\).

Hence, in any case we have \(\vert \alpha _1\vert \cdots \vert \alpha _{s-1}\vert \cdot \vert \alpha _{s+1} \vert >1\). From our assumption \(e_1\ge \cdots \ge e_d\) it follows by Lemma 12 that \(\vert \alpha _1^{e_1}\cdots \alpha _{s-1}^{e_{s-1}}\alpha _{s+1}^{e_s}\vert >1\). Therefore, \(M^{(n)}(\alpha )\) has a non-trivial Galois conjugate outside the unit circle [see (13)]. Hence \(M^{(n+1)}(\alpha )=M(M^{(n)}(\alpha ))> M^{(n)}(\alpha )\). \(\square \)

We are now ready to prove Theorem 3.

Proof of Theorem 3

As stated at the beginning of this section, we may assume that \(d\ge 5\), and that the elements \(\pm \alpha ^{\pm 1}\) are neither conjugates of a Pisot, nor a Salem number. Hence, we may assume that the hypothesis (10) is met. Let \(n\in {\mathbb {N}}\) be arbitrary. Then for some \(e_1,\ldots ,e_d \in {\mathbb {N}}_0\), we have \(M^{(n)}(\alpha ) =\alpha _1^{e_1}\cdots \alpha _d^{e_d}\). We have seen in Lemma 11, that one of the following statements applies:

  1. (i)

    \(e_1\ge e_2 \ge \cdots \ge e_d\), or

  2. (ii)

    the integers \(e_1,\ldots ,e_d\) are pairwise distinct.

In case (i), we have \(M^{(n+1)}(\alpha ) > M^{(n)}(\alpha )\) by Proposition 3. In case (ii), we have \(M^{(n+1)}(\alpha ) > M^{(n)}(\alpha )\) by Proposition 2. Hence \(\# {\mathcal {O}}_M(\alpha )=\infty \). \(\square \)

5 Arbitrarily large finite orbit size for certain units

Let \(d=4k\), with an integer \(k\ge 3\). Now, we will show that there exist algebraic units of degree d with arbitrarily large orbit size, proving Theorem 4.

Proof of Theorem 4

Let \(\alpha _1, \beta _1\) be positive real algebraic units satisfying:

  1. 1.

    \([{\mathbb {Q}}(\beta _1):{\mathbb {Q}}]=2\), \(\beta _1>1\),

  2. 2.

    \(\alpha _1\) is a Salem number of degree 2k.

  3. 3.

    The fields \({\mathbb {Q}}(\alpha _1)\) and \({\mathbb {Q}}(\beta _1)\) are linearly disjoint.

For any \(k\ge 3\) we can indeed find such \(\alpha _1\) and \(\beta _1\). Since there are Salem numbers of any even degree \(\ge 4\) we find an appropriate \(\alpha _1\). Now, we take any prime p which is unrammified in \({\mathbb {Q}}(\alpha _1)\), and let \(\beta _1>1\) be an algebraic unit in \({\mathbb {Q}}(\sqrt{p})\). Note that if the above conditions are met by \(\alpha _1\) and \(\beta _1\), then they are met by \(\alpha _1^{\ell }\) and \(\beta _1^{\ell '}\), for any \(\ell , \ell ' \in {\mathbb {N}}\).

We denote the conjugates of \(\alpha _1\) by \(\alpha _2, \cdots , \alpha _{2k}\), with \(\alpha _{2k}=\alpha _1^{-1}\), and the conjugate of \(\beta _1\) is \(\beta _2=\beta _1^{-1}\). Note that \(\alpha _2,\cdots ,\alpha _{2k-1}\) all lie on the unit circle. By assumption (3) the element \(\alpha _1\beta _1\) has degree 4k and a full set of Galois conjugates of \(\alpha _1\beta _1\) is given by

$$\begin{aligned} \{\alpha _i\beta _j : (i,j)\in \{1,\ldots ,2k\}\times \{1,2\}\}. \end{aligned}$$

There are two cases. First, if \(\beta _1>\alpha _1\), then \(\vert \alpha _i\beta _1\vert >1\) for all \(i\in \{1,\cdots , 2k\}\) and \(\vert \alpha _i\beta _2\vert <\vert \alpha _i\alpha _6\vert \le 1\) for all \(i\in \{1,\cdots , 2k\}\), hence,

$$\begin{aligned} M(\alpha _1\beta _1)=\left| \prod _{n=1}^{2k}\alpha _i\beta _1\right| =\beta _1^{2k}. \end{aligned}$$
(14)

For the second case, if \(\beta _1<\alpha _1\), then

$$\begin{aligned} \vert \alpha _i\beta _1\vert>1 \iff i\in \{1, \cdots , 2k-1\},\, \text {and} \, \,\vert \alpha _i\beta _2\vert >1 \iff i=1. \end{aligned}$$

Therefore

$$\begin{aligned} M(\alpha _1\beta _1)=\vert \alpha _1\beta _1\vert \cdot \left| \prod _{n=2}^{2k-1}\alpha _i\beta _1\right| \cdot \vert \alpha _1\beta _2\vert =\alpha _1^2\beta _1^{2k-2}. \end{aligned}$$
(15)

We now construct an algebraic unit of degree 4k of finite orbit size \(>S\). Let \(\ell \in {\mathbb {N}}\) be such that \((\alpha _1^\ell )^{2^S}>\beta _1^{(2k-2)^S}\). Then by (15), we have

  • \(M(\alpha _1^\ell \beta _1)=(\alpha _1^\ell )^2\beta _1^{2k-2}\),

  • \(M^{(2)}(\alpha _1^\ell \beta _1)=M((\alpha _1^\ell )^2)(\beta _1^{2k-2})) =(\alpha _1^\ell )^{2^2}\beta _1^{(2k-2)^2}\), and

  • \(M^{(n)}(\alpha _1^\ell \beta _1)=(\alpha _1^\ell )^{2^n}\beta _1^{(2k-2)^n}\), for all \(n\in \{1,\ldots ,S\}\).

Hence, the orbit size of \(\alpha _1^\ell \beta _1\) is greater than S. However, there exists \(S'>S\) such that \((\alpha _1^\ell )^{2^{S'}}<\beta _1^{(2k-2)^{S'}}\). Assume that \(S'\) is minimal with this property. Then we have

$$\begin{aligned} M^{(S'+1)}(\alpha _1^\ell \beta _1)=(\alpha _1^\ell )^{2^{S'}}\beta _1^{(2k-2)^{S'}} {\mathop {=}\limits ^{(14)}}(\beta _1^{4^{S'}})^{2k}, \end{aligned}$$

which is of degree 2. Therefore, the orbit size of \( \alpha _1^\ell \beta _1 \) is \( S'+2>S \). \(\square \)