On Schneider’s Continued Fraction Map on a Complete Non-Archimedean Field

Let M denote the maximal ideal of the ring of integers of a non-Archimedean ﬁeld K with residue class ﬁeld k whose invertible elements, we denote k × , and a uniformizer we denote π . In this paper, we consider the map T v : M → M deﬁned by where b ( x ) denotes the equivalence class to which π v( x ) x belongs in k × . We show that T v preserves Haar measure μ on the compact abelian topological group M . Let B denote the Haar σ -algebra on M . We show the natural extension of the dynamical system ( M , B , μ, T v ) is Bernoulli and has entropy # ( k ) # ( k × ) log ( # ( k )) . The ﬁrst of these two properties is used to study the average behaviour of the convergents arising from T v . Here for a ﬁnite set A its cardinality has been denoted by # ( A ) . In the case K = Q p , i.e. the ﬁeld of p -adic numbers, the map T v reduces to the well-studied continued fraction map due to Schneider.


Introduction
The purpose of this paper is to calculate the entropy of T. Schneider's continued fraction map, and to show the map has a natural extension which is Bernoulli. This is B R. Nair a.haddley@liv.ac.uk ; nair@liv.ac.uk 1 Mathematical Sciences, The University of Liverpool, Peach Street, Liverpool L69 7ZL, UK then used to study the behaviour of averages of convergents arising from Schneider's map. Schneider's map is usually defined on the p-adic field for the rational prime p, see [26]. In fact we work in a more general setting which we now describe. Let K denote a topological field. By this, we mean that the field K is a locally compact group under the addition, with respect to a topology. This ensures that there is a translation invariant Haar measure μ on K , which is unique up to scalar multiplication. In the non-Archimedean examples that concern us in this paper, this topology will always be discrete. For an element a ∈ K , we are now able to define its absolute value, as for every μ measurable F ⊆ K of finite positive μ measure. Let R ≥0 denote the set of all non-negative real numbers. An absolute value is a function |.| : K → R ≥0 such that (i) |a| = 0 if and only if a = 0; (ii) |ab| = |a||b| for all a, b ∈ K and (iii) |a + b| ≤ |a| + |b| for all pairs a, b ∈ K . The absolute value just defined gives rise to a metric defined by d(a, b) = |a − b| with a, b ∈ K , whose topology coincides with original topology on the field K . Topological fields come in two types. The first where (iii) can be replaced by the stronger condition (iii)* |a + b| ≤ max(|a|, |b|) a, b ∈ K , called non-Archimedean fields and fields where (iii)* is not true called Archimedean spaces. In this paper, we shall concern ourselves solely with non-Archimedean fields. Another approach to defining a non-Archimedean field is via discrete valuations. Denote the real numbers by R. Let K * = K \{0}. A map v : K * → R is a valuation if (a) v(K * ) = {0}; (b) v(x y) = v(x) + v(y) for x, y ∈ K and (c) v(x + y) ≥ min{v(x), v(y)}. Two valuations v and cv, for c > 0 a real constant, are called equivalent. We extend v to K formally by letting v(0) = ∞. The image v(K * ) is an additive subgroup of R called the value group of v. If the value group is isomorphic to Z, we say v is a discrete valuation. Here Z denotes the set of integers. If v(K * ) = Z, we call v a normalised discrete valuation. To our initial absolute value we associate the valuation described as follows. Pick 0 < α < 1 and write |a| = α v(a) , i.e., let v(a) = log α |a|. Then v(a) is a valuation, an additive version of |a|.
Let v : K * → R be a valuation corresponding to the absolute value |.| : K → R ≥0 . Then is a ring, called the valuation ring of v and K is its field of fractions. The set of units Because M is a maximal ideal, we know k = O/M is a field, called the residue field of v or of K . In the sequel throughout this paper, we assume that k is a finite field. Suppose the valuation v : K * → Z is normalised and discrete. Take π ∈ M such that v(π) = 1. We call π a uniformizer. Then every x ∈ K can be written uniquely as x = uπ n with u ∈ O × and n ∈ Z. In particular every x ∈ M can be written uniquely as x = uπ n for a unit u ∈ O × and n ≥ 1.
We now consider two examples. a) p-adic numbers : Let Q denote the rational numbers. For r = p v p u v in Q with u and v coprime to p and each other, let |r | p = p −v p . Then d p (r , r ) = |r − r | p for r ∈ Q defines a metric on Q. The completion of Q with respect to the metric d p is a field denoted Q p referred to as the p-adic numbers. We also use Z p to denote {x ∈ Q p : |x| p ≤ 1}-the ring of p-adic integers. | It is worth keeping in mind that the metric d p has the ultrametric property, namely that d p (r , r ) ≤ max(d p (r , r ), d p (r , r )) for all r , r and r ∈ Q p . The main characteristics of the field Q p that distinguish it from the field R stem from the ultrametric property. It turns out that Q p is a locally compact abelian field and hence comes endowed with a translation invariant Haar measure. In this instance, K = Q p , O = Z p , M = pZ, π = p and k = Z/ pZ. See [16] for a clear and succinct introduction to p-adic numbers.
b) The field of formal Laurent series in finite characteristic : Let q be a power of a prime p and let F q be the finite field with q elements. Denote by F q [X ] and F q (X ) the ring of polynomials with coefficients in F q and the quotient field of F q [X ], respectively. For each P, Q ∈ F q [X ] set |P/Q| := q deg(P)−deg(Q) , where for an element g ∈ F p [X ] we have denoted its degree by deg(g). Let F q ((X −1 )) denote the field of formal Laurent series, i.e.
The only two types of non-Archimedean local fields there are finite extensions of the field of p-adic numbers for some rational prime p and the field of formal Laurent series over a finite field. For more details and background to this discussion of non-Archimedean fields see Chapter 2 of [10], and Chapter 4 of [23].
Our primary object of study in this paper is the map where b(x) denotes the residue class to which π v(x) x belongs in k. This gives rise to the continued fraction expansion of x ∈ M in the form where b n ∈ k × , a n ∈ N for n = 1, 2, . . .. Here N denotes the set of natural numbers. The rational approximants to x ∈ M arise in a manner similar to that in the case of the real numbers as follows. We suppose Then set A n = π a n A n−2 + b n A n−1 and B n = π a n B n−2 + b n B n−1 (2) for n ≥ 2. A simple inductive argument, for n = 1, 2, . . . gives The map T v : M → M preserves Haar measure on M. By this we mean, for each Haar measurable set A contained in M we have μ( To prove that T v preserves Haar measure on M we only need to check it for special sets of the form πa + π n O, where a ∈ O. This is because sets of this form generate the Haar σ -algebra on M. Suppose c 0 ∈ k\{0} and let m, n be positive integers. Then It follows which has measure #(k) 1−m−n . Recall here #(A) denotes the cardinality of the finite set A. It follows as required. So, T v preserves Haar measure on M.
In the case where K = Q p the map T v reduces to the original Schneider's continued fraction map T p , which motivates this whole investigation and is defined as follows. For x ∈ pZ p define the map T p : pZ p → pZ p by where v(x) is the p-adic valuation of x, a(x) ∈ N and b(x) ∈ {1, 2, . . . , p − 1}. Then using the continued fraction algorithm for x we get the expansion, where b n ∈ {1, 2, . . . , p − 1}, a n ∈ N for n = 1, 2, . . . . We now define the measure-theoretic entropy. Let (X , A, m) be a probability space where X is a set, A is a σ -algebra of its subsets and m is a probability measure.
Here is a denumerable index set. For a measure-preserving transformation T , Let A ⊂ A be a sub-σ -algebra. Then we define the conditional entropy of α given Here m(A|A ) denotes the mconditional probability of A with respect to the σ -algebra A . See [22] for more details about conditional probability. The entropy of a measure-preserving transformation T relative to a partition α is defined to be where the limit always exists. The alternative formula for h m (T , α) which is used for calculating entropy is We define the measure-theoretic entropy of T with respect to the measure m to be h m (T ) = sup α h m (T , α). Here the supremum is taken over all finite or countable partitions α from A with H (α) < ∞.
Two measure-preserving transformations (X 1 , β 1 , m 1 , T 1 ) and (X 2 , β 2 , m 2 , T 2 ) are said to be isomorphic if there exist sets M 1 ⊆ X 1 and M 2 ⊆ X 2 with m 1 (M 1 ) = 1 and m 2 (M 2 ) = 1 such that T 1 (M 1 ) ⊆ M 1 and T 2 (M 2 ) ⊆ M 2 and such that there exists a map φ : The importance of measure theoretic entropy is that two dynamical systems with different entropies can not be isomorphic. For more on measure-theoretic entropy and isomorphism, see [31]. The following is our first result.
). The measure-preserving transformation ( pZ p , B, μ, T p ) is known to be ergodic [14]. Moreover, in [12] it was proved that ( pZ p , B, μ, T p ) is exact. We forgo the definition of exactness here, however, as we do not use the concept in this paper. The exactness of ( pZ p , B, μ, T p ) implies other weaker properties including mixing, which implies weak-mixing implying ergodicity, all implications being strict. Suppose Here {x n } is a bi-infinite sequence of elements of the set Y . Any measure-preserving transformation isomorphic to a Bernoulli process will be referred to as Bernoulli. The fundamental fact about Bernoulli processes, famously proved by D. Ornstein, is that Bernoulli processes with the same entropy are isomorphic [27]. To any measure-preserving transformation, (X , β, m, T 0 ) we can associate another called its natural extension. Originally introduced by V. A. Rokhlin [24], the natural extension is defined as follows. Set and let T : The map T is 1 − 1 on X T 0 . If T 0 preserves a measure m, then we can define a measure m on X T 0 , by defining m on the cylinder sets for k ≥ 1. One can check that the transformation (X T 0 , β, m, T 0 ) is measurepreserving as a consequence of the measure preservation of the transformation (X , β, m, T 0 ). Our second theorem is the following.

Theorem 1.2 Suppose (M, B, μ, T v ) is as in our first theorem. Then the dynamical system (M, B, μ, T v ) has a natural extension that is Bernoulli.
In last two sections of this paper, this theorem is combined with subsequence pointwise ergodic theorems and the moving average ergodic theorem [4], respectively, to study the average behaviour of the convergents arising from the map T v . These results in the special case K = Q p already appear in [12]. Our two theorems above together tell us that as a dynamical system, the isomorphism class to which T v belongs is determined solely by its residue class field. This is irrespective of the characteristic of the underlying global field. For instance, for each rational prime p the corresponding Schneider map has entropy p p−1 log( p), so we know these maps are mutually nonisomorphic. Each of them is, however, isomorphic to the analogue of the Schneider map on the field of formal power series with coefficient field the finite field of p elements.
Henceforth, for a real number y let {y} denote the fractional part of y. The study of the properties of (M, B, μ, T v ) parallels that of the Gauss map defined on [0, 1] by This map preserves the measure defined for Lebesgue measurable A ⊆ [0, 1] by Analogously to the Gauss map [13], the map which governs the regular continued fraction on the real numbers, the measure-preserving transformation ( pZ p , B, μ, T p ) via (4) gives rise to an integer recurrence relationship. This is as follows. We Suppose

Then set
A n = p a n A n−2 + b n A n−1 and B n = p a n B n−2 + b n B n−1 for n ≥ 2. A simple inductive argument gives for n = 1, 2, . . ..
Because p does not divide B n we deduce that the integers A n and B n are coprime. The sequence of rationals ( A n B n ) ∞ n=1 are the convergents to x in pZ p arising from (5). Naturally one of the first things one might try to do is explore the extent to which, theorems true for continued fractions on the real numbers extend to the p-adic numbers. For the most part, one can extend the regular continued fraction expansion and its properties to the field of formal Laurent series over a finite field, in a relatively trouble-free manner. This is primarily because the field of formal Laurent series over a finite field is a Euclidean domain. In the context of the p-adic numbers the direct analogue of the regular continued fraction is the Ruban continued fraction [25]. Here, however, there are problems. The p-adic numbers are not a Euclidean domain. It is possible to define a sequence of rationals analogous to the convergents of the regular continued fractions. Their convergence to the number they are supposed to represent is not assured, however. This problem can be got round using a system of weights. This is what leads to Schneider's continued fraction expansion. This is at a cost, however. Some partial success at recovering analogues of standard properties of the regular continued fraction for the real numbers is possible on the p-adic numbers. See, for instance, [1][2][3]8,11], where the issues of when a p-adic continued fraction is either finite or periodic is explored. One cannot, however, hope to have a theory as satisfactory or as useful as that offered by the regular continued fraction expansion. The situation is just more complex. For instance, unlike the sequence of convergents of the regular continued fraction expansion, the sequence ( A n B n ) ∞ n=1 does not always provide a sequence of best approximants to the p-adic number they approximate. Other solutions to this particular problem are available, though not using Schneider's map however [20], [21]. All this said, as observed in [7], while not as versatile as the regular continued fraction, Schneider continued fraction can be a powerful tool in a number of situations. It is sometimes very useful in delicate constructions on the p-adic numbers. In [7] for instance it is used to construct numbers that distinguish between the Mahler and Koksma schemes of approximation to a specified degree. Specifically for a p-adic number η let w(n) denote its Mahler function on N defined to be the supremum of all real numbers w such that the inequality 0 < |P(η)| p ≤ H (P) −w−1 is satisfied by infinitely many polynomials P over Z of degree at most n. Here H denotes the height of the polynomial P, defined to be the maximum of the absolute values of the coefficients of P. Analogously, to η we can also associate the Koksma function w * (n) which is defined to be the supremum over all real numbers w such that the inequality is satisfied by infinitely many algebraic numbers ξ of degree at most n. In this instance H (ξ ) denotes the height of the minimal polynomial defining ξ . The relationship between the numbers w(n) and w * (n) is a complex and unresolved issue. Restricting to the case n = 2 some progress has been made, though even here this is not an easy matter. It is known w(2) ∈ [w * (2), w * (2) + 1]. We also know that w(2) = w * (2) for almost all η in Q p . Methods of diophantine approximation have been used to show there are p-adic numbers η such that w(2) = w * (2) + δ for each δ ∈ [0, 1). Constructing ξ such that w(2) = w * (2) + 1 has so for only proved possible using the Schneider continued fraction. The method has the additional advantage over diophantine approximation methods of being constructive. The details of this are to be found in [7]. See also [5,6] for related applications.
Another interesting application of Schneider's continued fraction is to deciding the algebraic independence of a set of p-adic numbers. See [9,19] for details.
For background on the theory of regular continued fractions and its ergodic theory see for instance [13,15]. As is well known, if you restrict the Gauss map to the rational numbers you get the Euclidean algorithm. If you set p = 2 and restrict the Schneider map to the rational numbers what you get is the Binary Euclidean algorithm. This is another way of calculating the highest common factor of two integers, particularly well adapted to efficient implementation on binary machines. The algorithm was first published by Josef Stein [29] but is also attributed to Roland Silver and John Terzian in unpublished form [17]. The algorithm may, however, be much older. Knuth [17] cites a verbal description of the algorithm in the first-century A.D. Chinese text "Chiu Chang Suan Shu".

The Entropy of Schneider's Continued Fraction Map
In this section, we will prove the first result of the paper.
One can see that it can be complicated to compute entropy from its definition, so there is the following theorem due to Ya. G. Sinai which is the main tool. The proof of the theorem and more information about entropy can be found in Chapter 4 of [31].

Now let (0) = M and let
(1) where j 1 is the first element of the sequence j. Next define (2) Proceeding inductively, we get So, (n) j is the set of all x ∈ M with continued fraction expansion starting with j 1 , j 2 , . . . , j n . This means that To compute entropy, we first need to find the conditional information function Here, for a partition φ the symbol μ(A|φ) denotes the μ-conditional probability of A with respect to the σ -algebra generated by the partition φ. If x ∈ (n) j , then χ ( j 1 ) (x) = 1 and χ ( j i ) (x) = 0 for all i ≥ 2. So we get The conditional probability is . Then we can see that χ C 1 (x) = 1 and for other where i ≥ 2 we have χ C i (x) = 0. Thus, we obtain .
A simple computation shows that μ( (n) j ) = 1 #(k) N , where N = a 1 + a 2 + · · · + a n . Thus, we have and the conditional information function is By (6), we see that the entropy of T v relative to the partition α is Notice that a 1 (x) = v(x) and we have We claim that α is a strong generator for T v . This is because for almost every x, y ∈ M if x = y, the points x and y have distinct Schneider continued fraction expansions. This implies the partition α separates almost every pair of points. Hence, by Sinai's Theorem 2.1, the entropy of T v with respect to μ is

Proof of the Bernoulli Property
Let P = ( p 1 , p 2 , . . .) and Q = (q 1 , q 2 , . . .) denote two μ-measurable denumerable partitions of the same set X . Then P and Q are said to be ε-independent and we write A denumerable partition P is called weak Bernoulli with respect to an invertible, measure-preserving transformation T if for each ε > 0 there exists a positive constant K = K (ε) such that for every n ≥ 0 we have Note this is not the only way to formulate this property. As observed in [28] for a non-invertible transformation, its natural extension is weakly Bernoulli, if there is a denumerable partition such that for each ε > 0 there exists K = K (ε) and every n ≥ 0 we have The isomorphism to a Bernoulli shift is then ensured by the following theorem which was proved by Friedmann and Ornstein, see [27].

Theorem 3.1 A weak Bernoulli (invertible) transformation is isomorphic to a Bernoulli shift with the same entropy.
We now complete the proof of our second theorem.

Proof of Theorem 1.2 Set
where N = a 1 + · · · + a n (on Proof For x ∈ (n) j , suppose its n th convergent is A n B n defined by the recurrence relation (4) . Using (4) and (5) one checks that As B n is in O × and multiplication by π N scales Haar measure by |π | −N , this lemma is proved if we show that the map t : M → M defined by t(x) = x x B n−1 +B n preserves Haar measure. Fix L ∈ N and y ∈ k[π ] (the ring of polynomials in π over the residue class ring). One checks readily that t maps the coset π y + π L O bijectively to the coset t(π y) + π L M. Cosets of this type form a basis for the open sets of M and have the same measure, so their measure is preserved. Hence, our lemma is proved.
We, therefore, have Thus, the generator α for T v is weak Bernoulli which by the above theorem means that the natural extension of T v is isomorphic to a Bernoulli shift with the entropy #(k) #(k × ) log(#(k)).

Application of the Pointwise Subsequence Ergodic Theorems
Recall the elementary identities ∞ n=1 nx n = x (1−x) 2 and ∞ n=1 n 2 x n = 1+x (1−x) 3 for |x| < 1. Also as is easily verified From this, we get We now describe the elements of subsequence ergodic theory, which we use to study convergents.
A sequence of integers (a n ) ∞ n=1 is called L p -good universal if for each dynamical system (X , B, μ, T ) and f ∈ L p (X , B, μ), we have existing μ almost everywhere.
Recall that we say a sequence of real numbers (x n ) ∞ n=1 is uniformly distributed modulo one if for each interval I ⊆ [0, 1), denoting its length by |I |, we have See [18] for further background. The reference [12] contains an extensive list of sequences of natural numbers, that are L p -good universal for all p > 1. Some are L 1good universal as well. All the examples mentioned in the reference have the additional property that ({k n ψ}) n≥1 is uniformly distributed for each irrational number ψ. We will call a sequence of natural numbers (k n ) n≥1 that is both L p -good universal and such that ({k n ψ}) n≥1 is uniformly distributed modulo one for each irrational ψ p-good. In [12], the following theorem is proved. Theorem 4.1 If (k n ) n≥1 is p-good for any p > 1 and the dynamical system (X , β, μ, T ) is weak-mixing, then f (x) = X f dμ μ almost everywhere.
the following result.
Note that transformations that have natural extensions that are Bernoulli are also weak-mixing [31]. Theorem 4.1 has a number of applications.

Theorem 4.2 Suppose (k n ) n≥1 is an p-good and suppose F : R ≥0 → R is a continuous increasing function with
For each n ∈ N and arbitrary real numbers d 1 , . . . , d n , we define n .
Then we have  In the case k n = n (n = 1, 2, . . .) and K = Q p , the first part of this result is from [14]. Unlike the natural numbers, however, most examples of p-good sequences are of zero density. We also have the following additional consequences.

Theorem 4.5 For any p-good
almost everywhere with respect to Haar measure on M. Proof

Application of the Moving Average Pointwise Ergodic Theorem
We begin by introducing some notation. Let Z be a collection of points in Z × N and let Geometrically we can think of Z 1 α as the lattice points contained in the union of all solid cones with aperture α and vertex contained in Z 1 = Z . We say a sequence of pairs of natural numbers (n l , k l ) ∞ l=1 is Stoltz if there exists a collection of points Z in Z×N, and a function h = h(t) tending to infinity with t such that (n l , k l ) ∞ l=t ∈ Z h(t) and there exist h 0 , α 0 and A > 0 such that for all integers λ > 0 we have |Z h 0 α 0 (λ)| ≤ Aλ. This technical condition is interesting because of the following theorem from [4]. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.