A Proof of the Weierstraß Gap Theorem not Using the Riemann–Roch Formula

Usually, the Weierstraß gap theorem is derived as a straightforward corollary of the Riemann–Roch theorem. Our main objective in this article is to prove the Weierstraß gap theorem by following an alternative approach based on “first principles”, which does not use the Riemann–Roch formula. Having mostly applications in connection with modular functions in mind, we describe our approach for the case when the given compact Riemann surface is associated with the modular curve X0(N)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_0(N)$$\end{document}.


Main Objective
Various topical areas in the theory of partitions, such as congruences for partition numbers, are connected to modular functions for congruence subgroups of SL 2 (Z) as, for instance, Γ 0 (N ); see Sect. 15 for definitions. Such functions live on compact Riemann surfaces, for instance, on X 0 (N ) for Γ 0 (N ). Number theoretic aspects then relate to properties of certain subalgebras formed by these functions. In cases where the genus of such surfaces is zero like, for instance, for X 0 (5) and X 0 (7), these algebras essentially have a relatively simple structure. For positive genus g, for example, in the case of X 0 (11), this changes. One explanation is this: when considering sets of meromorphic functions with poles only at one point p, the Weierstraß gap theorem says that one can obtain functions with all possible pole orders at p with exactly g exceptions. Theorem 1.1 (Weierstraß gap theorem; e.g., Sect. III.5.3 in [6]). Let X be a compact Riemann surface having genus g ≥ 1. Then, for each p ∈ X, there are precisely g integers n j = n j (p) with 1 = n 1 < · · · < n g ≤ 2g − 1, (1.1)

Introduction
To exemplify the usage of Weierstraß's gap theorem, we choose an example related to the classical Ramanujan congruences, which in further details are discussed in [14]. Following the definition given in Sect. 15 To keep this article as much self-contained as possible, we list basic definitions and properties of modular functions in a separate Appendix Sect. 15.
In Sect. 12, we prove Theorem 12.2, a version of the gap Theorem 1.1 for the case X = X 0 (N ) and with the only pole put at ∞, utilizing only first principles and avoiding the use of the Riemann-Roch formula. In particular, we avoid the use of any differentials. In addition, our approach provides new algebraic insight by consisting in a combination of module presentations of modular function algebras, integral bases, Puiseux series, and discriminants. For example, using our approach to prove the bound ≤ 2g − 1 stated in the Weierstraß gap theorem is reduced to an elementary combinatorial argument, see Sect. 12. Another by-product of our proof of the Weierstraß gap Theorem 12.2 is a natural explanation of the genus g = 0 case as a consequence of the reduction to an integral basis.
In view of various constructive aspects involved, we are planning to exploit the algorithmic content of our approach for computer algebra applications, for instance, for the effective computation of suitable module bases for modular functions. As already mentioned, some ideas we used trace back to the celebrated work [3] by Dedekind and Weber; see [2] for an English translation together with an excellent introduction by John Stillwell.
Finally, we remark that the history of Weierstraß's gap theorem and related topics such as Weierstraß points somehow presents a challenge. The historical account [4] by Andrea Del Centina describes the scientific evolution of the gap theorem up to the 1970s. Concerning its beginnings Centina says, "The history of Weierstraß points is not marked by a precise starting date because it is not clear when Weierstraß stated and proved his Lückensatz (or "gap" theorem), but one can argue that probably it was in the early 1860s." The rest of our article is structured as follows. In Sect. 3, we introduce order-complete bases of modules over a polynomial ring C[t] to describe modular function algebras. In Sect. 4, we describe how such bases can be stepwise modified to obtain an integral basis; i.e., an order-complete basis for the full algebra M ∞ (N ). Under particular circumstances, one can keep track of the total number of such steps, which then gives a proof of the Weierstraß gap Theorem 12.2. To do this bookkeeping, one can use "order-reduction" polynomials discussed in Sect. 5. In Sect. 6, we explain how to obtain order-reduction polynomials computationally; Sect. 7 deals with important special cases. In Sects. 8 and 9, we derive important ingredients of our proof of Theorem 12.2; for example, a factorization property of the discriminant polynomial in Proposition 9.3. In Sects. 10 and 11, we relate discriminant polynomials to orderreduction polynomials associated with integral bases. In Sect. 12, we use these results to prove the Weierstraß gap theorem in the version of Theorem 12.2. To prove the bound 2g − 1 for the size of the maximal gap, our approach allows a purely combinatorial argument (a gap property of monoids) which we describe in Sect. 13. At various places, we require functions to have the separation property, as defined in Sect. 9. In Sect. 14, we prove the existence of such functions by giving an explicit construction.
The first Appendix Sect. 15 gives a short account on basic modular function facts needed; the second Appendix Sect. 16 recollects some fundamental facts about meromorphic functions on Riemann surfaces.

Modular Function Algebras as C[t]-Modules
We already used (implicitly) the convention that if a meromorphic function f has a pole, then the pole order is defined as the negative order at this point, that is If f ∈ M ∞ (N ), we simplify notation using the convention for the pole order at infinity: Slightly more generally, any tuple (1, β 1 , . . . , β n−1 ) which is a reordering of an order-complete tuple (1, b 1 , . . . , b n−1 ), that is, is also called order-complete.
Notice that pord f = 4. In [13], it is shown that the subalgebra C[1/z 11 , f] of M ∞ (11), which is generated by all bivariate polynomials in 1/z 11 and f , has a representation as a C[1/z 11 ]module as follows: ] .
In view of these examples, we note that in contrast to (2.10), C[1/z 11 , f] = M ∞ (11). For instance, it is obvious that this subalgebra does not contain any g ∈ M ∞ (11) with pord g = 3. Nevertheless, both function tuples , form a basis of the corresponding C[t]-module they generate, where t := 1/z 11 . Namely, since the generators have different pole-order modulo pord t = 5, each element contained in these modules can be represented as a unique linear combination of the module generators with coefficients being polynomials in t. This motivates the following definition.

Integral Bases
In Example 3.3, we saw that (1, f, . . . , f 4 ) is an order-complete basis of C[1/z 11 , f] which is a proper subalgebra of M ∞ (11). 2 In this section, we shall see how such an order-complete basis can be step-wise modified to obtain an order-complete basis for the full algebra M ∞ (11).
The motivation for this terminology comes from

Then, f satisfies an algebraic relation
Moreover, if f ∈ M ∞ (N ), then there exists an algebraic relation with n = pord t.
A crucial observation for the process to obtain an integral basis for M ∞ (N ) from an order-complete basis is stated in the following.
be an order-complete basis of the C[t]-module: Then, for any f ∈ M ∞ (N ), there exist polynomials q(x) and Proof. For j ∈ Z ≥0 , consider the sets For each j ≥ 0, choose a non-zero g j ∈ G j , such that pord g j is minimal amongst all the elements in G j . By construction, using the convention nZ ≥0 := {nk : k ∈ Z ≥0 }, we have for all j ≥ 0: Obviously, S is an additive submonoid of (Z ≥0 , +). Moreover, Z ≥0 \S has only finitely many elements; let k be the maximal element in this set. Then there exist c j ∈ C, not all zero, such that This is owing to the fact that equating the coefficients of non-positive powers in the q-expansions of both sides (which are functions in M ∞ (N )) gives k + 1 equations in k + 2 variables c j . Hence, the dimension of the C-vector space G, which is generated by all the g j , j ≥ 0, is bounded by k+1. Using g j := t j f −h j with h j ∈ M , (4.2) rewrites into the form: The linear combination of h j is in M ; hence, this gives the desired relation for If M = M ∞ (N ), then there exist c j ∈ C, not all zero, and α in C, such that By division with remainder, there are polynomials q j (x) ∈ C[x] and c j ∈ C, such that p j (x) = (x − α)q j (x) + c j , j = 0, . . . , n − 1. Rewriting the representation of g and noting that c i = 0 proves the first part of the statement on h α . To prove (4.4), consider Let k be the index for which pord b k becomes maximal with c k = 0. Recalling pord b j ≡ j (mod n), j = 1, . . . , n, proves pord b k ≥ k + n. Because otherwise pord b k = k which owing to the choice of k would imply pord b j = j for all j = 1, . . . , n, and the given order-complete basis would be integral. This proves (4.4).
of b k by h α is called a pole-order-reduction step associated with α ∈ C.
We summarize in the form of (ii) If (1, β 1 , . . . , β n−1 ) is any another integral basis, that is, Proof. The proof of part (i) is an immediate consequence of Corollary 4.4. Namely, owing to (4.4), each step reduces the pole order of one of the basis elements by n. This guarantees termination in finitely many steps. To prove (ii), without loss of generality, we can assume that pord β j ≡ pord β j ≡ j (mod n) for all j. Suppose pord β j = pord β j for some j ∈ {1, . . . , n − 1}, i.e., pord β j = pord β j + kn with k ≥ 1. However, this implies that β j ∈ 1, β 1 , . . . , β n−1 C[t] , because then, no element in this module can have the same pole order as β j , a contradiction.

Order-Reduction Polynomials
It was shown in the previous section that by applying a procedure using finitely many steps, any order-complete basis of a subalgebra of M ∞ (N ) can be extended to an integral basis of M ∞ (N ). Moreover, by (4.5), the pole orders of the integral basis functions are uniquely determined. It turns out that under particular circumstances, one can keep track of the number of order-reduction steps, which then gives a proof of the Weierstraß gap Theorem 12.2. To do this bookkeeping, one can use "order-reduction" polynomials. To our knowledge, for the first time such polynomials have been used by Dedekind and Weber [3], see [2] for Stillwell's translation into English.
where U ⊆ X 0 (N ) and V ⊆ C are open sets, such that Hence, for v ∈ V , the evaluations have to be interpreted in the sense of (5.1), i.e., interpreting Depending on the context, we will freely move between considering t as a function on H, resp.Ĥ, and its induced version t * : X 0 (N ) →Ĉ.
Using the terminology explained in the Appendix Sect. 16, we assume that v 0 ∈ C is not a branch point of t * ; in short, v 0 ∈ BranchPts(t * ). In this case, there are n pairwise distinct points x j = [τ j ] N ∈ X 0 (N ) with τ j ∈ H, such that In addition, there exists a neighborhood V of v 0 and neighborhoods U j of x j , such that as a disjoint union of open sets, and such that for j = 1, . . . , n, the restricted functions are bi-holomorphic. Let Taking the square of the determinant guarantees that the expression on the right side is symmetric with respect to any permutation of T 1 , . . . , T n . Consequently, D t (1, b 1 , . . . , b n−1 ) is a holomorphic function on V . Carrying out the same construction on neighborhoods V for all v 0 ∈ C\BranchPts(t * ), and gluing the resulting gives a global holomorphic function: Using the same arguments as in the proof of Theorem 8.2 in [7], this function can be extended to a meromorphic function: with ∞ as its only pole. Classical complex analysis tells that M(Ĉ) = C(z), i.e., the field of meromorphic functions onĈ are rational functions with coefficients in C. Hence, we have the following.
Example 5.4. Taking (11) and where F j ∈ M ∞ (11) are as in Example 2.4, one obtains Example 5.5. Taking t and the b j as in Example 5.4, one obtains Remark 5.6. How such polynomials are computed is explained in Sect. 6.
In Corollary 4.4, we proved that if M = M ∞ (N ), then there exist c j ∈ C, not all zero, and v 0 in C, such that Recall that we denoted the n pairwise distinct preimages of v 0 as follows: As a necessary condition for the existence of c j ∈ C not all zero, the determinant of the corresponding linear system has to be zero. In view of . Above, we used the fact that the definition for This means, the case when v 0 ∈ C is a branch point of t * is also covered by the same determinant condition: However, if v 0 ∈ C is a branch point, this condition is automatically satisfied, because then at least the two rows are equal for i = j. Summarizing, this gives

an order-complete basis of the C[t]-module
for c j ∈ C, not all zero. Then If v 0 is a branch point of t * , the condition (5.7) is automatically satisfied.

How to Compute Order-Reduction Polynomials
Next, we explain how to compute the order-reduction polynomials in (5.4) and (5.5).
To this end, it will be convenient to introduce the following notation: that is, Returning to the setting (5.2), we again assume that v 0 ∈ C is not a branch point of t * . This means that there exists pairwise distinct as a disjoint union of open sets, and such that the restricted functions For each j = 1, . . . , n and v ∈ V our goal, achieved in Lemma 6.2(ii), is to determine expressions for q j (v) := q 2πiτ (j) , where τ (j) is close to τ j , such that For q = e 2πiτ with τ ∈ H, we havẽ . Here, we assume that the first coefficient in this q-expansion of t is 1. Now, if , To fix a branch of the nth root, we choose the preimage τ n and recall that By choosing a neighborhood of τ n , we fix a branch of the nth root of v ∈ C close to v 0 : where τ (n) is close to τ n and determined as in (6.3).
In addition, let W be such that U (W (q)) = W (U (q)) = q, and define ζ n := e 2πi n . After this preparation, in view of (6.4) we can put things together as follows. Lemma 6.2. In the given setting, for j = 1, . . . , n and v ∈ C close to v 0 , let where n √ v is defined as in (6.4). Then, for j = 1, . . . , n and v ∈ C close to v 0 : (6.3), and where the values q j (v) are pairwise distinct for j = 1, . . . , n.
, because by part (i) with i = 1, . . . , n − 1 and j = 1, . . . , n: can be represented as a Laurent series in powers of 1/v 1/n : · · ,(6.5) with coefficients α i,j , β i,j , etc., in C, and under the assumption that the first Laurent series coefficient β (5.3) must be a polynomial in v. Consequently, we can compute it by taking suitable truncated versions of the expansions (6.5). Remark 6.3. This is how we computed the order-reduction polynomials in (5.4) and (5.5).

Discriminant Polynomials
Important special cases of order-reduction polynomials are produced by ordercomplete module bases of C[t, f ] of the form as in Proposition 3.5.
is called the discriminant polynomial for the order-complete basis (1, f, . . . , f n−1 ) of the C[t]-module: The discriminant polynomial factors as the square of a Vandermonde determinant. Now, invoking (6.5) with b i = f , and thus, pord b i = pord f , gives Summarizing, we have the following.

Reduction Steps and Order-Reduction Polynomials
In Sect. 4, we described how order-complete bases can be transformed into integral bases of M ∞ (N ) by a finite sequence of pole-order-reduction steps.
In this section, we establish a link between pole-order-reduction steps and order-reduction polynomials.
To this end, we consider again our standard situation: By Corollary 4.4, when M = M ∞ (N ), there exist c j ∈ C, not all zero, and α in C, such that In particular, there exists a k ∈ {1, . . . , n − 1}, such that Proposition 8.1. With regard to order-reduction polynomials, this setting is reflected by Proof. After filling the right side of (8 . . , n, the proof is a straightforward consequence of determinant calculus.
In other words, a pole-order-reduction step associated with α ∈ C: from one order-complete basis to another corresponds to factoring the orderreduction polynomial as In the situation of Example 5.5

Local Puiseux Expansions
By considering local expansions at finitely many points [τ j ] N ∈ X 0 (N ) for τ j ∈Ĥ = H ∪ Q ∪ {∞}, in this section, we derive important ingredients for our proof of Theorem 12.2. To this end, we consider charts ϕ τ0 : if τ 0 ∈ H is an elliptic point (cf. (9.5)), or according to (15.3) by furthermore, the periods h(τ 0 ) equal either 2 or 3. We note explicitly that all these charts are centered at 0, that is, Throughout this section, again t ∈ M ∞ (N ) with n := pord t ≥ 1. Now, we reconsider the setting in Sect. 5 by dropping the assumption that v 0 ∈ C is not a branch point of t * . This means, we allow ≤ n pairwise distinct points There exists a neighborhood V 0 of v 0 and neighborhoods U j of the x j , such that as a disjoint union, and for k = 1, . . . , k j , the restricted functions Again, using (6.1), one has where B j (z) := z a j,0 + a j,1 z Now, by inverting the relation (9.8) and using the Puiseux series, the situation of (9.6) is reflected as follows: for each v ∈ V , there is for fixed (j, k), j = 1, . . . , k j and k ∈ {1, . . . , k j }, a uniquely determined τ = τ (j, k) ∈ U j,k , such that For such pairs τ = τ (j, k) and v, one has As in Sect. 5, one works with a fixed branch of the k j th root; moreover, we note that as a consequence of the definition of A j (z), A j,1 = 0 for all j = 1, . . . , .
To connect to discriminant polynomials, let f ∈ M ∞ (N ) be such that gcd(n, pord f ) = 1. Moreover, without loss of generality, for j = 1, . . . , , we can assume that the neighborhoods U j are chosen, such that the following expansions exist for all [τ ] N ∈ U j : (9.10) Invoking (9.9), one obtains k , the statement follows from applying (9.9) to (9.10): To adapt to the refined setting (9.6), we extend our T j -notation to the additional restricted functions: Finally, we use the information we obtained in terms of the local holomorphic Puiseux series expansion to represent the discriminant polynomial at v 0 ∈ C.
Namely, for all v ∈ V : The last equality is by (9.11); it gives rise to the following.
with multiplicities k 1 , . . . , k , respectively, i.e., k 1 + · · · + k = n. , such that for all v ∈ C : Proof. The statement follows from the last equality of the derivation preceding this proposition. Namely, under the condition (9.12), the second product on the right side of this equality is non-zero for v = v 0 . Condition (9.13) means that b j,1 = 0 in (9.10), thus c j,1 = 0 for j = 1, . . . , in (9.11). Consequently, from the first product in the expression under consideration, one can pull out v − v 0 as follows: Recalling that k 1 + · · · + k l = n completes the proof for all v ∈ V , where V is an open subset of a neighborhood V 0 of v 0 , such that V does not contain v 0 . However, invoking the identity theorem from complex analysis, the statement extends to all v ∈ C.
Properties (9.12) and (9.13) are sufficiently important to deserve a Definition 9.4 (separation property). Let t ∈ M ∞ (N ) with n := pord t ≥ 1, let f ∈ M ∞ (N ) be such that gcd(n, pord f ) = 1. For v 0 ∈ C, suppose that We say that f has the separation property for (t, v 0 ) if f satisfies (9.12) and (9.13).
Remark 9.5. In Sect. 14, we describe how to construct such an f having the separation property.
An immediate consequence of Proposition 9.3 is Corollary 9.6. Let f have the separation property for (t, β) with β ∈ C. Then Another consequence of our analysis above is with multiplicities k 1 , . . . , k , respectively, i.e.,k 1 + · · · + k = n.
Using the same argument, one derives If one of the polynomial factors in the role of B(x) above would be the zero polynomial, we are done. Otherwise, invoking condition (9.12) implies that Recalling that not all the a j are zero and k 1 + · · · + k = n, we obtain a contradiction to the assumption that F * is analytic on X 0 (N )\{[∞] N }.
For complex numbers a 0 , . . . , a n−1 , define a meromorphic function on H by If f has the separation property for (t, v 0 ), then Proof. If one of the a j would be non-zero, Proposition 9.7 would imply a pole of F * at some [τ j ] N = [∞] N , τ j ∈ H ∪ Q.

Order Reduction and Discriminant Polynomials
In this section, we relate discriminant polynomials to order-reduction polynomials associated with integral bases. Throughout this section, let t ∈ M ∞ (N ) with n := pord t ≥ 1, let (1, b 1 , . . . , b n−1 ), b j ∈ M ∞ (N ), be an order-complete tuple forming an integral basis for M ∞ (N ) over C[t], that is, Moreover, let f ∈ M ∞ (N ) again be chosen, such that gcd(n, pord f ) = 1. By Proposition 3.5, such an f gives rise to an order-complete basis (1, f, . . . , f n−1 ) of the C[t]-module: . By exemplifying the case for n = 3, we shall see how the discriminant polynomial is related to the order-reduction polynomial: By the identity theorem from complex analysis, it is sufficient to consider the situation for v from a neighborhood V of v 0 ∈ V . With the setting as in (5.3), one has ⎛ ⎜ ⎝ because owing to (10.1), there exist polynomials r Taking determinants of both sides of the matrix equation squared, this gives for the general case: as polynomials in v : D t (1, b 1 Next, we consider the other direction. By Proposition 4.3, there exist polynomials q j (x) and where q j (t) is either a constant or such that gcd(q j (x), p As before, this can be expressed as a matrix equation. We display the case for n = 3: ⎛ ⎜ ⎝ Again, taking determinants of both sides of the matrix equation squared, for the general case, this gives another polynomial relation in v: where s(x), q 1 (x), . . . , q n−1 (x) are polynomials in C[x]. It will be convenient to cancel out possible common factors and to write, as polynomials in C[x]: such that S(x) and Q 1 (x) · · · Q n−1 (x) are relatively prime polynomials, (10.6) and where q j (x) are determined as in (10.3).
Another application of the argument we used to derive (10.2) is the existence of some polynomial R(x) ∈ C[x], such that This, using (10.5), implies Finally, as a consequence of (10.6), we obtain We summarize the following.

Lemma 11.2. There is
As in the proof of Corollary 4.4, by division with remainder, there are polynomials p l (x) ∈ C[x] and a l ∈ C, such that p Owing to the fact that f has the separation property for (t, β), one has by Corollary 9.8: Iterating this argument cancels out all powers of t − β and one arrives at a representation of b j of the form: with polynomials P l (x) and Q(x), such that x − β Q(x). (11.5) Comparing this to the representation (10.3), which rewrites as n−1 (t)f n−1 , produces a contradiction to the uniqueness of the basis representation since in contrast to (11.5), x − β divides the denominator polynomial q j (x).

Proposition 11.4. For any β ∈ C:
Proof. For the proof we choose f having the separation property for (t, β). 7 For the "⇒" direction of the statement, suppose D t (1, b 1 ≥ 1 and (1, b 1 Then, there exists a c ∈ C and positive integers m β , such that D t (1, b 1 , . . . , b n−1 ) Moreover, for any β ∈ BranchPts C (t * ), suppose that t * (x) = β has  7 How to construct such f is described in Sect. 14.
with the maximal power, which proves (11.8).
For the next consideration, we again have to use the charts as in (9.1), (9.2), and (9.3). In the setting of Proposition 11.5, one has where the last line is by the fact that if t(τ 0 ) ∈ BranchPts(t * ), then Here, we use the notion of multiplicity mult x (f ), also explained in Sect. 16, which stands for the multiplicity at the point x ∈ X of a meromorphic function f on a (compact) Riemann surface X. For x 0 = [τ 0 ] N ∈ X 0 (N ), one has (e.g., [11,Lemma 4.7] and [5,Sect. 2.4]) with respect to our charts φ τ0 (τ ) centered at 0: 8 Hence, we obtain Proposition 11.5. Corollary 11.6. Let t ∈ M ∞ (N ) with n := pord t ≥ 1 and (1, b 1 Then Next, recall from Sect. 16 the definition of Deg(f ), the degree of a meromorphic function f on a compact Riemann surface X: Choosing v := ∞, we have Deg(t * ) = n. Let g(X) := genus of a compact Riemann surface X.

Proof of the Weierstraß Gap Theorem
In this section, we prove the gap theorem for modular functions in M ∞ (N ). In this section, we prove the gap theorem in the following version.
(ii) by Proposition 8.1 (8.3) reduces the degree of the order-reduction polynomial by exactly two.
Hence, we proved that M ∞ (N ) has exactly g gaps. If g = 0 there are no gaps; i.e., in this case, after relabelling indices, To prove the remaining part of the gap theorem, namely, the bound (12.1) for the gaps {n 1 = 1, n 2 , . . . , n g } where g ≥ 1, we will use a general combinatorial argument. Notice that n 1 = 1, because otherwise there would be no gap, which, as we proved, is only possible if g = 0.
To prepare for the combinatorial argument, recall that after choosing t and f from M ∞ (N ) as above, by applying pole-order-reduction steps, we arrived, after relabelling indices, at an integral basis (1, b 1 , . . . , b  Let m + 1 be the smallest non-gap of M ∞ (N ); m ≥ 1 owing to n 1 = 1.
In Sect. 13, we denote the number of gaps in a monoid S by γ(S). Hence, in the given context, g = γ(S). Recall that m+1 is chosen to be the smallest nongap of M ∞ (N ). Therefore, we choose a monoid representation with respect to m + 1. Concretely, in this case, there are k j ∈ Z >0 , such that Since j + (m + 1)(k j − 1) are the largest non-gaps in each residue class modulo m + 1, this proves the bound given in (12.1), and the proof of the Weierstraß gap Theorem 12.2 is completed.

An easy count gives
(13.1)

Lemma 13.1 (Monoid gap lemma).
Under these assumptions, one has for all j = 1, . . . , m: In other words, the largest possible gap is bounded by 2γ(S) − 1. Before proving this statement, we prove two elementary observations. Lemma 13.2. If i and in Z >0 are such that i + = j for j ∈ {1, . . . , m}, then Proof.
The inequality is by s i + s ∈ S with s i + s ≡ j (mod m + 1), and s j ∈ S is minimal with this property. Lemma 13.3. If i and in Z >0 are such that i+ = j+m+1 for j ∈ {1, . . . , m}, Proof.
The inequality is by s i + s ∈ S with s i + s ≡ j (mod m + 1), and s j = j + (m + 1)k j ∈ S is minimal with this property.
By Lemma 13.3, Summing the left and right sides, respectively; of these, m − j inequalities gives Combining the two inequalities, we obtain that which is (13.3).

Functions with Separation Property
The setting which we use throughout this section is: t ∈ M ∞ (N ) with n := pord t ≥ 2 and (1, b 1 Because of pord t = n, for any fixed α ∈ C, we have that 12 with multiplicities k 1 , . . . , k , respectively. (I.e., k 1 + · · · + k = n.) (14.1) We note that, as above, owing to t ∈ M ∞ (N ), In Definition 9.4, we defined the separation property of f for (t, v 0 ) with v 0 = α as in (14.1). At various places, we required f to have this property, for instance, in Proposition 11.5. In this section, we prove the existence of such f . In addition, here, we have to use the charts as in (9.1), (9.2), and (9.3).

Lemma 14.1. Given the setting of this section with
Proof. We are free to relabel the indices of the preimages of α. Hence, it is sufficient to prove the statement for τ 1 := τ and τ 2 := τ . Suppose ( ) As in (9.7), for j = 1, . . . , and suitable neighborhoods U j , one has local expansions for τ ∈ U j : Moreover, as in (9.10), for j = 1, . . . , , we can assume that the neighborhoods U j are chosen, such that the following expansions exist for all τ ∈ U j : Taking a j ∈ C the quotient defines a modular function g ∈ M (N ). Now, g ∈ M ∞ (N ) if and only if all the zeros τ j of t(τ j ) − α = 0 cancel out. Indeed, assuming ( ), one can determine a j ∈ C, not all zero, such that this cancellation happens. Namely, using the local expansions the cancellation condition translates into a system of linear equations, where j runs form 1 to : 13 a 0 + a 1 b 1 (τ j ) + · · · + a n−1 b n−1 (τ j ) = 0, This gives in total k 1 + · · · + k = n equations. However, owing to ( ), two of these equations are the same. This means, we are left with n − 1 equations in n unknowns a 0 , . . . , a n−1 . This implies that there exists a solution to the system with the a j not all 0 . This produces a contradiction: since g ∈ M ∞ (N ), there are polynomials p j (x) ∈ C[x], such that Combining this with (14.2), the uniqueness of the basis representation gives Consequently, all p j (x) and all a j must be zero. Hence, ( ) leads to a contradiction and the lemma is proved.
Suppose F r (τ m+1 ) = F r (τ s ) for some s ∈ {1, . . . , m}\{r}. If there is no such s, we are done with f := F r . Otherwise, by the previous lemma, there is some l ∈ {1, . . . , n − 1}, such that b l (τ m+1 ) = b l (τ s ), and we can choose a non-zero d ∈ C, such that Now, we set F r,s := F r + db l , and see that, Proof. Let us assume that As in (9.7), for j = 1, . . . , and suitable neighborhoods U j , one has local expansions for τ ∈ U j : Now, we proceed with the proof exactly as above. Namely, as in (9.10), for j = 1, . . . , , we can assume that the neighborhoods U j are chosen such that the following expansions exist for all τ ∈ U j and k ∈ {1, . . . , n − 1}: Now, we apply the same strategy as in the proof of Lemma 14.1 and determine a j ∈ C, not all zero, such that This leads us to consider the same system of k 1 + · · · + k = n linear equations. This time, owing to ( ), we have d is always satisfied and can be removed. This means, we are left with n − 1 equations in n unknowns a 0 , . . . , a n−1 . 14 This implies that there exists a solution to the system with the a j not all 0, which produces a contradiction as in the proof of Lemma 14.2. Lemma 14.5. Again, we assume the setting of this section with ≥ 1. Then, there exist α i ∈ C, such that for f := α 0 t + α 1 b 1 + · · · + α n−1 b n−1 :
For the induction step k → k + 1, we assume that we have F of required form, such that (14.8) and (14.9) hold. If, in addition, we are done. Otherwise, define where the quotient is taken for all i, j ∈ {1, . . . , } for which the denominator is non-zero. Now, we additionally require that for a 1,j and b 1,j coming from the expansions: The action of SL 2 (Z) on H extends to an action on meromorphic functions f : H →Ĉ := C ∪ {∞}. A meromorphic function f : H →Ĉ is called a meromorphic modular function for Γ 0 (N ), if (i) for all a b c d ∈ Γ 0 (N ): and (ii) for any γ = a b c d ∈ SL 2 (Z), there exists an M = M (γ) ∈ Z together with a Fourier expansion: In view of γ∞ = lim Im(τ )→∞ γτ = a/c, we say that (15.3) is a q-expansion of f at a/c. Understanding that a/0 = ∞, this also covers the definition of q-expansions at ∞. Concerning the uniqueness of such expansions, let γ ∈ SL 2 (Z) be such that γ ∞ = γ∞ = a/c, then the q-expansion of f (γ τ ) differs from that of f (γτ ) only by a root-of-unity factor in the coefficients, namely, we have then γ = γ ±1 m 0 ±1 for some m ∈ Z, which implies As a consequence, one can extend f from H toĤ := H ∪ {∞} ∪ Q by defining f (a/c) := lim Im(τ )→∞ f (γτ ), where γ ∈ SL 2 (Z) is chosen, such that γ∞ = a/c. Another consequence is that the q-expansions of f at ∞ are uniquely determined owing to and w (0) = 1. (15.4) Next, notice that the action of SL 2 (Z), and thus of Γ 0 (N ), extends in an obvious way to an action onĤ. The orbits of the Γ 0 (N ) action are denoted by In cases where N is clear from the context, one also writes [τ ] instead of [τ ] N . The set of all such orbits is denoted by The Γ 0 (N ) action maps Q ∪ {∞} to itself, and owing to (15.1), each Γ 0 (N ) produces only finitely many orbits [τ ] N with τ ∈ Q ∪ {∞}; such orbits are called cusps of X 0 (N ). One has, for example, Proof. This fact can be found in many sources; a detailed description of how to construct a set of representatives for the cusps of Γ 0 (N ), for instance, is given in [16,Lemma 5.3].
Suppose that the domain of f ∈ M (N ) is extended from H toĤ as described above, i.e., f : H →Ĉ is extended to f :Ĥ →Ĉ, where we keep the same name for the extended function. Then, using this extension gives rise to a function f * : X 0 (N ) →Ĉ, which is defined as follows: The fact that f * is well-defined follows from our previous discussion. We say that f * is induced by f .
As described in detail in [5], X 0 (N ) can be equipped with the structure of a compact Riemann surface. This analytic structure turns the induced functions f * into meromorphic functions on X 0 (N ). The following classical lemma [11, Theorem 1.37], a Riemann surface version of Liouville's theorem, is crucial for zero recognition of modular functions. Hence, knowing all the cusps [a j /c j ] reduces the task of finding all possible poles to the inspection of q-expansions of f at a j /c j ; i.e., of q-expansions of f (γ j τ ) as in (15.3) with γ j ∈ SL 2 (Z) such that γ j ∞ = a j /c j . We call these expansions also local q-expansions of f * at the cusps [a j /c j ] N ; w N (c j ) is called the width of the cusp [a j /c j ] N . It is straightforward to show that it is independent of the choice of the representative a j /c j of the cusp [a j /c j ] N , and that w N (c j ) = N/ gcd(c 2 j , N) for relatively prime a j and c j . Note that

Appendix: Meromorphic Functions on Riemann Surfaces-Basic Notions
To make this article as much self-contained as possible, in this second appendix section, we recall most of the facts that we need about meromorphic functions on Riemann surfaces. For the terminology, we basically follow [7]; other classic texts are [6] and [11]. Lemma 15.2 states the fundamental fact that any analytic function on a compact Riemann surface is constant. In Example 2.1, we have seen that z * 5 has its only zero of order 1 at [∞] 5 and its only pole at [0] 5 with multiplicity 1, i.e., z * 5 has order −1 at [∞] 5 . 16 This is also in accordance with Lemma 16.1, a corollary of another fundamental fact which says that meromorphic functions on compact Riemann surfaces have exactly as many zeroes as poles (counting multiplicities); see, for instance, [11,Proposition 4.12]: Lemma 16.1. Let g be a non-constant meromorphic function on a compact Riemann surface X. Then x∈X ord x g = 0.
Here, ord x0 g is defined as follows. Suppose g(x) = ∞ n=m c n (ϕ(x) − ϕ(x 0 )) n , c m = 0, is the local Laurent expansion of g at x 0 using the local coordinate chart ϕ : U 0 → C which homeomorphically maps a neighborhood U 0 of x 0 ∈ X to an open set V 0 ⊆ C. Then, ord x0 g := m.
Let M(S) denote the field of meromorphic functions f : S →Ĉ on a Riemann surface S. 17 Let f ∈ M(S) be non-constant: then for every neighborhood U of x ∈ S, there exist neighborhoods U x ⊆ U of x and V of f (x), such that the set f −1 (v) ∩ U x contains exactly k elements for every v ∈ V \{f (x)}. This number k is called the multiplicity of f at x; notation: k = mult x (f ). 18 If S is compact, f ∈ M(S) is surjective and each v ∈Ĉ has the same number of preimages, say n, counting multiplicities; i.e., n = x∈f −1 (v) mult x (f ), see, e.g., [7,Theorem 4.24]. This number n is called the degree of f ; notation: n = Deg(f ). One of the consequences is that non-constant functions on compact Riemann surfaces have as many (finitely many) zeros as poles counting multiplicities; this is Lemma 16.1.
RamiPts(f ) := {x ∈ S : mult x (f ) ≥ 2} denotes the set of ramification points of f ; BranchPts(f ) := f (RamiPts(f )) ⊆Ĉ denotes the set of branch points of f . Ramification points, and also branch points, of a function f form sets having no accumulation point. Hence, for functions on compact Riemann surfaces, these sets have finitely many elements.

Conclusion
In this article, we present the first proof of the Weierstraß gap theorem (for modular functions) without using the Riemann-Roch theorem. The main ingredient in our proof is the concept of order-reduction polynomials which corresponds to the discriminant of a field extension of Q in the setting of algebraic number theory, see, for instance, [10,III,§3]. In the field case, the structure of this discriminant is related to the ramification index [10, III, §2, Proposition 8, and III, §3, Proposition 14]. Analogously, in Proposition 11.5, we give a factorization of the order-reduction polynomial which in direct fashion relates to the branch points of the modular function t. This relation allows us to connect the degree of this polynomial to the genus of X 0 (N ). This observation is crucial for our proof of the Weierstraß gap theorem.
In addition, our approach gives new algebraic and algorithmic insight based on module presentations of modular function algebras, in particular, the usage of integral bases. For example, our proof also gives a method to compute the order-reduction polynomial by using the Puiseux series expansions at infinity. Another new feature concerns the gap bound: the main task of our proof is to show that there are exactly g gaps for any modular function algebra. The proof that the corresponding pole orders are bounded by 2g − 1, with the help of an elementary combinatorial argument turns out to be an immediate consequence of our approach. Another by-product of our framework is a natural explanation of the genus g = 0 case as a consequence of the reduction to an integral basis.
Summarizing, our setting generalizes ideas from algebraic number theory, but still stays close to "first principles." Hence, we feel that our approach has potential for further extensions and applications. For example, we are planning to exploit the algorithmic content of our approach for computer algebra applications, for instance, for the effective computation of suitable module bases for modular function algebras.