1 Introduction

Let \(f\in M_k(\Gamma _0(N),\xi )\) be a classical holomorphic modular form of weight k, level N and nebentypus character \(\xi \), and define

$$\begin{aligned} g(z)=(\sqrt{N}z)^{-k}f\!\left( -\frac{1}{Nz}\right) . \end{aligned}$$
(1.1)

Let \(f_n\) and \(g_n\) denote the Fourier coefficients of f and g, respectively, and define

$$\begin{aligned} \Lambda _f(s)=\Gamma _\mathbb {C}\left( s+\tfrac{k-1}{2}\right) \sum _{n=1}^\infty f_nn^{-s-\frac{k-1}{2}} \quad \text {and}\quad \Lambda _g(s)=\Gamma _\mathbb {C}\left( s+\tfrac{k-1}{2}\right) \sum _{n=1}^\infty g_nn^{-s-\frac{k-1}{2}} \end{aligned}$$
(1.2)

for \(\mathfrak {R}(s)>\frac{k+1}{2}\), where \(\Gamma _\mathbb {C}(s):=2(2\pi )^{-s}\Gamma (s)\). Then \(\Lambda _f(s)\) and \(\Lambda _g(s)\) continue to entire functions of finite order, apart from at most simple poles at \(s=\frac{1\pm k}{2}\), and satisfy the functional equation

$$\begin{aligned} \Lambda _f(s)=i^kN^{\frac{1}{2}-s}\Lambda _g(1-s). \end{aligned}$$
(1.3)

Conversely, when \(N\le 4\), Hecke [11, 12] (see also [1]) showed that the modular forms of level N are characterized by these properties. Precisely, given sequences \(\{f_n\}_{n=1}^\infty \), \(\{g_n\}_{n=1}^\infty \) of at most polynomial growth, if the functions \(\Lambda _f(s)\) and \(\Lambda _g(s)\) defined by (1.2) continue to entire functions of finite order and satisfy (1.3) then \(f_n\) and \(g_n\) are the Fourier coefficients of modular forms of level N and weight k, related by (1.1).

When \(N\ge 5\), Hecke’s proof no longer goes through, and in fact the vector space of sequences \(\{f_n\}_{n=1}^\infty \), \(\{g_n\}_{n=1}^\infty \) satisfying the above conditions is infinite dimensional. Weil [22] showed that one can recover the converse statement by assuming additional functional equations for twisted L-functions

$$\begin{aligned} \Lambda _f(s,\chi )=\Gamma _\mathbb {C}(s+\tfrac{k-1}{2})\sum _{n=1}^\infty f_n\chi (n)n^{-s-\frac{k-1}{2}} \end{aligned}$$
(1.4)

for primitive characters \(\chi \) of conductor coprime to N. On the other hand, it has been conjectured (see [5, Conjecture 1.2]) that if \(\Lambda _f(s)\) and \(\Lambda _g(s)\) have Euler product expansionsFootnote 1 of the shape satisfied by primitive Hecke eigenforms then the single functional equation (1.3) should suffice to imply modularity, without the need for character twists. Some partial progress on this problem was made by Conrey and Farmer [3] (see also [4]), who proved the conjecture for some values of N exceeding 4.

One drawback of assuming an Euler product is that it imposes a non-linear constraint on the Fourier coefficients \(f_n,g_n\), so the solutions to (1.3) no longer form a vector space. In turn, it is unclear how to make use of this constraint to extend Hecke’s proof to higher level. In this paper we propose a replacement for the Euler product that, we conjecture, characterizes the modular forms of any level N, yet retains the linearity of (1.3):

Conjecture 1.1

Let \(\xi \) be a Dirichlet character modulo N, k a positive integer satisfying \(\xi (-1)=(-1)^k\), and \(\{f_n\}_{n=1}^\infty ,\{g_n\}_{n=1}^\infty \) sequences of complex numbers satisfying \(f_n,g_n=O(n^\sigma )\) for some \(\sigma >0\). For \(q\in \mathbb {{N}}\), let

$$\begin{aligned} c_q(n)=\sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}}e\!\left( \frac{an}{q}\right) \end{aligned}$$

be the associated Ramanujan sum, where \(e(x):=e^{2\pi ix}\), and define

$$\begin{aligned} \Lambda _f(s,c_q)= & {} \Gamma _\mathbb {C}\bigl (s+\tfrac{k-1}{2}\bigr )\sum _{n=1}^\infty \frac{f_nc_q(n)}{n^{s+\frac{k-1}{2}}} \quad \text {and}\\ \Lambda _g(s,c_q)= & {} \Gamma _\mathbb {C}\bigl (s+\tfrac{k-1}{2}\bigr )\sum _{n=1}^\infty \frac{g_nc_q(n)}{n^{s+\frac{k-1}{2}}} \end{aligned}$$

for \(\mathfrak {R}(s)>\sigma +1-\frac{k-1}{2}\). For every q coprime to N, suppose that \(\Lambda _f(s,c_q)\) and \(\Lambda _g(s,c_q)\) continue to entire functions of finite order and satisfy the functional equation

$$\begin{aligned} \Lambda _f(s,c_q)=i^k \xi (q)(Nq^2)^{\frac{1}{2}-s}\Lambda _g(1-s,c_q). \end{aligned}$$
(1.5)

Then \(f(z):=\sum _{n=1}^\infty f_ne(nz)\) is an element of \(M_k(\Gamma _0(N),\xi )\).

To understand the motivation behind this conjecture, we first consider a more general family of twists. Let \(\chi \pmod {q}\) be a Dirichlet character, not necessarily primitive, and define

$$\begin{aligned} c_\chi (n)= & {} \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} \chi (a)e\!\left( \frac{an}{q}\right) , \end{aligned}$$
(1.6)
$$\begin{aligned} \Lambda _f(s,c_\chi )= & {} \Gamma _\mathbb {C}\left( s+\tfrac{k-1}{2}\right) \sum _{n=1}^\infty \frac{f_nc_\chi (n)}{n^{s+\frac{k-1}{2}}} \quad \text {and}\nonumber \\ \Lambda _g(s,c_{\overline{\chi }})= & {} \Gamma _\mathbb {C}\left( s+\tfrac{k-1}{2}\right) \sum _{n=1}^\infty \frac{g_nc_{\overline{\chi }}(n)}{n^{s+\frac{k-1}{2}}}. \end{aligned}$$
(1.7)

Note that when \(\chi \) is the trivial character mod q, \(c_\chi \) reduces to the Ramanujan sum, \(c_q\). In Lemma 4.10, we show that if we start from a pair of modular forms fg satisfying (1.1), then \(\Lambda _f(s,c_\chi )\) and \(\Lambda _g(s,c_{\overline{\chi }})\) satisfy the functional equation

$$\begin{aligned} \Lambda _f(s,c_\chi ) = i^k\xi (q)\overline{\chi (-N)}(Nq^2)^{\frac{1}{2}-s} \Lambda _g(1-s,c_{\overline{\chi }}). \end{aligned}$$
(1.8)

When \(\chi \) is primitive, we have \(c_\chi (n)=\tau (\chi )\overline{\chi (n)}\), where \(\tau (\chi )=\sum _{a=1}^q\chi (a)e(a/q)\) denotes the Gauss sum, and (1.8) reduces to the familiar functional equation for the multiplicative twist \(\Lambda _f(s,\overline{\chi })\). More generally, when \(\Lambda _f(s)\) possesses an Euler product, we show in Lemma 4.12 that (1.8) is implied by the functional equation for \(\Lambda _f(s,\overline{\chi }_*)\), where \(\chi _*\) is the primitive character inducing \(\chi \). In particular, in the presence of an Euler product, (1.3) implies (1.5).

Given any \(Q\in \mathbb {{N}}\) and \(q\mid Q\), we can view \(c_\chi \) for \(\chi \pmod {q}\) as a function on \(\mathbb {Z}/Q\mathbb {Z}\). One can show that as \(\chi \) ranges over all characters of modulus dividing Q, the functions \(c_\chi \) form an orthogonal basis for the space of functions on \(\mathbb {Z}/Q\mathbb {Z}\). Thus, any twist of f with periodic coefficients and period coprime to N is a linear combination of the twists by \(c_\chi \). In this sense, (1.8) is the most general functional equation (from twists with period coprime to the level) that one can expect.

Conjecture 1.1 arises from the speculation that any constraints on the solutions to (1.3) imposed by the assumption of an Euler product are already implied by the extra functional equations (1.8) that one obtains from taking \(\chi \) equal to the trivial character mod q. In Sect. 2, we prove five theorems that lend some support to the conjecture:

  1. (1)

    Theorem 2.1 establishes Conjecture 1.1 for some values of N exceeding 4, following the methods of Conrey and Farmer [3].

  2. (2)

    Theorem 2.2 proves Conjecture 1.1 under the additional assumption that f is modular for some subgroup of finite index in \({{\mathrm{SL}}}_2(\mathbb {Z})\) (not necessarily a congruence subgroup).

  3. (3)

    Theorem 2.3 proves Conjecture 1.1 under the additional assumption that |f| is modular for some congruence subgroup.

  4. (4)

    Theorem 2.4 proves Conjecture 1.1 under the additional assumptions that N is prime and f is modular for the commutator subgroup of \(\Gamma _0(N)\). This establishes a version of Theorem 2.2 for some cases of infinite index.

  5. (5)

    Theorem 2.5 shows that for almost all primes q, the hypotheses of Conjecture 1.1, together with the expected analytic properties and functional equations of the multiplicative character twists (1.4) for the primitive characters \(\chi \pmod {q}\), suffice to imply modularity. Particular examples of suitable q are given for some levels outside the scope of Theorem 2.1.

To set these results in context, we note that one reason why Hecke’s argument fails for \(N\ge 5\) is that there are counterexamples arising from more general kinds of modular forms. If one believes that a twistless converse theorem is possible assuming an Euler product, then it is reasonable to ask how these counterexamples are eliminated by the Euler product. Points (2) and (3) above address two such generalizations of modular forms, namely forms for non-congruence groups and forms for more general weight-k multiplier systems (not necessarily of finite order).

Concerning point (5), Diaconu et al. [6] showed that if \(\Lambda _f(s)\) is given by an Euler product, then there exists a prime q (depending on N) such that the analytic properties and functional equations of the character twists (1.4) for all primitive \(\chi \) of conductor dividing q suffice to imply modularity. On the other hand, again under the assumption of an Euler product, it follows from a theorem of Piatetski-Shapiro [17] that it suffices to assume the expected properties of (1.4) for all primitive \(\chi \pmod {p^j}\) for any fixed prime p and all \(j\ge 0\). Point (5) can be seen as a complement to both of these results. We conjecture that the proof of Theorem 2.5 can be extended to all sufficiently large primes q, and we study this problem in detail in Sect. 3.

2 Main results

Let \(\mathbb {{H}}=\{z\in \mathbb {C}:\mathfrak {I}(z)>0\}\) denote the upper half-plane. For any function \(h:\mathbb {{H}}\rightarrow \mathbb {C}\) and any matrix \(\gamma =\left( {\begin{matrix}a&{}b\\ c&{}d\end{matrix}}\right) \in {{\mathrm{GL}}}_2^+(\mathbb {R})=\{M\in {{\mathrm{GL}}}_2(\mathbb {R}):\det {M}>0\}\), define

$$\begin{aligned} h|\gamma =(\det \gamma )^{k/2}(cz+d)^{-k}h\!\left( \frac{az+b}{cz+d}\right) , \end{aligned}$$

where \(k\in \mathbb {{N}}\) is the integer appearing in Conjecture 1.1. (We assume that k is fixed from now on and suppress it from the notation.) Note that this defines a right action, i.e. \(h|(\gamma _1\gamma _2)=(h|\gamma _1)|\gamma _2\) for any \(\gamma _1,\gamma _2\in {{\mathrm{GL}}}_2^+(\mathbb {R})\). We extend the action linearly to the group algebra \(\mathbb {C}[{{\mathrm{GL}}}_2^+(\mathbb {R})]\), i.e. for \(\gamma =\sum _i c_i\gamma _i\in \mathbb {C}[{{\mathrm{GL}}}_2^+(\mathbb {R})]\) we define \(h|\gamma =\sum _i c_ih|\gamma _i\).

Let f be as in Conjecture 1.1, and define \(g(z)=\sum _{n=1}^\infty g_ne(nz)\). Then, by Hecke’s argument [15, Theorem 4.3.5], the fact that \(\Lambda _f(s,c_1)\) and \(\Lambda _g(s,c_1)\) continue to entire functions of finite order and satisfy (1.5) for \(q=1\) is equivalent to the identity \(f|\left( {\begin{matrix}&{}-1\\ N&{}\end{matrix}}\right) =g\). Writing \(T=\left( {\begin{matrix}1&{}1\\ {} &{}1\end{matrix}}\right) \) and \(W=\left( {\begin{matrix}&{}-1\\ N&{}\end{matrix}}\right) T^{-1}\left( {\begin{matrix}&{}-1\\ N&{}\end{matrix}}\right) ^{-1}=\left( {\begin{matrix}1&{}\\ N&{}1\end{matrix}}\right) \), since f and g are given by Fourier series, we have \(f|T=f|W=f\).

Given a matrix \(\gamma =\left( {\begin{matrix}a&{}b\\ c&{}d\end{matrix}}\right) \in \Gamma _0(N)\), we define \(\xi (\gamma )=\xi (d)\). Since \(\xi (-1)=(-1)^k\), we have \(f|(-I)=\xi (-I)f\), and thus \(f|\gamma =\xi (\gamma )f\) for every \(\gamma \in \langle -I,T,W\rangle \). To prove that \(f\in M_k(\Gamma _0(N),\xi )\), it suffices to verify this equality for every \(\gamma \in \Gamma _0(N)\), since the holomorphy of f at cusps follows from modularity and the growth estimate \(f_n=O(n^\sigma )\).

Note that if \(\gamma ,\gamma '\in \Gamma _0(N)\) have the same top row then \(\gamma '\gamma ^{-1}\) is a power of W, so that \(f|\gamma '=f|\gamma \). Thus, \(f|\gamma \) depends only on the top row of \(\gamma \). With this in mind, we will write \(\gamma _{q,a}\) to denote any element of \(\Gamma _0(N)\) with top row \(\left( {\begin{matrix}q&-a\end{matrix}}\right) \).

Theorem 2.1

Conjecture 1.1 is true for \(N\le 9\) and \(N\in \{11,15,17,23\}\).

Proof

The following table shows, for each N in the statement of the theorem, minimal generating sets for \(\Gamma _0(N)\), verified with Sage [20]:

N

Generators

N

Generators

1

\(\{T,W\}\)

8

\(\{-I,T,W,\gamma _{3,1}\}\)

2

\(\{T,W\}\)

9

\(\{-I,T,W,\gamma _{2,1}\}\)

3

\(\{T,-W\}\)

11

\(\{-I,W,\gamma _{2,1},\gamma _{3,1}\}\)

4

\(\{-I,T,W\}\)

15

\(\{-I,T,W,\gamma _{2,1},\gamma _{4,1},\gamma _{11,4}\}\)

5

\(\{T,W,\gamma _{2,1}\}\)

17

\(\{T,W,\gamma _{2,1},\gamma _{3,1},\gamma _{6,1}\}\)

6

\(\{-I,T,W,\gamma _{5,2}\}\)

23

\(\{-I,T,W,\gamma _{2,1},\gamma _{4,1},\gamma _{6,1},\gamma _{10,-3}\}\)

7

\(\{T,W,-\gamma _{2,1}\}\)

  

In particular, for \(N\le 4\), \(\Gamma _0(N)\) is generated by \(-I\), T and W, so there is nothing to prove. For all other levels we apply the methods of Conrey and Farmer [3], in the form of Lemmas 4.1, 4.3 and 4.4.

For odd values of N, Lemma 4.1 with \(q=2\) implies that \(f|\gamma _{2,1}=\overline{\xi (2)}f\). In view of the table, this establishes the claim for \(N\in \{5,7,9\}\).

For \(N\in \{8,11,15,17,23\}\) we obtain values of \(q\in \{3,4,6\}\) for which \(f|\gamma _{q,1}=\overline{\xi (q)}f\) from Lemma 4.3. For \(N\in \{8,11,17\}\) these are sufficient to establish the claim.

It remains only to prove the claim for \(N=6,15,23\), for which we need to show modularity with respect to the generators \(\gamma _{5,2}\), \(\gamma _{11,4}\), \(\gamma _{10,-3}\), respectively. For \(N=6\) we have the equalities

$$\begin{aligned} \begin{pmatrix}5&{}-1\\ 6&{}-1\end{pmatrix}=-TW^{-1} \quad \text {and}\quad \begin{pmatrix}5&{}1\\ -6&{}-1\end{pmatrix}=-T^{-1}W, \end{aligned}$$

so Lemma 4.1 with \(q=5\) takes the form

$$\begin{aligned} f\biggl |\bigl [\gamma -\xi (-1)\bigr ]\begin{pmatrix}1&{}2/5\\ {} &{}1\end{pmatrix}+f\biggl |\bigl [\gamma ^{-1}-\xi (-1)\bigr ]\begin{pmatrix}1&{}-2/5\\ {} &{}1\end{pmatrix}=0, \end{aligned}$$

where \(\gamma =\left( {\begin{matrix}5&{}-2\\ -12&{}5\end{matrix}}\right) \). Applying Lemma 4.4 with \(\alpha =4/5\) and \(\zeta =-1\), we obtain \(f|\gamma =\xi (-1)f\).

For \(N=15\) we have the equalities

$$\begin{aligned} \begin{pmatrix}8&{}-1\\ -15&{}2\end{pmatrix}= & {} T^{-1}\begin{pmatrix}2&{}-1\\ 15&{}-7\end{pmatrix}^{-1}, \quad \begin{pmatrix}8&{}1\\ 15&{}2\end{pmatrix}=\begin{pmatrix}2&{}-1\\ -15&{}8\end{pmatrix}^{-1},\\ \begin{pmatrix}8&{}-3\\ 75&{}-28\end{pmatrix}= & {} -\begin{pmatrix}2&{}-1\\ 15&{}-7\end{pmatrix}T\begin{pmatrix}11&{}-4\\ -30&{}11\end{pmatrix}, \\ \begin{pmatrix}8&{}3\\ 45&{}17\end{pmatrix}= & {} -\begin{pmatrix}2&{}-1\\ 15&{}-7\end{pmatrix}T\begin{pmatrix}11&{}-4\\ -30&{}11\end{pmatrix}^{-1}, \end{aligned}$$

so Lemma 4.1 with \(q=8\) takes the form

$$\begin{aligned} \xi (7)f\biggl |\bigl [\gamma -\xi (11)\bigr ]\begin{pmatrix}1&{}3/8\\ {} &{}1\end{pmatrix}+\xi (7)f\biggl |\bigl [\gamma ^{-1}-\xi (11)\bigr ]\begin{pmatrix}1&{}-3/8\\ {} &{}1\end{pmatrix}=0, \end{aligned}$$

where \(\gamma =\left( {\begin{matrix}11&{}-4\\ -30&{}11\end{matrix}}\right) \). Applying Lemma 4.4 with \(\alpha =3/4\) and \(\zeta =-1\), we obtain \(f|\gamma =\xi (11)f\).

For \(N=23\) we have the equalities

$$\begin{aligned} \begin{pmatrix}3&{}-1\\ -23&{}8\end{pmatrix}=-\begin{pmatrix}4&{}-1\\ -23&{}6\end{pmatrix}\begin{pmatrix}6&{}-1\\ -23&{}4\end{pmatrix}^{-1} \begin{pmatrix}10&{}3\\ 23&{}7\end{pmatrix}^{-1} \end{aligned}$$

and

$$\begin{aligned} \begin{pmatrix}3&{}1\\ 23&{}8\end{pmatrix}=-\begin{pmatrix}2&{}-1\\ 23&{}-11\end{pmatrix}\begin{pmatrix}10&{}3\\ 23&{}7\end{pmatrix}, \end{aligned}$$

so Lemma 4.1 with \(q=3\) takes the form

$$\begin{aligned} \xi (11)f\biggl |\bigl [\gamma -\xi (7)\bigr ]\begin{pmatrix}1&{}-1/3\\ {} &{}1\end{pmatrix}+\xi (10)f\biggl |\bigl [\gamma ^{-1}-\xi (10)\bigr ]\begin{pmatrix}1&{}1/3\\ {} &{}1\end{pmatrix}=0, \end{aligned}$$

where \(\gamma =\left( {\begin{matrix}10&{}3\\ 23&{}7\end{matrix}}\right) \). Applying Lemma 4.4 with \(\alpha =-2/3\) and \(\zeta =-\xi (8)\), we obtain \(f|\gamma =\xi (7)f\). \(\square \)

Theorem 2.2

Assume the hypotheses of Conjecture 1.1. Suppose that there is a subgroup \(H<\Gamma _1(N)\) of finite index such that \(f|\gamma =f\) for all \(\gamma \in H\). Then \(f\in M_k(\Gamma _0(N),\xi )\).

Proof

We may assume without loss of generality that H contains T and W. By Lemma 4.1, for any prime \(q\not \mid N\),

$$\begin{aligned} \sum _{a=1}^{q-1} f\bigg | \Bigl [\gamma _{q,a}-\overline{\xi (q)}\Bigr ] \begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}=0. \end{aligned}$$
(2.1)

Put \(h=[\Gamma _0(N):H]\), and let \(g_1,\ldots ,g_h\in \Gamma _0(N)\) be coset representatives for \(H\backslash \Gamma _0(N)\). Replacing \(g_i\) by \(Wg_i\) if necessary, we may assume without loss of generality that \(g_i\) is not upper triangular. For each \(\gamma _{q,a}\in \Gamma _0(N)\), there exists \(i\in \{1,\ldots ,h\}\) such that \(\gamma _{q,a}\in H g_i\), so that \(f|\gamma _{q,a}=f|g_i\). Rearranging (2.1), we get

$$\begin{aligned} \sum _{i=1}^h f\bigg | \bigl [g_i - \xi (g_i)\bigr ] \sum _{\ell =1}^{\kappa _i} \begin{pmatrix}1 &{} \frac{a_{i\ell }}{q}\\ 0 &{} 1\end{pmatrix}=0, \end{aligned}$$

where \(\bigcup _{i=1}^h\{a_{i\ell }:\ell =1,\ldots ,\kappa _i\}\) is a disjoint partition of \(\{1,\ldots ,q-1\}\).

For each \(i\in \{1,\ldots ,h\}\), since \([\Gamma _0(N): g_i^{-1} H g_i]=[\Gamma _0(N):H]<\infty \), there exists \(m_i\in \mathbb {{N}}\) such that

$$\begin{aligned} g_i^{-1} H g_iT^{m_i} = g_i^{-1} H g_i. \end{aligned}$$

Setting \(m={{\mathrm{lcm}}}(m_1,\ldots ,m_h)\), we have \(g_i T^m \in H g_i\) for all i. Then \(f|g_i T^{m} = f|g_i\), and thus \(f|[g_i-\xi (g_i)]\) has a Fourier expansion:

$$\begin{aligned} f\bigl |\bigl [g_i -\xi (g_i)\bigr ] = \sum _{n\in \mathbb {Z}}\lambda _i(n)e\!\left( n\tfrac{z}{m}\right) . \end{aligned}$$
(2.2)

Therefore,

$$\begin{aligned} \sum _{i=1}^h f\bigg | \bigl [g_i-\xi (g_i)\bigr ] \sum _{\ell =1}^{\kappa _i} \begin{pmatrix}1 &{} \tfrac{a_{i\ell }}{q}\\ 0 &{} 1\end{pmatrix}= \sum _{n\in \mathbb {Z}} \sum _{i=1}^h \lambda _i(n) \left( \sum _{\ell =1}^{\kappa _i} e\!\left( n \tfrac{a_{i\ell }}{qm}\right) \right) e\!\left( n\tfrac{z}{m}\right) =0, \end{aligned}$$

i.e. for \(n\in \mathbb {Z}\),

$$\begin{aligned} \sum _{i=1}^h \lambda _i(n) \left( \sum _{\ell =1}^{\kappa _i} e\!\left( n \tfrac{a_{i\ell }}{qm}\right) \right) = 0. \end{aligned}$$
(2.3)

Fix \(n\in \mathbb {Z}\setminus \{0\}\). By Dirichlet’s theorem, we can choose distinct primes \(q_1,\ldots ,q_h\not \mid mnN\) and integers \(a_1,\ldots ,a_h\) such that \(\gamma _{q_i,a_i}\in \langle T\rangle g_i\subseteq Hg_i\) for each i. Thus, from (2.3) for \(q\in \{q_1,\ldots ,q_h\}\), we obtain a system of linear equations of the shape

$$\begin{aligned} \sum _{i=1}^h \left( \sum _{\ell =1}^{\kappa _{i,j}} e\!\left( n\tfrac{a^{(j)}_{i\ell }}{q_j m}\right) \right) \lambda _i(n)=0 \quad \text {for }j\in \{1,\ldots ,h\}, \end{aligned}$$
(2.4)

with \(\kappa _{i,i}>0\) for every \(i\in \{1,\ldots ,h\}\). By Lemma 4.5,

$$\begin{aligned} \det \left( \left[ \sum _{\ell =1}^{\kappa _{i, j}} e\!\left( n\tfrac{a^{(j)}_{i\ell }}{q_j m}\right) \right] _{1\le i, j\le h}\right) \ne 0, \end{aligned}$$

so (2.4) has only the trivial solution \(\lambda _1(n)=\cdots =\lambda _h(n)= 0\).

Since \(n\in \mathbb {Z}\setminus \{0\}\) was arbitrary, it follows from (2.2) that \(f|[g_i-\xi (g_i)]\) is a constant, say \(C_i\). Since \(g_i^{-1}Hg_i\cap H\) has finite index in \({{\mathrm{SL}}}_2(\mathbb {Z})\), there exists \(\gamma =\left( {\begin{matrix}a&{}b\\ c&{}d\end{matrix}}\right) \in g_i^{-1}Hg_i\cap H\) with \(c\ne 0\). Then \(C_i=C_i|\gamma =(cz+d)^{-k}C_i\). Since \(k>0\), we must have \(C_i=0\), i.e. \(f|g_i=\xi (g_i)f\). This concludes the proof. \(\square \)

Theorem 2.3

Assume the hypotheses of Conjecture 1.1, and suppose that there is a congruence subgroup \(H<\Gamma _0(N)\) such that \(\bigl |(f|\gamma )(z)\bigr |=|f(z)|\) for all \(\gamma \in H\). Then \(f\in M_k(\Gamma _0(N),\xi )\).

Proof

If \(f=0\) then the conclusion is trivially true, so from now on assume \(f\ne 0\). Let M denote the level of H, so that \(H\supseteq \Gamma (M)\). Since \(f|T=f|W=f\) and \(\Gamma _1(N)\) is generated by \(\{T,W\}\cup \Gamma (M)\), we may assume without loss of generality that \(H\supseteq \Gamma _1(N)\). By Theorem 3.2, there exists a prime \(q\equiv 1\pmod {N}\) such that \(\Gamma _1(N)\) is generated by \(\{T,W,\gamma _{q,a}:1\le a<q\}\). By Lemma 4.6, there exists \(m\in \mathbb {{N}}\) such that \(q\mid m\) and \(\{f_m,g_m\}\ne \{0\}\). Since \(\left( {\begin{matrix}&{}-1\\ N&{}\end{matrix}}\right) \) normalizes \(\Gamma _1(N)\), we may swap the roles of f and g if necessary, so as to assume that \(f_m\ne 0\).

For any \(\gamma \in \Gamma _1(N)\), the function \((f|\gamma )(z)/f(z)\) is meromorphic on \(\mathbb {{H}}\) and has modulus 1; by the maximum modulus principle, it must be a constant, say \(\epsilon (\gamma )\). By Lemma 4.1, we have

$$\begin{aligned} 0= \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} f\left| \bigl [\gamma _{q,a}-1\bigr ]\begin{pmatrix}1&{}a/q\\ {} &{}1\end{pmatrix}\right. =\sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} \bigl [\epsilon (\gamma _{q,a})-1\bigr ]f\left| \begin{pmatrix}1&{}a/q\\ {} &{}1\end{pmatrix}\right. . \end{aligned}$$

Considering the Fourier expansion, this implies that

$$\begin{aligned} \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} \bigl [\epsilon (\gamma _{q,a})-1\bigr ] f_ne\!\left( \frac{an}{q}\right) =0 \quad \text {for all }n. \end{aligned}$$

In particular, taking \(n=m\), we have

$$\begin{aligned} \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} \bigl [\epsilon (\gamma _{q,a})-1\bigr ]=0, \end{aligned}$$

and since \(|\epsilon (\gamma _{q,a})|=1\) for every a, it follows that \(\epsilon (\gamma _{q,a})=1\). Therefore, \(f|\gamma =f\) for all \(\gamma \in \Gamma _1(N)\). Applying Theorem 2.2 with \(H=\Gamma _1(N)\), we conclude that \(f\in M_k(\Gamma _0(N),\xi )\). \(\square \)

Theorem 2.4

Assume the hypotheses of Conjecture 1.1. Suppose that N is prime and that \(f|\gamma _1\gamma _2=f|\gamma _2\gamma _1\) for every pair \(\gamma _1,\gamma _2\in \Gamma _0(N)\). Then \(f\in M_k(\Gamma _0(N),\xi )\).

Proof

Let H be the smallest subgroup of \(\Gamma _0(N)\) containing T, W and all commutators \(\gamma _1\gamma _2\gamma _1^{-1}\gamma _2^{-1}\) for \(\gamma _1,\gamma _2\in \Gamma _0(N)\). Then H is a normal subgroup with abelian quotient \(H\backslash \Gamma _0(N)\), and \(f|\gamma =f\) for all \(\gamma \in H\). If \(N\in \{2,3\}\) then \(\langle H,-I\rangle =\Gamma _0(N)\) and there is nothing to prove, so we assume henceforth that \(N\ge 5\).

Let \(R=\{r\in \mathbb {Z}:2\le |r|<\frac{1}{2}N\}\), and for each \(r\in R\), fix a matrix \(\gamma _{r,1}\) with top row \(\left( {\begin{matrix}r&-1\end{matrix}}\right) \). Then, by Lemma 4.7, for any prime \(q\not \mid N\) and a coprime to q, we have

$$\begin{aligned} \gamma _{q,a}=\pm \prod _{i=1}^l\tau _i, \end{aligned}$$

where each \(\tau _i\) is an element of \(\{T,T^{-1},W,W^{-1},\gamma _{r,1}^{-1}:r\in R\}\). Since \(H\backslash \Gamma _0(N)\) is abelian, we are free to permute the \(\tau _i\) without changing the coset \(H\prod \tau _i\). Hence, since H contains \(\langle T,W\rangle \), we may write

$$\begin{aligned} H\gamma _{q,a}=H(-I)^\epsilon \prod _{r\in R}\gamma _{r,1}^{-e_r}, \end{aligned}$$

for some \(\epsilon \in \{0,1\}\) and non-negative integers \(e_r\) (depending on q and a), satisfying \(\sum _{r\in R}e_r\le \log _2q\).

Now, fix \(s\in R\), \(n\in \mathbb {Z}\setminus \{0\}\) and \(X\in \mathbb {{N}}\), and let \(Q=Q(s,n,X)\) denote the set of primes q satisfying \(qs\equiv 1\pmod {N}\), \(q\not \mid n\) and \(q\le X\). As in the proof of Theorem 2.2, we consider (2.1) for all primes \(q\in Q\). Let \(g_1,\ldots ,g_h\) be a minimal set of representatives for the cosets \(H\gamma _{q,a}\) of all matrices occurring there. By the above, we may take each \(g_i\) of the form \((-I)^\epsilon \prod _{r\in R}\gamma _{r,1}^{-e_r}\) with \(\epsilon \in \{0,1\}\), \(e_r\ge 0\) and \(\sum _{r\in R}e_r\le \log _2X\). In particular, \(H\gamma _{s,1}^{-1}=H\gamma _{q,-1}\) for every \(q\in Q\), so we may take \(g_1=\gamma _{s,1}^{-1}\). By Dirichlet’s theorem, we have \(\#Q\gg X/\log {X}\), and thus \(h\le 2(1+\log _2X)^{N-3}\le \#Q\) for all sufficiently large X.

For each \(i\in \{1,\ldots ,h\}\), we have \(f|g_iT=f|Tg_i=f|g_i\), so \(f|[g_i-\xi (g_i)]\) has a Fourier expansion as in (2.2), with \(m=1\). In turn, this leads to the system of linear equations (2.4), where we take \(\{q_j\}\) to be any subset of Q of cardinality h. Applying Lemma 4.8, by appropriate permutation of the rows and columns we can select a square subsystem for which the diagonal entries are non-zero. Since the coset \(Hg_1\) occurs in every row, the column \(i=1\) is necessarily one of the variables in the subsystem.

Hence, by Lemma 4.5, we have \(\lambda _1(n)=0\). Since \(n\in \mathbb {Z}\setminus \{0\}\) was arbitrary, we thus have that \(f|[\gamma _{s,1}^{-1}-\xi (s)]\) is a constant, say C. Clearly \(C|\gamma =C\) for all \(\gamma \in \gamma _{s,1}H\gamma _{s,1}^{-1}\cap H=H\). Taking \(\gamma =W\), it follows that \(C=0\), whence \(f|\gamma _{s,1}^{-1}=\xi (s)f\). Finally, Lemma 4.7 implies that \(\Gamma _0(N)\) is generated by \(-I\), T, W and \(\gamma _{s,1}\) for \(s\in R\), so \(f|\gamma =\xi (\gamma )f\) for all \(\gamma \in \Gamma _0(N)\). \(\square \)

Theorem 2.5

Assume the hypotheses of Conjecture 1.1. There is a set Q of prime numbers such that

  1. (i)

    Q has density 1 in the set of all primes, and

  2. (ii)

    if there exists \(q\in Q\) such that the multiplicative twists \(\Lambda _f(s,\chi )\) and \(\Lambda _g(s,\overline{\chi })\), for all primitive characters \(\chi \pmod {q}\), continue to entire functions of finite order and satisfy the functional equation

    $$\begin{aligned} \Lambda _f(s,\chi )=i^k\xi (q)\chi (N)q^{-1}\tau (\chi )^2(Nq^2)^{\frac{1}{2}-s} \Lambda _g(1-s,\overline{\chi }), \end{aligned}$$
    (2.5)

    then \(f\in M_k(\Gamma _0(N),\xi )\).

In particular, for each N in the following table, the set Q contains every prime \(q\not \mid N\) in the indicated interval.

N

q

N

q

10

\((11,10^9)\)

18

\((53,10^9)\)

12

\((35,10^9)\)

19

\((37,10^9)\)

13

\((5,10^9)\)

20

\((79,10^9)\)

14

\((43,10^9)\)

21

\((83,10^9)\)

16

\((47,10^9)\)

22

\((43,10^9)\)

Proof

Let Q be the set of primes \(q\not \mid N\) such that \(H_q\supseteq \Gamma _1(N)\), in the notation of Sect. 3. By Theorem 3.2, Q has density 1 in the set of all primes, so (i) holds, and the fact that Q contains the numbers indicated in the table is the content of Theorem 3.3.

Let \(q\in Q\). Then by [15, Lemmas 4.3.9, 4.3.13], the assumed analytic properties of \(\Lambda _f(s,\chi )\) and \(\Lambda _g(s,\overline{\chi })\) described in (ii), together with the functional equation (2.5) for all primitive \(\chi \pmod {q}\), imply the equality

$$\begin{aligned} f\bigg |\Bigl [\gamma _{q,a}-\overline{\xi (q)}\Bigr ] \begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}=f\bigg |\Bigl [\gamma _{q,b}-\overline{\xi (q)}\Bigr ] \begin{pmatrix}1 &{} \frac{b}{q} \\ 0 &{} 1\end{pmatrix}\end{aligned}$$

for any integers ab coprime to q. By Lemma 4.1, it follows that \(f|\gamma _{q,a}=\overline{\xi (q)}f\) for every a coprime to q. By the definition of Q, we thus have \(f|\gamma =\xi (\gamma )f\) for every \(\gamma \in H_q\supseteq \Gamma _1(N)\). Applying Theorem 2.2 with \(H=\Gamma _1(N)\), we conclude that \(f\in M_k(\Gamma _0(N),\xi )\). \(\square \)

3 Generating \(\Gamma _1(N)\)

In this section, we consider the question of when the elements of \(\Gamma _0(N)\) with a fixed upper-left entry generate a subgroup containing \(\Gamma _1(N)\). By the proof of Theorem 2.5, any such upper-left entry gives sufficient conditions to imply modularity using twists of a single modulus.

For any \(q\in \mathbb {{N}}\) coprime to N, let \(H_q\) denote the subgroup of \(\Gamma _0(N)\) generated by the matrices

$$\begin{aligned} \left\{ \begin{pmatrix}A&{}B\\ C&{}D\end{pmatrix}\in \Gamma _0(N):A=q\right\} . \end{aligned}$$

Conjecture 3.1

There exists \(q_0=q_0(N)\) such that \(H_q\supseteq \Gamma _1(N)\) for every \(q\ge q_0\) coprime to N.

Theorem 3.2

\(H_q\supseteq \Gamma _1(N)\) holds for almost all \(q\in \mathbb {{N}}\) coprime to N and for almost all primes \(q\not \mid N\), i.e.

$$\begin{aligned} \#\{q\in \mathbb {{N}}:(q,N)=1,\;H_q\supseteq \Gamma _1(N),\;q\le x\} =\bigl (\tfrac{\varphi (N)}{N}+o(1)\bigr )x \end{aligned}$$
(3.1)

and

$$\begin{aligned} \#\{q\text { prime}:q\not \mid N,\;H_q\supseteq \Gamma _1(N),\;q\le x\} =(1+o(1))\pi (x) \end{aligned}$$
(3.2)

as \(x\rightarrow \infty \).

Proof

For \(q\in \mathbb {{N}}\) coprime to N, set

$$\begin{aligned} \Gamma _q= \left\{ \begin{pmatrix}A&{}B\\ C&{}D\end{pmatrix}\in \Gamma _0(N): A\equiv q^n\!\!\!\!\!\!\pmod {N} \text { for some }n\in \mathbb {{N}}\right\} . \end{aligned}$$

Then \(\Gamma _q\) is a group satisfying \(\Gamma _1(N)\cup H_q\subseteq \Gamma _q\subseteq \Gamma _0(N)\), and we have

$$\begin{aligned} H_q\supseteq \Gamma _1(N)\Longleftrightarrow H_q=\Gamma _q. \end{aligned}$$

Consider a fixed \(q_0\in \mathbb {{N}}\) coprime to N, and let \(\bar{q}_0\) be a multiplicative inverse of \(q_0\pmod {N}\). Then, for any \(q\equiv q_0\pmod {N}\),

$$\begin{aligned} T=\begin{pmatrix}q&{}1\\ q(N+\bar{q}_0)-1&{}\bar{q}_0+N\end{pmatrix} \begin{pmatrix}q&{}1\\ q\bar{q}_0-1&{}\bar{q}_0\end{pmatrix}^{-1}, \end{aligned}$$

and

$$\begin{aligned} W=\begin{pmatrix}q&{}1\\ q\bar{q}_0-1&{}\bar{q}_0\end{pmatrix}^{-1} \begin{pmatrix}q&{}q+1\\ q\bar{q}_0-1&{}(q+1)\bar{q}_0-1\end{pmatrix}. \end{aligned}$$

so that \(H_q\) and \(\Gamma _q=\Gamma _{q_0}\) contain \(\langle T,W\rangle \).

Let

$$\begin{aligned} \{T,W\}\cup \left\{ \gamma _i= \begin{pmatrix}A_i&{}B_i\\ NC_i&{}D_i\end{pmatrix}:1\le i\le h \right\} \end{aligned}$$

be a fixed generating set for \(\Gamma _{q_0}\), with \(\gamma _1=\left( {\begin{matrix}q_0&{}1\\ q_0\bar{q}_0-1&{}\bar{q}_0\end{matrix}}\right) \). For \(i\ge 2\), replacing \(\gamma _i\) by \(\gamma _1^{n_i}\gamma _i\) for a suitable \(n_i\), we may assume that \(A_i\equiv q_0\pmod {N}\). Also, we may assume that \(A_i\ne 0\), since otherwise \(N=1\) and \(\gamma _i\) is contained in \(\langle T,W\rangle \).

Next, we modify \(\gamma _1,\ldots ,\gamma _h\) by multiplying by powers of T and W. First, multiplying by \(W^{m_i}\) on the left leaves \(A_i\) unchanged and replaces \(C_i\) by \(C_i+m_iA_i\). Hence, by Dirichlet’s theorem, we may take \(C_1,\ldots ,C_h\) to be distinct primes not dividing N. Second, by the Chinese remainder theorem, we can choose \(q_1\in \mathbb {{N}}\) satisfying \(q_1\equiv q_0\pmod {N}\) and \(q_1\equiv A_i\pmod {C_i}\) for every i. Multiplying on the left by \(T^{(q_1-A_i)/(NC_i)}\) replaces each \(A_i\) by \(q_1\).

Now, let \(q\in \mathbb {{N}}\) with \(q\equiv q_0\pmod {N}\). Suppose that the divisors of \(q-q_1\) represent all invertible residue classes modulo \(Nq_1\), i.e.

$$\begin{aligned} \{d+Nq_1\mathbb {Z}:d\in \mathbb {{N}},\;d\mid (q-q_1)\}\supseteq (\mathbb {Z}/Nq_1\mathbb {Z})^\times . \end{aligned}$$
(3.3)

For \(i=1,\ldots ,h\), let \(d_i\) be a divisor of \(q-q_1\) satisfying \(d_i\equiv C_i\pmod {Nq_1}\). Then \((d_i,N)=1\), so \(Nd_i\mid (q-q_1)\). Hence,

$$\begin{aligned} T^{\frac{q-q_1}{Nd_i}}W^{\frac{d_i-C_i}{q_1}} \begin{pmatrix}q_1\\ NC_i\end{pmatrix} =\begin{pmatrix}q\\ Nd_i\end{pmatrix}, \end{aligned}$$

so that \(\gamma _i\) is contained in \(H_q\). Therefore \(H_q=\Gamma _{q_0}=\Gamma _q\).

Erdős [7] showed that almost all \(q\in \mathbb {{N}}\) satisfy (3.3). Therefore, the set of \(q\in \mathbb {{N}}\) such that \(q\equiv q_0\pmod {N}\) and \(H_q=\Gamma _q\) has density 1 / N. Letting \(q_0\) run through a set of representatives for the invertible residue classes mod N yields (3.1). For the prime case, we similarly apply Lemma 4.9 with \((p_0,q)=(q_1,Nq_1)\) to see that almost all \(q\not \mid N\) satisfy (3.3), and this leads to (3.2). \(\square \)

Theorem 3.3

For each N in the following table, \(H_q\supseteq \Gamma _1(N)\) holds for \(q\in \mathbb {{N}}\) with \((q,N)=1\) and for primes \(q\not \mid N\) in the indicated intervals.

N

\( (q,N)=1\)

Prime \(q\not \mid N\)

N

\((q,N) = 1\)

Prime \(q \not \mid N\)

5

\( (44,10^9)\)

\((0,10^9)\)

14

\( (55, 10^9)\)

\((43, 10^9)\)

6

\( (1,10^9)\)

\((0,10^9)\)

15

\( (91, 10^9)\)

\((31, 10^9)\)

7

\( (20,10^9)\)

\((0,10^9)\)

16

\( (63, 10^9)\)

\((47, 10^9)\)

8

\( (15,10^9)\)

\((7,10^9)\)

17

\( (390, 10^5)\)

\((101,10^9)\)

9

\( (136,10^9)\)

\((2,10^9)\)

18

\( (55, 10^9)\)

\((53, 10^9)\)

10

\( (39,10^9)\)

\((11,10^9)\)

19

\( (360, 10^5)\)

\((37, 10^9)\)

11

\( (84, 10^9)\)

\((2, 10^9)\)

20

\( (119, 10^5)\)

\((79, 10^9)\)

12

\( (35, 10^9)\)

\((23, 10^9)\)

21

\( (230, 10^5)\)

\((83, 10^9)\)

13

\( (168, 10^9)\)

\((5, 10^9)\)

22

\( (175, 10^5)\)

\((43, 10^9)\)

Proof

We applied two strategies to verify the statement computationally. First, we used Lemma 4.14 and Corollary 4.15 to compute a list L of all elements of \(\langle T,W\rangle \) of height up to some bound chosen by trial and error (e.g. for \(N=13\) we chose the bound 5500, which yielded 290841 words in TW). We then used Sage [20] to compute a generating set \(\{g_1,\ldots ,g_h\}\) for \(\Gamma _1(N)\), and for each generator we computed every word of the form \(w_1g_i^{\pm 1}w_2\), for \(w_1,w_2\in L\). Combining this with Lemma 4.13 and a simple sieve, we obtained sufficient conditions to establish the claim for the vast majority of q.

For the relatively small number of values of q remaining, we computed the expansions of every element \(\gamma _{q,a}\) for \(1\le a\le q\) in terms of the generators \(S=\left( {\begin{matrix}&{}-1\\ 1&{}\end{matrix}}\right) \) and \(T=\left( {\begin{matrix}1&{}1\\ {} &{}1\end{matrix}}\right) \) of \({{\mathrm{SL}}}_2(\mathbb {Z})\), and presented \({{\mathrm{SL}}}_2(\mathbb {Z})\cong \langle S,T:S^4=S^2(ST)^3=1\rangle \) as an abstract group to GAP [19]. We then used GAP’s implementation of the Todd–Coxeter algorithm [21] to attempt to compute the index \([{{\mathrm{SL}}}_2(\mathbb {Z}):H_q]\). When this terminated with a number equal to the expected index \([{{\mathrm{SL}}}_2(\mathbb {Z}):\Gamma _q]\), we obtained the claim for q.

The first strategy tends to work better at finding prime values of q, which explains the discrepancy in the sizes of the intervals for larger values of N, where there are eventually too many exceptions to test by the second method in a reasonable amount of time.

For some q (those for which the Todd–Coxeter algorithm appeared not to terminate), our results were inconclusive, though we expect that \(H_q\not \supseteq \Gamma _1(N)\) in those cases. In a very small number of cases, \(H_q\) has finite index in \({{\mathrm{SL}}}_2(\mathbb {Z})\) but is not the full group \(\Gamma _q\). \(\square \)

4 Lemmas

Lemma 4.1

Let \(q\in \mathbb {{N}}\) with \((q,N)=1\). The assumptions of Conjecture 1.1 imply the relation

$$\begin{aligned} \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} f\bigg | \Bigl [\gamma _{q,a} - \overline{\xi (q)}\Bigr ] \begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}=0, \end{aligned}$$
(4.1)

where \(\gamma _{q,a}\) is any element of \(\Gamma _0(N)\) with top row \(\left( {\begin{matrix}q&-a\end{matrix}}\right) \).

Proof

From Hecke [15, Theorem 4.3.5] we know that the functional equation in Conjecture 1.1 is equivalent to the equation

$$\begin{aligned} \sum _{n=1}^{\infty }f_n c_q(n) e^{2\pi i nz} = (-1)^k \xi (q) (Nq^2)^{-\frac{k}{2}} z^{-k} \sum _{n=1}^{\infty } g_n c_q(n) e^{2 \pi i \frac{-n}{Nq^2z}}. \end{aligned}$$
(4.2)

In particular we find for \(q=1\), that \(f|\left( {\begin{matrix}0 &{} -N^{-\frac{1}{2}} \\ N^{\frac{1}{2}} &{} 0 \end{matrix}}\right) =g\), where \(g(z)=\sum _{n=1}^{\infty } g_n e^{2 \pi i n z}\). Now we shall note that (4.2) may be rewritten as

$$\begin{aligned} \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} f\bigg | \begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}=\xi (q)\sum _{\begin{array}{c} c\pmod {q}\\ (c,q)=1 \end{array}} g\bigg | \begin{pmatrix}-N^{\frac{1}{2}}c &{} N^{-\frac{1}{2}}q^{-1} \\ -N^{\frac{1}{2}}q &{} 0\end{pmatrix}. \end{aligned}$$

Combining this with the matrix identity

$$\begin{aligned} \begin{pmatrix}0 &{} -N^{-\frac{1}{2}} \\ N^{\frac{1}{2}} &{} 0 \end{pmatrix}\begin{pmatrix}-N^{\frac{1}{2}}c &{} N^{-\frac{1}{2}}q^{-1} \\ -N^{\frac{1}{2}}q &{} 0\end{pmatrix}= \begin{pmatrix}q &{} 0 \\ -Nc &{} q^{-1}\end{pmatrix}= \begin{pmatrix}q &{} -a \\ -Nc &{} s\end{pmatrix}\begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}, \end{aligned}$$

where \(a=a(c)\) is chosen so that \(Nca\equiv -1\pmod {q}\) and \(s=(Nac+1)/q\), we derive

$$\begin{aligned} \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} f\bigg | \begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}=\xi (q) \sum _{\begin{array}{c} c\pmod {q}\\ (c,q)=1 \end{array}} f\bigg | \begin{pmatrix}q &{} -a \\ -Nc &{} s\end{pmatrix}\begin{pmatrix}1 &{} \frac{a}{q} \\ 0 &{} 1\end{pmatrix}. \end{aligned}$$

Here the summation over c may be replaced by the summation over \(a\pmod {q}, (a,q)=1\), by choosing appropriate representatives, thereby proving the lemma. \(\square \)

Lemma 4.2

Suppose that \(h:\mathbb {{H}}\rightarrow \mathbb {C}\) is a holomorphic function, \(M\in {{\mathrm{SL}}}_2(\mathbb {R})\) is elliptic of infinite order, and \(\zeta \in \mathbb {C}^\times \) is a root of unity such that \(h|M=\zeta h\). Then \(h=0\).

Proof

This is an extension of Weil’s Lemma [2, Lemma 1.5.1], which is the special case \(\zeta =1\). It can be proven by the same method or, alternatively, derived as a consequence, as follows. Suppose that \(\zeta \) has order n, and let \(M=\left( {\begin{matrix}a&{}b\\ c&{}d\end{matrix}}\right) \). Then we have

$$\begin{aligned} (cz+d)^{-kn}h\!\left( \frac{az+b}{cz+d}\right) ^n =\bigl ((h|M)(z)\bigr )^n=h(z)^n. \end{aligned}$$

Applying Weil’s Lemma to \(h^n\) (and the weight-kn slash operator), we conclude that \(h^n=0\), whence \(h=0\). \(\square \)

Lemma 4.3

Assume the hypotheses of Conjecture 1.1, and suppose that \(N=qs-1\), where \(q,s\in \{3,4,6\}\). Then \(f|\gamma _{q,1}=\overline{\xi (q)}f\).

Proof

Note that \(\varphi (q)=\varphi (s)=2\), and we have \(\gamma _{q,\pm 1}=\gamma _{s,\mp 1}^{-1}=\left( {\begin{matrix}q&{}\mp 1\\ \mp N&{}s\end{matrix}}\right) \). Hence, applying Lemma 4.1 to both q and s, we obtain

$$\begin{aligned} f\bigl |\bigl [\gamma _{q,1}-\overline{\xi (q)}\bigr ] =&-f\biggl |\bigl [\gamma _{q,-1}-\overline{\xi (q)}\bigr ]\begin{pmatrix}1&{}-2/q\\ {} &{}1\end{pmatrix}\\ =&\quad \xi (s)f\biggl |\bigl [\gamma _{s,1}-\overline{\xi (s)}\bigr ] \gamma _{s,1}^{-1}\begin{pmatrix}1&{}-2/q\\ {} &{}1\end{pmatrix}\\ =&-\xi (s)f\biggl |\bigl [\gamma _{s,-1}-\overline{\xi (s)}\bigr ] \begin{pmatrix}1&{}-2/s\\ {} &{}1\end{pmatrix}\gamma _{s,1}^{-1}\begin{pmatrix}1&{}-2/q\\ {} &{}1\end{pmatrix}\\ =&f\biggl |\bigl [\gamma _{q,1}-\overline{\xi (q)}\bigr ] \gamma _{q,1}^{-1}\begin{pmatrix}1&{}-2/s\\ {} &{}1\end{pmatrix}\gamma _{s,1}^{-1}\begin{pmatrix}1&{}-2/q\\ {} &{}1\end{pmatrix}. \end{aligned}$$

Writing \(M=\gamma _{q,1}^{-1}\left( {\begin{matrix}1&{}-2/s\\ {} &{}1\end{matrix}}\right) \gamma _{s,1}^{-1}\left( {\begin{matrix}1&{}-2/q\\ {} &{}1\end{matrix}}\right) =\left( {\begin{matrix}1&{}-2/q\\ 2q-2/s&{}-3+4/(qs)\end{matrix}}\right) \), we thus have

$$\begin{aligned} f\bigl |\bigl [\gamma _{q,1}-\overline{\xi (q)}\bigr ][I-M]=0. \end{aligned}$$

Note that \(|{{\mathrm{tr}}}{M}|<2\) and \({{\mathrm{tr}}}{M}\notin \mathbb {Z}\), so M is elliptic of infinite order. Applying Lemma 4.2 to \(h=f|[\gamma _{q,1}-\overline{\xi (q)}]\), we obtain \(f|\gamma _{q,1}=\overline{\xi (q)}f\). \(\square \)

Lemma 4.4

Assume the hypotheses of Conjecture 1.1, and suppose there exist \(\gamma =\left( {\begin{matrix}A&{}B\\ C&{}D\end{matrix}}\right) \in \Gamma _0(N)\), \(\alpha \in \mathbb {Q}\) and a root of unity \(\zeta \in \mathbb {C}^\times \) such that \(C\alpha \notin \mathbb {Z}\), \(|A+D+C\alpha |<2\), and

$$\begin{aligned} f\bigl |\bigl [\gamma ^{-1}-\xi (A)\bigr ] =\zeta f\biggl |[\gamma -\xi (D)]\begin{pmatrix}1&{}\alpha \\ {} &{}1\end{pmatrix}. \end{aligned}$$

Then \(f|\gamma =\xi (D)f\).

Proof

We have

$$\begin{aligned} -\xi (D)\zeta f|[\gamma -\xi (D)]= & {} -\xi (D)f\biggl |\bigl [\gamma ^{-1}-\xi (A)\bigr ] \begin{pmatrix}1&{}-\alpha \\ {} &{}1\end{pmatrix}\\= & {} f\biggl |[\gamma -\xi (D)] \gamma ^{-1}\begin{pmatrix}1&{}-\alpha \\ {} &{}1\end{pmatrix}. \end{aligned}$$

Note that \({{\mathrm{tr}}}\bigl (\gamma ^{-1}\left( {\begin{matrix}1&{}-\alpha \\ {} &{}1\end{matrix}}\right) \bigr )=A+D+C\alpha \). By hypothesis this is non-integral and has modulus less than 2, so \(\gamma ^{-1}\left( {\begin{matrix}1&{}-\alpha \\ {} &{}1\end{matrix}}\right) \) is elliptic of infinite order. Applying Lemma 4.2, we obtain \(f|\gamma =\xi (D)f\). \(\square \)

Lemma 4.5

Let \(h,n,m\in \mathbb {{N}}\), and let \(q_1,\ldots ,q_h\) be distinct primes with \(q_j\not \mid mn\) for all j. For every j, let \(s_{i,j} \subseteq \{1,\ldots ,q_j-1\}\), with \(s_{i_1,j}\cap s_{i_2,j}=\emptyset \) for all \(i_1\ne i_2\) (we do not assume that \(s_{i,j}\ne \emptyset \)). Let \(S_{i,j} = \sum _{a\in s_{i,j}} e\big (\frac{na}{mq_j}\big )\). Suppose that \(s_{i,i}\ne \emptyset \) for every i. Then \(\det \bigl ([S_{i,j}]_{1\le i,j\le h}\bigr )\ne 0\).

Proof

Replacing (mn) by \((m/\gcd (m,n),n/\gcd (m,n))\) if necessary, we may assume without loss of generality that \((m,n)=1\). We prove the claim by induction on h.

Suppose first that \(h=1\). Each \(e\big (\frac{na}{mq_1}\big )\) is the ath power of \(e\big (\frac{n}{mq_1}\big ) =: \zeta _{mq_1}\), which is a primitive \(mq_1\)th root of unity. By hypothesis \(s_{1,1}\) is not empty, so \(S_{1,1}\) is the value at \(\zeta _{mq_1}\) of a non-constant polynomial \(P\in \mathbb {Q}[x]\). Note that \(P(x)=xQ(x)\) for some non-zero \(Q\in \mathbb {Q}[x]\) (since \(s_{1,1}\subseteq \{1,\ldots ,q_1-1\}\)), and that the degree of Q is at most \(q_1-2\). The degree of the extension \(\mathbb {Q}(\zeta _{mq_1})/\mathbb {Q}\) is \(\varphi (mq_1) = \varphi (m)\varphi (q_1) \ge \varphi (q_1) = q_1-1\). Hence \(S_{1,1}=P(\zeta _{mq_1})=\zeta _{mq_1}Q(\zeta _{mq_1})\ne 0\). This concludes the proof for \(h=1\).

Suppose \(h\ge 2\) and expand \(\det [S_{i,j}]\) with respect to the first line. We get an expression of the form \(P(\zeta _{mq_1})\) for some polynomial \(P\in \mathbb {Q}(\zeta _{mq_2},\ldots ,\zeta _{mq_h})[x]\). We claim that P is not constant. To see this, let \(a\in s_{1,1}\) (such a exists because \(s_{1,1}\ne \emptyset \)). Then \(a\notin s_{i,1}\) for any \(i\ne 1\), since \(s_{i_1,1}\cap s_{i_2,1}=\emptyset \) for \(i_1\ne i_2\). Thus, the coefficient of \(x^a\) in P(x) is the determinant of the cofactor matrix for \(S_{1,1}\). This determinant satisfies all hypotheses of the lemma for \(h-1\) and primes \(q_2,\ldots ,q_h\); hence it is non-zero by the inductive hypothesis.

Note that \(P(x)=xQ(x)\) for some non-zero \(Q\in \mathbb {Q}(\zeta _{mq_2},\ldots ,\zeta _{mq_h})[x]\) (since each \(s_{i,1}\subseteq \{1,\ldots ,q_1-1\}\)), and that the degree of Q is \(\le q_1-2\). By coprimality assumptions, the degree of the extension \(\mathbb {Q}(\zeta _{mq_1},\ldots ,\zeta _{mq_h})/\mathbb {Q}(\zeta _{mq_2},\ldots ,\zeta _{mq_h})\) is \(\varphi (mq_1q_2\cdots q_h)/ \varphi (mq_2\cdots q_h)= \varphi (q_1) = q_1-1\). Hence \(Q(\zeta _{mq_1})\ne 0\). Thus \(\det [S_{i,j}] = P(\zeta _{mq_1})= \zeta _{mq_1}Q(\zeta _{mq_1}) \ne 0\). \(\square \)

Lemma 4.6

Assume the hypotheses of Conjecture 1.1, and suppose that f is not identically 0. Then for any prime \(q\not \mid N\), there exists \(n\in \mathbb {{N}}\) such that \(q\mid n\) and \(\{f_n,g_n\}\ne \{0\}\).

Proof

Suppose that the conclusion is false for some prime \(q\not \mid N\), so that \(f_n=g_n=0\) for every n divisible by q. Then we have \(f_nc_q(n)=-f_n\) and \(g_nc_q(n)=-g_n\) for every n, so that

$$\begin{aligned} -1=\frac{\Lambda _f(s,c_q)}{\Lambda _f(s,c_1)} =\frac{\Lambda _g(1-s,c_q)}{\Lambda _g(1-s,c_1)}. \end{aligned}$$

On the other hand, (1.5) applied to \(c_1\) and \(c_q\) shows that

$$\begin{aligned} \frac{\Lambda _f(s,c_q)}{\Lambda _f(s,c_1)} =\xi (q)q^{1-2s}\frac{\Lambda _g(1-s,c_q)}{\Lambda _g(1-s,c_1)}, \end{aligned}$$

so \(\xi (q)q^{1-2s}=1\). Since \(q>1\), this is a contradiction. \(\square \)

Lemma 4.7

Let N be a prime, and for each \(r\in \mathbb {Z}\) with \(2\le |r|<\frac{1}{2}N\), let \(\gamma _{r,1}\in \Gamma _0(N)\) be a matrix with top row \(\left( {\begin{matrix}r&-1\end{matrix}}\right) \). Then any matrix \(\left( {\begin{matrix}A &{} B \\ CN &{} D \end{matrix}}\right) \in \Gamma _0(N)\) may be written in the form \(\pm \tau _1\tau _2\cdots \tau _l\) with \(\tau _i\in \{T,T^{-1},W,W^{-1},\gamma _{r,1}^{-1}:2\le |r|<\frac{1}{2}N\}\) for each \(i=1,\ldots ,l\), in such a way that

$$\begin{aligned} \#\{i:\tau _i\in \{\gamma _{r,1}^{-1}\}\}\le \log _2(|A|). \end{aligned}$$

Proof

If \(C=0\) then \(\left( {\begin{matrix}A &{} B \\ CN &{} D \end{matrix}}\right) = \pm T^{\alpha }\) for some choice of sign and \(\alpha \in \mathbb {Z}\). In the general case we may multiply on the left by a power of T to replace A by any integer \(A'\) such that \(A'\equiv A\pmod {CN}\). Choosing \(A'\) such that \(|A'|\le \frac{1}{2}|CN|\), we also have \(|A'|\le |A|\). Similarly we may multiply on the left by W and replace C by any integer \(C'\equiv C\pmod {A'}\) with \(|C'|\le \frac{1}{2}|A'|\).

Repeating this process will either lead to \(C=0\) or will eventually stagnate. Thus we may assume now that \(|A|\le \frac{1}{2}|CN|\) and \(0<|C|\le \frac{1}{2}|A|\). In particular, this implies that \(N\ge 4\), so N is an odd prime. Let r be the nearest integer to the fraction CN / A (note that \(A\ne 0\) since \((A,N)=1\)), rounded toward 0 in the case of a tie. We have \(2\le |CN/A|\le \frac{1}{2}N\), and thus \(2\le |r|<\frac{1}{2}N\). Multiplying on the left by \(\gamma _{r,1}\), the new top-left corner is \(rA-CN=A(r-\frac{CN}{A})\), which does not exceed \(\frac{1}{2}|A|\) in absolute value. Thus, by repeating this process we eventually end up in the case \(C=0\), having used at most \(\log _2(|A|)\) matrices \(\gamma _{r,1}\). \(\square \)

Lemma 4.8

Let A be an \(n\times n\) matrix over a ring, with non-zero rows. Then there exists \(m\in \{1,\ldots ,n\}\) and \(n\times n\) permutation matrices P and Q such that PAQ takes the block form \(\left( \begin{array}{c|c} \hat{A}&{}0\\ \hline C&{}D\end{array}\right) \), where \(\hat{A}\) is of size \(m\times m\) and has non-zero diagonal entries.

Proof

Denote the entries of A by \(a_{ij}\). For any \(S\subseteq \{1,\ldots ,n\}\), define

$$\begin{aligned} m_S=\#\{j:a_{ij}\ne 0\text { for some }i\in S\}. \end{aligned}$$

Note that for \(S=\{1,\ldots ,n\}\) we have \(m_S\le \#S\). Hence, there is a minimal non-empty set \(R\subseteq \{1,\ldots ,n\}\) satisfying \(m_R\le \#R\). Since A has non-zero rows, we have \(m_S>0\) whenever \(S\ne \emptyset \). From this and the minimality of R it follows that \(m_R=\#R\). Moreover, for any \(S\subseteq R\) we have \(m_S\ge \#S\).

By Hall’s marriage theorem [9], it follows that there is a subset \(C\subseteq \{1,\ldots ,n\}\) and a bijection \(i:C\rightarrow R\) such that \(a_{i(j)j}\ne 0\) for every \(j\in C\). Writing \(m=\#C=\#R\) and replacing A by PAQ for appropriate permutation matrices P and Q, we may assume that \(C=R=\{1,\ldots ,m\}\) and \(i(j)=j\). The block form of A then follows from the definition of \(m_S\). \(\square \)

Lemma 4.9

Given \(p_0,a,q\in \mathbb {Z}\) with \(p_0\ne 0\) and \((a,q)=1\), define

$$\begin{aligned} P(p_0;a,q)=\{p\text { prime}: \exists d\in \mathbb {{N}}\text { such that }d \equiv a\pmod {q} \text { and }p\equiv p_0\!\!\!\pmod {d}\} \end{aligned}$$

and

$$\begin{aligned} P(p_0;q)=\bigcap _{\begin{array}{c} 1\le a\le q\\ (a,q)=1 \end{array}}P(p_0;a,q). \end{aligned}$$

Then

$$\begin{aligned} \#\{p\in P(p_0;q):p\le x\}=(1+o(1))\pi (x) \quad \text {as }x\rightarrow \infty . \end{aligned}$$

Proof

This is proven for \(p_0=1\) in [10], uniformly for \(q\le 2^{(1-\varepsilon )\log \log {x}}\). One can generalize the proof to all \(p_0\ne 0\), and if one is not concerned with the uniformity in q a simpler proof suffices. For completeness we give the argument here.

For a character \(\chi \) modulo q and \(a\in \mathbb {Z}\) with \((a,q)=1\) let

$$\begin{aligned} d_\chi (n):=\sum _{d\mid n}\chi (d),\qquad d(n;a):=\sum _{\begin{array}{c} d\mid n\\ d\equiv a\pmod {q} \end{array}}1, \end{aligned}$$

so that we have

$$\begin{aligned} d(n;a)=\frac{1}{\varphi (q)}\sum _{\chi \pmod {q}}\overline{\chi }(a)d_\chi (n). \end{aligned}$$
(4.3)

Then, it suffices to prove that for almost all primes p, \(d(p-p_0;a)>0\) for all \(a\pmod {q}\) with \((a,q)=1\).

As in [10] we start by observing that if \(p',n\) are coprime with \(p'\) prime, then by multiplicativity and the Cauchy–Schwarz inequality one has

$$\begin{aligned} \Big (d(np';a)-\frac{d_{\chi _0}(np')}{\varphi (q)}\Big )^2 \le 16\sum _{\begin{array}{c} b\pmod {q}\\ (b,q)=1 \end{array}} \Big (d(n;b)-\frac{d_{\chi _0}(n)}{\varphi (q)}\Big )^2, \end{aligned}$$

where \(\chi _0\) is the trivial character modulo q. Denoting by \(\omega (n)\) the number of distinct prime factors of n, Halberstam [8] proved that \(\omega (p-p_0)\) has normal order \(\log \log p\). Thus, \(\omega (p-p_0)\le 2\log \log p\) for almost all \(p\le x\) and so, in particular, \(p-p_0\) almost always has a prime factor \(p'\) greater than \(r(x):=x^{\frac{1}{4\log \log x}}\) as \(x\rightarrow \infty \). Also for almost all such p we have \((p',(p-p_0)/p')=1\) since only \(o(\pi (x))\) integers \(\le x\) have such a large repeated prime factor. Denoting by \(\sum '\) the restriction of the sum to primes with such properties, we then have

where all the implicit constants here and below are allowed to depend on \(q,p_0\). By [18, Ch. II Satz 4.2] (cf. Satz 4.6 for the case \(p_0=1\)), with \((a_1,b_1,a_2,b_2)=(1,0,n,p_0)\), the inner sum is \(O(\frac{x}{\varphi (n)\log ^2( x/n)}) =O(\frac{x(\log \log x)^2}{\varphi (n)\log ^2 x})\) since \(n\le x/r(x)\). Thus, using also (4.3) the above is

$$\begin{aligned} \ll \frac{x(\log \log x)^2}{\log ^2x}\max _{\begin{array}{c} b\pmod {q}\\ (b,q)=1 \end{array}} \sum _{\chi _0\ne \chi _1,\chi _2\pmod {q}}\frac{\chi _1(b)\overline{\chi }_2(b)}{\varphi (q)^2}\sum _{n\le \frac{x}{r(x)}}\frac{d_{\chi _1}(n)d_{\chi _2}(n)}{\varphi (n)}. \end{aligned}$$
(4.4)

An easy exercise shows that for \(\mathfrak {R}(s)>1\),

$$\begin{aligned} \sum _{n\ge 1}\frac{d_{\chi _1}(n)d_{\chi _2}(n)}{\varphi (n)n^s}=L(1+s,\chi _0)L(1+s,\chi _1)L(1+s,\chi _2)L(1+s,\chi _1\chi _2)R(s) \end{aligned}$$

where R(s) is an Euler product which is convergent and uniformly bounded on \(\mathfrak {R}(s)\ge -\frac{1}{4}\). It follows that the inner sum in (4.4) is \(O(\log ^2 x)\). Thus we find

and so we deduce that for \(\varepsilon >0\) we must have

$$\begin{aligned} d(p-p_0;a)-\frac{d_{\chi _0}(p-p_0)}{\varphi (q)}\ll _{\varepsilon } (\log x)^{\frac{1}{2}+\varepsilon } \end{aligned}$$

for almost all \(p\le x\). Finally, for almost all primes \(p\le x\) we have \(\omega (p-p_0)\ge (1-\varepsilon )\log \log x \) and so

$$\begin{aligned} d_{\chi _0}(p-p_0)\ge 2^{\omega (p-p_0)-\omega (q)}\gg _{\varepsilon } (\log x)^{\log 2-\varepsilon }. \end{aligned}$$

Since \(\log 2>1/2\) we deduce that for almost all primes \(p\le x\) we have

$$\begin{aligned} d(p-p_0;a)\gg _{\varepsilon } (\log x)^{\log 2-\varepsilon }, \end{aligned}$$

as desired. \(\square \)

Lemma 4.10

Let \(f\in M_k(\Gamma _0(N),\xi )\), and define g by (1.1). Let \(f_n\) and \(g_n\) denote the Fourier coefficients of f and g, respectively, and for any character \(\chi \) of modulus q coprime to N, define \(\Lambda _f(s,c_\chi )\) and \(\Lambda _g(s,c_{\overline{\chi }})\) as in (1.7). Then \(\Lambda _f(s,c_\chi )\) and \(\Lambda _g(s,c_{\overline{\chi }})\) continue to entire functions, apart from at most simple poles at \(s=\frac{1\pm k}{2}\), and satisfy the functional equation (1.8).

Proof

Define

$$\begin{aligned} f_{\chi }(z) := \sum _{\begin{array}{c} a\pmod {q}\\ (a,q)=1 \end{array}} \chi (a) f\bigg | \begin{pmatrix}1 &{} \frac{a}{q}\\ &{} 1\end{pmatrix}= \sum _{n=0}^\infty f_n c_\chi (n) e(nz), \end{aligned}$$
(4.5)

and similarly for \(g_{\overline{\chi }}\). Then

$$\begin{aligned} f_\chi \left| \begin{pmatrix}&{}-1\\ Nq^2\end{pmatrix}\right. = \sum _{\begin{array}{c} u\pmod {q}\\ (u,q)=1 \end{array}}\chi (u) f\left| \begin{pmatrix}1 &{} \frac{u}{q} \\ &{} 1\end{pmatrix}\begin{pmatrix}&{}-1\\ Nq^2\end{pmatrix}\right. . \end{aligned}$$
(4.6)

Since

$$\begin{aligned} q^{-1} \begin{pmatrix}&{}-1\\ N\end{pmatrix}^{-1} \begin{pmatrix}1 &{} \frac{u}{q}\\ &{} 1\end{pmatrix}\begin{pmatrix}&{}-1\\ Nq^2\end{pmatrix}\begin{pmatrix}1 &{} -\frac{v}{q}\\ &{} 1\end{pmatrix}= \begin{pmatrix}q &{} -v\\ -uN &{} \frac{1+uvN}{q}\end{pmatrix}\in \Gamma _0(N), \end{aligned}$$
(4.7)

provided that \(uvN\equiv -1\pmod {q}\), we have

$$\begin{aligned} f_\chi \left| \begin{pmatrix}&{}-1\\ Nq^2\end{pmatrix}\right.= & {} \xi (q) \sum _{\begin{array}{c} u\pmod {q}\\ uvN \equiv -1\pmod {q} \end{array}} \chi (u) g\bigg |\begin{pmatrix}1 &{} \frac{v}{q}\\ &{} 1\end{pmatrix}\nonumber \\= & {} \xi (q)\overline{\chi (-N)} \sum _{\begin{array}{c} u\pmod {q}\\ uvN \equiv -1\pmod {q} \end{array}} \overline{\chi (v)} g\bigg |\begin{pmatrix}1 &{} \frac{v}{q}\\ &{} 1\end{pmatrix}= \xi (q)\overline{\chi (-N)} g_{\overline{\chi }}.\nonumber \\ \end{aligned}$$
(4.8)

The conclusion now follows by Hecke’s argument [15, Theorem 4.3.5]. \(\square \)

Lemma 4.11

Let \(\chi \pmod {q}\) be a Dirichlet character induced by the primitive character \(\chi _*\pmod {q_*}\). Define \(q_0 = \prod _{p\mid q, p\not \mid q_*}p\) and \(q_2 = \frac{q}{q_* q_0}\). Then \(c_\chi (n)=0\) if \(q_2\not \mid n\), and

$$\begin{aligned} c_\chi (nq_2)= & {} q_2\chi _*(q_0)c_{\chi _*}(n)c_{q_0}(n)\nonumber \\= & {} q_2\chi _*(q_0) \tau (\chi _*) \mu (q_0) \overline{\chi _*(n)} \mu (\gcd (q_0, n)) \varphi (\gcd (q_0, n)). \end{aligned}$$
(4.9)

Proof

By [16, §9.2, Theorem 12], if \(q_*\mid \frac{q}{\gcd (q, n)}\) then

$$\begin{aligned} c_\chi (n) = \overline{\chi _*\!\left( \frac{n}{\gcd (q, n)}\right) } \chi _*\!\left( \frac{q}{\gcd (q, n) q_*}\right) \mu \!\left( \frac{q}{\gcd (q, n) q_*}\right) \frac{\varphi (q)}{\varphi \!\left( \frac{q}{\gcd (q, n)}\right) } \tau (\chi _*), \end{aligned}$$

and \(c_\chi (n)=0\) otherwise. Since \(\chi _*\!\left( \frac{q}{\gcd (q, n) q_*}\right) = \chi _*\!\left( \frac{q_0q_2}{\gcd (q, n)}\right) =0\) unless \(q_2\mid n\), we get \(c_\chi (n)=0\) if \(q_*\not \mid \frac{q}{\gcd (q, n)}\) or \(q_2\not \mid n\).

For an integer n, we get

$$\begin{aligned} c_\chi (nq_2)= & {} \overline{\chi _*\!\left( \frac{n}{\gcd (q_0, n)}\right) } \chi _*\!\left( \frac{q_0}{\gcd (q_0, n)}\right) \mu \!\left( \frac{q_0}{\gcd (q_0, n)} \right) \frac{\varphi (q)}{\varphi \!\left( q_* \frac{q_0}{\gcd (q_0, n)}\right) } \tau (\chi _*) \\= & {} \overline{\chi _*(n)} \chi _*(q_0) \tau (\chi _*) \mu (q_0) \frac{\varphi (q)}{\varphi (q_* q_0)} \mu (\gcd (q_0, n)) \varphi (\gcd (q_0, n)), \end{aligned}$$

since \(q_0\) is squarefree and \(\gcd (q_0, q_*)=1\). Finally, since q has the same prime factors as \(q_*q_0\), we have \(\frac{\varphi (q)}{\varphi (q_* q_0)}=\frac{q}{q_* q_0}=q_2\). \(\square \)

Lemma 4.12

Let \(\xi \pmod {N}\) and \(\chi \pmod {q}\) be Dirichlet characters, with \((q,N)=1\). Let \(\{f_n\}_{n=1}^\infty \) be a sequence of complex numbers of at most polynomial growth, and define \(\Lambda _f(s)\) and \(\Lambda _f(s,c_\chi )\) as in (1.2) and (1.7). Suppose that \(f_1=1\) and the \(f_n\) satisfy the Hecke relations at primes not dividing N, so that

$$\begin{aligned} \Lambda _f(s)=\Gamma _\mathbb {C}(s+\tfrac{k-1}{2}) \sum _{n\mid N^\infty }\lambda _nn^{-s} \prod _{p\not \mid N} \bigl (1-\lambda _pp^{-s}+\xi (p)p^{-2s}\bigr )^{-1}, \end{aligned}$$
(4.10)

where \(\lambda _n:=f_nn^{-\frac{k-1}{2}}\). Let \(\chi _*\pmod {q_*}\) be the primitive character inducing \(\chi \), and define \(D_{f,\chi }(s)=\Lambda _f(s,c_\chi )/\Lambda _f(s, c_{\chi _*})\). Then \(D_{f,\chi }(s)\) is a Dirichlet polynomial given by the following formula:

$$\begin{aligned} D_{f,\chi }(s)= & {} \prod _{p\mid q_*}\lambda _{p^{{{\mathrm{ord}}}_p(q/q_*)}}p^{{{\mathrm{ord}}}_p(q/q_*)(1-s)}\nonumber \\&\times \prod _{p\mid q, p\not \mid q_*} p^{({{\mathrm{ord}}}_p(q)-1)(1-s)} \bigg [ \lambda _{p^{{{\mathrm{ord}}}_p(q)}} p^{1-s} +\lambda _{p^{{{\mathrm{ord}}}_p(q)-2}}\xi (p) p^{-s} \nonumber \\&\quad -\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\bigl (\chi _*(p) +\xi (p)\overline{\chi _*(p)}p^{1-2s}\bigr ) \bigg ] , \end{aligned}$$
(4.11)

where we define \(\lambda _{p^\ell }=0\) for any negative integer \(\ell \).

Suppose further that \(\{g_n\}_{n=1}^\infty \) is a sequence of at most polynomial growth such that \(g_1\ne 0\), \(g_n=g_1\overline{\xi (n)}f_n\) for all n coprime to N, and

$$\begin{aligned} \Lambda _g(s)=g_1\Gamma _\mathbb {C}(s+\tfrac{k-1}{2}) \sum _{n\mid N^\infty }\tilde{\lambda }_nn^{-s} \prod _{p\not \mid N} \bigl (1-\tilde{\lambda }_pp^{-s} +\overline{\xi (p)}p^{-2s}\bigr )^{-1}, \end{aligned}$$

where \(\tilde{\lambda }_n=g_1^{-1}g_nn^{-\frac{k-1}{2}}\). Then \(D_{f,\chi }(s)\) and \(D_{g,\overline{\chi }}(s):= \Lambda _g(s,c_{\overline{\chi }})/\Lambda _g(s,c_{\overline{\chi }_*})\) satisfy the functional equation

$$\begin{aligned} D_{f, \chi }(s) = (q/q_*)^{1-2s}\xi (q/q_*) D_{g,\overline{\chi }}(1-s). \end{aligned}$$
(4.12)

In particular, if \(\Lambda _f(s,\overline{\chi }_*)\) and \(\Lambda _g(s,\chi _*)\) satisfy (2.5) with \((\overline{\chi }_*,q_*)\) in place of \((\chi ,q)\), then \(\Lambda _f(s,c_\chi )\) and \(\Lambda _g(s,c_{\overline{\chi }})\) satisfy (1.8).

Proof

Let \(q_0 = \prod _{p\mid q, p\not \mid q_*} p\) and \(q_2 = \frac{q}{q_0 q_*}\). By (4.9), we have

$$\begin{aligned} \frac{\Lambda _f(s, c_\chi )}{\Gamma _\mathbb {C}\!\left( s+\frac{k-1}{2}\right) }= & {} \sum _{n=1}^\infty \frac{\lambda _{nq_2} c_\chi (nq_2)}{(nq_2)^s}\\= & {} q_2\chi _*(q_0) \tau (\chi _*) \mu (q_0) \sum _{n=1}^\infty \frac{\lambda _{nq_2} \overline{\chi _*(n)} \mu (\gcd (q_0, n)) \varphi (\gcd (q_0, n))}{(nq_2)^s}\\= & {} q_2\chi _*(q_0) \tau (\chi _*) \mu (q_0) \sum _{n\mid N_\infty }\frac{\lambda _n \overline{\chi _*(n)}}{n^s} \prod _{p\not \mid q N} \sum _{j=0}^\infty \frac{\lambda _{p^j} \overline{\chi _*(p^j)}}{p^{js}} \\&\times \prod _{p\mid \gcd (q_2, q_*)} \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q_2)}}}{p^{{{\mathrm{ord}}}_p(q_2) s}} \times \prod _{p\mid q_0} \chi _*(p^{{{\mathrm{ord}}}_p(q)-1})\\&\bigg [\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} - \varphi (p) \times \sum _{j={{\mathrm{ord}}}_p(q)}^\infty \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}} \bigg ]. \end{aligned}$$

Thus,

$$\begin{aligned} D_{f, \chi }(s)= & {} \frac{\Lambda _f(s, c_\chi )}{\Lambda _f(s, c_{\chi _*})}\\= & {} q_2 \prod _{p\mid q_*} \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q/q_*)}}}{p^{{{\mathrm{ord}}}_p(q/q_*)s}} \prod _{p\mid q_0} \chi _*(p^{{{\mathrm{ord}}}_p(q)})\\&\times \frac{-\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} + \varphi (p) \sum _{j={{\mathrm{ord}}}_p(q)}^\infty \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}}}{(1-\lambda _p \overline{\chi _*(p)} p^{-s} + \xi \cdot \overline{\chi _*}^2 (p) p^{-2s})^{-1}}. \end{aligned}$$

For each prime \(p\mid q_0\), we have

$$\begin{aligned}&-\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} + \varphi (p) \sum _{j={{\mathrm{ord}}}_p(q)}^\infty \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}} \\&\quad = -\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} - \varphi (p) \sum _{j=0}^{{{\mathrm{ord}}}_p(q)-1} \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}} \\&\qquad + \varphi (p) (1-\lambda _p \overline{\chi _*(p)} p^{-s} + \xi \cdot \overline{\chi _*}^2 (p) p^{-2s})^{-1}. \end{aligned}$$

Since \(\lambda _{p^j} \lambda _p = \lambda _{p^{j+1}} + \xi (p) \lambda _{p^{j-1}}\), we have

$$\begin{aligned} \sum _{j=0}^{{{\mathrm{ord}}}_p(q)-2} \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}}= & {} \bigg [\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-2}} \overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-2})} \xi \cdot \overline{\chi _*}^2(p) p^{-s}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} \\&- \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} + 1 \bigg ] \\&\quad \times (1-\lambda _p\overline{\chi _*(p)}p^{-s} + \xi \cdot \overline{\chi _*}^2(p)p^{-2s})^{-1}, \end{aligned}$$

so that

$$\begin{aligned}&-\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} + \varphi (p) \sum _{j={{\mathrm{ord}}}_p(q)}^\infty \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}} \\&\quad = -p \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} \\&\qquad - \varphi (p) \bigg [ \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-2}} \overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-2})} \xi \cdot \overline{\chi _*}^2(p) p^{-s}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} - \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} \bigg ] \\&\qquad \times (1-\lambda _p\overline{\chi _*(p)}p^{-s} + \xi \cdot \overline{\chi _*}^2(p)p^{-2s})^{-1}. \end{aligned}$$

Therefore, for each prime \(p\mid q_0\), we have

$$\begin{aligned}&\frac{-\frac{\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)-1})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} + \varphi (p) \sum _{j={{\mathrm{ord}}}_p(q)}^\infty \frac{\lambda _{p^{j}} \overline{\chi _*(p^{j})}}{p^{js}}}{(1-\lambda _p \overline{\chi _*(p)} p^{-s} + \xi \cdot \overline{\chi _*}^2 (p) p^{-2s})^{-1}} \\&\quad = \frac{\overline{\chi _*(p^{{{\mathrm{ord}}}_p(q)})}}{p^{({{\mathrm{ord}}}_p(q)-1)s}} \bigg [ \lambda _{p^{{{\mathrm{ord}}}_p(q)}}p^{1-s} -\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \chi _*(p) + \lambda _{p^{{{\mathrm{ord}}}_p(q)-2}}\xi (p) p^{-s} \\&\qquad -\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \xi \cdot \overline{\chi _*}(p)p^{1-2s} \bigg ]. \end{aligned}$$

Writing \(q_2=\prod _{p\mid q_*}p^{{{\mathrm{ord}}}_p(q/q_*)} \prod _{p\mid q_0}p^{{{\mathrm{ord}}}_p(q)-1}\), this yields

$$\begin{aligned} D_{f, \chi }(s)= & {} \prod _{p\mid q_*}\lambda _{p^{{{\mathrm{ord}}}_p(q/q_*)}}p^{{{\mathrm{ord}}}_p(q/q_*)(1-s)} \\&\quad \times \prod _{p\mid q, p\not \mid q_*} p^{({{\mathrm{ord}}}_p(q)-1)(1-s)} \bigg [ \lambda _{p^{{{\mathrm{ord}}}_p(q)}}p^{1-s}\\&-\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \chi _*(p) + \lambda _{p^{{{\mathrm{ord}}}_p(q)-2}}\xi (p) p^{-s} -\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \xi \cdot \overline{\chi _*}(p)p^{1-2s} \bigg ]. \end{aligned}$$

Since \(\tilde{\lambda }_p = \overline{\xi (p)} \lambda _p\) for \(p\mid q_0\), we also have

$$\begin{aligned}&(q/q_*)^{1-2s}\overline{\xi (q/q_*)} D_{f, \chi }(1-s) \\&\quad = q_2 \prod _{p\mid q_*} \overline{\xi (p^{{{\mathrm{ord}}}_p(q/q_*)})} p^{{{\mathrm{ord}}}_p(q/q_*)(1-2s)} \frac{\lambda _{p^{{{\mathrm{ord}}}_p(q/q_*)}}}{p^{{{\mathrm{ord}}}_p(q/q_*)(1-s)}} \\&\qquad \times \prod _{p\mid q, p\not \mid q_*} p^{-({{\mathrm{ord}}}_p(q)-1)s} \overline{\xi (p^{{{\mathrm{ord}}}_p(q)})} \chi _*(p) \bigg [ \lambda _{p^{{{\mathrm{ord}}}_p(q)}}\overline{\chi _*(p)} p^{1-s} -\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}}p^{1-2s} \\&\qquad + \lambda _{p^{{{\mathrm{ord}}}_p(q)-2}}\overline{\chi _*(p)}\xi (p) p^{-s} -\lambda _{p^{{{\mathrm{ord}}}_p(q)-1}} \xi \cdot \overline{\chi _*}^2(p) \bigg ] \\&\quad = q_2 \prod _{p\mid q_*} \frac{\tilde{\lambda }_{p^{{{\mathrm{ord}}}_p(q/q_*)}}}{p^{{{\mathrm{ord}}}_p(q/q_*)s}} \prod _{p\mid q, p\not \mid q_*} p^{-({{\mathrm{ord}}}_p(q)-1)s}\overline{\chi _*(p)} \bigg [ \tilde{\lambda }_{p^{{{\mathrm{ord}}}_p(q)}}\chi _*(p) p^{1-s} \\&\qquad -\tilde{\lambda }_{p^{{{\mathrm{ord}}}_p(q)-1}} \overline{\xi (p)} \chi _*(p)^2 p^{1-2s} + \tilde{\lambda }_{p^{{{\mathrm{ord}}}_p(q)-2}}\chi _*(p)\overline{\xi (p)} p^{-s} - \tilde{\lambda }_{p^{{{\mathrm{ord}}}_p(q)-1}} \bigg ] \\&\quad = D_{g, \bar{\chi }}(s). \end{aligned}$$

Finally, (1.8) follows from (4.12) and (2.5) (with \(\chi \) replaced by \(\overline{\chi }_*\)) on noting the equalities \(c_{\chi _*}=\tau (\chi _*)\overline{\chi }_*\), \(c_{\overline{\chi }_*}=\tau (\overline{\chi }_*)\chi _*\) and \(\tau (\overline{\chi }_*)/\tau (\chi _*) =q_*^{-1}\tau (\overline{\chi }_*)^2\chi _*(-1)\). \(\square \)

Lemma 4.13

Let \(\{g_1,\ldots ,g_h\}\) be a generating set for \(\Gamma _1(N)\). For \(i=1,\ldots ,h\), let \(\gamma _i\in \langle T,W\rangle g_i\langle T,W\rangle \) be a matrix with top row \(\left( {\begin{matrix}r_i&b_i\end{matrix}}\right) \), and choose \(m_i\in \mathbb {Z}\) with \(m_i\mid \frac{r_i-1}{N}\). Then, for any \(q\in \mathbb {{N}}\) satisfying \((q,Nm_i)=1\) and \(q\equiv Nm_ib_i\pmod {r_i}\) for every i, we have \(H_q\supseteq \Gamma _1(N)\).

Proof

Fix a choice of q satisfying the given conditions, and set \(d_i=(1-r_i)/(Nm_i)\). Then

$$\begin{aligned} qd_i\equiv Nm_ib_id_i=(1-r_i)b_i\equiv b_i\pmod {r_i}. \end{aligned}$$

By hypothesis we have \((q,Nm_i)=1\), so we can choose a matrix \(h_i\in \Gamma _0(N)\) with left column \(\left( {\begin{matrix}q\\ Nm_i\end{matrix}}\right) \). The upper-left entry of \(\gamma _iT^{\frac{qd_i-b_i}{r_i}}h_i\) is \(q(r_i+Nm_id_i)=q\), and thus \(\gamma _iT^{\frac{qd_i-b_i}{r_i}}\in H_q\). As shown in the proof of Theorem 3.2, \(H_q\) also contains T and W, and thus \(g_i\in H_q\). \(\square \)

Lemma 4.14

For \(\gamma =\left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) \in \Gamma _0(N)\), define \({{\mathrm{ht}}}(\gamma )=\max \{|a|,|b|,|c|,|d|\}\). Let \(\tau _1,\ldots ,\tau _\ell \in \bigl \{T,T^{-1},W,W^{-1}\bigr \}\), with \(\tau _{i+1}\ne \tau _i^{-1}\) for every \(i=1,\ldots ,\ell -1\). Then, provided that \(N\ge 4\),

$$\begin{aligned} {{\mathrm{ht}}}(\tau _1\ldots \tau _\ell )\ge \max \{{{\mathrm{ht}}}(\tau _1\ldots \tau _{\ell -1}), {{\mathrm{ht}}}(\tau _2\ldots \tau _\ell )\}. \end{aligned}$$

Proof

Since \({{\mathrm{ht}}}(\gamma )={{\mathrm{ht}}}(\gamma ^{-1})\) for every \(\gamma \), it suffices to prove that \({{\mathrm{ht}}}(\tau _1\ldots \tau _\ell )\ge {{\mathrm{ht}}}(\tau _1\ldots \tau _{\ell -1})\). Suppose that this is false, and let \(\tau _1,\ldots ,\tau _\ell \) be a counterexample of minimal length. Since \({{\mathrm{ht}}}(T^{\pm 1})={{\mathrm{ht}}}(W^{\pm 1})={{\mathrm{ht}}}(I)\), we must have \(\ell >1\).

Note that \(\langle T,W\rangle \) has some outer automorphisms that preserve the height function. Specifically, conjugating an element \(\gamma =\tau _1\ldots \tau _\ell \) by \(\left( {\begin{matrix}1&{}\\ {} &{}-1\end{matrix}}\right) \) leaves \({{\mathrm{ht}}}(\gamma )\) unchanged and swaps every occurrence of T with \(T^{-1}\) and W with \(W^{-1}\). Similarly, conjugating by \(\left( {\begin{matrix}&{}-1\\ N&{}\end{matrix}}\right) \) swaps T with \(W^{-1}\) and W with \(T^{-1}\). Thus, applying an appropriate outer automorphism, we may assume without loss of generality that \(\tau _\ell =T\).

Write \(\tau _1\ldots \tau _{\ell -1}=\left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) \). Then by assumption we have \(h:={{\mathrm{ht}}}(\left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) )>{{\mathrm{ht}}}(\left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) T)\), so that \(h=\max \{|a|,|b|,|c|,|d|\}>\max \{|a|,|a+b|,|c|,|Nc+d|\}\). Hence, \(h=\max \{|b|,|d|\}\). If \(h=|b|\) then \(|a|<|b|\) and \(|a+b|<|b|\), so \(ab<0\). If \(h=|d|\) then \(|Nc+d|<|d|\), so \(cd<0\) and \(|Nc|<2|d|\).

Next we consider \(\tau _{\ell -1}\), which must be one of \(T,W,W^{-1}\), since \(\tau _\ell \ne \tau _{\ell -1}^{-1}\). By minimality, we have \({{\mathrm{ht}}}(\left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) \tau _{\ell -1}^{-1})= {{\mathrm{ht}}}(\tau _1\ldots \tau _{\ell -2})\le h\). If \(\tau _{\ell -1}=T\) then we have \(\max \{|b-a|,|d-Nc|)\le h\), contradicting the fact that \(ab<0\) when \(h=|b|\) and \(cd<0\) when \(h=|d|\). If \(\tau _{\ell -1}=W\) then we have \(\max \{|a-Nb|,|c-d|\}\le h\), which is again a contradiction.

Hence we may assume that \(\tau _{\ell -1}=W^{-1}\), and we have \(\max \{|a+Nb|,|b|,|c+d|,|d|\}\le h\). If \(h=|b|\) then \(|b|\ge |a+Nb|>(N-1)|b|\), which is a contradiction, since \(N>1\). Hence we must have \(h=|d|\).

Next, let \(j\in \{1,\ldots ,\ell -1\}\) be the largest number such that \(\tau _{\ell -i}=W^{-1}\) for \(i=1,\ldots ,j\). Since \(|Nc|<2|d|\) and \(N>1\), we must have \(j<\ell -1\). Consider \(\tau _{\ell -j-1}\), which must be one of \(T,T^{-1}\). We have

$$\begin{aligned} {{\mathrm{ht}}}\left( \left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) W^j\tau _{\ell -j-1}^{-1}\right) = {{\mathrm{ht}}}(\tau _1\ldots \tau _{\ell -j-2})\le h. \end{aligned}$$

Since \(\tau _{\ell -j-1}=T^{\pm 1}\) and \(jN\ge 4\), this implies that

$$\begin{aligned} |d|\ge {{\mathrm{ht}}}\left( \left( {\begin{matrix}a&{}b\\ Nc&{}d\end{matrix}}\right) W^jT^{\mp 1}\right) \ge |(jN\mp 1)d+Nc|>(jN\mp 1-2)|d|\ge |d|, \end{aligned}$$

which is a contradiction. \(\square \)

For \(N\ge 4\), \(\Gamma _1(N)\) is torsionfree [13, Lemma 12.3], and hence free by the Kurosh subgroup theorem [14]. Lemma 4.14 permits a simple, direct proof of the following consequence:

Corollary 4.15

T and W generate a free group if and only if \(N\ge 4\).

Proof

For \(N\le 3\), we verify directly that \((W^{-1}T)^{12}=I\). For \(N\ge 4\), suppose that \(\tau _1\ldots \tau _\ell =I\) is a non-trivial relation of minimal length satisfied by T and W. Clearly \(\ell >1\), and by applying an appropriate outer automorphism, we may assume that \(\tau _1=T\). Considering each possible \(\tau _2\in \{T,W,W^{-1}\}\), we see that \({{\mathrm{ht}}}(\tau _1\tau _2)>1={{\mathrm{ht}}}(I)\), in contradiction to Lemma 4.14. \(\square \)