Algebraic twists of modular forms and Hecke orbits

We consider the question of the correlation of Fourier coefficients of modular forms with functions of algebraic origin. We establish the absence of correlation in considerable generality (with a power saving of Burgess type) and a corresponding equidistribution property for twisted Hecke orbits. This is done by exploiting the amplification method and the Riemann Hypothesis over finite fields, relying in particular on the ℓ-adic Fourier transform introduced by Deligne and studied by Katz and Laumon.

On the other hand, it is also well-known that the Fourier coefficients oscillate quite substantially, as the estimate n x f (n)e(αn) x 1/2 (log 2x) (1.3) valid for x 1 and α ∈ R, with an implied constant depending on f only, shows (see, e.g., [Iwa97,Th. 5.3] and [Iwa95, Th. 8.1]).

GAFA ALGEBRAIC TWISTS OF MODULAR FORMS 583
where the implied constant depends on f and V , and our aim will be to improve this bound; we will prove estimates of the shape S(f, K; p) p 1−δ (1.4) for some absolute δ > 0, where the implied constant depends only on f , V and easily controlled invariants of K, such as A first (slightly degenerate) example is a (normalized) Dirac function located at some u ∈ F p , i.e., K(n) = p 1/2 δ n≡u (mod p) . Here K ∞ = p 1/2 is large, but K 2 = 1 and S(f, K; p) = p 1/2 n≡u (mod p) f (n)V (n/p) p 1−δ (1.5) for any δ < 1 − 7/64 by (1.2). Another non-trivial choice (somewhat simpler than the previous one) is an additive character modulo p given by K(n) = e(an/p) for some fixed a ∈ Z. In that case, |K(n)| 1 and the bound (1.3) gives (1.4) for any δ < 1/2, with an implied constant depending only on f and V .
A third interesting example is given by K(n) = χ(n), where χ is a non-trivial Dirichlet character modulo p (extended by 0 at p). In that case, the bound (1.4), with an implied constant depending only on f and V , is essentially equivalent to a subconvex bound for the twisted L-function L(f ⊗ χ, s) in the level aspect, i.e., to a bound L(f ⊗ χ, s) s,f p 1/2−δ , for some δ > 0 and any fixed s on the critical line. Such an estimate was obtained for the first time by Duke-Friedlander-Iwaniec in [DFI93] for any δ < 1/22. This bound was subsequently improved to any δ < 1/8 (a Burgess type exponent) by Bykovski and Blomer-Harcos 2 [Byk98,BH08], and to δ < 1/6 (a Weyl type exponent) when χ is quadratic by Conrey-Iwaniec [CI00].
There are many other functions which occur naturally. We highlight two types here. First, given rational functions φ 1 , φ 2 , say We will also show a bound of the type (1.4) for these rather wild functions. The precise common feature of these examples is that they arise as linear combination of Frobenius trace functions of certain -adic sheaves over the affine line A 1 Fp (for some prime = p). We therefore call these functions trace functions, and we give the precise definition below. To state our main result, it is enough for the moment to know that we can measure the complexity of a trace function modulo p with a numerical invariant called its conductor cond(K). Our result is, roughly, that when cond(K) remains bounded, K(n) does not correlate with Fourier coefficients of modular forms.
As a last step before stating our main result, we quantify the properties of the test function V that we handle. Given P > 0 and Q 1 real numbers, we define: Remark 1.7. It is important to remark that this depends on (1.3), and thus this corollary does not hold for Eisenstein series. For the latter, one can define analogues of the trace norms which consider decompositions (1.9) with no additive characters.

Good functions and correlating matrices.
To deal with the level of generality we consider, it is beneficial at first to completely forget all the specific properties that K might have, and to proceed abstractly. Therefore we consider the problem of bounding the sum S V (f, K; p) for K : Z/pZ → C a general function, assuming only that we know that |K(n)| M for some M that we think as fixed.
For the case of Dirichlet characters, Duke, Friedlander and Iwaniec [DFI93] amplified K(n) = χ(n) among characters with a fixed modulus. Given the absence of structure on K in our situation, this strategy seems difficult to implement. Instead, we use an idea found in [CI00]: 3 we consider K "fixed", and consider the family of sums S V (g, K; p) for g varying over a basis of modular cusp forms of level Np, viewing f (suitably normalized) as an old form at p. Estimating the amplified second moment of S V (g, K; p) over that family by the Petersson-Kuznetzov formula and the GAFA ALGEBRAIC TWISTS OF MODULAR FORMS 587 Poisson formula, we ultimately have to confront some sums which we call correlation sums, which we now define. We denote byK the (unitarily normalized) Fourier transform modulo p of K, given byK (z) = 1 p 1/2 x (mod p) For any field L, we let GL 2 (L) and PGL 2 (L) act on P 1 (L) = L∪{∞} by fractional linear transformations as usual. Now for γ = a b c d ∈ GL 2 (F p ) or in PGL 2 (F p ), we define the correlation sum C(K; γ) by (1.10) The matrices γ which arise in our amplification are the reduction modulo p of integral matrices parameterized by various coefficients from the amplifier, and we need the sums C(K; γ) to be as small as possible.
If K ∞ M (or even K 2 M ), then the Cauchy-Schwarz inequality and the Parseval formula show that |C(K; γ)| M 2 p. (1.11) This bound is, unsurprinsingly, insufficient. Our method is based on the idea that C(K; γ) should be significantly smaller for most of the γ which occur (even by a factor p −1/2 , according to the square-root cancellation philosophy) and that we can control the γ where this cancellation does not occur. By this, we mean that these matrices (which we call the set of correlation matrices) is nicely structured and rather small, unlessK is constant, a situation which means that K(n) is proportional to e( an p ) for some a ∈ Z, in which case we can use (1.3) anyway. In this paper, the structure we obtain is algebraic. To discuss it, we introduce the following notation concerning the algebraic subgroups of PGL 2 : -we denote by B ⊂ PGL 2 the subgroup of upper-triangular matrices, the stabilizer of ∞ ∈ P 1 ; -we denote by w = 0 1 1 0 the Weyl element, so that Bw (resp. wB) is the set of matrices mapping 0 to ∞ (resp ∞ to 0); -we denote by PGL 2,par the subset of matrices in PGL 2 which are parabolic, i.e., which have a single fixed point in P 1 ; -Given x = y in P 1 , the pointwise stabilizer of x and y is denoted T x,y (this is a maximal torus), and its normalizer in PGL 2 (or the stabilizer of the set {x, y}) is denoted N x,y . i (N xi,yi − T xi,yi )(F p ).
In other words: given M 1 and p a prime, a p-periodic function K is (p, M )good if the only matrices for which the estimate |C(K; γ)| Mp 1/2 fails are either (1) upper-triangular or sending 0 to ∞ or ∞ to 0; or (2) parabolic; or (3) elements which permute two points defined by at most M integral quadratic (or linear) equations. We note that if we fix such data, a "generic" matrix is not of this type.
This notion has little content if M is larger that p 1/2 , but we will already present below some elementary examples of (p, M )-good functions, together with their sets of correlation matrices for M fixed and p arbitrary large (not surprisingly, all these examples come from trace functions).
Given a (p, M )-good function K, we next show using counting arguments that the set of matrices γ constructed from the amplifier does not intersect the set of correlating matrices in a too large set and we eventually obtain our main technical result: Theorem 1.9 (Bounds for good twists). Let f be a Hecke eigenform, p be a prime number and V a function satisfying (V (C, P, Q)). Let M 1 be given, and let K be a (p, M )-good function modulo p with K ∞ M .
There exists s 1 absolute such that for any δ < 1/8, where the implied constant depends only on (C, f, δ).
Remark 1.10. Although it is an elementary step [compare (5.14) and (5.15) in the proof] the beautiful modular interpretation of correlation sums is a key observation for this paper. It gives a group theoretic interpretation and introduce symmetry into sums, the estimation of which might otherwise seem to be hopeless.

Trace functions of -adic sheaves.
The class of functions to which we apply these general considerations are the trace functions modulo p, which we now define formally. Let p be a prime number and = p an auxiliary prime. The functions K(x) modulo p that we consider are the trace functions of suitable constructible sheaves on A 1 Fp evaluated at x ∈ F p . To be precise, we will consider -adic constructible sheaves on A 1 Fp . The trace function of such a sheaf F takes values in an -adic field so we also fix an isomorphism ι :Q −→ C, and we consider the functions of the shape ( (1), the restriction of F to U is geometrically isotypic when seen as a representation of the geometric fundamental group of U : it is the direct sum of several copies of some (necessarily non-trivial) irreducible representation of the geometric fundamental group of U (see [Kat88,§8.4

]).
If F is geometrically irreducible (instead of being geometrically isotypic), the sheaf will be called an irreducible trace sheaf.
We use similar terminology for the trace functions: Definition 1.12 (Trace function). Let p be a prime number. A p-periodic function K(n) defined for n 1, seen also as a function on F p , is a trace function (resp. Fourier trace function, isotypic trace function) if there is some trace sheaf (resp. Fourier trace sheaf, resp. isotypic trace sheaf) F on A 1 Fp such that K is given by (1.14).
We need an invariant to measure the geometric complexity of a trace function, which may be defined in greater generality. With these definitions, our third main result, which together with Theorem 1.9 immediately implies Theorem 1.2, is very simple to state: Theorem 1.14 (Trace functions are good). Let p be a prime number, N 1 and F an isotypic trace sheaf on A 1 Fp , with conductor N . Let K be the corresponding isotypic trace function. Then K is (p, aN s )-good for some absolute constants a 1 and s 1.
(1) This sweeping result encompasses the functions (1.6) and (1.7) and a wide range of algebraic exponential sums, as well as point-counting functions for families of algebraic varieties over finite fields. From our point of view, the uniform treatment of trace functions is one of the main achievements in this paper. In fact our results can be read as much as being primarily about trace functions, and not Fourier coefficients of modular forms. Reviewing the literature, we have, for instance, found several fine works in analytic number theory that exploit bounds on exponential sums which turn out to be special cases of the correlation sums (1.10) (see [FI85,Hea86,Iwa90,Pit95,Mun13]). Recent works of the authors confirm the usefulness of this notion (see [FKM14,FKM]).
(2) Being isotypic is of course not stable under direct sum, but using Jordan-Hölder components, any Fourier trace function can be written as a sum (with nonnegative integral multiplicities) of isotypic trace functions, which allows us to extend many results to general trace functions (see Corollary 1.6).

The -adic Fourier transform and the Fourier-Möbius group.
We now recall the counterpart of the Fourier transform at the level of sheaves, which was discovered by Deligne and developped especially by Laumon [Lau87]. This plays a crucial role in our work.
Fix a non-trivial additive character ψ of F p with values inQ . For any Fourier sheaf F on A 1 , we denote by G ψ = FT ψ (F)(1/2) its (normalized) Fourier transform sheaf, where the Tate twist is always defined using the choice of square root of p inQ which maps to √ p > 0 under the fixed isomorphism ι (which we denote √ p or p 1/2 ). We will sometimes simply write G, although one must remember that this depends on the choice of the character ψ. Then G is another Fourier sheaf, such that for any y ∈ F p (see [Kat90,Th. 7.3.8, (4)]).
In particular, if K is given by (1.14) and ψ is such that for x ∈ F p (we will call such a ψ the "standard character" relative to ι), then we have ι((tr G)(F p , y)) = −K(y) (1.16) for y in Z.
A key ingredient in the proof of Theorem 1.14 is the following geometric analogue of the set of correlation matrices: Definition 1.16 (Fourier-Möbius group). Let p be a prime number, and let F be an isotypic trace sheaf on A 1 Fp , with Fourier transform G with respect to ψ. The Fourier-Möbius group G F is the subgroup of PGL 2 (F p ) defined by The crucial feature of this definition is that G F is visibly a group (it is in fact even an algebraic subgroup of PGL 2,Fp , as follows from constructibility of higherdirect image sheaves with compact support, but we do not need this in this paper; it is however required in the sequel [FKM14]). The fundamental step in the proof of Theorem 1.14 is the fact that, for F of conductor M , the set G K,M of correlation matrices is, for p large enough in terms of M , a subset of G F . This will be derived from the Riemann Hypothesis over finite fields in its most general form (see Corollary 9.2).

Basic examples.
We present here four examples where G K,M can be determined "by hand", though sometimes this may require Weil's results on exponential sums in one variable or even optimal bounds on exponential sums in three variables. This already gives interesting examples of good functions.
Dually, we may consider the function for some fixed u ∈ F p , for which the Fourier transform isK(v) = e(uv/p). Then we get If u = 0, this sum is p − 1 for every γ and for 1 M < p 1/2 − 1, the function K is not (p, M )-good. 592É. FOUVRY, E. KOWALSKI AND P. MICHEL GAFA For u = 0, we get |C(K; γ)| = p if a − d = c = 0, C(K; γ) = 0 if a − d = 0 and c = 0 and otherwise, the sum is a Kloosterman sum so that |C(K; γ)| 2p 1/2 , by Weil's bound. In particular, for M 3 and p such that p > 3 √ p, Thus K is (p, 3)-good for all p 17.
(2) Recall that the classical Kloosterman sums are defined by for q 1 an integer and e, f ∈ Z.
We consider K(n) = S(1, n; p)/ √ p for 1 n p. By Weil's bound for Kloosterman sums, we have |K(n)| 2 for all n. We getK(v) = 0 for v = 0 and where * restricts the sum to those z / ∈ {0, −d/c, −b/a} in F p . According to the results of Weil, we have |C(K; γ)| 2p 1/2 unless the rational function is of the form φ(X) p − φ(X) + t for some constant t ∈ F p and φ ∈ F p (X) (and of course, in that case the sum is p − 3). Looking at poles we infer that in that later case φ is necessarily constant. Therefore, for M 3 and p such that p − 3 > 3 √ p, the set G K,M is the set of γ for which (1.17) is a constant. A moment's thought then shows that Thus K is (p, 3)-good for all p 17.

GAFA ALGEBRAIC TWISTS OF MODULAR FORMS 593
(3) Let K(n) = e(n 2 /p). For p odd, we get by completing the square, where τ p is the quadratic Gauss sum. Since |τ p | 2 = p, we find for γ ∈ PGL 2 (F p ) as above the formula For p 3, Weil's theory shows that |C(K; γ)| 2p 1/2 for all γ such that the rational function is not constant and otherwise |C(K; γ)| p − 1. Thus for M 2 and p 7 (when p − 1 > 2p 1/2 ), the set G K,M is the set of γ for which this function is constant: this requires c = 0 (the second term cannot have a pole), and then we get the conditions b = 0 and (a/d) 2 = 1, so that Thus that function K is (p, 2)-good for all primes p 7.
(4) Let K(n) = χ(n) where χ is a non-trivial Dirichlet character modulo p. Then we haveK(v) =χ(v) τ (χ) is the Gauss sum associated to χ. Then for γ as above, we have Again from Weil's theory, we know that |C(K; γ)| 2p 1/2 unless the rational function is of the form tP (X) h for some t ∈ F p and P ∈ F p (X), where h 2 is the order of χ (and in that case, the sum has modulus p − 3). This means that for M 2, and p 11, the set G K,M is the set of those γ where this condition is true. Looking at 594É. FOUVRY, E. KOWALSKI AND P. MICHEL GAFA the order of the zero or pole at 0, we see that this can only occur if either b = c = 0 (in which case the function is the constant da −1 ) or, in the special case h = 2, when a = d = 0 (and the function is cb −1 X 2 ). In other words, for p 11 and M 2, we have if χ is real-valued. In both cases, these matrices are all in B(F p ) ∪ B(F p )w, so that the function χ(n) is (p, 2)-good, for all p 11. 1.6 Notation. As usual, |X| denotes the cardinality of a set, and we write e(z) = e 2iπz for any z ∈ C. If a ∈ Z and n 1 are integers and (a, n) = 1, we sometimes writeā for the inverse of a in (Z/nZ) × ; the modulus n will always be clear from context. We write F p = Z/pZ.
where X is an arbitrary set on which f is defined, we mean synonymously that there exists a constant C 0 such that |f (x)| Cg(x) for all x ∈ X. The "implied constant" refers to any value of C for which this holds. It may depend on the set X, which is usually specified explicitly, or clearly determined by the context. We write f (x) g(x) to mean f g and g f . The notation n ∼ N means that the integer n satisfies the inequalities N < n 2N . We denote the divisor function by d(n).
Concerning sheaves, for a = 0, we will write [×a] * F for the pullback of a sheaf F on P 1 under the map x → ax.
For a sheaf F on P 1 /k, where k is an algebraic closure of a finite field, and x ∈ P 1 , we write F(x) for the representation of the inertia group at x on the geometric generic fiber of F, and F x for the stalk of F at x.
For F a sheaf on P 1 /k, where now k is a finite field of characteristic p, and for ν an integer or ±1/2, we also write F(ν) for the Tate twist of F, with the normalization of the half-twist as discussed in Section 1.4 using the underlying isomorphism ι : Q → C. From context, there should be no confusion between the two possible meanings of the notation F(x).

Some Applications
2.1 Proof of Corollary 1.4. We explain here how to derive bounds for sums over intervals with sharp cut-offs from our main results.
Taking differences, it is sufficient to prove the following slightly more precise bound: for any δ < 1/8 and any 1 X p, we have 1 n X f (n)K(n) cond(K),f,δ X 3/4 p 1/4−δ/2 , since the right-hand side is always p 1−δ/2 .
Remark 2.1. Observe that, by taking δ close enough to 1/8, we obtain here a stronger bound than the "trivial" estimate of size cond(K),f X coming from (1.1), as long as X p 3/4+η for some η > 0.
By a dyadic decomposition it is sufficient to prove that for 1 X p/2, we have for any δ < 1/8. We may assume that for any j 0. Then, provided ΔX p 3/5 , we deduce from (1.1) that where the implied constant depends only on f . By Theorem 1.2 applied to V (x) = W (px/X) with Q = Δ −1 > 2 and P = X/p 1, we have for any δ < 1/8 where the implied constant depends on f , cond(K) and δ. Hence we derive so the above inequality applies to give as we wanted.

Characters and Kloosterman sums.
We first spell out the examples of the introduction involving the functions (1.6) and (1.7). We give the proof now to illustrate how concise it is given our results, referring to later sections for some details.
Proof. The first case follows directly from Theorem 1.2 if φ 1 and φ 2 satisfy the assumption of Theorem 10.1. Otherwise we have K(n) = e( an+b p ) and the bound follows from (1.3).
In the second case, we claim that K tr,s 1, where the implied constant depends only on (m, φ, Φ), so that Corollary 1.6 applies. Indeed, the triangle inequality shows that we may assume that Φ(U, V ) = U u V v is a non-constant monomial. Let K m,φ be the hyper-Kloosterman sheaf discussed in §10.3, K m,φ its dual. We consider the sheaf of rank m u+v given by We have cond(F) 5 αu+v (2m + 1 + deg(RS)) βu+v by combining Proposition 10.3 and 8.2 (3) for some constants α n and β n (determined by α 0 = 0, α n+1 = 2α n + 1, β 0 = 1, β n+1 = 2β n + 2; note that this rought bound could be improved easily). We replace F by its semisimplification (without changing notation), and we write where F 2 is the direct sum of the irreducible components of F which are geometrically isomorphic to Artin-Schreier sheaves L ψ , and F 1 is the direct sum of the other components. The trace function K 2 of F 2 is a sum of at most m u+v additive characters (times complex numbers of modulus 1) so On the other hand, each geometrically isotypic component of F 1 have conductor bounded by that of F, and therefore K 1 tr,s (5m) u+v (2m + 1 + deg(RS)) 2s(u+v) (Compare with Proposition 8.3).

Distribution of twisted Hecke orbits and horocycles.
We present here a geometric consequence of our main result. Let Y 0 (N ) denote the modular curve Γ 0 (N )\H. For a prime p coprime to N , we denote byT p the geometric Hecke operator that acts on complex-valued functions f defined on Y 0 (N ) by the formulã [note that this differs from the usual Hecke operator T p = (p + 1)p −1/2T p acting on Maass forms, defined in (3.2)].
598É. FOUVRY, E. KOWALSKI AND P. MICHEL GAFA As we will also recall more precisely in Section 3, the L 2 -space has a basis consisting ofT p -eigenforms f , which are either constant functions, Maass cusp forms or combinations of Eisenstein series, with eigenvalues ν f (p) such that for some absolute constant θ < 1/2 (e.g., one can take θ = 7/64 by the work of Kim and Sarnak [KS03]). This bound implies the well-known equidistribution of the Hecke orbits {γ t · τ } for a fixed τ ∈ Y 0 (N ), as p tends to infinity. Precisely, let Note that all but one point of the Hecke orbit lie on the horocycle at height (τ )/p in Y 0 (N ) which is the image of the segment x + i (τ )/p where 0 x 1, so this can also be considered as a statement on equidistribution of discrete points on such horocycles.
We can then consider a variant of this question, which is suggested by the natural parameterization of the Hecke orbit by the F p -rational points of the projective line. Namely, given a complex-valued function which is now a (finite) signed measure on Y 0 (N ). We call these "algebraic twists of Hecke orbits", and we ask how they behave when p is large. For instance, K could be a characteristic function of some subset A p ⊂ F p , and we would be attempting to detect whether the subset A p is somehow biased in such a way that the corresponding fragment of the Hecke orbit always lives in a certain corner of the curve Y 0 (N ). We will prove that, when 1 Ap can be expressed or approximated by a linear combination of the constant function 1 and trace functions with bounded conductors, this type of behavior is forbidden. For instance if A p = (p) is the set of quadratic residues modulo p one has where it is pointed out that it is intimately related to the Burgess bound for short character sums and to subconvexity bounds for Dirichlet L-functions of real characters and twists of modular forms by such characters.
Our result is the following: Then, for any given τ ∈ H, and I p such that |I p | p 1−δ for some fixed δ < 1/8, the measures μ Kp,Ip,τ converge to 0 as p → +∞.
Here is a simple application where we twist the Hecke orbit by putting a multiplicity on the γ t corresponding to the value of a polynomial function on F p .
Corollary 2.4 (Polynomially-twisted Hecke orbits). Let φ ∈ Z[X] be an arbitrary non-constant polynomial. For any τ ∈ Y 0 (N ) and any interval of length |I p | p 1−δ for some fixed δ < 1 8 , the sequence of measures values of φ has positive density in F p for p large, but the limsup of the density |A p |/p is usually strictly less than 1. The statement means, for instance, that the points of the Hecke orbit of τ parameterized by A p can not be made to almost all lie in some fixed "half" of Y 0 (N ), when φ is fixed.
These result could also be interpreted in terms of equidistribution of weighted p-adic horocycles; similar questions have been studied in different contexts for rather different weights in [Str04,Ven10,SU] (e.g., for short segments of horocycles). Also, as pointed out by P. Sarnak, the result admits an elementary interpretation in terms It is well-known that the non-trivial bound (2.2) implies the equidistribution of with respect to the Haar measure on SL 2 (R) (see [Sar91] for much more general statements). Now, any matrix γ ∈ M (p) 2 (Z) defines a non-zero singular matrix modulo p and determines a point z(γ) in P 1 (F p ), which is defined as the kernel of this matrix (e.g. z(γ t ) = −t). By duality, our results imply the following refinement: for any non-constant polynomial φ ∈ Z[X], the subsets

Trace functions over the primes.
In the paper [FKM14], we build on our results and on further ingredients to prove the following statement: Theorem 2.5. Let K be an isotypic trace function modulo p, associated to a sheaf F with conductor M , and such that F is not geometrically isomorphic to a direct sum of copies of a tensor product L χ(X) ⊗ L ψ(X) for some multiplicative character χ and additive character ψ. Then for any X 1, we have for any η < 1/48. The implicit constants depend only on η and M . Moreover, the dependency M is at most polynomial.
These bounds are non-trivial as long as X p 3/4+ε for some ε > 0, and for X p, we save a factor ε p 1/48−ε over the trivial bound. In other terms, trace functions of bounded conductor do not correlate with the primes or the Möbius function when X p 3/4+ε .
This theorem itself has many applications when specialized to various functions. We refer to [FKM14] for these.

Review of Kuznetsov formula.
We review here the formula of Kuznetsov which expresses averages of products of Fourier coefficients of modular forms in terms of sums of Kloosterman sums. The version we will use here is taken mostly from [BHM07], though we use a slightly different normalization of the Fourier coefficients.
3.1.1 Hecke eigenbases. Let q 1 be an integer, k 2 an even integer. We denote by S k (q), L 2 (q) and L 2 0 (q) ⊂ L 2 (q), respectively, the Hilbert spaces of holomorphic cusp forms of weight k, of Maass forms and of Maass cusp forms of weight k = 0, level q and trivial Nebentypus (which we denote χ 0 ), with respect to the Petersson norm defined by where k g is the weight for g holomorphic and k g = 0 if g is a Maass form. These spaces are endowed with the action of the (commutative) algebra T generated by the Hecke operators {T n | n 1}, where where k g = 0 if g ∈ L 2 (q) and k g = k if g ∈ S k (q) (compare with the geometric operatorT p of Section 2.3). Moreover, the operators {T n | (n, q) = 1} are self-adjoint, and generate a subalgebra denoted T (q) . Therefore, the spaces S k (q) and L 2 0 (q) have an orthonormal basis made of eigenforms of T (q) and such a basis can be chosen to contain all L 2normalized Hecke newforms (in the sense of Atkin-Lehner theory). We denote such bases by B k (q) and B(q), respectively, and in the remainder of this paper, we tacitly assume that any basis we select satisfies these properties.
The orthogonal complement to L 2 0 (q) in L 2 (q) is spanned by the Eisenstein spectrum E(q) and the one-dimensional space of constant functions. The space E(q) is continuously spanned by a "basis" of Eisenstein series indexed by some finite set which is usually taken to be the set {a} of cusps of Γ 0 (q). It will be useful for us to employ another basis of Eisenstein series formed of Hecke eigenforms: the adelic reformulation of the theory of modular forms provides a natural spectral expansion of the Eisenstein spectrum in which the Eisenstein series are indexed by a set of parameters of the form where χ ranges over the characters of modulus q and B(χ) is some finite (possibly empty) set depending on χ (specifically, B(χ) corresponds to an orthonormal basis 602É. FOUVRY, E. KOWALSKI AND P. MICHEL GAFA in the space of the principal series representation induced from the pair (χ, χ), but we need not be more precise). With this choice, the spectral expansion for ψ ∈ E(q) can be written where the Eisenstein series E χ,g (t) is itself a function from H to C. When needed, we denote its value at z ∈ H by E χ,g (z, t).
The main advantage of these Eisenstein series is that they are Hecke eigenforms for T (q) : for (n, q) = 1, one has

Multiplicative and boundedness properties of Hecke eigenvalues.
Let f be any Hecke eigenform of T (q) , and let λ f (n) denote the corresponding eigenvalue for T n , which is real. Then for (mn, q) = 1, we have This formula (3.4) is valid for all m, n if f is an eigenform for all of T, with an additional multiplicative factor χ 0 (d) in the sum.
We recall some bounds satisfied by the Hecke eigenvalues. First, if f belongs to B k (q) (i.e., is holomorphic) or is an Eisenstein series E χ,f (t), then we have the Ramanujan-Petersson bound for any ε > 0. For f ∈ B(q), this is not known, but we will be able to work with suitable averaged versions, precisely with the second and fourth-power averages of Fourier coefficients. First, we have for any x 1 (see, e.g., [KRW07,(3.3), (3.4)]).

Hecke eigenvalues and Fourier coefficients
For z = x + iy ∈ H, we write the Fourier expansion of a modular form f as follows: where 1/4 + t 2 f is the Laplace eigenvalue, and is a Whittaker function (precisely, it is denoted W 0,itf in [DFI02,§4]; see also [GR94,9.222.2,9.235.2].) When f is a Hecke eigenform, there is a close relationship between the Fourier coefficients of f and its Hecke eigenvalues λ f (n): for (m, q) = 1 and any n 1, we have and moreover, these relations hold for all m, n if f is a newform, with an additional factor χ 0 (d).
In particular, for (m, q) = 1, we have (3.13) (3.14) be Bessel transforms. Then for positive integers m, n we have the following trace formula due to Kuznetsov:

Choice of the test function.
For the proof of Theorem 1.9, we will need a function φ in Kuznetsov formula such that the transformsφ(k) andφ(t) are nonnegative for k ∈ 2N >0 and t ∈ R ∪ (−i/4, i/4). Such φ is obtained as a linear combination of the following explicit functions. For 2 b < a two odd integers, we take (3.18) Notice that if we have the freedom to choose a and b very large, we can ensure that the Bessel transforms of φ a,b decay faster than the inverse of any fixed polynomial at infinity.

The Amplification Method
4.1 Strategy of the amplification. We prove Theorem 1.9 using the amplification method ; precisely we will embed f in the space of forms of level pN (a technique used very successfully by Iwaniec in various contexts [Iwa87,CI00]), as well as by others [Byk98], [BH08]. The specific implementation of amplification (involving the full spectrum, even for a holomorphic form f ) is based on [BHM07].
We consider an automorphic form f of level N , which is either a Maass form with Laplace eigenvalue 1/4 + t 2 f , or a holomorphic modular form of even weight k f 2, and which is an eigenform of all Hecke operators T n with (n, pN ) = 1.
By viewing f as being of level 2 or 3 if N = 1, we can assume that N 2, which will turn out to be convenient at some point of the later analysis. We will also assume that f is L 2 -normalized with respect to the Petersson inner product (3.1).
Finally, we can also assume that p > N, hence p is coprime with N . We will also assume that p is sufficiently large with respect to f and ε.
The form f is evidently a cusp form with respect to the smaller congruence subgroup Γ 0 (pN ) and the function (4.1) may therefore be embedded in a suitable orthonormal basis of modular cusp forms of level q = pN , either B(q) or B kf (q). Let a > b 2 be odd integers, to be chosen later (both will be taken to be large), let φ = φ a,b be the function (3.17) defined in section 3.2. We define "amplified" second moments of the sums S(g, K; p), where g runs over suitable bases of B(q) and B kf (q). Precisely, given L 1 and any coefficients (b ) defined for 2L and supported on ∼ L, and any modular form h, we define an amplifier B(h) by We will also use the notation for χ a Dirichlet character modulo N and g ∈ B(χ).
for any even integer k 2. We will show: Proposition 4.1 (Bounds for the amplified moment). Assume that M 1 is such that K is (p, M )-good. Let V be a smooth compactly supported function satisfying Condition (V (C, P, Q)). Let (b ) be arbitrary complex numbers supported on primes ∼ L, such that |b | 2 for all . For any ε > 0 there exist k(ε) 2, such that for any k k(ε) and any integers a > b > 2 satisfying provided that The implied constants depend on (C, ε, a, b, k, f ).
We will prove Proposition 4.1 in Sections 5 and 6, but first we show how to exploit it to prove the main result.
From now on, we omit the fixed test-function V and use the simplified notation S V (f, K; p) = S(f, K; p). Also (and because we will need the letter C for another variable), we fix the sequence C = (C ν ) ν and we will not mention the dependency in C in our estimates. (note the use of Hecke eigenvalues, and not Fourier coefficients, here). With this choice, the pointwise bound |b | 1 is obvious, and on average we get ∼L |b | π(2L) 2L.
Moreover, for L large enough in terms of f and L < p, we have where the implied constant depends on f . Indeed, we have which we bound from below by writing Thus by (3.7), we have (4.9) Now we apply Proposition 4.1 for this choice. We recall from (3.19) that we havẽ in the second and third terms of the sum defining M (L), while for k 2, even, we haveφ where the implied constant depends on (f, ε). Now all the terms of the left-hand side of the equality (4.10) are non-negative. Applying positivity and recalling (4.1), we obtain and hence |S(f, K; p)| 2 p 2+ε P (P + Q) L + p 3/2+ε LP Q 2 (P + Q) M 3 (log L) 6 (4.11) by (4.8), where the implied constant depends on (f, ε). We let for arbitrarily small ε > 0 so that (4.6) is satisfied. Therefore, if L is sufficiently large depending on f , we obtain On the other hand, if L f 1, we have Q f 1 2 p 1/4−ε , and the estimate (4.13) is trivial. Thus we obtain Theorem 1.9.
Remark 4.2. In [FKM14, p. 1707], we quote a slighlty different choice of L. This was due to a minor slip in the proof of (4.11) in the first draft of this paper, which is corrected above. Using the value (4.12) in [FKM14] does not affect any of the main results of that paper.
where χ is a Dirichlet character of modulus N , g ∈ B(χ) and ϕ is some smooth compactly supported function. We have the following: There exists an absolute constant s 1 such that for any δ < 1/8, where the implied constant depends only on (N, δ, ϕ).
Proof. Let T 0 be such that the support of ϕ is contained in [−T, T ]. Then we have and we will bound the right-hand side. gives the Hecke eigenvalues of E χ,g (t 0 ). Let α p = exp(− √ log p). For t such that |t − t 0 | α p , and for prime with ∼ L, we have and then B(g, t) = B(g, t 0 ) + O(Lα 1/2 p ). (4.14) 610É. FOUVRY, E. KOWALSKI AND P. MICHEL GAFA Our next task it to give an analogue of (4.8), namely we prove the lower bound for L L 0 (N, T ), uniformy for |t 0 | T . The argument is similar to [FKM14, Lemma 2.4]. We start from the equality Restricting the summation to the primes ≡ 1 mod N , we obtain the lower bound (4.16) In [FKM14, p. 1705], the corresponding sum without the condition ≡ 1 mod N is shown to be L/(log L) 6 . Since N is fixed, it is easy to include this condition in the proof of loc. cit., using the Prime Number Theorem in arithmetic progressions. We leave the details to the reader.
Combining (4.14) and (4.15), we deduce where the implied constant depends only on N and T . We therefore get Remark 4.4. The bounds (4.9) and (4.17) exhibit a polynomial dependency in the parameters of f or E χ,g,ϕ . This is due to the direct use of the prime number theorem for various L-functions. However, with more sophisticated Hoheisel-type estimates (see [Mot] for instance), this dependency can be made polynomial. This is important for instance to obtain polynomial decay rates in p in Theorem 2.3. and the identity |λ f ( )| 2 − λ f ( 2 ) = 1 for prime it is possible to obtain a non trivial bound for the sum S V (f, K; p) when f is of level Np (rather than N ); however due to the lacunarity of the amplifier the resulting bounds are weaker: the exponent 1/8 in Theorem 1.2 and its corollaries has to be replaced by 1/16. The proof is a little bit more involved as one has to consider more than 3 cases in §5.5 and we will not give it here.

Estimation of the Amplified Second Moment
We begin here the proof of Proposition 4.1. Obviously, we can assume that P p, Q p.
We start by expanding the squares in B(g) and |S(g, K; p)| 2 , getting and similarly where we used the fact that the Hecke eigenvalues λ g ( 2 ) and λ χ ( 2 , t) which are involved are real for 2 coprime to pN , because of the absence of Nebentypus. depending on whether 1 = 2 or 1 = 2 . We begin with the "diagonal" terms M d (L), M d (L; k) where 1 = 2 , which are the only cases where 1 and 2 are not coprime. where, for instance, we have By (3.6) and the bound |b | 2, we get where the implied constant is independent of f . We can then apply the rapid decay (3.18) ofφ(t) at infinity and the large sieve inequality of Deshouillers-Iwaniec [DI82, Theorem 2, (1.29)] to obtain where the implied constant depends only on ε.
The bounds for the holomorphic and Eisenstein portion are similar and in fact slightly simpler as we can use Deligne's bound on Hecke eigenvalues of holomorphic cusp form (or unitary Eisenstein series) instead of (3.6) (still using [DI82, Th.  k) are Hecke-eigenforms for the Hecke operators T (n) for (n, q) = (n, pN ) = 1, hence we can combine the eigenvalues at the primes 1 = 2 using the Hecke relation (3.10) and By the Petersson formula (3.12), we write where M 1 (L; k) corresponds to the diagonal terms δ( 1 2 n 1 d −2 , n 2 ) while where Δ q,k is given in (3.13).
On the other hand, by (3.15), there is no diagonal contribution for M nd (L), and we write where Δ q,φ (m, n) is defined in (3.16).

Diagonal terms.
We begin with M 1 (L; k): we have Since V has compact support in for an arbitrary function φ. We then have We first transform these sums by writing S(en 1 , n 2 ; cpN )K(dn 1 )K(n 2 )φ 4π √ en 1 n 2 Having fixed d, e as above, let C = C(d, e) 1/2 be a parameter.
where the constant implied is absolute. Recalling the definition (3.17), we obtain (5.6) with κ = a − b for φ = φ a,b and with κ = k − 1 for φ = 2πi −k J k−1 , and we note that in the latter case, the constant B is independent of k. Then, summing over c > C(d, e) , we obtain: Proposition 5.4. With notation as above, assuming that |K| M , we have where the implied constant is absolute.
so it is negligible.

Estimating the off-diagonal terms. It remains to handle the complementary sum [see (5.4)] which is
where C is defined by (5.8). In particular, we can assume C 1 otherwise the above sum is zero. Recall that we factored the product of distinct primes 1 2 (with i ∼ L) as 1 2 = de. Hence we have three types of factorizations of completely different nature, which we denote as follows: • Type (L 2 , 1): this is when d = 1 2 and e = 1, so that L 2 < d 4L 2 ; • Type (1, L 2 ): this is when d = 1 and e = 1 2 , so that L 2 < e 4L 2 ; • Type (L, L): this is when d and e are both = 1 (so d = 1 and e = 2 or conversely), so that L < d = e 2L.
We will also work under the following (harmless) restriction p δ P < L.
The outcome of the above computations is, for any c 1, the identitỹ We make the following definition: Definition 5.6 (Resonating matrix). For n 1 n 2 ≡ e (mod cN ), the integral matrix γ(c, d, e, n 1 , n 2 ) defined by (5.17) is called a resonating matrix.
Observe that det(γ(c, d, e, n 1 , n 2 )) = de and since de is coprime with p, the reduction of γ(c, d, e, n 1 , n 2 ) modulo p provides a well-defined element in PGL 2 (F p ).

Estimating the Fourier transform.
Our next purpose is to truncate the sum over n 1 , n 2 in (5.16). To do this, we introduce a new parameter: Note that, since 1 c C = p δ P (e/d) 1/2 , we have We will use Z to estimate the Fourier transform H φ ( n1 cpN , n2 cpN ). The first bound is given by the following lemma: Lemma 5.7. Let (d, e) be of Type (L, L) or of Type (1, L 2 ). Let H φ and Z be defined by (5.3) and (5.18). Assume that V satisfies (V (C, P, Q)) and that n 1 n 2 = 0.
(1) Recalling (5.3) and (3.17), we have ×J a 4π (e/d) 1/2 cN √ xy e − (n 1 /d)x + n 2 y cN dxdy. (5.20) We use the uniform estimates for the Bessel function, valid for z > 0 and ν 0, where the implied constant depends on a and ν (see [EMOT55,Chap. VII]). We also remark that Z is the order of magnitude of the variable inside J a (· · · ) in the above formula, then integrating by parts μ times with respect to x and ν times with respect to y, we get the result indicated.
(2) This is very similar: since we want uniformity with respect to k, we use the integral representation  γ(c, d, e, n 1 , n 2 )) | M 2 p, we see that, for any fixed ε > 0, the contributions toẼ φ (c, d, e) of the integers n 1 , n 2 with are negligible (see (5.16)). Thus we get: where E φ is the subsum ofẼ φ given by The implied constant depends on (δ, ε, N, a, b), but is independent of k for φ = 2πi −k J k−1 .

A more precise evaluation.
In the range |n i | N i , i = 1, 2 we will need a more precise evaluation. We will take some time to prove the following result: Lemma 5.9. Let (d, e) be of Type (L, L) or of Type (1, L 2 ). Let H φ and Z be defined by (5.3) and (5.18). Assume that V satisfies (V (C, P, Q)) and that n 1 n 2 = 0.
(1) For φ = φ a,b , we have where the implied constant depends on (C, a, b, N ).
(2) For φ = φ k , we have where the implied constant depends on C and N .

GAFA ALGEBRAIC TWISTS OF MODULAR FORMS 621
Proof. We consider the case φ = φ k , the other one being similar. We shall exploit the asymptotic oscillation and decay of the Bessel function J k−1 (z) for large z. More precisely, we use the formula which is valid uniformly for z > 0 and k 1 with an absolute implied constant (to see this, use the formula from, e.g., [Iwa95, p.227, (B 35)], which holds with an absolute implied constant for z 1 + (k − 1) 2 , and combine it with the bound |J k−1 (x)| 1.) The contribution of the second term in this expansion to The contribution arising from the first term can be written as a linear combination (with bounded coefficients) of two expression of the shape 1 dZ 1/2 R 2 ±2 (e/d)xy − (n 1 /d)x − n 2 y cN dxdy ×e − 2P (n 1 /d)x 2 ∓ 2 e/dxy + n 2 y 2 cN dxdy.
We write these in the form where we note that the function is smooth and compactly supported in [0, 1] 2 , and -crucially -the phase We now prove two lemmas in order to deal with the oscillatory integrals (5.23) above, from which we will gain an extra factor Z 1/2 . We use the notation for a function ϕ on R 2 .
Lemma 5.10. Let F (x, y) be a quadratic form and G(x, y) a smooth function, compactly supported on [0, 1], satisfying the inequality where G 0 is some positive constant. Let λ 2 denote the Lebesgue measure on R 2 . Then, for every B > 0, we have G(x, y)e (F (x, y)) dy .
To simplify the exposition, we suppose that A(x) is a segment of the form ]a(x), 1] with 0 a(x) 1 (when it consists in two segments, the proof is similar). Integrating by part, we get (F (x, y)) dy The first term in the right hand side of (5.26) is since, on the interval of integration, F (0,1) has a constant sign and F (0,2) is constant. Inserting these estimations in (5.25) and using the equality we complete the proof.
The following lemma gives an upper bound for the constant λ 2 (G(B)) that appears in the previous one.
Lemma 5.11. Let F (x, y) = c 0 x 2 + 2c 1 xy + c 2 y 2 be a quadratic form with real coefficients c i . Let B > 0 and let G(B) be the corresponding subset of [0, 1] 2 as defined in Lemma 5.10. We then have the inequality Proof. By integrating with respect to x first, we can write λ 2 (G(B)) = We return to the study of the integral appearing in (5.23). Here we see easily that Lemma 5.11 applies with Hence, by Lemma 5.10, we deduce for any B > 0. Choosing B = √ Z, we see that the above integral is QZ −1/2 . It only remains to gather (5.22, 5.23, 5.24) with the bound Z −3/2 p δ/2 Q/Z to complete the proof of Lemma 5.9.

Contribution of the non-correlating matrices.
From now on, we simply choose δ = ε > 0 in order to finalize the estimates.
We start by separating the terms according as to whether |C(K; γ(c, d, e, n 1 , n 2 ))| Mp 1/2 or not, i.e., as to whether the reduction modulo p of the resonating matrix γ(c, d, e, n 1 , n 2 ) is in the set G K,M of M -correlation matrices or not [see (1.12)]. Thus we write where * restricts to those (n 1 , n 2 ) such that γ(c, d, e, n 1 , n 2 ) (mod p) ∈ G K,M , and E n φ is the contribution of the remaining terms. Similarly, we write say.

GAFA ALGEBRAIC TWISTS OF MODULAR FORMS 625
We will treat M n 3 [φ; d, e] slightly differently, depending on whether (d, e) is of Type (L, L) or of Type (1, L 2 ). For T = (L, L) or (1, L 2 ), we write Notice that in both cases we have by (5.11, 5.18) and (5.21); here the implied constant depends on N . This shows that the total number of terms in the sum E φ (c, d, e) (or its subsums E n φ (c, d, e)) is N 1 N 2 c −1 .
-When (d, e) is of Type (L, L), we appeal simply to Lemma 5.7 with μ = ν = 0, and obtain Summing the above over c C p ε P and then over ( 1 , 2 ), and over the pairs (d, e) of Type (L, L), we conclude that M n,(L,L) 3

[φ]
Mp 1/2+3ε L 2 (Q 2 P + P Q + P 2 ) Mp 1/2+3ε L 2 P Q(P + Q). We now apply Lemma 5.9. Considering the case of φ = φ k , we get If φ = φ a,b , we obtain the same bound without the factor k 3 , but the implied constant then depends also on (a, b).
to simplify notation.
The main tool we use is the fact that, when the coefficients of γ(c, d, e, n 1 , n 2 ) are small enough compared with p, various properties which hold modulo p can be lifted to Z.  (c, d, e), which is also easily handled. Indeed, a parabolic γ ∈ PGL 2 (F p ) has a unique fixed point in P 1 , and hence any representativeγ of γ in GL 2 (F p ) satisfies tr(γ) 2 − 4 det(γ) = 0. Now if there existed some matrix γ(c, n 1 , n 2 ) which is parabolic modulo p, we would get (n 1 + dn 2 ) 2 = 4de = 4 1 2 (mod p).

Triangular and related matrices. Note that
Under the assumption p 3ε LQ < p 1/2 (6.5) [which is stronger than (6.3)], this becomes an equality in Z, and we obtain a contradiction since the right-hand side 4 1 2 is not a square. Therefore, assuming (6.5), we have also M p 3 [φ; d, e] = 0. (6.6)

Toric matrices.
We now examine the more delicate case of E t φ (c, d, e). Recall that this is the contribution of matrices whose image in PGL 2 (F p ) belong to a set of M tori T xi,yi . We will deal with each torus individually, so we may concentrate on those γ(c, n 1 , n 2 ) which (modulo p) fix x = y in P 1 (F p ). In fact, we can assume that x and y are finite, since otherwise γ would be treated by Section 6.1.
We make the stronger assumption p 3ε LQ < p 1/3 (6.7) to deal with this case. We therefore assume that there exists a resonating matrix γ(c, n 1 , n 2 ) whose image in PGL 2 (F p ) is contained in T x,y (F p ). From (6.3), we saw already that γ (mod p) is not a scalar matrix. Now consider the integral matrix (which has trace 0). The crucial (elementary!) fact is that, since γ is not scalar, an element γ 1 in GL 2 (F p ) has image in T x,y if and only 2γ 1 − tr(γ 1 )Id is proportional to 2γ − tr(γ)Id (indeed, this is easily checked if x = 0, y = ∞, and the general case follows by conjugation). Hence, if a resonating matrix γ 1 = γ(c 1 , m 1 , m 2 ) has reduction modulo p in T x,y , the matrix Because of (6.7), one sees that these equalities modulo p hold in fact over Z. We then get where the first term is also given by for some x ∈ F p . By Lemma 8.1, Propositions 8.2 and 8.4, each such L is an isotypic trace function whose conductor is bounded solely in terms of cond(K). Therefore we would like to apply Theorem 1.9.
Remark 7.1. For the rest of this section we will not necessarily display the dependency in M or f or τ of the various constants implicit in the Vinogradov symbols .
The functions V and W above do not a priori satisfy a condition of type (V (C, P, Q)), but it is standard to reduce to this situation. First, we truncate the large values of n, observing that since where the implied constant depends on t [see (3.9)], the contribution of the terms with n p 1+ε to any of the sums appearing in (7.2) is for any ε > 0. Then, by means of a smooth dyadic partition of the remaining interval, the various sums S V (f, L; p) and S W (f, L; p) occuring in (7.2), are decomposed into a sum of O(log p) sums of the shape where L has conductor bounded in terms of M only, for functionsṼ , depending on τ and t f , which satisfy Condition (V (C, P, Q)) for some sequence C = (C ν ), and P ∈ 1 2 p −1 , p ε , Q tf ,ε 1 (the normalizing factor P −1/2 comes from the factorization (x/p) −1/2 = P −1/2 (x/pP ) −1/2 , and is introduced to ensure thatṼ (x) tf ,ε 1). we see that the contribution to μ K,I,τ (f ) of the sums with P p −1/2 is For the remaining sums, we use Theorems 1.9 and 1.14: we have for any δ < 1/8, where the implicit constants depend on (M, C, f, τ, δ, ε). We obtain that As long as |I| p 7/8+κ for some fixed κ > 0, we can take ε > 0 small enough and δ > 0 small enough so that we above shows that μ K,I,τ (f ) → 0 as p → +∞, as desired.
The case where ϕ is a packet of Eisenstein series E χ,g (ϕ) is similar, using Proposition 4.3. Indeed, the contribution of the non-zero Fourier coefficients are handled in this manner, and the only notable difference is that we must handle the constant term of this packet. This is given by χ,g (ϕ, 0)(z) = R ϕ(t){c 1,g (t)y 1/2+it + c 2,g (t)y 1/2−it }dt, (7.5) and contributes to μ K,I,τ (E χ,g (ϕ)) by for t ∈ F p . By §10.2, K is a Fourier trace function (not necessarily isotypic), whose Fourier transform is therefore also a Fourier trace function, given bŷ nφ(x) p , (n, p) = 1 By Proposition 8.3, we can expressK as a sum of at most deg(φ) functionsK i which are irreducible trace functions with conductors bounded by M . The contribution from the termsK i is then treated by the previous proof.

Trace Functions
We now come to the setting of Section 1.3. For an isotypic trace function K(n), we will see that the cohomological theory of algebraic exponential sums and the Riemann Hypothesis over finite fields provide interpretations of the sums C(K; γ), from which it can be shown that trace functions are good.
In this section, we present some preliminary results. In the next one, we give many different examples of trace functions (isotypic or not), and compute upper bounds for the conductor of the associated sheaves. We then use the cohomological theory to prove Theorem 1.14.
First we recall the following notation for trace functions: for a finite field k, an algebraic variety X/k, a constructible -adic sheaf F on X, a finite extension k /k, and a point x ∈ X(k ), we define (tr F)(k , x) = tr(Fr k | Fx), the trace of the geometric Frobenius automorphism of k acting on the stalk of F at a geometric pointx over x (seen as a finite-dimensional representation of the Galois group of k ; see [Kat90,7.3 We also consider the (Tate-twisted) Fourier transform G = FT ψ (F)(1/2) with respect to an additive -adic character ψ of F p . It satisfies (2) Suppose that F is pointwise ι-pure 5 of weight 0, i.e., that it is a trace sheaf. Then i.e., for any finite field k with v ∈ k, the eigenvalues of the Frobenius of k acting on the stalk of G at a geometric pointv over v are |k|-Weil numbers of weight at most 0.
We defined the conductor of a sheaf in Definition 1.13. An important fact is that this invariant also controls the conductor of the Fourier transform, and that it controls the dimension of cohomology groups which enter into the Grothendieck-Lefschetz trace formula. We state suitable versions of these results: Proposition 8.2. Let p be a prime number and = p an auxiliary prime.
(1) Let F be an -adic Fourier sheaf on A 1 Fp , and let G = FT ψ (F)(1/2) be its Fourier transform. Then, for any γ ∈ GL 2 (F p ), the analytic conductor of γ * G satisfies cond(γ * G) 10 cond(F) 2 . (8.2) (2) For F 1 and F 2 lisse -adic sheaves on an open subset U ⊂ A 1 , we have (3) Let F 1 and F 2 be middle-extension -adic sheaves on A 1 Fp . Then Note that (8.2) and (8.3) can certainly be improved, but these bounds will be enough for us.
We first bound the number of singularities where λ runs over the breaks of F(∞), and x over the singularities of F in A 1 . The first term is Swan ∞ (F), so that the rank of G is bounded by rank(G) Swan(F) + rank(F)n(F) cond(F) 2 . (8.5) Thus it only remains to estimate the Swan conductors Swan x (G) at each singularity. We do this using the local description of the Fourier transform, due to Laumon [Lau87], separately for 0, ∞ and points in G m . Third case. Let x ∈ G m . By translation, we have so that the previous case gives By (8.4) and (8.5), this leads to x Swan x (G) 2 cond(F) 2 + 3 cond(F) 2 = 5 cond(F) 2 , and cond(G) 10 cond(F) 2 .
To check the claim, note that the formula for the character of an induced representation shows that tr i (g) = 0 for any g / ∈ H (see, e.g. [Kow14, Prop. 2.7.43]). Hence the trace function vanishes obviously on U (F p ) since the Frobenius elements associated to x ∈ U (F p ) relative to F p are not in H.
This property extends to x ∈ (A 1 − U )(F p ) by a similar argument (we thank N. Katz for explaining this last point; note that we could also treat separately the points in A 1 − U , which would lead at most to slightly worse bounds for the trace norm of K). LetG = π 1 (A 1 ,η) be the fundamental group of the affine line. There is a surjective homomorphismG −→ G.
Composing these with τ i and i gives representationsτ i and˜ i ofH andG, respectively, with˜ i = IndG Hτ i .
The stalk of F i at a geometric point above x ∈ (A 1 − U )(F p ) is isomorphic, as a vector space with the action of the Galois group of F p , to the invariant space Ix i under the inertia subgroup at x, which is a subgroup I x ofG.
The space of˜ i can be written as a direct sum where the spaces W σ areH-stable and permuted byG. Moreover, any g ∈G −H permutes the W σ without fixed points, becauseH is normal inG. The point is that since I x ⊂G g ⊂H (the inertia group is a subgroup of the geometric Galois group) and each W σ isH-stable, we havẽ (in other words, this shows that˜ Ix i IndG Hτ Ix i ). The matrix representing the action on˜ Ix i of any element g in the decomposition group D x mapping to the Frobenius conjugacy class at x in D x /I x is block-diagonal with respect to this decomposition. Since g / ∈H, this block-diagonal matrix has zero diagonal blocks, hence its trace, which is the value of the trace function of F i at x, also vanishes. Proof. It is clear that F (x) has the right trace function and that it is a Fourier trace sheaf, with the same conductor as F.
Finally, we state a well-known criterion for geometric isomorphism of sheaves, that says that two irreducible middle-extension sheaves are geometrically isomorphic if their trace functions are equal on A 1 (F p ) "up to a constant depending on the definition field". Precisely: Proposition 8.5 (Geometric isomorphism criterion). Let k be a finite field, and let F 1 and F 2 be geometrically irreducible -adic sheaves, lisse on a non-empty open set U/k and pointwise pure of weight 0. Then F 1 is geometrically isomorphic to F 2 if and only if there exists α ∈Q × such that for all finite extensions k 1 /k, we have (tr F 1 )(k 1 , x) = α [k1:k] (tr F 2 )(k 1 , x) (8.11) for all x ∈ U (k 1 ).
In particular, if F 1 and F 2 are irreducible Fourier sheaves, they are geometrically isomorphic if and only if there exists α ∈Q × such that for all finite extensions k 1 /k, we have (tr F 1 )(k 1 , x) = α [k1:k] (tr F 2 )(k 1 , x) (8.12) for all x ∈ k 1 .
Proof (Sketch of proof ). This is a well-known fact; it is basically an instance of what is called "Clifford theory" in representation theory. We sketch a proof for completeness. In the "if" direction, note that (8.11) shows that F 1 and α deg(·) ⊗ F 2 are lisse sheaves on U with the same traces of Frobenius at all points of U ; the Chebotarev Density Theorem shows that the Frobenius conjugacy classes are dense in π 1 (U,η), so we conclude that F 1 α deg(·) ⊗ F 2 ) as lisse sheaves on U . But then restriction to the geometric fundamental group (the kernel of the degree) gives F 1 F 2 geometrically on U .
Conversely, if F 1 is geometrically isomorphic to F 2 , and i is the representation of π 1 (U,η) associated to F i , then representation theory (see, e.g., [Kow14,2.8

Examples of Trace Functions
In this section, we will discuss four classes of functions K(n) that arise as trace functions. In a first reading, only the definitions of these functions may be of interest, rather than the technical verification that they satisfy the necessary conditions.
We The classical constructions of Artin-Schreier and Kummer sheaves show that, for any = p, one can construct -adic sheaves L ψ(φ) and L η(φ) on A 1 Fp such that we have (tr L ψ(φ) )(F p , x) = ψ(φ(x)) if φ(x) is defined, 0 i fx is a pole of φ, and (tr L η(φ) )(F p , x) = η(φ(x)) if φ(x) is defined and non-zero, 0 i fxis a zero or pole ofφ (these are the extensions by zero to A 1 of the pullback by φ of the lisse Artin-Schreier and Kummer sheaves defined on the corresponding open subsets of A 1 ). Fix an isomorphism ι :Q → C. We assume that ψ is the standard character, so that ι(ψ(x)) = e x p , for x ∈ F p . Similarly, if χ is a Dirichlet character modulo p, there is a multiplicative character η such that ι(η(x)) = χ(x) for x ∈ F p . Let then φ 1 , φ 2 ∈ Q(X) be rational functions as in (1.6), with φ 2 = 1 if χ is trivial. The -adic sheaf (2) Let d 1 be the number of poles of φ 1 , with multiplicity, and d 2 the number of zeros and poles of φ 2 (where both are viewed as functions from P 1 to P 1 ). The analytic conductor of the sheaf F satisfies cond(F) 1 + 2d 1 + d 2 .
Proof. (1) The sheaf F is pointwise pure of weight 0 on the open set U where φ 1 and φ 2 are both defined and φ 2 is non-zero, which is the maximal open set on which F is lisse. Moreover, it is of rank 1 on this open set, and therefore geometrically irreducible. By [Kat88, Proof of Lemma 8.3.1], F is a Fourier sheaf provided it is not geometrically isomorphic to the Artin-Schreier sheaf L ψ(sX) for some s ∈ A 1 , which is the case under our assumption.
(2) The rank of F is one. The singular points are the poles of φ 1 and the zeros and poles of φ 2 , so their number is bounded by d 1 +d 2 . Furthermore, the Swan conductor at any singularity x is the same as that of L ψ(φ1) , since all Kummer sheaves are everywhere tame. Thus only poles of φ 1 contribute to the Swan conductor, and for such a pole x, the Swan conductor is at most the order of the pole at x, whose sum is d 1 (it is equal to the order of the pole when φ 1 is Artin-Schreier-reduced at x, which happens if p is larger than the order of the pole, see, e.g., [Del77, Sommes Trig., (3.5.4)]).