Modular $S_4$ Models of Lepton Masses and Mixing

We investigate models of charged lepton and neutrino masses and lepton mixing based on broken modular symmetry. The matter fields in these models are assumed to transform in irreducible representations of the finite modular group $\Gamma_4 \simeq S_4$. We analyse the minimal scenario in which the only source of symmetry breaking is the vacuum expectation value of the modulus field. In this scenario there is no need to introduce flavon fields. Using the basis for the lowest weight modular forms found earlier, we build minimal phenomenologically viable models in which the neutrino masses are generated via the type I seesaw mechanism. While successfully accommodating charged lepton masses, neutrino mixing angles and mass-squared differences, these models predict the values of the lightest neutrino mass (i.e., the absolute neutrino mass scale), of the Dirac and Majorana CP violation (CPV) phases, as well as specific correlations between the values of the atmospheric neutrino mixing parameter $\sin^2\theta_{23}$ and i) the Dirac CPV phase $\delta$, ii) the sum of the neutrino masses, and iii) the effective Majorana mass in neutrinoless double beta decay. We consider also the case of residual symmetries $\mathbb{Z}^{ST}_3$ and $\mathbb{Z}^S_2$ respectively in the charged lepton and neutrino sectors, corresponding to specific vacuum expectation values of the modulus.


Introduction
Understanding the origin of the flavour structure of quarks and leptons continues to be a highly challenging problem. Adding to this problem is the pattern of two large and one small mixing angles in the lepton sector, revealed by the data obtained in neutrino oscillation experiments (see, e.g., [1]). The results of the recent global analyses of these data show also that a neutrino mass spectrum with normal ordering (NO) is favoured over the spectrum with inverted ordering (IO), as well as a preference for a value of the Dirac CP violation (CPV) phase δ close to 3π/2 (see, e.g., [2]).
The observed 3-neutrino mixing pattern can naturally be explained by extending the Standard Model (SM) with a flavour symmetry corresponding to a non-Abelian discrete (finite) group G f (see, e.g., [3][4][5][6]). This symmetry is supposed to exist at some high-energy scale and to be broken at lower energies to residual symmetries of the charged lepton and neutrino sectors. Extensive studies of the non-Abelian discrete flavour symmetry approach to the (lepton) flavour problem have revealed that, typically, the breaking of the flavour symmetry requires the introduction of a large number of scalar fields (flavons). These fields have to develop a set of particularly aligned vacuum expectation values (VEVs). Arranging for such an alignment requires in turn the construction of rather elaborate scalar potentials.
A new and very interesting approach to the lepton flavour problem, based on invariance under the modular group, has been proposed in Ref. [7] where also models based on the finite modular group Γ 3 A 4 have been constructed. Although the models found in Ref. [7] were not realistic and made use of a minimal set of flavon fields, this work inspired further studies of the modular invariance approach to the lepton flavour problem. In Ref. [8] a realistic model with modular Γ 2 S 3 symmetry was built with the help of a minimal set of flavon fields. In the most economical versions of the models with modular symmetry, the VEV of the modulus τ can be, in principle, the only source of symmetry breaking without the need of flavon fields. A realistic model of the charged lepton and neutrino masses and of neutrino mixing without flavons, in which the modular Γ 4 S 4 symmetry was used, was constructed in [9]. Subsequently, lepton flavour models with and without flavons based on the modular symmetry Γ 3 A 4 have been proposed in Refs. [10,11].
In the present article, building on the results obtained in Ref. [9], we construct in a systematic way flavour models based on the finite modular group Γ 4 S 4 and study in detail their phenomenology. We focus on the case when the light neutrino masses are generated via the type I seesaw mechanism and where no flavons are introduced.
The article is organised as follows. In Section 2, we briefly describe the modular symmetry approach to lepton masses and mixing proposed in Ref. [7]. In Section 3, we construct minimal modular-invariant seesaw models. In Section 4, we perform a thorough numerical analysis, identify viable models and study their phenomenology. In Section 5, we discuss the implications of preserving residual symmetries of the modular group. Finally, in Section 6 we summarise our conclusions. 1 2 The Framework

Modular group and modular forms
The modular group Γ is the group of linear fractional transformations γ acting on the complex variable τ belonging to the upper-half complex plane as follows: where a, b, c, d ∈ Z and ad − bc = 1 , Imτ > 0 . (2.1) Since changing the sign of a, b, c, d simultaneously does not change eq. (2.1), the group Γ is isomorphic to the projective special linear group P SL(2, Z) = SL(2, Z)/Z 2 , where SL(2, Z) is the group of 2×2 matrices with integer elements and unit determinant, and Z 2 = {I, −I} is its centre (I being the identity element). The modular group is generated by two transformations S and T satisfying S 2 = (ST ) 3 = I . (2.2) Representing these transformations as For N = 2 we define Γ(2) ≡ Γ(2)/{I, −I}, while for N > 2, since the element −I does not belong to Γ(N ), we have Γ(N ) ≡ Γ(N ). The elements of Γ(N ) are in a one-to-one correspondence with the associated linear fractional transformations. The groups Γ(N ) are referred to as principal congruence subgroups of the modular group. Taking the quotient Γ N ≡ Γ/Γ(N ), one obtains a finite modular group. Remarkably, for N ≤ 5 the finite modular groups are isomorphic to permutation groups widely used in lepton flavour model building (see, e.g., [12]). Namely, Γ 2 S 3 , Γ 3 A 4 , Γ 4 S 4 and Γ 5 A 5 . Modular forms of weight k and level N are holomorphic functions f (τ ) transforming under the action of Γ(N ) in the following way: (2.6) Here k is even and non-negative, and N is natural (note that Γ(1) SL(2, Z) and Γ(1) ≡ Γ). Modular forms of weight k and level N form a linear space of finite dimension. It is possible to choose a basis in this space such that a transformation of a set of modular forms f i (τ ) is described by a unitary representation ρ of the finite modular group Γ N : This result is the foundation stone of the approach to lepton masses and mixing proposed in Ref. [7].
In the case of N = 2, the modular forms of lowest non-trivial weight 2 form a twodimensional linear space. One can find a basis in which the two generating modular forms are transformed according to the 2-dimensional irreducible representation (irrep) of S 3 [8]. In the case of N = 3, the corresponding space has dimension 3, and the generating modular forms have been shown to form the triplet of A 4 [7]. For N = 4, there are 5 linearly independent modular forms of weight 2. They are organised in a doublet and a triplet (3 ) of S 4 [9]. Modular forms of higher weights (k > 2) can be constructed from homogeneous polynomials in the generating modular forms of weight 2.

Modular-invariant supersymmetric action
In the case of N = 1 rigid supersymmetry, the matter action S reads where K is the Kähler potential, W is the superpotential and χ denotes a set of chiral supermultiplets contained in the theory in addition to the modulus τ . The integration goes over both space-time coordinates x and Graßmann variables θ and θ. The supermultiplets χ are divided into several sectors χ I . Each sector in general contains several supermultiplets. The modular group Γ acts on τ and χ I in a specific way [13,14]. Assuming that the supermultiplets χ I transform also in a certain representation ρ I of a finite modular group The transformation law for the supermultiplets χ I is similar to that in eq. (2.7). However, χ I are not modular forms, and thus, the weight (−k I ) is not restricted to be an even nonnegative number. The invariance of S under the transformations given in eq. (2.9) requires the invariance of the superpotential W , while the Kähler potential K is allowed to change by a Kähler transformation, i.e., (2.10) An example of Kähler potential which satisfies this requirement is given by where Λ 0 is a parameter with mass dimension one. 1 Expanding the superpotential in powers of χ I , we have where 1 stands for an invariant singlet of Γ N . From eq. (2.9) it is clear that the invariance of W requires the Y I 1 ... In (τ ) to transform in the following way: where ρ Y is a representation of Γ N , and k Y and ρ Y are such that Thus, Y I 1 ... In (τ ) are modular forms of weight k Y and level N furnishing the representation ρ Y of the finite modular group Γ N (cf. eq. (2.7)).

Modular forms of level 4
The dimension of the linear space formed by the modular forms of weight 2 and level 4 is equal to 5 (see, e.g., [7]), i.e., there are five linearly independent modular forms of the lowest non-trivial weight. In Ref. [9] these forms have been explicitly constructed in terms of the Dedekind eta function with a 1 + · · · + a 6 = 0, the basis of the modular forms of weight 2 reads with ω ≡ e 2πi/3 . Furthermore, as shown in [9], the Y 1 (τ ) and Y 2 (τ ) form a doublet transforming in the 2 of S 4 , while the three remaining modular forms make up a triplet transforming in 3 of S 4 . In what follows, we denote the doublet and the triplet as The modular forms of higher weights k = 4, 6, . . . , can be built from the Y i (τ ), i = 1, . . . , 5. Thus, the Y i (τ ) generate the ring of all modular forms of level 4 M(Γ(4)) = ∞ k=0 M k (Γ(4)) . (2.24) The dimension of the linear space M k (Γ(4)) of modular forms of weight k is 2k +1. The modular forms of higher weight transform according to certain irreps of S 4 . For example, at weight 4 we have 9 independent modular forms, which arrange themselves in an invariant singlet, a doublet and two triplets transforming in the 1, 2, 3 and 3 irreps of S 4 , respectively [9]: (2.25) Some higher weight multiplets are given in Appendix B. In the next section we use the modular forms of level 4 to build a modular-invariant superpotential, as in eq. (2.12).
With these assumptions, we can rewrite the superpotential as where the sum over all independent singlet contributions is understood as specified earlier.
Thus, by specifying the weights of the modular forms one obtains the weights of the matter superfields.
After modular symmetry breaking, the matrices of charged lepton and neutrino Yukawa couplings, λ and Y, as well as the Majorana mass matrix M for heavy neutrinos, are generated: where a sum over i, j = 1, 2, 3 is assumed. Eventually, after integrating out N c and after electroweak symmetry breaking, the charged lepton mass matrix M e and the light neutrino Majorana mass matrix M ν are generated: 2 In what follows we will systematically consider low weights k α i , k g , k Λ and identify the corresponding seesaw models.

The Majorana mass term for heavy neutrinos
We start with the analysis of the Majorana mass term for heavy neutrinos. If k Λ = 0, i.e., no non-trivial modular forms are present in the last term of eq. (3.2), k N = 0, and for both choices which leads to the following mass matrix for heavy neutrinos: Thus, in this case, the spectrum of heavy neutrino masses is degenerate, and the only free parameter is the overall scale Λ, which can be rendered real. The Majorana mass term with the mass matrix in eq. (3.8) conserves a "non-standard" lepton charge and two of the three heavy Majorana neutrinos with definite mass form a Dirac pair [15]. Allowing for modular forms of weight k Λ = 2 in the Majorana mass term, we have instead the following structure in the superpotential: The second term vanishes because the 3 from the decomposition of 3 ⊗ 3 (3 ⊗ 3 ) needed to form an invariant singlet is antisymmetric (see Appendix A.2). Applying the decomposition rules to the first term, we obtain where Y 1,2 depend on the complex VEV of τ . Therefore, there are 3 free real parameters in the matrix M .
Increasing k Λ to 4 leads to a bigger number of free parameters, since more than one invariant singlet can be formed. There are nine independent modular forms of weight 4 and level 4. As shown in [9], they are organised in an invariant singlet, a doublet and two triplets, one transforming in the 3 and the other in the 3 of Γ 4 , cf. eq. (2.25). Hence, the relevant part of W reads The last term vanishes, as before, due to antisymmetry. The remaining three terms lead to Thus, apart from τ , there are one real (Λ) and two complex (Λ /Λ, Λ /Λ) free parameters in M , that is, 5 real parameters apart from τ . Weight 6 and higher weight modular forms (see Appendix B) will lead to more free parameters and thus to a decrease in predictivity.

The neutrino Yukawa couplings
Next we analyse the neutrino Yukawa interaction term in the superpotential of eq. (3.2). If k g = 0, the irreps in which N c and L transform should be the same to construct an invariant singlet, i.e., ρ N = ρ L ∼ 3 or 3 . The structure of the singlet is the same of eq. (3.7), and the neutrino Yukawa matrix reads The lowest non-trivial weight, k g = 2, leads to (3.14) There are 4 possible assignments of ρ N and ρ L we consider. Two of them, namely ρ N = ρ L ∼ 3 and ρ N = ρ L ∼ 3 give the following form of Y: The two remaining combinations, (ρ N , ρ L ) ∼ (3, 3 ) and (3 , 3), lead to: In both cases, up to an overall factor, the matrix Y depends on one complex parameter g /g and the VEV τ .
Considering further the case of k g = 4, we have Again we have two equivalent possibilities with ρ N = ρ L and two others with ρ N = ρ L . In the former case, the matrix Y reads It depends on 7 real parameters and the complex τ . In the case of different representations ρ N = ρ L , 3 ⊗ 3 does not contain the invariant singlet, such that the first term in eq. (3.17) is not possible. The sum of three remaining terms yields where plus sign in ± corresponds to (ρ N , ρ L ) ∼ (3, 3 ) and minus sign to (ρ N , ρ L ) ∼ (3 , 3). This minus sign can be absorbed in g . Thus, apart from τ , the matrix Y depends on 5 real parameters. Given the rising multiplicity of free parameters, we do not consider weights k g higher than 4 in the present analysis.

The charged lepton Yukawa couplings
Further we investigate the charged lepton Yukawa interaction terms in the superpotential. Since we consider ρ i ∼ 1 or 1 and ρ L ∼ 3 or 3 , we have four possible combinations ρ i ⊗ ρ L . None of them contain the invariant singlet. Thus, the weights k α i cannot be zero, i.e., they are strictly positive, Therefore, if k α i = 2 for all i = 1, 2, 3, three rows of the charged lepton Yukawa matrix λ will be proportional to each other, and rank(λ) = 1 implying that two of the three charged lepton masses are zero, since rank(λ) = rank(λ † λ). If k α i = k α j = 2, where i = j, and k αp > 2, one has rank(λ) = 2, i.e., one of the masses is zero. Thus, in order to have maximal rank, rank(λ) = 3, and no zero masses, only one k α i can be equal to 2. The minimal (in terms of weights) possibility is defined by k α i = 2 and k α j = k αp = 4, for j = p. Indeed, there are two triplets of weight 4, namely Y (4) 3 and Y 3 . To avoid having a reduced rank(λ), the representations ρ j and ρ p should be different. This ensures that both Y are present in the superpotential, and the corresponding rows in the matrix λ are linearly independent. Then the relevant part of W , which we denote as W e , takes one of the following 6 forms: The possible assignments of irreps to the L and E c i superfields for each of these forms of W e are given in Table 1. Equation (3.22)  while the other 5 forms of W e yield a λ which differs from that in eq. (3.28) by permutations of the rows (and renaming of the free parameters). However, those permutations do not affect the matrix U e diagonalising M e M † e = v 2 d λ † λ, and thus do not lead to new results for the PMNS matrix. In what follows, without loss of generality, we adhere to the minimal choice in eq. (3.22), taking k α 1 = 2 and k α 2 = k α 3 = 4. As we can see, in this "minimal" example the matrix λ depends on 3 free parameters, α, β and γ, which can be rendered real by re-phasing of the charged lepton fields, and the complex τ .
The next natural choice of the weights would be k α i = 4 for any i = 1, 2, 3. However, such a combination of weights leads to rank(λ) < 3, since there are only two independent triplets Y

Summary of models
Let us bring together the different pieces we have obtained so far and summarise the number of free and independent real parameters in the models containing modular forms of weights ≤ 4. Apart from the dependence of M ν and M e on τ (2 real parameters), 3 we have 3 real parameters α, β and γ from the charged lepton sector. Making use of eq. (3.6), we can count the number of free parameters in the light neutrino Majorana mass matrix M ν for different combinations of k Λ and k g . We present the results in Table 2.
In the case of k Λ = k g = 0, the light neutrino mass matrix has the following form (see eqs. (3.6), (3.8) and (3.13)): which leads to |m 1 | = |m 2 | = |m 3 | in contradiction with the neutrino oscillation data. In the cases of k Λ = k g = 4, the total number of free independent real parameters is bigger than 12, i.e., than the number of observables we want to describe or predict. The observables are 3 charged lepton masses, 3 neutrino masses, and 3 mixing angles,  Table 3: Best fit values and 1σ ranges for neutrino oscillation parameters, obtained from the global analysis of Ref. [2], and for charged-lepton mass ratios, given at the scale 2 × 10 16 GeV with the tan β averaging described in [7], obtained from Ref. [16]. The parameters entering the definition of r are δm 2 ≡ m 2 2 − m 2 1 and ∆m 2 ≡ m 2 3 − (m 2 1 + m 2 2 )/2. The best fit value and 1σ range of δ did not drive the numerical searches here reported.
CPV phases in the PMNS matrix. In the next section we will investigate in detail potentially viable models with both k Λ and k g ≤ 2.

Numerical Analysis
Each of the investigated models depends on a set of dimensionless parameters p i = (τ, β/α, γ/α, g /g, . . . , Λ /Λ, . . .) , (4.1) which determine dimensionless observables (mass ratios, mixing angles and phases), and two overall mass scales: v d α for M e and v 2 u g 2 /Λ for M ν . Phenomenologically viable models are those that lead to values of observables which are in close agreement with the experimental results summarised in Table 3. 4 As a measure of goodness of fit, we use the sum of one-dimensional ∆χ 2 j functions for six accurately known dimensionless 5 observable quantities The atmospheric mass-squared difference ∆m 2 31 = ∆m 2 + δm 2 /2 for the NO spectrum of light neutrino masses and ∆m 2 32 = ∆m 2 − δm 2 /2 for the IO spectrum.
In eq. (4.2) we have assumed approximate independence of the fitted quantities (observables).
In what follows, we define N σ ≡ ∆χ 2 . For sin 2 θ ij , we make use of the one-dimensional projections from Ref. [2], whereas for the remaining quantities we employ Gaussian approximations: We restrict the parameter space in the following way: log 10 (β/α), log 10 (γ/α), log 10 g /g , and τ is taken from the fundamental domain D of Γ, depicted in Fig. 1, with an additional constraint of Im τ ≤ 2. The probability distribution for the numerical scan is chosen to be uniform with respect to the parameters in eq. (4.5) and to Re τ and Im τ . Let us comment on why it is sufficient to scan τ in the fundamental domain (4.6). Since the underlying theory enjoys the modular symmetry Γ, all the vacua related by modular transformations are physically equivalent. Therefore, given a non-zero VEV of the modulus τ , we can send it to τ ∈ D with a modular transformation. This is similar to the choice of the Higgs doublet VEV in the Standard Model, which we can bring to its second component and make real by acting with a global gauge transformation. Note however that couplings (α, β, etc.) also transform non-trivially: the kinetic terms of the chiral supermultiplets arising from the Kähler potential in eq. (2.11) should be rescaled to their canonical forms, and we implicitly absorb these rescalings into the couplings. Since the kinetic term scalings change under modular transformations, one has to rescale the couplings accordingly, i.e.
To see this, let us first notice that under τ → −τ * modular multiplets of weight 2 transform as , which is equivalent to: 1. a modular transformation T −1 , 3. complex conjugation of the result.
The first operation does not affect the physics as discussed earlier. The effect of the second transformation can be absorbed into the unphysical phases for the mass matrices under consideration. Therefore τ → −τ * acts as complex conjugation on the modular forms. Together with complex conjugation of couplings, it is nothing but complex conjugation of the mass matrices, which flips the signs of the CPV phases. Inside the fundamental domain, each viable τ will thus be paired to − τ * , its reflection across the imaginary axis.
4.1 Models with (k Λ , k g ) = (2, 0) In this case, the matrices are given by eqs. (3.10), (3.13) and (3.28). According to our numerical search, this model is unable to reproduce known data. The best point we have found is excluded at around 9.7 sigma confidence level, as it does not provide acceptable values for sin 2 θ 12 and sin 2 θ 23 (see Table 4).  Fig. 1). For cases A ( * ) and B ( * ) one has ρ N = ρ L , while for the remaining cases ρ N = ρ L . Note that starred cases correspond to predictions for δ not in line with its experimentally allowed 3σ range. We present the best fit values along with 2σ and 3σ confidence intervals in Tables 5a -5e. Interestingly, from Fig. 1 we observe that 6 out of 10 values of τ corresponding to local minima lie almost on the boundary of the fundamental domain D. The four points which are relatively far from the boundary (C, C * , D, and D * ) correspond to inverted ordering.

Models with (k
The structure of a scalar potential V for the modulus field τ has been previously studied in the context of string compactifications and supergravity (see, e.g., [17][18][19]). In Ref. [19], considering the most general non-perturbative effective N = 1 supergravity action in four dimensions, invariant under modular symmetry, it has been conjectured that all extrema of V lie on the boundary of D and on the imaginary axis (Re τ = 0). This conjecture has been checked there in several examples. If -as suggested by global analyses -it turns out that the normal ordering of light neutrino masses is realised in Nature, this could be considered as an additional indication in favour of the modular symmetry approach to flavour.  Table 4: Best fit values of the parameters and observables in the models with (k Λ , k g ) = (2, 0). Here and in the following tables the weights (k α1 , k α2 , k α3 ) = (2, 4, 4).
The models with (k Λ , k g ) = (0, 2) analysed by us are characterised by six real parameters, v d α, β/α, γ/α, v 2 u g 2 /Λ, |g /g|, Im τ , and two phases arg(g /g) and Re τ . 6 The three real parameters v d α, β/α and γ/α are fixed by fitting the three values of the charged lepton masses. The remaining three real parameters and two phases (v 2 u g 2 /Λ, |g /g|, Im τ , arg(g /g), Re τ ) are used to describe the three neutrino masses, three neutrino mixing angles and the one Dirac and two Majorana [20] CPV phases present in the PMNS matrix. Obviously, the values of some of these altogether nine observables are expected to be correlated.
In the analysis of the five different pairs of models, A and A * , B and B * , . . . , E and E * , as indicated earlier, we used as input the e, µ and τ masses, the one-dimensional χ 2 projections for sin 2 θ ij from Ref. [2] and the Gaussian approximations for δm 2 and ∆m 2 . As a result of the analysis we obtain: i) the best fit values and the 2σ and 3σ ranges of Reτ , Imτ , β/α, γ/α, v d α, Re(g /g), Im(g /g), v 2 u g 2 /Λ, for which we have a sufficiently good quality of the fit to the data, ii) the best fit values and the 2σ and 3σ allowed ranges of sin 2 θ ij , δm 2 and ∆m 2 , to be compared with those found in Ref. [2] and quoted in Table 3, iii) the predicted best fit values and the 2σ and 3σ ranges of the absolute neutrino mass scale min(m j ), j = 1, 2, 3, and of the CPV phases δ, α 21 and α 31 . Together with the results on δm 2 , ∆m 2 , sin 2 θ 12 and sin 2 θ 13 , this allows us to obtain predictions for the sum of neutrino masses i m i and for the effective Majorana mass in neutrinoless double beta decay | m | (see, e.g., [1,21]).
These results are reported in Tables 5a -5e. A successful description of the data in the lepton sector, as our analyses show, implies a correlation between the values of Re τ and Im τ (see Fig. 1), as well as between the values of Im(g /g) and Im τ and of Re(g /g) and Im τ (see Fig. 4). In what concerns the neutrino masses and mixing observables, we find that the value of sin 2 θ 23 is correlated with the values i) of the Dirac phase δ, ii) of i m i and iii) of | m |. These correlations are illustrated in Fig. 2. 7 We note that the correlation between the values of sin 2 θ 23 and | m | is a consequence, in particular, of the correlations between the values of sin 2 θ 23 and of the Majorana phases α 21 and α 31 . 8 Finally, we comment in Appendix E on the correspondence of models with (k Λ , k g ) = (0, 2) to the model with k L = 2 considered in [9], where the light neutrino masses are generated via the Weinberg operator.

Models with (k Λ , k g ) = (2, 2)
In this case the matrices are given by eqs. (3.10), (3.15) or (3.16), and (3.28). According to our numerical search, this model cannot accommodate the experimental data. The best points we have found through numerical search are presented in Table 6. 6 Notice that the dependence on τ arises through powers of exp(2πiτ /4). 7 In Fig. 2 we do not show correlations in the cases of the models E and E * since these models are noticeably less favoured by the data than the other four pairs of models (see Tables 5a -5e). 8 As a consequence of their correlations with sin 2 θ23, the values of δ and of α21 and of α31 are also correlated.  Table 5a: Best fit values along with 2σ and 3σ ranges of the parameters and observables in cases A and A * , which refer to (k Λ , k g ) = (0, 2) and to a certain region in the τ plane (see Fig. 1 Table 5b: Best fit values along with 2σ and 3σ ranges of the parameters and observables in cases B and B * , which refer to (k Λ , k g ) = (0, 2) and to a certain region in the τ plane (see Fig. 1 Table 5c: Best fit values along with 2σ and 3σ ranges of the parameters and observables in cases C and C * , which refer to (k Λ , k g ) = (0, 2) and to a certain region in the τ plane (see Fig. 1 Table 5d: Best fit values along with 2σ and 3σ ranges of the parameters and observables in cases D and D * , which refer to (k Λ , k g ) = (0, 2) and to a certain region in the τ plane (see Fig. 1 Table 5e: Best fit values along with 2σ and 3σ ranges of the parameters and observables in cases E and E * , which refer to (k Λ , k g ) = (0, 2) and to a certain region in the τ plane (see  Subcase  Table 6: Best fit values of the parameters and observables in the models with (k Λ , k g ) = (2, 2). 23

Residual Symmetries
Residual symmetries arise whenever the VEV of the modulus τ breaks the modular group Γ only partially, i.e., the little group (stabiliser) of τ is non-trivial. There are only 2 inequivalent finite 9 points with non-trivial little groups, namely τ = − 1/2 + i √ 3/2 ≡ τ L ("the left cusp") with residual symmetry Z ST 3 = {I, ST, (ST ) 2 }, and τ = i ≡ τ C with residual symmetry Z S 2 = {I, S} (see, e.g., [22]). Indeed, the actions of ST on τ L and of S on τ C leave respectively τ L and τ C unchanged. Any other point with non-trivial little group is related to τ L or τ C by a modular transformation, and is therefore physically equivalent to it. For example, τ R = + 1/2+i √ 3/2 ("the right cusp") has residual symmetry Z T S 3 = {I, T S, (T S) 2 }, and it is related to τ L by a T transformation: τ R = T τ L . With one modulus field τ we can have either the Z 3 or the Z 2 residual symmetry, and it will be a common symmetry of the charged lepton and neutrino sectors of the theory.
In the basis we have employed (see also Appendix A.1), the triplet irreps of the generators S and T have the form: where the plus (minus) sign corresponds to the representation 3 (representation 3 ) of S 4 . It follows from eq. (5.1) that in the basis we are using the product of the triplet representations of S and T generators is a diagonal matrix given by: 10 In the left cusp point τ = τ L , corresponding to the residual symmetry Z ST 3 , the five independent modular forms take the following values: In the point τ = τ C , invariant under the action of the S generator and in which we have the residual symmetry Z S 2 , the modular forms Y 2 , Y 3 , Y 4 and Y 5 can be expressed in terms of the form Y 1 : We could not find models with one modulus field τ and residual symmetry Z ST 3 or Z S 2 , which are phenomenologically viable. Since the residual symmetry is the same for both the 9 Note that τ = i∞ breaks Γ4 to Z T 4 = {I, T, T 2 , T 3 }. 10 The form we get in the triplet representation of ST coincides with the form of the triplet representation of the S4 generator T in a different presentation for the S4 generators (see, e.g., [6] and Appendix F). 24 charged lepton and neutrino mass matrices, 11 the resulting neutrino mixing matrix always contains zeros, which is ruled out by the data.
We will consider next the case of having two moduli fields in the theory -one, τ , responsible via its VEV for the breaking of the modular S 4 symmetry in the charged lepton sector, and a second one, τ ν , breaking the modular symmetry in the neutrino sector. This will be done on purely phenomenological grounds: we will not attempt to construct a model in which the discussed possibility is realised; we are not even sure such models exist.
We will assume further that we have a residual Z ST 3 symmetry in the charged lepton sector and a residual Z S 2 symmetry in the neutrino sector. Under the indicated conditions, one of the charged lepton masses vanishes: the first column of eq. (3.28) is exactly zero at τ L , which follows immediately from eq. (5.3). However, it is possible to render all masses non-vanishing from the outset if we replace the last Yukawa interaction term in eq. (3.22) with a singlet containing modular forms of weight 6: is the only modular form triplet of weight 6 transforming in the 3 of S 4 . In this case we get diagonal M e M † e at τ L : The mixing is therefore determined by the neutrino mass matrix having a Z S 2 symmetry. It is possible to obtain phenomenologically viable solutions in this scenario. For example, in the case (k Λ , k g ) = (4, 0), the neutrino mass matrix is given by eqs. In this case the three masses, three mixing angles and three CPV phases in the neutrino sector are described by three real parameters, v 2 u g 2 /Λ, |Λ /Λ| and |Λ /Λ|, and two phases, arg(Λ /Λ) and arg(Λ /Λ). As a consequence, the values of certain neutrino mass and mixing observables should be correlated. Indeed, through a numerical scan in the vicinity of the point given by eq. (5.8) (keeping τ ν = i fixed) we find strong correlations between sin 2 θ 12 and sin 2 θ 13 , and between sin 2 θ 23 and δ, as shown in Fig. 3. We note that instead of considering two different moduli fields, one could also realise this scenario with only one modulus τ = τ ν and an extra flavon field ϕ T as it was done in Ref. [7] in the case of Γ 3 A 4 symmetry. In such models, the flavon ϕ T develops a particularly aligned VEV which is responsible for the diagonal form of the charged lepton mass matrix. However, the joint description of the lepton and quark flavours, most likely, will require the introduction of two different moduli which develop different vacuum expectation values.
Finally, we comment on the famed tri-bimaximal (TBM) mixing [23,24] (see also [25]) in the context of considered residual symmetries in Appendix F.

Summary and Conclusions
In the present article, we have continued to develop a new and very interesting approach to flavour proposed in Ref. [7]. This approach is based on invariance of the physical supersymmetric action under the modular group. Assuming, in addition, that the matter superfields transform in irreps of the finite modular group Γ 4 S 4 , we have investigated the minimal scenario in which the only source of modular symmetry breaking is the VEV of the modulus field τ and no flavons are introduced. Yukawa couplings in such minimal class of models are modular forms of level 4, transforming in certain irreps of Γ 4 .
Using the basis for the lowest non-trivial weight (k = 2) modular forms found in Ref. [9], we have constructed in a systematic way minimal models in which the light neutrino masses are generated via the type I seesaw mechanism. After stating several simplifying assumptions formulated in the beginning of Section 3, we have classified the minimal models according to the weights of the modular forms entering i) the Majorana mass-like term of the gauge singlet neutrinos (weight k Λ ), ii) the neutrino Yukawa interaction term (weight k g ), and iii) the charged lepton Yukawa interaction terms (weights k α i , i = 1, 2, 3), see eq. (3.2). We have shown that the most economic (in terms of weights) assignment, which yields the correct charged lepton mass spectrum, is (k α 1 , k α 2 , k α 3 ) = (2,4,4). Adhering to the corresponding matrix of charged lepton Yukawa couplings given in eq. (3.28), we have demonstrated that in order to have a relatively small number of free parameters (≤ 8), both weights k Λ and k g have to be ≤ 2 (Table 2). Further, we have performed a thorough numerical analysis of the models with (k Λ , k g ) = (2, 0), (0, 2) and (2,2). 12 We have found that the models characterised by (k Λ , k g ) = (2, 0) and (2, 2) do not provide a satisfactory description of the neutrino mixing angles (Tables 4 and 6). The models with (k Λ , k g ) = (0, 2) instead not only successfully accommodate the data on the charged lepton masses, the neutrino mass-squared differences and the mixing angles, but also lead to predictions for the absolute neutrino mass scale and the Dirac and Majorana CPV phases. Our numerical search has revealed 10 local minima of the ∆χ 2 function. Each of them is characterised by certain values of τ (Fig. 1) and other free parameters. By investigating regions around these minima we have calculated 2σ and 3σ ranges of the observables, which are summarised in Tables 5a -5e. Moreover, our numerical procedure has shown that the atmospheric mixing parameter sin 2 θ 23 is correlated with i) the Dirac CPV phase δ, ii) the sum of neutrino masses, and iii) the effective Majorana mass in neutrinoless double beta decay. We present these correlations in Fig. 2.
The obtained values of τ in the minima of the ∆χ 2 function lead to a very intriguing observation. Namely, 6 of them, which occur very close to the boundary of the fundamental domain D of the modular group, correspond to the NO neutrino mass spectrum, while 4 others, which lie relatively far from the boundary, correspond to the IO spectrum (Fig. 1). The structure of a scalar potential for the modulus field τ has been previously studied in the context of string compactifications and supergravity, and it has been conjectured in Ref. [19] that all extrema of this potential occur on the boundary of D and on the imaginary axis (Re τ = 0). If -as suggested by global analyses of the neutrino oscillation data -it turns out that the NO spectrum is realised in Nature, this could be considered as an additional indication in favour of the considered modular symmetry approach to flavour. Finally, we have performed a residual symmetry analysis, based on the fact that the points τ = i, τ = exp(2πi/3) and τ = exp(πi/3) preserve respectively the Z S 2 , Z ST 3 and Z T S 3 subgroups of the modular group. While a single preserved residual symmetry cannot lead to a viable neutrino mixing matrix, one can assume that residual symmetries of the charged lepton and neutrino sectors are different. In this case, two moduli fields -one, responsible for the breaking of the modular symmetry in the charged lepton sector, and a second one breaking the modular symmetry in the neutrino sector -may be needed. We have considered this scenario on purely phenomenological grounds with an assumption of having a residual Z ST 3 symmetry in the charged lepton sector and a residual Z S 2 symmetry in the neutrino sector. We have provided a phenomenologically viable example for which the charged lepton mass term (more specifically, the matrix M e M † e ) is diagonal, and lepton mixing is fully determined by the neutrino mass matrix. 13 In conclusion, the modular symmetry approach to flavour points to a very intriguing connection between modular-invariant supersymmetric theories (possibly originating from string theory) and the flavour structures observed at low energies. Its predictions will be tested with future more precise neutrino oscillation data, with prospective results from direct neutrino mass and neutrinoless double beta decay experiments, as well as with improved cosmological measurements.

A.1 Presentation and basis
S 4 is the symmetric group of permutations of four objects. It contains 4! = 24 elements and admits the five irreps 1, 1 , 2, 3 and 3 (see, e.g., [26]). While a presentation of S 4 in terms of three generators (see Appendix F) is commonly used, it proves convenient to consider in this context a presentation given in terms of two generators S and T , namely (A.1) Following the identifications described in Ref. [9], from the results in Ref. [27] one can find the explicit basis for the irreps of S 4 which we employ in our discussion: where as usual ω = e 2πi/3 .

A.2 Clebsch-Gordan coefficients
For the basis given in the previous subsection, one can directly make use of the Clebsch-Gordan coefficients listed in Ref. [27], which we reproduce here for completeness. Entries of each multiplet entering the tensor product are denoted by α i and β i .

B Higher Weight Modular Forms
In this appendix we present the modular multiplets arising at weights 6, 8 and 10. The linear space of modular forms of weight k (and level N = 4, corresponding to Γ 4 S 4 ) has dimension 2k + 1. At weight k = 6, one has the irreps: corresponding to a total dimension of 13. At weight k = 8 one has Y (8) corresponding to a total dimension of 17. Finally, at weight k = 10 one has corresponding to a total dimension of 21. The correct dimensionality of each linear space is guaranteed via an appropriate number of constraints relating products of modular forms.

C Numerical Procedure
Our goal is to explore phenomenologically viable regions in the parameter space, i.e., where l(p i ) is the "loss" objective function, which we define as l(p i ) ≡ N σ(p i ) ≡ ∆χ 2 (p i ), and l max is the threshold, which we set to 3, so that it corresponds to compatibility with the observed data at 3σ confidence level. We decompose this problem into two parts: first, we find local minima p (1) i , p i , . . . of l(p i ), and then we explore connected regions around the minima p (n) i that satisfy the constraint l (p i ) ≤ l max .
To find local minima of l(p i ), we use the following algorithm: 1. Pick parameters p i at random until we find a "good enough" point such that l(p i ) < l 0.01 . The threshold l 0.01 is a 0.01 quantile of the l(p i ) distribution, i.e., it is chosen in such a way that we accept roughly 1% points. We use this preliminary step to filter out unpromising points which are very far from the regions of interest. Note that typically l 0.01 > l max , i.e. the regions of interest cover only a tiny fraction of the parameter space, so this step is needed to speed up the computation. Cases A, A* Figure 4: Correlations between the parameters Im τ , Re g /g and Im g /g in cases A and A * , which refer to (k Λ , k g ) = (0, 2) and to a certain region in the τ plane (see Fig. 1). The plus (minus) signs refer to the case without (with) an asterisk.
parameters p i individually until the objective function l(p i ) increases to l max . It corresponds to approximation of the viable region with a parallelepiped. A more sophisticated approach is to approximate the viable region with an ellipsoid by expanding l(p i ) around the minimum up to the second order. However, neither of these approaches work well in our setting due to peculiar shapes of viable regions, see, e.g., Fig. 4. Typically, only a small part of a viable region can be approximated with a parallelepiped or an ellipsoid, therefore such approximations lead to a significant underestimation of the full viable parameter space. Instead, we explore a viable region with a random walk process known as the Metropolis algorithm. The algorithm mimics the Brownian motion of a probe particle in a potential. The procedure is as follows: 1. Define a "potential" eq. (F.3) represent the widely known Altarelli-Feruglio basis for A 4 [30]. TheS andŨ elements generate the ZS 2 × ZŨ 2 subgroup of S 4 . When preserved in the neutrino sector, this subgroup leads to TBM mixing, since ρ(S) and ρ(Ũ ) are simultaneously diagonalised by U TBM . Considering the presentation of S 4 in terms of two generators S and T given in eq. (A.1), which we repeat here for convenience, (F.5) Thus, the preserved S generator corresponds toSTSŨ , and a 3-dimensional ρ(STSŨ ) is not diagonalised by U TBM . In order to preserveS one would need to preserve T 2 . The value of τ = i∞ has residual symmetry Z T 4 = {I, T, T 2 , T 3 }, which contains T 2 , but this does not work because ρ(T ) itself is not diagonalised by U TBM . The question is whether there exists a value of τ which preserves the Z T 2 2 = {I, T 2 } subgroup of S 4 . There are only two inequivalent finite points in the τ plane with non-trivial little groups (τ L and τ C ), both of which we have considered in Section 5, and they do not preserve Z T 2 2 . Thus, it seems rather difficult to realise TBM mixing in the considered set-up.