1 Introduction

Understanding the origin of the flavor structure of quarks and leptons is one of the big challenges in particle physics. Recently, the modular flavor symmetry attracts the attention as an interesting possibility to explain the flavor structure [1]. In these models, the three generations of quarks and leptons transform non-trivially under the modular symmetry, that is, the modular symmetry is in a sense a flavor symmetry. On top of that, Yukawa couplings are assumed to be modular forms, which are holomorphic functions of the modulus \(\tau \) and non-trivially transform under the action of the modular group. As discussed in Ref. [2], it is remarkable that the (in)homogeneous finite modular group \(\Gamma _N^{(\prime )}\) with the level \(N \le 5\) is isomorphic to the well-known (double covering of) permutation group, such as \(S_3\), \(A_4^{(\prime )}\), \(S_4^{(\prime )}\) and \(A_5^{(\prime )}\), which have been intensively studied to explain the lepton flavor structure in the literature [1, 3,4,5,6,7,8,9,10]. These non-Abelian finite groups have been studied in flavor models for quarks and leptons [11,12,13,14,15,16,17,18,19,20,21]. The phenomenological aspects of the modular flavor symmetries have been actively discussed in the literature [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62].

The modular symmetry is well-motivated from the higher dimensional theories such as superstring theory. For example, if we consider the torus or its orbifold compactification, the modulus parameter \(\tau \) is the complex structure modulus, which is a dynamical degree of freedom of the effective field theory determining the shape of the torus. The modular symmetry appears as the geometrical symmetry associated with this compact space. The Yukawa couplings are obtained by the overlap integral of the profile functions of the matter zero-modes and expressed as the function of the modulus which transform non-trivially under the modular transformation. The behavior of the zero-mode function under the modular transformation was studied in magnetized D-brane models [63,64,65,66,67,68,69] and heterotic orbifold models [70,71,72,73,74,75]. The modular flavor symmetric three-generation models based on the magnetized extra dimension were discussed in Refs. [69, 76]. The modulus stabilization is also discussed in Refs. [77, 78].

A certain residual symmetry remains unbroken when the modular symmetry is broken by the vacuum expectation value (VEV) of the modulus at a certain fixed point. The residual symmetry \(\mathbb {Z}^{S}_{4} \subset \Gamma _N^\prime \) or \(\mathbb {Z}^S_{2} \subset \Gamma _N\), associated with the S generator, remains unbroken at \(\tau =i\). In addition, \(\mathbb {Z}^{ST}_3\) and \(\mathbb {Z}_N^{T}\), associated with ST and T, remains at \(\tau = \omega := e^{2\pi i/3}\) and \(\tau = i\infty \), respectively. These residual symmetries have been utilized in model-building in the literature [5, 25, 34, 60]. It is particularly interesting that the hierarchical structure of Yukawa matrices for the Standard Model (SM) fermions can be realized at a vicinity of the fixed points. Indeed the lepton sector was discussed in Refs. [79, 80].

In this paper, we discuss the modular flavor symmetry which can realize the hierarchical structures of the quark masses and the Cabbibo–Kobayashi–Maskawa (CKM) mixing. The realizations of the quark mass hierarchy were discussed by use of \(\Gamma _3 \simeq A_4\) at \(\tau \sim \omega \) and \(\Gamma _6\) at \(\textrm{Im}\,\tau \gg 1\) in Refs. [81] and [82], respectively. We focus on the modular flavor symmetry at \(N=4\), which is isomorphic to \(S_4^{(\prime )}\). This is the minimal possibility to realize the hierarchical structure with up to cubic order of a small parameter which may be necessary to explain the quark hierarchies. For \(\textrm{Im} \tau \gg 1\), this model has approximately the residual discrete symmetry \(\mathbb {Z}^T_4\), which realizes the quark mass structures by the Froggatt–Nielsen (FN) mechanism [83, 84]. It turns out that a few patterns of representations can realize the quark hierarchical structure. We then explicitly construct two models by assigning modular weights, so that the experimental values of the quark masses and the CKM angles are explained with small and \(\mathcal {O}\left( {1}\right) \) ratios of bottom to top quark Yukawa couplings.

The rest of the paper is organized as follows. In Sect. 2, we briefly review the modular symmetry at \(N=4\). We discuss the textures of the modular forms at \(\textrm{Im}\,\tau \gg 1\), where the the residual \(\mathbb {Z}^T_4\) symmetry is realized. In Sect. 3, we discuss possible quark representations to realize the quark mass hierarchy. We study two models with different modular weights based on the modular flavor symmetry \(S_4^{(\prime )}\) in Sect. 4. Sect. 5 is devoted to summary. The details of \(S_4^\prime \) modular flavor symmetry as well as \(S_3\) are shown in Appendix A.

2 Modular symmetry at \(N=4\)

We briefly review the modular symmetry. The homogeneous modular group is defined as

(1)

This group is generated by the generators

(2)

and they satisfy the following algebraic relations,

$$\begin{aligned} S^2 = R, \qquad (ST)^3 = R^2 = \mathbb {I}, \qquad TR = RT. \end{aligned}$$
(3)

The inhomogeneous modular group is defined by , where \(\mathbb {Z}^R_2\) is generated by R. That is, the generator R is presented by \(R=\mathbb {I}\) in .

In addition, congruence subgroup \(\Gamma (N)\) is defined by

(4)

The quotients \(\Gamma _N:= \overline{\Gamma }/\Gamma (N)\) for \(N=2,3,4\) and 5 are respectively isomorphic to \(S_3, A_4, S_4\) and \(A_5\). Moreover, the quotients \(\Gamma ^\prime _N:=\Gamma /\Gamma (N)\) for \(N=3,4\) and 5 are isomorphic to \(A_4^\prime , S_4^\prime \) and \(A_5^\prime \), which are double covering groups of \(A_4, S_4\) and \(A_5\), respectively. In these quotients, the generator T satisfies

$$\begin{aligned} T^N=\mathbb {I}, \end{aligned}$$
(5)

and thus it generates \(\mathbb {Z}_N^T\) symmetry.

The group \(\Gamma _4' \simeq S_4'\) has 10 irreducible representations,

$$\begin{aligned} 1,1^\prime , 2, 3, 3^\prime \quad \text {and}\quad \hat{1}, \hat{1}^\prime , \hat{2}, \hat{3}, \hat{3}^\prime . \end{aligned}$$
(6)

The non-hatted representations r are those in the \(S_4\) symmetry, transformed by R trivially, i.e. \(\rho _R(r) = \mathbb {I}\), while the hatted representations \(\hat{r}\) are transformed non-trivially by R, i.e. \(\rho _R(\hat{r}) = -\mathbb {I}\). Throughout this work, we use the representation matrices in which matrices are diagonal for T and real for S, shown in Appendix A.

The modular group \(\Gamma \) acts on the modulus \(\tau \) (\(\textrm{Im}\,\tau >0\)) as

$$\begin{aligned} \tau \rightarrow \frac{a\tau + b}{c\tau +d}. \end{aligned}$$
(7)

A modular form \(Y_r^{(k)}\) of representation r under \(\Gamma _4^\prime \) with a weight k transforms as

$$\begin{aligned} Y_{r}^{(k)}(\tau ) \rightarrow (c\tau +d)^{k} \rho (r) Y_r^{(k)}(\tau ), \end{aligned}$$
(8)

where \(\rho (r)\) is the representation matrix. The number of representations at a weight k is listed in Table 3 of Appendix A.1. At \(k=1\), there is a \(\hat{3}\) representation,

$$\begin{aligned} Y^{(1)}_{\hat{3}}(\tau ) = \begin{pmatrix} \sqrt{2}\epsilon (\tau ) \theta (\tau ) \\ \epsilon ^2(\tau ) \\ -\theta ^2(\tau ) \end{pmatrix}, \end{aligned}$$
(9)

where the functions \(\theta \) and \(\epsilon \) are written by the Jacobi theta functions [8]. Their series forms are given by

$$\begin{aligned} \theta (\tau ) = 1+2\sum _{n=1}^\infty q^{n^2}, \quad \epsilon (\tau ) = 2q^{1/4} \sum _{n=0}^\infty q^{n(n+1)}, \end{aligned}$$
(10)

where \(q:= e^{2\pi i\tau }\). The modular forms at higher weights can be constructed from products of \(Y^{(1)}_{\hat{3}}\), and the ones used in our models are shown in Appendix A.1. Note that there is ambiguity in the normalization of the modular forms, which can be absorbed by the coefficients of the couplings. We assume that the ambiguity does not change the hierarchical structure given by the powers of \(\epsilon \), and we choose a normalization such that the absolute value of the numerical factor of the element whose the power of \(\epsilon \) is lowest is unity as in Eq. (9).

At \(\textrm{Im}\,\tau \gg 1\), \(\theta \sim 1\) and \(\left| {\epsilon }\right| \sim 2 e^{-(\pi /2) \textrm{Im}\,\tau } \ll 1\), and hence \(\epsilon \) will be the origin for the quark hierarchies. In this limit, the symmetry generated by T is unbroken, and thus the \(\mathbb {Z}^T_4\) symmetry can realize the FN-like mechanism [83, 84], where the flavon is replaced by \(\epsilon (\tau ) \sim 2 q^{1/4}\) whose \(\mathbb {Z}^T_4\) charge is one. The irreducible representations have the following hierarchical structures in this limit,

$$\begin{aligned}&Y_1 \sim 1, \quad Y_{1^\prime } \sim \epsilon ^2, \quad Y_{\hat{1}} \sim \epsilon ^3, \quad Y_{\hat{1}^\prime } \sim \epsilon , \quad Y_{2} \sim \begin{pmatrix} 1 \\ \epsilon ^2 \end{pmatrix}, \quad Y_{\hat{2}} \sim \begin{pmatrix} \epsilon ^3 \\ \epsilon \end{pmatrix}, \nonumber \\&Y_{3} \sim \begin{pmatrix} \epsilon ^2 \\ \epsilon ^3 \\ \epsilon \end{pmatrix}, \quad Y_{3^\prime } \sim \begin{pmatrix} 1 \\ \epsilon \\ \epsilon ^3 \end{pmatrix}, \quad Y_{\hat{3}} \sim \begin{pmatrix} \epsilon \\ \epsilon ^2 \\ 1 \end{pmatrix}, \quad Y_{\hat{3}^\prime } \sim \begin{pmatrix} \epsilon ^3 \\ 1 \\ \epsilon ^2 \end{pmatrix}, \end{aligned}$$
(11)

where \(Y_r\) is the modular form of the representation r. The weights k are omitted here since the hierarchical structures are determined only by the \(\mathbb {Z}_4^T\) charge and are independent of the weight for a given representation (see the representation matrix \(\rho (r)\) shown in Appendix A.1). We see that the maximum power of \(\epsilon \) is \(N-1 = 3\) which may be the minimal number to explain the quark hierarchies [84].

3 Hierarchical structures

The goal of this work is to explain the hierarchical structure in the quark sector with \(\mathcal{O}(1)\) free parameters. The quark hierarchies may be expressed by a small parameter \(\epsilon \ll 1\),

$$\begin{aligned}&(m_u, m_c, m_t) \sim (\epsilon ^3, \epsilon , 1),\nonumber \\&(m_d, m_s, m_b) \sim \epsilon ^p\times (\epsilon ^2, \epsilon ^2, 1),\nonumber \\&V_\textrm{CKM}\sim \begin{pmatrix} 1 &{}\quad 1 &{}\quad \epsilon ^{2} \\ 1 &{}\quad 1 &{}\quad \epsilon ^{2} \\ \epsilon ^{2} &{}\quad \epsilon ^2 &{}\quad 1 \end{pmatrix}, \end{aligned}$$
(12)

where \(p=0,1\). The top to bottom mass ratio \(m_t/m_b\) will be explained by \(\epsilon \) for \(p=1\), while, for \(p=0\), it is explained by \(\tan \beta := v_u/v_d\), where \(v_u\) (\(v_d\)) is the VEV of the neutral component of the up-type (down-type) Higgs doublet in two Higgs doublet models, such as supersymmetric models. We note that \(N=4\) is the minimum possibility to realize the texture in Eq. (12), since the maximum power of \(\epsilon \) is \(N-1\): \(\epsilon ^{N-1} = \epsilon ^{3} \sim m_u/m_t\). One may think that the texture may not fully fit the data, especially for the strange to down quark mass ratio \(m_s/m_d\) and the CKM angles involving the third generation. The former is predicted to be \(\mathcal {O}\left( {1}\right) \) and the latter may be too small. It is shown in Ref. [84] that the CKM angles with \(\mathcal {O}\left( {\epsilon }\right) \) fits to the data. We will see later that these potential issues are resolved by the canonical normalizations and the numerical coefficients in the modular forms. Recently, the quark hierarchical structures realized by the level \(N=3\) at \(\tau \simeq \omega \) and \(N=6\) at \(\tau \simeq i\infty \) were studied in Refs. [81] and [82], respectively.

The hierarchical structure of the masses from the modular flavor symmetry is listed in Ref. [80].Footnote 1 The mass hierarchy of the up quarks, \((1,\epsilon , \epsilon ^3)\), is realized if

$$\begin{aligned} u^c \otimes Q = 3 \otimes {\left\{ \begin{array}{ll} 2 \oplus 1 \\ 1^\prime \oplus 1 \oplus 1 \\ 1^\prime \oplus 1^\prime \oplus 1^\prime , \end{array}\right. } \end{aligned}$$
(13)

where the exchange of Q and \(u^c\) gives the same masses. Here and hereafter, we omit the combinations which give the same representation of Yukawa couplings, e.g. \(3^\prime \otimes (2\oplus 1^\prime )\). Those of the down quarks, \(\epsilon ^p (1,\epsilon ^2, \epsilon ^2)\), are obtained if

$$\begin{aligned} Q \otimes d^c = {\left\{ \begin{array}{ll} 2 \oplus 1 \\ 1^\prime \oplus 1 \oplus 1 \end{array}\right. } \otimes 1^\prime \oplus 1^\prime \oplus 1^\prime \quad (\hat{1}^\prime \oplus \hat{1}^\prime \oplus \hat{1}^\prime ) \end{aligned}$$
(14)

for \(p=0\) \((p=1)\). One can consider that the down quark masses have the hierarchy, \(\epsilon ^p(1,\epsilon , \epsilon ^2)\) realized by

$$\begin{aligned} Q \otimes d^c = 3 \otimes \hat{1}^\prime \oplus \hat{1}^\prime \oplus \hat{1}^\prime \quad \left( 1\oplus 1\oplus 1\right) , \end{aligned}$$
(15)

for \(p=0\) (\(p=1)\).

The CKM hierarchy can be realized only if \(Q = 2\oplus 1\) or \(1^\prime \oplus 1 \oplus 1\). If Q is composed of the same singlet representations, all of the mixing angles in the CKM matrices are predicted to be \(\mathcal {O}\left( {1}\right) \), and hence the hierarchy can not be explained. If Q is a triplet 3, the top Yukawa coupling of \(\mathcal {O}\left( {1}\right) \) is originated from \(Y_{3^\prime }\), see Eqs. (11) and (13). The down quarks are realized in a way of Eq. (15), so the representation of the Yukawa couplings are \(Y_{\hat{3}}\) or \(Y_3\). According to Eq. (11), the top Yukawa coupling is predominantly from the first row of the Yukawa matrix, while the bottom Yukawa coupling is from the second or third row. This means that the CKM matrix is not identity at \(\epsilon =0\), and \(\left| {V_{tb}}\right| \ll 1\) is predicted. Thus the CKM hierarchy is explained only if \(Q = 2\oplus 1\) or \(1^\prime \oplus 1\oplus 1\). Altogether, the texture in Eq. (12) is realized only in the following four cases:

$$\begin{aligned} u^c = 3, \quad d^c = {\left\{ \begin{array}{ll} 1^\prime \oplus 1^\prime \oplus 1^\prime \\ \hat{1}^\prime \oplus \hat{1}^\prime \oplus \hat{1}^\prime \end{array}\right. }, \quad Q = {\left\{ \begin{array}{ll} 2\oplus 1 \\ 1^\prime \oplus 1\oplus 1 \end{array}\right. }. \end{aligned}$$
(16)

Note that the cases such as \(u^c = 3^\prime \), \(d^c = 1\oplus 1\oplus 1\) and \(Q = 2^\prime \oplus 1^\prime \) give the same Yukawa structure, so the phenomenology will not be changed from the above four cases. The first (second) case for \(d^c\) corresponds to \(p=0\) (\(p=1\)). The texture is the same for the two cases of Q. We shall study the first case since it is more predictive, because of the smaller number of parameters. The second case is obtained by splitting 2 into \(1^\prime \oplus 1\) in the first case.

4 Models

Table 1 Assignments of the quarks and Higgs doublets under \(G_{\textrm{EW}}:= SU(2)_L\times U(1)_Y\), \(S_4^\prime \) and the weight k

We construct supersymmetric models with the representations shown in Eq. (16) which realizes the texture in Eq. (12). The assignments of the chiral superfields under the electroweak (EW) gauge symmetry, \(S_4^\prime \) and the modular weights k are shown in Table 1. For general weight assignments, the Yukawa couplings are given by

$$\begin{aligned} W&=\ H_u \left\{ \alpha _1 q_1 \left( Y_3^{(k_u + k_{q_1})} u^c\right) _1 + \alpha _2 \left( q_2 Y^{(k_u + k_{q_2})}_3 u^c \right) _1 \right. \nonumber \\ {}&\quad \left. + \alpha _3 \left( q_2 Y^{(k_u+k_{q_2})}_{3^\prime } u^c \right) _1\right\} \nonumber \\&\quad + H_d \sum _{i=1}^3 \left\{ \beta _{1i} q_1 \left( Y^{(k_{d_i}+k_{q_1})}_{\textbf{1}} d^c_i \right) _1 +\beta _{2i} \left( q_2 Y^{(k_{d_i}+k_{q_2})}_{\textbf{2}}d^c_i\right) _{1} \right\} \nonumber \\&=: H_u Q Y_u u^c + H_d Q Y_d d^c, \end{aligned}$$
(17)

where \((\cdots )_{1}\) is the trivial singlet combination of the product inside the parenthesis.Footnote 2 Besides, there will be more coefficients if there are more than one modular forms which are degenerate for a given representation r and weight k. On the contrary, the term is understood to be absent if there is no modular form for a given r and k. For explicit examples, see the models in the following sections. For the down quark couplings, \((\textbf{1}, \textbf{2}) = (1^\prime , 2)\) and \((\hat{1}, \hat{2})\) for \(p=0\) and \(p=1\), respectively. In the second line, we defined \(Q:= (Q_1, Q_2, Q_3)\), \(u^c:= (u^c_1, u^c_2, u^c_3)\) and \(d^c:= (d^c_1, d^c_2, d^c_3)\). We assign \(Q_1\) is the singlet and the others forming the doublet under \(S_4^\prime \), i.e. \(q_1:= Q_1\), \(q_2:= (Q_2, Q_3)\). The up-type quark \(u^c\) is the triplet and each element of \(d^c\) is the singlet. The Kähler potential of the quark chiral superfield q with wight \(k_q\), which includes the kinetic term, is given byFootnote 3

$$\begin{aligned} K\supset \frac{q^\dagger q}{(-i\tau +i\tau )^{k_q}}, \end{aligned}$$
(18)

then, after the canonical normalization, the Yukawa matrices are normalized as

$$\begin{aligned}&\left[ Y_u\right] _{ij} \rightarrow \left( \sqrt{2\,\textrm{Im}\,\tau }\right) ^{k_{q_i} + k_u} \left[ Y_u\right] _{ij},\nonumber \\&\left[ Y_d\right] _{ij} \rightarrow \left( \sqrt{2\,\textrm{Im}\,\tau }\right) ^{k_{q_i} + k_{d_j}} \left[ Y_d\right] _{ij}, \end{aligned}$$
(19)

where \(i,j = 1,2,3\) and \(k_{q_3} = k_{q_2}\). At \(\textrm{Im}\,\tau \gg 1\) where \(\epsilon \sim \mathcal {O}\left( {0.01}\right) \), this normalization factor \(2\,\textrm{Im}\,\tau \sim 5\) can be important for the hierarchical structure.

The hierarchical structure of the Yukawa matrices before the canonical normalization are given by

$$\begin{aligned}&Y_u P_{13} \sim \begin{pmatrix} \epsilon ^3 &{}\quad \epsilon &{}\quad \epsilon ^2 \\ \epsilon ^3 &{}\quad \epsilon &{}\quad \epsilon ^2 \\ \epsilon &{}\quad \epsilon ^3 &{}\quad 1 \end{pmatrix}, \nonumber \\&Y_d P_{13} \sim \epsilon ^p \begin{pmatrix} \epsilon ^2 &{} \quad \epsilon ^2 &{}\quad \epsilon ^2 \\ \epsilon ^2 &{}\quad \epsilon ^2 &{}\quad \epsilon ^2 \\ 1 &{}\quad 1 &{}\quad 1 \\ \end{pmatrix}, \nonumber \\ \mathrm {where\,}&P_{13}:= \begin{pmatrix} 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 1 &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad 0 \\ \end{pmatrix}. \end{aligned}$$
(20)

Here, \(P_{13}\) is multiplied so that (3, 3) element is predominantly the top and bottom Yukawa couplings. These structures realize the mass and CKM hierarchies in Eq. (12). Note that the hierarchial structure in the masses and mixing can be realized even if some elements are vanishing because of absence of modular forms for a given representation and weight. We shall consider the two models which can explain the quark hierarchies for \(p=0\) and \(p=1\).

4.1 Large \(\tan \beta \) scenario: \(p=0\)

Table 2 The values of the Yukawa couplings at benchmark points in the case of \(p=0\) (left) and \(p=1\) (right). The second column is predictions of our models, and the third (fourth) column shows the experimental values (its \(1\sigma \) error). The central values are at the GUT scale after the renormalization group evolution from the experimental values when \(M_{\textrm{SUSY}} = 10~\textrm{TeV}\) and vanishing threshold corrections in the MSSM [86]. The errors at the scale 10 TeV are shown for reference. At these points, \(\tan \beta = 36.0982\) (1.6358) on the left (right) panel

We assign the modular weights as

$$\begin{aligned}&k_{q_1} = 2, \quad k_{q_2} = 4, \quad k_u = 2,\nonumber \\&k_{d_1} = 4, \quad k_{d_2} = 2, \quad k_{d_3} = 0. \end{aligned}$$
(21)

Since there is no odd weight, there are only non-hatted representations. This means that the inhomogeneous group \(\Gamma _4^\prime /R \simeq S_4\) is enough for this model. We can also study other patterns of the weights where the weights of the Yukawa couplings are less than 10, but this setup has the smallest hierarchy among the parameters with explaining the experimental values.

In this case, the superpotential is

$$\begin{aligned} W&= H_u \Bigg \{ \alpha _1 q_1 \left( Y_3^{(4)} u^c\right) _1 + \alpha _2 \left( q_2 Y^{(6)}_3 u^c \right) _1 \nonumber \\ {}&\quad + \sum _{i_Y=1}^2 \alpha _3^{i_Y} \left( q_2 Y^{i_Y(6)}_{3^\prime } u^c \right) _1 \Bigg \} \nonumber \\&\quad + H_d \left\{ \beta _{11} q_1 \left( Y^{(6)}_{{1}^\prime } d^c_1 \right) _1 + \sum _{i_Y=1}^2 \beta _{21}^{i_Y} \left( q_2 Y^{i_Y(8)}_{{2}}d^c_1\right) _{1} \right. \nonumber \\ {}&\quad \left. + \sum _{j=2}^3 \beta _{2j} \left( q_2 Y^{(10-2j)}_{2}d^c_j\right) _{1} \right\} . \end{aligned}$$
(22)

The Yukawa matrices are given by

$$\begin{aligned}&Y_u =\nonumber \\&\begin{pmatrix} \alpha _1 [Y^{(4)}_3]_1 &{}\quad \alpha _1 [Y^{(4)}_3]_3 &{}\quad \alpha _1 [Y^{(4)}_3]_2 \\ -2 \alpha _2 [Y^{(6)}_3]_1 &{}\quad \alpha _2 [Y^{(6)}_3]_3 +\sqrt{3}\alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_2 &{}\quad \alpha _2 [Y^{(6)}_3]_2 +\sqrt{3}\alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_3 \\ -2 \alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_1 &{}\quad \alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_3 -\sqrt{3}\alpha _2 [Y^{(6)}_{3}]_2 &{}\quad \alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_2 -\sqrt{3}\alpha _2 [Y^{(6)}_{3}]_3 \\ \end{pmatrix}, \nonumber \\ Y_d&= \begin{pmatrix} \beta _{11} Y^{(6)}_{1^\prime } &{}\quad 0 &{}\quad 0 \\ -\beta _{21}^{i_Y} [Y^{i_Y(8)}_{2}]_2 &{}\quad -\beta _{22} [Y^{(6)}_{2}]_2&{}\quad -\beta _{23}[Y^{(4)}_{2}]_2 \\ \beta _{21}^{i_Y} [Y^{i_Y (8)}_{2}]_1 &{} \quad \beta _{22} [Y^{(6)}_{2}]_1&{}\quad \beta _{23}[Y^{(4)}_{2}]_1 \end{pmatrix}, \end{aligned}$$
(23)

where \([Y^{(k)}_r]_{i}\) is the i-th element of \(Y^{(k)}_r\). Here, the summation over \(i_Y = 1,2\) is implicit. Note that there is no \(1^\prime \) at \(k=2,4\), so (1, 2) and (1, 3) elements in \(Y_d\) are zero. Altogether there are 9 coefficients, namely \(\alpha _1,\alpha _2,\alpha _3^1, \alpha _3^2,\beta _{11},\beta _{12}^1, \beta _{12}^2, \beta _{22}\) and \(\beta _{23}\). In our analysis, we assume that the coefficients are real except \(\alpha _3^1\) for simplicity.Footnote 4 We introduce a phase to \(\alpha _3^1\), since the CKM phase is approximately vanishing if all of the coefficients are real.

We fit \(\tan \beta \), \(\tau \) and the 9 coefficients to explain the observed quark masses and the CKM mixing. The values of Yukawa coupling constants and CKM angles at the Grand Unified Theory (GUT) scale in the MSSM are calculated in Ref. [86]. Throughout our analysis, we refer to the values when \(M_{\textrm{SUSY}} = 10\) TeV, and the SUSY threshold corrections are zero.Footnote 5 We find the benchmark point where the observables are within \(1\sigma \) range, as shown in the left panel of Table 2. The input parameters at this point are \(\tan \beta = 36.0982\), \(\tau = 0.4956+2.2306i\), \(|\alpha _3^1| = 4.1369\times 10^{-3}\) and

$$\begin{aligned} \frac{1}{|\alpha _3^1|} \begin{pmatrix} \alpha _{1} \\ \alpha _{2} \\ \alpha _{3}^1 \\ \alpha _3^2 \end{pmatrix} = \begin{pmatrix} -0.1357 \\ -1.6734 \\ e^{0.0074i} \\ -0.6894 \end{pmatrix} , \quad \frac{1}{|\alpha _3^1|} \begin{pmatrix} \beta _{11} \\ \beta _{21}^1 \\ \beta _{21}^2 \\ \beta _{22} \\ \beta _{23} \end{pmatrix} = \begin{pmatrix} -3.1165 \\ 0.1350 \\ 1.6214 \\ -0.1357 \\ 0.2806 \end{pmatrix}. \end{aligned}$$
(24)

Here, we normalize the coefficients by the absolute value of \(\alpha _3^1\). The overall factor for the coefficients are needed to be small to compensate the relatively large factor from the canonical normalization in Eq. (19). Although this would not be a problem because we do not know the overall normalization of the Yukawa forms, the smallness of overall factor of the coefficients could also be explained by other moduli, which appear only in the overall factor [87]. The ratio of the largest and smallest absolute values of the coefficients is about 23.

With the assignment of the weights, the predictions for the quark masses and CKM angles at \(t:= 2\,\textrm{Im}\,\tau \gg 1\) are modified as

$$\begin{aligned}&(y_u, y_c, y_d, y_s, y_b)/y_t \sim \ (\epsilon ^3/t, \epsilon , \epsilon ^2/t^2, \epsilon ^2, t) \sim (5\times 10^{-5},\nonumber \\&0.06, 2\times 10^{-4}, 0.004, 5), \nonumber \\&\quad (s_{12}, s_{23}, s_{13}) \sim \ (1/t, \epsilon ^2, \epsilon ^2/t) \sim (0.2, 0.0036,0.0008), \end{aligned}$$
(25)

where \(\epsilon \sim 0.06\) and \(t \sim 4.5\). Here, we define , where \(\theta _{ij}\) being the mixing angle between the i-th and j-th generation in the standard parametrization of the CKM matrix [88]. For quark masses, \(y_d\) and \(y_s\) are well explained in this setup, because of the \(t^2\) difference between them. The other ones, \(y_u\), \(y_c\) and \(y_b\), are predicted to be larger than the experimental values by an order of magnitude. These are resolved by \(\mathcal {O}\left( {0.1}\right) \) values of \(\alpha _1\) and \(\beta _{21}\) for \(y_u\) and \(y_b\), respectively. The charm Yukawa \(y_c\) is realized by a tuning between \(\alpha _2\) and \(\alpha _3\). The phase of \(\alpha _3\) should be small (but non-zero) to keep the charm mass light, because it can not be canceled by real \(\alpha _2\). In other words, the phase of \(\alpha _2\) and \(\alpha _3\) should be approximately aligned to explain the charm mass. Regarding the CKM angles, \(s_{12}\) and \(s_{13}\) are well explained by the pattern Eq. (25), while \(s_{23} \sim 0.004\) is an order of magnitude smaller than the experimental value. This gap is explained by the numerical factor of \(\mathcal {O}\left( {\epsilon ^2}\right) \) in the modular form \(Y^{(8)}_2 \sim (1, 10/\sqrt{3}\,\epsilon ^2)\).

Altogether, the CKM angles and \(y_d\), \(y_s\) are well explained in this model, while there should be \(\mathcal {O}\left( {0.1}\right) \) hierarchies in \(\alpha _1/\alpha ^2_3\) and \(\beta _{21}/\alpha _3^2\) for \(y_u/y_t\) and \(y_b/y_t\) respectively, and the \(\mathcal {O}\left( {0.1}\right) \) tuning between \(\alpha _2\) and \(\alpha _3\) for \(y_c/y_t\). Although the small hierarchies of \(\mathcal {O}\left( {0.1}\right) \) may be simply accidental, we will discuss the possible origin for the small hierarchy in Sect. 4.3.

4.2 Small \(\tan \beta \): \(p=1\)

Now we consider the case with \(p=1\) and the bottom quark mass is suppressed by \(\epsilon \). We assign the modular weights as

$$\begin{aligned}&k_{q_1} = 4, \quad k_{q_2} = 4, \quad k_u = 2,\nonumber \\&k_{d_1} = 5, \quad k_{d_2} = 3, \quad k_{d_3} = 1, \end{aligned}$$
(26)

so that the superpotential is given by

$$\begin{aligned} W&=H_u \Bigg \{ \alpha _1 q_1 \left( Y_3^{(6)} u^c\right) _1 + \alpha _2 \left( q_2 Y^{(6)}_3 u^c \right) _1 \nonumber \\ {}&+ \sum _{i_Y=1}^2 \alpha _3^{i_Y} \left( q_2 Y^{i_Y(6)}_{3^\prime } u^c \right) _1 \Bigg \} \nonumber \\ {}&\quad + H_d \left\{ \beta _{11} q_1 \left( Y^{(9)}_{\hat{1}} d^c_1 \right) _1 + \sum _{i=1}^3 \beta _{2i} \left( q_2 Y^{(11-2i)}_{\hat{2}}d^c_i\right) _{1} \right\} . \end{aligned}$$
(27)

The Yukawa matrices are given by

$$\begin{aligned} Y_u&= \begin{pmatrix} \alpha _1 [Y^{(6)}_3]_1 &{}\quad \alpha _1 [Y^{(6)}_3]_3 &{}\quad \alpha _1 [Y^{(6)}_3]_2 \\ -2 \alpha _2 [Y^{(6)}_3]_1 &{}\quad \alpha _2 [Y^{(6)}_3]_3 +\sqrt{3}\alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_2 &{}\quad \alpha _2 [Y^{(6)}_3]_2 +\sqrt{3}\alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_3 \\ -2 \alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_1 &{}\quad \alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_3 -\sqrt{3}\alpha _2 [Y^{(6)}_{3}]_2 &{}\quad \alpha _3^{i_Y} [Y^{i_Y(6)}_{3^\prime }]_2 -\sqrt{3}\alpha _2 [Y^{(6)}_{3}]_3 \\ \end{pmatrix}, \nonumber \\ Y_d&= \begin{pmatrix} \beta _{11} Y^{(9)}_{\hat{1}} &{}\quad 0 &{}\quad 0 \\ \beta _{21} [Y^{(9)}_{\hat{2}}]_1 &{}\quad \beta _{22} [Y^{(7)}_{\hat{2}}]_1 &{}\quad \beta _{23}[Y^{(5)}_{\hat{2}}]_1 \\ \beta _{21} [Y^{(9)}_{\hat{2}}]_2 &{}\quad \beta _{22} [Y^{(7)}_{\hat{2}}]_2 &{}\quad \beta _{23}[Y^{(5)}_{\hat{2}}]_2 \end{pmatrix}. \end{aligned}$$
(28)

In \(Y_d\), (1, 2) and (1, 3) elements vanish because there is no \(\hat{1}\) representation for \(k<9\). There are 8 coefficients in this setup. We assign the odd weights for the down-quarks, and hence there are the hatted-representations in \(Y_d\). Thus we should consider the homogeneous group \(\Gamma _4^\prime \) unlike the first model. In our numerical analysis, we assume that the coefficients are real except \(\alpha _3^1\) as in the first model for simplicity.

At the benchmark point, the parameters are given by \(\tan \beta = 1.6358\), \(\tau = 0.4944+2.6779i\), \(|\alpha _3| = 1.2683\times 10^{-3}\) and

$$\begin{aligned} \frac{1}{|\alpha _3|} \begin{pmatrix} \alpha _{1} \\ \alpha _{2} \\ \alpha _{3}^1 \\ \alpha _3^2 \end{pmatrix}&= \begin{pmatrix} -0.2674 \\ 1.7408 \\ e^{-3.1281i} \\ -1.4009 \end{pmatrix} , \quad \frac{1}{|\alpha _3|} \begin{pmatrix} \beta _{11} \\ \beta _{21} \\ \beta _{22} \\ \beta _{23} \end{pmatrix} = \begin{pmatrix} -6.9026 \\ -0.1294 \\ 0.2800 \\ 0.4095 \end{pmatrix}. \end{aligned}$$
(29)

The ratio of the largest and smallest absolute values of the coefficients is about 53. The slightly larger ratio is necessary because the parameter \(\epsilon \sim 0.03\) is smaller than the first case.

The hierarchical structure after the canonical normalization gives the masses and CKM angles as

$$\begin{aligned} (y_u, y_c, y_d, y_s, y_b)/y_t \sim&\ (\epsilon ^3, \epsilon , \epsilon ^3/t^{1/2}, \epsilon ^3 t^{1/2}, \epsilon \, t^{3/2}) \nonumber \\ \sim&(3\times 10^{-5}, 0.03, 1\times 10^{-5}, 6\times 10^{-5}, 0.4), \nonumber \\ (s_{12}, s_{23}, s_{13})&\sim \ (1, \epsilon ^2, \epsilon ^2) \sim (1, 9\times 10^{-4}, 9\times 10^{-4}), \end{aligned}$$
(30)

where \(\epsilon \sim 0.03\) and \(t \sim 5.4\). The hierarchy well explains most of the hierarchical patterns except \(y_c\), \(y_b\) and \(s_{23}\). Similarly to the first model, \(y_c\) is suppressed by the cancellation between \(\alpha _2\) and \(\alpha _3\), and \(y_b\) is suppressed by \(\beta _{21}\). The CKM angle \(s_{23}\) is enhanced by the ratio of the coefficients \(\left| {\beta _{11}/\beta _{21}}\right| \sim 50\). Note that the first (second) row in the Yukawa matrices \(Y_u\) and \(Y_d\) is predominantly the second (first) generation, as opposed to the first model.

4.3 \(S_3\) origins of the small hierarchies and spontaneous CP violation

The small hierarchical structure of the coefficients in previous two models can be explained by another modular symmetry \(S_3\) at \(N=2\). The \(S_3\) modular symmetry would be realized in models with extra dimensions, e.g. \(T_2\times T_2\), where the first \(T_2\) leads \(S_4^\prime \) and the second one leads \(S_3\). In this case, the coefficients \(\alpha _{i}\) and \(\beta _{ai}\) are modular forms of \(\tau _2\), where \(\tau _2\) is the modulus of the \(S_3\) modular symmetry. In both models of the previous sections, we found the small hierarchies in the coefficients,

$$\begin{aligned} \left| {\alpha _1}\right| \ll \left| {\alpha _2}\right| , \left| {\alpha _3}\right| , \quad \left| {\beta _{11}}\right| \gg \left| {\beta _{21}}\right| , \left| {\beta _{22}}\right| , \left| {\beta _{23}}\right| . \end{aligned}$$
(31)

Here we omit the upper index \(i_Y\) for the multiple modular forms.Footnote 6 Similarly to \(\Gamma _4^\prime \simeq S_4^\prime \), we can consider \(\Gamma _2 \simeq S_3\) to explain this hierarchy, by another small parameter \(\epsilon _2 \sim 0.1\) controlled by another modulus \(\tau _2\). The irreducible representations have the following hierarchies at \(\textrm{Im}\,\tau _2 \gg 1\),

$$\begin{aligned} Y_1 \sim 1, \quad Y_{1^\prime } \sim \epsilon _2, \quad Y_2 \sim \begin{pmatrix} 1 \\ \epsilon _2 \end{pmatrix}, \end{aligned}$$
(32)

in the basis with real S and diagonal T. The explicit forms of the modular forms are shown in Appendix A.2.

Now, we assign \(d_i^c\), \(q_1\) to the trivial singlet 1 and \(u^c\), \(q_2\) to the non-trivial singlet \(1^\prime \) under the \(S_3\) symmetry. Then the hierarchial structure of the Yukawa matrices are given by

$$\begin{aligned} Y_u \propto \begin{pmatrix} \epsilon _2 &{}\quad \epsilon _2 &{} \quad \epsilon _2 \\ 1 &{}\quad 1 &{} \quad 1 \\ 1 &{}\quad 1 &{} \quad 1 \end{pmatrix}, \quad Y_d \propto \begin{pmatrix} 1 &{} \quad 1 &{} \quad 1 \\ \epsilon _2 &{}\quad \epsilon _2 &{}\quad \epsilon _2 \\ \epsilon _2 &{}\quad \epsilon _2 &{}\quad \epsilon _2 \\ \end{pmatrix}. \end{aligned}$$
(33)

This can explain the hierarchical pattern in Eq. (31). For example, we can construct \(Y_1\) and \(Y_{1^\prime }\) by the modular forms of weight 6 as explicitly shown in Appendix A.2. The hierarchical structure of the Yukawa couplings is essentially realized by \(\mathbb {Z}_4^T\times \mathbb {Z}_2^{T^\prime }\) symmetry, where the second one is from the \(S_3\) modular symmetry. The residual symmetry \(\mathbb {Z}_6^T \subset \Gamma _6\) plays a similar role in the model of Ref. [82].

The spontaneous CP violation may be induced from the modulus VEV of the \(S_3\) symmetry at \(\mathcal {O}\left( {\epsilon _2^2}\right) \). If all of the \(\mathcal {O}\left( {1}\right) \) coefficients are real, the CKM phase is vanishing up to \(\mathcal {O}\left( {\epsilon ^4}\right) \) and \(\mathcal {O}\left( {\epsilon _2^2}\right) \). After rotating the phases of the quarks as

$$\begin{aligned} Q^T \rightarrow \begin{pmatrix} e^{-2i\phi } Q_1\\ e^{-i(2\phi +\phi _2)} Q_2 \\ e^{-i\phi _2} Q_3 \end{pmatrix}, \quad u^c \rightarrow \begin{pmatrix} e^{i\phi _2} u^c_1\\ e^{i(\phi +\phi _2)} u^c_2 \\ e^{-i(\phi -\phi _2)} u^c_3 \end{pmatrix}, \quad d^c \rightarrow d^c, \end{aligned}$$
(34)

where \(\phi := \textrm{Arg}(\epsilon )\) and \(\phi _2:= \textrm{Arg}(\epsilon _2)\), the phases of the Yukawa matrices are given by

$$\begin{aligned} \textrm{Arg}\left( Y_u\right)&= \begin{pmatrix} 2\phi _2 &{}\quad 2\phi _2 &{}\quad 2\phi _2 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 4\phi &{}\quad 0 \end{pmatrix}, \quad \textrm{Arg}\left( Y_d\right) = 0. \end{aligned}$$
(35)

The phase \(4\phi \) does not contribute to the CKM phase, since it is relevant only to the diagonalization of \(u^c\) up to \(\mathcal {O}\left( {\epsilon ^4}\right) \). Whereas, the phase \(2\phi _2\) contributes to the CKM phase, and thus the spontaneous CP violation will occur due to the \(S_3\) modular symmetry.Footnote 7 We also note that our assumptions on the phases, i.e. only \(\alpha ^1_3\) is complex, is not hold, and all of the coefficients will be complex. Nonetheless we expect that there will be a good fits to the data, and concrete numerical analysis is beyond the scope of this paper.

5 Summary

In this paper, we demonstrated that the hierarchical structure of the quark masses and the CKM matrix is realized in the modular flavor symmetry at the level \(N=4\). The FN-like mechanism is realized due to the \(\mathbb {Z}_4^T\) symmetry with a small parameter \(\epsilon \) where the modulus is assumed to be stabilized at \(\textrm{Im}\,\tau \gg 1\). We found that there are only four cases shown in Eq. (16) in which the observed hierarchical structure in Eq. (12) is realized.

We then studied the two examples with different assignments of the modular weights for the quarks. In both models, the quark hierarchical structures are realized with \(\mathcal {O}\left( {1}\right) \) coefficients, although the small hierarchy shown in Eq. (31) of \(\mathcal {O}\left( {0.1}\right) \) is needed, as well as the cancellation between the parameters \(\alpha _2\) and \(\alpha _3\) to explain the charm mass. We proposed a way to understand the small hierarchical structure Eq. (31) and the origin of the CKM phase by the existence of another modular flavor symmetry \(S_3\). In this case, the hierarchical structure of the Yukawa couplings are essentially realized by \(\mathbb {Z}_4^T\times \mathbb {Z}_2^{T^\prime }\), where the second symmetry comes from the second modular symmetry \(S_3\). It was also turned out that the factors from canonical normalization \(2\,\textrm{Im}\,\tau \sim 5\) play an important role, because of the assignments of the modular weights. For instance, the Cabbibo angle is explained by this factor in the first model.