1 Introduction

The puzzling flavor structure of the Standard Model (SM) has inspired a lot of model building over the past decades. The large ratios of masses and mixing parameters displayed in Table 1, commonly referred to as flavor hierarchies, cannot be the result of a generic ultraviolet (UV) theory with order-one couplings.

Another feature of the SM matter sector is the peculiar structure of its five gauge representations, which strongly hints at some unified gauge group with only two SU(5) or one SO(10) representation. These so-called grand-unified theories (GUTs) in turn strongly motivate the extension of the SM to its minimally supersymmetric version, the MSSM, as the latter provides an accurate unification of the gauge couplings near a scale of the order of \(10^{16}\) GeV, commonly referred to as the GUT scale. However, the unification of couplings is much less impressive in the Yukawa sector. In a supersymmetric theory, the Yukawa couplings are encoded in the superpotential

$$\begin{aligned} W^\mathrm{MSSM}_Y=H_d Q^T \mathcal Y_d D^c + H_u Q^T \mathcal Y_u\,U^c+ H_d L^{T} \mathcal Y_e\, E^c , \end{aligned}$$
(1)

where Q and L are the superfields containing the doublet SM matter fields, \(E^c\), \(U^c\), and \(D^c\) contain the singlet matter fields, \(H_d\) and \(H_u\) are the two Higgs doublets of the MSSM, and \(\mathcal Y_{u,d,e}\) are the Yukawa coupling matrices.Footnote 1 Both Higgs fields need to acquire vacuum expectation values, usually parametrized by a scale v and an angle \(\beta \) as \(\left\langle H_u\right\rangle =v\sin \beta \), \(\left\langle H_d\right\rangle =v\cos \beta \). Furthermore, in SU(5) unification, one assembles \(D^c\) and L into a \(\bar{\mathbf{5}}\) representation \(F^c\), and the Q, \(U^c\) and \(E^c\) fields into an antisymmetric tensor \(\mathbf{10}\) representation A, whereas the Higgs doublet \(H_u\) (\(H_d\)) unifies with an additional color triplet into a \(\mathbf{5}\) (\(\bar{\mathbf{5}}\)) representation, called H (\(H^c\)). The Yukawa sector now becomes more constrained, as the SU(5) invariant superpotential now has only two independent operators

$$\begin{aligned} W^{SU(5)}_Y = H^c A^T\, \mathcal Y F^c + \frac{1}{2} H A^T \mathcal Y' A , \end{aligned}$$
(2)

with \(\mathcal Y\) an arbitrary complex and \(\mathcal Y'\) a symmetric complex matrix. One then has the GUT-scale relation \(\mathcal Y_d=\mathcal Y_e^T\) for the Yukawa couplings matrices of the charged-lepton and down-quark sectors. A quick glance at Table 1 (which shows the couplings at the GUT scale) reveals that this is certainly not the case. Even though supersymmetric threshold corrections are more important than in the gauge sector (and more dependent on the superpartner spectrum), it is very hard to attribute the large differences in the couplings to this alone. Typically it is possible to adjust the spectrum such that only some but not all of the Yukawa couplings become unified. Certain GUT breaking effects at the high scale thus seem to be unavoidable. Some efforts have been made to include such breaking effects, for instance via exotic Higgs representations [1], nonrenormalizable operators [2, 3], vectorlike representations [4], or combinations of these [5]. In minimal SO(10) unification, where A and \(F^c\) unify (together with a right handed neutrino) into a \(\mathbf{16}\) spinor representation that we call S, and \(H,H^c\) assemple into a fundamental representation \(\mathbf{10}\) (called \(H_{10}\)), one has the unique Yukawa interaction

$$\begin{aligned} W_Y^{SO(10)}=\frac{1}{2}H_{10}S^T\mathcal YS , \end{aligned}$$
(3)

with \(\mathcal Y^T=\mathcal Y\), and hence all Yukawa couplings are predicted to be exactly equal, \(\mathcal Y_u=\mathcal Y_d=\mathcal Y_e\). The necessary GUT breaking effects now become even larger, as a comparison between the different columns of Table 1 shows.

Table 1 Quark and lepton data at the GUT scale in the MSSM [6]. Here, \(y'_i=y_i\sin \beta \) and \(y''_i=y_i\cos \beta \). The values are representative, as they depend on the supersymmetric threshold corrections, and indirectly on \(\tan \beta \) via the renormalization group running

The clockwork (CW) mechanism, originally formulated in [7, 8] in the context of the Relaxion, has soon been recognized as a general framework for constructing natural hierarchies [9]. In the flavor sector, it has been applied to explain the smallness of neutrino masses [10,11,12,13,14] as well as the charged flavor hierarchies mentioned above [15,16,17,18,19,20,21]. The CW Lagrangian is basically a one dimensional lattice (“theory space”) of nearest-neighbor interactions enforced by some symmetry. For instance, consider the following fermionic CW Lagrangian

$$\begin{aligned} \mathcal L_\mathrm{CW}=\bar{\psi }_{L,0} {\mathcal O}+\sum _{i=1}^N m_i \bar{\psi }_{L,i} \psi _{R,i}+k_i\phi \bar{\psi }_{L,i-1}\psi _{R,i}+\mathrm{h.c.} \end{aligned}$$
(4)

The fields \(\psi _i\) have charges i under a a U(1) symmetry which is broken by the vacuum expectation value of a scalar field \(\phi \) with charge \(-\,1\). Other renormalizable interactions are forbidden by the symmetry. We are also displaying a coupling of the field \(\psi _{0,L}\) to some operator \(\mathcal O\) composed of other fields of the model, for instance \(\mathcal O=H\chi _R\) with H the Higgs field and \(\chi _R\) another fermion. Since there are \(N+1\) left handed but only N right handed fields there must exist a chiral zero mode. The generation of hierarchies can be attributed to a controlled localization of this zero mode towards the boundaries of theory space. For instance, consider the parameters \(k_i\) and \(m_i\) to be uniform \(m_i=m\), \(k_i=k\), and define the dimensionless ratio

$$\begin{aligned} q\equiv \frac{k\left\langle \phi \right\rangle }{m} . \end{aligned}$$
(5)

The zero mode \(\psi _L\) has a profile

$$\begin{aligned} \psi _L= z^{-1}q^{i}\psi _{i,L}\quad z^2=\sum _{i=0}^N q^{2i} \end{aligned}$$
(6)

and hence it is sharply localized towards either site \(i=0\) (for \(q< 1\)) or \(i=N\) (for \(q>1\)). In particular, in the latter case, after integrating out the heavy modes, the effective Lagrangian becomes

$$\begin{aligned} \mathcal L_\mathrm{CW}=q^{-N}\bar{\psi }_{L}\mathcal O+\mathrm{h.c.} \end{aligned}$$
(7)

and the coupling to the rest of the theory is highly suppressed.Footnote 2 The preceding analysis is valid for universal couplings. However, it has also been pointed out [20, 22, 23] that when the parameters in the CW Lagrangian are chosen at random, sharp localization of the zero modes occurs in the bulk of theory space, still leading to hierarchical suppression of couplings. Moreover, when the couplings of the CW model become \(3\times 3\) matrices in flavor space, the three zero modes spontaneously localize at different points in the lattice, and generate inter-generational hierarchies (the vertical hierarchies in Table 1) [20]. On a technical level, this is closely related to a peculiar property of products of N random matrices, which feature a very hierarchical spectrum despite their matrix elements being all of order one [16]. Since this aspect of the clockwork mechanism is less known, we will review it in some detail in Sect. 2.

In this paper, we are going to build models of natural flavor hierarchies along the lines of Refs. [16, 20] in supersymmetric SU(5) and SO(10) unification. The necessary GUT breaking effects are naturally present in these models, as the symmetries of the CW Lagrangian allow for renormalizable Yukawa couplings of the vectorlike fields (CW gears) with the GUT-breaking Higgs field(s). The flavor hierarchies and Yukawa unification thus have a common origin in the framework of a completely renormalizable model without any exotic Higgs or matter representations.

This paper is organized as follows. In Sect. 2 we describe an SU(5) model and briefly review the mechanism of flavor hierarchies. In Sect. 3 we implement our mechanism within an SO(10) model. In Sect. 4 we perform comprehensive scans over the parameter space of these models, and quantify how well the SM flavor structure is reproduced. Some phenomenological considerations are given in Sect. 5, and in Sect. 6 we present our conclusions.

2 An SU(5) model of fermion mass hierarchies

2.1 The charged fermion sector of the model

We denote the usual SU(5) GUT field content, consisting of 3 generations of \(\bar{\mathbf{5}}\) and \(\mathbf{10}\) each by \(F^c\) and A respectively, where here and in the following we suppress all flavour indices.

Firstly, we add to this N vectorlike copies \((F,F^c)_{i}\), which are charged under a vectorlike U(1) symmetry under which \(F_i\) has charge i and \(F_i^c\) has charge \(-\,i\). The index \(i=1\dots N\) will be referred to as a site (in theory space). In addition we introduce the U(1) breaking spurion \(\phi \) with charge \(-\,1\). We will denote the chiral MSSM GUT fields with an index \(i=0\), a notation consistent with their vanishing U(1) charge. The unique renormalizable superpotential allowed by symmetries is thus of the clockwork type

$$\begin{aligned} W_5=\sum _{i=1}^N \phi (F^c_{i-1})^T\mathcal C_i F_{i}+(F^c_i)^T(m\mathcal A_i+\Sigma \mathcal B_i)F_i , \end{aligned}$$
(8)

where m is a mass scale, \(\Sigma \) is the SU(5) breaking adjoint field, and \(\mathcal A_i\), \(\mathcal B_i\) and \(\mathcal C_i\) are dimensionless couplings that are \(3\times 3\) arbitrary complex matrices.Footnote 3 The U(1) symmetry is anomaly free (and hence can be gauged) once a conjugate field \(\phi ^c\) is included, we will assume \(\langle \phi \rangle \gg \langle \phi ^c\rangle \) such that we can simply ignore it.Footnote 4

Secondly, in a completely analogous manner we introduce copies of the fields transforming in the antisymmetric \(\mathbf{10}\) representation, charged under a \(U(1)'\) symmetry, with superpotentialFootnote 5

$$\begin{aligned} W_{10}=\sum _{i=1}^{N'}\phi ' (A^c_{i})^T\, \mathcal C'_i A_{i-1}+(A^c_i)^T(m'\mathcal A'_i+\Sigma \,\mathcal B'_i)A_i , \end{aligned}$$
(9)

Furthermore, we denote the two Higgs doublets as \(H^c\) and H. The Yukawa superpotential reads

$$\begin{aligned} W_Y=H^c(A_0)^T\, \mathcal Y F_0^c + \frac{1}{2} H(A_0)^T \mathcal Y'A_0 , \end{aligned}$$
(10)

with \(\mathcal Y'^T=\mathcal Y'\). In terms of MSSM fields, this gives

$$\begin{aligned} W_Y= & {} H_d (Q_0)^T \mathcal YD^c_0 + H_d (E^c_0)^T \mathcal Y\, L_0\nonumber \\&+ H_u (Q_0)^T \mathcal Y'\,U_0^c+\cdots , \end{aligned}$$
(11)

where the ellipsis denotes terms with the Higgs triplet.

The field content of our model is summarized in Table 2.Footnote 6

Table 2 Field content and symmetries of the model. All matter fields and their vectorlike partners, \(F_i\), \(F_i^c\), \(A_i\), \(A^c_i\), and \(N^c\) carry an additional generation index (not shown)

The complete superpotential of the charged fermions is \(W=W_5+W_{10}+W_Y\). Integrating out the clockwork fields with \(i\ne 0\) exactly, the superpotential becomes \(W=W_Y\), while the Kähler potential turns into

$$\begin{aligned} K=(F_0^c)^\dagger \mathcal ZF^c_0+ (A_0)^\dagger \mathcal Z' A_0 , \end{aligned}$$
(12)

with

$$\begin{aligned}&\mathcal {Z}\equiv \sum _{i=0}^N (\mathcal Q_i\cdots \mathcal Q_1)^\dagger (\mathcal Q_i\cdots \mathcal Q_1) ,\quad \nonumber \\&\mathcal Q_i\equiv (m\mathcal A_i+v_{24}Y\mathcal B_i)^{-1}\phi \,\mathcal C_i , \end{aligned}$$
(13)

and analogously for \(\mathcal Z'\). Here Y is the hypercharge (with SM normalization, \(Y_Q=1/6\)). Notice that the flavor matrices \(\mathcal Z\) become hypercharge dependent and thus provide a source of GUT breaking. It is this effect that we want to exploit in order to separate the down quark and charged lepton Yukawa couplings. After canonical normalization, the physical Yukawa couplings read

$$\begin{aligned} \mathcal Y_u^*&=(\mathcal E_Q)^T \mathcal Y' \mathcal E_{U^c} ,\quad \mathcal Y_d^*=(\mathcal E_Q)^T \mathcal Y\mathcal E_{D^c} ,\quad \nonumber \\ \mathcal Y_e^*&=(\mathcal E_{L})^T\mathcal Y^T \mathcal E_{E^c} , \end{aligned}$$
(14)

where the Hermitian matrices \(\mathcal E_X\) are defined as

$$\begin{aligned} \mathcal E_{D^c}\equiv (\mathcal Z_{{2}/{3}})^{-{1}/{2}} ,\quad \mathcal E_{L}\equiv (\mathcal Z_{-{1}/{2}})^{-{1}/{2}} ,\end{aligned}$$
(15)

and

$$\begin{aligned} \mathcal E_{Q}\equiv (\mathcal Z'_{{1}/{6}})^{-1/2} ,\quad \mathcal E_{U^c}\equiv (\mathcal Z'_{-{2}/{3}})^{-1/2} ,\quad \mathcal E_{E^c}\equiv (\mathcal Z'_{1})^{-{1}/{2}} . \end{aligned}$$
(16)

Here the subscripts on the \(\mathcal Z\), \(\mathcal Z'\) refer to hypercharge. The eigenvalues of \(\mathcal Z\) (\(\mathcal E\)) are always larger (smaller) than one.

Assuming no further relation between couplings it is reasonable to expect that the matrices \(\mathcal A^{(\prime )}\), \(\mathcal B^{(\prime )}\), \(\mathcal C^{(\prime )}\), and \(\mathcal Y^{(\prime )}\) have \(\mathcal O(1)\) complex entries. As has been shown [16, 20], the matrices \(\mathcal E_X\), even though not containing any a priori large or small parameters, spontaneously develop strongly hierarchical spectra, i.e., their three eigenvalues satisfy

$$\begin{aligned} \epsilon _{X^1}\ll \epsilon _{X^2}\ll \epsilon _{X^3}\le 1 . \end{aligned}$$
(17)

We will refer to this kind of hierarchies as inter-generational hierarchies, to distinguish them from inter-species hierarchies between the different matter representations (e.g. between top and tau). In order to better understand this spontaneous generation of inter-generational hierarchies, let us define positive parameters

$$\begin{aligned} a\equiv \frac{m}{\left\langle \phi \right\rangle } ,\quad b\equiv \frac{v_{24}}{\left\langle \phi \right\rangle } , \end{aligned}$$
(18)

and

$$\begin{aligned} a'\equiv \frac{m'}{\left\langle \phi '\right\rangle } ,\quad b'\equiv \frac{v_{24}}{\left\langle \phi '\right\rangle } . \end{aligned}$$
(19)

In the extreme limit \(a+b\rightarrow \infty \), we have

$$\begin{aligned} \mathcal Z\approx 1 , \end{aligned}$$
(20)

and there are no inter-generational hierarchies, while in the opposite limit of \(a+b\rightarrow 0\), we obtain a product structure

$$\begin{aligned} \mathcal Z\approx (\mathcal Q_N\cdots \mathcal Q_1)^\dagger (\mathcal Q_N\cdots \mathcal Q_1) , \end{aligned}$$
(21)

which spontaneously generates large inter-generational hierarchies between the three eigenvalues. This can be understood in terms of a general property of products of random O(1) matrices [16]. The parameters a and b thus interpolate between a hierarchical and non-hierarchical situation. It is noteworthy that the hierarchies in Eq. (21) become independent of ab (in the sense that only the eigenvalue’s overall size but not their ratios depend on them). Interestingly, even in the case \(a+b\sim 1\),Footnote 7 a strong hierarchy is still present, especially between the third and second generations (see also the discussion in [20]).

A complementary explanation of this mechanism can be provided by the localization of the zero modes. As has also been shown in Ref. [20] the zero modes of the matter fields spontaneously localize sharply around some random site in theory space, a property first pointed out in [22] in similar models. In this interpretation, the zero modes of the three generations are localized at different sites in theory space, which explains their hierarchical overlap with the site-zero fields. The parameters a and b set a bias for this localization, the larger a and b, the more the zero modes are localized towards site zero.

Moving to a basis in which the \(\mathcal E_X\) are diagonal, the Yukawa couplings assume the structure \((\mathcal Y_u)_{ij}\sim \epsilon _{Q_i}\epsilon _{U^c_j}\) etc, familiar from Froggatt–Nielsen [24], extra-dimensional [25,26,27], or strongly coupled [28] models, and the hierarchies of masses and mixings follow in a way similar to those (see for instance Ref. [29]). The CKM angles scale as \(\theta _{ij}\sim \epsilon _{Q^i}/\epsilon _{Q^j}\) (\(i<j\)) and similarly for the PMNS anlges with \(Q\rightarrow L\). The charged fermion masses on the other hand scale as \(\epsilon _{Q^i}\epsilon _{U^i}\), \(\epsilon _{Q^i}\epsilon _{D^i}\), and \(\epsilon _{L^i}\epsilon _{E^i}\), respectively. Then, the CKM hierarchies roughly determine the \(\epsilon _{Q^i}\) hierarchies. As a general rule, this saturates the hierarchies in the down quark masses, i.e., the required hierarchies in the \(\epsilon _{D^i}\) are rather mild, while the hierarchies in the up quark masses require further suppression from the \(\epsilon _{U^i}\). In the lepton sector, clearly one must avoid a large hierarchy in the \(\epsilon _{L^i}\) in order to keep the PMNS angles large. The charged lepton hierarchies then come mostly from the \(\epsilon _{E^i}\) parameters. From these rough considerations alone, it is already quite apparent that the structure is rather SU(5) compatible (large hierarchies in the \(\mathbf{10}\) sector and mild hierarchies in the \(\bar{\mathbf{5}}\) sector). This fact is analogous to FN models, where one can easily achieve SU(5) compatible charge assignments [30]. One can then try to lift the down-quark lepton degeneracy by the above-mentioned hypercharge-dependent effects. To get an idea how the degenaracy is removed, we plot in Fig. 1 the distribution for the second-generation parameters \(\epsilon _{E^c_2}\) and \(\epsilon _{Q_2}\) for various values of b. In Sect. 4 we will indeed show that the SU(5) model works remarkably well. However, it turns out that our ideas are even partially successful in SO(10) unified models. The reason for that is that the CW mechanism allows for a more radical splitting of the inter-species hierarchies (for instance between the \(\epsilon _{Q_i}\) and \(\epsilon _{D_i})\). It is worthwhile then to develop an SO(10) version of the model, which will be done in Sect. 3.

Fig. 1
figure 1

Second-generation eigenvalues of \(\mathcal E_{E^c}\) and \(\mathcal E_{Q}\) for \(N'=5\), \(a'=0.6\) and \(b'=0.05\) (green), \(b'=0.2\) (orange) and \(b'=0.4\) (blue). We use uniform priors for the complex entries of \(\mathcal A'_i\) and \(\mathcal B'_i\). The mean values and one-sigma contours are also shown in each case. One can see that the degeneracy appearing at \(b'=0\) is progressively removed for larger \(b'\) (the correlation coefficients are 0.97, 0.78, and 0.53 respectively). Notice that the mean values for \(\mathcal E_{E^c}\) shift much more because of its larger hypercharge

2.2 The neutrino sector

To generate neutrino masses we will employ the seesaw mechanism [31,32,33,34,35,36]. We will assume no clockwork fields for the neutrinos, such that we have the simple superpotential

$$\begin{aligned} W_\nu =H_u (N^c)^T \mathcal Y'' L_0 +\frac{1}{2}(N^c)^T \mathcal MN^c+\cdots , \end{aligned}$$
(22)

where we have already discarded the Higgs triplets. Integrating out the right handed (RH) neutrinos as well as the clockwork leptons gives the Weinberg operator

$$\begin{aligned} W_{\nu }=-\frac{1}{2}(H_u L)^T (\mathcal E_L^T\mathcal Y''^T\mathcal M^{-1}\mathcal Y''\mathcal E_L)\,(H_uL) . \end{aligned}$$
(23)

We parametrize \(\mathcal M=m_R \, \mathcal D\), where \(m_R\) is a scale and \(\mathcal D\) is a further dimensionless order one complex (symmetric) matrix.

3 An SO(10) extension of the model

It is possible to extend the previous model to SO(10). The SM fields (including the right handed neutrino) unify into a spinorial \(\mathbf{16}\) representation, denoted by S. To break SO(10) down to the SM one needs more than one irreducible representation. We will choose a \(\mathbf{45}\) and a vectorlike (\(\mathbf{126}+\overline{126}\)). These are the lowest-dimensional representations that satisfy the following three criteria: (i) breaking of \(SO(10)\rightarrow \) SM , (ii) possibility to write Yukawa couplings in the clockwork Lagrangian with \(\mathbf{16}\), \(\overline{\mathbf{16}}\) and a GUT breaking Higgs field, (iii) possibility to write Majorana neutrino masses [34, 35].Footnote 8

3.1 The charged fermion sector

Table 3 Field content and symmetries of the SO(10) model. All matter fields (\(S_i\), \(S^c_i\)) carry an additional generation index (not shown)

We are then lead to consider a single U(1) symmetry [the combination \(Q-Q'\) of the symmetries in the SU(5) model which commutes with SO(10)]. The field content is given in Table 3. The model is defined by

$$\begin{aligned} W_{16}=\sum _{i=1}^{N}\phi \, (S^c_{i})^T \mathcal C_i\, S_{i-1}+(S^c_i)^T(m\mathcal A_i+\Sigma \,\mathcal B_i)\, S_i , \end{aligned}$$
(24)

along with the unique Yukawa coupling

$$\begin{aligned} W_Y=\frac{1}{2}H_{10}(S_0)^T\mathcal YS_0 . \end{aligned}$$
(25)

We assume here that the entire \(\mathbf{126}+\overline{\mathbf{126}}\) Higgs sector is very heavy and its vacuum expectation value (VEV) aligned in the SM singlet direction. In particular, the doublets contained in the \(\overline{\mathbf{126}}\) does do not contribute to Yukawa couplings at low energy.Footnote 9

Table 4 Breaking patterns of \(\mathbf{45} +126+\overline{126}\). Note that the \(\mathbf{126}\) VEV always leaves SU(5) unbroken, i.e. \(\mathbb H_{126}=SU(5)\). The Yukawa relations marked with \(*\) are ruled out

The VEV of the \(\mathbf{45}\) is a generic linear combination of hypercharge and \(B-L\) generators, which breaks SO(10) down to \(\mathbb H_{45}=\) SM\(\times (B-L)\). At the same time the VEV of the \(\mathbf{126}\) representation breaks SO(10) to \(\mathbb H_{126}=SU(5)\), leaving as the unbroken subgroup \(\mathbb H_{45}\cap \mathbb H_{126}=\mathrm SM\). In certain particular directions of \(\langle \Sigma \rangle \), the group \(\mathbb H_{45}\) can be enhanced. These directions are summarized in Table 4. Since only the 45 VEV participates in the GUT breaking of the Yukawa couplings, an enhanced \(\mathbb H_{45}\) may lead to some relations between Yukawa couplings even when \(\mathbb H_{45}\cap \mathbb H_{126}=\mathrm SM\). This is the case for all but the last row in Table 4.

Let us write the VEV of \(\Sigma \) as

$$\begin{aligned} \langle \Sigma \rangle = v_{45} [\sin \alpha \, Y+\cos \alpha \, (B-L)] . \end{aligned}$$
(26)

The model then has one discrete and 4 continuous non-stochastic parameters: N, \(a\equiv m/\left\langle \phi \right\rangle \), \(b\equiv v_{45}/\left\langle \phi \right\rangle \), \(\tan \alpha \), and \(\tan \beta \).Footnote 10 According to Table 4, the values \(\tan \alpha =0,\ -\frac{4}{5},\ -2\) are ruled out by either \(\mathcal Y_u=\mathcal Y_d\) or \(\mathcal Y_d=\mathcal Y_e\).

3.2 The neutrino sector

With the symmetry assignments as in Table 4, there is a unique Majorana mass term for the RH neutrinos

$$\begin{aligned} W_{R}=\frac{1}{2}v_{126}N_0^{cT} \mathcal DN_0^{c} , \end{aligned}$$
(27)

where \(\mathcal D\) is another complex order one symmetric, dimensionless matrix. All other Majorana mass terms are forbidden by the nonzero U(1) charges. Considering the non-hierarchical nature of neutrino masses, this is actually very welcome, since this implies that the hierarchical factors \(\mathcal E_\nu \) drop out of the Weinberg operator. We could, for instance, first integrate out the clockwork gauge-singlets, and the RH neutrino fields \(N^c_0\) would obtain the usual hierarchical wave function renormalization factor. But the normalization of the RH neutrino’s kinetic term is irrelevant, as they are integrated out in the see-saw mechanism.

4 Determining the parameters of the models

In this section we would like to find sets of parameters such that the mechanism leads to a successful generation of the SM flavor structure.

We distinguish two kind of model parameters. The first kind quantify some underlying physics assumption, such as the scales of symmetry breaking, or the number of vectorlike fields. The following parameters are of this type:

$$\begin{aligned}&N,\ N',\ a,\ b,\ a',\ b',\ \tan \beta ,\ m_R\quad SU(5) \mathrm{\ model,} \end{aligned}$$
(28)
$$\begin{aligned}&N,\ a,\ b,\ \tan \alpha ,\ \tan \beta ,\ m_R\quad SO(10) \mathrm{\ model.} \end{aligned}$$
(29)

We will refer to these parameters as non-stochastic or deterministic. The remaining parameters are the coupling matrices \(\mathcal A^{(\prime )}_i\), \(\mathcal B^{(\prime )}_i\), \(\mathcal C^{(\prime )}_i\), \(\mathcal Y^{(\prime )}\) and \(\mathcal D\). In the absence of additional structure, such as symmetries that would constrain the form of these matrices, it is natural to assume that they have order-one complex entries. Choosing them randomly from some suitable prior distribution defines an ensemble of models (with the same physics assumptions). We can then compute the distributions of physical observables (masses and mixings), for each choice of the deterministic parameters. In order to model the property “order one” for the matrix elements, we will chose flat uniform priors with \(|{\text {Re}}(\mathcal A_i)_{kl}| \le 1\) and \(|{\text {Im}}(\mathcal A_i)_{kl}| \le 1\) etc. Of course, the “posterior distributions” depend to some extent on the choice of priors.Footnote 11

In order to quantify the success of the mechanism, we proceed as follows. Let us define the variables

$$\begin{aligned} x_i\equiv \log _{10} O_i , \end{aligned}$$
(30)

where the observables \(O_i\) run over the nine physical Yukawa couplings, the three CKM mixing angles \(\theta _{ij}\), as well as the three quantities \(\sin ^2\theta _{ij}\) of the PMNS matrix.Footnote 12 It turns out that the logarithms of the observables roughly follow a multi-dimensional Gaussian distribution. It is therefore useful to approximate this distribution by a Gaussian with mean and covariance taken from the exact (simulated) distribution. This defines a \(\chi ^2\) function

$$\begin{aligned} \chi ^2(x_i)=(x_i-\bar{x}_i)(x_j-\bar{x}_j)C^{-1}_{ij} , \end{aligned}$$
(31)

where the means \(\bar{x}_i\) and covariances \(C_{ij}\) depend on the deterministic parameters. This \(\chi ^2\) function can be used to quantify how the experimental point \(x_i^\mathrm{exp}\) compares to the typical models in the ensemble. We use as experimental input the values given in Table 1.Footnote 13 Instead of \(\chi ^2(x_i^\mathrm{exp})\), an equivalent but more meaningful quantity to look at is the associated “\(p\) value”, the proportion of models that have a larger \(\chi ^2\) than the experimental point, that is, which are less likely. In our context, a \(p\) value \(\sim 1\) indicates that the experimental value roughly coincides with the mean of the theoretical distribution, implying that the ensemble typically features a SM-like flavor structure. One can then optimize the deterministic parameters in order to yield larger \(p\) values, that is, the SM point belongs to the most likely models of the ensemble (it sits near the mean of the distribution).

We should make a disclaimer though to avoid misconceptions. We are not performing a usual \(\chi ^2\) fit of a model to experimental data. Indeed, to find the parameters that accurately reproduce the experimental data, one also should treat the stochastic parameters as freeFootnote 14 and optimize them (along with the non-stochastic ones) to minimize the functionFootnote 15\(\chi ^2_\mathrm{exp}(x_{i}^\mathrm{model})\), where \(x_i^\mathrm{models}\) depend on all the free parameters. Of course, the large number of free parameters (many more than there are observables) renders such an excercise both unfeasable and meaningless, as the minimal \(\chi ^2\) will most certainly be zero. Rather, in our method, we are optimizing the deterministic parameters such that the theoretical distributions of models have the experimental data as a typical outcome. To quantify this statement we use \(\chi _\mathrm{models}^2(x_{i}^\mathrm{exp})\) (and the associated \(p\) values) as a measure. Let us also stress that we are not claiming any prediction of the SM parameters in Table 1 (except in order of magnitude). However, notice that having a nonzero (and in fact, unsuppressed) probability density \(\sim \exp \left[ -\frac{1}{2}\chi ^2_\mathrm{models}(x^\mathrm{exp})\right] \) means that there must exist parameters of order one that yield \(x^\mathrm{exp}\) to arbitrary precision.Footnote 16 Finally, the function \(\chi ^2_\mathrm{exp}\) is useful in other ways, for instance to select the points in the scan (of fixed non-stochastic parameters) that most closely reproduce the data. These points can be viewed as a sample of a conditional probability that may be used to make predictions for other observables, such as flavor changing neutral currents or neutrino-masses.

We will now take a look at the SU(5) and SO(10) cases separately.

Table 5 Simulation for the SU(5) model for the scenario with \(\tan \beta =40\). We display the \(\chi ^2\) of the physical couplings, corresponding to 15 degrees of freedom. The parentheses give the \(p\) value, the proportion of models with lower probability density (larger \(\chi ^2\)) than the SM
Table 6 Simulation for the SU(5) model for the scenario with \(\tan \beta =10\). We display the \(\chi ^2\) of the physical couplings, corresponding to 15 degrees of freedom. The parentheses give the \(p\) value, the proportion of models with lower probability density (larger \(\chi ^2\)) than the SM
Fig. 2
figure 2

The distribution of masses and mixings (parameters \(x_i\)) in the SU(5) model, for the case \(N=1\), \(N'=5\) and \(\tan \beta =40\). In the lower triangle, we display scatter plots of the two-dimensional marginal distributions, with the solid (dashed) contour representing one (two) sigma, and the red dots the experimental value. In the diagonal, we show the one dimensional marginal distributions over a 3 sigma range with the red lines indicating the experimental value. The numbers on the bottom of each entry are the mean and standard deviation of the theoretical distributions. In the upper triangle, we show the correlation coefficients

4.1 Parameters in the SU(5) model

We will consider two benchmark values, \(\tan \beta =40\) and \(\tan \beta =10\). For each pair of \(N,N'\), we can then optimize the continuous parameters a, b, \(a'\) and \(b'\). However, roughly speaking, the optimal values of ab (\(a',b'\)) only depend on N (\(N'\)) and not \(N'\) (N). Therefore, we can group the parameters as shown in Tables 5 and 6. A more refined tuning of the continuous parameters can lead to slightly smaller \(\chi ^2\) for some values of \((N,N')\), but we don’t believe that this adds anything to the general conclusions. We also give in Fig. 2 the distributions for the case \(N=1\), \(N'=5\). Several features are worth pointing out.

  • The \(p\) values in Tables 5 and 6 can be quite close to one, especially in the \(\tan \beta =40\) case, indicating that our mechanism results in very SM-like masses and mixings.

  • As evident from Fig. 2, only weak correlations between the lepton and down sectors (for instance between mu and strange masses) persist, due to the GUT breaking effects built into the clockwork Lagrangian. One easily realizes large differences such as \(m_\mu /m_s\sim 5\).

  • The breaking of the degeneracy between \(\mathcal Y_d\) and \(\mathcal Y_e\) comes mostly from the \(\bar{\mathbf{5}}\) and not from the \(\mathbf{10}\) sector (the fit prefers \(a<b\) and \(a'>b'\)). The reason is that the hierarchies must mainly come from the \(\mathbf{10}\) sector, as explained at the end of Sect. 2.1, but the hypercharges predict larger hierarchies in \(\mathcal E_{E^c}\) than in \(\mathcal E_Q\), which goes in the wrong direction. Therefore, the GUT breaking terms proportional to \(b'\) cannot be too large, and \(N\ne 0\) (even if not strictly needed to generate the hierarchies) is crucial to get \(\mathcal Y_d\ne \mathcal Y_e^T\).Footnote 17

  • At larger N (\(N'\)) a too strong inter-generational hierarchy can be mitigated by larger values of ab (\(a',b')\), see Eq. (20).

  • On the other hand for \(N'<2\), the inter-generational hierarchies are too small, and this cannot be compensated for by going to very small \(a',b'\), which in this regime have a universal effect on all three generations (see Eq. (21) and the discussion there).

  • There exist some deviations from the Gaussian approximation for the leptonic observables \(\sin ^2\theta _{12}\) and \(\sin ^2\theta _{23}\), due to the fact that their distributions peak near the upper limit \(\sin ^2\theta _{ij}= 1\). However, the true probability density at the experimental point is larger than the the one given by the Gaussian approximation, hence our estimate for the global \(\chi ^2\) is conservative.

As already remarked in footnote 6, an even more constrained model can be constructed with only a single spurion \(\phi '\). From Eqs. (18) and (19) one sees that this model is implemented simply by setting \(b=b'\). A quick glance at Tables 5 and 6 shows that this would point somewhat to the large \(\tan \beta \) scenario and low value of N. For other restriction of \(\tan \beta \) and N, the constraint \(b=b'\) will somehow deteriorate the \(p\) values.

Before turning to the SO(10) case, let us comment on the remaining observables, namely the mass-squared differences of the neutrinos and the CP phases in the CKM and PMNS matrices. Inverted neutrino mass ordering would imply almost complete degeneracy between the heavier two neutrinos, which is difficult to realize in our models. Hence normal mass ordering is strongly preferred. In this case, neutrino masses can always be fitted rather well. We have checked that enlarging the set of observables by \(\Delta m_{21}^2\) and \(\Delta m_{32}^2\) adds very little to the global \(\chi ^2\) (after adjusting the see-saw scale \(m_R\)), approximately \(\sim 0.25\) for \(N=1\), and \(\sim 0.5\) for \(N=3\). However, since there are now two more degrees of freedom, the \(p\) values are actually slightly higher than those reported in Tables 5 and 6. In the case of \(\tan \beta =40\) (\(\tan \beta =10\)), the seesaw scale that corresponds to the minimal \(\chi ^2\) is given by \(m_R\approx 1.5\times 10^{14}\) GeV (\(10^{13}\) GeV).Footnote 18 The dependence on \(\tan \beta \) is indirect: For large \(\tan \beta \), one does not need very small values of a and b to suppress the bottom and tau masses, leading in turn also to unsuppressed neutrino Yukawa couplings. On the other hand, for smaller \(\tan \beta \), the suppression for these masses needs to come from smaller a and b parameters, which also suppresses the neutrino Yukawa couplings, and the corresponding seesaw scale is lower. Another observation is that, even with \(N=1\), the neutrino spectrum is predicted to be fairly hierarchical. We plot in Fig. 3 the distribution of the lightest neutrino mass (selecting only points that satisfy the neutrino data within experimental errors).

As for the phases \(\delta _\mathrm{CKM}\) and \(\delta _\mathrm{PMNS}\), their distributions are pretty much flat over the entire range \([0,2\pi )\). Since the Gaussian approximation certainly fails for these variables, it is not very meaningful to include them in the analysis above. On the other hand the flatness of the distribution tells us that the experimental values are neither preferred nor disfavored in our class of models.

4.2 Parameters in the SO(10) model

The SO(10) model depends on \(\tan \alpha \), defined in Eq. (26), which parametrizes the relative direction of SO(10) breaking by the \(\mathbf{45}\) representation. In order to assess the favoured values of \(\tan \alpha \), let us define the quantity

$$\begin{aligned} b_{X}\equiv b\left| Y\sin \alpha + (B-L)\,\cos \alpha \right| , \end{aligned}$$
(32)

where \(X=Q,L,U^c,D^c,E^c\).

Fig. 3
figure 3

Lightest neutrino masses in the SU(5) model for \(N'=5\). The curves correspond to \(N=1\) (blue) and \(N=3\) (green), as well as \(\tan \beta =40\) (solid) and \(\tan \beta =10\) (dashed)

Fig. 4
figure 4

One dimensional distribution of eigenvalues of the \(\mathcal E_X\) matrices for SO(10) with \(N=5\), \(a=0.5\), \(\tan \alpha =-\,0.6\), and \(b=1.4\) (blue)

The quantity \(a+b_X\) controls the average localization of the zero modes of the field X: the larger \(a+b_X\), the more it is localized towards site zero, while smaller values repel the zero modes from site zero and reduce the couplings to the Higgs. Ideally, we would like \(b_L\) to be large, as it reduces the hierarchy in the PMNS angles. One has that \(b_L\) is the largest amongst the \(b_X\) in the regime \(-\,0.8< \tan \alpha < 0\). At the same time we need to avoid to be too close to the end points of this interval, as they correspond to points where some of the Yukawa couplings matrices exactly unify, see Table 4. A value that works well in this regime is \(\tan \alpha \sim -\,0.6\), which we will use as our main benchmark. At this point, one has

$$\begin{aligned} b_X=b\times \{0.6,\ 0.46,\ 0.34,\ 0.2,\ 0.05 \} \end{aligned}$$
(33)

for \(X=L,D^c,E^c,Q,U^c\) respectively. We show in Fig. 4 the one-dimensional distributions for the epsilon parameters and in Fig. 5 the disappearance of the correlations as a result from the GUT breaking terms in the clockwork Lagrangian. Since the a and b parameters affect all fields simultaneously, it is no longer possible (contrary to the SU(5) case) to suppress the down type and charged lepton masses by reducing ab without for instance also reducing the top mass. We must instead resort to large values of \(\tan \beta \). We find that a reasonable value is \(\tan \beta \sim 50\).

Fig. 5
figure 5

Second-generation eigenvalues of \(\mathcal E_{L}\) and \(\mathcal E_{U^c}\) in the SO(10) model for \(N=5\), \(a=0.5\), \(\tan \alpha =-\,0.6\), and \(b=0.1\) (green), \(b=0.8\) (orange) and \(b=1.4\) (blue). The mean values and one-sigma contours are also shown in each case. The correlation coefficients are 0.93, 0.25, and 0.15 respectively. The increasing b parameter affects the U sector very little (\(b_{U^c}=0.05\, b\)), retaining a large hierarchy, while the L sector hierarchy is largely removed by the much larger \(b_L=0.6\,b\)

We show in Table 7 the results of optimizing the remaining parameters a and b, and in Fig. 6 we show the distributions in a representative case. We can make the following observations.

  • While certainly less impressive than the SU(5) model, the \(p\) values are nevertheless surprisingly good, considering the necessity of rather large SO(10) breaking effects. For \(N\ge 5\), the experimental point \(x_i^\mathrm{\exp }\) is less than one sigma away from the mean of the distribution.

  • As in the case of SU(5), at large values of N, potentially too large inter-generational hierarchies are partially erased by increasing a or b, which explains why the \(\chi ^2\) values do not increase again at larger N. However, they also do not improve any further for \(N>7\).

  • Neutrinos with normal mass ordering can again be fit easily in our model. The typical increase in \(\chi ^2\) is of the order of \(\sim 0.3\) (with an optimzed \(m_R\approx 10^{14}\) GeV), which for two additional degrees of freedom slightly increases the \(p\) values.

Table 7 Simulation for the SO(10) model, with \(\tan \alpha =-\,0.6\) and \(\tan \beta =50\). The \(\chi ^2\) corresponds to 15 degrees of freedom. The row marked \(p\) value gives the proportion of models with lower probability density (larger \(\chi ^2\)) than the standard model
Fig. 6
figure 6

The distribution of masses and mixings in the SO(10) model (parameters \(x_i\)), for the case \(N=6\). In the lower triangle, we display scatter plots of the two-dimensional marginal distributions, with the solid (dashed) contour representing one (two) sigma, and the red dots the experimental value. In the diagonal, we show the one dimensional marginal distributions over a 3 sigma range with the red lines indicating the experimental value. The numbers on the bottom of each entry are the mean and standard deviation. In the upper triangle, we show the correlation coefficients

5 Phenomenology

There are two principle impacts on low energy observables that could be used to constrain this class of models.

Firstly, as any supersymmetric GUT model, exchange of the triplet Higgs can mediate proton decay via dimension-five operators. In minimal SU(5) GUTs, precision gauge coupling unification requires a triplet Higgs mass that allows for proton decay at a rate incompatible with data [38]. In our model, there are additional vectorlike matter fields with associated threshold corrections. Notice that only mass ratios enter in the generation of the Yukawa couplings, and the overall mass scale of these new particles is a free parameter. For each model in our scan, one can in principle adjust this mass scale and the triplet mass to achieve precise gauge coupling unification, and calculate the associated proton lifetime. Such an analysis is however not independent of the solution to the doublet triplet splitting problem, and the two issues should be dealt with together. For instance, the missing partner mechanism [39,40,41] typically require extended Higgs sectors. Even though we have presented our mechanism for a minimal Higgs sector, we expect it to work similarly well for extended Higgs sectors, as long as we can write some Yukawa interactions of the CW matter fields with the GUT-breaking Higgs fields. On the other hand it would also be interesting to try and take advantage of the CW mechanism itself to split the doublet and triplet masses. A fully realistic model in this regard, including doublet-triplet splitting and an analysis of proton decay, is left to future work.

Secondly, we should comment about low energy flavor violating signatures of these models. The wave function renormalization factors, see e.g. Eq. (12), will also strongly reduce flavor violation in the soft masses [42, 43]. The effect is virtually identical to strongly coupled [28] or extra-dimensional [44, 45] models of supersymmetric flavor, though quite different from models with horizontal symmetries, which are more constrained than the former [43]. One of the most constraining observables is the decay of \(\mu \rightarrow e\gamma \). If the charged lepton hierarchy is mostly coming from the \(\mathcal E_e\) (as in our SU(5) scenario), one has [43]

$$\begin{aligned} \left( \frac{A_0}{100\ \mathrm GeV}\right) \left( \frac{400\ \mathrm GeV}{\tilde{m}_\ell }\right) ^4<0.4 , \end{aligned}$$
(34)

where \(A_0\) is the trilinear soft term and \(\tilde{m}_\ell \) the slepton mass (similar bounds have been derived in Ref. [28]). When \(\mathcal E_\ell \) is somewhat hierarchical (as in our SO(10) model), the bounds become weaker. In the quark sector, the strongest bounds come from the neutron electric dipole moment which require squark masses \(\tilde{m}_q\gtrsim 1\) TeV [43]. Since the sfermion masses are dominated by the gaugino loops, squark and slepton masses are related as \(\tilde{m}_q\sim 5 \tilde{m}_\ell \), and the leptonic bounds are more constraining.

Finally, the Majorana nature of the neutrinos imply the occurence of neutrinoless double \(\beta \) decay. The neutrino spectrum is fairly hierarchical, with a rather light \(m_1\) (see Fig. 3). In turn, this means that \(m_{\beta \beta }=|\sum _i U_{ei}^2m_i|\lesssim 0.005\) eV, which is an order of magnitude below the most stringent current limits [46].

6 Conclusions

We have presented a renormalizable GUT model of flavor which accounts very well for the observed hierarchies of masses and mixings in the charged fermion sector. The model features two and one spontaneously broken U(1) symmetries in the case of SU(5) and SO(10) respectively. Contrary to Froggatt–Nielsen type models, the MSSM chiral matter fields are uncharged under this symmetry. We have taken advantage of the GUT breaking terms present in the most general renormalizable CW Lagrangian in order to lift the degeneracy of the down and lepton Yukawa couplings. Inter-generational hierarchies result spontaneously from products of \(\mathcal O(1)\) matrices, while inter-species hierarchies can either arise from a CW-like suppression or from large \(\tan \beta \).

In SU(5), for a GUT breaking scale slightly smaller than the U(1) breaking scales we obtain distributions of models that feature the SM point amongst the \(\sim 5\%\) most likely models, i.e., very close to the mean value of the distribution. This requires roughly about one vectorlike copy of the \(\bar{\mathbf{5}}\) matter fields, and \(\gtrsim 5\) copies of the \(\mathbf{10}\). For the best SO(10) case, the SM fits slightly worse in the distributions, belonging only to about the \(60\%\) most likely models, approximately one sigma away from the mean of the distribution. A good fit requires about \(N\gtrsim 5\) vectorlike copies of the entire MSSM matter content.

The fact that the probability density is unsuppressed at the SM point can also be interpreted in terms of fine-tuning. Accidental cancellations occur only very rarely at random. We could exclude such points from our distributions, but this would at most affect the far tails of the distributions, implying that there are many non-fine tuned models which reproduce the experimental data of Table 1 precisely.