A random clockwork of flavor

We propose a simple clockwork model of flavor which successfully generates the Standard Model flavor hierarchies from random order-one couplings. With very few parameters we achieve distributions of models in excellent agreement with observation. We explain in some detail the interpretation of our mechanism as random localization of zero modes in theory space. The scale of the vectorlike fermions is mostly constrained by lepton flavor violation with secondary constraints arising from rare meson decays.


Introduction
One of the striking features present in the Standard Model (SM) of particle physics is the strongly hierarchical pattern of masses and mixings in the matter sector. On the one hand, different representations of the unbroken SU(3) c × U(1) EM have noticeably different masses within a generation, in particular the masses of neutrinos are separated from those of the JHEP02(2020)186 charged fermions by orders of magnitude. On the other hand, within each representation, there are strong inter-generational hierarchies, which are weakest for the neutrinos, and strongest for the up type quarks. A related hierarchy appears in the CKM matrix for the quarks, which parametrizes the mixing between the up and down quarks in the charged current interaction.
In this paper, building on previous attempts [11], we will provide an extremely simple model for the mass and CKM hierarchies which arise from Lagrangian parameters of order one. Taking the latter randomly from flat prior distributions, the resulting distributions for the eigenvalues and mixings depend on five discrete parameters, the number of clockwork gears for each SM representation (two left handed doublets and three right handed singlets), as well as one continuous parameter of O(10) whose purpose is to suppress the bottom and tau masses with respect to the top mass. The inter-generational hierarchies in contrast are created from order-one random numbers. As we will see, the lightness of the neutrino masses cannot be convincingly explained in this simple model, so we will adopt the seesaw mechanism for this purpose, introducing one more non-stochastic parameter, the seesaw scale.
Our mechanism relies on chosing the Lagrangian parameters from some random distributions, and we should maybe clarify what is the rationale behind such a procedure. What we are trying to model ist the following hypothesis: whatever the UV completion of our CW Lagrangian L, it does not induce any additional special flavor structure (such as relations between couplings, parametric suppressions/enhancements etc.) into the free parameters of L. We thus stipulate that these free parameters should be independent O(1) numbers for any such UV completion. To quantify this vague statement, we are using certain prior distributions, which hopefully give us a realistic measure for the free parameters of L in the space of all possible UV completions. In fact, this hypothesis is often used implicitly in other flavor models such as Froggatt-Nielsen (FN) or extra-dimensional (ED) ones, where the free parameters are scanned over similar prior distributions to produce (posterior) distributions for the masses and mixings. The same logics can even be applied to the SM itself, as we do in section 4. In other words, randomness is just a way to scan over the space (ensemble) of models with the same given flavor structure (none/FN/ED/CW), but with different UV completions that do not have any additional impact on flavor. This allows one to answer the question whether the given flavor structure produces a realistic pattern of masses and mixings for the majority of its UV completions.
Our model can also be interpreted as spontaneously localizing SM fermion zero modes in the bulk of theory space, similar to Anderson-localization in condensed matter systems [15]. We investigate in some detail the mechanism how this localization takes place.
The scale that sets the masses of the CW gears is a priori a completely arbitrary parameter, which we will refer to as the CW scale. We present an analysis of the main JHEP02(2020)186 constraints on the CW scale from flavor changing neutral current (FCNC) observables, based on the dimension-six effective theory. Since there are only fermionic New Physics states, the sole tree-level FCNC operators are ∆F = 1 operators of the current-current type, J µ Higgs J µ fermion . In particular, ∆F = 2 processes have to proceed via the exchange of a weak boson and hence are suppressed by the fourth power of the CW scale. At the loop level, one can generate additional four-fermion and dipole type operators via loops of the Higgs and the new fermions.
This paper is organized as follows. In section 2 we introduce the model and explain its mechanism for the hierarchies. In section 3 we explain the localization features of the zero mode in theory space and compare our random CW with the standard uniform model. In section 4 we present results of simulations and quantify the performance of the model for various choices of parameters. Section 5 and section 6 are devoted to a discussion of the main FCNC effects in the effective field theory framework. In section 7 we present our conclusions. Two appendices deal with some technical details regarding the localization probabilities in theory space and the relation of the field basis used in the main text, and the mass eigenbasis.

The model
The model is defined by the following Lagrangian where the bilinear part of the Lagrangian is given in terms of (2.2) and the same for L ψ R with L → R everywhere. All fermions carry implicit generation indices. A cartoon of one of the five sectors is shown in figure 1. We will refer to the points in theory space labeled by i as sites, and additionally we will refer to the first and last sites as the "boundaries" of theory space. The matrices M ψ i and K ψ i are complex 3 × 3 matrices with dimension of mass. The last term in eq. (2.1) contains the coupling of the CW gears to the SM Higgs, where the "proto Yukawa couplings" C d,u,e are complex, dimensionless matrices. Notice that the q and fields have one more left handed than right handed field per generation, while for the u, d and e fields it is the other way around. Therefore, we are guaranteed to have three left handed (right handed) zero modes for q and (u, d and e), which are identified with the corresponding SM fields. These zero modes will be linear combinations of all CW gears. However, as we will see, it is very convenient to integrate out the CW gears without going to the mass eigenbasis and use the ψ 0 , i.e., the boundary fields, as "interpolating fields" for the zero modes. 1 JHEP02(2020)186 Let us briefly discuss the symmetries of this Lagrangian. Ignoring the coupling to the Higgs, the sectors decouple. Then, say, in the q sector, one has an G = U(3) under which the M q i and K q i transform as (3 R,i ,3 L,i ) and (3 R,i ,3 L,i−1 ) spurions respectively. Similar symmetries appear for the d, u, e, and sectors. In order to guarantee that only nearest neighbours couple, it is sufficient to invoke an Abelian subgroup of this big symmetry. The smallest such subgroup appears to be the discrete (Z 2,R ) Nq × (Z 2,L ) Nq+1 subgroup generated by minus the identity of each U(3) factor. Notice however that there is no apparent symmetry that would enforce the universality (site-independence) of the matrices K i and M i . 2 The symmetry argument that enforces the nearest-neighbour structure is of course standard in clockwork models, however, the fact that we take seriously the apparent non-universality of the site couplings will be crucial to our mechanism.
Solving the equations of motion for the vectorlike clockwork gears leads to the following Lagrangians for the interpolating fields ψ 0 : where the form factors Π ψ (p 2 ) are determined from the following recursion relation We stress that no approximations have been made (except for the fact that we work at tree level for now) and hence the Lagrangians are exact even though they contain an infinite number of derivatives. For momenta small compared to the lightest gear, we expand this to quadratic order in derivatives As we will see, the Z ψ 0 are hierarchical matrices with eigenvalues > 1 which will lead to hierarchical Yukawa couplings after canonical normalization. The matrices A on the other hand will describe the flavour violating dimension-six operators, and are evaluated in section 5. One easily obtains the recursion relations JHEP02(2020)186 which we can solve explicitly as As shown below, even for order-one random matrices K and M , the matrix Z is very hierarchical, with eigenvalues ranging from z 1 to z 1. After canonical normalization we obtain the physical Yukawa couplings where The model we are considering is essentially the one studied previously in ref. [11]. Compared to ref. [11], we are considering only one chirality when coupling the neighbouring sites. In ref. [12] a related model was proposed in which uniform (site independent) Hermitian random matrices Q i = Q were assumed. In the model of ref. [13], clockwork chains of different lengths within each fermion species were considered, with site-independent couplings. Recently, the authors of ref. [14] considered classes of models which in some cases are very similar to our model, in particular, the couplings between the clockwork fields were taken site-dependent and random. As pointed out, the assumption of uniformity (that is, site-independent couplings) is not easy to justify by four-dimensional symmetry principles. However, it is also unnecessary, as a successful model of flavor arises just from our Lagrangian, without any further assumptions.
In the rest of this section we will review how our mechanism generates hierarchical Yukawa couplings and CKM angles [11]. In the following we will call a complex matrix B hierarchical if its singular values (the square roots of the eigenvalues of BB † ) are hierarchical, b 1 b 2 b 3 . We will work under the assumption that the matrices K ψ i and M ψ i have order-one complex entries in units of some "clockwork scale" that we leave unspecified for now. Notice that the matrix Q ψ i = (M ψ i ) −1 K ψ i and hence the Yukawa couplings are independent of this scale, which will thus show up only in higher dimensional operators. We will see that for any given order-one random matrices K and M it is very likely that the matrix Z is very hierarchical with eigenvalues ranging from unity to very large values, and this hierarchy is inherited by the Yukawa couplings in eq. (2.10) after canonical normalization.
The mechanism is based on the following two observations [11] 1. Hierarchy from products. The product of randomly chosen order-one matrices quickly becomes hierarchical with increasing number of factors.   The first observation guarantees that the Z matrices (and hence the E matrices) are very hierarchical. Specifically, the second term in the expression for Z ψ −1 , eq. (2.7) is more hierarchical than Z ψ because of the additional factors of Q ψ . Adding the identity to it simply raises each of the three eigenvalues of the second term by one. 3 This slightly mitigates the hierarchy, in particular it "resets" possible eigenvalues 1 to one, while it essentially does not affect very large eigenvalues. Typical distributions for the eigenvalues of E are shown in figure 2a, where we use uniform priors with |Re(M i ) kl | < 1, |Im(M i ) kl | < 1 and analogously for K i . These distributions depend on only one discrete free parameter, the number of clockwork gears N . A characteristic feature is that the largest eigenvalue is very sharply peaked just below = 1, while the distributions for the smaller eigenvalues become more and more spread out.

2.
The second observation above guarantees that the CKM matrix is near-diagonal. As is evident from eq. (2.10), the Yukawa couplings for up and down sector have the common hierarchical factor B = E q , which guarantees the alignment of the left handed rotations V u and V d that appear in the singular value decomposition for the quark Yukawa couplings. Notice that the rotation matrices themselves are not necessarily close to the identity, only the combination A more explicit way of seeing these features is to go to the basis in which the matrices E ψ are diagonal. This is achieved by order-one, gauge-invariant, unitary rotations, such that up to order one-numbers, the Yukawa couplings become (Y u ) ij ∼ q i u j etc. This in turn leads to eigenvalues and mixing matrices with i < j in the last relation. A rigorous and systematic treatment underlying the behaviour shown in eq. (2.12) has recently been given in [16].

JHEP02(2020)186
So far, the distributions for the Yukawa couplings depend solely on five discrete parameters N ψ . As it turns out, these distributions are too rigid for the following reason. As can be seen from figure 2a, the largest eigenvalue of the matrix E ψ is always of order one. This introduces a problem for the down quark and charged lepton sectors, which require third generation Yukawa couplings of the order of 10 −2 . There are two simple solutions to this problem. The first one is to include an overall suppression factor in front of the proto-Yukawa couplings for the down and lepton sectors, which could for instance be due to a large tan β in a supersymmetric or a two Higgs doublet (2HDM) version of our model. The second possibility is to suppose that the K ψ matrices are generated at a slightly different scale than the M ψ matrices. Denoting the ratio of such scales by ξ ψ , one can conveniently incorporate this effect by writing the corresponding Q matrices as where the hatted quantities are dimensionless complex matrices taken to be of order one. For ξ 1, one can approximate 14) The eigenvalues of this matrix are expected to shift to smaller values. This can be seen for instance for the cases ξ ψ = 2, 3 explicitly shown in figure 2b. It can also be observed that this increases the hierarchy between middle and largest eigenvalues, while the hierarchy between smallest and middle eigenvalue is unaffected. 4 For the opposite case, ξ ψ 1, all three eigenvalues of E ψ will be very close to one, and the hierarchies disappear altogether.
Finally, we must accommodate neutrino masses. Within our paradigm it is difficult to use the clockwork mechanism to explain the smallness of neutrino masses (for instance by introducing N ν sterile clockwork gears with ξ ν > 1), as a large number of gears will necessarily mean a large hierarchy between the neutrinos themselves, contrary to observation. We will thus simply implement the see-saw mechanism, or, equivalently write the low-energy Weinberg operator We will separate the right handed mass matrix into a scale m ν R and a dimensionless symmetric complex matrixM ν R that we will take as random order-one numbers The seesaw scale m ν R will be another free (non-stochastic) parameter of the theory.
4 Also notice that in this limit one could work with simpler non-Hermitian matrices E ψ = (Q ψ N · · · Q ψ ) −1 when normalizing canonically. The two choices differ by a unitary matrix. Whatever the choice, the structure of the E ψ matrices is now a simple product, showing once more that our mechanism is a consequence of the "hierarchy from products" property described above.

JHEP02(2020)186 3 Clockwork Lagrangians: uniform versus random
Up to now we have not really made any connection to the usual CW paradigm of "coupling suppression from zero mode localization". Moreover, our CW Lagrangian does not even seem to contain any small order parameter q controlling these suppressions. In this section we will clarify these issues and also comment on an interesting observation made recently regarding a link to Anderson localization in condensed matter systems [15]. This section is not essential for the understanding of the rest of the paper, section 4, 5 and 6.
We start out by considering a single generation of fermions, in preparation for the realistic three-generation case to be presented after. The Lagrangian reads: where we have written a fermionic source O to keep track of couplings to the other sectors of the theory. Reading off the mass matrix for the fermions gives which has a left-handed zero mode and ≡ (1 + |q 1 | 2 + |q 1 q 2 | 2 + · · · + |q 1 q 2 · · · q n | 2 ) − 1 2 . (3.4) The eigenvector f i will be called the zero mode wave function in what follows. Note that the zero mode wave function at site zero is simply f 0 = .
In the standard CW scenario one choses the q i uniformly, q i = q. We will refer to this scenario as the "ordered" or "uniform" CW model. On the other hand, taking the q i different (and random) at each point, will result in the "disordered" or "random" CW model. In the standard uniform case, the wave function is monotonically decreasing (for |q| < 1) or monotonically increasing (for |q| > 1). In the former case, one has ≈ 1 and the coupling of the zero mode to the operator O is unsuppressed. In the second case, one has 1 and the coupling of the zero mode is suppressed by . Hence, unless q = 1, the zero-mode wave function is strongly localized at either boundary of the lattice. Moreover, the wave functions for the gears can be computed analytically and are essentially delocalized over the lattice [13].
On the other hand, for the random CW model the zero mode can peak at any site. However, as we will show, the localization is still rather narrow around that site, such that the wave function f 0 is typically still suppressed. In appendix B it is shown that for JHEP02(2020)186  Figure 3. (a) The averaged zero mode wave functions for a random CW with N = 10 with uniform priors for k and m. We separately average over the wave functions that peak at a given site. The dots mark the probabilities for the peaks to occur at that site, given in eq. (3.6). (b) Some of the same wave functions for uniform (solid) and Gaussian (dashed) priors. randomly chosen parameters m i and k i , the probability p N,j for the maximum of the zero mode wave function to occur at site j is given by To derive this result it is assumed that the functional form of the prior distributions for k i and m i are the same and site-independent. Except for this assumption, the localization probabilities in eq. (3.6) are independent of the chosen prior. 5 The next step is to see how localized the wave function becomes, given that it is peaked at site j. This depends on the prior. For identical priors for m and k, it is intuitively clear (and easily shown), that the distribution for x = log |q| 2 is then symmetric under x → −x. For instance, for uniform priors it is simply given by p(x) = 1 2 e −|x| , while for Gaussian priors it is p(x) ∼ cosh(x) −1/2 . We show in figure 3 the averaged wave functions for the case of N = 10. 6 The values of correspond to the left end points of the curves. Naively one might have expected that the wave function eq. (3.3) with randomly chosen coefficients q i is uniformly spread over the lattice. Instead, one finds that they are rather strongly localized around their maxima. Another question is how different priors might influence the localization. There is a mild O(1) dependence on the priors, as shown in the right plot of figure 3, which is explained by the slower falloff of p(x) for large x in the Gaussian case.
A semi-quantitative understanding of the localization mechanism can be obtained as follows. Let us consider the cases with maximum at fixed site j (with probabilities given JHEP02(2020)186 above). We then consider the (conditional) probability that the wave function decreases when we move away from the maximum, say from site j Clearly p − (1) = 1 (otherwise |f j | is not maximal). In appendix B it is shown that p − (k) is monotonically decreasing with k, but it always stays above 1 2 , with the universal (priorindependent) limiting value Hence there is a strong bias for the wave function to decrease to the right of the peak, which weakens but never goes away once we move to larger k. A completely analogous argument applies to the left of the peak. This fact explains the shapes in figure 3. In summary, the probability for the wave function to rapidly decay away from its maximum is large. A further comment concerns the massive modes or "clockwork gears". In contrast to the uniform case they are also localized very sharply at individual sites. It is thus typically the case that only very few heavy modes couple to the source O.
In ref. [15] a similar class of models was studied, and it was pointed out that the localization features observed here are closely related to an effect that is known as Anderson localization in condensed matter systems [17]. To see this parallel here, consider the Hermitian This matrix takes the form of a Hamiltonian over some discrete (one-dimensional) lattice with nearest neighbour interactions, known as the Anderson tight binding model. This model is known to localize its energy eigenstates similarly to our model, implying that any localized wave functions only have overlap with very few energy eigenstates and hence do not diffuse over time [17]. The original models only considered the E i to be stochastic and the interactions V i to be fixed (and uniform), but subsequent studies have shown that localization also occurs for off-diagonal disorder [18]. One difference to the Lagrangians considered in ref. [15] is that there exists an exactly massless mode, and this is the only one that is phenomenologically relevant. The fact that its wave function is known analytically (in contrast to the massive modes), allows for additional insight regarding its localization properties.
Let us now move to the case of three generations. There are now three zero modes and their orthonormal wave functions read F is a 3(N +1)×3 matrix whose three columns are the three zero mode wave functions and E is the same matrix encountered in eq. (2.11). Notice that each site has three fields living on it, and accordingly each wave function has three components per site. Diagonalizing with V unitary. One can absorb the factor of V † on the right by performing a further rotation of the zero mode fields with V . In this basis the zero mode wave functions read F = F V and their site-zero components are given by the eigenvalues i of E. As in the case of one generation, the i parametrize the mixing of the zeros modes with the boundary fields. It is then tempting to speculate that the smaller the value of i the further the i th zero mode wave function is localized away from site zero. This can be easily verified in a numerical simulation. In figure 4 we show the average wave function of each of the three zero modes for a clockwork chain with N = 10. Notice that the three left endpoints of the curves correspond precisely to the medians of the distributions for the i shown in figure 2a. 8 The correlation between localization and suppression is clearly visible. In summary, in this section we have shown that our model can be interpreted as creating a spontaneous separation of the three zero modes of each fermion species in theory space. One zero mode is likely to live near site zero (where the Higgs is located), one is likely to live near the opposite boundary, and yet another one is likely to live in the bulk. These modes are then identified, respectively, as the third, first and second generation of each species.

JHEP02(2020)186
Let us briefly comment also on the relation of our model to the quark flavor model of ref. [15]. There, the authors added one vectorlike partner ψ and one neutral scalar partner π ψ for every SM quark ψ, with Lagrangian where L Yuk contains O(1) Yukawa couplings of the ψ to the Higgs. It is the scalars π ψ that are then assembled into a N = 9 (disordered) clockwork chain, leading to a scalar mass matrix of the same form as eq. (3.9). If one of the localized modes will obtain a vacuum expectation value (vev), it will produce hierarchically different mixings between the quarks and their partners from the first term in eq. (3.12). Upon integrating the ψ , this leads to hierarchical Yukawa couplings. Thus, even though mathematically both models are based on Anderson localization, the two models are very different in that in our case it is the fermion zero modes whose localization is exploited, while in the model of ref. [15] it is the scalar mode that obtains a vev.

Quarks
In order to assess how well the proposed model can naturally reproduce the observed strong hierarchical patterns in the fermion sector, one can perform simulations to obtain the distributions of Yukawa eigenvalues, as given by eq. (2.10), and the related mixing angles. To this end, the "proto-Yukawas" C u,d , as well as the set of dimensionless matriceŝ M ψ k andK ψ k , were taken as random order one, 3×3 complex matrices. 9 Besides the discrete parameters N q , N u and N d , there are three continuous parameters, ξ q , ξ u and ξ d that we introduced in eq. (2.13). Our task is now to optimize these six parameters and achieve the most favorable distribution for the observed quark masses and mixings.
In order to measure how well the distribution performs we proceed as follows. First, from the data of the simulation for a given choice of parameters, we compute the mean values µ i as well as the variances σ i and the correlation matrix C ij of the eigenvalues and CKM angles. This defines a Gaussian approximation for the theoretical distribution of models. More precisely, we opt to work with the logarithms of the observables, as the resulting distributions are much more symmetric than the variables themselves, and hence the Gaussian approximation in terms of the first two moments is better than in the linear case. We then define a χ 2 function evaluate it at the experimental central values χ 2 (x exp i ), and optimize the parameters to obtain a minimal χ 2 . For convenience, we summarize in table 7 the experimental data.

JHEP02(2020)186
Notice that this procedure is in a way the exact opposite to the standard fit of a theoretical model to experimental data. There, the parameters of the model are adjusted in order to minimize χ 2 exp (x th i ). Here, the parameters of the theoretical distributions are adjusted in order to minimize χ 2 th (x exp i ). The distributions for the CKM angles to a good approximation only depend on N q (and weakly on ξ q ), while the up and down type mass eigenvalues only depend on the There are a variety of effects that determine the behaviour of the χ 2 function.
• The CKM sector favors ξ q > 1. For ξ q ≈ 1 the hierarchy between q 2 and q 3 is not very large (see figure 2a) causing θ 23 to turn out somewhat large, see eq. (2.12). This is improved by increasing ξ q (see figure 2b). The CKM matrix can be well reproduced for values 2 ≤ N q ≤ 4.
• In the up sector the top quark mass prefers ξ q and ξ u of order one. The simultaneous fit of the ratios m t /m c and m c /m u hierarchies prefers slightly larger values of ξ q and/or ξ u . The up-sector masses are in general well reproduced for 8 N q +N u 12, depending on the values of ξ q,u .
• The down sector prefers values of ξ q and/or ξ d greater than one in order to suppress the bottom mass. Given the restriction on ξ q from the top mass, ξ d typically is the larger of the two. A larger N d would help to keep ξ d smaller, but this is limited by the comparatively mild hierarchy in the down sector which prefers 5 N q + N d 7.
The most noticeable tension in the quark distributions is coming from the value of ξ q due to the opposite interests of the CKM and up sectors. 10 Taking all the previously mentioned factors into consideration one can arrive at the best fit values for the parameters by finding the minimum value of χ 2 . We consider two scenarios: scenario A, which assumes that N u = N q and ξ u = ξ q , naively compatible with SU(5) unification, and scenario B, in which we set ξ u = 1. Both cases have just two continuous free parameters, ξ q and ξ d which we adjust to minimize χ 2 . For several choices of the N q,u,d we present the best fit points in table 1. The typical values for the χ 2 are of the order of χ 2 ∼ 10, which, for 10 degrees of freedom, corresponds to less than one σ deviation from the mean value of the distribution. 11 Furthermore, in table 1 we also report on a third scenario in which we set all the parameters ξ q,u,d = 1 and instead upgrade to a type-II two Higgs doublet model (2HDM) with tan β = 40 in order to accomodate the bottom mass. A noteworthy feature is that with ξ q,u,d = 1 one in general needs more clockwork gears in order to achieve a large enough hierarchy between second and third generations. The resulting values for χ 2 are again of the order of one sigma, however they do not have any free continuous parameter besides tan β. To put these values into perspective, we have computed the corresponding distribution for N q,u,d = 0, that is, the distribution for masses 10 One can in principle also remove this tension by further tweaking the prior distributions for the proto-Yukawas Cij, also allowing for values slightly larger than one. 11 We stress that this σ is the one pertaining to the theoretical distribution. By no means are we claiming that the model predicts the physical values within one experimental σ.  and mixings in the SM if the Yukawa couplings were taken as order one random complex numbers. This yields χ 2 ≈ 4000, putting the physical values at about 63 sigma away from the mean value of this distribution. Including tan β, these values "improve" to χ 2 ≈ 1600 or 39 sigma. The marginalized distributions for a representative case can be found in figure 5 and figure 6.

Leptons
Moving to the lepton sector, we have the two discrete (N , N e ), and three continuous parameters (ξ , ξ e , m ν R ). For the neutrino sector, we consider normal mass ordering only. To good approximation, the charged lepton masses only depend on ξ e , ξ , N e , and N , while the mixings mainly depend on N (and weakly on ξ ) and the neutrino masses only depend on ξ , N and m ν R .  The following features determine the main behaviour of the χ 2 function.
• The PMNS matrix requires a low value of N in order to avoid a too small θ 13 : N = 1, 2 work best, with N > 3 noticeably deteriorating the fit. The dependence on ξ is weak, the main effect of larger ξ is a mild suppression of θ 23 .
• The charged leptons require a somewhat large value of ξ e and/or ξ in order to suppress the τ mass.
• For the neutrino sector, one needs to avoid large hierarchies between the second and third generations (m ν 3 /m ν 2 6 for normal ordering), which points towards ξ 1. However, for small N (preferred by PMNS, see above), even a large ξ is in principle possible. Since this decreases the neutrino masses, it requires a smaller seesaw scale in order to achieve a large enough m ν 3 .
The lepton sector is thus very easy to accommodate. Our results for the leptons are summarized in table 2, and some distributions can be found in figures 7 and 8. The SU (5) compatible scenario A has all parameters fixed from the quark sector, N e = N q , N = N d , ξ e = ξ q , and ξ = ξ d . This leaves as the only free parameter the seesaw scale, which we use to the fit the neutrino masses. As the neutrino masses are suppressed by the relatively large values of ξ , the seesaw scale is comparatively small, ∼ 5 · 10 11 GeV. For the scenario B, we fix ξ = 1 and vary only m ν R and ξ e . Notice that the first three cases of scenario B are still compatible with SU(5) as far as the number of gears go, and only show SU(5) violation in the ξ e, . For the 2HDM scenario, we take the same SU(5) compatible choices from the quark sector, leaving again only one free parameter, the seesaw scale.
Again, table 2 contains the "random SM" and its 2HDM cousin for comparison. It is worth noticing that in both cases almost the entire contribution to the χ 2 comes from the charged lepton masses, with the neutrino sector adding very little. This simply shows that neutrino anarchy [19,20] is working well.

Dimension-six operators
In this section we compute the dimension-six operators relevant for the leading flavor changing neutral currents (FCNC) effects of the model. Writing the effective theory as our task is to compute the Wilson coefficients C I , where we will be adopting the Warsaw basis [21].

Tree level
All tree level flavor changing effects come from the dimension-six effective Lagrangian (in canonical normalization) Here, A ψ is the order p 2 term in the expansion of the form factors eq. (2.6), that is given recursively in terms of One can solve this explicitly to get where Z ψ was given explicitly in eq. (2.8).
We prefer to translate the operators in L 6 to equivalent operators with fewer derivatives [21]. Using integration by parts and the equations of motion we find only two type of operators. The first ones are corrections to Yukawa couplings given in table 3, where we have defined

JHEP02(2020)186
Operator Table 4. Current-current operators and their Wilson coefficients. The others are current-current interactions given in table 4, where we defined the weak Higgs currents In order to discuss FCNCs, it is essential to understand the flavor structure of these couplings. Even though the expressions for the Wilson coefficients in the above operators are very explicit, they are not very illuminating for this purpose. In appendix C we recompute the Wilson coefficients in the mass basis, according to the diagram in figure 9 and show that replacing the exact mass eigenvalues by the CW scale(s) m ψ amounts to the substitution which are anarchic matrices of order-one numbers times m −2 ψ . The quantity m ψ is the typical mass scale of the fermionic resonances, an explicit definition is given in eq. (C.10). In reality the mass eigenvalues lie in a band around m ψ which will lead to somewhat broad JHEP02(2020)186 distributions of allowed gear masses. Moreover, the Yukawa couplings satisfy where we introduced the quantities c u,d which parametrize the typical size of the elements of C u,d . Depending on the scenario, we may set c u,d = 1 or c u = sin β, c d = cos β, or incorporate any other additional parametric dependence the model may predict for the C u,d . A similar parameter c e can be introduced in the lepton sector. Making use of this fact, the flavor structure is now evident. We have, for instance The fact that the flavor violating Wilson coefficient has the same common hierarchical factor E d as the Yukawa coupling eq. (2.10) implies that the rotations that diagonalize the Yukawa couplings also approximately diagonalize the Wilson coefficients [11]. By a suitable unitary (and gauge-invariant) transformation, one can obtain a basis in which the E ψ are diagonal. Let us denote the three eigenvalues by ( ψ i ), with the convention that ψ 1 ≤ ψ 2 ≤ ψ 3 . Notice that the eigenvalues are less than one, ψ i ≤ 1, and are typically hierarchical.
In this basis, we can give rough estimates by omitting order-one coefficients. For instance, the Yukawa couplings themselves behave as etc., while the Wilson coefficient above reads Diagonalizing the Yukawa couplings leaves the form of eq. (5.13) unchanged. 12 Therefore, this form of displaying the Wilson coefficients makes the flavor alignment very explicit, as off-diagonal entries always have at least one suppressed factor. These parametric estimates for all the nonzero Wilson coefficients are also given in table 3 and table 4.

One-loop
We now list the relevant fermionic operators that are not already generated at the tree level. Ignoring purely bosonic operators not relevant for flavor observables, we focus on operators that contain either two or four fermion fields. The only four-fermion operators that violate flavor are generated by the diagrams in figure 10. These operators are considered in section 5.2.1. If we disregard operators that are already generated at the tree level, we can furthermore restrict the two-fermion operators to ones with at most one Higgs field. There are two classes of operators, which schematically are of the form ψ 2 D 3 (two JHEP02(2020)186 fermions, three covariant derivatives) and Hψ 2 D 2 (two fermions, two covariant derivatives, and one Higgs), where we count a field strength as D 2 . The relevant topologies are shown in figure 11. After reduction to the Warsaw basis we are left only with dipole type operators, operators already present at the tree level, as well as ∆F = 1 four-fermion operators. Notice that in both cases only Higgs loops but not gauge loops contribute.

Four-fermion operators
Dimension-6 operators with four fermions will appear at the one loop level only, and involve two virtual Higgs exchanges, see diagrams in figure 10. For convenience, we define the function where a, b = q, , u, d, e, which parametrizes the diagram with exchange of a and b type clockwork fermions. These decouple with the heavier of the two scales m a and m b . In terms of the Warsaw basis, one generates all vector-current squared operators except O (8) qu and O (8) qd and none of the scalar-current squared operators. Here, for simplicity, we define the alternative operator O ud which is used instead of the operator O

Two-fermion operators
The complete set of dimension-six dipole operators is given in the first column of table 6. All three type of diagrams in figure 11 in principle contribute to the dipole operators after

JHEP02(2020)186
Operator Wilson coefficient (approximate) reduction to the Warsaw basis. One could evaluate these contributions exactly and subsequently perform an approximation of the kind eq. (C.4). However, as we already plan to ignore order-one coefficients anyway, it is sufficient to just know the parametric behaviour with the clockwork scales in the decoupling limit, which is easy to find without a detailed computation of the diagrams. When evaluating this decoupling behaviour, it is important to keep in mind that the clockwork gears only possess Yukawa couplings of the same chirality as in the SM, i.e., couplings of the kindq R Hd L are absent. Therefore, between each pair of Yukawa vertices only the chirality preserving part of the Dirac propagator contributes.
Let us first consider the quark operators. The first diagrams produceqqD 3 operators decoupling as c 2 d m −2 d + c 2 u m −2 u , andūuD 3 (ddD 3 ) operators going as c 2 u m −2 q (c 2 d m −2 q ). By use of the equations of motion they pick up factors of c u,d and contribute to the dipole operators. The second diagrams givequHD 2 operators going as c u (c 2 d +c 2 u ) m −2 q and c 3 u m −2 u , and similarly for theqdHD 2 operators. Finally, the third diagrams give rise toquHD 2 operators proportional to c u c 2 d decoupling with the heavier of m q , m d (and similarly for

JHEP02(2020)186
Operator Wilson coefficient (approximate) Table 6. Dipole operators and their Wilson coefficients. Here the scale m a,b is the smaller of the two scales m a , m b .
u ↔ d). This contribution is at most of the same order as the first two. The leading contributions to the Wilson coefficients therefore are where m a,b should be taken as the smaller of the two scales, 13 and analogously for the down quark operators with d ↔ u. The lepton sector works in a similar way. The resulting estimates for the Wilson coefficients are summarized in table 6.

FCNC constraints
In this section we provide the leading constraints from FCNC observables. The analysis is somewhat similar to extra-dimensional models of flavor (see ref. [22] for an overview of the constraints), with the important difference that here only fermionic resonances contribute to the effective operators, which strongly suppresses certain observables such as those from neutral meson mixing.

Decay µ → eγ
This proceeds via the dipole operators (O eA ) ij = F µνē i L σ µν e j R with coefficient In terms of this, the partial width is Γ(µ → eγ) =

µ → e conversion
The conversion of muons to electrons in nuclei such as gold is another strongly constrained observable. In terms of the Wilson coefficients 14 one has for the transition rate normalized to the capture rate [24] B Au (µ → e) = 4m 5 where the last factor comes from the coupling of the Z boson to the nucleus. Using the overlap integrals and capture rate from ref. [25] V (p) one obtains The experimental bound is B Au (µ → e) < 7 · 10 −13 [26]. Let us consider some representatives of the scenarios A, B as well as the 2HDM separately. In the following, we eliminate the e i in favor of the physical masses via the relation y e i ∼ c e e i i , and use as reference values for the i the median values of the distribution in each case.

JHEP02(2020)186
• Scenario A (c e = 1, ξ = 13, N = 2). Keeping in mind the experimental limit, scenario B is the most constrained case, pointing to m e in the 100 TeV region, while scenario A requires m 10 TeV. The 2HDM scenario is the least constrained one, and survives even for TeV scale gear masses.

Decay µ → 3e
The branching ratio for the decay µ → 3e is governed by the exact same combination of Wilson coefficients as the µ → e conversion rate. However, the bounds turn out to be about a factor of 3 weaker than the latter, hence we will not detail them here. To find the limits, we perform a simplified analysis using the bounds on Wilson coefficients given in refs. [27,28]. In terms of the standard terminology, C sd 15 The imaginary parts of these Wilson coefficients are constrained as C sd 1 , C sd 1 (1.7 · 10 4 TeV) −2 and C sd

JHEP02(2020)186
• Scenario A gives • In the 2HDM scenario one has Clearly, none of these cases result in any significant bounds onm (say, above 1 TeV). The bounds from the DD and BB systems are even weaker and will not be investigated here.

Rare meson decays
Various measurements on rare decay modes of mesons, in particular s and b flavoured ones, constrain the tree-level generated flavor-violating Z couplings. We investigate here two of the experimentally best measured ones that allow us to obtain the strongest bounds [29] B(K L → µ + µ − ) = (6.84 ± 0.11) · 10 −9 , (6.12) which correspond to s → d and b → s transitions respectively. It is straightforward to relate the expressions in ref. [30] to the relevant Wilson Coefficients.

JHEP02(2020)186
In the following, we neglect the NP 2 terms, which, consistent with the EFT approach, are subleading. Our class of models then yield TeV m 2 · 10 −9 , (6.16) 17,17] TeV m 2 · 10 −9 , (6.17) where the ± schematically parametrizes the possible types of interference with the SM. For the second and third models the bounds are more severe than the ∆F = 2 constraints, going as high as 5 TeV for the third model and 9 TeV for the first one. The bounds are noticeably weaker than in scenario A, going maximally to 4 TeV in the case of the third model. No significant bounds arise in this model, which can be consistent with data even for sub TeV masses.

Conclusions
In this paper we have proposed a model of flavor, both for the quark and lepton sectors, in which hierarchical masses and mixings arise from a chain of vectorlike fermions. In contrast to similar constructions with site-independent couplings, here the couplings at different sites are uncorrelated. We showed that if they are taken to be random order-one complex numbers (in units of some scale), it is very likely that the physical Yukawa eigenvalues of each species turn out hierarchical, thus explaining in a natural way the observed mass patterns in Nature. We have also shown that this effect can be interpreted as a spontaneous and narrow localization of the zero mode at a random site of the chain ("localization in theory space"), which can occur close to the boundary where the Higgs lives (we identify this mode with a third generation fermion), towards the opposite site (first generation), or randomly in the bulk (second generation). The deterministic parameters of the model are essentially the number of clockwork fermions of each gauge representation (in the case of SU(5) compatible choices these are just two integer numbers), whereas effectively one more

JHEP02(2020)186
continuous parameter is needed in order to suppress the bottom and tau mass compared to the top mass. This latter parameter is of the order ∼ 10, and can appear either as a ratio of scales or a tan β type parameter in a 2HDM version of the model. We have performed a quantitative analysis of the model by simulating the "posterior" distributions for the masses and mixings from reasonable prior distributions for the Lagrangian parameters, and show that for appropriate number of clockwork fermions the experimental values appear near the mean values of these distributions, about one standard deviation away from it. The results are summarized in tables 1 and 2. Furthermore, we have computed the dimension-six flavor changing operators (exactly at the tree level, and parametrically up to order-one numbers at the loop level). With these result in hand, we have investigated the main flavor changing effects and found that the primary constraints come from µ → e conversion in nuclei, followed by µ → eγ. The constraints in the quark sector are less severe, and predominantly come from rare meson decays. The actual lower bounds on the vector-like scale differ quite a lot over the different variants of the model and can generally vary in the 1 to 100 TeV.
Besides the clockwork scale, the main variables that govern these bounds are the ψ parameters. Should a pattern of flavor violation be observed in the future, one can try to reconstruct preferred ranges for these parameters which in turn could give some important clues as to the number of clockwork gears compatible with experiment. This will be particularly constraining for scenario A, since parameters in the lepton and quark sectors are related. For instance, rare meson decays should show up at the same clockwork scales as µ → e conversion, so a simultaneous discovery of these two effects could point towards this class of models.
A key feature of our model, that allows us to distinguish it from certain other flavor models, is the absence of tree-level ∆F = 2 four-fermion interactions. For instance, a nonobservation of large effects in KK mixing and simultaneous observation of say, rare meson decays, could help to distinguish these models from models with extra dimensiona [22]. 16 The latter predict a similar pattern of FCNC dimension-six operators generated form integrating out fermionic resonances, but in addition generate a plethora of tree-level flavorviolating four-fermion operators from integrating out spin-1 resonances.
Should actual fermionic resonances be observed, the natural question that arises is how to distinguish them from other models predicting such states, such as extra dimensions or clockwork constructions without randomness. Two main features could be exploited in that regard. Firstly, the spectrum of resonances is quite different from extra dimensions, in that the masses are more concentrated around one scale m rather than spaced evenly. They are also much less regularly distributed than those of both extra dimensions or the ordered clockwork model [13]. Secondly, a unique feature of the disordered clockwork is that the heavy gears are localized in the bulk of theory space, making their couplings to the Higgs (that lives on site zero) very hierarchical. As a consequence, the branching fractions of a gear into a Higgs (or longitudinal W/Z) and a SM fermion (or lighter gear) are expected to be very different for the various gears of the same type.

JHEP02(2020)186
We close with a few comments on possible further directions. It would be interesting to embed this paradigm in low-energy supersymmetry. Even if the flavor scale is very high, interesting bounds could be obtained due to the inheritance of the flavor structure by the soft terms. It has been pointed out previously [31] that these effects can be markedly different for different flavor mechanisms (such as extra dimensions vs. horizontal symmetries). Furthermore, even though we have obtained good fits with SU(5) compatible choices for the number of vectorlike fermions, the other Lagrangian parameters (the matrices M ψ i and K ψ i ) need to present a significant amount of SU(5) violation in order to generate the differences in down and lepton Yukawa couplings. However, interestingly, these couplings are dimension-3 operators, and hence they could descend from renormalizable Yukawa-type interaction with the GUT breaking 24 representation, which naturally can result in order one SU(5) breaking effects. It would be very interesting to find a working model along these lines.  We are interested in the statistical distribution of the f i when k i and m i are randomly chosen from some given prior distributions. We will chose these priors (though not the actual values of m i and k i ) to be site-independent. If, in addition, the priors for m and k are identical, then the distribution for x ≡ log |k/m| 2 is symmetric.

Acknowledgments
For instance, for uniform priors with ranges where x runs over the entire real line. For the moment we will keep the distribution p(x) arbitrary. Let us suppose that the wave function is maximized at site j, f max ≡ max i |f i | = |f j |. The set of values of the x i that corresponds to this situation defines a region in R N that we will denote by R N,j . Using it is not hard to see that this region is defined by the relations Integrating over R N,j then gives Vol(R N,j ) ≡ p N,j , the probability for the zero mode wave function to peak at site j. However, the relations in (B.5) only involve x 1 . . . x j and the ones in (B.6) only the x j+1 . . . x N . In fact separately they define the two lower-dimensional regions R j,j and 17 R N −j,0 into which R N,j factorizes, in particular Eq. (B.7) then leads to the recursive relation If, in addition p(x) = p(−x), one has the relation p N,j = p N,N −j , (B.9) 17 Strictly speaking this notation implies a renaming xj+i ≡ yi for i ≥ 1.

JHEP02(2020)186
in particular p N,j = p j,0 p N −j,0 . (B.10) Hence, all probabilities are determined once we know the quantities p j,0 . The latter can simply be obtained from normalization N j=0 p N,j = 1 and the relations eq. (B.10). One finds This result is independent of p(x), as long as p(x) = p(−x). Next, we would like to investigate the conditional probabilities (given the location of the peak) that the wave function decreases say to the right of the peak. Formally, it is calculated as In the second line we used the factorization fo the two regions which still holds true in the presence of the additional constraint x j+k < 0, which causes a cancellation of Vol(R j,j ) factors between numerator and denominator. Notice that the problem reduces from an N -dimensional to an N − j ≡ N dimensional one. Renaming the sites x i → x i−j we can simply calculate for 1 ≤ k ≤ N . Clearly p − (1) = 1 (otherwise f 0 would not be maximal). We will now show that this probability decreases monotonically until it reaches the final value p − (N ) = N /(2N − 1) > 1 2 . In other words, p − is always greater than 1 2 and hence it is always more likely that the wave function decreases rather than increases at any step to the right of its peak. Monotonicity is fairly straightforward. Let R = R N ,0 and R i ⊃ R be defined like R but with the condition x 1 + · · · + x i < 0 removed. Then (B.14) In the third line we have used that the conditions in R k−1 as well as the integration measure are symmetric under exchange of x k−1 ↔ x k , and in the fourth line we used the condition x k > 0 ∧ x 1 + · · · + x k < 0 ⇒ x 1 + · · · + x k−1 < 0. In order to compute p − (N ) we need Vol(R N ,0 ∩ x N < 0) = Vol(x N < 0) Vol(R N −1,0 ) , (B.15) JHEP02(2020)186 which follows from x N < 0 ∧ x 1 + · · · + x N −1 < 0 ⇒ x 1 + · · · + x N < 0, such that the latter condition can be dropped without consequences. By the use of eq. (B.11) one finds thus establishing the result.

C Mass eigenbasis
It is useful to compute the Wilson Coefficients in the mass eigenbasis before electroweak breaking, which is less explicit than the basis employed in the main text but nevertheless insightful. The two type of operators are then generated by tree-level diagram as shown in figure 9. To avoid an overboarding notation, it is useful to first consider the case of a single generation. For instance, the diagram involving exchange of up type gears generates the unique dimension-six operator After reduction to the Warsaw basis, we end up again with the previous two type of operators of tables 3 and 4, but this time expressed in terms of the mass eigenvalues and mixing angles. The notation is as follows: U q L (U q R ) are defined as the N q + 1 (N q ) dimensional unitary matrices diagonalizing the mass matrix for the doublet gears, and U u R and U u L (of dimension N u + 1 and N u respectively) are the analogous matrices for the singlet gears. Notice that the first column of U q L contains the zero mode wave function, Furthermore, a labels the N u mass eigenstates. The term in parenthesis in eq. (C.1) is essentially a weighted average over the inverse gear masses squared. To simplify this expression, we make the approximation of approximately degenerate gear masses, that, is where m u is some characteristic clockwork scale for the u sector. A possible definition for this scale is given in eq. (C.10). Unitarity implies a =0 |U u R 0,a | 2 = 1− 2 u and the approximation eq. (C.4) then simplifies the dimension-6 Lagrangian in eq. (C.1) as L 6,u =

JHEP02(2020)186
When going to three generations, U q L 0,0 becomes a 3 by 3 matrix, previously denoted by E q , see eq. (2.11). Moreover, the single row matrix U u R 0,a becomes a 3 × 3N u matrix that we will denote by G u R . Then for instance, one has C (1) where in our convention the E matrices are Hermitian, and we defined the diagonal 3N u × 3N u matrix M u that contains the eigenvalues m u a . Notice that the three rows of (E u , G u R ) comprise the first three rows of the unitary matrix U u R . Comparison with  table 4 gives Note that all tree-level Wilson Coefficients in table 3 and 4 contain the combination E ψ A ψ E ψ , and in analogy to eq. (C.7) one has the relations Even though we do not have analytical expressions for the masses and mixings on the right hand side of eq. (C.8), this is still a useful rewriting. For instance, approximating the eigenvalues by the CW scale m u and using unitary E 2 ψ + G ψ (G ψ ) † = 1 3×3 one gets One possible definition for the CW scale m ψ is in terms of the prior distributions for the matrices M ψ and K ψ . If we denote the medians of the |(M ψ ) ij | bym ψ and the medians of |(K ψ ) ij | byk ψ , we define m 2 ψ =m 2 ψ +k 2 ψ . (C.10) In practice, this defines the scale that sets the mass eigenvalues of the clockwork gears.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.