Yukawa ratio predictions in non-renormalizable SO(10) GUT models

Since SO(10) GUTs unify all fermions of the Standard Model plus a right-chiral neutrino in a representation 16 per family, they have the potential to be maximally predictive regarding the ratios between the masses (or Yukawa couplings) of different fermion types, i.e. the up-type quarks, down-type quarks, charged leptons and neutrinos. We analyze the predictivity of classes of SO(10) (SUSY) GUT models for the fermion mass ratios, where the Yukawa couplings for each family are dominated by a single effective GUT operator of the schematic form 162 · 45n · 210m · H, for H ∈ {10,120,126¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overline{\mathbf{126}} $$\end{document}}. This extends previous works to general vacuum expectation value directions for GUT-scale VEVs and to larger Higgs representations. In addition, we show that the location of the MSSM Higgses in the space of all doublets is a crucial aspect to consider. We discuss highly predictive cases and illustrate the predictive power in toy models consisting of masses for the 3rd and 2nd fermion family.


Introduction
Grand Unified Theories (GUTs) present an attractive framework for physics Beyond the Standard Model (BSM). Besides gauge coupling unification, they also unify fermions in joint GUT representations, presenting an interesting possibility to address the flavor puzzle, i.e. the origin of the values of masses, mixings and CP violating phases. The most common GUT models are based on the unifying groups SU (5) and SO (10); this paper focuses on the latter choice, where an entire SM family of fermions and an additional right-handed neutrino can be embedded into a single representation 16.
From the point of view of the Yukawa sector and the flavor puzzle, one can distinguish two approaches to build unified models: 1. Minimal renormalizable models, where a minimal set of irreducible particle representations in the fermionic and Higgs sectors are postulated. All renormalizable terms admitted by gauge symmetry are written down, and the GUT symmetry breaking -1 -

JHEP02(2020)086
is achieved with a minimal set of scalar representations. The bigger symmetry can manifest itself in a smaller number of free parameters for the masses and mixings compared to the Standard Model (SM). Usually, the predictions in this type of models appear as correlations between observables and are often rather complicated. They are typically made apparent only through a numeric fit and subsequent analysis. Regarding the predictions for the ratios of fermion masses, these are generically hidden due to each entry of the Yukawa matrix being generated by a linear combination of GUT operators.
2. Flavor models (as effective theories or renormalizable realizations), where the emphasis is put on explaining the observed mass ratios and mixings and on the maximal predictivity for the yet unmeasured observables. Despite models of this type having a larger particle content, they achieve predictivity by postulating a certain (continuous or discrete) "family symmetry", broken spontaneously by so-called "flavon fields", to ensure control over the textures of the Yukawa entries. This opens up the possibility that each entry of the Yukawa matrices is dominantly generated from one effective GUT operator, a scenario to which we will refer as "single operator dominance". When this condition is satisfied, the group-theoretic Clebsch-Gordan coefficients between the different fermion sectors can give rise to fixed ratios between the Yukawa entries. In the main part of this paper, we will focus on this scenario and the predictivity for the fermion mass ratios from the Clebsch factors of single effective GUT operators.
In the context of SO (10), an example of a model of the first kind is the minimal renormalizable supersymmetric model [1][2][3][4] with a Higgs sector consisting of the irreducible representations 10, 126, 126 and 210, where the first two are involved in the Yukawa sector of the model, while the second two ensure a suitable potential for GUT symmetry breaking. A renormalizable Yukawa term with a 120 is also possible. Some fits of renormalizable SO(10) Yukawa sectors can be found in [5][6][7]. Alternative setups using the 54 [8], or an additional vector-like fermion family 16 ⊕ 16 [9] also fall under the approach of "minimal renormalizable models".
The second approach of flavor models is more prevalent in the context of SU(5) GUTs. The simplest examples are models which make use of Clebsch factors between the down and charged lepton sector from the renormalizable Yukawa operators 10 F · 5 F · H using the 5 or 45 for the Higgs representation H. If each Yukawa entry comes from a single GUT operator (i.e. in the case of single operator dominance), the ratios of entries in the different sectors at the GUT scale are predicted from the type of operator used. The simplest aforementioned examples of operators lead to the well known cases of b-τ unification and the Georgi-Jarlskog factor [10]. Clebsch factors arising from more general non-renormalizable operators have been studied in [11,12]; this was done through an approach where the non-renormalizable operators are assumed to arise as effective theory operators from a renormalizable theory after integrating out heavy mediator fields. For models built on this approach, see for example [13][14][15][16][17][18][19][20][21][22][23][24].

JHEP02(2020)086
The purpose of this paper is to systematically explore the predictions of classes of (non-renormalizable) SO(10) GUT operators for the fermion mass ratios, extending the previous results (cf. [25]), towards the construction of new SO(10) GUT flavor models with single operator dominance. In contrast to SU (5), apart from the ubiquitous 3rd family Yukawa unification from an operator 16 F · 16 F · 10 H , existing flavor models in SO (10) typically use linear combinations of operators at least for the masses of the second and first family (e.g. [26][27][28]). There are a number of new issues and circumstances arising in SO (10) compared to SU (5): For instance, all fermions are now in a single representation 16. As a consequence, the SO(10) symmetry relates not just entries in Y d (down sector) and Y e (charged lepton sector), but Y u (up sector) and Y ν (neutrino sector) as well. This means each operator in principle provides 3 ratios between sectors instead of 1, making SO(10) symmetry much more predictive. Also, effective operators are constructed with the use of representations, whose SM singlet components acquire GUT-scale vacuum expectations values (VEVs). In contrast to SU (5), where the representations 24 and 75 have only one SM singlet, the representations 45 and 210 of SO(10) have 2 and 3 SM singlets, respectively. Therefore the direction of the VEV in a multidimensional space of singlets becomes important. Furthermore, the Yukawa ratio predictions in SO (10) are affected by where the Minimal Supersymmetric SM (MSSM) Higgs doublets are located with respect to the doublet flavor eigenstates. For assessing the predictivity of fermion mass ratios coming from SO(10) GUT operators all these aspects have to be taken into account.
The paper is organized as follow: in section 2 we consider the class of non-renormalizable superpotential operators where the 45 or 210 in SO (10) acquire GUT-scale VEVs, and compute their predictions. In section 3 we show that the location of MSSM Higgses in the doublet states crucially impacts the Yukawa results and provide some predictive scenarios for model building. In section 4, we then combine the previous results into a discussion on model building, and provide 3 example toy models with the 2nd and 3rd family effective operators. We then conclude. Additionally, a number of more technical considerations have been relegated to the appendices: appendix A contains the description of SO(10) conventions used in this paper, while appendix B analyzes the different constructions of non-renormalizable operators via mediator fields.
2 A class of effective Yukawa operators in SO (10) JHEP02(2020)086 First, we fix the notation used in this paper. The irreducible representations of groups will be typed in boldface, and we use the labels G 51 , G 422 and G 321 for the groups SU(5) × U(1) X , the Pati-Salam group SU(4) C × SU(2) L × SU(2) R and the Standard Model group SU(3) C × SU(2) L × U(1) Y group, respectively. The groups G 51 and G 422 are both maximal subgroups of SO (10), which contain the SM group G 321 .
A very convenient property of the group SO (10) is that SM fermions of one family, alongside a right-handed neutrino, all fit into a single spinorial representation 16 of SO (10) We take the above SM embedding and assume that the fermionic sector of the theory contains 3 copies of this spinorial representation, which we label by 16 I ≡ 16 F I , where F denotes that the representation is "fermionic" and I is a family index with I = 1, 2, 3.
At the renormalizable level, two fermionic representations couple to a single Higgs representation H, where the MSSM Higgs doublet/antidoublet (1, 2, ±1/2) with an EW scale VEV reside (at least partly 1 ). Suitable representations H are determined by considering the decomposition of the tensor product of two fermionic representations: where s and a denote whether the representation lives in the symmetric or antisymmetric term of the product, respectively. Each of the representations on the right-hand side contain weak doublets/antidoublets, so they can be used to obtain (MS)SM Yukawa terms. At the renormalizable level we thus have exactly 3 possible Yukawa terms: 16 I · 16 J · 10, 16 I · 16 J · 126, 16 I · 16 J · 120, (2.5) associated with 3 × 3 Yukawa matrices Y 10 IJ , Y 126 IJ and Y 120 IJ . The matrices Y 10 and Y 126 are symmetric in the indices I and J, while Y 120 is antisymmetric. These statements are all well known, and numeric fits of these operators have been performed, see e.g. [5][6][7]29].
We now consider possible extensions of such a Yukawa sector, making use of nonrenormalizable operators. Assuming no new content in the fermionic sector, i.e. the fermionic sector still consists of 3 copies 16 I , the Yukawa operators consists of two fermionic factors 16, a factor containing the SM Higgs, and possible further factors containing SM singlets acquiring GUT-scale VEVs. In order to obtain new prediction possibilities, the extra GUT-scale factors should not form an invariant by themselves, and should thus be contracting in the invariant with the fermionic and Higgs factors in a non-trivial way. Renormalizable operators from eq. (2.5) are thus most conveniently extended by factors of 1 The MSSM Higgs doublet pair can be merely the lightest such pair in the theory. Since the flavor and mass eigenstates do not coincide, H may contain only part of the MSSM Higgs. This is to be discussed in detail later in section 3. (self-conjugate) representations 45 and 210. It is exactly such a class of operators that we systematically consider in this paper. We limit ourselves to invariants where the contractions of the extra factors is performed in a "spinorial way", a detail that we discuss later.
The operators under consideration are schematically written in the following way: 210 β j · 16 I ) · H · ( n k=1 45 α k · m l=1 210 β l · 16 J ), (2.6) where H stands for a 10, 126, or 120, and the integers n, n , m, m ≥ 0. These integers denote the number of factors in the product. Note that each of the n + n factors of the representation 45 is equipped with an index α i or α k , since we are considering the possibility that we have multiple copies of this representation as part of the Higgs sector, i.e. they may contain different field degrees of freedom. Analogously, we label the possibly different m + m factors of the representation 210 with indices β j and β l . We use the more complicated labels, e.g. α i instead of just i, for later convenience. For n = n = m = m = 0 the operator reduces to one of the 3 usual renormalizable operators in eq. (2.5), depending on H.
The class of invariants under consideration from eq. (2.6) can be most conveniently constructed by writing the representations 45, 210 and H in spinorial form as 32 × 32 matrices. We describe the detailed group theory conventions and procedures for this in appendix A. In short, the spinorial form of e.g. the representation 45 has the index structure 45 A B , with both the upper index A and lower index B running from 1 to 32. The 210 and H have the same index structure of one upper and one lower index.
The representations 45 and 210 acquire GUT-scale VEVs; for obtaining Yukawa operators the only relevant states in them are the SM singlets. Crucially, these are found (e.g. by explicit computation) to be on the diagonal of their 32 × 32 matrix form, and thus 45 and 210 commute and their order in the invariant is not important. In contrast, their commutation with H is non-trivial, and thus it is important to distinguish on which side of H the 45 or 210 representations lie. If we imagine the invariants from eq. (2.6) to be generated from a renormalizable theory by integrating out heavy mediators of the type 16 and 16, the situation can be summarized by stating that external legs of the diagram containing 45s and 210s commute with each other, but not with the external leg of the field H. This is the reason for distinguishing the GUT VEV representations acting on 16 I and 16 J in eq. (2.6) (primed and non-primed indices), while their internal order is not relevant. Considerations regarding mediators are investigated in detail in appendix B.
Given the discussion above, the invariants in (2.6) can be written explicitly with index contractions as where all capital indices A, B, D, E, F run from 1 to 32 and C AB are the components of the "charge conjugation" operator, see appendix A. The products over i, j, k, l in parentheses are understood as ordinary matrix multiplication, e.g.
As already discussed, the GUT VEVs of such objects simply form a diagonal 32 × 32 matrix. The 10, 120 and 126 (the H), on the other hand, acquire an EW scale VEV in their doublet/antidoublet components. Concerning the GUT-scale VEVs, it is important to note that each 45 contains 2 SM singlet states, while the 210 has 3 SM singlets. Their VEVs can thus have an arbitrary direction in the spaces of singlet fields, i.e. in the spaces F 2 and F 3 , respectively. In a non-SUSY GUT F = R, while in SUSY GUTs F = C, since chiral supermultiplets contain complex fields and the real representations thus need to be complexified.
For specifying the singlet states, we make use of a basis adapted to the maximal subgroup G 51 , under which basis states all lie in a single irreducible representation of this subgroup. These decompositions can be found in table 10 of appendix A. We label the VEVs of the basis singlet fields by The underlying singlet fields of the X and Z VEVs have well-defined transformation properties under G 51 : they belong to the G 51 irreducible representation in the angled brackets, while their SO(10) origin is denoted in the index. Note that in a SUSY scenario the VEVs X and Z have in general complex values. Alternatively, we can use a basis of singlet states adapted to the irreducible representations of the Pati-Salam group G 422 , which is the other regular maximal subgroup of SO(10) containing G 321 . All associated decompositions of representations are given in table 11 of appendix A. We denote the VEVs in the Pati-Salam adapted basis bỹ (2.10) The relation between the G 51 and G 422 adapted bases is explicitly computed to bẽ (2.11)

JHEP02(2020)086
These VEV relations implicitly contain some relative phase conventions for the VEVs. We normalize the SU(5) and Pati-Salam aligned states so that their VEVs are orthonormal, i.e.
where the contracted indices are the complex (anti)fundamental indices of the representation 10, see appendix A for details. In the H-representations 10, 126 and 120, we have doublets (1, 2, 1/2) and (1, 2, −1/2) of the SM group G 321 ; we denote their neutral components by H u x and H d x , respectively, where the index x specifies the exact state. There is one doublet-antidoublet pair in each of the representations 10 and 126, but two pairs in 120. Furthermore, the representation 126 also contains a SM singlet. We label these states in the following way under the embedding chain G 321 ⊆ G 51 ⊆ SO(10): the doublets are the antidoublets are 14) and the singlet VEV in the 126 is denoted by All the above states H u x , H d x and the VEV ∆ are (canonically) normalized so as to be orthonormal, i.e. analogous to eq. (2.12) when using the following contracted expressions: With all definitions at hand, we proceed to the explicit results for the Yukawa terms generated by a non-renormalizable operator of the type specified in eq. (2.7). There are, broadly speaking, two lines of inquiry one can follow: 1. The acquired GUT-scale VEVs in the 45 and 210 are in discrete directions corresponding to singlets contained in a single irreducible representation of the maximal subgroups G 51 and G 422 . These special discrete directions can be obtained for example due to the choice of invariants used in the (super)potential for the Higgs sector.
2. The acquired GUT-scale VEVs have an arbitrary (continuous) direction in the space of SM singlets for the representations 45 and 210.
-7 -JHEP02(2020)086 We state the results for each possibility in a dedicated subsection. A mixed case with some factors having VEVs in discrete directions and some factors in arbitrary directions would in principle also be possible; we do not consider such a case here, but the explicit results can be inferred by combining the results of the discrete and arbitrary direction cases.

Discrete directions
We assume that each representation 45 α i and 45 α k in eq. (2.6) has a discrete alignment of its VEV along one of the directions X 1 , X 2 ,X 1 orX 2 defined in eq. (2.9) and (2.10). These directions have well-defined transformation properties in a maximal subgroup, i.e. their corresponding particle states lie in a single irreducible representation of a maximal subgroup of SO (10). Each 45 α i and 45 α k factor therefore has one of 4 possible alignments, which we can specify in our notation e.g. by α 1 = X 1 or α 1 =X 2 .
We analogously assume that each 210 β j and 210 β l has one of 6 possible discrete alignments (Zs andZs) with well-defined transformation properties under a maximal subgroup.
This setup in the context of SO(10) is an extension of the analysis performed in [25], where only the factors 45 and H = 10 were considered. We shall not consider in this paper how to obtain such alignments with GUT Higgs potentials after GUT breaking. Constructing the Yukawa sector of a model with operators using discrete VEV alignments predicts (apart from the location of MSSM Higgs doublets, to be discussed later in this section and in section 3) ratios of Yukawa matrix entries in different sectors, and is thus the natural implementation of the predictive Clebsch approach from SU(5) flavor models, see e.g. [11,12,24].
As already stated, the 32 × 32 matrices 45 A B and 210 A B acquire the VEVs X, Z and X,Z on the diagonal. We label the coefficient alongside the VEV in the diagonal entries (the "charge") corresponding to the particle p by q(p), and normalize such that q(Q) = 1 for the quark doublet Q, or in the case when q(Q) = 0 we normalize by q(u c ) = 1. Charges for different particles p in the same SM irreducible representations need to be the same due to G 321 symmetry. Furthermore, the underlying space of states of the spinorial representation is reducible: 32 = 16 ⊕ 16. The particles in the 16 have opposite charges of those in the 16, including the q charge discussed above, i.e. q(p) = −q(p). The above discussion implies that knowing the charges q of the various SM representations is sufficient to reconstruct the VEV matrices 45 A B and 210 A B . We provide the charges and their normalizing factors N in tables 1 and 2. They were computed explicitly with methods from appendix A. With discrete alignments of all α i , β j , α k , β l , we obtain the following Yukawa terms (written as superpotential operators) from an operator in eq. (2.7) for fixed family indices I and J: Above, each N α i labels a normalization factor from table 1 given one of four possible -9 -JHEP02(2020)086

JHEP02(2020)086
for the ratios of Yukawa coefficients between the different sectors thus read . (2.18) The -The coefficients χ H depend on the doublet-antidoublet mass matrix, which in turn involves Higgs sector parameters. Specifying any Higgs sector of the model, one can compute χ H using the tools provided in section 3. In general, the χ H will be functions of the Higgs sector parameters; in the previous point, we considered special setups yielding numeric values independent of any parameters.
-Alternatively, if one wishes to remain agnostic regarding the specifics of the Higgs sector, one can also take the factors χ H to be free complex parameters. The only assumption necessary in this case is that the Higgs sector is sufficiently rich to provide the chosen numerical values for these factors. In such a scenario, the Yukawa ratios from a single operator are not predicted, but since the 3 family Yukawa sector will involve many such operators, a predictive model may still be obtained. Note that these factors would be fixed to the same value in all operators using the same H.  • Compared to SU(5) GUTs, where only the charged lepton and down sector are related, SO(10) relates Yukawa coefficients of all 4 sectors. Furthermore, a naive expectation in SO(10) is that with discrete SU(5)-compatible directions of the VEVs, at least the Clebsch ratios between the charged lepton and down sector from the SU(5) symmetric operators would be reproduced, e.g. from the classification in [11]. Curiously, this turns out not to be the case, with the ratios in principle different due to the presence of two Yukawa terms in each sector in eq. (2.17) or (2.18). This implies that each SO(10) operator actually combines two operators at the SU(5) level in a given Yukawa sector, thus modifying the single operator SU(5) predictions.
• At the renormalizable level, the product over i, j, k, l in eq. (2.17) has no factors, leading to a value of 1. The relative Clebsch factors between the sectors are then determined by the ratio C H eν /C H ud , which is 1 and −3 if H is 10 or 126, corresponding in the SU(5) context to the b-τ unification and the Georgi-Jarlskog factor [10], respectively. As is well known, Yukawa matrices arising from renormalizable operators for H equal to 10 or 126 are symmetric under the exchange of family indices I and J, and anti-symmetric for the 120. This result is reproduced in eq. (2.17) due to the coefficient s H . In the general non-renormalizable case, where the product over α i , β j from the charges acting on 16 I is not the same as the product over α k , β l of charges acting on 16 J , the overall symmetry or anti-symmetry in family indices is lost. In particular, non-renormalizable operators with H = 120 can thus also be used for generating diagonal Yukawa entries when using asymmetric products.
• The product of normalizations N and the product of VEV sizes X or Z can be absorbed into the overall operator coefficient C/Λ n+n +m+m , thus having no impact on the SO(10) relations between different fermion sectors. The normalizing factors N become important though when arbitrary VEV directions are considered in the next subsection.
A subclass of operators with m = m = 0 and H = 10 has been previously considered in the literature [25]. To facilitate the comparison of our notation for the 45 with that in [25], we provide below a dictionary for translating our notation to theirs: (2.20) The idea behind the alternative notation is that the 45 is the adjoint representation of SO(10), so one can make use of the gauge boson labeling also for the scalar (or chiral super- multiplet in SUSY) fields analyzed here. In this sense the scalar states correspond to the following gauge boson generators: X for the U(1) X charge in G 51 , Y for the hypercharge (a SM generator), B−L to the difference of baryon and lepton number, which is gauged and is represented by the generator T B−L in SO (10), and T 3R to the diagonal generator of the SU(2) R factor in the Pati-Salam group G 422 . Since in this paper we also consider the VEVs in the representation 210, to which the logic of the notation from [25] cannot be extended, we prefer our notation due to greater visual clarity. The X and Z letters refer to 45 and 210, respectively, and the non-tilde and tilde refer to the G 51 and G 422 alignment, respectively. We conclude this section on discrete VEV alignments by applying the general formulae we derived to a concrete example, which is of particular importance for flavor GUT model building. We are interested in the numerical predictions for the ratios of diagonal Yukawa entries between different Yukawa sectors from eq. (2.18), the analog of the SU(5) predictions from [11,12].
As discussed earlier, the predictions in SO(10) now relate all Yukawa sectors. As an illustration, we choose the simplest predictive case #1 from table 4, i.e. H = 10 (with H u 1 = v u and H d 1 = v d ) and all χ H = 1. We computed for this scenario the ratios of eq. (2.18) for all operators containing up to 5 fields, i.e. n + n + m + m ≤ 2, and all possible combinations of discrete-VEV alignments. The list of Clebsch factor predictions is compiled in table 5, with the ratio values then shown graphically in figure 1 with the same case numbering convention. The table is meant to be used to identify the particular operator giving the prediction, while the figure is very convenient for searching through the possible numerical values for the ratios. The labeling order of the various cases is that of increasing values of ratios II (in that order of importance). As can be seen from the table, different cases of operators can predict the same triple of Yukawa ratios, so they are listed under the same number Nr.
Finally, we remind the reader that the ratios (Y e ) II /(Y d ) II in table 5 are not merely a reproduction of the SU(5) Clebsch factors, since one SO(10) operator actually gives a sum -13 -JHEP02(2020)086 of two SU(5) operators in each Yukawa sector, as was discussed. As an example, the most common SU(5) Clebsch ratio −3/2 (obtained when the 24 of SU(5) acquires a VEV) is not among the SO(10) ratios in the table. The case that is naively expected to yield the −3/2 ratio involves one 45 factor aligned in the X 2 direction; it instead yields the (Y e ) II /(Y d ) II ratio 1, corresponding to the subcase (X 2 |.) (cf. table caption for label definition) of Nr 19 in table 5.

Arbitrary directions
We now perform an analysis of Yukawa couplings generated by operators in eq. (2.7), but where we do not assume the VEVs of 45 and 210 to have discrete directions, but instead can take an arbitrary direction. Since the two representations have 2 and 3 SM singlets, respectively, the VEVs point in an arbitrary direction in C 2 and C 3 (in the SUSY context).
In this type of model building approach, we assume for simplicity to only have a single copy of a 45 and 210: the VEV direction in the 45 is the same for all α i and α k , and analogously the VEV directions in the 210s are the same for all β j and β l . This setup of (at most) one copy of each representation provides maximal predictivity of such models, since it introduces a minimal set of new continuous parameters related to the arbitrary directions.
Alongside the continuous parameters describing the VEV direction and size, the model is specified by the powers n, n for the 45 and m, m for 210 for each operator added to the Yukawa sector. The arbitrary direction VEV approach does not have a good analog in SU(5) flavor GUTs, since no low-dimensional irreducible representations of SU(5) contain more than one SM singlet. This approach is therefore relevant only for bigger GUT groups such as SO(10) and E 6 .
Since the overall size of the VEVs can be absorbed into the coefficient in front of an operator, it is convenient to define the following ratios of GUT-scale VEVs: We do not lose any generality by considering the ratios, since the result for X 1 = 0 can be recovered by taking the limit κ → ∞ given the relation X 1 = X 2 /κ. An analogous treatment also applies to the case Z 1 = 0 in κ 1 and κ 2 . Note that the κ-ratios are complex numbers in the SUSY context, and real numbers in the non-SUSY context. The discrete alignments from section 2.1 can be recovered by taking special values for the κ ratios. One can easily reproduce them by using the definitions in eqs. (2.9), (2.10), (2.21) and the connection of SU(5) and Pati-Salam alignments from eq. (2.11). For the sake of completeness and to identify special values of κ at a glance, we provide them in table 6.
Using the freedom of the overall phase of 45 and 210, the values X 1 and Z 1 can be chosen real without loss of generality; the ratios κ, κ 1 and κ 2 , however, always remain complex in the SUSY context.   JHEP02(2020)086 Table 6. The special values of ratios κ, κ 1 and κ 2 corresponding to discrete VEV alignments in the representations 45 and 210.
Using the ratios κ, κ 1 and κ 2 , we define for a fermion of type p the order 1 polynomials P p (κ) and R p (κ 1 , κ 2 ) by which represent the combined charge for the particle p given the arbitrary direction of the VEV. In the above definition of P p , the expression N X i represents the normalization factor N under the X i alignment, while q X i (p) is the charge under the X i alignment of the particle p ∈ {Q, u c , d c , L, e c , ν c } from tables 1 and 2. The index i goes over all possible VEVs in the representation, i.e. over all SU(5) compatible alignments. Analogous definitions hold for the quantities in the definition of R p . The normalizations N and charges q are those from tables 1 and 2. The operator from eq. (2.7) with arbitrary alignments in a single copy of 45 and 210 generates the following Yukawa terms in the superpotential: The H-dependent quantities in the above expression can again be found in table 3, while the polynomials P (κ) and R(κ 1 , κ 2 ) are those defined in eq. (2.22) and (2.23). The coefficient C/Λ n+n +m+m is again the overall coupling in front of the operator, and two terms contribute to each Yukawa sector.
A final remark on the result: note that the X 2 , Z 2 and Z 3 charges for right-handed neutrinos are zero, as can be seen in tables 1 and 2, making the last term in eq. (2.24) especially simple and κ, κ 1 , κ 2 -independent: (2.26)

General considerations
We now turn to the issue of the location of the MSSM Higgs doublets, which crucially influences the Yukawa sector predictions, as we saw in section 2. , which are the only light mass eigenstates in the full doublet mass matrix above the GUT scale. All the other doublet mass eigenstates should be heavy, i.e. at or near the GUT scale, since they should be integrated out at the GUT scale to obtain the MSSM as the effective description.
As is well known, the doublet mass matrix in GUT models is linked to the mass matrix of the triplets (3, 1, +1/3) and antitriplets (3, 1, −1/3), since the (weak) doublets come together with (color) triplets in GUT representations. Crucially, these (anti)triplets mediate D = 5 proton decay in SUSY GUTs and must therefore be heavy not to violate proton decay bounds. We thus have a situation of two mass matrices linked by GUT symmetry: the (anti)doublet mass matrix with one pair of light states and all others heavy, and the triplet mass matrix with all states heavy. The issue of having only one light -18 -

JHEP02(2020)086
pair of doublet states with all other doubles and triplets heavy is known as doublet-triplet (DT) splitting, see e.g. [30,31] for a brief overview. The location of the MSSM Higgses is thus related to the question of DT splitting.
At the SU(5) level, the representations containing both a doublet and a triplet are 5 and 45, while the representation 50 contains a triplet only, but no doublet. Analogous statements hold for the conjugates of these representations. At the SO(10) level, the location and number of doublets and triplets can be looked up in table 10.
Suppose we have a concrete model of the Higgs sector, whose superpotential W is specified. We fix our notation by defining the doublet-antidoublet M D and triplet-antitriplet M T mass matrices via where D k , D l , T k and T l denote the doublet, antidoublet, triplet and antitriplet states, respectively, where the indices k and l go over available states of each type in the Higgs sector. The scalar mass matrices for doublets and antidoublets are then M * D M T D and M † D M D , respectively. A necessary condition for DT splitting is that det M D ≈ 0 (the MSSM pair of doublets is almost massless compared to the GUT scale) and det M T = 0. Since the mass eigenmodes of D and D are the left and right singular modes of M D , respectively, the null mass pair of the left and right singular mode correspond to H u and H d of the MSSM, respectively. They can be written as a linear combination of (anti)doublets: where a k and b l are complex coefficients with k |a k | 2 = l |b l | 2 = 1 due to the unitarity of the left and right transition matrices in the singular value decomposition of M D . There is a remaining overall phase ambiguity for the a k , and similarly for b k , associated to the phases of the states H u and H d . The phase ambiguity is not relevant, since the phases are fixed in the MSSM so that the EW-breaking VEVs of H u and H d are real.
In the mass basis, only H u and H d can obtain an EW-breaking VEV, since they are the only light states. In the flavor basis D k and D l , adapted to GUT representations, the unitarity of transition matrices from the flavor to the mass eigenbasis then enforces the expansion where the usual MSSM definitions apply: with v = 174 GeV and The crucial point for model building is that in the presence of multiple doublets and antidoublets, the flavor eigenstates D k and D l , which include the H u,d i coupled to SM fermions, depend on the coefficients a k and b l through eq. (3.5). These coefficients in turn depend on the superpotential parameters coming into M D . Consequently, there is in principle extra freedom to the SO(10) constraints in the Yukawa entries due to the a i and b i parameters, depending on the particularities of the Higgs sector.

Tools for computing DT splitting
We have seen that the details of DT splitting are crucial for the Yukawa predictions due to the coefficients a k and b l , but are also model specific.
One way to remain agnostic about the Higgs sector, as was discussed in section 2.1, is to simply take a k and b l as free complex parameters. The only assumption then required is that the freedom in the Higgs sector parameters is sufficient for the freedom in the a k and b l parameters, i.e. that the unknown Higgs sector is sufficiently rich.
On the other hand, it is useful to the model builder to have the necessary tools for computing the a k and b l coefficients in a particular model of the Higgs sector. We provide in this subsection all the necessary information for the reader to reconstruct the matrices M D and M T for the model of their choosing, provided that the Higgs sector of the superpotential consists of renormalizable operators with SO (10)  We write below all renormalizable superpotential terms of the SO(10) Higgs sector, where only one copy of each type of representation up to dimension 210 (excluding 144, 144 and 210 ) is considered (the case of multiple copies is a straightforward extension, as discussed later). We include only the terms relevant for the doublet/triplet mass matrices, while the terms relevant for the breaking of GUT symmetry, which determine the SM (3.8) The conventions for the representations and their indices in tensor notation are specified in appendix A. The fundamental indices have been lowered by P ij from eq. (A.6), while spinor indices have been lowered by C AB from eq. (A.17). Also, the spinor forms of the representations, such as 10 A B , have been defined in eq. (A.24)-(A.29). One can confirm that in each term all indices are indeed contracted.
The normalizations of the states and VEVs in these representations is canonical, in the sense that the quadratic invariant formed with the complex conjugate gives orthonormal normalization. For example, if the states in the representation 126 are labeled by X K , then implying that the canonical normalization of kinetic terms requires for them to be in the standard form (with a prefactor 1) The labeling of the coefficients λ in front of the operators follows the following scheme: each λ has 3 numbers of increasing size in the index, which correspond to the three representations forming the invariant. Each number corresponds to the representation with that -21 -JHEP02(2020)086 many fundamental indices, including a possible bar on top of the label if the representation has a bar; the exceptions are the 54, which also adds a prime to the symbol λ, and the representations 16 and 16 corresponding to labels 6 and6, respectively. Altogether this provides an efficient labeling scheme for the coefficients; the numeric factors in front are there as part of our convention for later convenience; in particular, the numeric factors simplify the writing of doublet/triplet mass matrices. Eq. (3.8) should thus also be viewed as defining our notation for the m and λ parameters in front of the operators.
We now specify a basis for all the doublets/antidoublets present in the representations of eq. (3.8): in an obvious notation based on which SO(10) representation and SU(5) subrepresentation they are located in. The explicit relation of such a basis to the more specific H u,d i notation from eq. (2.13) and (2.14) relevant for table 3 is the following: Similarly, we use the triplet/antitriplet basis where an extra (anti)triplet state from the 126 (126) contained in the 50 (50) of SU (5) is now present. Given these bases, D k and D l consist of 7 states each, while T k and T l consist of 8 states, so that M D and M T from eq. (3.1) and (3.2), respectively, are 7 × 7 and 8 × 8 matrices. They can be compactly written as a single 8×8 matrix M D|T in the following way: (3.14) The way to interpret the compactly written doublet/triplet mass matrix M D|T in eq. (3.14) is the following: • Doublets: obtain M D by crossing out the last row and column of M D|T and take η i = 1.
• Triplets: obtain M T from M D|T by taking

JHEP02(2020)086
We have labeled the SM singlet VEVs in eq. (3.14) by their SU(5) origin with letters V , W and Z for the representations 1, 24 and 75, respectively. Their SO(10) origin is indicated in the index. The Clebsch coefficients η are present only in contributions from SU(5) nonsinglets, i.e. the W and Z VEVs. Note: this notation for the VEVs in the 45 and 210 is different than the one used in section 2; such representations in the Higgs sector relevant for DT splitting can be unrelated to those in the Yukawa sector operators.
In addition to the operators in eq. (3.8), there are 7 more renormalizable operators (completely) anti-symmetric in the representations of the same dimensions, such that multiple independent copies of a representation are required for the invariants to be non-zero: (3.16) The indices α, β and γ label the copy of a representation. The above anti-symmetric operators yield by explicit computation the following doublet and triplet terms: (3.23)

JHEP02(2020)086
In the expressions above, the ellipsis symbol signifies the addition of terms with permuted indices αβ or αβγ, such that the expressions become completely anti-symmetric in these indices. Furthermore, the factors with Clebsch coefficients η i have different numerical values depending on whether they are multiplying a DD or T T term: η i = 1 for doublets, while the values for triplets are given in eq. (3.15). The notation for some doublets, triplets and VEVs now also carries a multiplicity index in a straightforward extension of the notation from eq. (3.11) and (3.13).
This completes all the data needed to compute M D and M T with any renormalizable potential. If multiple copies of a representation are used, one can distinguish between two cases: when the invariant is completely symmetric or anti-symmetric in the copies. 2 The data for the anti-symmetric cases is obviously given in eq. (3.17)-(3.23). The symmetric case, on the other hand, can be reconstructed from the data in eq. (3.14) noting that the invariants in eq. (3.8) turn out to be the invariants completely symmetric in the representation factors of the same type.

Discussion and the most predictive cases
Given the model building tools for the Higgs sector and DT splitting developed in section 3.2, we now reiterate how to make use of these tools step by step. The procedure to build a concrete Higgs sector and determine its impact on the Yukawa sector in the MSSM effective theory below the GUT scale consists of the following steps: 4. Make sure that DT splitting is achieved by confirming that det M D = 0 and det M T = 0. We will return to the issue how to achieve this later.
5. Compute the normalized left and right null modes for M D ; when the basis for rows and columns consists of D k and D l , respectively, the obtained left and right null vectors correspond to the coefficients a k and b l , respectively. This determines the Yukawa sector operators in the MSSM effective theory via eq. (3.5).
We turn now to the issue of achieving DT splitting in step 4. One possibility is to make use of one of the mechanisms for DT splitting in the literature, e.g. the missing partner mechanism [31][32][33] or Dimopoulos-Wilczek mechanism in SO(10) [34,35] 10,120 45 = X 2 m 10 ,m 120 ,λ 123 λ 123 = 2i set-up in the Higgs sector, i.e. very special choices in steps 1-3. We shall not consider this possibilities further in this paper. Another possibility is DT splitting by fine-tuning. While this option can typically be employed in any model, it is considered a less elegant solution to the DT splitting problem. One simply computes det M D and imposes that the resulting expression vanishes, implying a strict relation between the independent parameters of the Higgs sector, i.e. it requires fine-tuning one of them. If det M T = 0 after imposing the fine-tuning relation, DT splitting was successfully achieved.
We discuss now how DT splitting in general impacts the predictivity of the Yukawa operators in eq. (2.7) for the different choices of H. Using the tools of section 3.2, the discussion culminates in a list of some simple predictive scenarios in table 7. We make the following observations: • The coefficients a k and b l from step 5 are functions of the parameters present in M D . An important goal is to find predictive scenarios, where the parameter values -26 -in the Higgs sector do not modify the Yukawa predictions. We approach this goal by searching for scenarios, where the mass matrix M D is small and contains as few parameters as possible. This implies choosing the smallest possible number of operators and representations for the Higgs sector of the model. One parameter is eliminated by fine-tuning.
• To achieve DT splitting, we need at least one SO(10) representation with a SM singlet VEV which is not an SU(5) VEV, so that M D = M T due to the η i coefficients. This conclusion holds true even when using 126 and 126 (when the model contains more triplets than doublets), since the SU(5) singlet VEVs do not contribute to any mixing of the extra triplet with other triplets, i.e. there are no off-diagonal terms in the last row and column involving the V VEVs in eq. (3.14).
• We find that when M D is symmetric, the left null mode will be equal to the right null mode, thus implying a k = b k for all k. These relations already reduce the freedom in the coefficients by half, e.g. only a k remain to be determined. It is easy to obtain a symmetric M D by using only real representations, i.e. in the scenarios H = 10 or H = 120.
• A straightforward possibility to obtain a fully predictive Yukawa sector is for a k and b l to all be fixed numbers independent of the parameter values in the Higgs sector. Such scenarios are possible, e.g. see cases #1-3 in table 7. From the point of view of yielding a predictive MSSM Yukawa sector, however, it is sufficient that the ratios of a k and b l within each irreducible SO(10) representation H are fixed numbers. In such a case, the parameter-dependent common factor in all a k and b l in a given H can simply be absorbed into the Yukawa coupling coefficient in front of the operator. Examples include cases #4-6 in table 7, further commented on below.
• The Higgs representation H = 10 has only one DD pair. A symmetric M D is sufficient to determine their VEV ratio and lead to a predictive Yukawa sector. This is implemented in case #1 of table 7.
• The Higgs representation H = 120 has two DD pairs. A symmetric M D and a fixed ratio of the two relevant a i s is thus sufficient to determine all their VEV ratios and give a predictive Yukawa operators involving H = 120. A model containing only the doublets from 120 is implemented in cases #2 and 3 of table 7.
• The implementation of a predictive case when H = 126 is a bit more tricky, since it is a complex representation. It contains one DD pair, but D and D are part of different SU (

General considerations
We obtained explicit results for a single operator of the type as in eq. (2.7) in section 2, and then analyzed the location of the MSSM Higgs fields in section 3. We now provide a summary how to construct an entire model with these operators, based on single operator dominance in each Yukawa entry. We include some additional model building considerations, for which we provide justification and technical details in appendix B.
We provide below a step by step guide for the construction of a complete model. We divide the discussion into two parts: the choices that determine the model, and the subsequent consistency check of their validity: 1. Choices for model building in the Yukawa sector: • Build the entire Yukawa sector 3 by specifying which operators from eq. 3 The Majorana neutrino masses need to be constructed separately.

JHEP02(2020)086
-If all 45 refer to the same copy, and the same holds for the 210, but both representations have an arbitrary GUT VEV direction, then the result in eq. (2.24) applies. The definitions of the polynomials P and R can be found in eq. (2.22) and (2.23).
-If one mixes multiple discrete and continuous directions, eq. (2.24) can be extended by replacing the powers of polynomials P and R with a product, where each polynomial has its own direction, i.e. its own value of the κ variable(s). Discrete directions in any given polynomial can be obtained by applying the values of the κ variables from table 6.
In a 3 family model, the Yukawa operators are 3 × 3 matrices. Operators connecting 16 I and 16 J for I = J generate both the I-J and J-I off-diagonal Yukawa entries. Single operator dominance in every Yukawa entry then puts an upper limit of including at most 6 such operators. In SO (10), that specifies the Yukawa couplings in all fermion sectors, including the Dirac type Yukawa for neutrinos. More operators can of course be added if one relaxes the single operator dominance assumption.  • Another consideration is how to naturally achieve in each Yukawa entry only the presence of the desired operator(s), and forbid all others. These considerations can be made at two levels: -External legs: It is possible to forbid Yukawa operators constructed from undesired combinations of fields by imposing global symmetries (which can be e.g. discrete). This involves suitable charge assignments for the representations 16 I , 45 α i , 210 β i and H (specify the charges q I , x i , y j and h from appendix B), which allow the construction only of those Yukawa operators, whose charges of the fields (i.e. external legs) amount to a net zero charge.

JHEP02(2020)086
-Internal contractions: Global symmetries impose constraints only on the external legs. Given the same representations in an invariant, there may still be multiple possible ways to internally contract the indices. One possible approach for allowing only a subset of operators with the same external legs is by use of mediators. In such an approach, one assumes the non-renormalizable SO(10) operators arise from integrating out heavy mediator fields (from a possibly renormalizable theory). Imposing the global symmetry onto the UV theory requires charge assignments for the mediators, thus allowing or forbidding certain types of contractions.
The invariants formed from eq. We emphasize that the mediator approach is used to further restrict the Yukawa sector, i.e. forbid certain contractions, so that the result at the MSSM level is single operator dominance. Whether the mediators postulated for the purpose of one Yukawa entry do not interfere with operators in other entries (and allow contractions, which we would like to forbid), is something that needs to be checked in a given model. The problem is thus reduced to finding charge assignments for external particles, and specific orders of GUT VEVs in the operators, such that only a unique invariant contraction is possible in any given Yukawa entry; the interested reader should again consult appendix B. This is a complication that needs to be considered only when multiple contractions with the given external legs are possible.

JHEP02(2020)086
We briefly now turn to a discussion which model building choices yield the most predictive Yukawa sector, i.e. as few free parameters as possible. It is clearly preferable to choose a predictive DT splitting scenario, e.g. one of the choices from table 7. Furthermore, the number of continuous parameters is further minimized by either using only discrete alignments of GUT VEVs, or as few continuous alignments as possible, essentially reducing ourselves to the two cases considered in sections 2.1 and 2.2.

Toy models
For illustrative purposes, we consider 3 example models which exhibit the various aspects of model building that were just discussed. For all examples we consider a simplified setup with only two operators contributing to the Yukawa sector: O IJ for I = J = 2 and I = J = 3, where I and J are family indices. We thus consider only the 2nd and 3rd family, and neglect the mixing between them, a good starting point given the hierarchical structure of the quark and charged lepton masses and mixings at low energies. The Yukawa part of the superpotential of our toy models is thus setting the following form for the Yukawa matrices at the GUT scale: For the 3rd family, the top Yukawa coupling y t should be O(1), which is simplest to achieve by taking O 33 to be a renormalizable operator not suppressed by powers of X/Λ, where Λ is the cutoff scale for the effective SO(10) theory we consider and X a GUTscale VEV. As a common feature for all toy models, we thus choose the H = 10 and m 1 = m 2 = n 1 = n 2 = 0 in eq. (2.6), i.e.
The models will thus differ only in the choice of the O 22 operator (the representations and VEV directions), as well as the choice of the Higgs location. Since the coefficients λ 2 and λ 3 in eq. (4.1) can be adjusted to set the overall scale of the 2nd and 3rd family Yukawa couplings, and any phases in the Yukawa couplings absorbed into the appropriate fermion fields, the physically relevant predictions of our models are only the absolute values of Yukawa ratios: Since the neutrino sector also involves the unspecified Majorana mass matrix, the physical observables we consider in any fit exclude the two ratios involving neutrinos. Since these ratios are still predicted by the model, we will nevertheless specify them as predictions.

JHEP02(2020)086
We connect the model predictions of the Yukawa entries at the GUT scale to their experimental values at low scales by making use of data tables provided in [36]. These tables provide GUT-scale values obtained by running the RG equations for the Yukawa couplings and mixings from their measured values at the scale M Z all the way to the GUT scale fixed at M GU T = 2 · 10 16 GeV. In between, a transition from the SM running to the MSSM running (and the MS renormalization scheme to the DS scheme) takes place at M SU SY = 3 TeV, where the effects of the unknown SUSY spectrum are parametrized using threshold effect parameters η q and η b . They are defined by performing the SM to MSSM matching via    In our models the Yukawa matrices already take this canonical form due to the 2 operator setup of eq. (4.2). Furthermore, we make a simplification by neglecting the threshold effects in the charged lepton sector. 4 The free parameters determining the Yukawa ratios of eq. (4.4) of a model thus consists of the following: Model free parameters: tan β, η b , η q , others (model specific), (4.7) where the model specific parameters may include parameters determining the direction of GUT-scale VEVs, or coefficients determining the presence of the MSSM Higgs doublets in the doublet flavor-eigenstates. We emphasize that simple parameter and observable counting based on eq. (4.4) and (4.7) is not really meaningful, since a determination of the SUSY threshold correction parameters η b and η q puts constraints on the spectrum of the SUSY sparticles in a JHEP02(2020)086 Table 8. Operators with up to 4 GUT-scale VEVs in discrete directions, which can be used for the O 22 operator to provide a good fit χ 2 < 1 in figure 2. The ratios were computed from eq. (2.18).
non-trivial way. A model thus predicts more observables than one might naively assume, but we shall not pursue the determination of the SUSY spectrum further in this paper, see e.g. [23,37,38] for examples. We now present our example models.

Example 1: discrete VEV directions and predictive DT
The first example consists of a predictive DT scenario and discrete VEV directions. In particular, since O 33 involves H = 10, the simplest choice for a predictive DT scenario is case #1 from table 7, implying H u,d 1 = H u,d . This yields a t-b-τ unification prediction for the Yukawa couplings based on eq. (2.18) and table 4: It is known that such a scenario requires a tan β ≈ 50. For the O 22 operator, we consider any non-renormalizable operator in eq. (2.6) with H = 10 of dimension at most 7 in the superpotential and with discrete GUT VEV direction, i.e. we demand n + n + m + m ≤ 4 and the VEV direction for each of the 45 or 210 factors is independently chosen to be one of those from eq. (2.9) or (2.10). In other words, α i , β j , α k , β l are chosen independently: The results for the predicted 2nd family Yukawa ratios can be computed using eq. (2.18). The large set of operators under consideration thus includes also all the cases of table 5 (or equivalently figure 2). Since the location of the Higgses is predictive and the VEV directions are discrete, there are no new model specific parameters in eq. (4.7), while the direct observables are the Yukawa ratios from eq. (4.4). We allow the free parameters to vary in the range 20 ≤ tan β ≤ 70, and compute for each parameter point a χ 2 value based on the 4 directly observable Yukawa ratios. The values for the ratios as well as the standard deviations 5 are computed from interpolating the data tables provided by [36]. The parameters η b and tan β have to be chosen, such that the data gives the |y t /y b | and |y τ /y b | close to 1 according to the model prediction from eq. (4.8). The prediction of the 2nd family ratios |y c /y s | and |y µ /y s | then still has the freedom of η q . The results of the fit of the data projected onto the two 2nd family ratios, as well as the possible predictions from the constructed operators, are shown in figure 2. The darker and lighter blue regions correspond to the values of ratios consistent with a fit giving χ 2 < 1 and χ 2 < 4, respectively. The green + symbols correspond to model predictions from O 22 operators containing only the representations 45 (m = m = 0), while the more numerous red crosses correspond to cases where at least one 210 in some VEV direction is also involved, i.e. m + m > 0. The predicted Yukawa ratios from different operators may be equal, so a single symbol may represent identical predictions of more than one operator. Furthermore, preference was given to operators with no representations 210: if both a red cross and a green plus would need to be drawn in the same location in the figure, only the latter is shown. 5 The tables in [36] provide only the values of the Yukawa couplings and their errors. The relative errors for the ratio x/y is computed by taking δ 2 x + δ 2 y , where δx and δy are the relative errors of the quantities x and y, respectively.

JHEP02(2020)086
We can see that ratio-pairs of only a few operators fall into the low χ 2 regions in figure 2. The operators falling into the best-fit region with χ 2 < 1 are listed in table 8. These operators are the most promising candidates for the 2nd family operator in further model building. They all contain either 3 or 4 GUT-scale VEVs, and include none of the candidates with 2 GUT-scale VEVs or less from table 5. We emphasize that all this holds only with the assumption of 3rd family Yukawa unification and case #1 for predictive DT splitting.
As an alternative model building approach, we can relax in the next examples either the discrete VEV assumption or the assumption of having a (most) predictive case of DT.

Example 2: arbitrary VEV direction and predictive DT
We modify now Example 1 to an arbitrary VEV direction approach from section 2.2.
In particular, we again choose case #1 of table 7 for the predictive Higgs location scenario. The renormalizable operator O 33 from eq. We wish to construct a model with single operator dominance, and since we are dealing with I = J = 2 and only one type of GUT VEV, we are restricted to taking operators with |n − n | ≤ 1, as was discussed in section 4.1 (and is derived in detail in appendix B). We shall consider only operators of dimension at most 8 in the superpotenial, i.e. n + n ≤ 5. The considered choices for the 2nd family operator are thus  This leads to the possibilities of n and n in table 9 (referred to as "models"), where n ≤ n can be assumed without loss of generality. The 2nd family Yukawas ratios |y µ /y s |, |y c /y s | and |y νµ /y s | are computed with the tools established in section 2.2, in particular by the use of eq. (2.24). Since only the representation 45 is involved, only the P -type polynomials are relevant in that equation. Case #1 for the DT scenario implies H u,d 1 = H u,d . The Yukawa ratios are functions of the VEV ratio κ = X 2 /X 1 in the 45.
The obtained results in table 9 indicate that the modulus of the Yukawa ratios for models 1a and 1b are the same, so we refer collectively to both models as "model 1". Similarly, the predictions from models 2a and 2b are the same, so we refer to them collectively as model 2. Model 0 leads to c-s-µ unification at the GUT scale and is not a viable starting point for model building: considering just the s to µ ratio at the GUT scale, it needs to be between 3 and 6 based on typical SUSY threshold corrections, see e.g. [36]. We are thus left with models 1 and 2 as viable candidates in this simplest setup.
A final model building consideration is whether one can allow only the two operators O 33 and O 22 , and no others. We use the approach from appendix B, and assign U(1) -35 -JHEP02(2020)086 model n 1 n 2 |y µ /y s | |y c /y s | |y νµ /y s | model 0 Table 9. Models with different choices of n and n in O 22 and their predictions for Yukawa ratios.
charges to the representations. We label the charges of 16 2 , 16 3 , 45 and 10 by q 2 , q 3 , x and h, respectively. The net zero charge of O 33 and O 22 demands Forbidding a O 32 operator for any power k ∈ N 0 of the 45 then implies where we inserted the expression for h by solving eq. (4.12). The non-vanishing of the total charge in eq. (4.13) then holds for any k ∈ N 0 provided q 3 = q 2 and n + n is odd. From this perspective, models 1b and 2b in table 9 have an odd n + n , so we can forbid in them off diagonal Yukawa couplings by imposing for example charges q 3 = h = 0, q 2 = −1, x = 1, leading to the addition of mediators 16 with charges α 1 = β 1 = 0 (and its conjugate). Strictly speaking, it is thus models 1b and 2b that we consider, at least when using techniques from appendix B for imposing single operator dominance. The directly observable Yukawa ratios are those in eq. (4.4), while the model parameters consist of the ones in eq. (4.7) with the additional complex parameter κ. It turns out that in both models 1 and 2 it is possible to fit the observables. We now show this in a sequence of considerations: 1. Our models predict y t /y b = 1 and y τ /y b = 1. We search for these ratio values in the tan β-η b plane. Figure 3 shows the regions where the data tables from [36] give y t /y b (red) and y τ /y b (blue) to be around 1, with the error range for these quantities also provided by the data tables. We see that the two regions overlap; inside is a point where the ratios are exactly one, which determines our default values of tan β and η b : The large tan β is expected due to t-b-τ unification.
2. Taking the values in eq. (4.14), the only remaining threshold parameter to be determined for the ratio y s /y c is η q . We plot this dependence in figure 4. We read off the possible range for the ratio y s /y c to be, for example, between 6.65 and 3.58, assuming the interval range η q ∈ (−0.3, 0.3).   3. We plot regions in the complex κ-plane, where the ratios y s /y c and y µ /y c give suitable values; the result is shown in figure 5. The models predicts the Yukawa ratios to be functions of κ, see table 9, which are then compared to the experiment-derived data tables of their GUT values in [36]. Note that we are considering y µ /y c = (y µ /y s )/(y c /y s ) as one of the two ratios, since it does not depend on the threshold parameter η q (only y s depends on it). The allowable values for the y µ /y c ratio are computed from the data tables taking a combined relative error (δ yµ ) 2 + (δ yc ) 2 at a fixed tan β and η b from eq. (4.14). The allowed values of y s /y c , on the other hand, are in the range specified by step 2. We can see that for both models 1 and 2 there exist overlap regions, where κ values can fit well both 2nd family Yukawa ratios. Note that the pictures are symmetric with respect to the sign in the complex phase of κ, since replacing κ with κ * does not change the absolute value of Yukawa ratios in table 9.
In this way we are able to fit all observables: points in the overlap region (in the tan β-η b plane) in figure 3 fit the 3rd family ratios, while points in the overlap region (in the κ plane) in figure 5 fit the 2nd family ratios.

Example 3: discrete VEV directions and less-predictive DT
As a final example, we consider another possible modification of Example 1, in which the GUT VEV directions remain discrete, but we introduce additional freedom through the ambiguity in the location of the MSSM Higgses.
We could in principle remain agnostic about the doublet-triplet splitting and introduce free parameters for the coefficients a i and b i as discussed in section 3.3. It might be instructive, however, to explicitly construct an example of such a less-predictive DT splitting.
We consider the Higgs sector containing the doublets to consists for example of 10 ⊕ 126 ⊕ 126 ⊕ 210, where H = 10 is used for O 33 , and H = 126 is used in O 22 . We keep the couplings m 10 , m 126 , m 210 , λ 145 , λ 145 , λ 455 , λ 444 in eq. (3.8), such that we obtain the following doublet and triplet mass matrices: (4. 16) They are written in the (sub)bases (4.17) Based on these explicit matrices, we can perform fine-tuning in the parameter λ 444 , such that Note that the M T matrix has an additional row and column compared to M D , but additional triplet states get mixed with the others only in the second term.
The left (right) null eigenmode of M D | λ 444 , where the vertical bar denotes the insertion of the fine-tuned expression λ 444 , can then be solved for analytically. It corresponds to the MSSM Higgs H u (H d ), and has the components a i (b i ) in the basis D i (D i ), where i goes from 1 to 4. The coefficients a i and b i are functions of the parameters and VEVs, and are properly normalized by i |a i | 2 = i |b i | 2 = 1. We omit here the resulting very complicated analytic expressions for a i and b i , but we checked explicitly that their expressions are independent, so that they can indeed be treated as free parameters.
The (anti)doublets involved in the Yukawa sector are located in the 10 and the 126. According to table 3 we have With these equations, the DT splitting and MSSM Higgs locations are completely determined, up to the values of a i and b i , which we take as free parameters. We now choose the 2nd family Yukawa operator O 22 ; for simplicity we take simply H = 126, m = m = 0, n = 1, n = 0, with the discrete direction α 1 = X 1 . Using eq. (2.17), this setup gives the superpotential We ignored the ∆ term of right-handed neutrinos. The last line of the superpotential expression, written only with MSSM fields, clearly gives the following Yukawa ratios: The directly observable predictions are the 4 Yukawa ratios not involving the neutrinos. The ratios of the up and down sectors y t /y b and y c /y s involve the ratios |a 1 /b 1 | and |a 3 /b 2 |, respectively, which can have arbitrary values (depending on the values of the parameters m 10 , m 126 , m 210 , λ 145 , λ 145 and λ 455 of the doublet sector, while λ 444 is fine-tuned). The free parameters in this example are thus those in eq. (4.7) and the additional two |a 1 /b 1 | and |a 3 /b 2 |. The concrete predictions of this example model are thus only the charged lepton to down sector ratios y τ /y b and y µ /y s . We already know that the y τ /y b ratio can be fit to 1 from figure 3, from which we can read off η b ≈ −0.2. Using this value, we can construct a contour plot for the y µ /y s ratio values in the tan β-η q plane given the η b value we specified, see figure 6. It is clear from the figure that y µ /y s = 3 can be reached by η q ≈ −0.31, independent of tan β. This shows the 2nd and 3rd Yukawa family can be successfully fit in Example 3.

Conclusions
We investigated in this paper a class of non-renormalizable SO (10)  The model building approach taken was that of single operator dominance, so that in each entry of the Yukawa matrices a contribution from only one operator dominates. In contrast to SU (5), where only the down and charged-lepton sectors are connected via operators, the SO(10) case connected all fermion sectors amongst themselves, i.e. the operator predicts 3 ratios of Yukawa entries, making such models potentially more predictive than SU(5) models. A complication arises due to SO(10) representations acquiring GUT-scale VEVs containing more than one SM singlet state, i.e. the 45 contains 2 and 210 contains 3. We have considered 2 cases in section 2: when these fields acquire a VEV in a discrete direction aligned with a state with well-defined transformation properties under one of the maximal subgroups (G 51 or G 422 ), or to allow for an arbitrary direction in singlet space, thus introducing new free parameters. Even in the latter case the model could still be predictive, since each operator predicts 3 ratios of Yukawa entries, as well as the possibility of using the same GUT VEV fields in multiple operators, thus minimizing the number of introduced parameters. The main general results for the operators of eq. (2.7) are collected Furthermore, we found that computing the resulting terms of the SO(10) operators is not sufficient, since there is an additional ambiguity in the location of the MSSM Higgses. The location of H u and H d is specified only when they are solved for as the left and right light eigenstate of the doublet-antidoublet mass matrix M D , which is intimately connected to the issues of doublet-triplet splitting. The Higgs location then crucially influences the Yukawa predictions, cf. section 3. This is another feature of the models not found in the SU(5) case. It arises in SO(10) for two reasons: because the predictions involve all Yukawa sectors, in particular sectors involving both H u and H d in the MSSM, as well as possibly having Higgs representations (the 120 in particular), which contain more than one doubletantidoublet pair, which then couple differently to different Yukawa sectors. We provide the necessary tools for the reader to perform DT splitting and determine the Higgs location for the model of their choosing in section 3.2, and suggest a list of predictive scenarios (not involving any additional free parameters) in table 7 of section 3.3.
We considered both model building elements -the operator choice and the Higgs location -then in section 4 and discussed how to approach SO(10) flavor model building. Achieving single operator dominance for example requires to forbid all operators except the desired one in any Yukawa matrix entry. These restrictions may be imposed by assigning charges under an extra (e.g. global) symmetry: assigning charges to the external legs of the operators introduces constraints on the representations used in each Yukawa entry, while introducing 16 ⊕ 16 mediators with charges restricts the types of internal contractions allowed in the operators. The mediator restrictions have been explored in detail in appendix B. The contractions leading to different Yukawa predictions correspond to choosing for each 45 and 210 factor with which of the two fermionic 16 F representations it contracts, i.e. on which side of H it is located. This allows for single operator dominance to be retained in almost all cases, except for some particular cases with only one type of 45 or 210 with an arbitrary direction VEV.
We also presented 3 examples of toy models involving only the 2nd and 3rd family Yukawa entries in section 4.2. Example 1 involves the most predictive possibility, when only discrete VEV directions and a predictive DT scenario are considered, and we identify the most promising candidate operators for the 2nd family. We then modify this example and further investigate an arbitrary VEV direction approach in Example 2, and a lesspredictive DT scenario in Example 3. Both Examples 2 and 3 can also be successfully fit to the data. Overall, the 3 examples show all the model building aspects and approaches discussed in this paper.
Beyond this work, a complete SO(10) model would require a few more ingredients. The details of the entire Higgs sector were not studied, i.e. it was not considered how SO(10) spontaneously breaks to the SM gauge group. Also, it was not considered how the VEV directions of the 45 and 210 in the Yukawa operators are achieved. These issues are from a model building perspective orthogonal to the predictions in the Yukawa sector, which this paper is concerned with, so we do not address them here.

JHEP02(2020)086
A missing ingredient in the Yukawa sector, however, are the operators providing Majorana masses for right-handed neutrinos. An example of such an operator could for example be (16 F 16) 2 , with 16 acquiring a GUT-scale VEV. Since the choice of such operators is again model dependent, we have not considered them in the analysis.
A potential limitation of this type of SO(10) model building is that Landau poles of the unified gauge coupling can occur well within one order of magnitude above the GUT scale, especially if one introduces many different representations 210 to the model. From this point of view there is a preference for representations of lower dimensionality, and a preference for a smaller number of them, i.e. a preference for simpler models of this type. This is a consideration in building any complete model that should not be neglected.
In summary, this paper provides all the necessary model building tools and results for constructing flavor SO(10) GUTs, at least those based on the wide class of nonrenormalizable Yukawa operators we considered. Applying these tools opens up new routes towards SO(10) flavor GUT models, in a similar spirit to single operator dominance models in SU (5). While there are more aspects to consider in SO(10), the model builder can be rewarded with an even more predictive Yukawa sector.

JHEP02(2020)086
basis can be conveniently written by considering the 10 = 5 ⊕ 5 decomposition under the SU(5) subgroup, and it is written with an upper or a lower index i: P pi ≡ P, anticomplex to real: where we used the index notation for components on the left-hand side and matrix notation on the right-hand side. The indices p, q, r, . . . are referred to as real indices, and i, j, k, . . . as complex fundamental or antifundamental indices, depending on whether they are upper or lower, respectively. Observe that in the component notation all matrices are denoted by P , but have different index types and placement. For example, P pi transforms the complex basis to the real basis, i.e. 10 p = P pi 10 i , while P i p is its inverse and transforms the real basis into the complex basis, i.e. 10 i = P i p 10 p . The real indices are always lower, so the Einstein summation convention applies to them if two lower ones are repeated.
Eq. (A.4) gives the following unitarity relations: Furthermore, the complex index i can be raised or lowered by the matrices The representation 10 is real, so it contains 10 real degrees of freedom x p ; in the complex basis, the 5 contains 5 complex degrees of freedom, while the 5 has those same 5 complex meaning that the spinor representation is rotated into its conjugate representation. In the basis 32 A of eq. (A.7) the matrix C ≡ C AB and its inverse C −1 ≡ (C AB ) −1 = C AB have the following form Therefore C AB and C AB are used to lower and raise spinor indices, i.e.
which is consistent with the definitions in eq. (A.7). When forming invariants, it is more practical to use the gamma matrices Γ i with a complex antifundamental index instead of a real one, determined via Since only the basis of the fundamental index is changed, the block structures shown in eq. (A.11) and (A.12) are not affected. The irreducible representations of SO(10) are located in tensor products of the 10, 16 and 16. It is convenient to embed the 16 and the 16 into the reducible representation 32 by setting the components of one of the parts to zero in eq. (A.7). Thus, the spinor index A is used to label the components of both 16 and 16, i.e. where parentheses indicate symmetrization and square brackets indicate anti-symmetrization of the indices. The irreducible representations carrying only fundamental indices are real representations, whereas the ones with a spinor index are complex. We use a notation for the representations where all indices are by default upper. In order to specify the components of the 54, 210 , 144 and 144, lower dimensional representations have to be projected out of the tensor products The 126 and 126 are determined by a restriction to entries satisfying the (anti)self-duality identities where denoting the rank 10 completely anti-symmetric Levi-Civita symbol, and the real basis must be taken. Invariant terms are formed by contracting indices of the same type. While in the case of complex fundamental and spinor indices an upper index has to be combined with a lower one, there is no such restriction in the real fundamental basis, where all indices are of the same height. In order to raise and lower indices, the matrices P ij , P ij , defined in eq. (A.6), and the charge conjugation matrices C AB , C AB , defined in eq. (A.16), are used. Fundamental indices are transformed into spinor indices by the contraction with gamma matrices. The representations with anti-symmetric fundamental indices have a simple form in spinor space, since they can be written as 32 × 32 matrices with an upper-lower spinor index pair: They adhere to the block structure in eq. (A.12). In spinor space, contractions in products of such representations are thus simply done by matrix multiplication. This notation is  As a final aid for group theoretic considerations, we provide decompositions of the lowest dimensional irreducible representations of SO(10) under the maximal subgroups G 51 and G 422 , listed in tables 10 and 11. In addition, table 12 shows the number of SM singlets and weak doublets in each of these representations, which is relevant information for DT splitting considered in section 3. The information provided in the tables is available also in e.g. [39].

B Construction of operators via mediators
In this appendix we investigate the conditions under which single operator dominance in model building in the Yukawa sector can be justified without simply setting the coefficients in front of unwanted operators to zero.
We focus on the class of non-renormalizable superpotential operators in eq. (2.7) that are of relevance to this paper. These operators generate Yukawa terms once the SM singlets acquire GUT-scale VEVs. The generic form of the operators is   m = m = n = 0, the two possible independent ways of contracting the indices are the following: with all the necessary definitions found in appendix A. The difference in the contractions is whether the 10 contracts to spinor form via Γ i in eq. (B.2), or contracts though the 45 via a fundamental index in eq. (B.3). The two operators in eq. (B.2) and (B.3) turn out to form independent invariants while containing the same representations. Imposing extra symmetries (e.g. global) on these fields cannot discriminate between such contraction ambiguities, thus implying the presence of all such operators in the superpotential, which is at odds with single operator dominance. A way to circumvent this limitation is to impose symmetries in an extended theory containing additional mediator fields, which effectively control the allowed contractions (analogous to what can be done in SU(5), see [11,12]). We could imagine that the extended theory is renormalizable, while the non-renormalizable operators under study arise as effective operators once the heavy mediator fields are integrated out above the GUT scale. For example, in the extended theory the operator in eq. (B.2) is formed via mediators in the representations 16 and 16, and a mass insertion from 16 · 16. The operator in eq. (B.3), on the other hand, requires a mediator 10, and a mass insertion from 10 · 10. A mediator 120 is also possible, but it does not yield an expression linearly independent from the other two. The construction of these operators is presented graphically in figure 7.
In  The mediator labels are written schematically; the mass insertion of X i is actually a vertex X i · X i , so both X i and its conjugate representation need to be used, and analogous for Y i . All mediator representations of the tree substructure except for 126 are real though, in the sense that they form quadratic invariants.
Higgs field representation. Since in spinor notation the matrices 45 A B and 210 A B are diagonal, these VEVs commute among each other, but they do not commute with the EW scale VEV of the Higgs representation H A B . The relative order of the GUT-scale VEVs on each side of H is thus irrelevant, in the sense that they yield the same low energy Yukawa operators. The only relevant aspect is whether a VEV is located on the left-or on the right-hand side of the Higgs representation H.
If in addition non-spinor representations are used for mediators alongside 16 ⊕ 16, the external legs with a 45 or 10 can form complicated tree graph structures, as shown in figure 9. Considering only the mediators 16 ⊕ 16 thus limits us to diagrams with simple external legs as in figure 8 and prevents complications from tree substructures of external legs connecting to the "fermion line" of 16s as in figure 9. This simple case of a fermion line of 16 and 16 representations, to which all other external legs attach, exactly corresponds to the explicit contractions through spinor indices in eq. (2.7) of the operators we consider in this paper. We are interested in constructing operators which lead to unique predictions for the Yukawa couplings via single operator dominance. Even using just the 16 ⊕ 16 mediators, we still need to control the order in which the 45s and 210s attach to the fermion line in figure 8. One can introduce a global U(1) symmetry, or a suitable discrete subgroup, to distinguish between these cases. The assignment of the charges to the fields is shown in figure 10: we assume there are M VEV legs to the left of H, and N VEV legs to the right, so that M = m + n and N = m + n . If the mediators are integrated out, the diagram in figure 10 corresponds to the non-renormalizable operator where each X i and Y j is either a 45 or a 210. Note that a different position of the C matrix in the product would give the same invariant up to a minus sign due to the commutation relation with gamma matrices in eq. (A.14). The global charges of the mediators on the left-and right-hand side of the Higgs field are labelled by α i and β j , respectively. 7 Note that for each α i and β j we have in principle a different pair of mediators 16 ⊕ 16. The sum of the charges in each vertex needs to be zero, yielding a large system of equations. The mediator charges are computed to be α i = q I + Note that these charges are unrelated to the alignments, which are also labeled by α and β in eq. (2.6).

JHEP02(2020)086
Thus, specifying the charges of the 16 I(J) and the 45s or 210s is sufficient for all the other charges (the charges of the Higgs representation and of the mediators) to be fixed as well.
In general, the charge of the Higgs needs to be consistent in the wider context of multiple operators (since we populate multiple entries of the Yukawa matrix), which can be checked already at the level of external legs only. If we choose a discrete global Z k symmetry instead of U(1), all the charge equations hold modulo k. In any given model, if the mediator mechanism is employed to impose single operator dominance, it needs to be checked with all mediators introduced into the model that they do not allow also undesired diagrams. In particular, mediators introduced for one operator may allow undesired contractions in another operator. This appendix has provided the reader with all the necessary considerations and tools to check the consistency explicitly in any given model. We also provide some general conclusions on forbidding undesired diagrams below, but only when considering an operator in a single Yukawa entry.
A diagram as in figure 8 is said to be protected by a set of global charges of the fields if there exists no other diagram with the same external legs using the same set (or a subset) of mediators. This implies that permutations among the VEV legs or with the Higgs leg are forbidden for a protected diagram. As discussed earlier, only the relative position of the VEVs to the Higgs H impacts the MSSM predictions, so having a protected diagram is a sufficient (but not necessary) condition for single operator dominance. Whether a diagram can be protected or not is discussed for the following two cases: • I = J: -Such diagrams can always be protected. For example, choose x s > 0, y t > 0 and q J = q I + m s=1 x s + n t=1 y t , where different 45s or 210s are assumed to have different charges x s and y t , so that no one charge is a sum of some of the others. This then yields a unique allowed order of external legs, assuming no special relations between x s and y t , i.e. for a random choice of rational charges.
• I = J: -If all 45s or 210s are the same field (X i = Y j for all i and j): Only the cases |M − N | ≤ 1 can be protected, i.e. only the diagrams which are as symmetric as possible in the number of external legs left and right of H. In all other cases the diagrams always come along with more symmetric diagrams, which potentially generate different Yukawa couplings, as illustrated in figure 11.
-If not all 45s or 210s are the same field: for a generic order of external legs, there may be no protection possible. However, the values of the Yukawa couplings do not depend on the specific order of the 45s or 210s while permuting on the same side of H. It was checked numerically that among the diagrams which are equivalent from the point of view of the Yukawa couplings, there always exists one which is protected. An example is given in figure 12. Incidentally, this provides further motivation for our choice of operators under consideration in eq. (2.7), since all such operators (with multiple fields) can be protected for some internal -53 -JHEP02(2020)086 Figure 11. An example of a diagram (upper) in the case of I = J and only one different 45 or 210 field with global U(1) charge x, which cannot be protected. If the upper diagram is present, then the lower, more symmetric one can be constructed, using a subset of the mediators. Figure 12. Example of two diagrams in the case of I = J providing the same Yukawa couplings, but where only one of them can be protected. They contain two different 45 or 210 fields with charges x 1 and x 2 . For any choice of global charges in the first diagram, the symmetric diagram with a pair of legs x 1 -x 2 on both sides of H can also be constructed. In contrast, the second diagram, where the fields on the left-hand side of the Higgs fields are permuted, can be protected by a suitable choice of charges, for example q I = 0, x 1 = 3 and x 2 = 1.
reordering of {α i , β j } and some reordering of {α k , β l } while retaining the same MSSM Yukawa prediction.
In summary, for any non-renormalizable operator as in eq. (B.5), there exists a diagram as in figure 8 which can be protected, and which leads to a unique Yukawa operator when integrating out the mediators. Exceptions occur only in the case of I = J and only one different 45 or 210 field, where diagrams which are not as symmetric as possible (|M − N | > 1) cannot be protected. Note that the protection we considered applies only when considering operators for a single Yukawa entry; adding more Yukawa entries (more operators) may introduce new mediators with charges, which interfere with the protection of the previous operators. The relevance of the concept of protected operators in concrete models is thus the following: operators that cannot be protected also cannot be used for -54 -JHEP02(2020)086 single operator dominance; protected operators can be used freely in the preliminary model building stages, but subsequent consistency of allowing only a certain set of operators needs to be checked with all external fields and introduced mediators.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.