1 Introduction

In supersymmetric extensions of the Standard Model (SM), any particles with sizeable couplings to the Higgs sector are expected to have masses not too far above the electroweak scale. This concerns in particular the squarks of the third generation, which should be lighter than about a TeV in order not to create a severe naturalness problem. By contrast, the squarks of the first two generations could well be much heavier. This possibility is particularly attractive because the bounds from supersymmetry (SUSY) searches at the LHC are strongest by far for the first two generations of squarks, and because flavour constraints are also easier to satisfy when they are very heavy. The scenario of an inverted mass hierarchy in the squark sector, typically combined with a small higgsino mass parameter and a not too heavy gluino (see e.g. [14] and references therein), is commonly dubbed “natural” or “effective” SUSY, and is increasingly becoming the new paradigm of SUSY phenomenology.

In the Minimal Supersymmetric Standard Model (MSSM) with boundary conditions at the Grand Unification (GUT) scale, light stops and sbottoms with otherwise very heavy squarks are especially interesting because they can lead to radiatively induced large stop mixing [57]. The latter is needed in the MSSM to obtain a 126 GeV Higgs mass while keeping the stops reasonably light. More precisely, if the first two-generation squarks have masses of the order of \(10\) TeV, and if supersymmetry breaking is mediated at a very high scale such as \(M_{\mathrm {GUT}}\approx 10^{16}\) GeV, then the stop masses at the low scale receive significant negative contributions from two-loop running (or possibly even from one-loop running if there is a non-vanishing hypercharge \(D\)-term). This allows one to realise a sizeable ratio \(| A_{t}/\overline{m}_{\tilde{t}}|\), where \(A_{t}\) is the stop trilinear parameter and \(\overline{m}_{\tilde{t}}\) is the average stop mass, leading to large one-loop corrections to the lightest Higgs mass. However, in precisely this situation where radiative corrections to the spectrum from the first two generations are important, they may also induce a significant misalignment between the squark and quark mass matrices. The resulting flavour-changing neutral currents (FCNCs) are tightly constrained by experiment. The effects of such a split squark spectrum on flavour observables have already been investigated in [811] (see also [1220] for some recent discussions on FCNCs in selected models with light third-generation squarks). Here, we propose to shed light on this issue using a different strategy.

Firstly, having assumed a very high mediation scale, hierarchical squark soft terms at the low scale have to be obtained from some non-universal boundary conditions through the renormalisation group evolution. But even just prescribing such boundary conditions in a model-independent way is nontrivial, since they depend on the chosen flavour basis. Our first result is to propose a formalism to define general soft-term boundary conditions in a basis-independent manner. Secondly, we apply this formalism to the cases where either a subset or all of the third-generation squarks are light, while the first two-generation squarks are heavy and near-degenerate. It turns out that not only is our formalism particularly well suited to study such squark mass patterns, but in addition the resulting TeV-scale soft terms are in many cases manifestly compatible with the minimal flavour violation principle (MFV),Footnote 1 as proposed in Reference [21]. In addition, whenever a departure from MFV is observed, it can be quantified precisely. Clearly, realizing split squark scenarios in this way is of great advantage because it helps ensure that there will be no conflict with bounds on \(D\)\(\bar{D}\) and \(K\)\(\bar{K}\) mixing observables, which one might otherwise expect for generic hierarchical soft terms.

In Sect. 2, we briefly recall the essentials of the SUSY flavour problem, the concept of MFV, and present our procedure to define fully generic and non-universal boundary conditions for soft-breaking terms. In Sect. 3 we use this scheme to parametrise the boundary conditions leading to third-generation squarks much lighter than the first two generations, and characterise their flavour properties. Section 4 contains our conclusions. In the “Appendix”, we address some technical subtleties regarding the definition and running of the CKM matrix and show that our scheme allows to easily deal with, and correct for, CKM-induced uncertainties in the renormalisation group (RG) running.

2 The SUSY flavour sector

We follow the conventions of the SLHA2 [22], which we now briefly recall. The matter fields of the supersymmetric Standard Model transform under a global non-abelian flavour symmetry \(G_\mathrm{F}=\mathrm {SU}(3)_{Q}\times \mathrm {SU}(3)_{U}\times \mathrm {SU}(3)_{D}\times \mathrm {SU}(3)_{L} \times \mathrm {SU}(3)_{E}\). This symmetry is explicitly broken by the Yukawa superpotential

$$\begin{aligned}&\!\!\!W_{\mathrm {Yukawa}}=-(\mathbf {Y}_{u})_{ij}\,H_{u}Q_{i}U_{j}+(\mathbf {Y} _{d})_{ij}\,H_{d}Q_{i}D_{j}\nonumber \\&\qquad \quad \quad \quad +(\mathbf {Y}_{e})_{ij}\,H_{d}L_{i}E_{j}, \end{aligned}$$
(1)

as well as by the soft mass matrices for the squarks and sleptons, and by the soft trilinear terms.

In the lepton–slepton sector, \(\mathbf {Y}_{e}\) can always be diagonalised via a suitable \(\mathrm {SU}(3)_{L}\times \mathrm {SU}(3)_{E}\) transformation. We will focus on the quark–squark sector, where at most one of the matrices \(\mathbf {Y}_{u}\) and \(\mathbf {Y}_{d}\) can be chosen diagonal in a gauge eigenstate basis. After electroweak symmetry breaking, the Yukawa matrices are diagonalised by

$$\begin{aligned} \widehat{\mathbf {Y}}_{d,u}=\mathbf {V}_{d,u}^{R\dag }\mathbf {Y}_{d,u} ^{\mathrm {T}}\mathbf {V}_{d,u}^{L},&\widehat{\mathbf {Y}} _{d}=\text {diag}(y_{d},\,y_{s},\,y_{b}),\nonumber \\&\widehat{\mathbf {Y}} _{u}=\text {diag}(y_{u},\,y_{c},\,y_{t}). \end{aligned}$$
(2)

The misalignment of left-handed quarks is encoded in the CKM matrix, \(\mathbf {V}_{\mathrm {CKM}}=\mathbf {V}_{u}^{L\dag }\mathbf {V}_{d}^{L}\). Rotating quarks and squarks by the same unitary transformations defines the super-CKM basis, in which the squark mass matrices take the form

$$\begin{aligned} \mathbf {M}_{\tilde{u}}^{2}&= \left( \begin{array}{ll} {\mathbf {V}_{\mathrm {CKM}}\,\widehat{\mathbf {m}}}_{Q}^{2}\mathbf {V} _{\mathrm {CKM}}^{\dag }\!+\!\frac{v_{u}^{2}}{2}\widehat{\mathbf {Y}}_{u}^{2} &{} \frac{v_{u}}{\sqrt{2}}\left( \widehat{\mathbf {T}}_{u}^{\dag }\!-\!\widehat{\mathbf {Y}}_{u}\,\mu \cot \beta \right) \\ \frac{v_{u}}{\sqrt{2}}\left( \widehat{\mathbf {T}}_{u}\!-\!\widehat{\mathbf {Y} }_{u}\,\mu ^{*}\cot \beta \right) &{} \widehat{\mathbf {m}}_{U}^{2} +\frac{v_{u}^{2}}{2}\widehat{\mathbf {Y}}_{u}^{2} \end{array} \right) \nonumber \\&+D\text {-terms}\,,\nonumber \\ \mathbf {M}_{\tilde{d}}^{2}&= \left( \begin{array}{ll} {\widehat{\mathbf {m}}}_{Q}^{2}+\frac{v_{d}^{2}}{2}\widehat{\mathbf {Y}}_{d}^{2} &{}\frac{v_{d}}{\sqrt{2}}\left( \widehat{\mathbf {T}}_{d}^{\dag }-\widehat{\mathbf {Y}}_{d}\,\mu \tan \beta \right) \\ \frac{v_{d}}{\sqrt{2}}\left( \widehat{\mathbf {T}}_{d}-\widehat{\mathbf {Y} }_{d}\,\mu ^{*}\tan \beta \right) &{} \widehat{\mathbf {m}}_{D}^{2} +\frac{v_{d}^{2}}{2}\widehat{\mathbf {Y}}_{d}^{2} \end{array} \right) \nonumber \\&+D\text {-terms}. \end{aligned}$$
(3)

In terms of the interaction-basis soft masses \(\mathbf {m}_{Q,U,D}^{2}\) and trilinear terms \(\mathbf {T}_{u,d}\),

$$\begin{aligned} \widehat{\mathbf {m}}_{Q}^{2}&=\mathbf {V}_{d}^{L\dag }\mathbf {m}_{Q} ^{2}\mathbf {V}_{d}^{L},\quad \widehat{\mathbf {m}}_{U}^{2}=\mathbf {V} _{u}^{R\dag }(\mathbf {m}_{U}^{2})^{\mathrm {T}}\mathbf {V}_{u}^{R},\nonumber \\&\qquad \qquad \qquad \qquad \quad \widehat{\mathbf {m}}_{D}^{2}=\mathbf {V}_{d}^{R\dag }(\mathbf {m}_{D} ^{2})^{\mathrm {T}}\mathbf {V}_{d}^{R},\nonumber \\ \widehat{\mathbf {T}}_{u}&=\mathbf {V}_{u}^{R\dag }\mathbf {T}_{u} ^{\mathrm {T}}\mathbf {V}_{u}^{L},\quad \widehat{\mathbf {T}}_{d}=\mathbf {V} _{d}^{R\dag }\mathbf {T}_{d}^{\mathrm {T}}\mathbf {V}_{d}^{L}. \end{aligned}$$
(4)

Our aim is now to establish a formalism for encoding the squark sector soft-term data without fixing a flavour basis. Such a basis-independent formalism has both conceptual and practical advantages which will be discussed in detail below.

In order to find a basis-independent parameterisation of the soft terms, we expand them in powers of the Yukawa matrices, covariantly with respect to the spurious \(G_\mathrm{F}\) flavour symmetry. To this end we define the matrices

$$\begin{aligned} \mathbf {A}=\mathbf {Y}_{d}\mathbf {Y}_{d}^{\dag },\qquad \mathbf {B} =\mathbf {Y}_{u}\mathbf {Y}_{u}^{\dag }. \end{aligned}$$
(5)

They transform as bifundamentals under an \(\mathrm {SU}(3)_{Q}\) rotation which sends \(Q\rightarrow \mathbf {V}_{Q}Q\):

$$\begin{aligned} \mathbf {A}\rightarrow \mathbf {V}_{Q}^{*}\mathbf {A}\,\mathbf {V} _{Q}^{\mathrm {T}},\qquad \mathbf {B}\rightarrow \mathbf {\mathbf {V}}_{Q} ^{*}\mathbf {B}\,\mathbf {V}_{Q}^{\mathrm {T}}. \end{aligned}$$
(6)

Given that \(\mathbf {m}_{Q}^{2}\) also transforms as a bifundamental, \((\mathbf {m} _{Q}^{2})^{\mathrm {T}}\rightarrow \mathbf {V}_{Q}^{*}(\mathbf {m}_{Q} ^{2})^{\mathrm {T}}\,\mathbf {V}_{Q}^{\mathrm {T}}\), we can expand

$$\begin{aligned} (\mathbf{m}_Q^2)^\mathrm {T}&= m_0^2(a_1^q\,\mathbf{1}+a_2^q\,\mathbf{A}+a_3^q\,\mathbf{B}+a_4^q\,\mathbf{A}^2+a_5^q\,\mathbf{B}^2\nonumber \\&+ a_6^q\,\{\mathbf{A},\mathbf{B}\}+i\,b_1^q\, [\mathbf{A},\mathbf{B}] +i\,b_2^q\, [\mathbf{A},\mathbf{B}^2]\nonumber \\&+i\,b_3^q\, [\mathbf{B},\mathbf{A}^2]), \end{aligned}$$
(7)

where the expansion coefficients \(a_i^q\) and \(b_i^q\) are invariant under \(G_\mathrm{F}\). Likewise, given their respective transformation properties under \(G_\mathrm{F}\), the right-handed squark mass matrices and the trilinear terms are covariantly expanded as

$$\begin{aligned} \mathbf{m}_{U}^2&= m_0^2(a_1^{u}\,\mathbf{1}+\mathbf{Y}_{u}^\dag (a_2^{u}\,\mathbf{1}+a_3^{u}\,\mathbf{A}+a_4^{u}\,\mathbf{B} +a_5^{u}\mathbf{A}^2\nonumber \\&+ a_6^{u}\,\{\mathbf{A},\mathbf{B}\} +i\,b_1^{u}\, [\mathbf{A},\mathbf{B}] +i\,b_2^{u}\, [\mathbf{A},\mathbf{B}^2]\nonumber \\&+i\,b_3^{u}\, [\mathbf{B},\mathbf{A}^2])\mathbf{Y}_{u}),\nonumber \\ \mathbf{m}_{D}^2&= m_0^2(a_1^{d}\,\mathbf{1}+\mathbf{Y}_{d}^\dag (a_2^{d}\,\mathbf{1}+a_3^{d}\,\mathbf{A}+a_4^{d}\,\mathbf{B}+a_5^{d}\,\mathbf{B}^2\nonumber \\&+a_6^{d}\,\{\mathbf{A},\mathbf{B}\} +i\,b_1^{d}\, [\mathbf{A},\mathbf{B}] +i\,b_2^{d}\, [\mathbf{A},\mathbf{B}^2]\,\nonumber \\&\quad +i\,b_3^{d}[\mathbf{B},\mathbf{A}^2])\mathbf{Y}_{d}),\end{aligned}$$
(8)
$$\begin{aligned} \mathbf{T}_{u,d}&= A_0 (c_1^{u,d}\,\mathbf{1}+c_2^{u,d}\,\mathbf{A}+c_3^{u,d}\,\mathbf{B}+c_4^{u,d}\,\mathbf{A}^2+c_5^{u,d}\,\mathbf{B}^2\nonumber \\&+c_6^{u,d}\,\{\mathbf{A},\mathbf{B}\}+i\,c_7^{u,d}\, [\mathbf{A},\mathbf{B}]\nonumber \\&+i\,c_8^{u,d}\, [\mathbf{A},\mathbf{B}^2]+i\,c_9^{u,d}\, [\mathbf{B},\mathbf{A}^2])\mathbf{Y}_{u,d}. \end{aligned}$$
(9)

The coefficients \(a_{i}^{q,u,d}\), \(b_{i}^{q,u,d}\) are real because the mass matrices are hermitian, but the \(c_{i}^{u,d}\) are generally complex. The parameters \(m_0\) and \(A_0\) are placeholder constants of mass dimension one which could as well be absorbed into the \(a\), \(b\), and \(c\) coefficients at one’s convenience. Eqs. (7)–(9) define our basis-independent general parameterisation of the squark sector soft terms.

Since the matrices appearing on the RHS of Eq. (7) are linearly independent (for generic \(\mathbf{A}\) and \(\mathbf{B}\)) [23], there is no loss of generality in this expansion. The same is true for each of Eqs. (8) and (9). Indeed it is a simple exercise in counting to show that the real \(a_i^{q,u,d}\) and \(b_i^{q,u,d}\) together with the complex \(c_i^{u,d}\) coefficients contain exactly the degrees of freedom needed for describing three hermitian \(3\times 3\) mass matrices and two general complex \(3\times 3\) trilinear matrices. The bases of flavour-covariant \(3\times 3\) matrices we are projecting on are not unique, but they are in a sense the simplest choices, being symmetric in \(\mathbf {Y}_u\) and \(\mathbf {Y}_d\) and using the lowest powers of Yukawa matrices possible.

These matrix bases turn out to be numerically somewhat peculiar when realistic values for \(\mathbf{Y}_u\) and \(\mathbf{Y}_d\) are inserted. Because of the large hierarchy in the Yukawa couplings, one has \(\mathbf {B}^2\approx \mathrm{tr }(\mathbf {B})\mathbf {B}\) and \(\mathbf {A}^2\approx \mathrm{tr }(\mathbf {A})\mathbf {A}\); that is, some of the basis matrices are nearly parallel in flavour space. In addition, the only non-diagonal structure provided by \(\mathbf {A}\) and \(\mathbf {B}\) is the very hierarchical CKM matrix. Therefore, numerically expanding a generic \(3\times 3\) matrix requires coefficients spanning several orders of magnitude, typically up to the order of \(m_{t}^{2}/m_{u}^{2}\sim 10^{10}\).

The above expansion enables us to adopt a very simple and clear definition of Minimal Flavour Violation (MFV). The basic assumption of MFV is often stated as \(G_\mathrm{F}\) being broken only through powers of Yukawa matrices [21] (see also e.g. [2426]). The usual rationale is that \(G_\mathrm{F}\) could be an exact but spontaneously broken symmetry of some more fundamental theory whose dynamics is responsible for the generation of both the Yukawa couplings and the soft terms. In our framework, we define MFV as follows: all \(a_i^x\), \(b_i^x\) and \(c_i^x\) coefficients in Eqs. (7)–(9) should be at most \(\mathcal{O}(1)\) when \(m_0\) and \(A_0\) represent the typical soft mass scale. (In fact the statement “the only sources of \(G_\mathrm{F}\) breaking are powers of Yukawa matrices” is somewhat meaningless when taken on its own, since the above expansion shows that one can parameterise any general soft mass and trilinear matrices in this way. However, if the expansion coefficients are allowed to be arbitrarily large, they could not possibly originate from \(G_\mathrm{F}\) spurions in a weakly coupled theory.) For more details, see also [27, 28].

At this point we should emphasise that our approach does not rely on the \(G_\mathrm{F}\) symmetry being in any way fundamental. When we allude to MFV in the following, it is mostly because the MFV condition (in the strict above sense) has certain other appealing properties: Firstly, it is stable and generally even IR-attractive [29, 30] under the renormalisation group; secondly, it allows a model to automatically satisfy many stringent bounds from flavour physics.

Unification may impose additional relations between the soft terms and hence between the expansion coefficients. GUT relations are typically spoiled at the subleading level by higher-dimensional operators involving GUT-breaking VEVs (for instance, the \(\mathrm {SU}(5)\) relation \(\mathbf{Y}_d=\mathbf{Y}_e\) should be violated to obtain a valid fermion spectrum). Neglecting such GUT-breaking effects, one may look for simple conditions on the coefficients to ensure that the soft terms are compatible with grand unification, depending on the actual GUT model. For example, standard \(\mathrm {SU}(5)\) unification requires \(\mathbf{m}_Q^2=\mathbf{m}_U^2\). Choosing a basis in which \(\mathbf{Y}_u\) is diagonal, it is clear that for this to hold it is sufficient to choose \(a_1^q=a_1^u\), \(a_3^q=a_2^u\), \(a_5^q=a_4^u\) with all other \(a_i^{q,u}=0\). More general patterns are of course possible since our parametrisation is fully general, but they will in general not be MFV-like.

In this work we are interested in models where the soft terms are neither universal nor necessarily MFV-like at some very high mediation scale \(M_\mathrm{GUT}\approx 10^{16}\) GeV. We will define the soft-term boundary conditions through the expansion Eqs. (7)–(9). Such a procedure has many desirable features (below, we use the short-hand \(x^{q,u,d}=a^{q,u,d},b^{q,u,d},c^{u,d}\); \(x=x^{q},x^{u},x^{d}\)):

  1. 1.

    The soft masses and trilinear terms at any scale \(Q\) admit expansions of the form (7)–(9), where both soft terms and Yukawa couplings are understood as those at the scale \(Q\). Thus, the running of the soft masses and trilinear terms can be represented by that of the flavour coefficients. Their renormalisation group equations (RGEs) were studied in References [29, 30]. Typically, not only are the evolutions of the coefficients \(a_{i\ne 1}[Q], b_i[Q], c_{i\ne 1}[Q]\) from \(Q=M_{\mathrm {GUT}}\) down to the TeV scale smooth and bounded, but they even exhibit infrared “quasi”-fixed points, whose values mostly depend on the non-flavoured MSSM parameters.

  2. 2.

    The \(\beta \)-functions of the soft masses and trilinear terms are naturally compatible with the expansions (7)–(9), and the running of the various coefficients sum up different physical effects. For example, the leading coefficients \(a_{1}^{q,u,d}[Q],c_{1}^{u,d}[Q]\) entirely encode the dominant flavour blind evolution, while subleading terms evolve separately.

  3. 3.

    The phenomenological impact of the flavour mixing induced by the off-diagonal soft-term entries can immediately be assessed. Indeed, the MFV limit is recovered when all the coefficients are \(\mathcal {O}(1)\). This means that one can directly spot potentially dangerous sources of new FCNCs simply by looking at the relative sizes of the coefficients. For example, if \(a_{1}^{q}[1\,\)TeV\(]=1\) but \(a_{3}^{q}[1\) TeV\(]=1000\), then one should expect difficulties with FCNC constraints from \(K\) and \(B\) physics. Indeed, assuming SUSY masses of the order of 1 TeV, such values grossly violate current bounds on mass insertions; see e.g.  Reference [31], with for example \([\mathbf{m}_Q^2]_{12}/[\mathbf{m}_Q^2]_{11}\approx 1000\times V_{td}^{*}V_{ts}\sim \mathcal {O}(0.1)\).

  4. 4.

    Starting with universal mSUGRA-like soft-breaking terms, \(x_{i} [M_{\mathrm {GUT}}]=\delta _{i1}\), the coefficients at the low scale are all MFV-like, \(x_{i}[Q]\sim \mathcal {O}(1)\) or smaller. More generally, the logarithmic running with small coupling constants cannot upset initial MFV-like boundary conditions at the GUT scale. The converse is not true though, because of the presence of the aforementioned quasi-fixed points [30].

  5. 5.

    Intrinsically new CP-violating phases, entering exclusively through \(b_{i}^{q,u,d}[M_{\mathrm {GUT}}]\ne 0\) and \(\mathrm{Im }c_{i} ^{u,d}[M_{\mathrm {GUT}}]\ne 0\), can be simply factored out from CP-violating effects induced by the CKM phase, introduced through \(\mathbf {Y}_{u}\) and \(\mathbf {Y}_{d}\). Note that if \(b_{i}^{q,u,d}[M_{\mathrm {GUT}}]=0\) and \(\mathrm{Im }c_{i}^{u,d}[M_{\mathrm {GUT}}]=0\), their values at the electroweak scale are entirely induced by the CP-violating phase of \(\mathbf {V}_{\mathrm {CKM}}\), and end up tiny. In this respect, the CP-violating phases of \(c_{1}^{u}[Q]\) and \(c_{1}^{d}[Q]\) are a bit special. Being flavour blind, they should be considered along with those of the other flavour blind complex parameters of the MSSM such as \(\mu \) and the gaugino masses [27].

  6. 6.

    For a given boundary condition, one can easily and completely probe its CKM neighbourhood by allowing \(\mathcal {O}(1)\) variations of the coefficients. Indeed, these variations simulate the presence of arbitrary CKM-like mixing matrices in both the left- (\(\mathbf {V}_{u,d}^{L}\)) and the right-handed sector (\(\mathbf {V}_{u,d}^{R}\)). In practice, this is far less demanding than it seems. For \(\mathcal {O}(1)\) perturbations, not all the 63 coefficients are equally relevant, so varying only the first few in each expansion is sufficient.

  7. 7.

    As analysed in the “Appendix”, provided none of the leading coefficients are particularly large, the soft-term expansions are largely independent of the precise parametrisation of the CKM matrix. In particular, the coefficients are similar using the full CKM matrix or its CP-conserving limit, no matter how this limit is taken. By contrast, off-diagonal entries of the soft terms can deviate by tens of percent depending on the chosen CKM matrix. This observation is useful in practice since it permits to compute the coefficients under some simplifying assumptions (CP-limit, no threshold corrections, and/or no experimental errors for the CKM parameters), and then to reconstruct with an excellent accuracy the physical soft terms and thereby reliably compute all the flavour observables.

  8. 8.

    Last but not least, it is easy and straightforward to parametrise boundary conditions where the third-generation squarks are split from the first two generations, since \(\mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\) and \(\mathbf {Y}_{d}\mathbf {Y}_{d}^{\dagger }\) do have precisely such a hierarchy. This possibility will be explored in detail in the next section.

To be complete, we should point out that there is one practical issue that needs to be kept in mind. Since the basis matrices span several orders of magnitude and are approximately linearly dependent, it is necessary to maintain a high level of accuracy in the numerical evaluations, otherwise instabilities can easily arise. This is especially true when computing the coefficients of highly suppressed terms such as \(a_{6}^{u,d}\) or \(b_{3}^{u,d}\). For the same reason, a perfectly unitary representation of the CKM matrix must be used, otherwise spuriously large coefficients can arise.

3 Split squarks and MFV

The peculiar structure of the MSSM Yukawa couplings should have its origin in some unknown flavoured dynamics at some high scale \(M_{\mathrm {F}}\). If supersymmetry breaking is mediated at a scale greater than \(M_{\mathrm {F}}\), then one can reasonably expect that this flavour dynamics will also generate some non-trivial flavour structures for the soft mass terms and the trilinear couplings. In that sense, expressing the soft terms directly in terms of the Yukawa couplings through the expansions (7)–(9) can be regarded as an attempt at capturing the relationships between them. If this picture is correct, the expansion coefficients at the scale \(M_{\mathrm {F}}\) would not be random but would derive from the flavour dynamics at that scale. It is thus quite possible that the various coefficients would actually follow a very definite pattern.

With the above idea in mind, our goal is to design flavour structures leading to spectra with light third-generation squarks at the low scale. There are many ways to achieve this. A first possibility is to impose

$$\begin{aligned} a_3^q\simeq - a_1^q/\langle \mathbf {B}\rangle ,\qquad a_{i\ne 1,3}^q=b_i^q=0, \end{aligned}$$
(10)

where \(\langle \mathbf {\cdot }\rangle \) denotes the trace in flavour space. More explicitly, let us set

$$\begin{aligned}&\mathbf {m}_{Q}^{2}[M_{\mathrm {GUT}}] =m_{0}^{2}( \mathbf {1}-\alpha _{q}\mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\langle \mathbf {Y}_{u}\mathbf {Y} _{u}^{\dagger }\rangle ^{-1}) ^{\mathrm {T}}, \nonumber \\&\mathbf {m}_{U,D}^{2}[M_{\mathrm {GUT}}] =m_{0}^{2}\mathbf {1}, \nonumber \\&\mathbf {T}_{u,d}[M_{\mathrm {GUT}}] =A_{0}\mathbf {Y}_{u,d}. \end{aligned}$$
(11)

When the free parameter \(\alpha _{q}\) is close to one, in the basis where \(\mathbf {Y} _{u}\) is diagonal, \(\mathbf {m}_{Q}^{2}\) has its first two entries nearly degenerate and much larger than the third, which is precisely what we aim for. Note, however, that in this particular case the value of \((\mathbf {m}_{Q}^2)_{33}\) receives large negative loop corrections from \((\mathbf {m}_{U}^2)_{33}\). In order to generate a realistic spectrum, the GUT-scale \((\mathbf {m}_{Q}^2)_{33}\) cannot not be chosen too small, and/or sizeable positive corrections from the gaugino masses are needed to overcome this effect. At the low scale \(\tilde{t}_L\) and \(\tilde{b}_L\) then end up much lighter than all the other squarks.

It should be remarked that compared to naively setting

$$\begin{aligned} \mathbf {m}_{Q}^{2}[M_{\mathrm {GUT}}]=\left( \begin{array}{lll} m_{1}^{2} &{} 0 &{} 0\\ 0 &{} m_{1}^{2} &{} 0\\ 0 &{} 0 &{} m_{2}^{2} \end{array} \right) , \end{aligned}$$
(12)

our procedure requires the same number of free parameters. But, at the same time, setting the initial conditions in our way is entirely independent of the flavour basis, while Eq. (12) in principle requires one to specify also the four mixing matrices \(\mathbf {V}_{u,d}^{L,R}\). In addition, the parameter \(\alpha _{q}\) could bear some physical meaning. First, because \(\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle ^{-1}\) is factored out, its RG evolution is very flat over the whole range down to the electroweak scale. Typically \(\alpha _q\) changes by \(\lesssim 20\,\%\) during the evolution. (We explicitly show the evolution of \(\alpha _q\) for a different scenario in the following discussion; see Fig. 2.) Second, it is tempting to imagine that some unknown flavour dynamics sets \(\alpha _{q}\) to exactly one at the scale \(M_{\mathrm {F}}\). However, since \(M_{\mathrm {F}}\ne M_{\mathrm {GUT}}\), one would then have \(\alpha _{q}[M_{\mathrm {GUT}}]\) close but not exactly equal to one. Thus, the only phenomenological constraint on this parameter is for it to evolve down to a value smaller than one at the low scale, so as to avoid inducing negative eigenvalues for the stop or sbottom squarks and the ensuing colour symmetry breaking. We are, however, not aware of any specific flavour model which predicts \(\alpha _q=1\), so for the moment we will treat \(\alpha _q\approx 1\) merely as a parameter choice, and study its implications independently of a possible dynamical generation.

A very interesting feature of the boundary condition Eq. (11) is that even if left-squark masses are highly hierarchical, it nevertheless respects the MFV principle since \(\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle \approx y_{t}^{2}\) is of \(\mathcal {O}(1)\) at all scales. So, once evolved to the low scale, we can immediately predict that these initial conditions should be compatible with flavour constraints.

Other scenarios can be constructed along the same lines. For instance, to also split the \(\tilde{t}_R\) from the first- and second-generation squarks, one can further impose

$$\begin{aligned} \mathbf {m}_{U}^{2}=m_{0}^{2}( \mathbf {1}-\alpha _{u}\mathbf {Y} _{u}^{\dagger }\mathbf {Y}_{u}\langle \mathbf {Y}_{u}^{\dagger }\mathbf {Y} _{u}\rangle ^{-1}), \end{aligned}$$
(13)

which is also compatible with the MFV principle when \(\alpha _{u}\approx 1\). As opposed to the above scenario, the condition that both \(\mathbf {m}_U^2\) and \(\mathbf {m}_Q^2\) be hierarchical is radiatively stable (provided that the other states which couple strongly to the stop sector, such as the up-type Higgs and the gauginos, are not too heavy). Together with a small \(\mu \) parameter, this constitutes a way to realise “natural supersymmetry” within MFV. An example for the typical evolution of the leading expansion coefficients for such a natural SUSY-MFV scenario is given in Fig. 1. The RG evolution and computation of the mass spectrum is done with SPheno [32, 33] with boundary conditions adapted according to Eqs. (7)–(9). The \(a_1\) coefficients are not shown because they remain very close to unity, with deviations at the level of less than a percent. The evolution of the \(a_3^q\) and \(a_2^u\) coefficients is much steeper than that of the other \(a_i\). The reason for this is that \(a_3^q\) and \(a_2^u\) are dominated by the running of \(y_t\); when the \(y_t\) dependence is factored out, the evolution is very flat, see Fig. 2.

Fig. 1
figure 1

Evolution of the leading expansion coefficients for the strictly MFV “natural SUSY” scenario with light \(\tilde{t}_{L,R}\) and \(\tilde{b}_L\) but a heavy \(\tilde{b}_R\). Concretely, we take \(m_0=10\) TeV, \(m_{1/2}=1\) TeV, \(A_0=-1\) TeV, \(\tan \beta =10\), \(m_{H_u}^2=m_{H_d}^2=7.5\) (TeV)\(^2\), and \(\alpha _q=\alpha _u=0.97\). The resulting spectrum has \(m_{\tilde{t}_1}=555\) GeV, \(m_{\tilde{b}_1}=570\) GeV, \(m_{\tilde{t}_2}\simeq 1.8\) TeV and all other squark masses \(\approx 10\) TeV; moreover, \(\mu \simeq 800\) GeV and \(m_{\tilde{g}}\simeq 2.5\) TeV. The point has a light Higgs mass of \(m_h=124\) GeV and passes flavour constraints (computed with SUSY_FLAVOR 2.02 [34]). Finally, \(m_A\simeq 3\) TeV, so we are deep in the Higgs decoupling regime

Fig. 2
figure 2

Evolution of \(\alpha ^q\) and \(\alpha ^u\) within the scenario of Fig. 1. The plot serves to confirm the flatness of the evolution of \(\alpha \). Moreover, it illustrates that the evolution of the flavour coefficients \(a_3^q\) and \(a_2^u\) is dominated by the RG evolution of \(y_t\), which is factored out here

On the other hand, there is no way to split the right sbottom from the first two generations without moving away from MFV. Indeed, all the non-trivial terms in the expansion of \(\mathbf {m}_{D}^{2}\) are sandwiched between \(\mathbf {Y}_{d}^{\dagger }\) and \(\mathbf {Y}_{d}\), which are small when \(\tan \beta \) is not very large. Specifically, the simplest way to lighten all third-generation squarks is to impose

$$\begin{aligned}&\mathbf {m}_{Q}^{2} =m_{0}^{2}( \mathbf {1}-\alpha _{q}\mathbf {Y} _{u}\mathbf {Y}_{u}^{\dagger }\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle ^{-1}) ^{\mathrm {T}},\nonumber \\&\mathbf {m}_{U}^{2} =m_{0}^{2}( \mathbf {1}-\alpha _{u}\mathbf {Y} _{u}^{\dagger }\mathbf {Y}_{u}\langle \mathbf {Y}_{u}^{\dagger }\mathbf {Y} _{u}\rangle ^{-1}),\nonumber \\&\mathbf {m}_{D}^{2} =m_{0}^{2}( \mathbf {1}-\alpha _{d}\mathbf {Y} _{d}^{\dagger }\mathbf {Y}_{d}\langle \mathbf {Y}_{d}^{\dagger }\mathbf {Y} _{d}\rangle ^{-1}),\nonumber \\&\mathbf {T}_{u,d} =A_{0}\mathbf {Y}_{u,d}, \end{aligned}$$
(14)

with \(\alpha _{q,u,d}\approx 1\). Clearly, unless \(\tan \beta \) is very large, \(\mathbf {m}_{D}^{2}\) significantly deviates from the MFV assumption. One might worry that this setting conflicts with current flavour constraints, which would thus disfavour light \(\tilde{b}_R\) squarks. However, this is not the case. First, note that a large \(a_{2}^{d}\sim \langle \mathbf {Y}_{d}^{\dagger }\mathbf {Y}_{d}\rangle ^{-1}\approx y_{b}^{-2}\) at the low scale is harmless, since it does not contribute to the \(\delta _{RR}^{d}\) mass insertions (this is evident in a basis where \(\mathbf {Y}_{d}\) is diagonal). The impact of a large \(a_{2}^{d}\) at the high scale is less obvious, since it can drive other coefficients towards large non-MFV values through the RGE evolution. However, as illustrated in Fig. 3, this effect turns out to be quite limited numerically. Though some coefficients are indeed initially driven towards large values, the quasi-fixed point behaviour of the RGE evolution then kicks in and brings them back to MFV-like values at the low scale (see e.g. the coefficient \(a_4^d\) in Fig. 3). So, even if the low-scale coefficients are not strictly compatible with the MFV principle, they are sufficiently close to MFV to pass all flavour constraints (we also checked this explicitly by direct computation of the flavour observables, using the SUSY_FLAVOR 2.02 code [34]).

Fig. 3
figure 3

Evolution of the leading expansion coefficients in scenario 2, which is not quite MFV because the \(\tilde{b}_R\) is also light. Here, we take \(\alpha _d=315\). The other parameters are as in Fig. 1, apart from adjusting \(m_{H_u}^2=m_{H_d}^2=5\) (TeV)\(^2\) to obtain a \(m_h\) near 125 GeV. The resulting spectrum is \(m_{\tilde{t}_1}=796\) GeV, \(m_{\tilde{b}_1}\simeq m_{\tilde{t}_2}\simeq 1.4\) TeV, and \(m_{\tilde{b}_2}\simeq 2.4\) TeV. The first/second-generation squark masses are again \(\approx 10\) TeV. The higgsino mass turns quite low, \(\mu \simeq 240\) GeV, while the gluino and the additional Higgs states remain heavy, \(m_{\tilde{g}}\simeq 2.4\) TeV and \(m_A\simeq 2\) TeV

There is another scenario worth considering. Imagine that for some reasons, the shift from universality induced by the yet unknown flavour dynamics occurs only in the \(\mathrm {SU}(3)_{Q}\) space, through the \(\mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }-\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle \) combination. Plugging this structure in the soft-breaking expansion, they can be parametrised at the scale \(M_{\mathrm {GUT}}\):

$$\begin{aligned}&\mathbf {m}_{Q}^{2} =m_{0}^{2}a_{1}^{q}( \mathbf {1}-\alpha _{0}\mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\langle \mathbf {Y}_{u}\mathbf {Y} _{u}^{\dagger }\rangle ^{-1}) ^{\mathrm {T}},\nonumber \\&\mathbf {m}_{U}^{2} =m_{0}^{2}( a_{1}^{u}\mathbf {1}+a_{2}^{u} \mathbf {Y}_{u}^{\dag }(\mathbf {1}-\alpha _{0}\mathbf {Y}_{u}\mathbf {Y} _{u}^{\dagger }\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle ^{-1})\mathbf {Y}_{u}) \approx m_{0}^{2}a_{1}^{u}\,\mathbf {1},\nonumber \\&\mathbf {m}_{D}^{2} =m_{0}^{2}( a_{1}^{d}\mathbf {1}+a_{2}^{d} \mathbf {Y}_{d}^{\dag }(\mathbf {1}-\alpha _{0}\mathbf {Y}_{u}\mathbf {Y} _{u}^{\dagger }\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle ^{-1})\mathbf {Y}_{d}) \approx m_{0}^{2}a_{1}^{d}\,\mathbf {1}, \nonumber \\&\mathbf {T}_{u,d} =c_{1}^{u,d}A_{0}\mathbf {Y}_{u,d}( \mathbf {1} -\alpha _{0}\mathbf {Y}_{u}^{\dagger }\mathbf {Y}_{u}\langle \mathbf {Y} _{u}\mathbf {Y}_{u}^{\dagger }\rangle ^{-1}), \end{aligned}$$
(15)

for some \(\mathcal {O}(1)\) coefficients \(a_{i}^{q,u,d}\) and \(c_{i}^{u,d}\), which we set to one for simplicity. Note how the \(a_{2}^{u,d}\) terms end up negligible because \(\mathbf {1}-\alpha _{0}\mathbf {Y}_{u}\mathbf {Y}_{u} ^{\dagger }\langle \mathbf {Y}_{u}\mathbf {Y}_{u}^{\dagger }\rangle ^{-1}\), whose \((3,3)\) entry is suppressed, is sandwiched between \(\mathbf {Y}_{u,d}^{\dag }\) and \(\mathbf {Y}_{u,d}\). Again, this input respects the MFV requirement. The only difference with the first scenario is to impose inverted hierarchies in the trilinear terms at the unification scale. Such a pattern does not survive to the evolution, however. Looking at the expansion of the trilinear terms, the leading \(c_1^{u,d}\) and subleading \(c_{i\ne 1}^{u,d}\) coefficients do not evolve at the same speed, especially when the former are driven by the gluino mass. So, the cancellation present at the unification scale does not happen at the low scale, and trilinear terms end up being quite similar to those obtained with the first scenario. In this respect, the difficulty mentioned there to obtain a viable spectrum applies here also; a dedicated numerical analysis would be needed to conclude on the valid parameter space of these scenarios.

Beyond these specific examples, it is now straightforward to state a more general sufficient condition for obtaining a GUT-scale split spectrum which is guaranteed to be flavour-safe, using our formalism. This condition is that the GUT-scale flavour coefficients should at most be \(\mathcal{O}(1)\) and should approximately satisfy the relations (generalizing the expressions for \(\mathbf {m}_Q^2\) in Eq. (11) and \(\mathbf {m}_U^2\) in Eq. (13))

$$\begin{aligned} a_1^q+a_3^q\,y_t^{-2}+a_5^q\,y_t^{-4}&=0,\nonumber \\ a_1^u+a_2^u\,y_t^{-2}+a_4^u\,y_t^{-4}&=0. \end{aligned}$$
(16)

The MFV condition ensures that there are no flavour problems, while the sum rules Eq. (16) ensure that the top squarks are actually split from the first two-generation up-type squarks (note that only \(a_1^q\), \(a_3^q\) and \(a_5^q\) can significantly contribute to the LH stop soft mass if all \(a_i^q\) are \(\lesssim \mathcal{O}(1)\), and similarly for \(a_1^u\), \(a_3^u\) and \(a_5^u\) and the RH stop mass).

While this prescription covers a large class of viable spectra, we note that it is of course also possible to obtain flavour-safe natural SUSY mass patterns in a different manner—for instance, as we have seen above, one may deviate from the MFV prescription by splitting also the right-handed sbottom mass, and rely on the RG evolution to produce an almost MFV spectrum at the low scale. For such scenarios, however, safeness from FCNC constraints is not automatic but must be checked in each case.

We also note that the above sum rules are tied to small or moderately large \(\tan \beta \). At very large \(\tan \beta \), where \(y_b\) is of order one, they should be modified to take into account also the remaining terms in Eqs. (7) and (8), which may now contribute to the third-generation squark masses even if their coefficients are \(\mathcal{O}(1)\).

4 Conclusions

Third-generation squarks below the TeV scale are an essential requirement for supersymmetry to be natural, while the squarks of the first two generations are likely much heavier. Therefore it is important to study the physics of non-universal squark masses, and of inverted squark mass hierarchies in particular. In phenomenological approaches which prescribe the soft terms at the TeV scale, such as the pMSSM, this is possible to a limited extent only, since effects arising from the renormalisation group running from the mediation scale are not accounted for. In particular, these effects could lead to radiatively induced flavour-violating squark mass mixings. Given the tight experimental constraints from flavour observables, to fully grasp the implications of non-universal squark masses, one should be careful to account for such effects.

In this paper we have studied non-universal squark masses in the case that SUSY breaking is mediated at the GUT scale. We have shown how split squark mass matrices (and trilinears) can be conveniently and generally prescribed in a basis-independent way, and investigated their renormalisation group evolution.

When requiring only the top squarks to be light, and the first two generations to be nearly mass degenerate, the most natural prescription automatically respects the principle of minimal flavour violation at the GUT scale. Since MFV is preserved during the RG evolution of the soft terms down to the TeV scale, bounds on FCNCs can easily be evaded.

For more general hierarchical soft terms at the GUT scale, the compatibility with flavour observables is not automatic, even though generic soft terms tend to be attracted towards MFV-like structures in the infrared [29, 30]. We have confirmed this tendency for the particularly relevant case where all third-generation squarks, including the right-sbottom, are light compared to the squarks of the first two generations. While this scenario strongly violates the MFV hypothesis at the GUT scale, the soft terms become increasingly MFV-like during the running, and end up compatible with flavour constraints at the low scale.

Our analysis puts the increasingly popular framework of “natural SUSY” on a more solid footing, showing that it is actually possible to obtain a natural SUSY spectrum at the TeV scale from well-motivated GUT-scale boundary conditions without having to worry about RG-induced flavour violation. Furthermore, our formalism for defining non-universal soft terms in a basis-independent way should be very useful for further studies of the supersymmetric flavour problem beyond minimal flavour violation. A full exploration, within our scheme, of the parameter space leading to natural SUSY is left for a subsequent work.