The Standard Model (SM) successfully explains the mechanism responsible for the fermion masses but does not justify them. The arbitrariness of the Yukawa couplings makes the SM adaptable to any spectrum of fermion masses and mixings brought in by the experimental measurements. As it happens, the observed quark masses span five orders of magnitudes, with the third generation of quarks being much heavier than the first two. Furthermore, the quark mixings obey the following hierarchical pattern [1]:

$$\begin{aligned} V \approx \begin{pmatrix} 1-\lambda ^2/2 &{} -\lambda &{} {{\mathcal {O}}} \left( \lambda ^3 \right) \\ \lambda &{} 1-\lambda ^2/2 &{} {{\mathcal {O}}} \left( \lambda ^2 \right) \\ {{\mathcal {O}}} \left( \lambda ^3 \right) &{} {{\mathcal {O}}} \left( \lambda ^2 \right) &{} 1 \end{pmatrix}\,, \end{aligned}$$
(1)

where \(\lambda \approx 0.22\) is the Cabibbo mixing parameter and the matrix, V, is known as the Cabibbo–Kobayashi–Maskawa (CKM) matrix [2,3,4]. Within the ambit of SM, such hierarchies can only originate from some conspiracies within the Yukawa couplings themselves [5]. This is part of the problem that is usually dubbed in the literature as the ‘flavor puzzle’ [6,7,8]. This aspect of the SM, for decades, has fueled speculations that there might exist a deeper theoretical framework which can offer a more natural insight into the flavor structure. This article presents an extension beyond the SM (BSM) with a \(D_4\) symmetry, which can make the quark flavor structure appear more instinctive (for other works on flavor models using \(D_4\) symmetry, see [9,10,11,12,13,14,15,16,17]). An essential ingredient of our model is that the primary sources of masses for the third generation of quarks have been disentangled from those for the first two generations of quarks. The hierarchies in the quark masses and mixings are then chiefly attributed to the hierarchies in the vacuum expectation values (VEVs) of the different scalar fields. This allows us to relax the Yukawa hierarchies in the quark sector considerably along with some new and interesting implications for the CKM matrix. We will describe our model in detail in the upcoming sections.

We start by laying out some of the basics of \(D_4\) symmetry [18]. The discrete group \(D_4\) has five irreducible representations which we label as \({\mathbf {1}}_{++}\), \({\mathbf {1}}_{--}\), \({\mathbf {1}}_{-+}\), \({\mathbf {1}}_{+-}\), and \({\mathbf {2}}\). For the two-dimensional representation of \(D_4\), we opt to work in a basis in which the generators of \(D_4\) are given by

$$\begin{aligned} a= \begin{bmatrix} 0 &{} -1\\ 1 &{} 0 \end{bmatrix}, b= \begin{bmatrix} 1 &{} 0 \\ 0 &{} -1 \end{bmatrix} \,, \end{aligned}$$
(2)

where a is of order 4 and b is of order 2. In this basis, the relevant tensor products in the explicit component form are given by [19]

$$\begin{aligned} {\begin{bmatrix}x_1 \\ x_2 \end{bmatrix}}_{\mathbf {2}} \otimes {\begin{bmatrix} y_1 \\ y_2 \end{bmatrix}}_{\mathbf {2}}= & {} {\begin{bmatrix} x_1 y_1 + x_2 y_2 \end{bmatrix}}_{{\mathbf {1}}_{++}} \oplus {\begin{bmatrix} x_1 y_2 - x_2 y_1 \end{bmatrix}}_{{\mathbf {1}}_{--}} \nonumber \\&\oplus {\begin{bmatrix}x_1 y_2 + x_2 y_1 \end{bmatrix}}_{{\mathbf {1}}_{-+}}\nonumber \\&\oplus {\begin{bmatrix}x_1 y_1 - x_2 y_2 \end{bmatrix}}_{{\mathbf {1}}_{+-}} \,, \end{aligned}$$
(3a)
$$\begin{aligned} {\mathbf {1}}_{r,s} \otimes {\mathbf {1}}_{r',s'}= & {} {\mathbf {1}}_{r\cdot r',s\cdot s'} \,. \end{aligned}$$
(3b)

Now we will specify the \(D_4\) transformations of the different fields in our model. The i-th generation of left-handed quark doublet is denoted by \(Q_{iL}\equiv (p_{iL},n_{iL})^T\). The right-handed charged quark singlets are denoted by \(p_{iR}\) and \(n_{iR}\) in the up and down sectors, respectively. We have four scalar doublets in our model, which we symbolize as \(\phi _1\), \(\phi _2\), \(\phi _u\) and \(\phi _d\). These fields are assumed to transform under the \(D_4\) symmetry as follows:

$$\begin{aligned}&\mathbf{2} : \begin{bmatrix} Q_{1L} \\ Q_{2L}\end{bmatrix} \,,\begin{bmatrix} \phi _{1} \\ \phi _{2} \end{bmatrix} \,, \end{aligned}$$
(4a)
$$\begin{aligned}&\mathbf{1} _{++}: n_{1R} \,,\quad \mathbf{1} _{--}: n_{2R}\,, n_{3R}\,, \phi _u \,,\quad \mathbf{1} _{-+}: p_{2R}\,, p_{3R} \,, \phi _d , \nonumber \\&\quad \, \mathbf{1} _{+-}: Q_{3L} \,, p_{1R} \,. \end{aligned}$$
(4b)

As we will see shortly, because of the above transformations, \(\phi _u\) and \(\phi _d\) will couple exclusively to the up and down type quarks respectively, which justifies their labeling. The gauge and \(D_4\) invariant Yukawa Lagrangians in the up and down quark sectors are then given by

$$\begin{aligned} -{\mathscr {L}}_u= & {} A_u({\overline{Q}}_{1L} {\widetilde{\phi }}_1 - {\overline{Q}}_{2L} {\widetilde{\phi }}_2 )p_{1R} + B_u({\overline{Q}}_{1L} {\widetilde{\phi }}_2 \nonumber \\&+ {\overline{Q}}_{2L} {\widetilde{\phi }}_1 ) p_{2R} + C_u({\overline{Q}}_{1L} {\widetilde{\phi }}_2 + {\overline{Q}}_{2L} {\widetilde{\phi }}_1 ) p_{3R} \nonumber \\&+ X_u {\overline{Q}}_{3L} \phi _u p_{2R}\nonumber \\&+ Y_u {\overline{Q}}_{3L} \phi _u p_{3R} \,, \end{aligned}$$
(5a)
$$\begin{aligned} -{\mathscr {L}}_d= & {} A_d({\overline{Q}}_{1L} \phi _1 + {\overline{Q}}_{2L} \phi _2 )n_{1R}\nonumber \\&+ B_d({\overline{Q}}_{1L} \phi _2 - {\overline{Q}}_{2L}\phi _1)n_{2R}\nonumber \\&+ C_d({\overline{Q}}_{1L} \phi _2 - {\overline{Q}}_{2L}\phi _1)n_{3R} \nonumber \\&+ X_d {\overline{Q}}_{3L} \phi _d n_{2R} + Y_d {\overline{Q}}_{3L} \phi _d n_{3R} \,, \end{aligned}$$
(5b)

where \({\widetilde{\phi }}_k = i\sigma _2\phi _k^\star \) with \(\sigma _2\) being the second Pauli matrix. For an intuitive understanding of the upcoming results, we will assume the Yukawa parameters to be real. As such, we will not deliberate so much on the complex phase of the CKM matrix. We will treat the phase as an independent parameter which, as we have checked, can be easily accommodated by allowing the Yukawa couplings to be complex. The mass matrices in the up and down sector that transpire from Eq. (5) are

$$\begin{aligned} M_u= & {} \begin{pmatrix} A_u v_1 &{} B_u v_2 &{} C_u v_2 \\ -A_u v_2 &{} B_u v_1 &{} C_u v_1 \\ 0 &{} X_u v_u &{} Y_u v_u \end{pmatrix} \,,\nonumber \\ M_d= & {} \begin{pmatrix} A_d v_1 &{} B_d v_2 &{} C_d v_2 \\ A_d v_2 &{} -B_d v_1 &{} -C_d v_1 \\ 0 &{} X_d v_d &{} Y_d v_d \end{pmatrix}, \end{aligned}$$
(6)

where \(v_1\), \(v_2\), \(v_u\) and \(v_d\) represents the VEVs of \(\phi _1\), \(\phi _2\), \(\phi _u\) and \(\phi _d\) respectively with the total electroweak VEV, v, being defined through the relation

$$\begin{aligned} v^2 = v_1^2 +v_2^2 +v_u^2 +v_d^2 \approx (174 ~\mathrm{GeV})^2 \,. \end{aligned}$$
(7)

The diagonal mass matrices can then be obtained via the following biunitary transformations:

$$\begin{aligned} D_u= & {} U_u M_u V_{u}^\dagger = \mathrm{diag}(m_u,~ m_c,~ m_t ) \,, \end{aligned}$$
(8a)
$$\begin{aligned} D_d= & {} U_d M_d V_{d}^\dagger = \mathrm{diag}(m_d,~ m_s,~ m_b ) \,. \end{aligned}$$
(8b)

Following this convention for the biunitary transformations, the CKM matrix will be given by

$$\begin{aligned} V_{\mathrm{CKM}}= U_u U_d^\dagger \,. \end{aligned}$$
(9)

The matrices \(U_u\) and \(U_d\) are obtained by diagonalizing \(M_u M_u^\dagger \) and \(M_d M_d^\dagger \) respectively, which can be calculated from Eq. (6) as follows:

$$\begin{aligned} M_u M_u^\dagger= & {} \begin{pmatrix} A_u^2 v_1^2 + (B_u^2 + C_u^2) v_2^2 &{} (-A_u^2 + B_u^2 + C_u^2) v_1 v_2 &{} (C_u Y_u + B_u X_u) v_2 v_u \\ (-A_u^2 + B_u^2 + C_u^2) v_1 v_2 &{} (B_u^2 + C_u^2) v_1^2 + A_u^2 v_2^2 &{} (C_u Y_u + B_u X_u) v_1 v_u \\ (C_u Y_u + B_u X_u) v_2 v_u &{} (C_u Y_u + B_u X_u) v_1 v_u &{} (Y_u^2 + {X_u}^2) v_u^2 \end{pmatrix} \,, \end{aligned}$$
(10a)
$$\begin{aligned} M_d M_d^\dagger= & {} \begin{pmatrix} A_d^2 v_1^2 + (B_d^2 + C_d^2) v_2^2 &{} (A_d^2 - B_d^2 - C_d^2) v_1 v_2 &{} (C_d Y_d + B_d X_d) v_2 v_d \\ (A_d^2 - B_d^2 - C_d^2) v_1 v_2 &{} (B_d^2 + C_d^2) v_1^2 + A_d^2 v_2^2 &{} -(C_d Y_d + B_d X_d) v_1 v_d \\ (C_d Y_d + B_d X_d) v_2 v_d &{} -(C_d Y_d + B_d X_d) v_1 v_d &{} (Y_d^2 + {X_d}^2) v_d^2 \end{pmatrix} \,. \end{aligned}$$
(10b)

As a matter of fact, both \(M_uM_u^\dagger \) and \(M_dM_d^\dagger \) can be fully diagonalized analytically by sequentially operating the following matrices:

$$\begin{aligned}&O_\beta = \begin{pmatrix} \cos \beta &{} -\sin \beta &{} 0\\ \sin \beta &{} \cos \beta &{} 0\\ 0 &{}0 &{} 1 \end{pmatrix}\,,\nonumber \\&O_\theta ^{u,d}= \begin{pmatrix} 1 &{} 0 &{} 0\\ 0 &{} \cos \theta _{u,d} &{} -\sin \theta _{u,d} \\ 0 &{} \sin \theta _{u,d} &{} \cos \theta _{u,d} \end{pmatrix}\,, \end{aligned}$$
(11)

where \(\tan \beta =v_2/v_1\) and \(\theta _{u,d}\) will be defined shortly. As a first step, we notice that \(M_uM_u^\dagger \) and \(M_dM_d^\dagger \) can be block diagonalized using \(O_\beta \) as

$$\begin{aligned}&(M_u^2)_{\mathrm{Block}} \equiv O_\beta M_u M_u^\dagger O_\beta ^\dagger \nonumber \\&\quad = \begin{pmatrix} A_u^2 v_{12}^2 &{} 0 &{} 0\\ 0 &{} (B_u^2 + C_u^2) v_{12}^2 &{} (C_u Y_u + B_u X_u) v_{12} v_u\nonumber \\ 0 &{} (C_u Y_u + B_u X_u) v_{12} v_u &{} (Y_u^2 + {X_u}^2) v_u^2 \end{pmatrix},\\ \end{aligned}$$
(12a)
$$\begin{aligned}&(M_d^2)_{\mathrm{Block}} \equiv O_\beta ^\dagger M_dM_d^\dagger O_\beta \nonumber \\&\quad = \begin{pmatrix} A_d^2 v_{12}^2 &{} 0 &{} 0\\ 0 &{} (B_d^2 + C_d^2) v_{12}^2 &{} -(C_d Y_d + B_d X_d) v_{12} v_d\nonumber \\ 0 &{} -(C_d Y_d + B_d X_d) v_{12} v_d &{} (Y_d^2 + {X_d}^2) v_d^2 \end{pmatrix},\\ \end{aligned}$$
(12b)

where, as we will see shortly, \(v_{12}^2 = v_1^2 + v_2^2\) is the total VEV that is primarily responsible for the light quark masses. Quite clearly, the remaining \(2\times 2\) block in the up and down sectors can be diagonalized using \(O_\theta ^u\) and \(O_\theta ^d\), respectively. This second stage of diagonalization allows us to express \(\theta _u\) and \(\theta _d\) in terms of the Yukawa couplings and the VEVs as follows:

$$\begin{aligned} \tan 2\theta _u= & {} \frac{2(C_u Y_u + B_u X_u) v_{12} v_u}{(Y_u^2 + {X_u}^2) v_u^2- (B_u^2 + C_u^2) v_{12}^2} \,, \end{aligned}$$
(13a)
$$\begin{aligned} \tan 2\theta _d= & {} -\frac{2(C_d Y_d + B_d X_d) v_{12} v_d}{(Y_d^2 + {X_d}^2) v_d^2- (B_d^2 + C_d^2) v_{12}^2} \,. \end{aligned}$$
(13b)

Thus, the full diagonalization in the up and down sectors can be expressed as

$$\begin{aligned} D_{u}^2 = O_\theta ^u O_\beta (M_uM_u^\dagger ) O_\beta ^\dagger {O_\theta ^u}^\dagger\equiv & {} \mathrm{diag} (m_{u}^2, m_{c}^2, m_{t}^2) \,, \end{aligned}$$
(14a)
$$\begin{aligned} D_{d}^2= O_\theta ^d O_\beta ^\dagger (M_dM_d^\dagger ) O_\beta {O_\theta ^d}^\dagger\equiv & {} \mathrm{diag} (m_{d}^2, m_{s}^2, m_{b}^2)\,. \end{aligned}$$
(14b)

Following our convention in Eq. (8), the matrices \(U_u\) and \(U_d\) can be extracted as follows:

$$\begin{aligned} U_u = O_\theta ^u O_\beta \,, \qquad U_d = O_\theta ^d O_\beta ^\dagger \,. \end{aligned}$$
(15)

Thus from Eq. (9), the CKM matrix is obtained as

$$\begin{aligned} V_{\mathrm{CKM}}= \begin{pmatrix} \cos 2 \beta &{} - \cos \theta _d \sin 2\beta &{} - \sin 2\beta \sin \theta _d\\ \cos \theta _u \sin 2 \beta &{} \cos 2 \beta \cos \theta _d \cos \theta _u + \sin \theta _d \sin \theta _u &{} \cos 2 \beta \cos \theta _u \sin \theta _d - \cos \theta _d \sin \theta _u\\ \sin 2 \beta \sin \theta _u &{} -\cos \theta _u \sin \theta _d + \cos 2 \beta \cos \theta _d \sin \theta _u &{} \cos \theta _d \cos \theta _u + \cos 2 \beta \sin \theta _d \sin \theta _u \end{pmatrix} . \end{aligned}$$
(16)

To make the connection between Eqs. (16) and (1) apparent, we assume that \(v_{12}\) is responsible for the masses of the first two generations of quarks whereas \(v_u\) and \(v_d\) primarily contribute to the third generation masses in the up and down sector, respectively. Therefore, it is quite natural to expect \(v_{12}\ll v_{u,d}\). From Eq. (12) we identify the first generation quark masses as

$$\begin{aligned} m_u^2 = A_u^2 v_{12}^2 \,, \qquad m_d^2 = A_d^2 v_{12}^2 \,. \end{aligned}$$
(17)

Furthermore, using the VEV hierarchy \(v_{u,d}\gg v_{12}\) we can approximate Eq. (13) as

$$\begin{aligned} \theta _u\approx & {} \frac{(C_u Y_u + B_u X_u)}{(Y_u^2 + {X_u}^2)}\frac{v_{12}}{v_u} \approx {{\mathcal {O}}} \left( \frac{v_{12}}{v_u} \right) \,, \end{aligned}$$
(18a)
$$\begin{aligned} \theta _d\approx & {} -\frac{(C_d Y_d + B_d X_d)}{(Y_d^2 + {X_d}^2)}\frac{v_{12}}{v_d} \approx {{\mathcal {O}}} \left( \frac{v_{12}}{v_d} \right) \,, \end{aligned}$$
(18b)

where we are implicitly assuming that the involved Yukawa couplings have similar orders of magnitude. It is also quite reasonable to take \(v_{12}\sim {{\mathcal {O}}} \left( 1~\mathrm{GeV} \right) \) and \(v_{u,d}\sim {{\mathcal {O}}} \left( 100~\mathrm{GeV} \right) \) so that the ratio \(v_{12}/v_{u,d}\) comes out to be \({{\mathcal {O}}} \left( \lambda ^2 \right) \). Therefore, from Eq. (18) we conclude

$$\begin{aligned} \sin \theta _{u,d} \approx {{\mathcal {O}}} \left( \lambda ^2 \right) \,, \qquad \cos \theta _{u,d} \approx {{\mathcal {O}}} \left( 1 \right) \,. \end{aligned}$$
(19)

Moreover, if we identify \(\sin 2\beta \) as the Cabibbo mixing, namely,

$$\begin{aligned} \sin 2\beta = \lambda \,, \end{aligned}$$
(20)

then Eq. (16) resembles exactly to Eq. (1). All these intuitive results will be validated later by providing explicit numerical benchmarks.

Given the structure of the CKM matrix predicted by the model as a function of \(\beta \) and \(\theta _{u,d}\), shown in Eq. (16), it is possible to extract the quark mixing angles by comparing the CKM matrix with the standard parametrization [20]. This, in turn, allows us to find the following best-fit values of \(\beta \) and \(\theta _{u,d}\) such that the quark mixing angles are compatible with the observed values [21]:

$$\begin{aligned} \sin 2\beta \approx 0.2265\,, \qquad \theta _u \approx \pm 0.025, \qquad \theta _d \approx \mp 0.016. \end{aligned}$$
(21)

As expected, the above values for \(\sin 2\beta \) and \(\theta _{u,d}\) conform well to our intuitive expectations of Eqs. (19) and (20). Fixing \(\sin 2\beta \) at its best-fit value, in Fig. 1 we display the region in \(\sin \theta _u\)\(\sin \theta _d\) plane allowed by the experimental uncertainties.

For the sake of completeness, we also calculate the mass eigenvalues for the second and third generation of quarks by diagonalizing the \(2\times 2\) submatrices in Eq. (12). In the up quark sector, we can compare the traces to write

$$\begin{aligned} m_c^2 +m_t^2 = (B_u^2 + C_u^2) v_{12}^2 + (Y_u^2 + {X_u}^2) v_u^2 \,. \end{aligned}$$
(22)
Fig. 1
figure 1

A representative allowed region in the \(\sin \theta _u\)\(\sin \theta _d\) plane from the uncertainties in \(\theta _{23}\) and \(\theta _{13}\). For this plot, we have fixed \(\sin 2\beta \) at its best-fit value given in Eq. (21). The best-fit point in the \(\sin \theta _u\)\(\sin \theta _d\) plane is marked with a star \((\star )\)

Keeping in mind the hierarchies, \(v_u\gg v_{12}\) and \(m_t \gg m_c\), the above relation can be approximated to express the top quark mass as

$$\begin{aligned} m_t^2 \approx (Y_u^2 + {X_u}^2) v_u^2 \,. \end{aligned}$$
(23)

Again, from the determinant of the \(2\times 2\) block in Eq. (12a), we may write

$$\begin{aligned} m_c^2 m_t^2 = (B_uY_u-C_uX_u)^2 v_{12}^2 v_u^2 \,. \end{aligned}$$
(24)

Using the expression for \(m_t\) from Eq. (23), we can extract the charm quark mass as

$$\begin{aligned} m_c^2 \approx \frac{(B_uY_u-C_uX_u)^2}{(Y_u^2 + {X_u}^2)} v_{12}^2 \,. \end{aligned}$$
(25)

Following the same steps in the down sector, we can obtain

$$\begin{aligned} m_s^2\approx & {} \frac{(B_dY_d -C_dX_d)^2}{(Y_d^2 + {X_d}^2)} v_{12}^2 \,, \end{aligned}$$
(26)
$$\begin{aligned} m_b^2\approx & {} (Y_d^2 + {X_d}^2) v_d^2 \,. \end{aligned}$$
(27)

At this point, we wish to emphasize that, assuming the Yukawas couplings to be similar for a particular sector, an obvious outcome of our model is

$$\begin{aligned} \frac{m_c}{m_t} \approx \frac{v_{12}}{v_{u}} \sim {{\mathcal {O}}} \left( \lambda ^2 \right) \,, \qquad \frac{m_s}{m_b} \approx \frac{v_{12}}{v_{d}} \sim {{\mathcal {O}}} \left( \lambda ^2 \right) \,, \end{aligned}$$
(28)

which agrees with the observations.

From Eqs. (23) and (25), we see that the third and second generation masses are controlled by \(v_u\) and \(v_{12}\), respectively. We can wonder how perturbativity may affect the model at hand, since \(m_t \approx {{\mathcal {O}}} \left( v_u \right) \) and \(m_c \approx {{\mathcal {O}}} \left( v_{12} \right) \) already [22]. Figure 2 illustrates how arbitrarily low values of \(v_{12}\) may jeopardize the perturbativity of the theory. By choosing \(v_u = 150\) GeV, and two example values for \(X_u = 0.7\) and 0.9, we can see from Fig. 2 that to have \(B_u\) and \(C_u\) in the perturbative regime, we should have \(v_{12}\ge {\mathcal {O}}(1~\text {GeV})\).

Fig. 2
figure 2

Left: Values of \(|B_u|\) (blue) and \(|C_u |\) (orange) compatible with the experimental values for the quark masses and mixing angles, for the benchmark points of \( X_u = 0.9\) (solid line) and \(X_u = 0.7\) (dashed), as a function of \(v_{12}\). The perturbative bound is marked in red. Right: \(|B_u |\) and \(|C_u|\) are perturbative in the shaded region in the \(X_u\)- \(v_{12}\) plane. We have assumed \(v_u = 150\)  GeV and \(\sin \theta _u = 0.025\) for both cases

Finally, to provide explicit justification to these intuitive expectations, we consider the following benchmark

$$\begin{aligned} \begin{array}{lll} v_{12} = {2~\mathrm{GeV}}, &{} v_u ={150~\mathrm{GeV}}, &{} v_d \approx {88~\mathrm{GeV}}, \\ A_u \approx 1.08 \!\!\times \!\! 10^{-3}, &{} B_u \approx 1.69, &{} C_u \approx 1.50, \\ X_u \approx 1.04, &{} Y_u \approx 0.49, A_d \approx 2.34 \!\!\times \!\! 10^{-3}, &{} B_d \approx 3.65 \!\!\times \!\! 10^{-2}, \\ C_d \approx 4.41 \!\!\times \!\! 10^{-2}, &{} &{}\\ X_d \approx 4.73 \!\!\times \!\! 10^{-2}, &{} Y_d \approx -3.20 \!\!\times \!\! 10^{-3},&{}\\ \end{array}\nonumber \\ \end{aligned}$$
(29)

which results in the following values of the quark masses and mixing angles

$$\begin{aligned}&m_u = {2.2~\mathrm{MeV}} \,, \qquad m_c ={1.27~\mathrm{GeV}} \,, \qquad m_t ={173~\mathrm{GeV}} \,, \end{aligned}$$
(30a)
$$\begin{aligned} \nonumber \\&m_d = {4.7~\mathrm{MeV}} \,, \qquad m_s ={0.093~\mathrm{GeV}} \,, \qquad m_b ={4.18~\mathrm{GeV}} \end{aligned}$$
(30b)
$$\begin{aligned} \nonumber \\&\sin \theta _{12} = 0.2265 \,, \qquad \sin \theta _{13} = 0.0036 \,, \qquad \sin \theta _{23} = 0.041 \,,\nonumber \\ \end{aligned}$$
(30c)

which are in agreement with the corresponding observations [21].

In passing, let us highlight the most notable outcomes of our model:

  • The hierarchy of the Yukawa couplings is diluted by two orders of magnitude, at least. Recall that, in the SM, \(m_t=174\) GeV and \(m_{u,d}\sim {{\mathcal {O}}} \left( 10^{-3}~\mathrm GeV \right) \) imply that the quark Yukawa couplings span five orders of magnitudes. We dampen this problem by assuming that the first two generations of quarks receive their masses from \(v_{12}\) which is of \({{\mathcal {O}}} \left( 1~\mathrm{GeV} \right) \). This means, the first generation Yukawas are, at worst, of \({{\mathcal {O}}} \left( 10^{-3} \right) \) whereas the second generation Yukawas can be of \({{\mathcal {O}}} \left( 1 \right) \). This feature is quite evident from the benchmark values given in Eq. (29).

  • We have introduced \(\phi _{u,d}\) dedicated for masses of the third generation of quarks. Quite naturally, we expect, \(v_{u,d}\sim {{\mathcal {O}}} \left( 100~\mathrm{GeV} \right) \) so that the top-Yukawa is of \({{\mathcal {O}}} \left( 1 \right) \). Thus, we should have the ratio \(v_{12}/v_{u,d}\sim {{\mathcal {O}}} \left( \lambda ^2 \right) \). It is very interesting to note that, this automatically conforms to \(m_2/m_3\approx v_{12}/v_{u,d}\sim {{\mathcal {O}}} \left( \lambda ^2 \right) \) where \(m_k\) is the mass for the k-th generation of quark. Quite clearly, this is a natural upshot of our model.

  • We have connected the quark mixings with the dynamics of the scalar sector. We have shown that the Cabibbo part of the quark-mixing stems purely from the ratio \(v_2/v_1\) (see Eq. (20)). The smallness of the off-Cabibbo elements of the CKM matrix is further connected to the VEV hierarchy \(v_{12}\ll v_{u,d}\). In other way, we are suggesting that the fact that the third generation of quarks are much heavier than the first two generations, is intimately connected to the smallness of the off-Cabibbo elements.

Finally, our current model is not just all about aesthetics, it will have other observable consequences too. The fact that the Yukawa Lagrangian of our model contains fewer parameters than that of the SM comes at the cost of making the scalar potential substantially more involved containing four scalar doublets. This means that the Higgs boson observed at the LHC is not the only fundamental scalar in nature, it is just the first one in series of many others to follow. The physical Higgs bosons will emerge from mixings among the four scalar doublets. Expanding the scalar doublets as

$$\begin{aligned} \phi _k = \begin{pmatrix} \varphi _k^+ \\ v_k + (h_k+iz_k)/\sqrt{2} \end{pmatrix} \,, \qquad k = 1,2, u,d, \end{aligned}$$
(31)

after the spontaneous symmetry breaking, the SM-like Higgs boson, h, can be extracted as follows [23]:

$$\begin{aligned} h = \frac{1}{v}(v_1h_1 +v_2h_2 +v_uh_u +v_dh_d) \,. \end{aligned}$$
(32)

This particular linear combination of the component fields will mimic the SM Higgs in its tree-level couplings and will not induce flavor changing neutral currents (FCNCs) at the tree-level. However, the other physical neutral scalars, in general, will possess tree-level FCNCs which means they have to be quite heavy to evade the experimental constraints. To have some intuitions on the FCNC couplings, we analyze the matrices, \(N_d^{1,2,d}\), which control them in the down sector. Below, we show approximate expressions for these matrices:

$$\begin{aligned} N_d^1\approx & {} \frac{1}{\sqrt{2} v_{12}} \begin{pmatrix} m_d \cos \beta &{} - m_s \sin \beta &{} m_b \theta _d \sin \beta \\ -m_d \sin _\beta &{} -m_s \cos \beta &{} m_b \theta _d \cos \beta \\ -m_d \theta _d \sin \beta &{} -m_s \theta _d \cos \beta &{} m_b \theta _d^2 \cos \beta \end{pmatrix}, \end{aligned}$$
(33a)
$$\begin{aligned} N_d^2\approx & {} \frac{1}{\sqrt{2} v_{12}} \begin{pmatrix} m_d \sin \beta &{} m_s \cos \beta &{} -m_b \theta _d\cos \beta \\ m_d \cos \beta &{} -m_s \sin \beta &{} m_b \theta _d \sin \beta \\ m_b \theta _d \cos \beta &{} -m_s \theta _d \sin \beta &{} m_b \theta _d^2 \sin \beta \end{pmatrix}, \end{aligned}$$
(33b)
$$\begin{aligned} N_d^d\approx & {} \frac{1}{\sqrt{2} v_{d}} \begin{pmatrix} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -m_b \theta _d \\ 0 &{} 0 &{} m_b \end{pmatrix}. \end{aligned}$$
(33c)

From the above expressions, we note that the magnitude of the largest off-diagonal element, for our chosen benchmarks of Eqs. (21) and (29), is 0.033, which is quite small. On top of this, the flavor constraints may be further relaxed if we remember the following points:

  • The actual FCNC matrices that control the couplings of the physical neutral scalars are orthogonal linear combinations (dictated by the scalar potential) of \(N_d^1\), \(N_d^2\), \(N_d^d\) and \(N_d^u\) where \(N_d^u=0\) simply because \(h_u\) does not couple to the down-type quarks.

  • A cancellation may be arranged between the scalar and pseudoscalar diagrams appearing in the FCNC process [24].

Furthermore, it should also be noted that low values of the VEVs, especially \(v_{12}\sim {{\mathcal {O}}} \left( 1~\mathrm{GeV} \right) \), will not necessarily imply the existence of light nonstandard scalars if we include terms that softly break the \(D_4\) symmetry in the scalar potential [25,26,27] (please refer to the appendix A for details). A detailed study of the scalar potential along with the analysis of the flavor constraints is reserved for a future work. Nevertheless, our current paper can be considered as a proof-of-concept for a novel idea that it might be possible to ascribe the quark flavor hierarchies primarily to the hierarchies in the VEVs all of which add together to constitute the total electroweak VEVFootnote 1. Thus, in other words, we have demonstrated that, to have an insight into the flavor puzzle, it might not be necessary to appeal to flavor symmetry breaking scales much higher than the EW scale. Moreover, the fact that such a scheme can be easily accommodated in a relatively simple and intuitive theoretical set-up, makes our model an interesting addition to the existing literature on flavor model building.