1 Introduction

The goal of string phenomenology is to construct the \(N=1\) supersymmetric Standard Models (SM) or the SM from string theories. In Type I, Type IIA and Type IIB string theories, D-branes as boundaries of open strings plays an important role in phenomenologically interesting model building [1]. For the open string sectors, conformal field theory provides the consistent constructions of four-dimensional supersymmetric \(N=1\) chiral models with non-Abelian gauge symmetry on Type II orientifolds. Within such framework, we obtain the chiral fermions on the worldvolume of the D-branes which are located at orbifold singularities [2,3,4,5,6,7,8,9,10], and/or at the intersections of D-branes in the internal space [11], which have a T-dual description in terms of magnetized D-branes [12, 13].

Within the intersecting D6-brane models on Type IIA orientifolds [14,15,16], many non-supersymmetric three-family Standard-like models and grand unified models have been constructed [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35]. However, they typically suffer from the large Planck scale corrections at the loop level, or in other words, there exists the gauge hierarchy problem. On the other hand, a large number of the supersymmetric three-family Standard-like models and grand unified models have been constructed as well [36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55], which can solve the above problem. For a review, see Ref. [56].

In Ref. [47], Cvetič, Liu and one of us (TL) systematically constructed the three-family \(N=1\) supersymmetric Pati–Salam models from Type IIA orientifolds on \(\mathbf{T}^6/(\mathbf{Z}_2\times \mathbf{Z}_2)\) with intersecting D6-branes where all the gauge symmetries come from U(n) branes. The Pati–Salam gauge symmetries \(SU(4)_C\times SU(2)_L \times SU(2)_R\) can be broken down to \(SU(3)_C\times SU(2)_L \times U(1)_{B-L} \times U(1)_{I_{3R}} \) via D6-brane splittings, and further down to the SM via four-dimensional \(N=1\) supersymmetry preserving Higgs mechanism. Thus, it provides a road to the SM without any additional anomaly-free U(1)’s around the electroweak scale. Also, the hidden sector contain USp(n) branes, which are parallel with the orientifold planes or their \(\mathbf{Z}_\mathbf{2}\) images. These models have at least two confining gauge groups in hidden sector, whose gaugino condensation can in turn trigger supersymmetry breaking and (some) moduli stabilization. In particuar, Chen, Mayes, Nanopoulos and one of us (TL) found one of these models with a realistic phenomenology [52, 54], and study its variations as well [53]. Thus, we shall revisit such kind of three-family \(N=1\) supersymmetric Pati–Salam model building in this work.

Moreover, it has been pointed out that there are a few other potentially interesting constructions which might lead to the SM [47]. For example, the possible massless vector-like Higgs fields, which do not arise from a \(N=2\) subsector, can break the Pati–Salam gauge symmetry down to the SM or break the \(U(1)_{B-L}\times U(1)_{I_{3R}}\) down to \(U(1)_Y\). However, because the large wrapping numbers is required by the increased absolute values of the intersection numbers between \(U(4)_C\) stack of D-branes and \(U(2)_R\) stack or its orientifold image, it might be very difficult to find such models. Another interesting scenario is to construct the \(SU(2)_L\) and/or \(SU(2)_R\) gauge symmetries from filler branes, i.e., \(SU(2)_{L,R}=USp(2)_{L,R}\). And then the number of the SM Higgs doublet pairs might be decreased. However, we do not want to construct the \(SU(2)_{L,R}\) gauge symmetries from the splittings of higher rank USp(N) (\(N\ge 4\)) branes, which would lead to even number of families in general. In such case, the absolute value for one wrapping number of U(4) branes larger than 2 cannot be avoided, which might make the model building very difficult due to the tadpole cancellation conditions. Interestingly, with the better scanning method, one can definitely try to construct these models in the future.

Employing novel random and supervised scanning methods, we will further systematically study the three-family \(N=1\) supersymmetric Pati–Salam model building in Type IIA orientifolds on \(\mathbf{T}^6/(\mathbf{Z}_2\times \mathbf{Z}_2)\) with intersecting D6-branes in which the \(SU(4)_C\times SU(2)_L \times SU(2)_R\) gauge symmetries arise from U(n) branes. In particular, we construct the new models with large winding numbers as well, and find that the approximate gauge coupling unification can be achieved at the string scale.

The paper is organized as follows. In Sect. 2 we briefly review the basic rules for supersymmetric intersecting D6-brane model building on Type IIA \(T^6/(\mathbf{Z}_2\times \mathbf{Z}_2)\) orientifolds, the tadpole cancellation conditions, and the conditions for D6-brane configurations which preserve four-dimensional \(N=1\) supersymmetry. Also, we will briefly review the T-duality symmetries and its variations in the supersymmetric model building with intersecting D6-branes.

In Sect. 3, we study the supersymmetric D6-brane model building with large winding numbers and generic T-duality in consideration. We point out that the Type II T-duality in Ref. [47] is not an equivalent relation in Pati–Salam model building if the model is not invariant under \(SU(2)_L\) and \(SU(2)_R\) exchange, and provides a way to obtain the new model. With this construction, we obtain the supersymmetric D6-brane models with only one USp group in the hidden sector, which have three families of the SM fermions, as well as satisfy the tadpole cancellation conditions and \(N=1\) supersymmetry preserving conditions. Furthermore, we for the first time expand our investigation to the models with large wrapping number, a la 5, 6, 7, 8, 9 and 10, and obtain the approximate gauge coupling unification in these models.

In Sect. 4, we discuss the phenomenological consequences of new models in different classes. For each class, we show the full phenomenology table for one representative. As explicit examples, we present the chiral spectra in the open string sector for each class of models. The difference of T-dual model with paralleled third two-torus in spectrum can also be found in this section.

In Sect. 5, we perform machine learning methods to show in Figs. 1 and 2 how the Minimal Supersymmetric SM (MSSM)-like models expand in our scanning according to the reduced latent dimension (which reduced from 18 wrapping numbers). We find that the MSSM-like models tend to gather in islands and indicates more chances to find more MSSM-like models in the nearby region of them.

In Sect. 6, we briefly discuss the other potentially interesting setups and conclude. Also, we present the D6-brane configurations and intersection numbers for supersymmetric Pati–Salam models in the Appendix.

2 \(T^6 /(Z_2 \times Z_2)\) orientifolds with intersecting D6-branes

First, let us briefly review the basic rules to construct the supersymmetric models on Type IIA \(T^6 /(Z_2 \times Z_2)\) orientifolds with D6-branes intersecting at generic angles, as well as to obtain the massless open string state spectra in Refs. [37, 39]. In Type IIA string theory which is compactified on a \(T^6 /(Z_2 \times Z_2)\) orientifold, we consider \(T^{6}\) as a six-torus factorized as three two-tori \(T^{6} = T^{2} \times T^{2} \times T^{2}\). The corresponding complex coordinates for the i-th two-torus are \(z_i\), \(i=1,\; 2,\; 3\), respectively.

The \(\theta \) and \(\omega \) generators for the orbifold group \(Z_{2} \times Z_{2}\), which are respectively associated with the twist vectors \((1/2,-\,1/2,0)\) and \((0,1/2,-\,1/2)\), act on the complex coordinates \(z_i\) as below

$$\begin{aligned}&\theta : (z_1,z_2,z_3) \rightarrow (-z_1,-z_2,z_3),~ \nonumber \\&\omega : (z_1,z_2,z_3) \rightarrow (z_1,-z_2,-z_3).~\, \end{aligned}$$
(1)

We implement the orientifold projection by gauging the \(\Omega R\) symmetry, where \(\Omega \) is world-sheet parity, and R acts on the complex coordinates as follows

$$\begin{aligned} R: (z_1,z_2,z_3) \rightarrow ({\overline{z}}_1,{\overline{z}}_2,{\overline{z}}_3).~\, \end{aligned}$$
(2)

Therefore, we have four kinds of orientifold 6-planes (O6-planes) respectively for the actions of \(\Omega R\), \(\Omega R\theta \), \(\Omega R\omega \), and \(\Omega R\theta \omega \). In order to cancel the RR charges of O6-planes, we introduce stacks of \(N_a\) D6-branes wrapping on the factorized three-cycles. Also, there are two kinds of complex structures for a two-torus: rectangular and tilted [15, 37, 39, 54], which are consistent with orientifold projection. The homology classes of the three cycles wrapped by the D6-brane stacks can be expressed in terms of \(n_a^i[a_i]+m_a^i[b_i]\) and \(n_a^i[a'_i]+m_a^i[b_i]\) for the rectangular and tilted tori respectively, where \([a_i']=[a_i]+\frac{1}{2}[b_i]\). Thus, a generic one cycle can be labelled as \((n_a^i,l_a^i)\) in both cases, where in terms of the wrapping numbers \(l_{a}^{i}\equiv m_{a}^{i}\) and \(l_{a}^{i}\equiv 2{\tilde{m}}_{a}^{i}=2m_{a}^{i}+n_{a}^{i}\) for a rectangular two-torus and a tilted two-torus, respectively. And then \(l_a^i-n_a^i\) must be even for a tilted two-torus.

Moreover, for a of \(N_a\) D6-branes along the cycle \((n_a^i,l_a^i)\), we need to introduce their \(\Omega R\) images \({a'}\)-stack of \(N_a\) D6-branes with wrapping numbers \((n_a^i,-l_a^i)\). The homology three-cycles for a of \(N_a\) D6-branes and its orientifold image \(a'\) respectively are

$$\begin{aligned}&[\Pi _a]=\prod _{i=1}^{3}\left( n_{a}^{i}[a_i]+2^{-\beta _i}l_{a}^{i}[b_i]\right) ,\;\;\;\nonumber \\&\left[ \Pi _{a'}\right] =\prod _{i=1}^{3} \left( n_{a}^{i}[a_i]-2^{-\beta _i}l_{a}^{i}[b_i]\right) ,~\, \end{aligned}$$
(3)

where \(\beta _i=0\) or \(\beta _i=1\) for the rectangular or tilted i-th two-torus, respectively. The homology three-cycles, which are wrapped by the four O6-planes, are given by

$$\begin{aligned}&\Omega R: [\Pi _{\Omega R}]= 2^3 [a_1]\times [a_2]\times [a_3],~\, \end{aligned}$$
(4)
$$\begin{aligned}&\Omega R\omega : [\Pi _{\Omega R\omega }]=-2^{3-\beta _2-\beta _3}[a_1]\times [b_2]\times [b_3],~\, \end{aligned}$$
(5)
$$\begin{aligned}&\Omega R\theta \omega : [\Pi _{\Omega R\theta \omega }]=-2^{3-\beta _1-\beta _3}[b_1]\times [a_2]\times [b_3],~\, \end{aligned}$$
(6)
$$\begin{aligned}&\Omega R\theta : [\Pi _{\Omega R}]=-2^{3-\beta _1-\beta _2}[b_1]\times [b_2]\times [a_3].~\, \end{aligned}$$
(7)

Thus, the intersection numbers can be expressed in terms of wrapping numbers as follows

$$\begin{aligned} I_{ab}= & {} [\Pi _a][\Pi _b]=2^{-k}\prod _{i=1}^3(n_a^il_b^i-n_b^il_a^i),~\, \end{aligned}$$
(8)
$$\begin{aligned} I_{ab'}= & {} [\Pi _a]\left[ \Pi _{b'}\right] =-2^{-k}\prod _{i=1}^3(n_{a}^il_b^i+n_b^il_a^i),~\, \end{aligned}$$
(9)
$$\begin{aligned} I_{aa'}= & {} [\Pi _a]\left[ \Pi _{a'}\right] =-2^{3-k}\prod _{i=1}^3(n_a^il_a^i),~\, \end{aligned}$$
(10)
$$\begin{aligned} I_{aO6}= & {} [\Pi _a][\Pi _{O6}]=2^{3-k}(-l_a^1l_a^2l_a^3 +l_a^1n_a^2n_a^3\nonumber \\&+n_a^1l_a^2n_a^3+n_a^1n_a^2l_a^3),~\, \end{aligned}$$
(11)

where \(k=\beta _1+\beta _2+\beta _3\) is the total number of tilted two-tori, and \([\Pi _{O6}]=[\Pi _{\Omega R}]+[\Pi _{\Omega R\omega }]+[\Pi _{\Omega R\theta \omega }]+[\Pi _{\Omega R\theta }]\) is the sum of four O6-plane homology three-cycles.

The generic massless particle spectrum for intersecting D6-branes at general angles, which is valid for both rectangular and tilted two-tori, can be expressed via the intersection numbers as listed in Table 1. In addition, the two main constraints on the four-dimensional \(N=1\) supersymmetric model building from Type IIA orientifolds with intersecting D6-branes are: RR tadpole cancellation conditions and \(N=1\) supersymmetry preservation in four dimensions, which are given in the following subsections A and B, respectively.

Table 1 General massless particle spectrum for intersecting D6-branes at generic angles. In this table, the representations refer to \(U(N_a/2)\), the resulting gauge symmetry because of \(Z_2\times Z_2\) orbifold projection [37]. The chiral supermultiplets contain both scalars and fermions in such supersymmetric constructions. And in our convention, the positive intersection numbers give us the left-handed chiral supermultiplets

2.1 The RR tadpole cancellation conditions

The tadpole cancellation conditions directly lead to the \(SU(N_a)^3\) cubic non-Abelian anomaly cancellation [17, 18, 37], while the cancellation of U(1) mixed gauge and gravitational anomaly or \([SU(N_a)]^2 U(1)\) gauge anomaly can be achieved by a generalized Green-Schwarz mechanism mediated by untwisted RR fields [17, 18, 37]. The D6-branes and orientifold O6-planes, which are the sources of RR fields, are restricted by the Gauss law in a compact space, namely, the sum of the RR charges of D6-branes and O6-planes must be zero due to the conservations of the RR field flux lines. The conditions for RR tadpole cancellations are given by

$$\begin{aligned} \sum _a N_a [\Pi _a]+\sum _a N_a \left[ \Pi _{a'}\right] -4[\Pi _{O6}]=0,~\, \end{aligned}$$
(12)

where the last terms asise from the O6-planes, which have \(-\,4\) RR charges in D6-brane charge unit.

For simplicity, we define the following products of wrapping numbers

$$\begin{aligned}&A_a \equiv -n_a^1n_a^2n_a^3, \quad B_a \equiv n_a^1l_a^2l_a^3, \quad C_a \equiv l_a^1n_a^2l_a^3, \quad D_a \equiv l_a^1l_a^2n_a^3, \nonumber \\&{\tilde{A}}_a \equiv -l_a^1l_a^2l_a^3, \quad {\tilde{B}}_a \equiv l_a^1n_a^2n_a^3, \quad {\tilde{C}}_a \equiv n_a^1l_a^2n_a^3, \quad {\tilde{D}}_a \equiv n_a^1n_a^2l_a^3.\,\nonumber \\ \end{aligned}$$
(13)

To cancel the RR tadpoles, we introduce an arbitrary number of D6-branes wrapping cycles along the orientifold planes, dubbed as “filler branes”, which contribute to the RR tadpole cacellation conditions while trivially satisfy the four-dimensional \(N=1\) supersymmetry conditions. The tadpole conditions then take the form of

$$\begin{aligned}&-2^k N^{(1)}+\sum _a N_a A_a=-2^k N^{(2)}+\sum _a N_a B_a= \nonumber \\&-2^k N^{(3)}+\sum _a N_a C_a=-2^k N^{(4)}+\sum _a N_a D_a=-16,\,\nonumber \\ \end{aligned}$$
(14)

where \(2 N^{(i)}\) is the number of filler branes wrapping along the i-th O6-plane that is given in Table 2. The filler branes, which give us the USp group, carry the same wrapping numbers as one of the O6-planes as shown in Table 2. When the filler branes have non-zero A, B, C or D, we refer to the USp group as the A-, B-, C- or D-type USp group, respectively.

Table 2 The wrapping numbers for four O6-planes

2.2 Conditions for four-dimensional \(N = 1\) supersymmetric D6-brane

In four-dimensional \(N=1\) supersymmetric models, 1/4 supercharges from ten-dimensional Type I T-dual are required to be preserved, namely, these 1/4 supercharges survive the orientation projection of the intersecting D6-branes and the \(Z_2\times Z_2\) orbifold projection on the background manifold. It was shown that the four-dimensional \(N=1\) supersymmetry can be preserved after the orientation projection iff the rotation angle of any D6-brane with respect to the orientifold plane is an element of SU(3) [11], or in other words, \(\theta _1+\theta _2+\theta _3=0 \) mod \(2\pi \), where \(\theta _i\) is the angle between the D6-brane and orientifold-plane in the i-th two-torus. Because the \(Z_2\times Z_2\) orbifold projection will automatically be survived for such D6-brane configuration, the four-dimensional \(N=1\) supersymmetry conditions can be written as below [39]

$$\begin{aligned}&x_A{\tilde{A}}_a+x_B{\tilde{B}}_a+x_C{\tilde{C}}_a+x_D{\tilde{D}}_a=0, \nonumber \\&A_a/x_A+B_a/x_B+C_a/x_C+D_a/x_D<0, \end{aligned}$$
(15)

where \(x_A=\lambda ,\; x_B=\lambda 2^{\beta _2+\beta 3}/\chi _2\chi _3,\; x_C\!=\!\lambda 2^{\beta _1+\beta 3}/\chi _1\chi _3, x_D=\lambda 2^{\beta _1+\beta 2}/\chi _1\chi _2\), where \(\chi _i=R^2_i/R^1_i\) are the complex structure moduli for the the i-th two-torus. And we introduce the positive parameter \(\lambda \) to put all the variables \(A,\,B,\,C,\,D\) on an equal footing. Based on these conditions, we can classify all the possible D6-brane configurations, which preserve four-dimensional \(N=1\) supersymmetry, into three types:

  1. (1)

    The filler brane which has the same wrapping numbers as one of the O6-planes in Table 2. The gauge symmetry is USp group. Because one and only one of the wrapping number products A, B, C and D has non-zero and negative value, we refer to the corresponding USp group as the A-, B-, C- or D-type USp group as mentioned in the last section.

  2. (2)

    The Z-type D6-brane with one zero wrapping number. There are two negative and two zero values in A, B, C and D.

  3. (3)

    The NZ-type D6-brane without zero wrapping number. Among A, B, C and D, three of them are negative while one of them is positive. Based on which one is positive, we can classify the NZ-type branes into the A-, B-, C- and D-type NZ branes. Each type has two forms of wrapping numbers defined as follows

    $$\begin{aligned}&A1: (-,-)\times (+,+)\times (+,+),\nonumber \\&A2:(-,+)\times (-,+)\times (-,+); \end{aligned}$$
    (16)
    $$\begin{aligned}&B1: (+,-)\times (+,+)\times (+,+),\nonumber \\&B2:(+,+)\times (-,+)\times (-,+); \end{aligned}$$
    (17)
    $$\begin{aligned}&C1: (+,+)\times (+,-)\times (+,+),\nonumber \\&C2:(-,+)\times (+,+)\times (-,+); \end{aligned}$$
    (18)
    $$\begin{aligned}&D1: (+,+)\times (+,+)\times (+,-),\nonumber \\&D2:(-,+)\times (-,+)\times (+,+). \end{aligned}$$
    (19)

    To be convenient , we shall refer the Z-type and NZ-type D6-branes to be U-branes in the following since they carry U(n) gauge symmetry.

3 Supersymmetric Pati–Salam model building

3.1 Construction of supersymmetric Pati–Salam models

To construct the SM or SM-like models from the intersecting D6-brane scenarios. besides the \(U(3)_C\) and \(U(2)_L\) gauge symmetries from stacks of branes, we must have at least two extra U(1) gauge groups in both supersymmetric and non-supersymmetric models to obtain the correct quantum number for right-handed charged leptons [18, 37,38,39]. One is the lepton number symmetry \(U(1)_L\), while the other is similar to the third component of right-handed weak isospin \(U(1)_{I_{3R}}\). And then the hypercharge is given by

$$\begin{aligned} Q_Y=Q_{I_{3R}}+{{Q_B-Q_{L}}\over {2}},~\, \end{aligned}$$
(20)

where \(U(1)_B\) is the overall U(1) of \(U(3)_C\). In general, the U(1) gauge symmetry, which comes from a non-Abelian SU(N) gauge symmetry, is anomaly free and then its gauge field is massless. In our model building, \(U(1)_{B-L}\) and \(U(1)_{I_{3R}}\) arise from \(SU(4)_C\) and \(SU(2)_R\) gauge symmetries, respectively. Thus, they are anomaly free and their gauge fields is massless.

If \(U(1)_{I_{3R}}\) arises from the stack of D6-branes on top of orientifold [37, 38], i.e., from the USp group, there exist at least 8 pairs of SM Higgs doublets, and two extra anomaly free U(1) gauge symmetries in general. These U(1) gauge symmetries could in principle be spontaneously broken via the Higgs mechanism by the scalar components of the chiral superfields whose quantum numbers are the same as the right-handed neutrinos. However, the D-flatness conditions cannot be preserved, and then supersymmetry is broken. Thus, the scale of symmetry breaking should be around the electroweak scale. Moreover, we typically do not have any other candidates, which can preserve the D-flatness and F-flatness conditions, and break these gauge symmetries at an intermediate scale.

Therefore, similar to Ref. [47], we concentrate on the Pati–Salam models in which \(U(1)_{I_{3R}}\) arises from the \(U(2)_R\) symmetry. Because it is very difficult to find the interesting models with \(SU(2)_L\) from the D6-branes on the top of O6-plane [47], we study the supersymmetric \(SU(4)_C\times SU(2)_L\times SU(2)_R\) model building from three stacks of D6-branes, which are not on the top of orientifold planes. In our model, we can break the Pati–Salam gauge symmetry down to \(SU(3)_C\times SU(2)_L\times U(1)_{B-L} \times U(1)_{I_{3R}}\) via D6-brane splittings, and further down to the SM gauge symmetry via Higgs mechanism with Higgs particles from a \(N=2\) subsector [47]. Because we do not have any extra anomaly free U(1) gauge symmetry around the electroweak scale, we solve a generic problem in previous constructions [37, 38].

In short, we introduce three stacks of D6-branes, a, b, c with D6-brane numbers 8, 4, and 4, which respectively give us the gauge symmetryies \(U(4)_C\), \(U(2)_L\) and \(U(2)_R\). The gauge anomalies from three U(1)s are cancelled by the generalized Green-Schwarz mechanism mediated by untwisted RR fields as shown in [17, 18, 37], and these U(1)s gauge fields obtain masses via the linear \(B\wedge F\) couplings. To be precise, \(U(4)_C\), \(U(2)_L\), and \(U(2)_R\) can be decomposed into \(SU(4)_C \times U(1)_C\), \(SU(2)_L \times U(1)_L\), and \(SU(2)_R \times U(1)_R\). The \(U(1)_C\), \(U(1)_L\), and \(U(1)_R\) gauge symmetries are anomalous U(1) gauge symmetries, whose anomalies are cancelled by the generalized Green-Schwarz mechanism. In general, the cubic non-Abelian anomalies in the intersecting D-brane models are cancelled automatically when the RR tadpole cancellation conditions are satisfied, while the additional mixed U(1) anomalies such as \(U(1)_C\), \(U(1)_L\), and \(U(1)_R\) in our models may still be present. These anomalies are cancelled by the generalized Green-Schwarz mechanism which involves the untwisted RR forms. The couplings of the four untwisted RR forms \(B^i_2\) to the U(1) field strength \(F_a\) of each stack a are [57]

$$\begin{aligned}&N_a m^1_a n^2_a n^3_a \int _{M4}B^1_2\wedge \text {tr}F_a,~ \;\; N_a n^1_a m^2_a n^3_a \int _{M4}B^2_2\wedge \text {tr}F_a,~ \nonumber \\&N_a n^1_a n^2_a m^3_a \int _{M4}B^3_2\wedge \text {tr}F_a,~ \;\;\nonumber \\&\quad -N_a m^1_a m^2_a m^3_a \int _{M4}B^4_2\wedge \text {tr}F_a,~\, \end{aligned}$$
(21)

which determine the linear combinations of U(1) gauge bosons that acquire string scale masses via the generalized Green-Schwarz mechanism. In short, in the intersecting D-brane model building, we can have up to four anomalous U(1) gauge symmetries whose anomalies are cancelled by the generalized Green-Schwarz mechanism. Therefore, we obtain the Pati–Salam gauge symmetries \(SU(4)_C\times SU(2)_L\times SU(2)_R\) with their gauge fields massless and anomaly free. And \(U(1)_{B-L}\) and \(U(1)_{I_{3R}}\) arise from \(SU(4)_C\) and \(SU(2)_R\) gauge symmetries, respectively. Moreover, to have three families of the SM fermions, we require the intersection numbers to satisfy

$$\begin{aligned}&I_{ab} + I_{ab'} = 3,~\, \end{aligned}$$
(22)
$$\begin{aligned}&I_{ac} = -3,~ I_{ac'} = 0,~\, \end{aligned}$$
(23)

where the conditions \(I_{ab} + I_{ab'}=3\) and \(I_{ac} =-\,3\) give us three generations of the SM fermions, whose quantum numbers under \(SU(4)_C\times SU(2)_L\times SU(2)_R\) gauge symmetries are \((\mathbf{4, 2, 1})\) and \(({{{\bar{\mathbf{{4}}}}}, 1, 2})\). To satisfy the \(I_{ac'} =0 \) condition, the stack a D6-branes must be parallel to the orientifold (\(\Omega R\)) image \(c'\) of the c-stack of D6-branes along at least one tow-torus, where in our model building we choose to be the third two-torus. And then we have Open strings that stretch between the a and \(c'\) stacks of D6-branes. When the minimal distance square \(Z^2_{(ac')}\) (in \(1/M_s\) units) between these two stacks on the third two-torus is small, namely when the minimal length squared of the stretched string is small, we obtain the light scalars with squared-masses \(Z^2_{(ab')}/(4\pi ^2 \alpha ')\) from the NS sector, and the light fermions with the same masses from R sector [17, 18, 46], which form four-dimensional \(N=2\) hypermultiplets. Thus, we have \(I_{ac'}^{(2)}\) (the intersection numbers for a and \(c'\) stacks on the first two two-tori) vector-like pairs of the chiral superfields with quantum numbers \(({{{\bar{\mathbf{{4}}}}}, 1, 2})\) and \((\mathbf{4, 1, 2})\). These vector-like particles are the Higgs fields, which can break the Pati–Salam gauge symmetry down to the SM gauge symmetry, while keep the four-dimensional \(N=1\) supersymmetry. Especially, they are massless when \(Z^2_{(ac')}=0\). Due to the symmetry transformation \(c\leftrightarrow c'\), the model with intersection numbers \(I_{ac}=0\) and \(I_{ac'}=-3\) are equivalent to that with \(I_{ac}=-3\) and \(I_{ac'}=0\), so we shall not discuss it here.

To break the Pati–Salam gauge symmetry to the SM, we split the a-stack of D6-branes into \(a_1\) and \(a_2\) stacks respectively with 6 and 2 D6-branes. And then the \(U(4)_C\) gauge symmetry is broken down to \(U(3)_C \times U(1)\).

The gauge fields and three chiral multiplets in adjoint representation of \(SU(4)_C\) are broken down to the gauge fields and three chiral multiplets in adjoint representations of \(SU(3)_C\) and as well as the gauge field and three singlets of \(U(1)_{B-L}\) accordingly. Also, we assume that the numbers of symmetric and anti-symmetric representations for \(SU(4)_C\) are and , respectively, similar convention for \(SU(3)_C\), \(SU(2)_L\), and \(SU(2)_R\). These chiral multiplets for \(SU(4)_C\) are broken down to the and chiral multiplets in symmetric and anti-symmetric representations for \(SU(3)_C\), and chiral multiplets with \(U(1)_{B-L}\) charge \(\pm 2\). Moreover, there exist \(I_{a_1 a'_2}\) new fields with quantum number \((\mathbf{3, -1})\) under \(SU(3)_C\times U(1)_{B-L}\) arising from the open strings at the intersections of \(a_1\) and \(a_2'\) stacks of D6-branes, while the rest of the particle spectrum remains the same. Also, the anomaly free gauge symmetries from \(a_1\) and \(a_2\) stacks of D6-branes are \(SU(3)_C\times U(1)_{B-L}\), the \(SU(4)_C\) subgroup.

To break \(U(2)_R\) gauge symmetry, we split the c-stack of D6-branes into \(c_1\) and \(c_2\) stacks, and each one has two D6-branes. And then the gauge fields and three chiral multiplets in adjoint representation of \(SU(2)_R\) are broken down to the gauge field and three singlets of \(U(1)_{I_{3R}}\), respectively. The chiral multiplets in symmetric representation of \(SU(2)_R\) are broken down to the chiral multiplets with \(U(1)_{I_{3R}}\) charge, while the chiral multiplets in anti-symmetric representation \(SU(2)_R\) will be gone. Also, there are \(I_{c_1 c'_2}\) new fields that are neutral under \(U(1)_{I_{3R}}\) arising from the open strings at the intersections of \(c_1\) and \(c_2'\) stacks of D6-brane, while the rest of the particle spectrum remain the same. The anomaly free gauge symmetry from \(c_1\) and \(c_2\) stacks of D6-branes becomes \(U(1)_{I_{3R}}\), the \(SU(2)_R\) Cartan subgroup.

With the above D6-brane splittings, we obtain the \(SU(3)_C\times SU(2)_L\times U(1)_{B-L} \times U(1)_{I_{3R}}\) gauge symmetry. In order to break it further down to the SM gauge symmetry, we assume the minimal distance square \(Z^2_{(a_2 c_1')}\) to be small, and thus obtain \(I_{a_2 c_1'}^{(2)}\) pairs of chiral multiplets with quantum numbers \((\mathbf{{ 1}, 1, -1, 1/2})\) and \((\mathbf{{ 1}, 1, 1, -1/2})\) under \(SU(3)_C\times SU(2)_L\times U(1)_{B-L} \times U(1)_{I_{3R}}\). These vector-like particles can break the \(SU(3)_C\times SU(2)_L\times U(1)_{B-L} \times U(1)_{I_{3R}}\) gauge symmetry down to the SM while keep the D- and F-flatness since their quantum numbers are the same as those of the right-handed neutrino and its complex conjugate. In particular, they are massless when \(Z^2_{(a_2c_1')}=0\). Therefore, the complete chains for symmetry breaking are

$$\begin{aligned}&SU(4)\times SU(2)_L \times SU(2)_R \nonumber \\&\quad \overrightarrow{\;a\rightarrow a_1+a_2\;}\; SU(3)_C\!\times \! SU(2)_L \!\times \! SU(2)_R \!\times \! U(1)_{B-L} \nonumber \\&\quad \overrightarrow{\; c\rightarrow c_1+c_2 \;} \; SU(3)_C\!\times \!SU(2)_L\!\times \! U(1)_{I_{3R}}\!\times \! U(1)_{B-L} \nonumber \\&\quad \overrightarrow{\;\mathrm Higgs \; Mechanism\;} \; SU(3)_C\!\times \! SU(2)_L\!\times \! U(1)_Y.~\, \end{aligned}$$
(24)

For Type IIA orientifolds with intersecting D6-branes, the dynamical supersymmetry breaking has been studied in Ref. [43]. There exist some filler branes carrying USp gauge symmetries that are confining, and then could allow for gaugino condensation, supersymmetry breaking, as well as moduli stabilization.

The gauge kinetic function for a generic stack x of D6-branes is given by [43]

$$\begin{aligned} f_x = \frac{{\mathbf{1}}}{\mathbf{4}} \left[ n^1_x n^2_x n^3_x S - \left( \sum _{i=1}^3 2^{-\beta _j-\beta _k}n^i_x l^j_x l^k_x U^i\right) \right] ,\, \end{aligned}$$
(25)

where the real parts of dilaton S and moduli \(U^i\) respectively are

$$\begin{aligned}&\mathrm{Re}(S) = \frac{M_s^3 R_1^{1} R_1^{2} R_1^{3} }{2\pi g_{s}},~\, \end{aligned}$$
(26)
$$\begin{aligned}&\mathrm{Re}(U^{i}) = \mathrm{Re}(S)~ \chi _j \chi _k,~\, \end{aligned}$$
(27)

where \(i\ne j\ne k\), and \(g_s\) is the string coupling. So the gauge coupling constant associated with x is

$$\begin{aligned} g_{D6_x}^{-2}= & {} |\mathrm {Re}\,(f_x)|. \end{aligned}$$
(28)

In our models, the holomorphic gauge kinetic functions for \(SU(4)_C\), \(SU(2)_L\) and \(SU(2)_R\) are identified with stacks a, b, and c, respectively. The holomorphic gauge kinetic function for \(U(1)_Y\) is then a linear combination of these for SU(4) and \(SU(2)_R\). As shown in [15, 54], we have

$$\begin{aligned} f_Y = \frac{3}{5} \,\left( \frac{2}{3}\, f_{a} + f_{c} \right) . \end{aligned}$$
(29)

Also, we can express the tree-level MSSM gauge couplings in the form of

$$\begin{aligned} g^2_{a} = \alpha \, g^2_{b} = \beta \, \frac{5}{3}g^2_Y = \gamma \, [\pi e^{\phi _4} ] \end{aligned}$$
(30)

where \(g_a^2, g^2_{b}\), and \(\frac{5}{3}g^2_Y\) are the strong, weak and hypercharge gauge couplings, respectively, and \(\alpha , \beta , \gamma \) are the ratios between them. Moreover, the Kähler potential is given by

$$\begin{aligned} K=-\mathrm{ln}(S+ {{\bar{S}}}) - \sum _{I=1}^3 \mathrm{ln}(U^I +{{{\bar{U}}}}^I).~\, \end{aligned}$$
(31)

Three stacks of D6-branes, which carry \(U(4)_C\times U(2)_L \times U(2)_R\) gauge symmetry, generically determine the complex structure moduli \(\chi _1\), \(\chi _2\) and \(\chi _3\) because of the four-dimensional \(N=1\) supersymmetry conditions. Thus, we only have one independent modulus field. In order to stabilize the moduli, one usually has at least two USp groups with negative \(\beta \) functions which can be confined and then allow for gaugino condensations [58,59,60]. In general, the one-loop beta function for the \(2N^{(i)}\) filler branes, which are on top of i-th O6-plane and carry \(USp(N^{(i)})\) group, is given by [47]

$$\begin{aligned} \beta _i^g= & {} -3\left( {N^{(i)}\over 2}\!+\!1\right) \!+\!2 |I_{ai}|\!+\! |I_{bi}| \!+\! |I_{ci}| \!+\!3\left( {N^{(i)}\over 2}\!-\!1\right) \nonumber \\= & {} -6+2 |I_{ai}|+ |I_{bi}|+ |I_{ci}|.~\, \end{aligned}$$
(32)

If supersymmetry is broken by gaugino condensations, we may need to consider gauge mediation since gravity mediation is much smaller. Thus, the supersymmetry CP problem may be solved as well. Unlike Ref. [47], to include alternative supersymmetry broken mechanisms, we will not require at least two USp gauge group factors with negative \(\beta \) functions in our Pati–Salam model building.

3.2 T-duality symmetry and its variations

In string theory, two theories are equivalent when T-duality can be performed to map one to the other. This also applies to D-brane model building when two models are related by T-duality. For D6-brane configurations, two models are equivalent if their three two-tori as well as their corresponding wrapping numbers for all the D6-branes are correlated by an element of the permutation group \(S_3\) acting on three two-tori. In addition, two D6-brane configurations are equivalent if their wrapping numbers on two arbitrary two-tori have the same absolute values but opposite sign, while their wrapping numbers on the third two-torus are the same. This is known as the D6-brane Sign Equivalent Principle.

However, from the pheonomenological point of view, we want to emphasize that Type II T-duality in Ref. [47] is not an equivalent relation in Pati–Salam model building if the model is not invariant under \(SU(2)_L\) and \(SU(2)_R\) exchange. And one can uterlize this to fine tune the gauge unification behaviors. Now we show how this works by recalling type II T-duality. Under the Type II T-duality, the transformations of the wrapping numbers for any stacks of D6-branes in the model are

$$\begin{aligned} n_x^i \rightarrow -n_x^i,~ l_x^i \rightarrow l_x^i,&n_x^j \leftrightarrow l_x^j,\quad n_x^k \leftrightarrow l_x^k,\, \end{aligned}$$
(33)

where \(i\not = j \not = k\), as well as x runs over all D6-branes in the model. In particular, it is easy to show that all the intersection numbers will change signs.

For a three-family supersymmetric Pati–Salam model, we obtain a corresponding new three-family supersymmetric Pati–Salam models by exchanging b-stacks and c-stacks of D6-branes

$$\begin{aligned} b \leftrightarrow c.~\, \end{aligned}$$
(34)

Especially, the quantum numbers for \(SU(2)_L\) and \(SU(2)_R\) in the particle spectrum, as well as the \(SU(2)_L\) and \(SU(2)_R\) gauge couplings at string scale will be interchanged due to the b-stacks and c-stacks exchange \(b \leftrightarrow c\). Therefore, from the phenomenological point of view, this is not an equivalent relation if the particle content is not invariant under \(SU(2)_L\) and \(SU(2)_R\) exchange or if \(SU(2)_L\times SU(2)_R\) gauge couplings are not unified at the string scale. To be more concise, this is not an equivalent relation if the Pati–Salam model is not invariant under \(SU(2)_L\) and \(SU(2)_R\) exchange. Moreover, recall that there is a variation of type II T-duality [47], under which, the transformations of the wrapping numbers for any stacks of D6-branes in the model are

$$\begin{aligned} l_x^1 \rightarrow -l_x^1, \quad l_x^2 \rightarrow -l_x^2, \quad l_x^3 \rightarrow -l_x^3, \nonumber \\ b\leftrightarrow c, \end{aligned}$$
(35)

where x runs over all D6-branes in the model.

These lead to one interesting phenomenon on gauge coupling unification aspect. Looking closer at the gauge kinetic function relation Eq. (29) and the MSSM gauge coupling relation Eq. (30), it is obvious that when \(f_a/f_b =1\), namely when there is the gauge coupling unification for the \(SU(4)_C\) and \(SU(2)_L\) gauge symmetries, after the b- and c-stacks of brane swapped, it will be shifted to the \(U(1)_Y \) and \(SU(4)_C\) gauge coupling unification, i.e., \(f_Y/f_a =1\). Similarly, when there is a \(U(1)_Y \) and \(SU(4)_C\) gauge coupling unification before the b- and c-stacks of brane swapping, it will be shifted to \(SU(4)_C\) and \(SU(2)_L\) gauge coupling unification at string scale. By performing this type II T-dualities, the constructed model can be fine tuned on the gauge unification aspect. In Sect. 3.3, we will discuss how the gauge couplings got fine tuned with exact models presented.

3.3 Scanning of supersymmetric Pati–Salam models

Similar to Ref. [47], we introduce three stacks of D6-branes, a, b, and c with number of D6-branes 8, 4, and 4, respectively. The corrsponding gauge symmetries are \(U(4)_C\), \(U(2)_L\) and \(U(2)_R\). Unlike the strategy in Ref. [47], we do not restrict ourselves with that at least two USp groups in the hidden sector have negative \(\beta \) functions, and instead we do a broader scanning without any constraint on the hidden sector.

In general, if all three two-tori are not tilted, we can not obtain the particle spectra with odd generations of the SM fermions. Thus, we have three kinds of scenarios: one tilted two-torus, two tilted two-tori, and three tilted two-tori. As pointed out in Ref. [47], the model buildings with two and three tilted two-tori either do not have three families or violate the RR tadpole cancellation conditions. And our scanning confirms this observation. Therefore, here we concentrate on the new scanning with only one tilted torus. We choose the third two-torus to be tilted and study the new inequivalent Pati–Salam models in the following. In our broader scanning, we obtain several classes of new models.

The first class of models including Models 811 has only one USp group, in which Models 8 and 9, as well as Models 10 and 11 are T-dual to each other. Especially, the Models 10 and 11 do not have the colored chiral exotic particles. The Higgs particles in Models 8 and 10 arise from \(N=2\) subsectors at the intersections of b- and c-stacks of D6-branes, while the Higgs particles in Models 9 and 11 arise from \(N=2\) subsectors at the intersections of b- and \(c'\)-stacks of D6-branes. There exist four and eight exotic Higgs-like particles in Models 8 and 9 as well as Models 10 and 11, respectively.

The second class of models has two USp groups, and the representative models are Models 12 and 13, which are T-dual to each other. The Higgs particles in Model 12 arise from \(N=2\) subsector at the intersections of b- and c-stacks of D6-branes, while the Higgs particles in Models 13 arise from \(N=2\) subsectors at the intersections of b- and \(c'\)-stacks of D6-branes. Because all these USp groups have negative \(\beta \) functions, we may stabilize the modulus and break the supersymmetry via gaugino condensations.

The third class of models has more than two USp groups, and the representative model is Model  14, which have four confining USp(N) gauge groups and can considered as T-dual of Model I-Z-10 in [47], is the only model in our current scan which has exact gauge coupling unification at the string scale, and one may stabilize the modulus and break the supersymmetry via gaugino condensations as well.

The fourth class of models has the absolute value of at least one wrapping number equal to 5, and the representative models are Models 1518. This kind of models has not been found in the previous search [47]. Interestingly, we observe that the MSSM gauge coupling values are more refined, and there exists the approximate gauge coupling unification. In which, Models 15 and 16 are with only one USp group with negative \(\beta \) function, so one needs to stabilize the modulus with different mechanism. In the next Section we will focus on the phenomenological studies of the models with small wrapping numbers.

4 Preliminary phenomenological studies

In this section, we shall discuss the phenomenological features of our models. We start with Models 8 and 9, which are constructed with one USp group. The gauge symmetry is \(U(4)\times U(2)_L\times U(2)_R\times USp(4)\), while the \(\beta \) function of USp(4) group is zero. So we cannot break supersymmetry via gaugino condensation, and then need the other mechanism for supersymmetry breaking. Also, how to decouple the exotic particles, which are charged under USp(4), is an interesting question since USp(4) is not confined. In Models 10 and 11, the gauge symmetry is \(U(4)\times U(2)_L\times U(2)_R\times USp(4)\) as well. The \(\beta \) function of USp(4) group is negative, so we can break supersymmety via gaugino condensation, and decouple the exotic particles. In all these models, we need to address the modulus stabilization issue as well, which is generic for the models with one USp group (Tables 3, 4).

Table 3 The chiral spectrum in the open string sector for Model 8
Table 4 The chiral spectrum in the open string sector for Model 9

The Models 12 and 13 are with two confining gauge groups in hidden sector. They are T-dual to each other in the same manner as for the one orientifold plane class of models, and they can be obtained from Model I-Z-2 in Ref. [47] via generic Type II T-duality. The full spectrum of Model 13 is shown in Table 5. Because we do not have \(SU(2)_L \times SU(2)_R\) gauge coupling unification at the string scale, the Models 12 and 13 are not equivalent to the Model I-Z-2 in Ref. [47] from phenomenological point of view. At the string scale, we have \(SU(4)_C\times SU(2)_R\) gauge coupling unification in Models 12 and 13, while \(SU(4)_C\times SU(2)_L\) gauge coupling unification in Model I-Z-2 of [47]. Recalling the definition of hypercharge gauge coupling, we find that the \(SU(3)_C \times SU(2)_L\) gauge coupling unification is shifted to \(SU(3)_C \times U(1)_Y\) gauge coupling unification, and vice versa. Thus, the generic Type II T-duality provides a new way to construct the new models. However, we should note that this construction is not simply swapping the b- and c-stack of D6-branes, but usually the non-trivial Type II T-dualities are performed. For the examples of performing b- and c-stack of D6-branes swapping under Eq. (35), we show Models 17, and 18 respectively from Models 15, and 16 via Type II T-duality, in which the SM gauge couplings are shifted resulting from such D6-brane swapping.

Table 5 The chiral spectrum in the open string sector for Model 13

As we mentioned, the USp(N) gauge groups with negative beta functions in hidden sector have a potential to be confining, and thus the non-perturbative effective superpotential can be generated via gaugino condensations. The ground state, which is determined by the minimization of this supergravity potential, can stabilize the dilaton and complex structure toroidal moduli, and breaks supersymmetry in some cases. For the models with two confining USp(N) gauge groups, a general analysis of the non-perturbative superpotential with tree-level gauge couplings can be performed, and it was shown that there can exist extrema with the stabilizations of dilaton and complex structure moduli [43]. However, these extrema might be saddle points and thus do not break supersymmetry. Interestingly, if the models have three or four confining USp(N) gauge groups, the non-perturbative superpotientil allows for the moduli stabilization and supersymmetry breaking at the stable extremum in general [43].

Among our representative models, two Models (8 and 9) carry one USp(N) gauge group with zero beta function, three Models (1011, and 16) have one confining USp(N) gauge group with negative beta function, two Models (12 and 13) carry two confining USp(N) gauge groups with negative beta functions, and one Model 14 have four confining USp(N) gauge groups and considered as T-dual of Model I-Z-10 in [47]. Therefore, for the latter seven models, there may exist the stable extrema with moduli stabilization and supersymmetry breaking due to gaugino condensations, which are very interesting from the phenomenological points of view. However, as pointed out in Ref. [43], the cosmological constants at these extrema are likely to be negative and close to the string scale, and thus the gaugino condensations in these models might not address the cosmological constant problem.

All the models contain the exotic particles that are charged under the hidden gauge groups. The strong coupling dynamics in hidden sector at certain intermediate scale might provide a mechanism for all these particles to form bound states or composite particles, which are compatible with anomaly cancellation conditions. And then similar to the quark condensation in QCD, these particles would be only charged under the SM gauge symmetry [40]. The USp groups have two kinds of neutral bound states in general. The first one is the pseudo inner product of two fundamental representations that is generated by decomposing the rank two anti-symmetric representation, and is general for USp groups. In some sense, this is the reminiscent of a meson that is the inner product of one pair of fundamental and anti-fundamental representations of \(SU(3)_C\) in QCD. The second one is the rank 2N anti-symmetric representation of USp(2N) group for \(N\ge 2\), which is an USp(2N) singlet and somewhat similar to a baryon, as a rank three anti-symmetric representation of \(SU(3)_C\) in QCD. Our models, which contain the second kind of neutral bound states, are Models 10 and 11 with confining USp(4) group, as well as Models 12 and 13 with confining groups \(USp(4)\times USp(4)\) in the hidden sector. We note that Model 12 and 13 are T-dual to each other, and are constructed by their b- and c-stack of D6-branes swapped from Model I-Z-2 in [47] with proper T-duality transformations. For \(N=1\), these two kinds are the same.

Now we take Models 11 and 13 as examples to show explicitly the new composite states. In Model 11, we present the confined particle spectrum in Table 6. Because it has one confining gauge group USp(4) with two charged intersections. Therefore, besides self-confinement, the mixed-confinement between different intersections is also possible, which yields the chiral supermultiplets \((1,2, {\overline{2}},1)\). In Model 13, the confined particle spectra are given in Table 7. It has two confining gauge groups \(USp(4)_3\) and \(USp(4)_4\) both with two charged intersections. Besides the self-confinement, the mixed-confinement between different intersections yields the chiral supermultiplets \(({\overline{4}},2,1,1,1), (4,{\overline{2}},1,1,1), (1,{\overline{2}},2,1,1)\), and (4, 1, 2, 1, 1). Note that when there is only one charged intersection, we do not have mixed-confinement, and only the tensor representations are yielded from self-confinement. Moreover, it is easy to check from the spectrum that no new anomaly is introduced to the remaining gauge symmetry, so this model is still anomaly free.

Table 6 The composite particle spectrum for Model 11 formed due to the strong forces in hidden sector
Table 7 The composite particle spectrum for Model 13 formed due to the strong forces in hidden sector

This kind of self-confinement and mixed-confinement between different intersections also applies to the other models except for the Models 8 and 9. In these two models, we do not have asymptotical free gauge symmetries in the hidden sector, so the states charged under these symmetries cannot be confined. Because the anomaly cancellations for the confined particle spectra are not automatically guaranteed, one extra field associated with composite states may be needed to satisfy t’ Hooft anomaly matching condition. To avoid the unnecessary complications, we only consider relatively simple examples here.

5 Machine learning and future model building

In this section, we review our scanning methods and employing dimension reduction methods visually show the hints for future model building.

Firstly, we employed the standard random scanning according to the wrapping numbers up to 5. This is efficient for the scanning with small wrapping numbers less than 5, while it became very low efficient from wrapping number 5. We improved our scanning methods by setting at least one wrapping number to be 5, and setting the rest scanning to be random. In such a supervised way, we choose for one torus of one stack of brane (a, b or c-stack of brane) is constructed with larger wrapping number, while the other torus are constructed with random wrapping (normally with small wrapping but wrapping number 5 may also appear) to improve the efficiency of obtain MSSM-like physics models as we show in Sect. 3.

To discuss the possible improvement for future model building, we employ the so-called dimension reduction methods from machine learning to visually show the expansion of our constructed models. Dimensionality reduction is an unsupervised learning technique in machine learning. It seeks a lower-dimensional representation of numerical input data that preserves the salient relationships in the data. Recall that the scanning data we have are the 18 wrapping numbers with 18-dimension if we want to show how the three generation models expand according to them. In machine learning, it is allowed to utilize different algorithms to reduce these 18-dimensional data into 2-dimensional data which allows us to visually observe the behavior of three generation models among the total models. This method is called as dimension reduction methods. According to the algorithms it used on the background analysis, there are methods such as AutoEncoder, LatentSemanticAnalysis, and etc. Among these dimension reduction methods, AutoEncoder is a popular one that gives spectacular results. It implements a type of artificial neural network that aims to copy its inputs to its outputs. It compress the input to a latent-space representation, and then reconstructs the output from this representations. It usually is composed by by parts: Encoder (compresses the input into a latent-space representation) and Decoder (reconstruct the input from the latent space representation). A compressed low dimensional representation of the input connect the Encoder and Decoder in the middle step, and ideally the Encoder and Decoder are identical. In our discussion, we present the behaviors of the models with three generation MSSM-like models with the dimension reduction methods with the step of Encoder as follows.

Fig. 1
figure 1

AutoEncoder dimension reduction

Fig. 2
figure 2

LatentSemanticAnalysis dimension reduction

Firstly, with dimension reduction method “AutoEncoder”, we observe in Fig. 2 that according to the wrapping numbers of D6-brane models: each point corresponds to a D6-brane model, and the MSSM-like models are highlighted with green points when the wrapping number is less than 5, while the MSSM-like models are highlighted with red points when the wrapping number is equal to 5.

It is obvious that the MSSM-like models populate in several separated islands which indicates that there will be more chances to construct new MSSM-like models there. We note that the dense observation partially contains the contributions from the T-dual MSSM-like models constructed in the nearby region.

In an alternative way, we visually show how the constructed models with three families of the SM fermions expand with dimension reduction method “LatentSemanticAnalysis”. With the so-called dimension reduction methods, we reduce the total 18 wrapping numbers with 18-dimension to 2-dimension and observe that the selected MSSM-like models expand with pattern according to the first and second latent dimensions as shown in Fig. 2. In which, each point corresponds to a D6-brane model, and the MSSM-like models with their wrapping number less than 5 are highlighted with green, while the MSSM-like models with the wrapping number is equal to 5 are highlighted with red. It is clear that the MSSM-like models trend to gather on some islands. Recall that the dimensions are reduced from the wrapping number, we expect once some MSSM-like models are obtained, scanning around the obtained three generation MSSM-like models will increase the efficiency for future model building.

In addition, we also found that with linear algorithm, this behaviors, which the MSSM-like models tend to gather, can be confirmed as well.

Based on these above observation from the machine learning Encoder procedure, we expect it would be interesting to perform machine learning methods on the Decoder procedure by feeding the collected data of the above models (e.g. the wrapping numbers and the intersection numbers) from supervised and random scanning to neural network, etc. In this way, one can train the neural network to select the MSSM-like physics models from the random constructed models and generate more three generation MSSM-like physics models with scanning wrapping numbers given.

Furthermore, we would like to mention that the heterotic orbifold landscape drawn with deep autoencoder neural network methods has similar cluster behaviour as shown in [61]. It seems to us that there might be such cluster behaviors or tend to gather for different model buildings. And thus, perturb around the current scanning might further enrich the obtained MSSM-like models in different model buildings.

6 Discussions and conclusions

We revisited the systematic construction of the three-family \(N=1\) supersymmetric Pati–Salam models from Type IIA orientifolds on \(\mathbf{T}^6/(\mathbf{Z}_2\times \mathbf{Z}_2)\) with intersecting D6-branes, where the \(SU(4)_C\times SU(2)_L \times SU(2)_R\) gauge symmetries arise from the stacks of D6-branes with U(n) gauge symmetries. We found that the Type II T-duality in Ref. [47] is not an equivalent relation in Pati–Salam model building if the model is not invariant under \(SU(2)_L\) and \(SU(2)_R\) exchange, and provides a way to obtain the new model. Unlike the previous studies, we did not require at least two confining USp groups. Also, we scanned the wrapping numbers up to 5, and obtained more interesting models with approximate gauge coupling unification. The Pati–Salam gauge symmetry can be broken down to the \(SU(3)_C\times SU(2)_L\times U(1)_{B-L} \times U(1)_{I_{3R}}\) via D6-brane splittings, and further down to the SM gauge symmetry via the D- and F-flatness preserving Higgs mechanism in which Higgs fields are the massless open string states from a specific \(N=2\) subsector. Moreover, Models 12 and 13 are T-dual to each other and can be obtained from Model I-Z-2 in [47] via generic Type II T-duality transformation. Thus, the \(SU(3)_C \times SU(2)_L\) gauge coupling unification in Model I-Z-2 is shifted to \(SU(3)_C \times U(1)_Y\) gauge coupling unification in Models 12 and 13, and vice versa.

Table 8 D6-brane configurations and intersection numbers in Model 8, and its MSSM gauge coupling relation is \(g^2_a=\frac{1}{2}\, g^2_b=\frac{65}{44} \,(\frac{5}{3}\,g^2_Y)= \frac{16 \sqrt{6}}{15} \, \pi \,e^{\phi ^4}\)
Table 9 D6-brane configurations and intersection numbers in Model 9, and its MSSM gauge coupling relation is \(g^2_a=\frac{1}{2}\, g^2_b=\frac{65}{44} \,(\frac{5}{3}\,g^2_Y)= \frac{16 \sqrt{6}}{15} \, \pi \,e^{\phi ^4}\)

Furthermore, we obtained the models with one USp(4) group, and the new confine particle spectrum in Model 11 shown in Table 6. The composite particle spectrum of Model 13, which is formed due to two confined USp(4) groups in hidden sector, is given in Table 7, where supersymmetry breaking via a “race-track” scenario is still possible.

Last but not least, we found interesting models with wrapping number equal to 5 in supervised scanning methods, which was not found before. For these models, especially Model 16 and its T-dual Model 18, the gauge couplings are in a much more refined form because of the complicity of the intersections due to a large wrapping number. An approximate gauge coupling unification is achieved. For the models with larger wrapping numbers, a better gauge unified model was not found yet for large wrapping number up to 10. However, due to the limited scanning so far, it is worthwhile to search for the Pati–Salam models with broader scanning and large wrapping numbers. And in the machine learning and model building part, we observed visually that the MSSM-like models tend to gather in islands, which indicates more chances to find more MSSM-like models in their nearby scanning region.