1 Introduction

About four years ago, the hint of a 125 GeV Higgs boson was reported in the diphoton channel by both the ATLAS and the CMS collaborations based on about \(5{~\mathrm fb}^{-1}\) data for each collaboration at the 7-TeV LHC [1, 2], and this led to the great discovery of the Higgs boson in July 2012 [3, 4]. Recently another excess in the diphoton channel was reported by the first \(3.2{~\mathrm fb}^{-1}\) data at the 13-TeV LHC [5, 6]. This time the invariant mass of the signal is located around \(750\, \mathrm{GeV}\), and its local and global significances are about \(3.6\sigma \) and \(2.3 \sigma \), respectively, for the ATLAS analysis, and \(2.6 \sigma \) and \(2 \sigma \) for the CMS analysis. Interestingly, although there exists an ostensible inconsistence in the width of the resonance,Footnote 1 both analyses favored the diphoton production rate at about \(4\, \mathrm{fb}\) in the narrow width approximation. Such a rate is about \(10^4\) times larger than the prediction of the standard model (SM) with a \(750 \mathrm{GeV}\) Higgs boson [8]. Obviously, if this excess is confirmed in near future, it points undoubtedly to the existence of new physics.

So far more than 100 theoretical papers have appeared to interpret the excess in new physics models [9164], and most of them employed the process \(g g \rightarrow S \rightarrow \gamma \gamma \) with S denoting a scalar particle with mass around \(750 \ \mathrm{GeV}\) to fit the data. From these studies, one can infer two essential ingredients of the explanations. One is that there must exist other charged and colored particles to generate by loop effects sufficiently large \(S \gamma \gamma \) and S g g interactions. The other is, given the fact that no excess was observed in the channels such as ZZ, \(WW^*\), and \(t\bar{t}\) at the LHC Run I, the particle S is preferred to be gauge singlet dominated so that the branching ratios of \(S \rightarrow Z Z\), \(W W^*\), \(t \bar{t}\) are not much larger than that of \(S \rightarrow \gamma \gamma \). These requirements guide us in seeking for the explanations of the excess.

In this work, we consider interpreting the diphoton excess in the minimal dilaton model (MDM), which extends the SM by one gauge singlet field, called the dilaton [165167]. Just like the traditional dilaton theories [168174], the dilaton in this model arises from a strong interaction theory with approximate scale invariance at a certain high energy scale. The breakdown of the invariance then triggers the electroweak symmetry breaking, and during this process, the dilaton as the pseudo Nambu–Goldstone particle of the broken invariance can be naturally light in comparison with the high energy scale. Furthermore, this model assumes that all SM particles except for the Higgs field do not interact with the dynamics sector, and consequently the dilaton does not couple directly to the fermions and W, Z bosons in the SM. In this sense, the dilaton is equivalent to an electroweak gauge singlet field. The model also consists of massive vector-like fermions acting as the lightest particles in the dynamical sector, to which the dilaton naturally couples in order to recover the scale invariance: \(M \rightarrow M e^{-\phi /f}\). As a result, the interactions between the dilaton and the photons/gluons are induced through loop diagrams of these fermions. These features render the MDM a hopeful theory to explain the diphoton excess through the dilaton production. Discussing the capability of the MDM in explaining the excess is the aim of this work.

This paper is organized as follows. We first introduce briefly the MDM in Sect. 2, and we present in Sect. 3 some analytical formulas which are used to calculate the diphoton rate. In Sect. 4, we discuss the constraints on the model, its capability in explaining the excess, and also the related phenomenology at the LHC Run II. For completeness, in Sect. 5 we turn to a discussion of the vacuum stability at high energy scale. Finally, we draw our conclusions in Sect. 6.

2 The minimal dilaton model

As introduced in the last section, the MDM extends the SM by adding one gauge singlet field S, which represents a linearized dilaton field, and also vector-like fermions \(X_i\). The low energy effective Lagrangian is then written as [165, 166]

(1)

where \(\mathcal L_\text {SM}\) is the SM Lagrangian without Higgs potential, f is the decay constant of the dilaton S, \(M_i\) is the mass of the fermion \(X_i\), and \(N_X\) is the number of the vector-like fermions. The scalar potential \(V(S,\tilde{H})\) contains terms with explicit breaking of the scale invariance, and its general form is given by

(2)

where \(m_S\), \(\lambda _S\), \(m_H\), \(\lambda _H\), and \(\lambda _{HS}\) are all free real parameters.

About the Lagrangian in Eq. (1), one should note the following points:

  • The MDM is actually a low energy theoretical framework describing the breakdown of a UV strong dynamics with approximate scale invariance, and the dilaton in this theory is distinguished from the usual one. Explicitly speaking, in the traditional dilaton models the whole SM sector is usually assumed to be a part of the strong dynamics, and all the fermions and gauge bosons of the SM are composite particles at the weak scale [168174]. Under these theoretical assumptions, the couplings of the linearized dilaton S to the SM fields take the following form [168174]:

    $$\begin{aligned} \mathcal{{L}} = \frac{S}{f} T^\mu _\mu , \end{aligned}$$
    (3)

    where \(T^\mu _\mu \) represents the trace of the energy-momentum tensor of the SM. Through the interactions in Eq. (3), the dilaton couples directly to the fermions and W, Z bosons in the SM with the strengths proportional to the mass of the involved particle. In this way, the dilaton mimics the properties of the SM Higgs boson. By contrast, in the MDM all SM particles except for the Higgs field are assumed to be the spectators of the strong dynamics, and they are all elementary particles. As a result, the dilaton does not couple directly to these particles.

  • In the original version of the MDM, the authors set \(N_X = 1\) and chose the quantum numbers of the fermion \(X_i\) same as those of the right-handed top quark. This setting was motivated by top-color theory [175], which intended to present a reasonable explanation of the relatively large top quark mass within a minimal framework. However, as we will show below, such a setting is tightly limited by the vacuum stability of the theory at \(m_{X_i}\) scale in interpreting the diphoton excess. Considering that a strong dynamical theory usually involves rich fermion fields and the assignment on their quantum numbers is somewhat arbitrary, we therefore consider a more general but also simple case, which assumes that all the vector-like fermions are identical, and each of them transforms in the \((3, 1, Y = 2 Q_X)\) representation of the SM gauge group \(SU(3)_c \bigotimes SU(2)_L \bigotimes U(1)_Y \). In the following, we vary the number of the fermions \(N_X\), their common mass \(m_X\), and also their electric charge \(Q_X\) to discuss the diphoton excess.

If one writes the Higgs field in unitary gauge via \(\tilde{H} = \frac{1}{\sqrt{2}} U (0, H)^T\), the scalar potential in Eq. (2) can be rewritten as

$$\begin{aligned} \tilde{V}(S,H)= & {} \frac{m_S^2}{2} S^2+ {\lambda _S\over 4}S^4 + \frac{m_H^2}{2} H^2\nonumber \\&+ \frac{\lambda _H}{4} H^4 + \frac{\lambda _{HS}}{4} S^2 H^2. \end{aligned}$$
(4)

In the following, we consider the most general situation in which both H and S take vacuum expectation values (VEV), \( \langle H \rangle = v \) and \(\langle S \rangle = f\), and they mix to form mass eigenstates h and s:

$$\begin{aligned} h= & {} \cos \theta _S H + \sin \theta _S S, \nonumber \\ s= & {} - \sin \theta _S H + \cos \theta _S S. \end{aligned}$$
(5)

In our scheme for the diphoton excess, h corresponds to the 125 GeV Higgs boson discovered at the LHC, and s is responsible for the \(750\, \mathrm{GeV}\) diphoton excess by the process \(g g \rightarrow s \rightarrow \gamma \gamma \). So, in the following, we set \(m_h=125\, \mathrm{GeV}\), \(m_s=750\, \mathrm{GeV}\), and \(v=246\, \mathrm{GeV}\), and for the convenience of our discussion, we choose \(\eta \equiv \frac{v}{f} N_X\), \(\sin \theta _S\), \(Q_X\), \(N_X\), and \(m_X\) as the input parameters of the MDM model. In this case, we have the following relations:

$$\begin{aligned} \begin{aligned} \lambda _{HS}&= \frac{2 \eta (m_h^2 - m_s^2 ) \sin \theta _S \cos \theta _S}{v^2 N_X}, \\ \lambda _H&= \frac{m_h^2 \cos ^2 \theta _S + m_s^2 \sin ^2 \theta _S }{2 v^2}, \\ \lambda _S&= \frac{\eta ^2 (m_h^2 \sin ^2 \theta _S + m_s^2 \cos ^2 \theta _S)}{2 v^2 N_X^2}. \end{aligned} \end{aligned}$$
(6)

With the assumption that the dilaton is fully responsible for the fermion masses, the Yukawa coupling of \(X_i\) is given by \(y_X \equiv \frac{m_X}{f} = \frac{\eta m_X}{v N_X}\). Obviously \(y_X\) is inversely proportional to \(N_X\) for fixed \(\eta \) and \(m_X\). As we will show below, the diphoton rate is only sensitive to the parameters \(\eta \), \(\sin \theta _S\), and \(Q_X\), and it does not depend on \(y_X\) directly.

3 Useful formulas in getting the diphoton excess

In the MDM, the particle s may decay into gg, \(\gamma \gamma \), \(Z \gamma \), ZZ, \(WW^*\), \(f \bar{f}\), and hh. In this section, we list the formulas for the widths of these decays, which are needed to get the diphoton rate. As we will show below, these formulas are helpful to understand our results.

  • The widths of \(\phi \rightarrow \gamma \gamma , g g, Z \gamma \) with \(\phi =h,s\):

    $$\begin{aligned} \Gamma _{\phi \rightarrow \gamma \gamma }= & {} \frac{G_\mu \alpha ^2 m_\phi ^3}{128\sqrt{2}\pi ^3} \left| I_{\gamma }^\phi \right| ^2, \end{aligned}$$
    (7)
    $$\begin{aligned} \Gamma _{\phi \rightarrow gg}= & {} \frac{G_\mu \alpha _s^2 m_\phi ^3}{16\sqrt{2}\pi ^3} \left| I_{g}^\phi \right| ^2, \end{aligned}$$
    (8)
    $$\begin{aligned} \Gamma _{\phi \rightarrow Z\gamma }= & {} \frac{G_\mu ^2 m_W^2 \alpha m_\phi ^3}{64\pi ^4} \left( 1-\frac{m_Z^2}{m_\phi ^2} \right) ^3 \left| I_{Z \gamma }^\phi \right| ^2, \end{aligned}$$
    (9)

    where the \(I_g^\phi \), \(I_\gamma ^\phi \), and \(I^\phi _{Z\gamma }\) are given by

    $$\begin{aligned} I_{\gamma }^h&= \cos \theta _S \times \left( A_1(\tau _W) + \frac{4}{3}A_{\frac{1}{2}}(\tau _t)\right) \nonumber \\&\quad +\sin \theta _S N_c \eta Q_X^2 A_{\frac{1}{2}}(\tau _X), \end{aligned}$$
    (10)
    $$\begin{aligned} I_{\gamma }^s&= - \sin \theta _S \times \left( A_1(\tau _W) + \frac{4}{3}A_{\frac{1}{2}}(\tau _t) \right) \nonumber \\&\quad + \cos \theta _S N_c \eta Q_X^2 A_{\frac{1}{2}}(\tau _X), \end{aligned}$$
    (11)
    $$\begin{aligned} I_{g}^h&= \frac{\cos \theta _S}{2} \times A_{\frac{1}{2}}(\tau _t) + \frac{\eta \sin \theta _S}{2} A_{\frac{1}{2}}(\tau _X), \end{aligned}$$
    (12)
    $$\begin{aligned} I_{g}^s&= - \frac{\sin \theta _S}{2} \times A_{\frac{1}{2}}(\tau _t) + \frac{\eta \cos \theta _S}{2} A_{\frac{1}{2}}(\tau _X), \end{aligned}$$
    (13)
    $$\begin{aligned} I_{Z\gamma }^h= & {} \cos \theta _S \times \left( \cos \theta _W C_1(\tau _W^{-1},\eta _W^{-1})\right. \nonumber \\&+\left. \frac{2(1-\frac{8}{3}\sin ^2\theta _W)}{\cos \theta _W} C_{\frac{1}{2}}(\tau _t^{-1},\eta _t^{-1})\right) \nonumber \\&+\, 4 \sin \theta _S N_c \eta Q_X^2 \frac{\sin ^2 \theta _W}{\cos \theta _W} C_{\frac{1}{2}}(\tau _X^{-1},\eta _X^{-1}), \end{aligned}$$
    (14)
    $$\begin{aligned} I_{Z\gamma }^s= & {} - \sin \theta _S \times \left( \cos \theta _W C_1(\tau _W^{-1},\eta _W^{-1})\right. \nonumber \\&+\left. \frac{2(1-\frac{8}{3}\sin ^2\theta _W)}{\cos \theta _W} C_{\frac{1}{2}}(\tau _t^{-1},\eta _t^{-1})\right) \nonumber \\&+\, 4 \cos \theta _S N_c \eta Q_X^2 \frac{\sin ^2 \theta _W}{\cos \theta _W} C_{\frac{1}{2}}(\tau _X^{-1},\eta _X^{-1}). \end{aligned}$$
    (15)

    In the above expressions, \(A_{\frac{1}{2}}\), \(A_1\), \(C_{\frac{1}{2}}\), \(C_1\) are the loop functions defined in [176, 177] with \(\tau _\beta =m_\phi ^2/(4 m_{\beta }^{2})\) and \(\eta _\beta = m_Z^2/(4 m_{\beta }^{2})\) for \(\beta = W, t, X_i\). About these formulas, one should note that the terms proportional to \(\cos \theta _S\) in the expressions of \(I_{i}^s\) are contributed by the dilaton component of s, while those proportional to \(\sin \theta _S\) come from the H-component of s. One should also note that in the case of \(\sin \theta _S \sim 0\), which is required by the null excess in the channels such as ZZ and hh at the \(750\, \mathrm{GeV}\) invariant mass (see below) and also by the \(125\, \mathrm{GeV}\) Higgs data, \(I_{\gamma }^s\), \(I_{g}^s\), and \(I_{Z\gamma }^s\) are all dominated by the contribution from the vector-like fermions, and consequently they are correlated. Explicitly, we have \(I_{\gamma }^s:I_{g}^s:I_{Z\gamma }^s = N_c Q_X^2 : \frac{1}{2}: \frac{N_c Q_X^2}{2} \frac{\sin ^2 \theta _W}{\cos \theta _W}\) in the limit \(m_s, m_X \gg m_Z\). This correlation may serve as a test of the model at future LHC experiments.

  • The widths of the decays \(s \rightarrow V V^*\) with \(V = W, Z\). If one parameterizes the effective \(s V V^*\) interaction as

    $$\begin{aligned} \mathcal{{A}}_{s V V^*} = g_V m_V ( A_{V}^s g^{\mu \nu } + B_{V}^s p_2^{\mu } p_1^{\nu } ) \epsilon _\mu (p_1) \epsilon _{\nu } (p_2), \end{aligned}$$

    then the decay width of \(s \rightarrow V V^*\) is given by[158]

    $$\begin{aligned}&\Gamma _{s \rightarrow V V^*} = \delta _V \frac{G_F m_s^3}{16 \pi \sqrt{2}} \frac{4 m_V^4}{m_s^4} \sqrt{\lambda (m_V^2,m_V^2;m_s^2)} \nonumber \\&\quad \times \left[ A_{V}^s A^{s*}_{V} \times \left( 2 + \frac{(p_1 \cdot p_2)^2}{m_V^4} \right) \right. \nonumber \\&\quad \left. + ( A_{V}^s B^{s*}_{V} + A^{s*}_{V} B_{V}^s) \times \left( \frac{(p_1 \cdot p_2)^3}{m_V^4} - p_1 \cdot p_2 \right) \right. \nonumber \\&\quad \left. \ \ +\ B_{V}^s B^{s*}_{V} \times \left( m_V^4 + \frac{(p_1 \cdot p_2)^4}{m_V^4} - 2 (p_1 \cdot p_2)^2 \right) \right] ,\nonumber \\ \end{aligned}$$
    (16)

    where \(\delta _V=2(1)\) for \(V=W(Z)\), respectively, and \(\lambda (x,y,z)= ((z-x-y)^2 - 4 xy)/z^2\). In the MDM, we have

    $$\begin{aligned} A_W^s\simeq & {} - \sin \theta _S, \quad B_W^s \simeq 0, \\ A_Z^s\simeq & {} -\sin \theta _S + \frac{\alpha }{ 4 \pi m_Z^2} \\&\cos \theta _S N_c \eta Q_X^2 \tan ^2\theta _W p_1\cdot p_2 A_{\frac{1}{2}}(\tau _X), \\ B_Z^s\simeq & {} - \frac{\alpha }{ 4 \pi m_Z^2} \cos \theta _S N_c \eta Q_X^2 \tan ^2\theta _W A_{\frac{1}{2}}(\tau _X). \end{aligned}$$

    Note that in the expressions of \(A_Z^s\) and \(B_Z^s\), we have included the one-loop corrections. This is because in the case of \(\sin \theta _S \sim 0\), the corrections are not always smaller than the tree-level contributions. Also note that in getting \(A_Z^s\) and \(B_Z^s\), to a good approximation we have neglected the Z boson mass appeared in the loop functions, and that is why we can express the corrections in terms of the simple function \( A_{\frac{1}{2}}(\tau _X)\).

  • The width of the tree-level decay \(s\rightarrow f\bar{f}\) with f denoting any of the fermions in the SM:

    $$\begin{aligned} \Gamma _{s\rightarrow f\bar{f}}= & {} \sin ^2\theta _S \frac{3G_\mu m^2_f m_s }{4\sqrt{2}\pi } \Big (1-\frac{4m_f^2}{m_s^2}\Big )^{\frac{3}{2}}. \end{aligned}$$
    (17)

    Note that for this kind of decays, the widths are proportional to \(\sin ^2 \theta _S\).

  • The width of the tree-level decay \(s \rightarrow h h\):

    $$\begin{aligned} \Gamma _{s \rightarrow h h}= & {} \frac{\left| C_{s hh}\right| ^2}{16\pi m_{s}^2}\left( \frac{m_s^2}{4}-m_h^2\right) ^\frac{1}{2}, \end{aligned}$$
    (18)

    where

    $$\begin{aligned} C_{s h h}= & {} - 6 \lambda _H v \sin \theta _S\cos ^2\theta _S + 6 \lambda _S f \sin ^2\theta _S\cos \theta _S \\&+ \lambda _{HS} (-v\sin ^3\theta _S+f\cos ^3\theta _S-2f\sin ^2\theta _S \cos \theta _S\nonumber \\&+2v \sin \theta _S \cos ^2\theta _S) \\\simeq & {} - \frac{2 m_s^2}{v} \sin \theta _S. \end{aligned}$$

    In getting the final expression of \(C_{shh}\), we have used the relation \(m_s^2 \gg m_h^2\) and \(\sin \theta _S \sim 0\) to neglect some unimportant terms. Just like the decays \(s \rightarrow WW^*\) and \(s \rightarrow t \bar{t}\), \(\Gamma _{s \rightarrow h h}\) is proportional to \(\sin ^2 \theta _S\).

With these formulas, the total width of the scalar s and the s-induced diphoton rate can be written as

$$\begin{aligned}&\Gamma _\mathrm{{tot}} = \Gamma _{s \rightarrow g g} + \Gamma _{s \rightarrow \gamma \gamma }\Gamma _{s \rightarrow Z \gamma } + \Gamma _{s \rightarrow Z Z} + \Gamma _{s \rightarrow W W^*}\nonumber \\&\qquad \qquad \quad + \Gamma _{s \rightarrow f \bar{f}} + \Gamma _{s \rightarrow h h} + \Gamma _{new}, \end{aligned}$$
(19)
$$\begin{aligned}&\sigma _{\gamma \gamma }^{13 \mathrm{TeV}} = \frac{\Gamma _{\phi \rightarrow gg}}{\Gamma ^\mathrm{{SM}}_{H \rightarrow g g}} |_{m_H \simeq 750 \,\mathrm{GeV}}\nonumber \\&\qquad \qquad \quad \times \sigma ^\mathrm{{SM}}_{\sqrt{s}=13 \mathrm{TeV}} (H) \times \frac{\Gamma _{s\rightarrow \gamma \gamma }}{\Gamma _{tot}}, \end{aligned}$$
(20)

where the \(\Gamma _\mathrm{{new}}\) in Eq. (19) represents the contribution from the exotic decays of s, which may exist if the MDM is embedded in a more complex theoretical framework, \(\Gamma ^\mathrm{{SM}}_{H \rightarrow g g}\) denotes the decay width of the SM Higgs H into g g with \(m_{H} = 750\, \mathrm{GeV}\), and \(\sigma ^\mathrm{{SM}}_{\sqrt{s}=13\, \mathrm{TeV}} (H)=735\, \mathrm{fb}\) is the NNLO production rate of the H at the 13-TeV LHC [181]. Obviously, if \(\Gamma _\mathrm{{tot}}\) is determined mainly by \(\Gamma _{gg}\), the rate can be approximated by

$$\begin{aligned} \sigma _{\gamma \gamma }^{13 \mathrm{TeV}} \simeq \frac{\Gamma _{\phi \rightarrow \gamma \gamma }}{\Gamma ^\mathrm{{SM}}_{H \rightarrow g g}} |_{m_H \simeq 750\, \mathrm{GeV}} \times \sigma ^\mathrm{{SM}}_{\sqrt{s}=13 \mathrm{TeV}} (H) \propto \eta ^2 Q_X^4,\nonumber \\ \end{aligned}$$
(21)

while, if \(\Gamma _\mathrm{{tot}}\) takes a fixed value, we have

$$\begin{aligned} \sigma _{\gamma \gamma }^{13 \mathrm{TeV}} = \left( \frac{45 \mathrm{GeV}}{\Gamma _\mathrm{{tot}}} \right) \times \sigma _\mathrm{{norm}} \times (\eta Q_X)^4, \end{aligned}$$
(22)

where the normalized cross section \(\sigma _\mathrm{{norm}}\) is equal to \(0.019 \mathrm{\ fb}\) (\(0.018 \mathrm{\ fb}\)) for \(m_X = 1 \mathrm{TeV}\) (\(1.5 \mathrm{TeV}\)).

Table 1 Upper limits on various \(750\, \mathrm{GeV}\) resonant signals at 8-TeV LHC set by either ATLAS or CMS collaboration [158]

From the discussion in this section, one can get the following important conclusions:

  • The widths of \(s \rightarrow gg, \gamma \gamma , Z \gamma \) or the production rates of the gg, \(\gamma \gamma \), and \(Z\gamma \) signals at the LHC are correlated by

    $$\begin{aligned}&\Gamma _{s \rightarrow g g} : \Gamma _{s \rightarrow \gamma \gamma } : \Gamma _{s \rightarrow Z \gamma } \simeq 1: \frac{9}{2} \frac{\alpha ^2}{\alpha _s^2} Q_X^4:\nonumber \\&\quad \frac{9}{4} \frac{\alpha ^2}{\alpha _s^2} \tan ^2 \theta _W Q_X^4 \simeq 1: 0.03 Q_X^4 : 0.004 Q_X^4. \end{aligned}$$
    (23)
  • The widths listed from Eqs. (7) to (18) depend on the number of the vector-like fermions \(N_X\) only through the parameter \(\eta \equiv \frac{v N_X}{f}\). As a result, explaining the diphoton excess puts non-trivial requirements on the combination \(\frac{v N_X}{f}\), instead of on the individual parameter \(N_X\) or \(y_X = \frac{\eta m_X}{v N_X}\).

  • Since the recent LHC searches for right-handed heavy quarks have required \(m_X \gtrsim 900 \ \mathrm{GeV}\) [178180] and thus \(\tau _{X} \equiv m_s^2/(4 m_X^2) < 0.2\), the loop functions appeared in the widths change slightly with the further increase of \(m_X\). This implies that the widths and also the cross section have a very weak dependence on the value of \(m_X\). As a result, the results obtained in this work are only sensitive to the parameters \(\eta \), \(\sin \theta _S\), and \(Q_X\). At this stage, one can infer that the parameter \(N_X\) may also be understood as the total number of the vector-like fermions with the electric charge \(Q_X\) in the strong dynamics because the contributions of the fermions to the diphoton rate are roughly identical. Since the particle content of the strong dynamics is usually rich, \(N_X\) is naturally larger than 1.

We recall that the second and third conclusions depend on the assumption that the dilaton is fully responsible for the masses of the vector-like fermions, and as far as we know were not paid attention to in previous literature.

Fig. 1
figure 1

The fit results of the MDM to the \(750 \mathrm{GeV}\) diphoton data together with the LHC Run I constraints listed in Table 1, which are projected on the \(\sigma _{\gamma \gamma }^{13 TeV}\)\(\Gamma _{tot}\) planes for \(Q_X=2/3\) (left panel) and \(Q_X=5/3\) (right panel), respectively. The regions filled by the colors from gray to deep blue represent the parameter spaces that can fit the diphoton data within 3\(\sigma \), 2\(\sigma \), and 1\(\sigma \) level, respectively, and by contrast the regions covered by straw color are excluded by the constraints. The boundaries for the hh, ZZ , and \(WW^*\) channels are also plotted, which correspond to blue lines, red lines, and brown lines, respectively, and the other constraints listed in Table 1 are too weak to be drawn on the panels. In each panel, the green line represents the best-fit samples. In getting this panel, we have set \(\Gamma _{new} = 0\) and \(m_X = 1 \mathrm{TeV}\), and we checked that \(m_X = 1.5 \mathrm{TeV}\) predicts roughly same results, which reflects that our results are insensitive to \(m_X\)

4 Numerical results and discussions

In this section, we discuss the diphoton excess in the MDM. In order to get the favored parameter space for the excess, we fix \(Q_X=\frac{2}{3}\), \(\frac{5}{3}\), and \(m_X = 1\, \mathrm{TeV}\), \(1.5\, \mathrm{TeV}\) at each time, and scan the following parameter space:

$$\begin{aligned} 0< \eta \le 2, \quad \quad |\tan \theta _S| \le 0.1. \end{aligned}$$
(24)

During the scan, we consider the following theoretical and experimental constraints:

  • The vacuum stability at the scale of \(m_s = 750\, \mathrm{GeV}\) for the scalar potential, which corresponds to the requirement \(4 \lambda _H \lambda _S - \lambda _{HS}^2 >0\) [165].

  • Constraints from the perturbativity at the scale of \(m_s = 750\, \mathrm{GeV}\), which requires \(\lambda _S, \lambda _H, \lambda _{HS} \lesssim 4 \pi \), and \(y_X \lesssim 4 \pi /\sqrt{N_c}\) [163].

  • Constraints from the electroweak precision data. We calculate the Peskin–Takeuchi S and T parameters [182] with the formulas presented in [165], and construct \(\chi ^2_\mathrm{{ST}}\) by the following experimental fit results with \(m_{h,ref}=125\, GeV\) and \(m_{t,ref}=173\, GeV\) [183]:

    $$\begin{aligned} S=0.06\pm 0.09,\quad T=0.10\pm 0.07,\quad \rho _{ST}=0.91.\nonumber \\ \end{aligned}$$
    (25)

    In our calculation, we require that the samples satisfy \(\chi ^2_{ST} \le 6.18\).

  • Experimental constraints from the 125 GeV Higgs data, which include the updated exclusive signal rates for \(\gamma \gamma \), \(ZZ^*\), \(W W^*\), \(b\bar{b}\), and \(\tau \bar{\tau }\) channels [184, 185]. We perform the fits like our previous paper [167, 186], and require the samples to coincide with the combined data at \(2\sigma \) level.

  • Experimental constraints from the null results in the search for the 750 GeV resonance through other channels such as \(s \rightarrow Z Z, hh\) at Run I, just like what we did in [158]. The upper bounds on these channels at \(95~\%\) C.L. are listed in Table 1.

For each sample surviving the constraints, we perform a fit to the \(750 \,\mathrm{GeV}\) diphoton data collected at the 8-\( \mathrm{TeV}\) and the 13-\(\mathrm{TeV}\) LHC. In doing this, we use the method introduced in [9], where the data were given by

$$\begin{aligned} \mu ^{exp}_i= & {} \sigma (pp\rightarrow \gamma \gamma )\nonumber \\= & {} \left\{ \begin{array}{lllll} 0.63\pm 0.25 ~\mathrm{fb} &{}\quad \mathrm{CMS} &{}\mathrm{at} ~\sqrt{s} = 8 &{}~\mathrm{TeV }, \\ 0.46\pm 0.85 ~\mathrm{fb} &{}\quad \mathrm{ATLAS} &{}\mathrm{at} ~\sqrt{s} = 8 &{}~\mathrm{TeV}, \\ 5.6\pm 2.4 ~\mathrm{fb} &{}\quad \mathrm{CMS} &{}\mathrm{at} ~\sqrt{s} = 13 &{}~\mathrm{TeV }, \\ 6.2^{+2.4}_{-2.0} ~\mathrm{fb} &{}\quad \mathrm{ATLAS} &{}\mathrm{at} ~\sqrt{s} = 13 &{}~\mathrm{TeV}, \\ \end{array}\right. \nonumber \\ \end{aligned}$$
(26)

and the \(\chi ^2_{\gamma \gamma }\) function was given by [9, 158]

$$\begin{aligned} \chi ^2= & {} \sum _{i=1}^4 \chi ^2_i, \nonumber \\ \chi ^2_i= & {} \left\{ \begin{array}{ll} 2[\mu _i^{exp} {-} \mu _i {+} \mu _i \mathrm{ln} \frac{\mu _i}{\mu _i^{exp}} ] &{}\quad \mathrm{for~ the~ 13~TeV~ATLAS~ data}, \\ \frac{(\mu _i^{exp}-\mu _i)^2}{\sigma _{\mu _i^{exp}}^2} &{}\quad \mathrm{for~the~other~three~sets~of~data}, \\ \end{array}\right. \nonumber \\ \end{aligned}$$
(27)

with \(\mu _i\) denoting the theoretical prediction of the diphoton rate.

Fig. 2
figure 2

Same samples as those in Fig. 1, but projected on the \(\eta \)\(\tan \theta _S\) planes. Although we take \(m_X = 1\, \mathrm{TeV}\) in getting this figure, we check that setting \(m_X = 1.5\, \mathrm{TeV}\) produces indistinguishable difference on the figure due to the comments below Eq. (23)

In the following, we only consider the samples surviving the first four constraints. In Fig. 1, we project these samples on the \(\sigma _{\gamma \gamma }^{13 \mathrm{TeV}}\)\(\Gamma _{\mathrm{tot}}\) planes for \(Q_X=2/3\) (left panel) and \(Q_X=5/3\) (right panel) respectively. The details of this figure are explained in its caption. From this figure, one can get the following facts:

  • The central value of the diphoton rate is \(3.9 \mathrm{fb}\) at the 13-\(\mathrm{TeV}\) LHC from the fit, and the \(1\sigma \), \(2 \sigma \) and \(3 \sigma \) ranges of the rate are \((2.5 \sim 5.3) \mathrm{\ fb}\), \((1.5 \sim 6.3) \mathrm{\ fb}\), \( (0.2 \sim 7.9) \mathrm{\ fb}\) respectively. Note that this conclusion is independent of the value of \(Q_X\).

  • For both \(Q_X= \frac{2}{3}\) and \(Q_X= \frac{5}{3}\) cases, the diphoton excess can be well explained. The difference of the two options comes from the fact that for \(Q_X= \frac{2}{3}\) case, \(\Gamma _\mathrm{{tot}} \lesssim 0.15\, \mathrm{GeV} \) if one wants to explain the excess at \(2\sigma \) level, while for \(Q_X= \frac{5}{3}\) case, \(\Gamma _\mathrm{{tot}} \lesssim 1.6\, \mathrm{GeV} \). The reason for such a difference is that in the \(Q_X= \frac{5}{3}\) case, \(\sin \theta _S\) can take a larger value (see discussion below).

  • Among the channels listed in Table 1, the hh channel puts the tightest constraints on the parameter space regardless the value of \(Q_X\).

Table 2 Detailed information for one of the best points in the left and right panels of Fig. 2 (labeled by P1 and P2 hereafter), respectively. We checked that all these points predict \(\chi ^2_{\gamma \gamma }=2.32\), which corresponds to a p-value of 0.68
Fig. 3
figure 3

Correlations of the diphoton rate at the \(13\, \mathrm{TeV}\) with those of ZZ, \(WW^*\), hh, and \(t\bar{t}\) signals respectively for the \(Q_X = \frac{2}{3}\) case, which are shown on the \(\eta \)\(\tan \theta _S\) planes. Colors in this figure have same meanings as those in Fig. 2, and from the left to right and upper to lower panels, the constant contours (red lines) of the production rates for ZZ, \(WW^*\), hh, and \(t\bar{t}\) signals are shown, respectively. The numbers on the red lines represent the corresponding production rates at the 13-\(\mathrm{TeV}\) LHC. Note that the correlations of the diphoton rate with those of the gg and \(Z\gamma \) signals are presented in Eq. (23)

Fig. 4
figure 4

Similar to Fig. 3, but for the \(Q_X = \frac{5}{3}\) case

Next we illustrate the favored parameter regions for the excess. For this purpose, we project the samples used in Fig. 1 on the \(\eta \)\(\tan \theta _S\) planes, which are shown in Fig. 2. This figure indicates the following facts:

  • In order to explain the diphoton excess at \(2 \sigma \) level, \( 0.65 \le \eta \le 1.55\) and \(|\tan \theta _S| \le 0.012\) are preferred for \(Q_X = \frac{2}{3}\) case, and by contrast \( 0.15 \le \eta \le 0.8\) and \(|\tan \theta _S| \le 0.06\) are preferred for \(Q_X = \frac{5}{3}\) case. Note that in the \(Q_X = \frac{5}{3}\) case, a smaller \(\eta \) as well as a wider range of \(\tan \theta _S\) are favored to explain the excess in comparison with the \(Q_X = \frac{2}{3}\) case. The reason is that a larger \(Q_X\) can increase greatly the width and also the branching ratio of \(s \rightarrow \gamma \gamma \), which in return needs a smaller s production rate to explain the excess.

  • The channels listed in Table 1 exclude the parameter space characterized by a large \(\eta \) and/or a large \(|\tan \theta _S|\). For these cases, the production rates of the channels are usually enhanced, which can be inferred from the expressions of the widths.

  • In case of \(\tan \theta _S \simeq 0\), the \(Z \gamma \) channel may impose upper bounds on \(\eta \), which is shown in the right panel of Fig. 2.

  • The favored parameter space is not symmetric if the sign for \(\tan \theta _S\) is reversed, and this asymmetry turns out to be more obvious for larger \(Q_X\) and \(|\tan \theta _S|\). The source of such a asymmetry comes from the expressions of \(\Gamma _{s \rightarrow g g}\), \(\Gamma _{s \rightarrow \gamma \gamma }\), \(\Gamma _{s \rightarrow Z \gamma }\), and \(\Gamma _{s \rightarrow Z Z}\), which are presented from Eqs. (7) to (16).

In Table 2, we show the detailed information for one of the best points in the left and right panels of Fig. 2 respectively. In the following, we label the two points by \(P_1\) and \(P_2\), respectively. From this table, one can learn that to explain the diphoton excess in the MDM, the branching ratio of \(s \rightarrow \gamma \gamma \) is usually at \(1~\%\) level, which is significantly larger than that of the Higgs boson in the SM. One can also learn that for the best points, \(s \rightarrow g g\) may be either dominant or subdominant decay channel of the s.

Finally, we study the correlations between the diphoton rate at the 13-\(\mathrm{TeV}\) LHC with the rates of the ZZ, \(WW^*\), hh, and \(t\bar{t}\) signals, respectively. The results are presented in Fig. 3 for the \(Q_X = \frac{2}{3}\) case with the implication of the figure explained in its caption. This figure reveals the following information:

  • Current LHC data have put upper limits on the rates of the different signals at the 13-TeV LHC, which are \(\sigma _{ZZ} \lesssim 48\, \mathrm{fb}\), \(\sigma _{WW} \lesssim 96\, \mathrm{fb}\), \(\sigma _{hh} \lesssim 190\, \mathrm{fb}\), and \(\sigma _{t\bar{t}} \lesssim 19\, \mathrm{fb}\).

  • Since for a moderately small \(\sin \theta _S\), the sZZ, sWW, shh, and \(st\bar{t}\) couplings are roughly proportional to \(\sin \theta _S\simeq \tan \theta _S\), the constant contours of the signal rates exhibit similar behaviors on the \(\eta \)\(\tan \theta _S\) plane. Obviously, if the diphoton excess persists at future LHC experiments and meanwhile none of the other signals is observed, a small \(\tan \theta _S\) is preferred.

  • More important, if more than one type of the signals are measured at the future LHC experiments, one can decide the parameters of the MDM. For example, given that \(\sigma _{\gamma \gamma }\) and \(\sigma _{jj}\) are precisely known, one can get the value of \(Q_X\), and if \(\sigma _{\gamma \gamma }\) and \(\sigma _{ZZ}\) are also measured, one can pin down the favored regions of \(\eta \) and \(\sin \theta _S\).

In Fig. 4, we show the correlations of the different signals for the \(Q_X = \frac{5}{3}\) case. The features of this figure are quite similar to those of Fig. 3 except that: (i) now the diphoton rate becomes more sensitive to \(\eta \) and \(\sin \theta _S\), thus, to extract the values of the two parameters in this case, a more precise measurement of the diphoton signal is needed; (ii) the asymmetry between \(\pm \tan \theta _S\) on the rates at 13-TeV LHC becomes more obvious.

Table 3 The scale where the vacuum becomes unstable for different choice of the vector-like fermion number \(N_X\). Here the scale \(\mu \) is in unit of GeV, and the points \(P_1\) and \(P_2\) correspond to the two benchmark points in Table 2. We checked that for the point P2 with \(N_X =5, 6\), the vacuum keeps stable before \(\lambda _H\) reaches its Landau poles, which are roughly at \(5.6 \times 10^{11}\, \mathrm{GeV}\) and \(3.8 \times 10^{10}\, \mathrm{GeV}\), respectively. We also checked that for the \(P_2\) with \(N_X = 4 \), the Landau pole of \(\lambda _H\) is roughly at \(2.2 \times 10^{13}\, \mathrm{GeV}\)

5 Vacuum stability at high energy scale

About one week before we finished this work, several papers appeared to discuss the vacuum stability in a theoretical framework which is quite similar to the MDM [161163]. The main argument of these papers was that, in order to explain the diphoton excess, the Yukawa coupling \(y_X\) must be so large that the vacuum becomes unstable at a certain high energy scale.Footnote 2 In our opinion, the MDM may be free of this problem due to the following two reasons. One is that the MDM is actually a low energy effective theory describing the breakdown of a strong dynamics with approximate scale invariance. This means that the physics beyond the MDM must appear at a certain high energy scale. The other is that, as we emphasized in Sect. 3, the diphoton excess actually imposes non-trivial requirements on the parameter \(\eta \equiv \frac{v N_X}{f}\), instead of on the Yukawa coupling \(y_X \equiv \frac{\eta m_X}{v N_X}\) directly. For a given value of \(\eta \), one may increase \(N_X\) to suppress the Yukawa coupling \(y_X\), and thus alleviate the problem. In order to verify our speculation, we assume that there are no particles in the strong interaction sector other than the vector-like fermions, and consider the two benchmark points presented in Table 2. We repeat the analysis in [163], i.e. we use the same RGEs as those in [163] to run all parameters in the MDM, and also consider the threshold correction to \(\lambda _S\) at the scale \(m_X\). In Table 3, we present the scale where the vacuum becomes unstable for different choices of \(N_X\). This table indicates that moderately large \(N_X\) and \(Q_X\) are helpful to stabilize the vacuum state.

Finally, we recall that, although large \(Q_X\) and/or \(N_X\) are welcomed to explain the excess, they cannot be arbitrarily large in the extension of the SM by one gauge singlet scalar and the vector-like fermions. The reason is that the \(\beta \) function of the gauge coupling \(g_1\) is given by \(\beta _{g_1} = ( \frac{41}{10} + N_X Q_X^2 \frac{12}{5} ) g_1^3\) [163], and consequently \(g_1\) increases rapidly with the RGE energy scale for large \(N_X\) and \(Q_X\). In this case, the \(\beta \) function of \(\lambda _H\) is dominated by the term proportional to \(g_1^4\), and consequently, \(\lambda _H\) may reach its Landau pole at an energy scale not far above the weak scale.

6 Conclusion

The MDM extends the SM by adding vector-like fermions and one gauge singlet scalar, which represents a linearized dilaton field. In this theory, the couplings of the dilaton to gg and \(\gamma \gamma \) are induced by the loops of the vector-like fermions, and they may be sizable in comparison with the H g g and \(H \gamma \gamma \) couplings in the SM. On the other hand, due to the singlet nature of the dilaton its decays into the other SM particles are suppressed. These characters make the diphoton signal of the dilaton potentially detectable at the LHC.

In this work, we tried to interpret the diphoton excess recently reported by the ATLAS and CMS collaborations at the 13-\(\mathrm{\ TeV}\) LHC in the framework of the MDM. For this purpose, we first showed by analytic formulas that the production rates of the \(\gamma \gamma \), gg, \(Z\gamma \), ZZ, \(WW^*\), \(t\bar{t}\), and hh signals at the \(750\, \mathrm{GeV}\) resonance are only sensitive to the dilaton–Higgs mixing angle \(\theta _S\) and the parameter \(\eta \equiv v N_X/f\), where \(N_X\) denotes the number of the vector-like fermions and f is the dilaton decay constant. Then we scanned the two parameters to find the solutions to the excess. During the scan, we considered various theoretical and experimental constraints, which included the vacuum stability and the perturbativity of the theory at the scale of \(m_s\), the electroweak precision data, the 125-\(\mathrm{GeV}\) Higgs data, the LHC searches for exotic quarks, and the upper bounds on the rates of ZZ, \(WW^*\), \(Z\gamma \), \(t\bar{t}\), and hh signals at LHC Run I. We concluded that the model can predict the central value of the diphoton rate without conflicting with any constraints. Moreover, after deciding the parameter space for the excess we discussed the signatures of the theory at the LHC Run II. We showed that the rates of the \(WW^*\) and hh signals may still reach about \(100 \mathrm{\ fb}\) and \(200 \mathrm{\ fb}\), respectively, at the 13-\(\mathrm{\ TeV}\) LHC, and thus they provide good prospect for detection in the future.

As an indispensable part of this work, we also discussed the vacuum stability of the theory at high energy scales. We showed that, by choosing moderately large \(N_X\) and \(Q_X\), the vacuum in our explanation can retain stable up to \(10^{11}\, \mathrm{GeV}\).

Note added: When we finished this work at the beginning of this January, we noted that two papers had appeared trying to explain the diphoton excess with the dilaton field [159, 160]. However, after reading these papers, we learned that the paper [159] considered the traditional dilaton model, and the paper [160] focused on 5D warped models. So their studies are quite different from ours. We also noted that by then there existed several papers studying the diphoton excess in the model which extends the SM by one gauge singlet scalar field and vector-like fermions [10, 161, 163]. Compared with these works, our study has the following features (which we consider improvements):

  • We considered a generic model which predicts \(N_X\) vector-like fermions (by contrast, most of the previous studies considered the most economical \(N_X = 1\) case). This enables us to explain the diphoton excess without invoking a large Yukawa coupling \(y_X\). Such a treatment, as we have discussed in Sect. 5, is helpful to retain vacuum stability of the theory at high energy scales.

  • More important, by assuming that the dilaton field is fully responsible for the masses of the vector-like fermions, we showed by analytic formulas that the rates for all the signals discussed in this work, such as \(\gamma \gamma \), gg, \(Z \gamma \), \(VV^*\), \(f\bar{f}\), and hh, are only sensitive to the parameter \(\eta = \frac{v N_X}{f}\), the dilaton–Higgs mixing angle \(\theta _S\) and the electric charge of the fermions \(Q_X\). This observation can greatly simplify the analysis on the diphoton excess, and to the best of our knowledge, it was not paid due attention in previous studies.

  • We considered various constraints on the model, especially those from different observations at the LHC Run I (which were listed in Table 1), and we concluded that the hh signal usually puts the tightest constraint on our explanation. This conclusion is rather new. Moreover, we also studied the signatures of our explanation at the LHC Run II, which are helpful to decide the parameters of the model. Such a study was absent in previous literature.

Before we end this work, we would like to clarify its relation with our previous work [158], where we utilized the singlet extension of the Manohar–Wise model to explain the diphoton excess. In either of the works, the scalar sector of the considered model contains a doublet and a singlet scalar field, which mix to form a 125 GeV SM-like Higgs h and a 750 GeV new scalar s, and the \(s\gamma \gamma \) and s g g interactions are induced by colored particles through loop effects. In organizing these works, we first introduced the theoretical framework and listed the formula for the partial widths of the scalar s, then we analyzed various constraints on the model and discussed the diphoton signal from the process \(gg \rightarrow s \rightarrow \gamma \gamma \). We concluded that both models can predict the central value of the excess in their vast parameter space. Since the two works adopted the same \(\chi ^2\) function for the excess which only depends on the diphoton rate, the \(\chi ^2\) values for the best points are the same in the two explanations. In spite of these similarities, we still think that the two works are independent since they are based on different physics. The differences are reflected in following aspects:

  • The origin of the singlet dominated scalar s. In the work [158], the singlet field is imposed by hand and only for interpreting the excess, while in this work it corresponds to a linearized dilation field, which is well motivated by the broken of a strong dynamic with approximate scale invariance.

  • The mechanism to generate sizable \(s\gamma \gamma \) and s g g interactions. In the singlet extension of the Manohar–Wise model, these interactions are induced by color-octet and isospin-doublet scalars \(S_R^A\), \(S_I^A\), and \(S_\pm ^A\) with \(A=1, \ldots 8\) denoting the color index (note that there are totally 32 bosonic freedom), so their coupling strengths are proportional to \((C_{s S_i^{A *} S_i^A} v)/m_{S_i}^2 A_0(\tau _{S_i})\) with \(C_{s S_i^{A *} S_i^A}\) denoting the coupling coefficient for the \(s S_i^{A *} S_i^A\) interaction. As a comparison, the couplings in this work are induced by the vector-like fermions, and their strengths are determined by the factor \(\eta A_{\frac{1}{2}} (\tau _X)\). Since the loop function \(A_0\) is usually several times smaller than the function \(A_{\frac{1}{2}}\)[176, 177], beside the large bosonic freedom, large \(C_{s S_i^{A *} S_i^A}\) and meanwhile moderately light \(S_i^A\) are also necessary to get the same sizes of the strengths as those in this work. By contrast, we only need to tune the value \(\eta \) to get the right couplings for the excess in this work. So the explanation presented here is rather simple and straightforward.

  • The intrinsic features of the explanations. Due to the particle assignments of the models, the two explanations exhibit different features. For example, for the explanation in [158] the upper limit of the dijet channel in Table 1 has constrained the diphoton rate to be less than about \(7.5\, \mathrm{fb}\) [158, 199], while in the present work the constraint from the dijet channel on the rate is rather loose. Another example is that for the explanation in [158], the vacuum stability can never constrain the model parameters, while in this work it acts as a main motivation to consider moderately large \(N_X\) and \(Q_X\) to keep the vacuum stability.