1 Introduction

The detection of gravitational waves (GWs) has become one of the most important tasks in modern astrophysics and cosmology, not only because it can furtherly examine the correctness of Einstein’s General Relativity, but also since it can provide an independent probe of our Universe, which is known as the “standard sirens” [1]. Therefore, it is no longer surprising that the first direct detection of GWs from two emerging black holes by the Laser Interferometer Gravitational-Wave Observatory (LIGO) [2] has become the hotspot of nowadays science and been awarded the Nobel Prize in Physics in 2017. Since then, LIGO and VIRGO announced several events of black hole GWs [3,4,5,6] as well as one event of neutron star GWs [7], which indicate the coming of a new “gravitational wave era”.

However, like the electromagnetic waves, the GWs are distributed over a very wide frequency range, from \(10^{-16}\) Hz to \(10^{12}\) Hz. Therefore, to get full information of GWs, we need to utilize various probes for different ranges of frequencies. Besides LIGO and VIRGO (\(\sim 10^2\) Hz), the existing and planning programs for GWs detecting includes FAST (\(10^{-8}\) Hz\(\sim \) nHz with annual modulation, same as PTA) [8], KAGRA (kHz, almost same target-range as those of LIGO and Virgo) [9], LISA/TianQin/Taiji (mHz) [10,11,12], EPTA (nHz) [13], AliCPT [14,15,16] (\(\sim 10^{-16}\) Hz) and so on, all of which are devoting themselves on building the “Multi-band gravitational waves astronomy”.

Among the various frequency bands, the most difficult to detect might be the one with ultra-low frequency (\(\sim 10^{-16}\) Hz) and ultra-long wave-lengths (order of the size of the observational Universe), as well as very low amplitudes. This kind of gravitational waves are believed to be generated at the very early stages of the Universe, probably during the inflationary era [17, 18], and thus dubbed as Primordial Gravitational Waves (PGWs). However, these PGWs, also known as primordial tensor perturbations, can affect the CMB photons before the last scattering, and thus, leave hints on the CMB sky map in the form of B mode polarization [19,20,21,22,23]. Since different evolution of the early Universe can give different evolution behavior of the primordial tensor perturbations and also different features of polarization in the CMB map, the observations of the polarizations – therefore, the primordial gravitational waves – can be used as a probe to test models for the early Universe, especially, inflation.

Practically, as done in all-sky surveys from WMAP [24] to PLANCK [25], the detection on polarizations of CMB photons can be transformed into that on parameters of inflation models, such as the spectral amplitude \(A_s\), the spectral index \(n_s\), as well as the tensor/scalar ratio r. The first two corresponds to the primordial scalar perturbations, while the last one involves both scalar and tensor ones. From the newly-released PLANCK 2018 [26], the constraints on these parameters are \(\ln (10^{10}A_s)=3.044\pm 0.0014\) (\(68\%\) C.L.), \(n_s=0.9649\pm 0.0042\) (\(68\%\) C.L.), (TT, TE, EE + lowE + lensing), \(r_{0.002}<0.064\) (\(95\%\) C.L., TT, TE, EE + lowE + lensing + BK14). As can be seen from these data, we still can have only upper bound for r, which is continuously lowered, although \(A_s\) and \(n_s\) can be constraints both from above and below. It means that the primordial gravitational waves are really very weak and very difficult to test, and current constraints to primordial gravitational waves, although having been improved much, still needs much more development.

In 2014, we proposed a ground-based CMB experiment called AliCPT in Ali of Tibet, China, which aims to search for PGWs by detecting such B mode polarization [14,15,16]. As an experiment in the northern hemisphere, it can cover up to \(65\%\) of the sky map, and thus become a very important counterpart to other ground-based experiments, such as that in Chile (Atacama Cosmology Telescope [27], POLARBEAR [28]) and at the South Pole (South Pole Telescope [29], BICEP [30]).

The very ambitious scientific goal of AliCPT is to furtherly improve the sensitivity on r-detection and to put a more stringent limit on r by one order of magnitude [14,15,16]. The significance of the detection of PGWs will be at least two-folded: If we succeed in detecting PGW, we will have evidence that the PGWs do exist, giving the tensor/scalar ratio be well within the AliCPT region, namely \(r\in (0.064,0.01)\). On the other hand, if the PGW is still not detected by then, it means that the upper bound of the tensor/scalar ratio will be lowered again, indicating that inflation models with even smaller tensor/scalar ratio (\(r<0.01\)) will be favored, examples of which including the Starobinsky model [31, 32], ultra-slow-roll inflation model [33]/constant-roll inflation model [34, 35], among many other models in the literature.

In this paper, we try to investigate that for various inflation models, how the tensor/scalar ratio can be made small, especially, to meet with the forthcoming observational data. In order to do this, we start with a general form of action, offered by the effective field theory (EFT) approach [36,37,38,39,40,41,42,43,44]. It has been proved in [38, 40, 43, 44] that the EFT action they are using is very general at least up to the quadratic level, and be able to cover a large class of field actions with the second-order equation of motion, such as Horndeski [45] and GLPV theories [46]. Therefore, we can obtain a general form of the tensor/scalar ratio r that can be reduced to various concrete models, and this allows us to discuss constraints on parameters of those models. Note that similar work has been done in [47] with more focus on the tensor perturbation itself, such as power spectrum and spectral index.

The rest of the paper is organized as the following: in Sect. 2, we introduce the general action in light of the work in [38, 40, 43, 44], and by calculating both scalar and tensor perturbations, we find a very general expression of the tensor/scalar ratio. In Sect. 3, we apply our results by reducing the general form of r to concrete examples. We show relations with various slow-varying parameters for each model and obtain the range of each pair of parameters, requiring r be within regions of detection/non-detection of PGWs. We also show how r of these models can deviate from the usual consistency relation in inflation models. Section 4.3 includes our final remarks and discussions.

2 From the general inflation action to the tensor/scalar ratio

Based on the metric in the ADM form:

$$\begin{aligned} ds^2=-N^2dt^2+h_{ij}(dx^i+N^idt)(dx^j+N^jdt)~ \end{aligned}$$
(1)

with N and \(N^i\) the lapse function and shift vector while \(h_{ij}\) the 3-dimensional spatial metric, the very general action up to quadratic perturbation level (in EFT form) is given by [36, 38, 40, 43, 44]:

$$\begin{aligned} S= & {} \int d^4x\sqrt{-g}\Big [{M_p^2\over 2} f(t)R-\Lambda (t)-c(t)g^{00}~\nonumber \\&+{m_2^4(t)\over 2}(\delta g^{00})^2-{m_3^3(t)\over 2}\delta K\delta g^{00}\nonumber \\&-m_4^2(t)\left( \delta K^2-\delta K_{\mu \nu }\delta K^{\mu \nu } \right) +{{\tilde{m}}_4^2(t)\over 2}R^{(3)}\delta g^{00}~\nonumber \\&-{\bar{m}}_4^2(t)\delta K^2+{{\bar{m}}_5(t)\over 2}R^{(3)}\delta K +{{\bar{\lambda }}(t)\over 2}(R^{(3)})^2+ \cdots ~\nonumber \\&-{{\tilde{\lambda }}(t)\over M_p^2}\nabla _iR^{(3)}\nabla ^iR^{(3)} + \cdots \Big ]~, \end{aligned}$$
(2)

The first line is background and the rest are for the perturbations up to second order. Note that according to the merit of EFT approach, the action is clearly written order by order of the perturbations, and the ellipse denotes all orders higher than 2. In the action, we define \(\delta K_{\mu \nu }=K_{\mu \nu }-H \Theta _{\mu \nu },~\delta K=K-3H\), where the induced metric \(\Theta _{\mu \nu }\equiv g_{\mu \nu }+n_\mu n_\nu \) and the normal vector is defined as \(n_\mu \equiv (-N,0,0,0)\). Moreover, since the third and the fourth lines are for higher space (but not time) derivatives, in the following analysis we turned them off by setting \({\bar{m}}_4={\bar{m}}_5={\bar{\lambda }}={\tilde{\lambda }}=0\).

2.1 The background equations of motion

The background of the metric (1) is of the well-known FLRW form, which is in the diagonal form of \(\{-1,a^2(t), a^2(t), a^2(t)\}\). It is straightforward to get the background equations from action (2), by varying the first line with respect to the lapse function N and the scale factor a:

$$\begin{aligned}&3M_p^2[f(t)H^2+{\dot{f}}(t)H]\nonumber \\&\quad =c(t)+\Lambda (t)~, \end{aligned}$$
(3)
$$\begin{aligned}&\qquad -M_p^2[2f(t){\dot{H}}+3f(t)H^2+2{\dot{f}}(t)H+{\ddot{f}}(t)]\nonumber \\&\quad =c(t)-\Lambda (t)~. \end{aligned}$$
(4)

These are actually nothing but the Friedmann equations, and for the minimal coupling theories where \(f(t)=1\), one can have \(c(t)=-M_p^2{{\dot{H}}}\) and \(\Lambda (t)=M_p^2({{\dot{H}}}+3H^2)\), which are the same as the results obtained in [36]. In the nontrivial case where f(t) is an arbitrary function, the theory is extended to include also cases where gravity part is modified, or there is nonminimally coupling between the field and gravity parts. From Eq.s (3) and (4) one can get:

$$\begin{aligned} H(t)= & {} -\frac{{\dot{f}}}{2f}\pm \frac{\sqrt{3}}{6}\sqrt{3\left( \frac{{\dot{f}}}{f}\right) ^2+4\frac{(c+\Lambda )}{M_p^2f}}~, \end{aligned}$$
(5)
$$\begin{aligned} {\dot{H}}(t)= & {} -\frac{c}{M_p^2f}-\frac{{\ddot{f}}}{2f}\nonumber \\&-\left( \frac{{\dot{f}}}{2f}\right) ^2\pm \frac{\sqrt{3}}{12}\sqrt{3\left( \frac{{\dot{f}}}{f}\right) ^4+4\left( \frac{{\dot{f}}}{f}\right) ^2\frac{(c+\Lambda )}{M_p^2f}}~\nonumber \\= & {} -\frac{c}{M_p^2f}-\frac{{\ddot{f}}}{2f}+\frac{H{\dot{f}}}{2f}~. \end{aligned}$$
(6)

Defining \(\delta _{f}^{(1)}\equiv {\dot{f}}/Hf\), \(\delta _{f}^{(2)}\equiv {\ddot{f}}/H{\dot{f}}\), one can also get a neat form of the Hubble parameter squared from Eq. (5) as:

$$\begin{aligned} H^2= & {} \frac{c+\Lambda }{3M_p^2f} +\frac{1}{2}H^2\left( \delta _{f}^{(1)}\right) ^2\nonumber \\&\mp \frac{\sqrt{3}}{6}H^2\sqrt{3\left( \delta _{f}^{(1)}\right) ^4+4(\delta _{f}^{(1)})^2\frac{(c+\Lambda )}{H^2M_p^2f}}~, \end{aligned}$$
(7)

which gives the solution

$$\begin{aligned} \frac{c+\Lambda }{M_p^2H^2f}=3\left( 1+\delta _{f}^{(1)}\right) ~. \end{aligned}$$
(8)

Moreover, the slow-roll parameter can be written as:

$$\begin{aligned} \epsilon \equiv -\frac{{\dot{H}}}{H^2}=\frac{3c}{c+\Lambda }\left( 1+\delta _f^{(1)}\right) +\frac{1}{2}\delta _{f}^{(1)}\left( \delta _{f}^{(2)}-1\right) ~. \end{aligned}$$
(9)

which will be frequently used in the following analysis.

2.2 Scalar perturbation

Using the action (2) and taking the unitary gauge, we find the quadratic action of the scalar perturbation [43]:

$$\begin{aligned} S^{(2)}_\zeta =\int d^4x a^3\left[ c_1{\dot{\zeta }}^2-\left( \frac{{\dot{c}}_3}{a}-c_2\right) \frac{(\partial \zeta )^2}{a^2}\right] ~, \end{aligned}$$
(10)

where \(\zeta \) is the curvature perturbation coming from the scalar perturbation in metric (1), and

$$\begin{aligned} c_1= & {} \frac{1}{D}\left( 2m_4^2+fM_p^2\right) \Big \{3m_3^6+4f^2H^2\epsilon M_p^4+16m_ 2^4m_4^2~\nonumber \\&+M_p^2\left[ -4{\ddot{fm}}_4^2+{\dot{f}}\left( -6m_3^3+4Hm_4^2+3{\dot{f}}M_p^2\right) \right] ~\nonumber \\&+2fM_p^2\left[ 4m_2^4-{\ddot{fM}}_p^2+H\left( 4H\epsilon m_4^2\nonumber \right. \right. \\&\left. \left. +{\dot{f}}M_p^2\right) \right] \Big \}~, \end{aligned}$$
(11)
$$\begin{aligned} c_2= & {} fM_p^2~, \end{aligned}$$
(12)
$$\begin{aligned} c_3= & {} \frac{2a}{D}\left( 2m_4^2+fM_p^2\right) \Big \{2f^2HM_p^4+fM_p^2\left[ -m_3^3+{\dot{f}}M_p^2\nonumber \right. \\&\left. +4Hm_4^2\right] \Big \}~, \end{aligned}$$
(13)
$$\begin{aligned} D= & {} \left[ m_ 3^3-4Hm_4^2-\left( 2fH+{\dot{f}}\right) M_p^2\right] ^2~. \end{aligned}$$
(14)

According to action (10), one can get the equation of motion:

$$\begin{aligned} u^{\prime \prime }+c_{s}^{2}k^{2}u-\frac{z^{\prime \prime }}{z}u=0~, \end{aligned}$$
(15)

where \(u\equiv z\zeta \), \(z\equiv a\sqrt{c_1}\), and prime denotes derivative with respect to the conformal time: \(\eta \equiv \int a^{-1}dt\). The sound speed squared is also defined as:

$$\begin{aligned} c_s^2\equiv \left( \frac{{\dot{c}}_3}{a}-c_2\right) \Big /c_1~. \end{aligned}$$
(16)

For initial condition, we consider the case of the subhorizon region \(c_s^2k^2\gg z^{\prime \prime }/z\), and we assume that the adiabatic condition \(|\omega ^\prime /\omega ^2|\ll 1\) is satisfied, where \(\omega ^2\equiv c_s^2k^2-z^{\prime \prime }/z\), which is true for wide range of parameter choice. Therefore, one can apply the WKB approximation to get:

$$\begin{aligned} u_{ini}=\frac{1}{\sqrt{2c_sk}}e^{i\int c_skd\eta }~. \end{aligned}$$
(17)

On the other hand, for a whole solution, assuming \(z\propto \eta ^{\frac{1}{2}-\nu }\), where the parameter \(\nu \) is assumed to be a constant. Moreover, for simplicity but without losing generality, we assume the sound speed squared \(c_s\propto \eta ^s\), then Eq. (15) becomes

$$\begin{aligned} u^{\prime \prime }+c_{s}^{2}k^{2}u-\frac{4\nu ^{2}-1}{4\eta ^{2}}u=0~, \end{aligned}$$
(18)

and the solution is the famous Hankel function:

$$\begin{aligned} u= & {} C\sqrt{\eta }\left[ H_{\nu /(s+1)}^{(1)}\left( \Big |\int c_{s}kd\eta \Big |\right) \nonumber \right. \\&\left. +H_{-\nu /(s+1)}^{(1)}\left( \Big |\int c_{s}kd\eta \Big |\right) \right] ~, \end{aligned}$$
(19)

with \(C=\sqrt{\pi /(s+1)}/2\) comparing to the initial condition (17). In the superhorizon region \(c_s^2k^2\ll z^{\prime \prime }/z\), one has \(H_{\nu }(\int c_skd\eta )=\sqrt{2/\pi }(\int c_skd\eta )^{-\nu }\), therefore we have

$$\begin{aligned} u= & {} \sqrt{\frac{\eta }{2(s+1)}}\left[ \left( \int c_{s}kd\eta \right) ^{-\frac{\nu }{s+1}}+\left( \int c_{s}kd\eta \right) ^{\frac{\nu }{s+1}}\right] ~,\nonumber \\ \zeta= & {} \frac{u}{z}\propto \frac{\eta ^{\nu }}{\sqrt{2(s+1)}}\left( \int c_{s}kd\eta \right) ^{-\frac{\nu }{s+1}}\nonumber \\&\times \left[ 1+\left( \int c_{s}kd\eta \right) ^{\frac{2\nu }{s+1}}\right] ~, \end{aligned}$$
(20)

and the power spectrum is

$$\begin{aligned} P_{\zeta }\equiv & {} \frac{k^{3}}{2\pi ^{2}}\Big |\frac{u}{z}\Big |^{2}~\nonumber \\= & {} \frac{k^{3}}{2\pi ^{2}}\frac{\eta }{2(s+1)a^{2}c_1}\left[ \left( \int c_{s}kd\eta \right) ^{-\frac{\nu }{s+1}}\nonumber \right. \\&\left. +\left( \int c_{s}kd\eta \right) ^{\frac{\nu }{s+1}}\right] ^{2}~. \end{aligned}$$
(21)

We assume slow-varying variable \(\epsilon \equiv -{\dot{H}}/H^2\), therefore it is easy to get \(a\sim \eta ^{1/(\epsilon -1)}\), and also \((aH)^{-1}=(\epsilon -1)\eta \). Moreover, we set \(c_s=c_{s*}(\eta /\eta _*)^s\) where \(*\) denotes some normalization scale, therefore

$$\begin{aligned} P_{\zeta }= & {} \frac{(s+1)^2(\epsilon -1)^{2}H_{*}^{2}}{4\pi ^{2}c_{1*}c_{s*}^{3}}\left( \frac{\eta }{\eta _{*}}\right) ^{-3+2\nu -3s}\left( \frac{c_{s}k\eta }{s+1}\right) ^{3-\frac{2\nu }{s+1}}\nonumber \\&\times \left[ 1 +\left( \frac{c_{s}k\eta }{s+1}\right) ^{\frac{2\nu }{s+1}}\right] ^{2} \end{aligned}$$
(22)

The current observations indicated that the power spectrum of the scalar perturbation (22) should be (nearly) scale-invariant. In order to be so, one can either have \(\nu /(s+1)\simeq 3/2\), with \(\left( \frac{c_{s}k\eta }{s+1}\right) ^{\frac{2\nu }{s+1}}\) decreasing:

$$\begin{aligned} P_{\zeta }=\frac{(s+1)^2(\epsilon -1)^{2}H_{*}^{2}}{4\pi ^{2}c_{1*}c_{s*}^{3}}~, \end{aligned}$$
(23)

which is still time-invariant, or have \(\nu /(s+1)\simeq -3/2\), with \(\left( \frac{c_{s}k\eta }{s+1}\right) ^{\frac{2\nu }{s+1}}\) increasing:

$$\begin{aligned} P_{\zeta }=\frac{(s+1)^2(\epsilon -1)^{2}H_{*}^{2}}{4\pi ^{2}c_{1*}c_{s*}^{3}}\left( \frac{\eta }{\eta _{*}}\right) ^{-6(s+1)}~, \end{aligned}$$
(24)

which will be proportional to \(\eta ^{-6}\) for constant \(c_s\).

2.3 Tensor perturbation

We can perform the same procedure to get the power spectrum for tensor perturbations. According to action (2), one can also obtain the quadratic action of the tensor perturbation [43]:

$$\begin{aligned} S^{(2)}_T=\frac{M_p^2}{8}\int d^4x a^3\mathcal{{D}}_T\left[ {\dot{\gamma }}_{ij}^2-c_T^2\frac{(\partial _k\gamma _{ij})^2}{a^2}\right] ~, \end{aligned}$$
(25)

where \(\gamma _{ij}\) is the tensor perturbation in metric (1), and

$$\begin{aligned} \mathcal{{D}}_T=f+2\frac{m_4^2}{M_p^2}~,~~~c_T^2=\frac{f}{\mathcal{{D}}_T}~. \end{aligned}$$
(26)

One can also get the equation of motion:

$$\begin{aligned} v^{\prime \prime }+c_T^2k^2v-\frac{z_T^{\prime \prime }}{z_T}=0~, \end{aligned}$$
(27)

where \(v\equiv z_T\gamma _{+,\times }\) where \(\gamma _{+,\times }\) are two polarization modes of \(\gamma _{ij}\), \(z_T^2\equiv a^2\mathcal{{D}}_T\). Following the same procedure as of the scalar perturbation, and assuming \(z_T\propto \eta ^{\frac{1}{2}-\nu _T}\), one gets the solution:

$$\begin{aligned} v= & {} \sqrt{\frac{\eta }{2(s_T+1)}}\left[ \left( \int c_{T}kd\eta \right) ^{-\frac{\nu _T}{s_T+1}}+\left( \int c_{T}kd\eta \right) ^{\frac{\nu _T}{s_T+1}}\right] ~,\nonumber \\ \gamma= & {} \frac{v}{z_T}\propto \frac{\eta ^{\nu _T}}{\sqrt{2(s_T+1)}}\left( \int c_{T}kd\eta \right) ^{-\frac{\nu _T}{s_T+1}}\nonumber \\&\times \left[ 1+\left( \int c_{T}kd\eta \right) ^{\frac{2\nu _T}{s_T+1}}\right] ~, \end{aligned}$$
(28)

where we assume \(c_T=c_{T*}(\eta /\eta _*)^{s_T}\), and the power spectrum is:

$$\begin{aligned} P_{T}\equiv & {} 2\frac{k^{3}}{2\pi ^{2}}\Big |\frac{v}{z_{T}}\Big |^{2}\nonumber \\= & {} \frac{(s_T+1)^2(\epsilon -1)^{2}H^{2}}{2\pi ^{2}\mathcal{{D}}_{T}c_{T}^{3}}\left( \frac{\eta }{\eta _{*}}\right) ^{-3+2\nu _T-3s_T}\nonumber \\&\times \left( \frac{c_{T}k\eta }{s_T+1}\right) ^{3-\frac{2\nu _{T}}{s_T+1}} \left[ 1+\left( \frac{c_{T}k\eta }{s_T+1}\right) ^{\frac{2\nu _{T}}{s_T+1}}\right] ^{2}~. \end{aligned}$$
(29)

The current observations have not provided constraint on the scale variance of primordial tensor power spectrum yet. However, in this work, we restrict ourselves on the case where the tensor spectrum is also scale-invariant, as is for the scalar one. In order to be so, one can either have \(\nu _T/(s_T+1)\simeq 3/2\), with \(\left( \frac{c_{T}k\eta }{s_T+1}\right) ^{\frac{2\nu _{T}}{s_T+1}}\) decreasing:

$$\begin{aligned} P_{T}=\frac{(s_{T}+1)^{2}(\epsilon -1)^{2}H_{*}^{2}}{2\pi ^{2}{\mathcal {D}}_{T*}c_{T*}^{3}} \end{aligned}$$
(30)

which is still time-invariant, or have \(\nu _T/(s_T+1)\simeq -3/2\), with \(\left( \frac{c_{T}k\eta }{s_T+1}\right) ^{\frac{2\nu _{T}}{s_T+1}}\) increasing:

$$\begin{aligned} P_{T}=\frac{(s_{T}+1)^{2}(\epsilon -1)^{2}H_{*}^{2}}{2\pi ^{2}{\mathcal {D}}_{T*}c_{T*}^{3}}\left( \frac{\eta }{\eta _{*}}\right) ^{-6(s_{T}+1)} \end{aligned}$$
(31)

which will be proportional to \(\eta ^{-6}\) for constant \(c_T\). Moreover, it is easy to see that, for \(\nu _T/(s_T+1)<-3/2\) or \(\nu _T/(s_T+1)>3/2\), the spectrum would have a red tilt, while for \(-3/2<\nu _T/(s_T+1)<3/2\), the spectrum would have a blue tilt.

2.4 Tensor/scalar ratio

The tensor/scalar ratio is defined as:

$$\begin{aligned} r\equiv \frac{P_T}{P_\zeta }~, \end{aligned}$$
(32)

where in general, \(P_T\) and \(P_\zeta \) is given in Eqs. (22) and (29). Hereafter, for simplicity, we stick ourselves only on the cases where tensor spectrum is also scale-invariant. According to the above analysis, one can immediately get the tensor/scalar ratio:

$$\begin{aligned} r=2\frac{{\mathcal {D}}_{s}c_{s}^{3}(s_{T}+1)^{2}}{{\mathcal {D}}_{T}c_{T}^{3}(s+1)^{2}}~. \end{aligned}$$
(33)

Moreover, since

$$\begin{aligned} {\mathcal {D}}_{s}= & {} c_1~,~ c_s=\frac{\sqrt{{\dot{c}}_3/a-c_2}}{\sqrt{\mathcal {D}}_{s}}~,~ {\mathcal {D}}_{T}=\frac{M_p^2}{8}\Bigg [f+2\frac{m_4^2}{M_p^2}\Bigg ]~,~\nonumber \\ c_{T}= & {} \frac{M_p\sqrt{f}}{\sqrt{8{\mathcal {D}}_T}}~, \end{aligned}$$
(34)

one can express r in a very general form, namely

$$\begin{aligned} r=16\frac{({\dot{c}}_3/a-c_2)^{3/2}\sqrt{f+2(m_4/M_p)^2}(s_T+1)^2}{\sqrt{c_1}f^{3/2}M_p^2(s+1)^2}~. \end{aligned}$$
(35)

Note that the above is our master formula on r. At this stage, it contains EFT functions only, and by reducing them to field functions of concrete models, as we will do below, it can be directly related to the slow-varying parameters of each model. The constraints on those models, especially on those parameters, can thus be obtained, via the future constraint on r.

Before heading to the next section, let us make further comments on Eq. (35): although the relation between r and various parameters seems obscure, it is clear with two parameters, \(s_T\) and s, which represents the time-dependence of the sound speed with respect to tensor and scalar perturbations, by the previous definition. Marginalizing the effects of other parameters, it is evident that a large running of \(c_T\) will make r large, while that of \(c_s\) will do the opposite. Due to the above, in the following, we will ignore these two parameters for simplicity, by assuming that both \(c_T\) and \(c_s\) are slow-varying. This is a very common-used assumption in the analysis of inflation models.

3 Confronting r with future constraints: concrete examples

3.1 Towards the concreteness: an dictionary

In this section, we will apply Eq. (35) to concrete models, trying to discuss how r is related to usually defined slow-roll parameters, and what the parameters will be like for r to be within these regions. Written in the form of Eq. (35), the upper limit on the tensor/scalar ratio r provides us with the constraints on the functions f(t), \(\Lambda (t)\), c(t), \(m_{i}(t)\), and \({\tilde{m}}_{4}(t)\). Although we can get allowed region in multidimensional parameter space, there are so many degrees of freedom, and it is rather difficult to handle it in an analytic way. Moreover, for different models, the constrained parameter spaces will be different due to different marginalization methods, so it becomes meaningless to constrain those general functions directly. In order to be specific and reduce the degrees of freedom, it is useful to consider the constraints after reducing to concrete models.

On the other hand, we do have some tools for such a reduction. For instance, there is a large group of field models which possess the advantage of being ghost-free, by making their equation of motion to be at most second order. Most of these models can be summarized into the Generalized Scalar-Tensor (GST, a.k.a. Horndeski) theory, namely [48, 49]

$$\begin{aligned} \mathcal{{L}}_2= & {} G_2(\phi , X)~, \end{aligned}$$
(36)
$$\begin{aligned} \mathcal{{L}}_3= & {} G_{3}(\phi ,X)\Box \phi ~, \end{aligned}$$
(37)
$$\begin{aligned} \mathcal{{L}}_4= & {} G_{4}(\phi ,X)R-2G_{4X}\left[ (\Box \phi )^{2}-\phi ^{;\mu \nu }\phi _{;\mu \nu }\right] ~, \end{aligned}$$
(38)
$$\begin{aligned} \mathcal{{L}}_5= & {} G_{5}(\phi ,X)G_{\mu \nu }\phi ^{;\mu \nu }+\frac{1}{3}G_{5X}(\phi ,X)\left[ (\Box \phi )^{3}\nonumber \right. \\&\left. -3\Box \phi \phi _{;\mu \nu }\phi ^{;\mu \nu }+2\phi _{;\mu \nu }\phi ^{;\mu \sigma }\phi _{\ ;\sigma }^{;\mu }\right] ~ \end{aligned}$$
(39)

with \(X=-1/2 (\partial _{\mu } \phi )^{2}\) and “;” denotes covariant derivative. Albeit somehow general, it will be rather tedious to calculate the observables using this form, let alone more and more generalized forms are still being developed. However, as has been shown in [38, 40], one can create a dictionary between our generalized form and GST theories, by using which, the concrete models can be directly “read off”. Therefore, one can immediately get the expression for various models, without bothering to calculate them one by one.

For the GST theory whose Lagrangian is given by Eqs. (36)–(39), the dictionary turns out to be:

$$\begin{aligned} \Lambda= & {} M_p^2(6fH^2+5{\dot{f}}H+2f{\dot{H}}+{\ddot{f}})/2~, \end{aligned}$$
(40)
$$\begin{aligned} c= & {} M_p^2({\dot{f}}H-2f{\dot{H}}-{\ddot{f}})/2~, \end{aligned}$$
(41)
$$\begin{aligned} m_2^4= & {} (2X)^{3/2}\left[ \sqrt{2X}(E_2+3HE_3+12H^2XE_{4,X}\nonumber \right. \\&\left. -6H^2E_4-2H^3XE_{5,X})_{,X}\right] _{,X}/4-c/2~, \end{aligned}$$
(42)
$$\begin{aligned} m_3^3= & {} {\dot{f}}+2XE_{3,X}+8HX(2XE_{4,X}-E_4)_{,X}\nonumber \\&-4H^2X(XE_{5,X})_{,X}~, \end{aligned}$$
(43)
$$\begin{aligned} m_4^2= & {} {\tilde{m}}_4^2=-2XE_{4,X}+HXE_{5,X}-{\dot{E}}_5/2~, \end{aligned}$$
(44)
$$\begin{aligned} M_p^2f= & {} 2E_4+{\dot{E}}_5~, \end{aligned}$$
(45)

where

$$\begin{aligned} E_2= & {} G_2+\sqrt{X}\int \frac{G_{3,\phi }}{\sqrt{X}}dX~,~\nonumber \\ E_3= & {} -\int \sqrt{2X}G_{3,X}dX-2\sqrt{2X}G_{4,\phi }~,\nonumber \\ E_4= & {} G_4-\sqrt{X}\int \frac{G_{5,\phi }}{2\sqrt{X}}dX~,~\nonumber \\ E_5= & {} -\int \sqrt{2X}G_{5,X}dX~. \end{aligned}$$
(46)

This dictionary coincides with [38, 40] at least on quadratic perturbation level. From this we can express Eq. (35) for various concrete models, as will be shown in the following. Moreover, the dictionary can also be applied to models beyond (36)–(39). For example, for the GLPV model where the Lagrangian is enlarged as [46]

$$\begin{aligned}&\mathcal{{L}}_4\rightarrow \mathcal{{L}}_4+F_4(\phi ,X)\epsilon ^{\mu \nu \rho }_{~~~\sigma }\epsilon ^{\mu ^\prime \nu ^\prime \rho ^\prime \sigma }\phi _{;\mu }\phi _{;\mu ^\prime }\phi _{;\nu \nu ^\prime }\phi _{;\rho \rho ^\prime }~,~\nonumber \\&\mathcal{{L}}_5\rightarrow \mathcal{{L}}_5+F_5(\phi ,X)\epsilon ^{\mu \nu \rho \sigma }\epsilon ^{\mu ^\prime \nu ^\prime \rho ^\prime \sigma }\phi _{;\mu }\phi _{;\mu ^\prime }\phi _{;\nu \nu ^\prime }\phi _{;\rho \rho ^\prime }\phi _{;\sigma \sigma ^\prime }\nonumber \\ \end{aligned}$$
(47)

Equations  (42), (43) and (44) in the dictionary will be enlarged to

$$\begin{aligned} m_2^4= & {} (2X)^{3/2}[\sqrt{2X}(E_2+3HE_3+12H^2XE_{4,X}\nonumber \\&-6H^2E_4-24H^2X^2F_4\nonumber \\&-2H^3XE_{5,X}+6H^3(2X)^{5/2}F_5)_{,X}]_{,X}/4-c/2, \end{aligned}$$
(48)
$$\begin{aligned} m_3^3= & {} {\dot{f}}+2XE_{3,X}+8HX(2XE_{4,X}-E_4-4X^2F_4)_{,X}\nonumber \\&-4H^2X(XE_{5,X}+3(2X)^{5/2}F_5)_{,X}~, \end{aligned}$$
(49)
$$\begin{aligned} m_4^2= & {} -2XE_{4,X}+HXE_{5,X}+4X^2F_4\nonumber \\&-3H(2X)^{5/2}F_5-{\dot{E}}_5/2~, \end{aligned}$$
(50)
$$\begin{aligned} {\tilde{m}}_4^2= & {} -2XE_{4,X}+HXE_{5,X}-{\dot{E}}_5/2~, \end{aligned}$$
(51)

As a remark, one may notice that for Horndeski theories, we always have \(m_4^2={\tilde{m}}_4^2\), however for theories beyond, we may not. Nonetheless, one may also see that, for the additional function satisfying certain conditions, such as \(F_4=3H\sqrt{2X}F_5\), we still have \(m_4^2={\tilde{m}}_4^2\), that is to say, action (2) with \(m_4^2={\tilde{m}}_4^2\) can not only cover Horndeski theory, but also part of GLPV theory (but not all). It is interesting because, apparently, adding the condition will make the number of functions in action (2) (five, because the function \(\Lambda \) is totally of background and doesn’t enter into the perturbation level) smaller than those in field theory (GLPV) (six). On the other hand, if one releases such a condition, action (2) can fully cover GLPV theory since the number of functions of both sides is the same (six each).

For the beyond Horndeski models which will cause difference between \(m_4^2\) and \({\tilde{m}}_4^2\), for simplicity and without losing generality, we use a unified form to describe the beyond part, namely [43, 44]

$$\begin{aligned} \mathcal{{L}}_6=\Delta {\tilde{m}}_{4}^{2}R^{(3)}\delta g^{00}~, \end{aligned}$$
(52)

where \(\Delta {\tilde{m}}_{4}^{2}={\tilde{m}}_4^2-m_4^2\). In the following, we will choose models which contain two or three of those above Lagrangian, as simple examples. We will study how the concrete model Lagrangians (36), (37), (38), (39), (52) are written in terms of the EFT functions systematically, and we will discuss the constraints on model parameters and plot the allowed regions.

3.2 K-essence: \(M_{p}^{2}R/2+L_{2}\)

We first start with the simplest case of K-essence single field [50, 51]. This case can be written in terms of action (2) with the correspondence:

$$\begin{aligned} f= & {} 1~,~\Lambda =\Lambda _{(2)}=\frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)-G_{2}(\phi ,X)~,~c=c_{(2)}\nonumber \\= & {} \frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)~,~m_{2}^{4}\nonumber \\= & {} \frac{{\dot{\phi }}^{4}}{4}G_{2XX}(\phi ,X)~,~m_{3}^{3}=m_{4}^{2}={\tilde{m}}_{4}^{2}=0~. \end{aligned}$$
(53)

According to Eq. (9), the slow-roll parameters can be reduced to

$$\begin{aligned} \epsilon _{(2)}= & {} \frac{3c_{(2)}}{c_{(2)}+\Lambda _{(2)}}\nonumber \\= & {} \frac{3{{\dot{\phi }}}^2G_{2X}}{2({{\dot{\phi }}}^2G_{2X}-G_2)}~. \end{aligned}$$
(54)

For canonical scalar field with \(G_2(X,\phi )=X-V(\phi )\), the above definition of \(\epsilon _{(2)}\) can be connected with the ratio between kinetic and potential terms \(\gamma \equiv K/V\): \(\epsilon _{(2)}=\gamma /(1+\gamma )\). Note that it is also coincide with another common-used definition: \({{\dot{\phi }}}^2G_{2X}/H^2\).

From Eqs. (11)–(14), we have

$$\begin{aligned} D= & {} 4H^{2}M_{p}^{4}~,~c_{1}=M_{p}^{2}\left( \epsilon _{(2)} +2\frac{m_{2}^{4}}{H^{2}M_{p}^{2}}\right) ,\nonumber \\ ~c_{2}= & {} M_{p}^{2}~,~c_{3}=\frac{a}{H}M_{p}^{2}~,~c_{s}^{2}=\frac{H^{2}\epsilon _{(2)} M_{p}^{2}}{H^{2}\epsilon _{(2)} M_{p}^{2}+2m_{2}^{4}}~,\nonumber \\ \mathcal{{D}}_{T}= & {} \frac{M_p^2}{8}~,~c_{T}^{2}=1~. \end{aligned}$$
(55)

so using Eq. (35), one has

$$\begin{aligned} r= & {} \frac{16\sqrt{2}HM_{p}\epsilon _{(2)}^{3/2}}{\sqrt{2H^{2}\epsilon _{(2)} M_{p}^{2}+{\dot{\phi }}^{4}G_{2XX}}}\nonumber \\= & {} \frac{16\epsilon _{(2)}^{3/2}}{\sqrt{\epsilon _{(2)}+2\delta _{KXX}}}~, \end{aligned}$$
(56)

where we define the parameter \(\delta _{KXX}\equiv {\dot{\phi }}^{4}G_{2XX}/(4H^2fM_p^2)\), with f being unity in the current case.

One can see that for all the parameter choices of \(\epsilon _{(2)}\) and \(\delta _{KXX}\), the result recovers the standard consistency relation: \(r=16\epsilon c_s\). Moreover, for the canonical scalar field \(G_{2XX}=0\), one has \(\delta _{KXX}=0\) and \(c_s=1\), which is so simple that for given potential form, \(\epsilon \) and r can be directly related to parameters such as the power-law index of the potential, and the number of e-foldings [52]. Therefore, in the canonical case, the only way of getting the small r is to have small \(\epsilon \) by requiring the potential to be very flat. Two specific examples are obvious in the literature: the small-field inflation with \(\epsilon \sim constant\) (hilltop inflation [53] for example) and the ultra-slow-roll inflation with \(\epsilon \sim \eta ^{-6}\), which is proposed by [33] (with further studies in e.g., [54, 55]) and extended to constant-roll inflation [34, 35].

For noncanonical scalar field, the sound speed \(c_s\) also plays its role. For \(G_{2XX}\sim M_2^4>0\) which causes \(c_s<1\), r can be suppressed by \(c_s\). Examples include ghost condensate inflation [56] as well as DBI inflation [57]. However, as for the single-field inflation, the non-Gassanianties is also related to \(c_s\) in terms of \(f_{nl}\sim c_s^{-2}\) roughly [58] where \(f_{nl}\) describes the amplitude of non-Gaussianities. These models will also meet the danger of making \(f_{nl}\) too large to be consistent with the current constraints [59].

Fig. 1
figure 1

The plot of r in parameter space \([\epsilon _{(2)}, \delta _{KXX}]\). Light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\)

The plot of r in parameter space \([\epsilon _{(2)}, \delta _{KXX}]\) is shown in Fig. 1, where light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\). Note that when \(\delta _{KXX}=0\), according to the consistency relation \(r=16\epsilon \), \(r<0.064\) and \(r<0.001\) take \(\epsilon \) into the very narrow region of \(\epsilon <0.004\) and \(\epsilon <0.000625\), respectively. However, when taking \(\delta _{KXX}\) into consideration, the allowed region of \(\epsilon _{(2)}\) will get much enlarged and larger \(\epsilon _{(2)}\) will also be allowed. Moreover, considering the constraints of stability, namely \(c_1>0\), \(c_s^2>0\), we must restrict \(\epsilon _{(2)}\) and \(\delta _{KXX}\) to be within the region of \(\epsilon _{(2)}>0\) and \(\epsilon _{(2)}+2\delta _{KXX}>0\).

For more complicated case, however, not only the consistency relation \(r=16\epsilon c_s\) will be violated, \(\epsilon \) could also be affected and be deviated from \(\epsilon _{(2)}\), due to the alterations of both background energy density and pressure, either of which will play a role of suppressing r. In order to illustrate, in the following, we write

$$\begin{aligned} \Lambda =\Lambda _{(2)}+\Delta \Lambda ~,~c=c_{(2)}+\Delta c~, \end{aligned}$$
(57)

where \(\Delta \Lambda \) and \(\Delta c\) denotes the derivation of \(\Lambda \) and c from those in the case of K-essence. Thus we have

$$\begin{aligned} \epsilon= & {} \epsilon _{(2)}\left[ 1+\delta _{f}^{(1)}-\frac{1}{3}\frac{\Delta c}{M_{p}^{2}H^{2}f}-\frac{1}{3}\frac{\Delta \Lambda }{M_{p}^{2}H^{2}f}\right] \nonumber \\&+\frac{\Delta c}{M_{p}^{2}H^{2}f}+\frac{1}{2}\delta _{f}^{(1)}(\delta _{f}^{(2)}-1) \end{aligned}$$
(58)

so the effects on \(\epsilon \) can be corresponded to several parameters such as \(\delta _{f}^{(1)}\), \(\delta _{f}^{(2)}\), \(\frac{\Delta c}{M_{p}^{2}H^{2}f}\) and \(\frac{\Delta \Lambda }{M_{p}^{2}H^{2}f}\).

3.3 Galileon: \(M_{p}^{2}R/2+L_{2}+L_{3}\)

The next case to consider is the Galileon case, the proposal of which is inspired by the ghost problems that appeared in DGP models [60,61,62]. Although the original proposal introduced the Galileon symmetry, it was later extended to general case that include also dependence of the field itself, with the addition of the term \(G_3(\phi , X)\Box \phi \) [63, 64]. When \(G_3\) contains \(\phi \) only, this term coincides with the kinetic term only by moduli a total derivative. Here for simplicity we take \(G_3\) to be of the form \(G_3=g(\phi )X\), therefore can be written in terms of action (2) with the correspondence:

$$\begin{aligned} f= & {} 1~,~\Lambda =\frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)-G_{2}(\phi ,X)\nonumber \\&-\frac{{\dot{\phi }}^{2}}{2}({\ddot{\phi }}+3H{\dot{\phi }})g(\phi )~,~\nonumber \\ c= & {} \frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)+\frac{{\dot{\phi }}^{2}}{2}({\ddot{\phi }}-3H{\dot{\phi }})g(\phi )+\frac{{\dot{\phi }}^{4}}{2}g_{\phi }(\phi )~,\nonumber \\ m_{2}^{4}= & {} \frac{{\dot{\phi }}^{4}}{4}G_{2XX}(\phi ,X)-\frac{{\dot{\phi }}^{2}}{4}({\ddot{\phi }}+3H{\dot{\phi }})g(\phi )\nonumber \\&+\frac{{\dot{\phi }}^{4}}{4}g_{\phi }(\phi )~,~m_{3}^{3}\nonumber \\= & {} -{\dot{\phi }}^{3}g(\phi )~,~m_{4}^{2}={\tilde{m}}_{4}^{2}=0~, \end{aligned}$$
(59)

then from Eqs. (11)–(14), we have

$$\begin{aligned} D= & {} (2HM_{p}^{2}-m_{3}^{3})^{2}~,~\nonumber \\ c_{1}= & {} \frac{M_{p}^{2}}{(2HM_{p}^{2}-m_{3}^{3})^{2}}(3m_{3}^{6}+4H^{2}\epsilon M_{p}^{4}+8m_{2}^{4}M_{p}^{2})~,~\nonumber \\ c_{2}= & {} M_{p}^{2}~,\nonumber \\ c_{3}= & {} \frac{2aM_{p}^{4}}{2HM_{p}^{2}-m_{3}^{3}}~,~\nonumber \\ c_{s}^{2}= & {} \frac{4H^{2}\epsilon M_{p}^{4}+2HM_{p}^{2}m_{3}^{3}-m_{3}^{6}+2(m_{3}^{3})^{\cdot }M_{p}^{2}}{4H^{2}\epsilon M_{p}^{4}+8m_{2}^{4}M_{p}^{2}+3m_{3}^{6}}~,\nonumber \\ \mathcal{{D}}_{T}= & {} \frac{M_p^2}{8}~,~c_{T}^{2}=1~. \end{aligned}$$
(60)

so using Eq. (35), one has

$$\begin{aligned} r= & {} \frac{16}{(2HM_{p}^{2}-m_{3}^{3})^{2}}\nonumber \\&\times \frac{\left[ 4H^{2}\epsilon M_{p}^{4}+2HM_{p}^{2}m_{3}^{3}-m_{3}^{6}+2(m_{3}^{3})^{\cdot }M_{p}^{2}\right] ^{3/2}}{\left( 4H^{2}\epsilon M_{p}^{4}+8m_{2}^{4}M_{p}^{2}+3m_{3}^{6}\right) ^{1/2}}. \end{aligned}$$
(61)

Moreover, in order to take into account of the variation of \(\epsilon \), we note that

$$\begin{aligned} \frac{\Delta c}{M_{p}^{2}H^{2}f}= & {} \frac{g(\phi ){\dot{\phi }}^{3}}{2M_{p}^{2}H}\left( \frac{{\ddot{\phi }}}{H{\dot{\phi }}}-3\right) +\frac{g_{\phi }(\phi ){\dot{\phi }}^{4}}{2M_{p}^{2}H^{2}}~, \end{aligned}$$
(62)
$$\begin{aligned} \frac{\Delta \Lambda }{M_{p}^{2}H^{2}f}= & {} -\frac{g(\phi ){\dot{\phi }}^{3}}{2M_{p}^{2}H}\left( \frac{{\ddot{\phi }}}{H{\dot{\phi }}}+3\right) ~, \end{aligned}$$
(63)
$$\begin{aligned} \delta _{f}^{(1)}= & {} \delta _{f}^{(2)}=0~. \end{aligned}$$
(64)

It is useful to define the “slow-varying” parameters as \(\delta _{gX}\equiv {\dot{\phi }}^{3}g(\phi )/(HM_{p}^{2})\), \(\delta _{g\phi }\equiv g_{\phi }(\phi ){\dot{\phi }}^{4}/(H^{2}M_{p}^{2})\), and \(\delta _{\phi }\equiv {\ddot{\phi }}/H{\dot{\phi }}\). Therefore the above formula will become \(\frac{\Delta c}{M_{p}^{2}H^{2}f}=\frac{1}{2}\delta _{gX}(\delta _{\phi }-3)+\frac{1}{2}\delta _{g\phi }\), \(\frac{\Delta \Lambda }{M_{p}^{2}H^{2}f}=-\frac{1}{2}\delta _{gX}(\delta _{\phi }+3)\), thus

$$\begin{aligned} \epsilon =\epsilon _{(2)}(1+\delta _{gX}-\frac{1}{6}\delta _{g\phi })+\frac{1}{2}\delta _{gX}(\delta _{\phi }-3)+\frac{1}{2}\delta _{g\phi }~. \end{aligned}$$
(65)

Moreover, we have \(m_{3}^{3}=-{\dot{\phi }}^{3}g(\phi )=-HM_{p}^{2}\delta _{gX}\), \((m_{3}^{3})^{\cdot }=-g_{\phi }(\phi ){\dot{\phi }}^{4}-3g(\phi ){\dot{\phi }}^{2}{\ddot{\phi }}=-H^{2}M_{p}^{2}\delta _{g\phi }-3H^{2}M_{p}^{2}\delta _{gX}\delta _{\phi }\), and \(m_{2}^{4}=H^{2}M_{p}^{2}\delta _{KXX}-\frac{1}{4}H^{2}M_{p}^{2}\delta _{gX}(\delta _{\phi }+3)+\frac{1}{4}H^{2}M_{p}^{2}\delta _{g\phi }\) where \(\delta _{KXX}\) has already been defined previously. In this regard, we get

$$\begin{aligned} r= & {} \frac{16}{(2+\delta _{gX})^{2}}\nonumber \\&\times \frac{\left[ 4\epsilon _{(2)}+4\epsilon _{(2)}\delta _{gX}-2\epsilon _{(2)}\delta _{g\phi }/3-8\delta _{gX}-4\delta _{gX}\delta _{\phi }-\delta _{gX}^{2}\right] ^{3/2}}{\left( 4\epsilon _{(2)}+4\epsilon _{(2)}\delta _{gX}-2\epsilon _{(2)}\delta _{g\phi }/3-12\delta _{gX}+4\delta _{g\phi }+3\delta _{gX}^{2}+8\delta _{KXX}\right) ^{1/2}}.\nonumber \\ \end{aligned}$$
(66)

Since one can notice that the dependence of r on \(\delta _{KXX}\) is also obvious and is the same as in the Kessence case, we will also ignore it in the following discussions. We now consider a simple case in which \(g_{\phi }(\phi )\simeq 0\) (\(g\simeq const.\)), namely \(\delta _{g\phi }\simeq 0\), therefore \(\epsilon \) and r will become

$$\begin{aligned} \epsilon= & {} \epsilon _{(2)}(1+\delta _{gX})+\frac{1}{2}\delta _{gX}(\delta _{\phi }-3)~, \end{aligned}$$
(67)
$$\begin{aligned} r= & {} \frac{16}{(2+\delta _{gX})^{2}(s+1)^{3}}\nonumber \\&\frac{\left[ 4\epsilon _{(2)}+4\epsilon _{(2)}\delta _{gX}-8\delta _{gX}-4\delta _{gX}\delta _{\phi }-\delta _{gX}^{2}\right] ^{3/2}}{\left( 4\epsilon _{(2)}+4\epsilon _{(2)}\delta _{gX}-12\delta _{gX}+3\delta _{gX}^{2}\right) ^{1/2}}~. \end{aligned}$$
(68)

Furtherly consider all the parameters are smaller than 1, then the terms of parameters multiplied or squared can be viewed as higher order infinitesimals. Therefore r could be greatly simplified as:

$$\begin{aligned} r\simeq 16 \frac{(\epsilon _{(2)}-2\delta _{gX})^{3/2}}{(\epsilon _{(2)}-3\delta _{gX})^{1/2}}~. \end{aligned}$$
(69)

For usual case of \(\epsilon _{(2)}>0\), whether \(\delta _{gX}\) is positive or negative, the inclusion of \(\delta _{gX}\) will cause the raise of r, so we tend to get large r as in G-inflation model [65]. However, for \(\epsilon _{(2)}<0\), which makes phantom-like inflation [66] possible to happen, there is certain room for small r where \(\delta _{gX}\) is also negative. Note also that due to the requirement of stabilities (\(c_1>0\), \(c_s^2>0\)), the case of positive \(\delta _{gX}\) for \(\epsilon _{(2)}<0\) is forbidden.

Fig. 2
figure 2

The plot of r in parameter space \([\epsilon _{(2)}, \delta _{gX}]\) (setting \(\delta _{g\phi }=0\), \(\delta _{\phi }=0\)), \([\epsilon _{(2)}, \delta _{g\phi }]\) (setting \(\delta _{gX}=0\), \(\delta _{\phi }=0\)), \([\epsilon _{(2)}, \delta _{\phi }]\) (setting \(\delta _{gX}=0\), \(\delta _{g\phi }=0\)), \([\delta _{gX}, \delta _{g\phi }]\) (setting \(\epsilon _{(2)}=0.004\), \(\delta _{\phi }=0\)), \([\delta _{gX}, \delta _{\phi }]\) (setting \(\epsilon _{(2)}=0.004\), \(\delta _{g\phi }=0\)), \([\delta _{g\phi }, \delta _{\phi }]\) (setting \(\epsilon _{(2)}=0.004\), \(\delta _{gX}=0\)). Light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\)

Fig. 3
figure 3

The plot of \(r-16\epsilon c_s\) in parameter space \([\epsilon _{(2)}, \delta _{gX}]\) (setting \(\delta _{g\phi }=0\), \(\delta _{\phi }=0\)), \([\epsilon _{(2)}, \delta _{g\phi }]\) (setting \(\delta _{gX}=0\), \(\delta _{\phi }=0\)), \([\epsilon _{(2)}, \delta _{\phi }]\) (setting \(\delta _{gX}=0\), \(\delta _{g\phi }=0\)), \([\delta _{gX}, \delta _{g\phi }]\) (setting \(\epsilon _{(2)}=0.004\), \(\delta _{\phi }=0\)), \([\delta _{gX}, \delta _{\phi }]\) (setting \(\epsilon _{(2)}=0.004\), \(\delta _{g\phi }=0\)), \([\delta _{g\phi }, \delta _{\phi }]\) (setting \(\epsilon _{(2)}=0.004\), \(\delta _{gX}=0\)). Blue region denotes \(r-16\epsilon c_s<0\), and yellow region denotes \(r-16\epsilon c_s>0\)

The plot of r in parameter space \([\epsilon _{(2)}, \delta _{gX}, \delta _{g\phi }, \delta _{\phi }]\) is shown in Fig. 2, where light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\). According to the plot, several remarks are as follows: (1) in the first plot, the allowed region extend to where \(\epsilon _{(2)}<0\), making possible for phantom inflation; (2) for other cases, there are also large space for r to be within (0.064, 0.1); (3) note that for \(\delta _{gX}=0\), r can be small for \(\delta _{g\phi }\simeq 6\), which is because there is a pole in the numerator of r in Eq. (66). However it seems difficult to make consistency of both \(\delta _{gX}=0\) and nonvanishing \(\delta _{g\phi }\); (4) from the rightmost column one can see that for \(\delta _{gX}=0\), the dependence of \(\delta _{\phi }\) is also decoupled. In the allowed region for \(\epsilon _{(2)}\) and \(\delta _{g\phi }\), the room for small r is also quite large.

The ratio between r and \(16\epsilon c_{s}\) can also be straightforwardly calculated as:

$$\begin{aligned}&\frac{r}{16\epsilon c_{s}}\nonumber \\&\quad =\frac{\left[ 4\epsilon _{(2)}+4\epsilon _{(2)}\delta _{gX}-2\epsilon _{(2)}\delta _{g\phi }/3-8\delta _{gX}-4\delta _{gX}\delta _{\phi }-\delta _{gX}^{2}\right] }{(2+\delta _{gX})^{2}\left[ \epsilon _{(2)}(1+\delta _{gX}-\frac{1}{6}\delta _{g\phi })+\frac{1}{2}\delta _{gX}(\delta _{\phi }-3)+\frac{1}{2}\delta _{g\phi }\right] },\nonumber \\ \end{aligned}$$
(70)

which depends on all the parameters, and we also plot this comparison in Fig. 3 (in order to avoid the possible pole in the dominator, we instead plot \(r-16\epsilon c_{s}\) and compare it with 0.), with blue color for \(r<16\epsilon c_s\), while yellow color for \(r>16\epsilon c_s\). Therefore for Galileon field, the consistency relationship is usually broken.

3.4 Modified gravity: \(L_{2}+L_{4}\)

Here we consider the case where \(G_4\) terms involves in, which includes nonminimally coupling case as well as f(R) modified gravity case. For simplicity, we take \(G_4\) to be purely function of \(\phi \), namely \(G_{4}=M_{p}^{2}A(\phi )/2\) where \(A(\phi )\) is an arbitrary function. Therefore this case can be written in terms of action (2) with the correspondence:

$$\begin{aligned} f= & {} A(\phi )~,~\Lambda =\frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)-G_{2}(\phi ,X),\nonumber \\ c= & {} \frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X),\nonumber \\ m_{2}^{4}= & {} \frac{{\dot{\phi }}^{4}}{4}G_{2XX}(\phi ,X)~,~m_{3}^{3}=m_{4}^{2}={\tilde{m}}_{4}^{2}=0~. \end{aligned}$$
(71)

From Eqs. (11)–(14), we have

$$\begin{aligned} D= & {} (2AH+{\dot{A}})^{2}M_{p}^{4}~,~c_{1}\nonumber \\= & {} \frac{1}{(2AH+{\dot{A}})^{2}}A\left( 4A^{2}H^{2}\epsilon M_{p}^{2}+3{\dot{A}}^{2}M_{p}^{2}+8AM_{2}^{4}\nonumber \right. \\&\left. -2A{\ddot{A}}M_{p}^{2}+2A{\dot{A}}HM_{p}^{2}\right) ~,\nonumber \\ c_{2}= & {} AM_{p}^{2}~,~c_{3}=\frac{2aA^{2}M_{p}^{2}}{2AH+{\dot{A}}}~,~c_{s}^{2}\nonumber \\= & {} \frac{2A{\dot{A}}H+3{\dot{A}}^{2}+4A^{2}H^{2}\epsilon -2A{\ddot{A}}}{2A{\dot{A}}H+3{\dot{A}}^{2}+4A^{2}H^{2}\epsilon -2A{\ddot{A}}+8A\frac{m_{2}^{4}}{M_{p}^{2}}}~,\nonumber \\&\mathcal{{D}}_T=\frac{M_p^2A(\phi )}{8}~,~c_{T}=1~. \end{aligned}$$
(72)

Defining \(\delta _{A}^{(1)}\equiv {\dot{A}}/HA\), \(\delta _{A}^{(2)}\equiv {\ddot{A}}/H{\dot{A}}\), and using Eq. (35), we get

$$\begin{aligned} r=\frac{16}{\left( 2+\delta _{A}^{(1)}\right) ^{2}}\frac{\left( 2\delta _{A}^{(1)}+3\delta _{A}^{(1)2}+4\epsilon -2\delta _{A}^{(1)}\delta _{A}^{(2)}\right) ^{3/2}}{\left( 2\delta _{A}^{(1)}+3\delta _{A}^{(1)2}+4\epsilon -2\delta _{A}^{(1)}\delta _{A}^{(2)}+8\delta _{KXX}\right) ^{1/2}}.\nonumber \\ \end{aligned}$$
(73)

Moreover, in this case we have \(\frac{\Delta c}{M_{p}^{2}H^{2}f}=0\), \(\frac{\Delta \Lambda }{M_{p}^{2}H^{2}f}=0\), \(\delta _{f}^{(1)}=\delta _{A}^{(1)}\), \(\delta _{f}^{(2)}=\delta _{A}^{(2)}\), therefore

$$\begin{aligned} \epsilon =\epsilon _{(2)}\left( 1+\delta _{A}^{(1)}\right) +\frac{1}{2}\delta _{A}^{(1)}(\delta _{A}^{(2)}-1) \end{aligned}$$
(74)

so

$$\begin{aligned} r= & {} \frac{16}{\left( 2+\delta _{A}^{(1)}\right) ^{2}}\frac{\left[ +3\delta _{A}^{(1)2}+4\epsilon _{(2)}(1+\delta _{A}^{(1)})\right] ^{3/2}}{\left[ +3\delta _{A}^{(1)2}+4\epsilon _{(2)}(1+\delta _{A}^{(1)})+8\delta _{KXX}\right] ^{1/2}},\nonumber \\ c_{s}^{2}= & {} \frac{2\delta _{A}^{(1)}+3\delta _{A}^{(1)2}+4\epsilon -2\delta _{A}^{(1)}\delta _{A}^{(2)}}{2\delta _{A}^{(1)}+3\delta _{A}^{(1)2}+4\epsilon -2\delta _{A}^{(1)}\delta _{A}^{(2)}+8\delta _{KXX}]^{1/2}}~. \end{aligned}$$
(75)
Fig. 4
figure 4

The plot of r in parameter space \({\epsilon _{(2)}, \delta _{A}^{(1)}}\) for nonminimal coupling models (Left panel), and vs. \(\delta _{A}^{(1)}\) for f(R) modified gravity models. Light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\)

Fig. 5
figure 5

The plot of \(r-16\epsilon c_s\) in parameter space \([\epsilon _{(2)}, \delta _{A}^{(1)}]\) (setting \(\delta _{A}^{(2)}=0\)), \([\epsilon _{(2)}, \delta _{A}^{(2)}]\) (setting \(\delta _{A}^{(1)}=0.01\)), \([\delta _{A}^{(1)}, \delta _{A}^{(2)}]\) (setting \(\epsilon _{(2)}=0.004\)). Blue region denotes \(r-16\epsilon c_s<0\), and yellow region denotes \(r-16\epsilon c_s>0\)

For the same reason as the above case, we ignore the effects of \(\delta _{KXX}\). In this case, \(c_s^2\) will exactly be unity, and one could get a very neat form of r:

$$\begin{aligned} r\simeq \frac{16(3\delta _{A}^{(1)2}+4\epsilon _{(2)}+4\epsilon _{(2)}\delta _{A}^{(1)})}{(2+\delta _{A}^{(1)})^{2}} \end{aligned}$$
(76)

which only involves two parameters. Since now \(c_s=1\), one can see that r has obviously deviated from the consistency relation: \(r=16\epsilon c_s\). Moreover, we also find that \(c_T=1\), namely even gravity is modified in the current form, the sound speed of tensor perturbation is unaltered. This indicates that \(c_T\) can only deviate from unity in a more complicated form of modified gravity, e.g., when there is a kinetic coupling to the gravity, as will be demonstrated in the next case. The interest in the deviation of \(c_T\) from unity is spurred by the constraints imposed by the latest GW event from a binary neutron star merger, namely GW170817, and its electromagnetic counterpart GRB170817A, see [67].

Let us now turn to another interesting case, namely where \(G_2(\phi , X)=G_2(\phi )\) is a pure function of \(\phi \). This, by conformal transformation, is nothing but f(R) gravity [68] (see also [69]). In this case we have \(\epsilon =\frac{1}{2}\delta _{A}^{(1)}\delta _{A}^{(2)}-\frac{1}{2}\delta _{A}^{(1)}\), therefore

$$\begin{aligned} r= & {} \frac{16\left( 2\delta _{A}^{(1)}+3\delta _{A}^{(1)2}+4\epsilon -2\delta _{A}^{(1)}\delta _{A}^{(2)}\right) }{\left( 2+\delta _{A}^{(1)}\right) ^{2}}~\nonumber \\= & {} \frac{48\delta _{A}^{(1)2}}{\left( 2+\delta _{A}^{(1)}\right) ^{2}}~. \end{aligned}$$
(77)

namely the number of parameters involved again get reduced, up to only one. Moreover, since r is proportional to \(\delta _{A}^{(1)2}\), it is easier to get small r, as long as \(\delta _{A}^{(1)}\) is not too large. For example, for \(r<0.064(0.01)\), one needs \(-0.070(-0.028)<\delta _A^{(1)}<0.076(0.029)\). A concrete example is the famous inflation model proposed by Starobinsky [31, 32].

The plot of r in parameter space \([\epsilon _{(2)}, \delta _{A}^{(1)}]\) is shown in Fig. 4, where light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\). Note that the allowed region extend to where \(\epsilon _{(2)}<0\), which means that in presence of nonminimal coupling, r can be within the detectable region even for phantom inflation, but in this case, large \(\delta _{A}^{(1)}\) is needed in order not to cause the instabilities. For f(R) gravity, there is modest space for \(\delta _{A}^{(1)}\) to make r within the detectable region. The comparison of r to the consistency relation \(r=16\epsilon c_s\) is also shown in Fig. 5. One can see that for the trivial case of \(\delta _{A}^{(1)}=0\), one can calculate that \(r=16\epsilon c_s\) and consistency relation is recovered. However, for positive/negative \(\delta _{A}^{(1)}\), one has r larger/smaller than \(16\epsilon c_s\).

3.5 Nonminimal derivative coupling: \(M_{p}^{2}R/2+L_{2}+L_{5}\)

In last case we discussed about nonminimal coupling of field and gravity but missed the possibility of derivative coupling, namely the gravity coupling with derivatives of the inflaton field. This is a very different case from above, which will probably even involve \(G_5\). Here we assume \(G_5=h(\phi )\) is a pure function of \(\phi \), which is equivalent to the introduction of a term like \(G_{\mu \nu }\partial ^\mu \phi \partial ^\nu \phi \), up to a total derivative [70,71,72,73]. This case can be written in terms of action (2) with the correspondence:

$$\begin{aligned} f= & {} 1-h_{\phi }(\phi ){\dot{\phi }}^{2}~, \end{aligned}$$
(78)
$$\begin{aligned} \Lambda= & {} \frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)-G_{2}(\phi ,X)-\frac{1}{2}M_{p}^{2}H{\dot{f}}\nonumber \\&-2M_{p}^{2}(f-1){\dot{H}} -\frac{1}{2}M_{p}^{2}{\ddot{f}}+3M_{p}^{2}H^{2}(f-1)~, \end{aligned}$$
(79)
$$\begin{aligned} c= & {} \frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)+\frac{7}{2}M_{p}^{2}H{\dot{f}}+2M_{p}^{2}(f-1){\dot{H}}\nonumber \\&+\frac{1}{2}M_{p}^{2}{\ddot{f}}+9M_{p}^{2}H^{2}(f-1)~, \end{aligned}$$
(80)
$$\begin{aligned} m_{2}^{4}= & {} \frac{{\dot{\phi }}^{4}}{4}G_{2XX}(\phi ,X)+\frac{1}{4}M_{p}^{2}H{\dot{f}}\nonumber \\&+M_{p}^{2}(f-1){\dot{H}}+\frac{1}{4}M_{p}^{2}{\ddot{f}}~, \end{aligned}$$
(81)
$$\begin{aligned} m_{3}^{3}= & {} M_{p}^{2}{\dot{f}}+4M_{p}^{2}H(f-1)~,~m_{4}^{2}\nonumber \\= & {} -M_{p}^{2}(f-1)~,~{\tilde{m}}_{4}^{2}=-M_{p}^{2}(f-1)~. \end{aligned}$$
(82)

From Eqs. (11)–(14), we have

$$\begin{aligned} D= & {} 4H^{2}M_{p}^{4}(3f-4)^{2}=4H^{2}M_{p}^{4}(1+3h_{\phi }{\dot{\phi }}^{2})^{2}~, \end{aligned}$$
(83)
$$\begin{aligned} c_{1}= & {} \frac{(2-f)M_{p}^{2}}{4H^{2}(3f-4)^{2}}\left\{ 48H^{2}(f-1)^{2}\nonumber \right. \\&\left. +(2-f)\left[ 2{\dot{\phi }}^{4}G_{2XX}(\phi ,X)/M_{p}^{2}\nonumber \right. \right. \\&\left. \left. +4H{\dot{f}}+4H^{2}\epsilon (2-f)\right] \right\} ~, \end{aligned}$$
(84)
$$\begin{aligned} c_{2}= & {} fM_{p}^{2}=(1-h_{\phi }{\dot{\phi }}^{2})M_{p}^{2}~, \end{aligned}$$
(85)
$$\begin{aligned} c_{3}= & {} \frac{-aM_{p}^{2}(2-f)^{2}}{H(3f-4)}~, \end{aligned}$$
(86)
$$\begin{aligned} c_{s}^{2}= & {} \frac{4H^{2}}{(2-f)}\nonumber \\&\times \frac{\left\{ {\dot{f}}(3f-2)(2-f)/H-(2-f)^{2}(3f-4)(1+\epsilon )-f(3f-4)^{2}\right\} }{\left\{ 48H^{2}(f-1)^{2}+(2-f)[2{\dot{\phi }}^{4}G_{2XX}(\phi ,X)/M_{p}^{2}+4H{\dot{f}}+4H^{2}\epsilon (2-f)]\right\} }, \end{aligned}$$
(87)
$$\begin{aligned} \mathcal{{D}}_T= & {} \frac{M_p^2}{8}(2-f)=\frac{M_p^2}{8}(1+h_{\phi }{\dot{\phi }}^{2})~, \end{aligned}$$
(88)
$$\begin{aligned} c_{T}= & {} \frac{f}{Q_{T}}=\frac{f}{2-f}=\frac{1-h_{\phi }{\dot{\phi }}^{2}}{1+h_{\phi }{\dot{\phi }}^{2}},\nonumber \\ \end{aligned}$$
(89)

so using Eq. (35), one has

$$\begin{aligned} r= & {} \frac{32}{(3f-4)^{2}}\nonumber \\&\times \frac{\left[ 2{\dot{f}}(2-f)/H-(2-f)^{2}(3f-4)(1+\epsilon )-f(3f-4)^{2}\right] ^{3/2}}{\left\{ 48(f-1)^{2}(2-f)+(2-f)^{2}\left[ 2{\dot{\phi }}^{4}G_{2XX}(\phi ,X)/(H^2M_{p}^{2})+4{\dot{f}}/H+4\epsilon (2-f)\right] \right\} ^{1/2}f^{3/2}} \end{aligned}$$
(90)

For ease of our analysis, we define \(\delta _{h\phi }\equiv f_{(5)}/f\), where \(f_{(5)}=f-1\). Therefore we have \(\dot{f_{(5)}}={\dot{f}}\), \(\ddot{f_{(5)}}={\ddot{f}}\), and \(\dot{\delta }_{h\phi }=(f_{(5)}/f)^{\cdot }={\dot{f}}_{(5)}/f-f_{(5)}{\dot{f}}/f^{2}=({\dot{f}}/f)(1-f_{(5)}/f)=H\delta _{f}^{(1)}(1-\delta _{h\phi })\). Moreover, to avoid ghost and gradient instabilities in this model (\(\mathcal{{D}}_T>0\), \(c_T>0\)), we require \(-1<h_\phi {{\dot{\phi }}}^2<1\), so from the definition one finds \(\delta _{h\phi }<1/2\) needed even for non-inflationary models.

Moreover, in this case we have

$$\begin{aligned} \frac{\Delta c}{M_{p}^{2}H^{2}f}= & {} -\frac{M_{p}^{2}H{\dot{f}}_{(5)}}{2M_{p}^{2}H^{2}f}-\frac{2M_{p}^{2}f_{(5)}{\dot{H}}}{M_{p}^{2}H^{2}f}\nonumber \\&-\frac{M_{p}^{2}{\ddot{f}}_{(5)}}{2M_{p}^{2}H^{2}f}+\frac{3M_{p}^{2}H^{2}f_{(5)}}{M_{p}^{2}H^{2}f} \end{aligned}$$
(91)
$$\begin{aligned}= & {} -\frac{1}{2}\delta _{f}^{(1)}+2\epsilon \delta _{h\phi }-\frac{1}{2}\delta _{f}^{(1)}\delta _{f}^{(2)}+3\delta _{h\phi }~, \end{aligned}$$
(92)
$$\begin{aligned} \frac{\Delta \Lambda }{M_{p}^{2}H^{2}f}= & {} +\frac{7M_{p}^{2}H{\dot{f}}_{(5)}}{2M_{p}^{2}H^{2}f}+\frac{2M_{p}^{2}f_{(5)}{\dot{H}}}{M_{p}^{2}H^{2}f}\nonumber \\&+\frac{M_{p}^{2}{\ddot{f}}_{(5)}}{2M_{p}^{2}H^{2}f}+\frac{9M_{p}^{2}H^{2}f_{(5)}}{M_{p}^{2}H^{2}f} \end{aligned}$$
(93)
$$\begin{aligned}= & {} +\frac{7}{2}\delta _{f}^{(1)}-2\epsilon \delta _{h\phi }+\frac{1}{2}\delta _{f}^{(1)}\delta _{f}^{(2)}+9\delta _{h\phi }~, \end{aligned}$$
(94)

therefore

$$\begin{aligned} \epsilon =\frac{\epsilon _{(2)}(1-4\delta _{h\phi })-\delta _{f}^{(1)}+3\delta _{h\phi }}{1-2\delta _{h\phi }}~, \end{aligned}$$
(95)

and

$$\begin{aligned} r=\frac{16}{(4\delta _{h\phi }-1)^{2}}\frac{\left[ 6\delta _{f}^{(1)}\delta _{h\phi }(1-2\delta _{h\phi })+\delta _{h\phi }(3-2\delta _{h\phi })(1-4\delta _{h\phi })+(1-4\delta _{h\phi })^{2}(1-2\delta _{h\phi })\epsilon _{(2)}\right] ^{3/2}}{\left[ 3\delta _{h\phi }(1+2\delta _{h\phi })+(1-2\delta _{h\phi })(1-4\delta _{h\phi })\epsilon _{(2)}+2\delta _{KXX}(1-2\delta _{h\phi })\right] ^{1/2}}~. \end{aligned}$$
(96)

We now consider a simple case, namely \(\delta _{f}^{(1)}\simeq 0\), which means that \(\delta _{h\phi }=const.\) Moreover, we also ignore the effect from \(\delta _{KXX}\). Therefore \(\epsilon \) and r will be reduced to:

$$\begin{aligned} \epsilon= & {} \frac{\epsilon _{(2)}(1-4\delta _{h\phi })+3\delta _{h\phi }}{1-2\delta _{h\phi }}, \end{aligned}$$
(97)
Fig. 6
figure 6

The plot of r in parameter space \([\epsilon _{(2)}, \delta _f^{(1)}]\) (setting \(\delta _{h\phi }=0\)), \([\epsilon _{(2)}, \delta _{h\phi }]\) (setting \(\delta _f^{(1)}=0\)), \([\delta _f^{(1)}, \delta _{h\phi }]\) (setting \(\epsilon _{(2)}=0.004\)). Light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\)

Fig. 7
figure 7

The plot of \(r-16\epsilon c_s\) in parameter space \([\epsilon _{(2)}, \delta _f^{(1)}]\) (setting \(\delta _{h\phi }=0\)), \([\epsilon _{(2)}, \delta _{h\phi }]\) (setting \(\delta _f^{(1)}=0\)), \([\delta _f^{(1)}, \delta _{h\phi }]\) (setting \(\epsilon _{(2)}=0.004\)). Blue region denotes \(r-16\epsilon c_s<0\), and yellow region denotes \(r-16\epsilon c_s>0\)

$$\begin{aligned}&r=\frac{16\left[ (\epsilon _{(2)}(1-4\delta _{h\phi })+3\delta _{h\phi })(1-2\delta _{h\phi })+4\delta _{h\phi }^{2}\right] ^{3/2}}{\left[ 12\delta _{h\phi }^{2}(1-4\delta _{h\phi })+(\epsilon _{(2)}(1-4\delta _{h\phi })+3\delta _{h\phi })(1-2\delta _{h\phi })(1-4\delta _{h\phi })\right] ^{1/2}}\nonumber \\&\simeq 16\left[ \epsilon _{(2)}+3\delta _{h\phi }+{\mathcal {O}}(\delta _{h\phi }^{2})\right] ~, \end{aligned}$$
(98)

where in the last step we expand the expression of r in terms of slow-varying parameters. One could see that the \(\epsilon _{(2)}\) and \(\delta _{h\phi }\) is anti-proportional to each other in determining r. Moreover, for \(\delta _{h\phi }\simeq 0\), \(\delta _{f}^{(1)}\simeq 0\) case (\(f\simeq 1\)), we have

$$\begin{aligned} \epsilon \simeq \epsilon _{(2)}~,~r=16\epsilon _{(2)}~ \end{aligned}$$
(99)

which recovers the canonical field case.

The plot of r in parameter space \([\epsilon _{(2)}, \delta _f^{(1)}, \delta _{h\phi }]\) is shown in Fig. 6, where light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\). One can see that (from middle panel) for \(\delta _f^{(1)}=0\), the allowed region lies in where \(\epsilon _{(2)}\) and \(\delta _{h\phi }\) goes roughly anti-proportional to each other, and large \(\epsilon _{(2)}\) is allowable for large negative \(\delta _{h\phi }\). Like nonminimal coupling case, negative \(\epsilon _{(2)}\) (phantom inflation) is also allowable for large positive \(\delta _{h\phi }\). Moreover, when \(\delta _{h\phi }=0\), the dependence of r on \(\delta _f^{(1)}\) is also decoupled (left panel). The comparison of r to the consistency relation \(r=16\epsilon c_s\) is also shown in Fig. 7. From the left panel we can see that, for \(\delta _{h\phi }=0\), the deviation from the consistency relation \(r=16\epsilon c_s\) (in this case \(c_s=1\)) totally depends on \(\delta _f^{(1)}\), namely \(r>/<16\epsilon \) when \(\delta _f^{(1)}>/<0\), this is because in this case, r go back to the trivial case \(r=16\epsilon _{(2)}=16(\epsilon +\delta _f^{(1)})\).

Fig. 8
figure 8

The plot of r in parameter space \([\epsilon _{(2)}, q]\) (setting \(\delta _{q}=0\)), \([\epsilon _{(2)}, \delta _{q}]\) (setting \(q=0\)), \([q, \delta _{q}]\) (setting \(\epsilon _{(2)}=0.004\)). Light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\)

Fig. 9
figure 9

The plot of \(r-16\epsilon c_s\) in parameter space \([\epsilon _{(2)}, q]\) (setting \(\delta _{q}=0\)), \([\epsilon _{(2)}, \delta _{q}]\) (setting \(q=0\)), \([q, \delta _{q}]\) (setting \(\epsilon _{(2)}=0.004\)). Blue region denotes \(r-16\epsilon c_s<0\), and yellow region denotes \(r-16\epsilon c_s>0\)

3.6 Beyond Horndeski theory: \(M_{p}^{2}R/2+L_{2}+L_{6}\)

The last case we consider is the beyond Horndeski model. In [43, 44], we considered the beyond Horndeski model, which can realize a non-singular universe without either ghost or gradient instabilities, by introducing a nonzero \(\Delta {\tilde{m}}_{4}^{2}\) in the action (2). Here we assume that \({\tilde{m}}_{4}^{2}=\Delta {\tilde{m}}_{4}^{2}=q(\phi )M_p^2\) being a pure function of \(\phi \) and \(q(\phi )\) is a dimensionless function, then this case can be written in terms of action (2) with the correspondence:

$$\begin{aligned} f= & {} 1~,~\Lambda =\frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)-G_{2}(\phi ,X)~,\nonumber \\ c= & {} \frac{{\dot{\phi }}^{2}}{2}G_{2X}(\phi ,X)~,\nonumber \\ m_{2}^{4}= & {} \frac{{\dot{\phi }}^{4}}{4}G_{2XX}(\phi ,X)~,~m_{3}^{3}=m_{4}^{2}\nonumber \\= & {} 0~,~{\tilde{m}}_{4}^{2}=q(\phi )M_p^2~. \end{aligned}$$
(100)

From Eqs. (11)–(14), we have

$$\begin{aligned} D= & {} 4H^2M_{p}^{4}~,~c_{1}=\frac{1}{H^{2}}(H^{2}\epsilon M_{p}^{2}+2m_{2}^{4})~,~c_{2}=M_{p}^{2}~, \nonumber \\ \end{aligned}$$
(101)
$$\begin{aligned} c_{3}= & {} \frac{aM_{p}^{2}}{H}\left( 1+2\frac{{\tilde{m}}_{4}^{2}}{M_{p}^{2}}\right) ~,~c_{s}^{2}\nonumber \\= & {} \frac{H^{2}\epsilon M_{p}^{2}+2H^{2}{\tilde{m}}_{4}^{2}(1+\epsilon )-2H\dot{{\tilde{m}}}_{4}^{2}}{H^{2}\epsilon M_{p}^{2}+2m_{2}^{4}}~, \end{aligned}$$
(102)
$$\begin{aligned} \mathcal{{D}}_T= & {} \frac{M_p^2}{8}~,~c_{T}=1~. \end{aligned}$$
(103)

In this case, we have \(\frac{\Delta c}{M_{p}^{2}H^{2}f}=\frac{\Delta c}{M_{p}^{2}H^{2}f}=\delta _f^{(1)}=\delta _f^{(2)}=0\), therefore \(\epsilon =\epsilon _{(2)}\), namely there is no derivation of \(\epsilon \) from the effect of \(L_6\). Define \(\delta _q=\dot{q}/(Hq)\), we have

$$\begin{aligned} r=\frac{16[\epsilon _{(2)} +2q(1+\epsilon _{(2)}-\delta _{q})]^{3/2}}{(\epsilon _{(2)}+8\delta _{KXX})^{1/2}}~, \end{aligned}$$
(104)

From the expression we can see that, for the case of \(\delta _q\simeq 0\) (\(q(\phi )\) is nearly constant), to have smaller r than that is given by consistency relation, one must have \(q<0\) (\(\delta _{KXX}\) also ignored). However, the constraint from positivity of \(c_1\) and \(c_s^2\) requires that \(\epsilon _{(2)}+8\delta _{KXX}>0\) as well as \(\epsilon _{(2)}+2q(1+\epsilon _{(2)})-\delta _q>0\).

The plot of r in parameter space \([\epsilon _{(2)}, q, \delta _{q}]\) is shown in Fig. 8, where light blue region denotes \(0.01<r<0.064\), and blue region denotes \(r<0.01\). For \(q<0\), small r can still be obtained (left panel). For \(q=0\), r go back to the trivial case \(r=16\epsilon _{(2)}\) and the dependence of r on \(\delta _{q}\) is also decoupled (middle panel). For q around 0 and \(\delta _{q}\) around 1, there still exists room for small r even when \(\epsilon _{(2)}=0.004\), which is at the edge of \(r=0.064\) in the absence of \(L_6\) (right panel). The comparison of r to the consistency relation \(r=16\epsilon c_s\) is also shown in Fig. 9.

4 Conclusion

The next decade will be a Gravitational Wave decade, with many more experiments of GWs getting down to work, and many more signals of GWs will be discovered. Especially, the ambitious ground-based experiment AliCPT in Tibet, China, is aiming to search for signals of PGWs with improved accuracy in the coming few years [14,15,16]. The current and future constraints has divided the amplitude of r into three parts, namely \(r>0.064\) (disfavored by current data), \(r\in (0.064,0.01)\) (within the observable window of next experiments like AliCPT) as well as \(r<0.01\) (still waiting for further detections). In this paper, we have formulated the tensor/scalar ratio in the generic setup with the facility of the EFT approach. As an application, we theoretically studied which kind of inflation models let r fall into the last two regions. In each model, we have analyzed the relation between r and other slow-varying parameters and obtained the corresponding regions in parameter spaces. Furthermore, we have also discussed the deviation of r from the consistency relation in each model. We summarize our conclusive remarks in the following:

  1. 1.

    Making use of the EFT approach, the tensor/scalar ratio r for the given action (2) can be expressed as in Eq. (35). Note that this expression is applicable for both cases where the power spectrum is constant or growing for scalar and tensor perturbations.

  2. 2.

    From the expression, one can see the running of sound speed affects r in an obvious way. When \(|s+1|>1\) or \(|s_T+1|<1\), r will get suppressed, or vise versa, where we simply assumed \(c_s\sim \eta ^s\), \(c_T\sim \eta ^{s_T}\). Note that the conclusion might not be applicable to more complicated cases.

  3. 3.

    For Kessence model where only \(M_p^2R/2\) and \(L_2\) are involved in the Lagrangian, r is in accord with the consistency relation, \(r=16\epsilon c_s\). In this case, various ways can be done to suppress r, e.g., to have small \(\epsilon \), or small sound speed, by non-canonicity of the scalar field. However, the small sound speed will meet the danger of large non-Gaussianities.

  4. 4.

    For Lagrangians beyond \(L_2\), r will deviate from the consistency relation, which will cause another mechanism for small r. Moreover, this allows for phantom inflation to give rise to small r.

  5. 5.

    For nonminimal coupling theory, namely \(L_4\) with \(G_4=G_4(\phi )\), the sound speed of tensor perturbations is still unity, \(c_T=1\). However, it will change when kinetic coupling to gravity also involves. Phantom inflation is allowed to have a small r in these cases.

  6. 6.

    Small r can also be obtained by taking into account the beyond-Horndeski part, namely \(L_6\), even in the absence of \(L_3\), \(L_4\), and \(L_5\).

Other than getting results of each concrete model in the ordinary model-by-model analysis, we have confirmed them at a more general level with the help of the EFT approach. By showing more detailed and more precise relationships between r and those parameters, our analysis and numerical plots will be useful for concrete model-buildings. For instance, we can use the present method to analyze other subclasses of the Horndeski theory, or beyond Horndeski theories. We can also enlarge our current scope, for example, by furtherly turning on the parameters \({\bar{m}}_4\), \({\bar{m}}_5\), \({\bar{\lambda }}\) and \({\tilde{\lambda }}\), to include models with higher-order spatial derivatives.

Before ending, let us remind that for the current discussions, we have focused only on r and have not taken into account constraints from other variables on the early universe models, such as spectral index (and even its running) and non-Gaussianities. Although we assumed that the power spectra of these models are scale-invariant, it deserves to consider how such a scale-invariance would impose constraints on those model parameters; and with the observational data being more and more precise, the running of the index will also become another important constraining tool. Moreover, the non-Gaussianities is also an interesting probe of the early Universe. In Ref. [74, 75], authors discussed that in matter bounce scenario driven by Horndeski theory, one could not get small r while keeping \(f_{nl}\) small enough to be within the current constraints (a.k.a. no-go theorem), while it is also interesting to consider such constraints for other early universe scenarios/models (examples has been given in [76]). We will address the above discussion in future works.