1 Introduction

Despite the immense predictive power of general relativity (GR), extensions to it have been motivated by several areas. In high-energy physics, which aims for the ultraviolet completeness of GR, quantum gravity and inflation models are included. On the other hand, models involving low-energy physics include, among others, the phenomenology of the dark sector of the universe and spherically symmetric solutions in a weak-field regime.

According to Lovelock’s theorem, fundamentally, GR is constructed based on some hypotheses: it is a 4-dimensional Riemannian metric gravity theory, containing the metric \(g_{\mu \nu }\) as the only fundamental field, invariant by diffeomorphism and with second-order field equations. In this sense, extensions to GR are achieved by violating any of these hypotheses [1]. By violating the first hypothesis, we can allow a higher-dimensional spacetime or even consider a gravitational action constructed with curvature and torsion invariants due to a Riemann–Cartan spacetime [2]. If we violate the hypothesis that the theory of gravity has the metric as the only fundamental field, we can obtain, for example, the Horndeski theories. These, in turn, are the more general 4-dimensional theories of gravity whose action, constructed with the metric and a scalar field, leads to second-order field equations [3]. On the other hand, by allowing field equations above the second-order and preserving all other assumptions, we find the higher-order gravities.

Higher-order theories of gravity are characterized by the inclusion of correction terms in the Einstein–Hilbert (EH) action that lead to higher-order field equations. Such corrections can be conveniently classified according to their mass (energy) scale. In this scenario, EH plus the cosmological constant represents the usual zero-order term. First-order corrections to EH are fourth mass terms constructed from the 4 possible invariantsFootnote 1

$$\begin{aligned} R^{2}{, }R_{\mu \nu }R^{\mu \nu }{, }R_{\mu \nu \alpha \beta }R^{\mu \nu \alpha \beta }\text { and }\square R. \end{aligned}$$

In turn, the second-order corrections to EH are sixth mass terms, built with the invariantsFootnote 2

$$\begin{aligned} \begin{aligned}&R\square R{, }R_{\mu \nu }\square R^{\mu \nu }{,} \\&R^{3}{, }RR_{\mu \nu }R^{\mu \nu }{, }R_{\mu \nu }R_{{ \ } \alpha }^{\nu }R^{\alpha \mu }{,} \\&RR_{\mu \nu \alpha \beta }R^{\mu \nu \alpha \beta }{, }R_{\mu \alpha }R_{\nu \beta }R^{\mu \nu \alpha \beta }\text { and }R_{\mu \nu \alpha \beta }R_{{ \ \ }\kappa \rho }^{\alpha \beta }R^{\kappa \rho \mu \nu }. \end{aligned} \end{aligned}$$

And so on, we will have more higher-order correction terms as we increase the energy scales.

Models involving higher-order gravities have been explored in various contexts. There are papers in the literature whose purpose is to show the equivalence between different classes of gravity theories, in particular, between \(f\left( R\right) \) or \(f\left( R,\square ^{k}R\right) \) and scalar-tensor theories [4,5,6,7,8,9,10]. In some contexts, it becomes more convenient to pass from the original frame to the Jordan or Einstein frames, through a conformal transformation, in order to handle equations for scalar fields rather than higher-order equations for the metric. Another topic of great interest is the investigation of spherically symmetric and static solutions in higher-order gravities, with Stelle’s paper [11] being one of those responsible for shedding light on this line of research. In particular, the study of the possibility of non-Schwarzschild black hole solutions through different approaches is addressed in Refs. [12,13,14,15,16,17,18,19], whereas researches involving weak-field regime solutions are covered in Refs. [19,20,21,22]. There are also models that study the generation and properties of gravitational waves [23,24,25,26,27,28,29,30,31,32,33]. The latter is a topic of great current appeal due to the direct detections [34,35,36] that allow the rising of gravitational wave astrophysics.

Regarding inflation, it is well known in the literature that the Starobinsky model [37, 38] has a good fit for recent observational data from Planck, BICEP3/Keck and BAO [39, 40]. Furthermore, the fact that it has a well-grounded theoretical motivation makes it one of the strongest inflationary candidates, despite the immense plethora of inflation models [41]. Such reasons motivate the investigation of models based on extensions to the Starobinsky model via higher-order gravity theories. There is a large amount of research in this context. Some of them are based on \( f\left( R\right) \) theories, as in Refs. [42,43,44,45,46,47], others consider the introduction of Weyl’s term [48,49,50,51,52]. There are those that consider local gravitational actions involving a finite number of curvature derivative terms [53,54,55,56,57,58,59], while others are nonlocal, involving infinite derivatives [60,61,62,63,64].

In this paper, we propose to investigate the extension to the Starobinsky model due to the inclusion of all correction terms up to the second-order involving only the scalar curvature R. In this sense, we have the following gravitational action

$$\begin{aligned} S=\frac{M_{Pl}^{2}}{2}\int d^{4}x\sqrt{-g}\left( R+\frac{1}{2\kappa _{0}} R^{2}+\frac{\alpha _{0}}{3\kappa _{0}^{2}}R^{3}-\frac{\beta _{0}}{2\kappa _{0}^{2}}R\square R\right) , \end{aligned}$$
(1)

where \(\kappa _{0}\) has squared mass unit and parameters \(\alpha _{0}\) and \( \beta _{0}\) are dimensionless quantities. Furthermore, \(M_{Pl}\) is the reduced Planck mass, such that \(M_{Pl}^{2}\equiv \left( 8\pi G\right) ^{-1}\) and \(\square \equiv \nabla _{\sigma }\nabla ^{\sigma }\) represents the covariant d’Alembertian operator. In this scenario, where we only address the scalar sector of corrections, \(R^{2}\) represents the first-order correction, while the last two terms correspond to the second-order corrections to EH. Note that the parameter \(\kappa _{0}\) is responsible for establishing the energy scale of inflation, while the parameters \(\alpha _{0} \) and \(\beta _{0}\) give us a measure of the Starobinsky deviation. Since \(R^{3}\) and \(R\square R\) are both second-order correction terms on energy scales, they must contribute similarly to inflation, so there is a joint effect that must be considered. In that regard, it is worth noting that our paper goes a step further in recent researches developed in [44] and [58, 59], which address the models Starobinsky\(+R^{3}\) and Starobinsky\(+R\square R\), respectively. In this paper, the multi-field treatment associated with the \(R\square R\) term is different from that used in Ref. [58]. While in that paper, inflation is described by a scalar and a vector field, here, inflation is driven through the dynamics of two scalar fields. Furthermore, by properly constructing the curvature perturbation, we can obtain observational constraints different from those obtained in Ref. [58] for the tensor-to-scalar ratio. In turn, by assuming the \(R\square R\) sixth-derivative term as a small perturbation to Starobinsky inflation, Ref. [59] uses a somewhat different approach, being able to map the model into a one-scalar theory.

It is important to comment that the discussed model (1) is not seen as a fundamental theory of gravity. On the other hand, it is seen as a classical model of gravity in a context of effective theory. One could legitimately worry about the ghost-type instabilities introduced with the \(R\square R\) sixth-derivative term.Footnote 3 Nevertheless, as previously pointed out by Refs. [65, 66], the complications of the growing up explosive behaviour of the ghost-type perturbations will not take place only until the initial seeds of such perturbations do not have sufficiently high frequencies. Usually, as long as the energy scales involved are close to the Planck order of magnitude, cosmological solutions are stable.

The paper is structured as follows. In Sect. 2, we start from the original frame for the action (1) and rewrite it in the scalar-tensor representation in the Einstein frame, where the theory is described through a metric and two auxiliary scalar fields, only one of which is associated with a canonical kinetic term. Then we write the field equations for each of the fields. Section 3 is responsible for making the full description of inflation in the cosmological background. In Sect. 3.1, we study the critical points and the 4-dimensional phase space of the model. Next, we explore inflation in the slow-roll leading order regime by defining the slow-roll factor, and thus, we obtain the slow-roll parameters and the number of e-folds. In Sect. 4, we give a complete description of the evolution of scalar perturbations. In addition to writing the perturbed field equations in the slow-roll leading order regime, we define the adiabatic and isocurvature perturbations by separating of the background phase space trajectories in the tangent (adiabatic perturbation) and orthogonal (isocurvature perturbation) directions. This allows us to properly establish the curvature perturbation, which is essential to connect our model with the observations. In Sect. 5, we confront the proposed model with the recent observations of Ref. [40], where by using the constraint for the number of inflation e-folds found in [44], we build the usual \(n_{s}\times r_{0.002}\) plane and the Plot for the parameter space \(\alpha _{0}\times \beta _{0}\). In Sect. 6, we make some final comments.

2 Field equations

The first step is to rewrite the action (1) in the Einstein frame. Performing this calculation, we get

$$\begin{aligned} \bar{S}&=\frac{M_{Pl}^{2}}{2}\int d^{4}x\sqrt{-\bar{g}}\left[ \bar{R} -3\left( \frac{1}{2}\bar{\nabla }_{\rho }\chi \bar{\nabla }^{\rho }\chi \right. \right. \nonumber \\&\quad \left. \left. -\frac{\beta _{0}}{6}e^{-\chi }\bar{\nabla }_{\rho }\lambda \bar{\nabla }^{\rho }\lambda +V\left( \chi ,\lambda \right) \right) \right] , \end{aligned}$$
(2)

with

$$\begin{aligned} V\left( \chi ,\lambda \right) =\frac{\kappa _{0}}{3}e^{-2\chi }\lambda \left( e^{\chi }-1-\frac{1}{2}\lambda -\frac{\alpha _{0}}{3}\lambda ^{2}\right) , \end{aligned}$$
(3)

the potential associated with the model. The quantities with bar are defined from the metric as \(\bar{g}_{\mu \nu }=e^{\chi }g_{\mu \nu }\) and the dimensionless fields \(\chi \) and \(\lambda \) are defined as

$$\begin{aligned} \lambda =\frac{R}{\kappa _{0}}\text { and }\mu =e^{\chi }=1+\lambda +\alpha _{0}\lambda ^{2}-\frac{\beta _{0}}{\kappa _{0}}\square \lambda {,} \end{aligned}$$
(4)

where in the Einstein frame, \(\square \lambda =e^{\chi }\left( \bar{\square }\lambda -\bar{\partial }^{\mu }\lambda \bar{\partial }_{\mu }\chi \right) \).

Addendum By recovering the usual notation and the dimensions of the scalar fields, and the potential, we must take

$$\begin{aligned} \chi= & {} \sqrt{\frac{2}{3}}\frac{\phi }{M_{Pl}},{ \ }\lambda =\sqrt{2} \frac{\psi }{M_{Pl}},\text { and} \nonumber \\ \tilde{V}\left( \phi ,\psi \right)= & {} \frac{3M_{Pl}^{2}}{2}V\left( \chi ,\lambda \right) . \end{aligned}$$
(5)

This way, we can rewrite the action (2) as

$$\begin{aligned} \bar{S}&=\int d^{4}x\sqrt{-\bar{g}}\left( \frac{M_{Pl}^{2}}{2}\bar{R}-\frac{ 1}{2}\bar{\nabla }_{\rho }\phi \bar{\nabla }^{\rho }\phi \right. \nonumber \\&\quad \left. +\frac{\beta _{0}e^{-\sqrt{\frac{2}{3}}\frac{\phi }{M_{Pl}}}}{2} \bar{\nabla }_{\rho }\psi \bar{\nabla }^{\rho }\psi -\tilde{V}\left( \phi ,\psi \right) \right) . \end{aligned}$$
(6)

By starting from the action (2), we obtain three field equations: one for \(\bar{g}_{\mu \nu }\) and another two for each of the scalar fields \(\chi \) and \(\lambda \). Taking the variation concerning the metric \(\bar{g}_{\mu \nu }\), we find

$$\begin{aligned} \bar{R}_{\mu \nu }-\frac{1}{2}\bar{g}_{\mu \nu }\bar{R}=\frac{1}{M_{Pl}^{2}} \bar{T}_{\mu \nu }^{\left( \text {eff}\right) }, \end{aligned}$$
(7)

where we define an effective energy-momentum tensor as

$$\begin{aligned}{} & {} \frac{1}{M_{Pl}^{2}}\bar{T}_{\mu \nu }^{\left( \text {eff}\right) }=\frac{3}{2 }\left( \bar{\nabla }_{\mu }\chi \bar{\nabla }_{\nu }\chi -\frac{1}{2}\bar{g} _{\mu \nu }\bar{\nabla }^{\rho }\chi \bar{\nabla }_{\rho }\chi \right) + \nonumber \\{} & {} \qquad -\frac{\beta _{0}e^{-\chi }}{2}\left( \bar{\nabla }_{\mu }\lambda \bar{\nabla } _{\nu }\lambda -\frac{1}{2}\bar{g}_{\mu \nu }\bar{\nabla }^{\rho }\lambda \bar{\nabla }_{\rho }\lambda \right) -\frac{3}{2}\bar{g}_{\mu \nu }V\left( \chi ,\lambda \right) . \end{aligned}$$
(8)

The variation concerning the \(\chi \) and \(\lambda \) fields results in

$$\begin{aligned}{} & {} \bar{\square }\chi -\frac{\beta _{0}}{6}e^{-\chi }\bar{\nabla }_{\rho }\lambda \bar{\nabla }^{\rho }\lambda -V_{\chi }=0, \end{aligned}$$
(9)
$$\begin{aligned}{} & {} \beta _{0}e^{-\chi }\left( \bar{\nabla }^{\rho }\chi \bar{\nabla }_{\rho }\lambda -\bar{\square }\lambda \right) -3V_{\lambda }=0. \end{aligned}$$
(10)

where \(V_{\chi }=\partial _{\chi }V\) and \(V_{\lambda }=\partial _{\lambda }V\) represent derivatives concerning the fields \(\chi \) and \(\lambda \), respectively.

3 Inflation in Friedmann cosmological background

On large scales (\(\gtrsim 100\) Mpc), we can consider the universe to be homogeneous and isotropic. Furthermore, for a spatially flat universe, the line element that describes the evolution of a comoving frame of reference is given by

$$\begin{aligned} ds^{2}=-dt^{2}+a^{2}\left( t\right) \left( dx^{2}+dy^{2}+dz^{2}\right) , \end{aligned}$$
(11)

where \(a\left( t\right) \) is the scale factor.

By obtaining the field equations in Friedmann background is to write the field equations (7), (9) and (10) for the metric (11). From the field equation for the metric, we get two independent ones, namely the Friedmann equations

$$\begin{aligned}{} & {} H^{2}=\frac{1}{2}\left( \frac{1}{2}\dot{\chi }^{2}-\frac{\beta _{0}}{6} e^{-\chi }\dot{\lambda }^{2}+V\left( \chi ,\lambda \right) \right) , \end{aligned}$$
(12)
$$\begin{aligned}{} & {} \dot{H}=-\frac{3}{4}\dot{\chi }^{2}+\frac{1}{4}\beta _{0}e^{-\chi }\dot{ \lambda }^{2}, \end{aligned}$$
(13)

where \(H=\dot{a}/a\). In addition to these equations, we also have the equations for the \(\chi \) and \(\lambda \) fields. Since, for a scalar field \( \Phi \),

$$\begin{aligned} \bar{\square }\Phi =\bar{\nabla }_{\sigma }\bar{\nabla }^{\sigma }\Phi =-3H\dot{ \Phi }-\ddot{\Phi }, \end{aligned}$$

for the equation of \(\chi \) given in (9), we have

$$\begin{aligned} \ddot{\chi }+3H\dot{\chi }-\frac{\beta _{0}}{6}e^{-\chi }\dot{\lambda } ^{2}+V_{\chi }=0. \end{aligned}$$
(14)

In turn, for the equation of \(\lambda \) given in (10), we have

$$\begin{aligned} \beta _{0}e^{-\chi }\left[ \ddot{\lambda }-\left( \dot{\chi }-3H\right) \dot{ \lambda }\right] -3V_{\lambda }=0. \end{aligned}$$
(15)

3.1 Phase space

In this section, we will analyze the phase space of the model. Therefore, it becomes convenient to rewrite the field equations in a dimensionless way: we define the dimensionless time derivative

$$\begin{aligned} A_{t}\equiv \frac{1}{\sqrt{\kappa _{0}}}\dot{A}, \end{aligned}$$

the dimensionless Hubble parameter h

$$\begin{aligned} h\equiv \frac{1}{\sqrt{\kappa _{0}}}H, \end{aligned}$$

and the dimensionless potential \(\bar{V}\) as

$$\begin{aligned} \bar{V}\left( \chi ,\lambda \right) =\frac{1}{\kappa _{0}}V\left( \chi ,\lambda \right) . \end{aligned}$$

With that, it is possible to rewrite the equations of cosmological dynamics (12), (13), (14) and (15) as follows:

$$\begin{aligned}{} & {} h^{2}=\frac{1}{2}\left( \frac{1}{2}\chi _{t}^{2}-\frac{\beta _{0}}{6} e^{-\chi }\lambda _{t}^{2}+\bar{V}\left( \chi ,\lambda \right) \right) , \end{aligned}$$
(16)
$$\begin{aligned}{} & {} h_{t}=-\frac{3}{4}\chi _{t}^{2}+\frac{1}{4}\beta _{0}e^{-\chi }\lambda _{t}^{2}, \end{aligned}$$
(17)

and

$$\begin{aligned}{} & {} \chi _{tt}+3h\chi _{t}-\frac{\beta _{0}}{6}e^{-\chi }\lambda _{t}{}^{2}+\bar{ V}_{\chi }=0, \end{aligned}$$
(18)
$$\begin{aligned}{} & {} \beta _{0}e^{-\chi }\left[ \lambda _{tt}-\left( \chi _{t}-3h\right) \lambda _{t}\right] -3\bar{V}_{\lambda }=0. \end{aligned}$$
(19)

We already know the inflationary dynamics of Starobinsky\(+R^{3}\) model, which in the scalar-tensor approach in the Einstein frame is characterized by its specific potential \(V\left( \chi \right) \) [44], as well as the dynamics inflation of Starobinsky\(+R\square R\) model, explored in Ref. [58] via a scalar-vector approach. A first step in order to understand the dynamics of our current case is through the study of its phase space, having as reference the known particular cases mentioned above. In this first part, we will investigate the existence of an attracting inflationary regime in some region of the phase space.

Since the dimensionless equations governing the dynamics of the \(\chi \) and \( \lambda \) fields are written as in (18) and (19), that is, two autonomous second-order differential equations concerning time, we can rewrite them as a system of four first-order differential equations. Taking \(\chi _{t}=\psi \) and \(\lambda _{t}=\phi \), we have

$$\begin{aligned} \chi _{t}&=\psi , \end{aligned}$$
(20)
$$\begin{aligned} \psi _{t}&=-3h\psi +\frac{\beta _{0}}{6}e^{-\chi }\phi ^{2}-\bar{V}_{\chi }, \end{aligned}$$
(21)
$$\begin{aligned} \lambda _{t}&=\phi , \end{aligned}$$
(22)
$$\begin{aligned} \beta _{0}\phi _{t}&=\beta _{0}\left( \psi -3h\right) \phi +3e^{\chi }\bar{V }_{\lambda }, \end{aligned}$$
(23)

where

$$\begin{aligned} h=\sqrt{\frac{1}{2}\left( \frac{1}{2}\psi ^{2}-\frac{\beta _{0}}{6}e^{-\chi }\phi ^{2}+\bar{V}\right) }, \end{aligned}$$

which is associated with a physically consistent system when its root argument is positive.Footnote 4

From that point on, we will study the approximate behavior of the solutions of the system at critical points. Critical points are equilibrium points of the system, and it is our interest to investigate their stability, which is directly related to the necessary conditions for the occurrence of a physical inflationary regime.Footnote 5 The analysis of the previous system allows us to conclude that there are two critical points:

$$\begin{aligned} P_{0}&=\left( \chi _{0},\lambda _{0},\psi _{0},\phi _{0}\right) =\left( 0,0,0,0\right) \end{aligned}$$
(24)
$$\begin{aligned} P_{c}&=\left( \chi _{c},\lambda _{c},\psi _{c},\phi _{c}\right) =\left( \ln \left( 4+\sqrt{\frac{3}{\alpha _{0}}}\right) ,\sqrt{\frac{3}{\alpha _{0}}} ,0,0\right) . \end{aligned}$$
(25)

The study on the stability of these critical points is done through the linearization of the 4-dimensional autonomous system \(\left( \chi ,\lambda ,\psi ,\phi \right) \). Linearizing the system given by Eqs. (20), (21), (22) and (23) around \(P_{0}\), we verify that the Lyapunov exponents \(r_{0}\), associated with the stability of the critical point, satisfy the fourth-order characteristic equation

$$\begin{aligned} \beta _{0}r_{0}^{4}+r_{0}^{2}+\frac{1}{3}=0, \end{aligned}$$
(26)

whose solution is

$$\begin{aligned} r_{0}=\pm \sqrt{\frac{-1\pm \sqrt{1-\frac{4\beta _{0}}{3}}}{2\beta _{0}}.} \end{aligned}$$

A center or spiral point occurs when we obtain pure imaginary roots. Looking at the previous expression, we see that this occurs whenever the condition

$$\begin{aligned} 0\le \beta _{0}\le \frac{3}{4}, \end{aligned}$$
(27)
Fig. 1
figure 1

The \(\chi _{t}\times \chi \) graphs considering phase space cuts \( \left( \chi ,\lambda ,\chi _{t},\lambda _{t}\right) \) fixing \(\lambda _{t}=\lambda _{tt}=0\) and \(\beta _{0}=0.001\) with \(\left( \lambda ,\alpha _{0}\right) =\left( 173,0.0001\right) \) (top graph) and \(\left( \lambda ,\alpha _{0}\right) =\left( 94,0.00034\right) \) (bottom graph). The red and black points correspond to the critical points \(P_{0}\) and \(P_{c}\), respectively. For \(\alpha _{0}=0.0001\), we have \(P_{c}=\left( 5.18,173,0,0\right) \) and for \(\alpha _{0}=0.00034\), we have \(P_{c}=\left( 4.58,94,0,0\right) \). The red (cyan) trajectories represent trajectories that, when reaching the attractor line close to \(\dot{\chi }=0\), approach (depart) from the origin. Details on the interpretation of the graphics are presented in the body of the text

is satisfied. Any value of \(\beta _{0}\) outside this range contains at least one Lyapunov exponent with positive real part. That is, outside the range (27) the point \(P_{0}\) is unstable.Footnote 6 A numerical analysis of the system (21) shows that within the interval (27) the point \(P_{0}\) is an attracting spiral point and therefore stable (see Fig. 1). This behavior is essential for the existence of a graceful exit. In fact, the spiral dynamics around \(P_{0}\) constitute the period of coherent oscillations consistent with the initial phases of reheating. It is also worth noting that Eq. (26) is independent of \(\alpha _{0}\), and therefore the term \(R^{3}\) plays no role at the end of the inflationary period.

In turn, linearizing the system (21) around \(P_{c}\), we verify that the Lyapunov exponents \(r_{c}\) satisfy the characteristic fourth-order equation

$$\begin{aligned}{} & {} \beta _{0}\left[ r_{c}\left( r_{c}-G\right) -\frac{4}{9}G^{2}\right] r_{c}\left( r_{c}-G\right) +\frac{4}{9}G^{2}\left[ \left( \sqrt{3\alpha _{0}}+6\alpha _{0}\right) \right. \nonumber \\{} & {} \quad \left. \times r_{c}\left( r_{c}{-}G\right) {+}\frac{1}{3}\sqrt{3\alpha _{0}}{-}\frac{4}{9}\left( \sqrt{3\alpha _{0}}{+}6\alpha _{0}\right) G^{2}\right] =0, \end{aligned}$$
(28)

where

$$\begin{aligned} G=\frac{-3}{2\sqrt{4\sqrt{3\alpha _{0}}+3}}. \end{aligned}$$

A numerical study of this characteristic equation, considering \(\alpha _{0}>0 \) and \(\beta _{0}>0\), shows that at least two of the four roots of Eq. (28) are real and have opposite signs. This shows that \(P_{c}\) is a saddle point and therefore unstable. This conclusion also remains valid for \(\beta _{0}=0\) and \(\alpha _{0}>0\), in which case we have only two roots.Footnote 7 See Ref. [44] for details.

To better understand the dynamics of the \(\chi \) and \(\lambda \) fields, we will numerically study the 4-dimensional phase space. In this study, we will analyze two 2-dimensional slices of this space given by \(\chi _{t}\times \chi \) and \(\lambda _{t}\times \lambda \). For that, we manipulate Eqs. (18) and (19) writing them as

$$\begin{aligned} \frac{d\chi _{t}}{d\chi }= & {} \frac{-3h\chi _{t}+\frac{\beta _{0}}{6}e^{-\chi }\lambda _{t}^{2}{}-\bar{V}_{\chi }}{\chi _{t}}, \\ \frac{d\lambda _{t}}{d\lambda }= & {} \left( \chi _{t}-3h\right) +\frac{3e^{\chi } }{\beta _{0}\lambda _{t}}\bar{V}_{\lambda }, \end{aligned}$$

where h is given by (16).

Numerical analysis of the equation \(d\chi _{t}/d\chi \) is more easily performed if we write \(\lambda =\lambda \left( \chi ,\chi _{t},\lambda _{t},\right. \) \(\left. \lambda _{tt},\alpha _{0},\beta _{0}\right) \). For that, it is necessary to work with the Eqs. (16) and (19). Solving the quadratic equation for \(\lambda \) in Eq. (19), we get

$$\begin{aligned} \lambda =\frac{-1+\sqrt{1-4\alpha _{0}\left\{ 1-e^{\chi }+\beta _{0}e^{\chi } \left[ \lambda _{tt}-\left( \chi _{t}-3h\right) \lambda _{t}\right] \right\} }}{2\alpha _{0}}, \end{aligned}$$

where we choose the positive sign to guarantee the Starobinsky limit. In principle, we can substitute (16) in the previous expression, obtain a third-degree algebraic equation for \(\lambda \) and solve it to obtain \(\lambda =\lambda \left( \chi ,\chi _{t},\lambda _{t},\lambda _{tt},\alpha _{0},\beta _{0}\right) \). However, we will see in Sect. 5 that the values of interest for \(\alpha _{0}\) and \(\beta _{0}\) are such that \(\alpha _{0}<10^{-3}\) and \(\beta _{0}<3\times 10^{-2}\).Footnote 8 In this case, it is licit to consider only linearized corrections of \(\alpha _{0}\) and disregard terms of the type \(\alpha _{0}\beta _{0}\). Performing these approximations, we obtain the functional forms

$$\begin{aligned}{} & {} F_{\chi }\equiv \frac{d\chi _{t}}{d\chi }\simeq \frac{-3\bar{h}\chi _{t}+ \frac{\beta _{0}}{6}e^{-\chi }\lambda _{t}^{2}{}-\frac{1}{9}e^{-2\chi }\bar{ \lambda }\left\{ 4-e^{\chi }+\bar{\lambda }-2\beta _{0}e^{\chi }\left[ \lambda _{tt}-\left( \chi _{t}-3\bar{h}\right) \lambda _{t}\right] \right\} }{\chi _{t}}, \end{aligned}$$
(29)
$$\begin{aligned}{} & {} xcF_{\lambda }\equiv \frac{d\lambda _{t}}{d\lambda }=\left( \chi _{t}-3h\right) +\frac{1}{\beta _{0}\lambda _{t}}\left[ 1-e^{-\chi }\left( 1+\lambda +\alpha _{0}\lambda ^{2}\right) \right] , \end{aligned}$$
(30)

where

$$\begin{aligned}{} & {} h\simeq \bar{h}\equiv \frac{-3\beta _{0}^{2}\lambda _{t}+\sqrt{\left( 3\beta _{0}^{2}\lambda _{t}\right) ^{2}+\left( 12+9\beta _{0}^{2}\lambda _{t}^{2}\right) \left[ 3\chi _{t}^{2}-\beta _{0}e^{-\chi }\lambda _{t}^{2}-\left( \lambda _{tt}-\chi _{t}\lambda _{t}\right) ^{2}\beta _{0}^{2}+A\right] }}{12+9\beta _{0}^{2}\lambda _{t}^{2}}, \end{aligned}$$
(31)
$$\begin{aligned}{} & {} \lambda \simeq \bar{\lambda }\equiv \left( e^{\chi }-1\right) \left[ 1-\alpha _{0}\left( e^{\chi }-1\right) \right] -\beta _{0}e^{\chi }\left[ \lambda _{tt}-\left( \chi _{t}-3h\right) \lambda _{t}\right] , \end{aligned}$$
(32)

with

$$\begin{aligned} A=\left( 1-e^{-\chi }\right) ^{2}\left[ 1-\frac{2}{3}\alpha _{0}\left( e^{\chi }-1\right) \right] . \end{aligned}$$

In Figs. 1 and 2, we show direction fields associated with equations (29) and (30).

Fig. 2
figure 2

The \(\lambda _{t}\times \lambda \) graphs considering phase space cuts \( \left( \chi ,\chi _{t},\lambda {,}\lambda _{t}\right) \) setting \(\chi _{t}=0\) and \(\beta _{0}=0.001\) with \(\left( \chi ,\alpha _{0}\right) =\left( 5.18,0.0001\right) \) and (top graph) and \(\left( \chi ,\alpha _{0}\right) =\left( 4.58,0.00034\right) \) (bottom graph). The black points correspond to the critical points \(P_{c}=\left( 5.18,173,0,0\right) \) (top graph) and \( P_{c}=\left( 4.58,94,0,0\right) \) (bottom graph). Details on the interpretation of the graphics are presented in the body of the text

The first (and most relevant) point that can be seen in Fig. 1 is that there is an attractor line close to \(\chi _{t}\simeq 0\). The existence of this region is consistent with any value of \(\alpha _{0}<10^{-3}\) and \(\beta _{0}<3\times 10^{-2}\) and for any interval of \(\lambda _{t}\) and \(\lambda _{tt}\) that yields real results in the region of interest \(\chi \in [0,8]\).Footnote 9 At the same time that the \( \chi \) field tends to the attracting line (\(\chi _{t}\simeq 0\)), Fig. 2 indicates that \(\lambda \) tends to a finite value and \(\lambda _{t}\rightarrow 0\). This finite value of \(\lambda \) essentially depends on the value of \(\chi \) with variations on a smaller scale due to changes in the parameter \(\alpha _{0}\). The other fixed parameters \(\chi _{t}\) and \( \beta _{0}\) in Fig. 2 change how \(\lambda \) approaches the accumulation point but does not change its value. We will see in the Sect. 3.2 that this attractor region in the 4-dimensional phase space where \(\left( \chi ,\lambda ,\chi _{t}{,}\lambda _{t}\right) \simeq \left( \chi ,\lambda \left( \chi \right) ,0,0\right) \) corresponds to a slow-roll inflationary regime.

Once the attractor region is reached, we must ask ourselves if inflation occurs enough, i.e., if it generates a sufficient number of e-folds and if it ends in a reheating phase. The answer to this question essentially depends on the position where the \(\chi \) field hits the attractor line in Fig. 1. If the \(\chi \) field is to the left of the critical point \(P_{c}\) (black dots in the graphs of Fig. 1), inflation proceeds normally and ends in a phase of coherent oscillations associated with the beginning of reheating. On the other hand, if \(\chi \) is to the right of \(P_{c}\) the value of \(\chi \) increases indefinitely, and inflation never ends (see Ref. [44] for details). Thus, a physical inflationary regime, i.e., consistent with a graceful exit, only occurs if \(\chi<\chi _{c}\Rightarrow \alpha _{0}<3\left( e^{\chi }-4\right) ^{-2}\) which for sufficiently large \(\chi \) corresponds to \(\alpha _{0}<3e^{-2\chi }\).

In the next section, we will see how to describe the dynamics of the \(\chi \) and \(\lambda \) fields during the slow-roll inflationary phase.

3.2 Inflation in the slow-roll leading order regime

This section aims to describe the dynamics of \(\chi \), \(\lambda \), and their derivatives during the inflationary regime considering the slow-roll approximation. In the region associated with physical inflation, the parameter \(\chi \) is a monotonic decreasing function of time, so we can parameterize the various quantities in terms of \(\chi \). For the case of Starobinsky model, we know that in the slow-roll leading order regime \(\chi _{t}\sim \delta \) and \(\chi _{tt}\sim \delta ^{2}\), where \(\delta \) is the slow-roll factor defined as \(\delta \equiv e^{-\chi }\). And for Starobinsky plus \(R^{3}\) model (i.e., \(\beta _{0}=0\) and \(\alpha _{0}\ne 0\)), we have [44]

$$\begin{aligned} \chi _{t}\sim \left( \delta -\frac{\alpha _{0}}{3}\delta ^{-1}\right) . \end{aligned}$$
(33)

Note that since \(\alpha _{0}<3\delta ^{2}\), the second term of the previous expression is of the same order or less than \(\delta \).

The previous discussion allows us to associate the factor \(\delta \) as a parameter that controls the slow-roll approximation order, i.e., a quantity \( f\sim \delta ^{n}\) will be an nth-order slow-roll quantity. In this case, \( \chi _{t}\) present in (33) is first-order in slow-roll, since both \(\delta \) and \(\alpha _{0}\delta ^{-1}\) are first-order. To apply this reasoning in our model, it is also necessary to establish what is the maximum slow-roll order of the parameter \(\beta _{0}\). A more detailed analysis of the field equations in the attractor region shows us that, for slow-roll inflation, \(\beta _{0}\lesssim \delta \), i.e., \(\beta _{0}\) is a (at most) first-order slow-roll parameter (for details see Ref. [58]).

Once the slow-roll (maximum) orders of the parameters \(\alpha _{0}\) and \( \beta _{0}\) are known, we can propose the following ansatz for \(\chi _{t}\):

$$\begin{aligned} \chi _{t}\simeq c_{1}\delta +\beta _{0}\sum \limits _{n=0}^{\infty }b_{n}\left( \beta _{0}\delta ^{-1}\right) ^{n}+\alpha _{0}\delta ^{-1}\sum \limits _{n=0}^{\infty }d_{n}\left( \beta _{0}\delta ^{-1}\right) ^{n}.\nonumber \\ \end{aligned}$$
(34)

This ansatz has the following properties:

  • All terms are first-order in slow-roll, this being the leading order of \(\chi _{t}\);

  • In the limit of \(\beta _{0}\rightarrow 0\), we recover the result (33);

  • Derivating Eq. (34) with respect to t, we increase the slow-roll order, i.e., \(\chi _{tt}\) is second order, \(\chi _{ttt}\) is third order, etc.

By following similar reasoning, we propose the following ansatz for \( \lambda \):

$$\begin{aligned} \lambda \simeq \delta ^{-1}+\sum \limits _{n=0}^{\infty }g_{n}\left( \beta _{0}\delta ^{-1}\right) ^{n}+\alpha _{0}\delta ^{-2}\sum _{n=0}^{\infty }j_{n}\left( \beta _{0}\delta ^{-1}\right) ^{n}. \nonumber \\ \end{aligned}$$
(35)

In this case, the first term is of order \(\mathcal {O}\left( -1\right) \) in slow-roll, and the others are zero-order terms. The \(\mathcal {O}\left( -1\right) \) order term is necessary as we know that in the case of Starobinsky \(\lambda =\delta ^{-1}-1 \) (see Eq. (19) with \(\alpha _{0}=\beta _{0}=0\)). Analogously to \(\chi _{t}\), each derivative of \(\lambda \) with respect to t increases by one the slow-roll order.

The next step is substituting these two ansatzes and their derivatives into Eqs. (16), (18), and (19) taking into account only the slow-roll leading order. In this situation, we get

$$\begin{aligned}&3h\chi _{t}-\frac{1}{3}\delta \lambda \left( 1-\delta \lambda -2\delta - \frac{2}{3}\alpha _{0}\delta \lambda ^{2}\right) \simeq 0, \end{aligned}$$
(36)
$$\begin{aligned}&3\beta _{0}h\lambda _{t}-\left[ 1-\delta \left( 1+\lambda +\alpha _{0}\lambda ^{2}\right) \right] \simeq 0, \end{aligned}$$
(37)

where

$$\begin{aligned} h^{2}\simeq \frac{1}{6}\delta \lambda \left( 1-\frac{1}{2}\delta \lambda \right) . \end{aligned}$$
(38)

By explicitly substituting Eqs. (34) and (35 ) in these last three expressions, we get after a long calculation

$$\begin{aligned} \chi _{t}\simeq & {} -\frac{2\sqrt{3}}{3\left( 3-\beta _{0}\delta ^{-1}\right) } \delta \left( 1-\frac{\alpha _{0}}{3}\delta ^{-2}\right) , \end{aligned}$$
(39)
$$\begin{aligned} \lambda\simeq & {} \delta ^{-1}-\frac{3-2\beta _{0}\delta ^{-1}}{3-\beta _{0}\delta ^{-1}}-\alpha _{0}\delta ^{-2}\left( 1+\frac{\frac{1}{3}\beta _{0}\delta ^{-1}}{3-\beta _{0}\delta ^{-1}}\right) , \end{aligned}$$
(40)

with

$$\begin{aligned} h^{2}\simeq \frac{1}{12}\left( 1-2\delta -\frac{2}{3}\alpha _{0}\delta ^{-1}\right) . \end{aligned}$$
(41)

For details see Appendix A. It is worth noting that the previous expressions are well defined only for \(\beta _{0}\delta ^{-1}<3\). Note in Eq. (39) the existence of two terms that, in the slow-roll leading order, are first-order terms. In Eq. (40), we have the \( \mathcal {O}\left( -1\right) \) order term in addition to the zero-order corrections. Finally, \(h^{2}\), related to the Hubble parameter, is given by the zero-order slow-roll leading term plus first-order corrections (independent of \(\beta _{0}\)).

3.2.1 Calculation of slow-roll parameters and number of e-folds

The characterization of the inflationary regime is done through the slow-roll parameters

$$\begin{aligned} \epsilon\equiv & {} -\frac{\dot{H}}{H^{2}}=-\frac{h_{t}}{h^{2}}, \end{aligned}$$
(42)
$$\begin{aligned} \eta\equiv & {} -\frac{1}{H}\frac{\dot{\epsilon }}{\epsilon }=-\frac{1}{h}\frac{ \epsilon _{t}}{\epsilon }. \end{aligned}$$
(43)

By substituting (39) and (40) in (17), we get

$$\begin{aligned} h_{t}\simeq -\frac{\delta ^{2}}{3\left( 3-\beta _{0}\delta ^{-1}\right) } \left( 1-\frac{\alpha _{0}}{3}\delta ^{-2}\right) ^{2}. \end{aligned}$$

Thus, in the slow-roll leading order, we have

$$\begin{aligned} \epsilon \simeq \frac{4\delta ^{2}}{\left( 3-\beta _{0}\delta ^{-1}\right) } \left( 1-\frac{\alpha _{0}}{3}\delta ^{-2}\right) ^{2}. \end{aligned}$$
(44)

The next step is calculating \(\eta \). Differentiating \(\epsilon \) and using this result together with \(\epsilon \) itself in Eq. (43), we get

$$\begin{aligned} \eta \simeq -\frac{4\delta }{\left( 3-\beta _{0}\delta ^{-1}\right) ^{2}} \left[ 3\left( 2-\beta _{0}\delta ^{-1}\right) +\alpha _{0}\delta ^{-2}\left( 2-\frac{1}{3}\beta _{0}\delta ^{-1}\right) \right] . \nonumber \\ \end{aligned}$$
(45)

Note that by construction \(\alpha _{0}\delta ^{-2}<3\) and \(\beta _{0}\delta ^{-1}<3\).

In order to have robust inflation, i.e., with enough number of e-folds, we must have \(\epsilon \ll 1\) and \(\eta \ll 1\). Thus, from the Eqs. (44) and (45), we see that this occurs for \(\delta \ll 1\) (typically \(\chi \gtrsim 4\)). However, unlike the Starobinsky case, we also have lower bounds for \(\delta \). In fact, the slow-roll inflationary regime only occurs if

$$\begin{aligned} \delta>\frac{\beta _{0}}{3}\quad \text { and }\quad \delta >\sqrt{\frac{\alpha _{0}}{ 3}}. \end{aligned}$$
(46)

The first condition does not represent a real difficulty for the existence of slow-roll inflation, because even if at some point we have \(\delta <\beta _{0}\),Footnote 10 the dynamics of the phase space guarantees that \(\chi \) decreases monotonically so that at some point \(\delta \) becomes greater than \(\beta _{0}\) (see Fig. 1). The second condition represents a real constraint for carrying out a physical inflation (see discussion at the end of Sect. 3.1). For a discussion of the implications of this second constraint and the initial conditions of inflation see Ref. [44].

Next we will calculate the number of e-folds N in the slow-roll leading order. By the definition of N, we have

$$\begin{aligned} N\!=\!\int _{t}^{t_{e}}\! Hdt\!\simeq \! \frac{1}{4}\int _{\delta }^{\delta _{e}}\!\frac{ \left( 1-2\delta -\frac{2}{3}\alpha _{0}\delta ^{-1}\right) \left( 3-\beta _{0}\delta ^{-1}\right) }{\delta ^{2}\left( 1-\frac{\alpha _{0}}{3}\delta ^{-2}\right) }d\delta , \end{aligned}$$

where the index e corresponds to the end of inflation. To integrate this expression, it is convenient to perform the following change of variable:

$$\begin{aligned} x=\frac{\delta _{m}}{\delta }\quad \text { where }\quad \delta _{m}=\sqrt{\frac{\alpha _{0}}{3}}. \end{aligned}$$
(47)

In this case, we get

$$\begin{aligned} N\simeq -\frac{1}{4\delta _{m}}\int _{x}^{x_{e}}\frac{\left( x-2\delta _{m}-2x^{2}\delta _{m}\right) \left( 3-\beta _{0}\delta _{m}^{-1}x\right) }{ 1-x^{2}}\frac{dx}{x}, \end{aligned}$$

whose solution is

$$\begin{aligned} N\simeq & {} -\frac{1}{4\delta _{m}}\left\{ -2x\beta _{0}-6\delta _{m}\ln x+\frac{\beta _{0}+12\delta _{m}^{2}}{2\delta _{m}}\right. \\{} & {} \left. \times \ln \left[ \left( 1-x\right) \left( 1+x\right) \right] +\frac{3+4\beta _{0}}{2}\ln \left( \frac{1+x}{1-x}\right) \right\} _{x}^{x_{e}}. \end{aligned}$$

By considering only leading terms and taking into account that \(x_{e}\ll x\), we finally get

$$\begin{aligned} N\simeq \frac{3}{8}\sqrt{\frac{3}{\alpha _{0}}}\ln \left[ \left( 1-x\right) ^{\gamma -1}\left( 1+x\right) ^{\gamma +1}\right] \text { where }\gamma ^{2}=\frac{\beta _{0}^{2}}{3\alpha _{0}}{.\ }\nonumber \\ \end{aligned}$$
(48)

By construction, physical inflation occurs in the interval \(0\le x<1\). In fact, when \(x\rightarrow 1\) we have \(\delta \rightarrow \delta _{m}\) which corresponds approximately to \(\chi \rightarrow \chi _{c}\) (see Eq. (25 )). However, the expression (48) has an extra restriction due to the presence of the \(\beta _{0}\) term. For \(\beta _{0}^{2}>3\alpha _{0}\Rightarrow \gamma >1\), we have that when \(x\rightarrow 1\), the value of N diverges to \(-\infty \), and this is clearly not physical. What happens is that for \(\gamma >1\), the function N presents a maximum point within the interval \(0\le x<1\). Differentiating N with respect to time, we have

$$\begin{aligned} N_{t}\simeq \frac{3x}{8}\sqrt{\frac{3}{\alpha _{0}}}\left[ \frac{-\left( \gamma -1\right) \left( 1+x\right) +\left( \gamma +1\right) \left( 1-x\right) }{\left( 1-x\right) \left( 1+x\right) }\right] \chi _{t}. \end{aligned}$$

So, for \(\gamma >1\), we have

$$\begin{aligned} N_{t}= & {} 0\Rightarrow \!-\!\left( \gamma -1\right) \left( 1+x_{\max }\right) \!+\!\left( \gamma +1\right) \left( 1-x_{\max }\right) \! =\!0 \nonumber \\\Rightarrow & {} x_{\max }=\frac{1}{\gamma }<1{.} \end{aligned}$$
(49)

On the other hand, for values of x such that \(x_{\max }\le x<1\), we get

$$\begin{aligned} \beta _{0}\delta ^{-1}=3\gamma x\ge 3\Rightarrow \delta \le \frac{\beta _{0}}{3} \end{aligned}$$

which violates the first condition of Eq. (46).

Therefore, based on the previous analysis, we conclude that Eqs. (44), (45) and (48) referring to the quantities \(\epsilon \), \(\eta \) and N are valid in the following intervals:

$$\begin{aligned} \left\{ \begin{array}{c} \gamma \le 1\Rightarrow x<1\Rightarrow \chi<\ln \left( \sqrt{\frac{3}{ \alpha _{0}}}\right) \\ \gamma >1\Rightarrow x<x_{\max }\Rightarrow \chi <\ln \left( \frac{3}{\beta _{0}}\right) \end{array} \right. . \end{aligned}$$
(50)

In the next section, we will study the inflationary regime from the perturbative point of view.

4 Inflation via cosmological perturbation theory

In this section, we investigate inflation of the model (2) via cosmological perturbations. Recall that its background dynamic equations are the Friedmann ones, given by Eqs. (12) and (13), and the equations of motion for the scalar fields \(\chi \) and \(\lambda \), given by Eqs. (14) and (15).

Before proceeding with our developments, it is worth commenting on scalar perturbations. In addition to the perturbations of the two scalar fields, which we will denote by \(\delta \chi \) and \(\delta \lambda \), we have the scalar perturbations of the metric. The line element in the perturbed Friedmann–Lemaître–Robertson–Walker (FLRW) metric is given by

$$\begin{aligned} ds^{2}= & {} -\left( 1+2A\right) dt^{2}+2a\partial _{i}Bdx^{i}dt \nonumber \\{} & {} +a^{2}\left[ \left( 1-2\psi \right) \delta _{ij}+2\partial _{ij}E+h_{ij} \right] dx^{i}dx^{j}, \end{aligned}$$
(51)

with A, B, \(\psi \) and E being the scalar perturbations of the metric [67, 68]. In order to obtain the perturbative field equations through a perturbation directly in the action, we need to write it up to the second order in the perturbations. In this case, we must consider second-order terms for perturbations involving only scalar field perturbations (e.g., \(\delta \chi ^{2}\)), second-order terms involving only metric scalar perturbations (e.g., \(A^{2}\)) and cross terms, that is, a product of first-order terms (e.g., \(A\delta \chi \)). In the following subsection, by following a perturbative procedure directly in the action, along the lines of that found in Refs. [68,69,70], and assuming the spatially flat gauge,Footnote 11 we obtain and discuss the equations of motion for the perturbations.

4.1 Equations for scalar perturbations

The first step in order to perturbate the action is defining the perturbations of the scalar fields. For an inhomogeneous distribution of matter, we write

$$\begin{aligned} \chi \left( t,x\right)&=\chi \left( t\right) +\delta \chi \left( t,x\right) , \end{aligned}$$
(52)
$$\begin{aligned} \lambda \left( t,x\right)&=\lambda \left( t\right) +\delta \lambda \left( t,x\right) . \end{aligned}$$
(53)

In turn, the metric in Eq. (51) is written as

$$\begin{aligned} g^{\rho \sigma }\left( t,x\right) =g^{\rho \sigma }\left( t\right) +\delta g^{\rho \sigma }\left( t,x\right) . \end{aligned}$$
(54)

By writing Eq. (2) up to the second order in the perturbations and taking their variations with respect to each one of the perturbations, we are able to obtain the following equations of motion for the perturbations \(\delta \chi \) and \(\delta \lambda \)

$$\begin{aligned}{} & {} \delta \ddot{\chi }+3H\delta \dot{\chi }-\frac{1}{a^{2}}\nabla ^{2}\left( \delta \chi \right) \nonumber \\{} & {} \qquad +\frac{\beta _{0}}{6}e^{-\chi }\dot{\lambda }\left( \dot{\lambda }\delta \chi -2\delta \dot{\lambda }\right) +V_{\chi \chi }\delta \chi +V_{\chi \lambda }\delta \lambda \nonumber \\{} & {} \quad =\dot{\chi }\dot{A}+\frac{1}{a}\dot{\chi }\nabla ^{2}B-2V_{\chi }A, \end{aligned}$$
(55)

and

$$\begin{aligned}{} & {} \beta _{0}e^{-\chi }\left[ \delta \ddot{\lambda }+\left( 3H-\dot{\chi } \right) \delta \dot{\lambda }-\frac{1}{a^{2}}\nabla ^{2}\left( \delta \lambda \right) \right. \nonumber \\{} & {} \qquad \left. -\dot{\lambda }\delta \dot{\chi }-3 V_{\lambda } \delta \chi \right] -3\left( V_{\chi \lambda }\delta \chi +V_{\lambda \lambda }\delta \lambda \right) \nonumber \\{} & {} \quad =\beta _{0}e^{-\chi }\left( \dot{\lambda }\dot{A}+\dot{\lambda }\frac{1}{a} \nabla ^{2}B+\dot{\chi }\dot{\lambda }A\right) +6V_{\lambda }A, \end{aligned}$$
(56)

as well as the Einstein equations

$$\begin{aligned}{} & {} H\left( 3HA-\frac{k^{2}}{a}B\right) =-\frac{1}{4}\left[ 3\dot{\chi }\delta \dot{\chi }-\beta _{0}e^{-\chi }\dot{\lambda }\delta \dot{\lambda }\right. \nonumber \\{} & {} \qquad \left. -\left( 3\dot{\chi }^{2}-\beta _{0}e^{-\chi }\dot{\lambda }^{2}\right) A+\frac{1}{2}\beta _{0}e^{-\chi }\dot{\lambda }^{2}\delta \chi +V_{\chi }\delta \chi +V_{\lambda }\delta \lambda \right] , \nonumber \\ \end{aligned}$$
(57)

and

$$\begin{aligned} HA=\frac{1}{4}\left( 3\dot{\chi }\delta \chi -\beta _{0}e^{-\chi }\dot{\lambda }\delta \lambda \right) . \end{aligned}$$
(58)

The double subscript in potential V represents second-order differentiation with respect to the corresponding scalar fields.

4.2 Equations in the slow-roll leading order regime

Once we obtain Eqs. (55), (56), (57) and (58), which completely describe the evolution of scalar perturbations, the next step is to write them in the slow-roll leading order regime. This is not a trivial task and therefore, we will initially present the particular case of the Starobinsky model (\(\alpha _{0}=\beta _{0}=0\)). In this case, Eqs. (55), (56), (57) and (58) reduce to

$$\begin{aligned}{} & {} \delta \ddot{\chi }+3H\delta \dot{\chi }-\frac{1}{a^{2}}\nabla ^{2}\left( \delta \chi \right) +\hat{V}_{\chi \chi }\delta \chi +\hat{V}_{\chi \lambda }\delta \lambda \nonumber \\{} & {} \quad =\dot{\chi }\dot{A}+\frac{1}{a}\dot{\chi }\nabla ^{2}B-2\hat{V}_{\chi }A, \end{aligned}$$
(59)
$$\begin{aligned}{} & {} \hat{V}_{\chi \lambda }\delta \chi +\hat{V}_{\lambda \lambda }\delta \lambda =-2\hat{V}_{\lambda }A, \end{aligned}$$
(60)
$$\begin{aligned}{} & {} H\left( 3HA-\frac{k^{2}}{a}B\right) =-\frac{1}{4}\left( 3\dot{\chi }\delta \dot{\chi }-3\dot{\chi }^{2}A+\hat{V}_{\chi }\delta \chi +\hat{V}_{\lambda }\delta \lambda \right) \nonumber \\ \end{aligned}$$
(61)

and

$$\begin{aligned} HA=\frac{3}{4}\dot{\chi }\delta \chi , \end{aligned}$$
(62)

with

$$\begin{aligned} \hat{V}\left( \chi ,\lambda \right) =\frac{1}{3}\kappa _{0}e^{-2\chi }\lambda \left( e^{\chi }-1-\frac{1}{2}\lambda \right) . \end{aligned}$$
(63)

In Sect. 3.2, we saw, in the context of the background, the behavior of the scalar fields, their derivatives and the relationships they keep between them. Once the slow-roll factor \(\delta \) was established, we recall that in the slow-roll leading order regime, we obtain

$$\begin{aligned} \dot{\chi }\sim \delta {, \ }\lambda \sim \delta ^{-1}{, \ }H\sim \delta ^{0}{, \ }\beta _{0}\sim \delta \text { and }\alpha _{0}\sim \delta ^{2}\,{,} \end{aligned}$$

and that with each differentiation with respect to time in the scalar fields, an order of slow-roll is increased, that is, \(\ddot{\chi }\sim \delta ^{2}\) and \(\dot{\lambda }\sim \delta ^{0}\). By making the constructions in this section, some assumption is necessary, namely, to find the slow-roll orders of the perturbations, we need to establish the slow-roll order of one of them. In this sense, we take the perturbation \(\delta \chi \) as a zero-order slow-roll quantity. Also, we keep in mind that derivatives do not change the slow-roll order of perturbations. We are now able to write the equations of motion in the slow-roll leading order. Analyzing Eq. (62), note that since \(\dot{\chi }\delta \chi \sim \delta \), perturbation A must be at most first-order in slow-roll. Regarding Eq. (61), in its right member, we have the first and third terms, which are of first-order, the second term, which is subdominant \(3 \dot{\chi }^{2}A\) of third-order in slow-roll, and the last one is null, since \(\hat{V}_{\lambda }=0\). Furthermore, since in its left member we have \( 3H^{2}A\sim \delta \), we conclude that perturbation B is at most first-order in slow-roll. In turn, as \(\hat{V}_{\chi \lambda }\sim \delta \) and \( \hat{V}_{\lambda \lambda }\sim \delta ^{2}\), we see that Eq. (60) establish the perturbation leading order of \(\delta \lambda \), namely, \(\delta \lambda \sim \delta ^{-1}\). With Eq. (60), we can still write \(\delta \lambda \) in terms of \( \delta \chi \) and substitute in Eq. (59). Thus, on the left side of Eq. (59), the first three terms are of zero order in slow-roll, while all other terms of the equation give us subdominant contributions. So we can write

$$\begin{aligned} \delta \ddot{\chi }+3H\delta \dot{\chi }-\frac{1}{a^{2}}\nabla ^{2}\left( \delta \chi \right) \simeq 0. \end{aligned}$$
(64)

When developing the previous analysis now for the case of the complete equations, we find that the perturbations evolve, in the slow-roll leading order regime, with the same orders obtained previously. In short, the scalar perturbations evolve in the form \(A\sim B\sim \delta \) and \(\delta \lambda \sim \delta ^{-1}\). It is interesting to note that the perturbation \(\delta \lambda \) goes in leading order regime with \(\delta ^{-1}\), and that if it were otherwise, it would seriously compromise the slow-roll dynamics.

By applying all the discussion raised above, we find, in the slow-roll leading order regime, the following equations of motion for the perturbations of the scalar fields

$$\begin{aligned} \delta \ddot{\chi }+3H\delta \dot{\chi }-\frac{1}{a^{2}}\nabla ^{2}\left( \delta \chi \right) \simeq \frac{1}{3}\kappa _{0}\left( \delta \chi -e^{-\chi }\delta \lambda \right) , \end{aligned}$$
(65)

and

$$\begin{aligned} \beta _{0}\left[ \delta \ddot{\lambda }+3H\delta \dot{\lambda }-\frac{1}{a^{2}} \nabla ^{2}\left( \delta \lambda \right) \right] \simeq \kappa _{0}\left( \delta \chi -e^{-\chi }\delta \lambda \right) . \end{aligned}$$
(66)

These results are in agreement with those obtained in Ref. [58], where the Starobinsky\(+R\square R\) model is explored.

4.3 Adiabatic and isocurvature perturbations

In this subsection, we define adiabatic and isocurvature perturbations, obtain expressions that describe their dynamics, and study their solutions.

The action (2) can be rewritten, along the lines of Ref. [71], compactly as

$$\begin{aligned} S\!=\!\frac{M_{Pl}^{2}}{2}\int d^{4}x\sqrt{-g}\left( -\frac{1}{2}g^{\mu \nu }G_{IJ}\left( \Phi \right) \partial _{\mu }\Phi ^{I}\partial _{\nu }\Phi ^{J}\!-\! 3V\right) ,\nonumber \\ \end{aligned}$$
(67)

where the scalars \(\Phi ^{I}\left( x\right) \) are seen as local coordinates of the scalar field space with metric \(G_{IJ}\left( \Phi \right) \)

$$\begin{aligned} \Phi ^{I}= \begin{pmatrix} \chi \\ \lambda \end{pmatrix}, \quad G_{IJ}\left( \Phi \right) = \begin{pmatrix} 3 &{} 0 \\ 0 &{} -\beta _{0}e^{-\chi } \end{pmatrix}, \end{aligned}$$
(68)

and V represents the potential of the model, Eq. (3). In a two-field scalar model, the field space is 2-dimensional and characterized by \(G_{IJ}\left( \Phi \right) \). To conveniently describe the evolution of perturbations, we can define a basis having a tangent direction, which we will denote by \(\hat{\sigma }^{I}\), and another orthogonal, \(\hat{s}^{I}\), to the background trajectories. Tangent directions to background trajectories are associated with adiabatic perturbation, while orthogonal directions are associated with isocurvature perturbation. In this sense, we build the basis through the definitions, respectively, of the module of the velocity vector, the unit velocity vector in the tangent direction and the normalization rule

$$\begin{aligned} \dot{\sigma }=\sqrt{G_{IJ}\dot{\Phi }^{I}\dot{\Phi }^{J}},\quad \hat{\sigma }^{I}=\frac{\dot{\Phi }^{I}}{\dot{\sigma }}\quad \text { and }\quad G_{IJ}\hat{\sigma } ^{I}\hat{\sigma }^{J}=1,\nonumber \\ \end{aligned}$$
(69)

and for the orthogonal direction, the normalizationFootnote 12 and orthogonality rules

$$\begin{aligned} G_{IJ}\hat{s}^{I}\hat{s}^{J}=-1\quad \text { and }\quad G_{IJ}\hat{s}^{I}\hat{\sigma } ^{J}=0. \end{aligned}$$
(70)

For our case, we have for the velocity module \(\dot{\sigma }\)

$$\begin{aligned} \dot{\sigma }=\sqrt{3\dot{\chi }^{2}-\beta _{0}e^{-\chi }\dot{\lambda }^{2}}. \end{aligned}$$
(71)

Note that it is directly related to the Friedmann equation (13). For the unit velocity vectors, we write

$$\begin{aligned} \hat{\sigma }^{\chi }=\frac{\dot{\chi }}{\dot{\sigma }}\text { and }\hat{ \sigma }^{\lambda }=\frac{\dot{\lambda }}{\dot{\sigma }}. \end{aligned}$$
(72)

In turn, for the unit vectors in the orthogonal direction to the background trajectories, we have

$$\begin{aligned} \hat{s}^{\chi }=\sqrt{\frac{\beta _{0}e^{-\chi }}{3}}\frac{\dot{\lambda }}{ \dot{\sigma }}\text { and }\hat{s}^{\lambda }=\sqrt{\frac{3}{\beta _{0}e^{-\chi }}}\frac{\dot{\chi }}{\dot{\sigma }}. \end{aligned}$$
(73)

Continuing our study on the evolution of scalar perturbations, we point out that the quantity \(\delta \Phi _{g}^{I}\) given by

$$\begin{aligned} \delta \Phi _{g}^{I}=\delta \Phi ^{I}+\frac{\dot{\Phi }^{I}}{H}\psi , \end{aligned}$$
(74)

is gauge invariant. It turns out that \(\psi =0\) when working on a spatially flat gauge, so that \(\delta \Phi _{g}^{I}=\delta \Phi ^{I}\). That said, by projecting \(\delta \Phi ^{I}\) in the \(\hat{\sigma }\) and \(\hat{s}\) directions, we construct the adiabatic \(Q_{\sigma }\) and isocurvature \(Q_{s}\) perturbations, respectively. In that sense, we have

$$\begin{aligned} Q_{\sigma } =\hat{\sigma }^{J}G_{IJ}\delta \Phi ^{I} =\frac{3\dot{\chi }\delta \chi -\beta _{0}e^{-\chi }\dot{\lambda } \delta \lambda }{\dot{\sigma }}, \end{aligned}$$
(75)

and

$$\begin{aligned} Q_{s} =\hat{s}^{J}G_{IJ}\delta \Phi ^{I} =\frac{\sqrt{3\beta _{0}e^{-\chi }}\left( \dot{\lambda }\delta \chi - \dot{\chi }\delta \lambda \right) }{\dot{\sigma }}. \end{aligned}$$
(76)

It is worth noting that from the point of view of the slow-roll approximation both Eqs. (75) and (76) are zero-order. By having written the expressions for the adiabatic and isocurvature perturbations, the next step is to invert the relations in order to obtain \(\delta \Phi ^{I}=\delta \Phi ^{I}\left( Q\right) \). We obtain these relations by solving the linear system given by Eqs. (75) and (76). Thus, we find the expressions

$$\begin{aligned} \delta \chi =\frac{1}{\dot{\sigma }}\left( \dot{\chi }Q_{\sigma }-\sqrt{\frac{ \beta _{0}e^{-\chi }}{3}}\dot{\lambda }Q_{s}\right) , \end{aligned}$$
(77)

and

$$\begin{aligned} \delta \lambda =\frac{1}{\dot{\sigma }}\left( \dot{\lambda }Q_{\sigma }-\sqrt{ \frac{3}{\beta _{0}e^{-\chi }}}\dot{\chi }Q_{s}\right) . \end{aligned}$$
(78)

Once the field perturbations in terms of the adiabatic and isocurvature perturbations were obtained, we can write the second order perturbed action for \(Q_{\sigma }\) and \(Q_{s}\). Such an action is fundamentally constituted by quadratic terms involving \(Q_{\sigma }\) and \(Q_{s}\) (e.g., \(Q_{\sigma }^{2}\)), cross terms involving a \(Q_{\sigma }\) or \(Q_{s}\) and a metric perturbation (e.g., \(AQ_{\sigma }\)), and quadratic terms for metric perturbations (e.g., \(A^{2}\)). On the other hand, we can express the cross term only in terms of the perturbations \(Q_{\sigma }\) and \(Q_{s}\) by making use of the constraints from the Einstein equations, Eqs. (57) and (58). By taking these considerations into account, we find the following structure for the part of the action that depends only on \( Q_{\sigma }\) and \(Q_{s}\)

$$\begin{aligned} S^{\left( 2\right) }&=\frac{M_{Pl}^{2}}{2}\int d^{4}x\sqrt{-g}\left( -\frac{1}{2}\partial _{\kappa }Q_{\sigma }\partial ^{\kappa }Q_{\sigma }+\frac{1}{2}\partial _{\kappa }Q_{s}\partial ^{\kappa }Q_{s}\right. \nonumber \\&\qquad +C_{Q_{\sigma }^{2}}Q_{\sigma }^{2}+C_{Q_{\sigma }Q_{s}}Q_{\sigma }Q_{s}+C_{Q_{s}^{2}}Q_{s}^{2} \nonumber \\&\qquad +C_{Q_{\sigma }\dot{Q}_{\sigma }}Q_{\sigma }\dot{Q}_{\sigma } +C_{\dot{ Q}_{\sigma }Q_{s}}\dot{Q}_{\sigma }Q_{s}+C_{Q_{\sigma }\dot{Q}_{s}}Q_{\sigma }\dot{Q}_{s} \nonumber \\&\qquad \left. +C_{Q_{s}\dot{Q}_{s}}Q_{s}\dot{Q}_{s}\right) , \end{aligned}$$
(79)

where the coefficient of the cross kinetic term \(\partial _{\kappa }Q_{\sigma }\partial ^{\kappa }Q_{s}\) is zero, and the others can be found in Appendix B. It is interesting to analyze the behavior of kinetic terms and their possible contribution to the emergence of ghost-type instabilities. Note that the kinetic terms are canonical, equal and with reversed signs. This characteristic irremediably indicates that the existence of ghost-type instabilities is something intrinsic to the model and that it is essential to take this into account when performing the perturbation quantization process.Footnote 13 Furthermore, the fact of the non-existence of the cross kinetic term is something expected and is directly related to our approach of making a consistent decomposition of the perturbations in the tangent and orthogonal directions to the trajectories of the background phase space.

When writing Eq. (79) considering a slow-roll leading order regime, we obtain

$$\begin{aligned} S^{\left( 2\right) }\simeq & {} \frac{M_{Pl}^{2}}{2}\int d^{4}x\sqrt{-g} \left\{ -\frac{1}{2}\partial _{\kappa }Q_{\sigma }\partial ^{\kappa }Q_{\sigma }+\frac{1}{2}\partial _{\kappa }Q_{s}\partial ^{\kappa }Q_{s}\right. \nonumber \\{} & {} \left. -\frac{\kappa _{0}}{\dot{\sigma }^{2}}\left[ 1-\frac{1}{2}\left( \frac{3}{\beta _{0}e^{\chi }}+\frac{\beta _{0}e^{\chi }}{3}\right) \right] \dot{\chi }^{2}Q_{s}^{2}\right\} . \end{aligned}$$
(80)

We now turn our attention to the task of writing the equations for the evolution of adiabatic and isocurvature perturbations. This is done by substituting the expressions (77) and (78) in the dynamic equations (65) and (66). In this case, taking the first derivatives of Eqs. (77) and (78), remembering that the background quantities can be considered constant, we are able to write

$$\begin{aligned} \ddot{Q}_{\sigma }+3H\dot{Q}_{\sigma }-\frac{1}{a^{2}}\nabla ^{2}Q_{\sigma }\simeq 0, \end{aligned}$$
(81)

and

$$\begin{aligned} \ddot{Q}_{s}+3H\dot{Q}_{s}+m^{2}Q_{s}\simeq 0, \end{aligned}$$
(82)

where

$$\begin{aligned} m^{2}=-\left[ \frac{1}{a^{2}}\nabla ^{2}+\frac{\kappa _{0}}{3}\left( 1-\frac{ 3}{\beta _{0}e^{\chi }}\right) \right] . \end{aligned}$$
(83)

Note that relations (81) and (82) indicate that the adiabatic \(Q_{\sigma }\) and isocurvature \(Q_{s}\) perturbations are decoupled, that is, they evolve independently in our model. That is an interesting result since such a decoupling usually does not occur. Generally, the isocurvature perturbation enter as source of the adiabatic one. [71].

From this point on, it becomes convenient to treat the field equations for the perturbations in a Mukhanov-Sasaki form. By making a redefinition of the perturbations and assuming conformal time and Fourier space, we get

$$\begin{aligned}{} & {} \delta \varphi _{\sigma }^{\prime \prime }+\left( k^{2}-\frac{a^{\prime \prime }}{a}\right) \delta \varphi _{\sigma }\simeq 0, \end{aligned}$$
(84)
$$\begin{aligned}{} & {} \delta \varphi _{s}^{\prime \prime }+\left[ k^{2}-\frac{a^{\prime \prime }}{a }-\frac{a^{2}\kappa _{0}}{3}\left( 1-\frac{3}{\beta _{0}e^{\chi }}\right) \right] \delta \varphi _{s}\simeq 0, \end{aligned}$$
(85)

with the prime representing derivative with respect to conformal time and where

$$\begin{aligned} \delta \varphi _{\sigma }\equiv aQ_{\sigma }\quad \text { and }\quad \delta \varphi _{s}\equiv aQ_{s}{.} \end{aligned}$$
(86)

In a de Sitter background (slow-roll zero order), we have

$$\begin{aligned} \frac{a^{\prime \prime }}{a}\simeq \frac{2}{\eta ^{2}},\quad a\simeq - \frac{1}{H\eta }\quad \text { and }\quad H^{2}\simeq \frac{\kappa _{0}}{12}. \end{aligned}$$
(87)

Furthermore, in the slow-roll zero order regime, we have

$$\begin{aligned} \dot{\chi }\simeq -\frac{1}{3}\frac{H^{-1}}{3-\beta _{0}\delta ^{-1}}\delta \left( 1-\frac{\alpha _{0}}{3}\delta ^{-2}\right) \simeq 0\Rightarrow \chi =cte{.} \end{aligned}$$
(88)

For consistency with several previous results, we have \(\beta _{0}e^{\chi }<3 \) so that we can define a quantity

$$\begin{aligned} M\equiv \frac{3}{\beta _{0}e^{\chi }}-1>0, \end{aligned}$$
(89)

and in this way we write the expressions

$$\begin{aligned}{} & {} \delta \varphi _{\sigma }^{\prime \prime }+k^{2}\left( 1-\frac{2}{k^{2}\eta ^{2}}\right) \delta \varphi _{\sigma }\simeq 0, \end{aligned}$$
(90)
$$\begin{aligned}{} & {} \delta \varphi _{s}^{\prime \prime }+k^{2}\left[ 1-\frac{2}{k^{2}\eta ^{2}} \left( 1-2M\right) \right] \delta \varphi _{s}\simeq 0. \end{aligned}$$
(91)

Next, we will explore the solutions of Eqs. (90) and (91).

4.4 Solutions to the perturbations

Once the equations for the dynamics of adiabatic and isocurvature perturbations have been established in the appropriate form, given by Eqs. (90) and (91), we can write and analyze their solutions.

In a subhorizon regime, \(k\eta \gg 1\), equations to the perturbations are approximated by

$$\begin{aligned}{} & {} \delta \varphi _{\sigma }^{\prime \prime }+k^{2}\delta \varphi _{\sigma }\simeq 0,\ \ k\eta \gg 1, \\{} & {} \delta \varphi _{s}^{\prime \prime }+k^{2}\delta \varphi _{s}\simeq 0,{ \ \ }k\eta \gg 1. \end{aligned}$$

The quantization process in de Sitter takes the following initial conditions [72, 73]

$$\begin{aligned} \delta \varphi _{\sigma }\simeq \delta \varphi _{s}\simeq \frac{1}{\sqrt{2k }}e^{-ik\eta },\quad k\eta \gg 1, \end{aligned}$$
(92)

or

$$\begin{aligned} Q_{\sigma }\simeq Q_{s}\simeq \frac{1}{\sqrt{2k}a}e^{-ik\eta }\simeq - \frac{H\eta }{\sqrt{2k}}e^{-ik\eta }. \end{aligned}$$
(93)

Note that due to the ghost-type behavior of isocurvature perturbation, the quantization of the \(Q_{s}\) field was performed in the same way as in Refs. [48, 74]. In principle, this behavior can raise questions about the unitarity of the theory [75] (see also the discussions in Refs. [65, 66]). However, as we will see below, the \(Q_{s}\) field decays rapidly after crossing the horizon, suppressing any observable effects associated with isocurvature perturbation.Footnote 14

The exact solution of Eqs. (90) and (91) can be written using a combination of Hankel’s functions as [73, 76]

$$\begin{aligned} \delta \varphi \left( \eta ,k\right)= & {} C_{1}\left( k\right) \sqrt{-\eta } H_{\nu }^{\left( 1\right) }\left( -k\eta \right) \nonumber \\{} & {} +C_{2}\left( k\right) \sqrt{ -\eta }H_{\nu }^{\left( 2\right) }\left( -k\eta \right) , \end{aligned}$$
(94)

where for the adiabatic perturbation \(\delta \varphi _{\sigma }\), we have \( \nu _{\sigma }=3/2\), and for the isocurvature perturbation \(\delta \varphi _{s}\), we have

$$\begin{aligned} \nu _{s}=\frac{3}{2}\sqrt{1-\frac{16M}{9}}. \end{aligned}$$
(95)

To determine the constants, we compare the general solution with the initial conditions in Eq. (92). For the adiabatic case (\(\nu _{\sigma }=3/2\)) in the subhorizon regime, we find

$$\begin{aligned} C_{1\sigma }=-\frac{\sqrt{\pi }}{2}\quad \text { and }\quad C_{2\sigma }=0, \end{aligned}$$
(96)

so that

$$\begin{aligned} \delta \varphi _{\sigma }\left( \eta ,k\right) =-\frac{\sqrt{-\pi \eta }}{2} H_{3/2}^{\left( 1\right) }\left( -k\eta \right) . \end{aligned}$$
(97)

In turn, for the case of isocurvature perturbation, where \(\nu _{s}\) is given by Eq. (95), for a subhorizon regime, we getFootnote 15

$$\begin{aligned} C_{1s}=\frac{\sqrt{\pi }}{2}e^{i\frac{\pi }{2}\left( \nu _{s}+\frac{1}{2} \right) }\quad \text { and }\quad C_{2s}=0. \end{aligned}$$
(98)

Thus, the solution to the isocurvature perturbation is written as

$$\begin{aligned} \delta \varphi _{s}\left( \eta ,k\right) =\frac{\sqrt{-\pi \eta }}{2}e^{i \frac{\pi }{2}\left( \nu _{s}+\frac{1}{2}\right) }H_{\nu _{s}}^{\left( 1\right) }\left( -k\eta \right) . \end{aligned}$$
(99)

Since the solutions (97) and (99) for the perturbations have been established, we can analyze their behavior in the superhorizon regime (\(k\eta \ll 1\)). Taking into account Eq. (87), which gives us the relation between \(\eta \) and a in a de Sitter background, for adiabatic perturbation, we find

$$\begin{aligned} \delta \varphi _{\sigma }\left( \eta ,k\right) \simeq -\frac{i}{4\eta } \left( \frac{k}{2}\right) ^{-3/2}\text { or }Q_{\sigma }\simeq \frac{iH}{ 4}\left( \frac{k}{2}\right) ^{-3/2},\nonumber \\{ \ }k\eta \ll 1,\nonumber \\ \end{aligned}$$
(100)

while for isocurvature perturbation, we have

$$\begin{aligned} \delta \varphi _{s}\left( \eta ,k\right) \simeq -\frac{i\sqrt{\pi }e^{i \frac{\pi }{2}\left( \nu _{s}+\frac{1}{2}\right) }}{2\sin \left( \nu _{s}\pi \right) \Gamma \left( 1-\nu _{s}\right) }\left( \frac{k}{2}\right) ^{-\nu _{s}}\left( -\eta \right) ^{\frac{1}{2}-\nu _{s}},\nonumber \\{ \ }k\eta \ll 1,\nonumber \\ \end{aligned}$$
(101)

or

$$\begin{aligned} Q_{s}\simeq -\frac{i\sqrt{\pi }e^{i\frac{\pi }{2}\left( \nu _{s}+\frac{1}{2} \right) }H^{\nu _{s}-\frac{1}{2}}}{2\sin \left( \nu _{s}\pi \right) \Gamma \left( 1-\nu _{s}\right) }\left( \frac{k}{2}\right) ^{-\nu _{s}}a^{\nu _{s}- \frac{3}{2}}. \end{aligned}$$
(102)

The result in Eq. (100) tells us that the adiabatic perturbation is constant in the superhorizon limit, whereas Eq. (102) reveals a decaying behavior for the isocurvature one. Remembering that \(M>0\), we have the following situations:

  • If the quantity \(\nu _{s}\) is real, we have

    $$\begin{aligned} 0<M\le \frac{9}{16}\Rightarrow 0\le \nu _{s}<\frac{3}{2}, \end{aligned}$$
    (103)

    representing a solution to \(Q_{s}\) that decays with a.

  • If the quantity \(\nu _{s}\) is imaginary,

    $$\begin{aligned} M>\frac{9}{16}\Rightarrow \nu _{s}=i\frac{3}{2}\sqrt{\left| 1-\frac{16M}{ 9}\right| }, \end{aligned}$$
    (104)

    also providing a decaying solution, since we have the product of an oscillatory term and the term decaying with \(a^{-\frac{3}{2}}\).

We can provide a quantitative measure of isocurvature perturbation by considering scales of interest during inflation (measured in CMB anisotropies). These ones are within the range \( 10^{-3}\,Mpc^{-1}<k<10^{4}\,Mpc^{-1}\), where the pivot scale is \(k_{*}=0.002\) with \(50<N_{*}<60\). The point is that during the inflationary regime, a given scale k crosses the horizon at a specific value of the number of e -folds N. The smaller/larger the scale k is, the smaller/larger the number of e-folds N it experiences after crossing the horizon. Taking the smallest scaleFootnote 16\(k_{sm}=10^{4}\,Mpc^{-1}\), we find \( N_{sm}=N_{*}-15.4\). In this sense, for \(N_{*}=50\) and \(\nu _{s}=1/2\), we get

$$\begin{aligned} Q_{s}\simeq \frac{1}{2}\left( \frac{k_{sm}}{2}\right) ^{-\frac{1}{2} }e^{-N_{sm}}\sim 10^{-18}, \end{aligned}$$
(105)

which shows us that isocurvature perturbation is negligible after inflation. Since, in addition to this, they do not enter as a source of adiabatic perturbation, we can consider them negligible. All the previous analysis was carried out considering the slow-roll approximation.

In the next section, we will analyze the connection of our model with the observations.

5 Observational constraints

At the end of the previous section, we show why isocurvature perturbation is negligible after inflation. Furthermore, since adiabatic perturbation has the same behavior as in the case of a single scalar field, we easily recognize the power spectrum and its connection with observational parameters. To make the connection with the observations, specifically to write the power spectrum, it is interesting to recover the mass units of the fields. In this sense, equations such as (12), (13), (69) and (75) need to be written in terms of massive fields given in Eq. (5). Since the curvature perturbation is given by \(\mathcal {R=}\frac{H}{\dot{\sigma }}\left( \frac{\sqrt{2}}{M_{Pl}}Q_{\sigma }\right) \),Footnote 17 the power spectrum of adiabatic perturbation is written as

$$\begin{aligned} \mathcal {P}_{\mathcal {R}}^{2}=\left. \frac{k^{3}}{2\pi ^{2}}\left| \mathcal {R}\right| ^{2}\right| _{k=Ha}=\left. \frac{1}{8\pi ^{2}M_{Pl}^{2}}\frac{H^{2}}{\epsilon }\right| _{k=Ha}, \end{aligned}$$
(106)

where we evaluate it to \(k=Ha\) at the instant when k crosses the horizon. The result in Eq. (106) is identical to the power spectrum for single-field inflationary models. Thus, in the slow-roll leading order regime, the scalar spectral index \(n_{s}\) and the tensor-to-scalar ratio r are, respectively,

$$\begin{aligned} n_{s}=1+\eta -2\epsilon \text { and }r=16\epsilon , \end{aligned}$$
(107)

where \(\epsilon \) and \(\eta \) are the slow-roll parameters of the model given by the Eqs. (44) and (45). These equations depend on \(\alpha _{0}\), \( \beta _{0}\), and the number of e-folds N through Eq. (48 ) which carries the dependency between N and \(\delta \).

In our paper, there are two types of Plots where we compare our model with observational data [40], built from Eq. (107) taking the three independent parameters \(\alpha _{0}\), \(\beta _{0}\), and N: the usual \(n_{s}\times r_{0.002}\) plane and the parameter space \(\alpha _{0}\times \beta _{0}\). The Plots are constructed by setting one of the parameters and varying the others. We use the range \(52\le N\le 59\) for the number of inflation e-folds N based on a reheating modeling. For details, see appendix C.

The Fig. 3 shows the \(n_{s}\times r_{0.002}\) plane containing the observational constraints (in blue) obtained from Ref. [40] and the theoretical evolution of the model in two different situations.

In the top graph of Fig. 3, we fixed the parameter \(\beta _{0}\) and varied the others. In it, the light red region represents Starobinsky\( +R^{3}\) model, which starts at the light red points. In turn, the light yellow region represents the complete model with \(\beta _{0}=1.5\times 10^{-2}\), starting at the yellow points. As we increase the values of the parameter \(\alpha _{0}\), the region predicted by the model shifts to the left and slightly downwards, until it crosses the region of \(95\%\) C.L.. This behavior can also be seen in Ref. [44], and it is consistent with the results obtained in Ref. [42], where \(\beta _{0}=0\). These constraints establish, in the most conservative way, a maximum value for \(\alpha _{0}\sim 10^{-4}\).

In the bottom graph of Fig. 3, on the other hand, we fixed the parameter \(\alpha _{0}\) and varied the others. The light red region represents the Starobinsky\( +R\square R\) model, which starts at the light red points. In turn, the light green region represents the complete model with \(\alpha _{0}=10^{-5}\), starting at the light green points. As the values of \(\beta _{0}\) increase, the region predicted by the model moves to the right and slightly upwards, until it crosses the region of \(95\%\) C.L.. These constraints establish a maximum value for \(\beta _{0}\sim 10^{-2}\). Similar results were obtained in Refs. [58, 59] for the Starobinsky\(+R\square R\) case. However, a considerable difference between our results and those in Ref. [58] is checked for the constraint on the tensor-to-scalar ratio \(r_{0.002}\). There, \(r_{0.002}\) can assume larger values, so that the growth of the region predicted by the model is more accentuated. This difference is due to the fact that in Ref. [58], the definition for the curvature perturbation was not established properly by not making a separation of the background phase space trajectories in the tangent (adiabatic perturbation) and orthogonal (isocurvature perturbation) directions. On the other hand, our results are closer to those in Ref. [59], indicating that the approach of treating the \(R\square R\) term as a small perturbation is relevant and consistent.

Fig. 3
figure 3

The contours in blue represent the constraints of the \(n_{s}\times r_{0.002}\) plane in \(68\%\) and \(95\%\) C.L. due to observational data from Planck plus BICEP3/Keck plus BAO [40]. In the top graph, we set the parameter \(\beta _{0}\) and vary the others. The light red circles represent Starobinsky\(+R^{3}\) model for \( N=52\) (smaller one) and \(N=59\) (bigger one). The yellow circles represent the complete model with \(\beta _{0}=1.5\times 10^{-2}\) for \(N=52\) and \(N=59\). As the values of \(\alpha _{0}\) increase, the region predicted by the model shifts to the left and downwards, until it crosses the region of \(95\%\) C.L.. When it crosses, the curves for Starobinsky\(+R^{3}\) and the complete model with \(\beta _{0}=1.5\times 10^{-2}\) for \(N=52\) correspond to \(\alpha _{0}=3.5\times 10^{-5}\) and \(r_{0.002}=4.1\times 10^{-3}\) and \(\alpha _{0}=4\times 10^{-5}\) and \(r_{0.002}=4.1\times 10^{-3}\), respectively; for \( N=59\) they correspond to \(\alpha _{0}=8.2\times 10^{-5}\) and \( r_{0.002}=2.8\times 10^{-3}\) and \(\alpha _{0}=5.4\times 10^{-5}\) and \( r_{0.002}=2.9\times 10^{-3}\), respectively. In the bottom graph, in turn, we set the parameter \(\alpha _{0}\) and vary the others. The light red circles represent Starobinsky\(+R\square R\) model for \(N=52\) (smaller one) and \(N=59\) (bigger one). The green circles represent the complete model with \(\alpha _{0}=10^{-5}\) for \(N=52\) and \(N=59\). As the values of \(\beta _{0}\) increase, the region predicted by the model shifts to the right and slightly upwards, until they cross the \(95\%\) C.L. region. As it crosses, the curves for \(N=52\) correspond approximately to \(\beta _{0}=1.7\times 10^{-2}\) and \( r_{0.002}=5.2\times 10^{-3}\); for \(N=59\) they correspond approximately to \( \beta _{0}=1.5\times 10^{-2}\) and \(r_{0.002}=3.9\times 10^{-3}\)

Another plot developed, Fig. 4, is the parameter space \( \alpha _{0}\times \beta _{0}\) allowed by the observations. In the top graph of Fig. 4, we have the Plot for \(N=52\), while in the bottom graph of Fig. 4, we have it for \(N=59\). The blue regions represent the allowed regions for the \(\alpha _{0}\) and \(\beta _{0}\) parameters in \(68\%\) and \(95\%\) C.L.. Note that the Plot for \(N=52\) gives us a smaller region for the parameters if we compare it to the Plot for \(N=59\). In addition, we can see two regions on each of the Plots. One is an approximated rectangular region completely within the \(95\%\) C.L. (for \(N=52\), the sides correspond to \(\alpha _{0}=2.8\times 10^{-5}\) and \(\beta _{0}=1.7\times 10^{-2}\), and for \(N=59\), \(\alpha _{0}=5.3\times 10^{-5}\) and \(\beta _{0}=1.5\times 10^{-2}\)), where the parameters \(\alpha _{0}\) and \(\beta _{0}\) do not keep a dependency between them, being able to assume any values independently. In this region of independence between the model parameters, we reproduced the results obtained in Refs. [42, 59] for each model separately. The other region is the asymptotic one for large values of \(\alpha _{0}\) and \(\beta _{0}\), whose occurrence suggests a dependence \(\beta _{0} = \beta _{0} (\alpha _{0})\) between the parameters.

Fig. 4
figure 4

The regions in blue represent the allowed regions for the parameters \(\alpha _{0}\) and \(\beta _{0}\) in \(68\%\) and \(95\%\) C.L., due to observational data from Planck plus BICEP3/Keck plus BAO [40]. In the top graph we have the Plot for \(N=52\), while in the bottom graph we have it for \(N=59\). Note that the constraints for \(N=59\) allow a larger region for the parameters \(\alpha _{0}\) and \(\beta _{0}\) in line with what we saw in the Fig. 3, whose predictions for \(N=59\) are more within the region of \(68\%\) C.L. Note that for large values of \(\alpha _{0}\) and \(\beta _{0}\), around \(\alpha _{0}=1.5\times 10^{-4}\) and \(\beta _{0}=2.5\times 10^{-2}\) (\(N=52\)) and \(\alpha _{0}=2.2\times 10^{-4}\) and \( \beta _{0}=1.2\times 10^{-2}\) (\(N=59\)), the predicted regions for the parameters converge to an asymptotic region. In this region, the values of \(\alpha _{0}\) and \(\beta _{0}\) suggest to keep a constraint

6 Final comments

The Starobinsky model is one of the most competitive candidates for describing physical inflation. In addition to having a well-grounded theoretical motivation, it better fits the recent observations [39, 40]. Motivated by the success of such a model, we propose to investigate inflation based on the higher-order gravitational action characterized by the inclusion of all terms up to the second-order correction involving only the scalar curvature, namely, the terms \(R^{2}\), \(R^{3}\), and \(R\square R\). In this sense, our proposed model has two additional dimensionless parameters, \(\alpha _{0}\), and \(\beta _{0}\), whose values represent deviations from Starobinsky.

Unlike Ref. [58], whose multi-field treatment used to address the term \(R\square R\) gives us an inflation described by a scalar and a vector field, here, when passing from the original frame to the representation in the Einstein frame, the model is described through the dynamics of two scalar fields \( \chi \) and \(\lambda \), where only one of them is associated with a canonical kinetic term, and whose potential is \(V\left( \chi ,\lambda \right) \) given in Eq. (3). The study of inflation in a Friedmann background, through the analysis of the critical points and phase space of the model, is essential to verify the existence of an attractor region associated with the occurrence of an inflationary regime and to know if such a regime has a graceful exit. We took as a basis the study of particular cases developed in Refs. [44, 58], which deal with the Starobinsky\(+R\square R\) and Starobinsky\(+R^{3}\) extensions. We saw that there is an attractor line near \(\chi _{t} \simeq 0\), corresponding to the slow-roll inflation, for any value of \(\alpha _{0}<10^{-3}\) and \(\beta _{0}<3\times 10^{-2}\). Furthermore, the occurrence of such a physical inflation regime essentially depends on the initial conditions for the \(\chi \) field. If they are such that the \(\chi \) field is to the right of the critical point \(P_{c}\), the value of \(\chi \) increases indefinitely, and inflation never ends. On the other hand, the occurrence of a consistent physical inflationary regime that has a graceful exit essentially requires that the initial conditions be such that \(\chi <\chi _{c}\), i.e., that it is to the left of the critical point \(P_{c}\). Finally, we conclude the background analysis with the study of inflation considering the slow-roll approximation. By defining the slow-roll factor \(\delta \), which in our analysis is responsible for controlling the slow-roll approximation order, we obtain all relevant quantities, such as \(\varepsilon \) and \(\eta \), in the slow-roll leading order.

There is considerable literature about multi-field inflation models, which we took into account to develop the analysis at the perturbative level [68, 70]. The equations of motion for the scalar perturbations were obtained using the spatially flat gauge. By writing the equations in the slow-roll leading order approximation, we saw that the scalar perturbations of the metric are sub-dominant concerning the perturbations \(\delta \chi \) and \(\delta \lambda \). At this point, we performed a correct decomposition of the perturbations in the tangent (adiabatic perturbations) and orthogonal (isocurvature perturbations) directions to the phase space background trajectories. This way, adiabatic \(Q_{\sigma }\) and isocurvature \(Q_{s}\) perturbations are completely separated. Such a decomposition allows us to consistently establish the curvature perturbation, which led us to obtain observational constraints different from those obtained in Ref. [58]. The action written in terms of \(Q_{\sigma }\) and \(Q_{s}\) makes it clear that there are irremediably ghost-type instabilities in the model since the kinetic terms have opposite signs. Next, we write the equations of motion in a Mukhanov–Sasaki form in order to study their solutions. We obtained the exact solutions for the perturbations through a linear combination of the Hankel functions. Their analysis leads us to conclude that the isocurvature perturbation associated with the ghost field is negligible after inflation and that the adiabatic one has the same behavior as in the case of a single-field inflation. All previous results were obtained considering the slow-roll approximation. Thus, a question that remains is whether the suppression of isocurvature perturbation holds beyond the slow-roll regime. This issue will be addressed in a further work.

Finally, we confront our model with recent observations from the Planck satellite, BICEP3/Keck and BAO [39, 40], making use of a constraint on the number of e-folds N of inflation (\( 52\le N\le 59\)) based on reheating modeling [44]. For that, we made two types of Plots, namely, the usual \(n_{s}\times r_{0.002}\) plane and the parameter space \(\alpha _{0}\times \beta _{0}\). In this analysis, we have three parameters: \(\alpha _{0}\), \(\beta _{0}\), and N. Thus, to build the Plots, we set one of the parameters and vary the others. Fixing the parameter \(\beta _{0}\), we observe that the region predicted by the model in the \(n_{s}\times r_{0.002}\) plane shifts to the left and slightly downwards. On the other hand, fixing the parameter \(\alpha _{0}\), we notice that the predicted region shifts to the right and slightly upwards. By setting \(\alpha _{0}=0\), we get the Starobinsky\(+R\square R\) model. In this context, we saw that inconsistency in establishing the curvature perturbation in Ref. [58] led them to obtain values higher than ours for the tensor-to-scalar ratio. In turn, by fixing the number of e-folds N, we construct the parameter space \(\alpha _{0}\times \beta _{0}\) constrained by the observations. In general, the model predictions are more in agreement with the observations for a number of e-folds \(N = 59\). Our analysis, conservatively, restrict the parameters to maximum values of \(\alpha _{0}\sim 10^{-4}\) and \(\beta _{0}\sim 10^{-2}\). It is also worth pointing out the behavior of the \(\alpha _{0}\times \beta _{0}\) parameter space. The \(R^{3}\) and \(R\square R\) terms are second-order correction terms on energy scales and, therefore, should contribute similarly to inflation. In this sense, the joint effect of such terms is reflected in the plot \(\alpha _{0}\times \beta _{0}\). In fact, there is a considerable region in which the parameters do not depend on each other and which we can associate with the models separately discussed in Refs. [42, 44, 58, 59]. However, there is an asymptotic region for large values of \(\alpha _{0}\) and \(\beta _{0}\), where such parameters seem to keep a constraint. In this particular region, a change in one of the parameters necessarily implies a change in the other, so that the possibility of a dependence \( \beta _{0} = \beta _{0} (\alpha _{0}) \) is something to be investigated. This is a topic that the authors will address in a future research.