1 Introduction

Theories with (classical) scale-invariance provide a dynamical origin of all mass scales [1,2,3,4,5,6] and present a number of interesting aspects. They lead to naturally flat inflationary potentials [3, 7,8,9,10,11,12,13] and dark matter candidates [10, 14,15,16,17] and represent an interesting framework to address the hierarchy problem [3, 4, 10, 14, 16, 18,19,20,21,22,23,24,25]. This no-scale principle has also the virtue of being predictive as only dimension-four operators are allowed in the classical Lagrangian, which can be viewed as a strong constraint on the allowed free parameters.

The absence of mass parameters implies that the pure gravitational piece of the Lagrangian is quadratic in the curvature tensors. The most general gravitational Lagrangian can be shown to be the sum of the Ricci scalar squared \(R^2\) and the Weyl-squared terms (modulo total derivatives). In order to solve the hierarchy problem in this framework including gravity it is necessary that the coefficients of these two terms be large enough, which corresponds to small enough gravitational couplings [3, 10]. The lower bound on the coefficient of \(R^2\) depends on the non-minimal coupling between R and the scalar fields (another possible scale-invariant term in the Lagrangian) [10]. The lower bound on the Weyl-squared term is instead model-independent and, as we will see, leads to potentially observable effects in cosmology.

Therefore, if one supposes that nature does not have fundamental scales and all observed masses are in fact dynamically generated one is led to conjecture that the fundamental theory of gravity contains only terms quadratic in curvature in addition to all possible scale-invariant couplings to matter. Moreover, the requirement that this four-derivative gravity theory cures the Higgs naturalness problem leads to the possibility of testing this hypothesis with observations of the early universe.

This scenario is indeed known to provide a renormalisable theory of gravity and of all other interactions [3, 26]. The price to pay is the presence of a ghost, a field whose quanta should include negative norms. This field has spin-2 and a mass \(M_2\) fixed by the coefficient of the Weyl-squared term; the above-mentioned naturalness bound corresponds to \(M_2\lesssim 10^{11}\) GeV [3], realising in an explicit theory the softened gravity idea of [27]: for energies below \(M_2\) one recovers general relativity coupled to an ordinary matter sector [28], while above the threshold \(M_2\) gravity gets softened compared to the behaviour in Einstein’s gravity. It has been proposed that such theory could make sense above \(M_2\) as a Lee–Wick theory [29,30,31,32]: the ghost is unstable and does not appear among the asymptotic states, leading to a unitary S-matrix. Also, recently there has been progress on the quantum mechanics of four-derivative theories [33, 34] and renewed further interest in four-derivative gravity [35,36,37,38,39,40,41,42,43,44,45,46,47]. References [48, 49] claimed, however, that the Lee–Wick option might result in a violation of causality. Our approach to this problem in the present work is practical: we wish to understand whether such a particle can be compatible with the available observations of the early universe we have. We leave the analysis of the remaining theoretical issues for future work.

The present work is intended to be a complete study of inflationary perturbations in general scale-invariant theories (which will be defined in detail in Sect. 2). Indeed, the presence of a naturally flat potential is only a good starting point for successful inflation. The observational quantities that can be measured are the result of tiny quantum perturbations generated during inflation, which have been later amplified and whose effects are observable today (see [50, 51] for a text-book introduction). Partial results on this subject have already been published [10, 52,53,54,55,56,57,58,59], but the present work covers the general set of classically scale-invariant theories. Indeed, our analysis includes a large class of models such as the most general two-derivative theories with an arbitrary number of scalar fields and a detailed discussion of the role of the Weyl-squared term. The bound from a natural electroweak (EW) scale, \(M_2\lesssim 10^{11}\) GeV, suggests that such a term becomes relevant whenever the Hubble rate during inflation exceeds roughly the scale of \(10^{11}\) GeV. The results we find actually also hold for some non-scale-invariant theories, they apply to generic renormalisable models. Indeed, after scale invariance is broken dynamically all the scales, such as the Planck scale, the EW scale and the cosmological constant appear. Given that we include them as effective mass parameters in the Lagrangian we are able to show that the results we find hold for scale-invariant as well as general renormalisable models. We also revisit the studies that already have appeared in the literature; in some cases we confirm the previous results with improved derivations, while in other cases we correct some expressions for the perturbations.

After a general discussion of the perturbations in Sect. 3, the scalar, vector and tensor perturbations are all analysed in detail (in Sects. 4, 5, 6, respectively) and the relevant degrees of freedom are identified. We explain how the known perturbations in Einstein’s gravity coupled to matter are reproduced when \(M_2\) is much bigger than the Hubble rate during inflation. We use both a Lagrangian and a Hamiltonian approach (which are introduced in Appendix A for four-derivative theories). The Lagrangian one is used to derive the form of the perturbations, while the Hamiltonian formalism (developed in Appendix B) allows us to show that the full conserved Hamiltonian of the perturbations does not feature negative energies if appropriately quantised.

Among the most important results we find there are expressions (given in Sect. 7) for the potentially observable quantities in a general no-scale theory: the curvature and tensor power spectra (with the derived formulae for the tensor-to-scalar ratio r and the curvature spectral index \(n_\mathrm{s}\)) as well as a detailed discussion of the power spectra of the other perturbations. In particular the Weyl-squared term can lead (depending on the size of its coefficient) to an isocurvature mode whose amplitude is typically small as it turns out to be smaller than the tensor amplitude in Einstein’s theory.

In Sect. 8 we also apply these results to a specific model with all terms quadratic in curvature, the Higgs field and a scalar that generates the Planck scale (and therefore dubbed “the planckion”). This model can satisfy the most recent observational bounds of Planck [60] on \(n_\mathrm{s}\) and r as well as on the isocurvature power spectra. Interestingly, when the coefficient of the Weyl-squared term is large enough even an inflation due to a quadratic potential is allowed; for example, an inflation triggered by the planckion is permitted, eliminating a tension that was previously found neglecting the Weyl-squared term [10]. In this case the predictions of planckion inflation are close to Planck’s observational bounds on isocurvature perturbations, which suggests that this possibility can be tested with future observations.

Finally, in Sect. 9, we argue that the possible issues due to the ghost, which we have discussed above, do not lead to phenomenological problems in some no-scale models (including the one we have just mentioned), at least if one saturates the bound on \(M_2\) required by the solution of the hierarchy problem.

2 Scale-invariant theories

In this work we consider general (classically) scale-invariant theories. The action is

$$\begin{aligned} S= & {} \int \mathrm{d}^4x \sqrt{|\det g|} \mathscr {L}, \quad \mathscr {L}= \mathscr {L}_\mathrm{gravity} + \mathscr {L}_\mathrm{matter}\nonumber \\&+ \mathscr {L}_{\mathrm{non}{{\text {-}}\mathrm{minimal}}} . \end{aligned}$$
(2.1)

The term \( \mathscr {L}_\mathrm{gravity}\) is the pure gravitational piece. The no-scale principle dictates that it is given by the most general Lagrangian quadratic in curvature,

$$\begin{aligned} \mathscr {L}_\mathrm{gravity} = \alpha R^2 +\beta R_{\mu \nu }^2 +\gamma R_{\mu \nu \rho \sigma }^2. \end{aligned}$$
(2.2)

The dimensionless parameters \(\alpha \), \(\beta \) and \(\gamma \) are not all independent because of the Gauss–Bonnet relation in four-dimensions according to which [26]

$$\begin{aligned} \sqrt{|\det g|}(R^2 -4 R_{\mu \nu }^2 + R_{\mu \nu \rho \sigma }^2) = \text{ divergence }. \end{aligned}$$
(2.3)

Therefore, we can restrict our attention to \(R^2\) and \(R_{\mu \nu }^2\) and write their most general linear combination as

$$\begin{aligned} \mathscr {L}_\mathrm{gravity} = \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} +\frac{R^2}{6f_0^2}. \end{aligned}$$
(2.4)

We have grouped together \(\frac{1}{3} R^2-R_{\mu \nu }^2\) because this quantity is (up to total derivatives) proportional to the squared of the Weyl tensor \(W_{\mu \nu \rho \sigma }\):

$$\begin{aligned}&\int \mathrm{d}^4x \sqrt{|\det g|} \,\left[ \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} \right] \nonumber \\&\quad = -\frac{1}{2} \int \mathrm{d}^4x \sqrt{|\det g|} W_{\mu \nu \rho \sigma }^2. \end{aligned}$$
(2.5)

The piece \(\mathscr {L}_\mathrm{non-minimal}\) represents the non-minimal couplings between the scalar fields \(\phi ^a\) and the Ricci scalar

$$\begin{aligned} \mathscr {L}_{\mathrm{non}{\text {-}}\mathrm{minimal}} = -\frac{1}{2} \xi _{ab} \phi ^a\phi ^b R. \end{aligned}$$
(2.6)

Finally, \(\mathscr {L}_\mathrm{matter}\) is the remaining part of the Lagrangian, which depends on the matter fields: the gauge fields \(A^B_\mu \) (with field strength \(F_{\mu \nu }^B\)), the fermions \(\psi _j\) and the scalars \(\phi ^a\). The absence of dimensionful parameter is so restrictive that we can write their Lagrangian in one line:

(2.7)

All the coefficients defined so far are dimensionless and so respect the no-scale principle. However, the scales we observe in nature, such as the Planck or the weak scale, must of course be generated. This can occur in two alternative ways: perturbatively or non-perturbatively. In the first case we assume that some scalar field(s) acquire a vacuum expectation value (VEV) which gives rise to the mass scales [3]; in Sect. 8 we will illustrate an example of this type. In the second way one supposes that some strongly coupled sector (such as an SU(n) gauge theory) confines and thus generates the observed scales through its coupling to the other sectors (e.g. its gravitational couplings) [1]. After this has happened the Planck mass, the weak scale, the cosmological constant, etc appear in the Lagrangian as effective parameters. We will therefore introduce these quantities directly in the action in the following sections. This will also allow us to be more general and cover arbitrary renormalisable models.

2.1 The action in the Einstein frame

Since our task is to study inflationary perturbations, from now on we restrict our attention to the scalar–tensor sector of the theory. Of course, fermions and gauge fields should also be present (with a Lagrangian of the form (2.7)) both for phenomenological reasons and to realise the dynamical generation of scales that we have discussed [1, 3, 10].

The most general renormalisable scalar–tensor theory is (up to total derivatives)

$$\begin{aligned} S_\mathrm{st}= & {} \int \mathrm{d}^4x \sqrt{|\det g|} \,\left[ \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} +\frac{R^2}{6f_0^2} -\frac{\bar{M}_\mathrm{Pl}^2+F(\phi )}{2}R\right. \nonumber \\&\left. + \frac{1}{2} (\partial _\mu \phi ^a)^2 - V(\phi ) \right] , \end{aligned}$$
(2.8)

where \(\bar{M}_\mathrm{P}\) is the reduced Planck mass. Renormalisability requires \(F(\phi ) = \xi _{ab} \phi ^a\phi ^b+...\) and \(V(\phi ) = \lambda _{abcd} \phi ^a\phi ^b\phi ^c\phi ^d/4!+...\) (where the dots are terms with lower powers of \(\phi ^a\)) in the bare Lagrangian; however, we will keep these functions general in the following to take into account possible field dependence of the couplings induced by the renormalisation group. As we have discussed in the introduction, the presence of the terms quadratic in curvature promotes gravity to a renormalisable theory, but also introduces a spin-2 ghost. The mass of this field is \(M_2 = f_2\bar{M}_\mathrm{P}/\sqrt{2}\) [26]. One of the motivations of this work is to understand whether the presence of this ghost is consistent with the observations of the early universe we have available.

To proceed further it is useful to rewrite the action in a more familiar form. We follow here the method presented in [10] appropriately extended to take into account the effective massive parameters. The non-standard \(R^2\) term can be removed by introducing an auxiliary field \(\chi \) with a Lagrangian that vanishes on-shell:

$$\begin{aligned} - \sqrt{|\det g|} \frac{(R+3f_0^2 \chi /2)^2}{6f_0^2}, \end{aligned}$$
(2.9)

which we are therefore free to add to the total Lagrangian. Once we have done so the action reads

$$\begin{aligned} S_\mathrm{st}= & {} \int \mathrm{d}^4x \sqrt{|\det g|} \,\left[ \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} -\frac{f(\chi ,\phi )}{2} R \right. \nonumber \\&\left. +\, \frac{1}{2}(\partial _\mu \phi ^a)^2- V(\phi ) -\frac{3f_0^2\chi ^2}{8} \right] , \end{aligned}$$
(2.10)

where \(f(\chi ,\phi )\equiv \bar{M}_\mathrm{P}^2 + F(\phi )+\chi \). In order to get rid of the remaining non-standard fR term we perform a conformal transformation of the metric,

$$\begin{aligned} g_{\mu \nu } \rightarrow \frac{\bar{M}_\mathrm{P}^2}{f}g_{\mu \nu } , \end{aligned}$$
(2.11)

and the action becomes

$$\begin{aligned} S_\mathrm{st}= & {} \int \mathrm{d}^4x \sqrt{|\det g|} \,\Bigg \{ \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} -\frac{\bar{M}_\mathrm{Pl}^2}{2} R \nonumber \\&+\bar{M}_\mathrm{P}^2 \left[ \frac{(\partial _\mu \phi ^a)^2}{2f} +\frac{3(\partial _\mu f)^2}{4f^2}\right] - U \Bigg \}, \end{aligned}$$
(2.12)

where

$$\begin{aligned} U = \frac{\bar{M}_\mathrm{P}^4}{f^2}\left( V +\frac{3f_0^2}{8} \chi ^2\right) . \end{aligned}$$
(2.13)

The form of the action in (2.12) is known as the Einstein frame because all the non-minimal couplings have been removed. The field f can now be seen as an extra scalar degree of freedom. It is interesting to note that the remaining parts of the action, which depend on the fermions and the gauge fields, remain invariant under the conformal transformation if we also transform the matter fields as follows:

$$\begin{aligned} \phi ^a \rightarrow \left( \frac{f}{\bar{M}_\mathrm{P}^2}\right) ^{1/2} \phi ^a, \quad \psi _j \rightarrow \left( \frac{f}{\bar{M}_\mathrm{P}^2}\right) ^{3/4} \psi _j, \quad A_\mu ^B \rightarrow A_\mu ^B.\nonumber \\ \end{aligned}$$
(2.14)

However, the scalar kinetic terms are not invariant, so we have not performed this transformation in (2.12). To simplify further the action we define \(\zeta =\sqrt{6f}\) (notice that in order for the metric redefinition in (2.11) to be regular one has to have \(f>0\) and thus we can safely take the square root of f) and we obtain

$$\begin{aligned} S_\mathrm{st}= & {} \int \mathrm{d}^4x \sqrt{|\det g|} \,\left\{ \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} -\frac{\bar{M}_\mathrm{Pl}^2}{2} R \right. \nonumber \\&+\left. \frac{6\bar{M}_\mathrm{P}^2}{2\zeta ^2} [(\partial _\mu \phi ^a)^2 + (\partial _\mu \zeta )^2]- U(\zeta ,\phi ) \right\} , \end{aligned}$$
(2.15)

where

$$\begin{aligned} U(\zeta ,\phi ) = \frac{36 \bar{M}_\mathrm{P}^4}{\zeta ^4}\left[ V(\phi ) +\frac{3f_0^2}{8} \left( \frac{\zeta ^2}{6} - \bar{M}_\mathrm{P}^2 - F(\phi )\right) ^2\right] .\nonumber \\ \end{aligned}$$
(2.16)

Therefore, we have been able to write the action as the sum of two pieces:

$$\begin{aligned} S= S_\mathrm{W} + S_\mathrm{ES}, \end{aligned}$$
(2.17)

where \(S_\mathrm{W}\) is the part due to the unusual Weyl-squared term,

$$\begin{aligned} S_\mathrm{W}=\int \mathrm{d}^4x \sqrt{|\det g|} \,\left[ \frac{\frac{1}{3} R^2-R_{\mu \nu }^2}{f_2^2} \right] , \end{aligned}$$
(2.18)

and \(S_\mathrm{ES}\) is the Einstein–Hilbert part plus the scalar field piece equipped with a non-trivial field metric,

$$\begin{aligned} S_\mathrm{ES}=\int \mathrm{d}^4x \sqrt{|\det g|} \,\left[ -\frac{\bar{M}_\mathrm{Pl}^2}{2}R + \frac{K_{ij}(\phi ) }{2}\partial _\mu \phi ^i\partial ^\mu \phi ^j- U(\phi ) \right] .\nonumber \\ \end{aligned}$$
(2.19)

Here, for notational simplicity, we have introduced a new set of fields \(\phi ^i\) where the index i runs over the possible values of the index a plus \(\zeta \), in other words \(\phi ^i=\{\phi ^a, \zeta \}\). Also the field metric is given by

$$\begin{aligned} K_{ij} = \frac{6\bar{M}_\mathrm{P}^2}{\zeta ^2} \delta _{ij}. \end{aligned}$$
(2.20)

2.2 FRW background and slow-roll inflation

In this section we consider a Friedmann–Robertson–Walker (FRW) metric

$$\begin{aligned} \mathrm{d}s^2 = \mathrm{d}t^2 -a(t)^2 [\mathrm{d}r^2+r^2(\mathrm{d}\theta ^2 +\sin ^2\theta \mathrm{d}\phi ^2)], \end{aligned}$$
(2.21)

where a is the scale factor and we have neglected the spatial curvature parameter as during inflation the energy density is dominated by the scalar fields. The hypotheses of homogeneity and isotropy will be relaxed in the next section where the perturbations around the FRW metric will be considered.

For the following analysis of the perturbations it is convenient to introduce the conformal time \(\eta \) defined as usual by

$$\begin{aligned} \eta = \int _{t^*}^t \frac{\mathrm{d}t'}{a(t')}, \end{aligned}$$
(2.22)

where \(t^*\) is some reference time; in the following we will choose \(t^* \rightarrow \infty \). The FRW metric in terms of \(\eta \) is

$$\begin{aligned} \mathrm{d}s^2 = a(\eta )^2(\mathrm{d}\eta ^2 -\delta _{ij}\mathrm{d}x^i\mathrm{d}x^j). \end{aligned}$$
(2.23)

The Einstein equations are

$$\begin{aligned}&\mathcal{H}^2=\frac{K_{ij} \phi '^i \phi '^j/2+a^2U}{3 \bar{M}_\mathrm{Pl}^2} , \end{aligned}$$
(2.24)
$$\begin{aligned}&\mathcal{H}^2-\mathcal{H}'= \frac{K_{ij} \phi '^i \phi '^j}{ 2\bar{M}_\mathrm{Pl}^2}, \end{aligned}$$
(2.25)

where \(\mathcal{H} \equiv a'/a\) (related to \(H\equiv \dot{a}/a\) by \(\mathcal{H} = aH\)) and a prime denotes a derivative with respect to \(\eta \), while a dot is a derivative with respect to t. The equations for the scalar fields are instead

$$\begin{aligned} \phi ''^i +\gamma ^i_{jk} \phi '^j \phi '^k +2\mathcal{H} \phi '^i+a^2U^{,i}=0. \end{aligned}$$
(2.26)

Here for a generic function F of the scalar fields, we defined \(F_{,i}\equiv \partial F/\partial \phi ^i\), also \(\gamma ^i_{jk}\) is the affine connection in the scalar field space

$$\begin{aligned} \gamma ^i_{jk}\equiv \frac{K^{il}}{2}(K_{lj,k}+K_{lk,j}-K_{jk,l}) \end{aligned}$$
(2.27)

and \(K^{ij}\) denotes the inverse of the field metric (which is used to raise and lower the scalar indices \(i,j,k,\ldots \)); for example \(F^{,i}\equiv K^{ij}F_{,j}\).

Notice that in the case of pure de Sitter spacetime (\(a(\eta )=-1/(H\eta )\), \(\phi '^i=0,\) \(U_{,i} =0\)) Eq. (2.24) tells us \( \mathcal{H}^2 = a^2 U/(3\bar{M}_\mathrm{P}^2)\) and Eq. (2.25) implies \(\mathcal{H}' = \mathcal{H}^2\).

In the slow-roll regime we require the scalar equation to reduce to

$$\begin{aligned} \phi '^i\approx -\frac{a^2}{3\mathcal{H}}U^{,i} \end{aligned}$$
(2.28)

so

$$\begin{aligned} \phi ''^i +\gamma ^i_{jk} \phi '^j \phi '^k\approx \mathcal{H} \phi '^i. \end{aligned}$$
(2.29)

Moreover,

$$\begin{aligned} \mathcal{H}^2\approx \frac{a^2U}{3 \bar{M}_\mathrm{Pl}^2}. \end{aligned}$$
(2.30)

The slow-roll occurs when two conditions are satisfied [42] (see also [62] for previous studies):

$$\begin{aligned} \epsilon \equiv \frac{ \bar{M}_\mathrm{Pl}^2 U_{,i}U^{,i}}{2U^2} \ll 1. \end{aligned}$$
(2.31)
$$\begin{aligned} \left| \frac{\eta ^{i}_{\,\,\, j} U^{,j}}{U^{,i}}\right| \ll 1 \quad \text{(i } \text{ not } \text{ summed), } \quad \text{ where }\quad \eta ^{i}_{\,\,\, j}\equiv \frac{\bar{M}_\mathrm{Pl}^2 U^{;i}_{\,\,\, ;j}}{U}.\nonumber \\ \end{aligned}$$
(2.32)

It is easy to check that \(\epsilon \) and \(\eta ^i_{\,\,\, j}\) reduce to the well-known single-field slow-roll parameters in the presence of only one field.

We now introduce the number of e-folds for a generic multi-field theory. By writing the equations in (2.28) and (2.30) in terms of the cosmic time t we obtain the following dynamical system for \(\phi ^i\):

$$\begin{aligned} \phi ^i=-\frac{\bar{M}_\mathrm{Pl}U^{,i}(\phi )}{\sqrt{3U(\phi )}}, \end{aligned}$$
(2.33)

which can be solved with a condition at some initial time \(t_0\): namely \(\phi ^i(t_0)=\phi ^i_0\). Once the functions \(\phi ^i(t)\) are known we can obtain H(t) from Eq. (2.30). The number of e-folds N can be introduced by

$$\begin{aligned} N(\phi _0) \equiv \int _{t_\mathrm{e}}^{t_0(\phi _0)} \mathrm{d}t' H(t'), \end{aligned}$$
(2.34)

where \(t_\mathrm{e}\) is the time when inflation ends. Dropping the label on \(t_0\) and \(\phi _0\) as they are generic values we have

$$\begin{aligned} N(\phi ) \equiv \int _{t_\mathrm{e}}^{t(\phi )} \mathrm{d}t' H(t'). \end{aligned}$$
(2.35)

Notice that we write t as a function of \(\phi \): this is because once the initial position \(\phi \) in field space is fixed the time required to go from \(\phi \) to the field value when inflation ends is fixed too because the dynamical system in (2.33) is of the first order. Note, however, that H also generically depends on \(\phi \).

3 Perturbations (generalities)

By choosing the conformal Newtonian gauge, the metric describing the small fluctuations around the FRW spacetime can be written as

$$\begin{aligned} \mathrm{d}s^2= & {} a(\eta )^2\{ (1+2\Phi (\eta , \vec {x})) \mathrm{d}\eta ^2 -2 V_i(\eta , \vec {x}) \mathrm{d}\eta \mathrm{d}x^i \nonumber \\&- [(1-2\Psi (\eta , \vec {x})) \delta _{ij}+h_{ij}(\eta , \vec {x})]\mathrm{d}x^i\mathrm{d}x^j\}. \end{aligned}$$
(3.1)

Here, by definition, the vector perturbations \(V_i\) satisfy

$$\begin{aligned} \partial _iV_i=0 \end{aligned}$$
(3.2)

and the tensor perturbations obey

$$\begin{aligned} h_{ij}=h_{ji}, \quad h_{ii} =0, \quad \partial _ih_{ij}=0. \end{aligned}$$
(3.3)

Sometimes the Newtonian gauge is defined for the scalar perturbations \(\Phi \) and \(\Psi \) only (see e.g. [51]). Here we considered an extended definition, which also includes the non-scalar perturbations. A possible gauge dependent divergence of \(h_{ij}\) has been set to zero by appropriately choosing the gauge.

Also we decompose the scalar fields \(\phi ^i(\eta , \vec {x})\) in the background \( \phi ^i(\eta )\), which are only time-dependent, plus the fluctuations \(\varphi ^i(\eta , \vec {x})\),

$$\begin{aligned} \phi ^i(\eta , \vec {x}) \rightarrow \phi ^i(\eta ) + \varphi ^i(\eta , \vec {x}) . \end{aligned}$$
(3.4)

As is well known there is no mixing between tensor, vector and scalar perturbations from the part \(S_\mathrm{ES}\) of the action. The same is also true for the Weyl term \(S_\mathrm{W}\). Indeed, that property only follows from (3.2) and (3.3) and rotation invariance. We therefore study the various sectors separately in the following.

Previous studies of the perturbations in less general setups can be found in [52,53,54,55,56,57,58,59]. We will also revisit these studies and find some differences with some of them, which will be discussed in the following.

4 Scalar perturbations

Let us start with the scalar perturbations, whose quadratic action we denote with \(S^{(S)}\). This action has a contribution from the Weyl-squared term and one from the remaining terms, \(S^{(S)}=S^{(S)}_\mathrm{W}+S^{(S)}_\mathrm{ES}\), where

$$\begin{aligned} S^{(S)}_\mathrm{W}= & {} -\frac{2}{3f_2^2}\int \mathrm{d}^4 x [\vec {\nabla }^2(\Phi +\Psi )]^2, \end{aligned}$$
(4.1)
$$\begin{aligned} S^{(S)}_\mathrm{ES}= & {} \int \mathrm{d}^4 x\frac{a^2}{2} \{\bar{M}_\mathrm{P}^2[-6 \Psi '^2 -12 \mathcal{H} \Phi \Psi ' +4 \Psi \vec {\nabla }^2 \Phi \nonumber \\&-2 \Psi \vec {\nabla }^2 \Psi - 2 ( \mathcal{H}'+2\mathcal{H}^2) \Phi ^2] \nonumber \\&+K_{ij} (\varphi '^i\varphi '^j +\varphi ^i\vec {\nabla }^2\varphi ^j) +2 K_{ij,l} \phi '^i\varphi '^j \varphi ^l \nonumber \\&-(\Phi +3\Psi ) (2K_{ij} \phi '^i \varphi '^j + K_{ij,l} \phi '^i\phi '^j\varphi ^l) \nonumber \\&-a^2 U_{,i,j}\varphi ^i\varphi ^j -2a^2(\Phi -3\Psi ) U_{,i} \varphi ^i\}. \end{aligned}$$
(4.2)

In order to derive \(S^{(S)}_\mathrm{ES}\) we have used the background equations (2.24) and (2.25) and we have dropped total derivatives. Notice that the field metric \(K_{ij}\) is totally general in this expression and does not have to satisfy Eq. (2.20), which is characteristic of renormalisable theories.

The fact that the time-derivative of the perturbation \(\Phi \) does not appear in the action above tells us that it should be considered as a non-dynamical field.

One might wonder why there are no more than two time-derivatives in the action for scalar perturbations even if we started from an action with four derivatives. The reason is that in Einstein gravity there is no independent degrees of freedom in the scalar sector, while with the addition of the Weyl-squared term we should find one degree of freedom (the one corresponding to the helicity-0 component of the massive spin-2 field). If we had found four time-derivatives in the scalar action we would instead have two scalar degrees of freedom instead of one because a four-derivative system can always been interpreted as a two-derivative system with the number of degrees of freedom doubled.

4.1 Pure de Sitter

The expression in (4.2) simplifies considerably in the case of pure de Sitter spacetime, \(a(\eta )=-1/(H\eta )\), for which \(\mathcal{H}' = \mathcal{H}^2\), \(\phi '^i=0\) and \(U_{,i} =0\). Moreover, we know that the de Sitter spacetime is a reasonably good approximation during inflation because we assume that the slow-roll conditions in (2.31) and (2.32) are satisfied. We therefore consider this case in the present section. In the next one we will study the small departures from this spacetime due to the small, but non-zero time-dependence of the scalar fields.

4.1.1 Scalar field perturbations

We start from the scalar field perturbations \(\varphi ^i\). We now show that, thanks to the special expression of the field metric in (2.20) that is realised for any renormalisable model, the mixing between the different \(\varphi ^i\) can be eliminated with a field redefinition and one can analyze the various \(\varphi ^i\) separately.

The field equations of \(\varphi ^i\) are

$$\begin{aligned} K_{ij} \varphi ''^j + 2\mathcal{H} K_{ij} \varphi '^j - K_{ij}\vec {\nabla }^2\varphi ^j + a^2 U_{,i,j} \varphi ^j = 0. \end{aligned}$$
(4.3)

We now multiply this equation by \(K^{il}\) and sum over i, to obtain

$$\begin{aligned} \varphi ''^l+ 2\mathcal{H} \varphi '^l - \vec {\nabla }^2 \varphi '^l +U_{,i,j} K^{il} \varphi ^j = 0. \end{aligned}$$
(4.4)

Notice that the matrix \(m^2\) with elements \(m^{2\,l}_{\,\,\, \,\, j} \equiv U_{,i,j} K^{il}\) is symmetric as a consequence of the symmetry of U, i.e. \(U_{,i,j} = U_{,j,i}\) and the proportionality between \(K_{ij}\) and \(\delta _{ij}\) (Eq. (2.20)). Therefore we can always diagonalise \(m^2\) with an orthogonal transformation, \(\varphi \rightarrow \mathcal{O} \varphi \), and, after this transformation, the equation for \(\varphi \) is (suppressing the index i for simplicity)

$$\begin{aligned} \varphi ''+2\mathcal{H} \varphi '+(m^2 a^2-\vec {\nabla }^2) \varphi = 0 \end{aligned}$$
(4.5)

and the corresponding action is (rescaling the field to have a canonically normalised kinetic term)

$$\begin{aligned} S_{\varphi }= \int \mathrm{d}^4 x \mathscr {L}_{\varphi }, \quad \text{ where }\quad \mathscr {L}_{\varphi } = \frac{a^2}{2} \{\varphi '^2 + \varphi \vec {\nabla }^2\varphi -m^2 a^2\varphi ^2\}.\nonumber \\ \end{aligned}$$
(4.6)

We can now quantise the theory with the standard canonical procedure. Of course, it is well known how to do this for a scalar field on de Sitter space. Nevertheless, here we revisit such procedure in a way that will be useful to study \(\Psi \) as well as the vector and tensor perturbations, which, as we will see, requires an unusual quantisation in the presence of the Weyl-squared term. We introduce the conjugate momentum

$$\begin{aligned} \pi _{\varphi } = \frac{\partial \mathscr {L}_{\varphi }}{\partial \varphi '} = a^2 \varphi ' \end{aligned}$$
(4.7)

and we impose the canonical quantisation conditions:

$$\begin{aligned}&[\varphi (\eta ,\vec {x}), \varphi '(\eta ,\vec {y})] = \frac{i}{a^2} \delta ^{(3)}(\vec {x} - \vec {y}),\nonumber \\&[\varphi (\eta ,\vec {x}), \varphi (\eta ,\vec {y})] = 0, \quad \nonumber \\&[\varphi '(\eta ,\vec {x}), \varphi '(\eta ,\vec {y})] = 0. \end{aligned}$$
(4.8)

We expand now the field by considering the Fourier transform with respect to \(\vec {x}\), but not on \(\eta \), that is,

$$\begin{aligned} \varphi (\eta ,\vec {x}) = \int \frac{\mathrm{d}^3q}{(2\pi )^{3/2}} e^{i\vec {q}\cdot \vec {x}} \varphi _0(\eta ,\vec {q}). \end{aligned}$$
(4.9)

The hermiticity condition on \(\varphi (\eta ,\vec {x})\), i.e. \(\varphi (\eta ,\vec {x})^\dagger =\varphi (\eta ,\vec {x})\), implies

$$\begin{aligned} \varphi _0(\eta , \vec {q})^\dagger = \varphi _0(\eta , -\vec {q}) \end{aligned}$$
(4.10)

and the equation of motion in (4.5) dictates that \(\varphi _0\) satisfies

$$\begin{aligned} \varphi ''_{0}+2\mathcal{H} \varphi '_{0}+(m^2 a^2+q^2) \varphi _{0} = 0. \end{aligned}$$
(4.11)

The general solution of (4.11) is a linear combination of

$$\begin{aligned} \eta ^{3/2} J_{\frac{1}{2} \sqrt{9-\frac{4 m^2}{H^2}}}(q \eta ) \quad \text{ and } \quad \eta ^{3/2} Y_{\frac{1}{2} \sqrt{9-\frac{4 m^2}{H^2}}}(q \eta ), \end{aligned}$$
(4.12)

where \(J_ n(z)\) and \(Y_n(z)\) are the Bessel functions of the first and second kind, respectively. Since in the superhorizon limit \(\eta \rightarrow 0^-\), we see that the de Sitter scale factor \(a^2=1/(H \eta )^2\) diverges we see that the massive fields, \(m\ne 0\), are suppressed in that limit and therefore we can focus on the effectively massless degrees of freedom. By doing so, we can take the two linearly independent solutions to be

$$\begin{aligned}&y_0(\eta , q) \equiv \frac{H |\eta |}{\sqrt{2q}}\left( 1-\frac{i}{q\eta }\right) e^{-iq\eta }\nonumber \\&\quad \text{ and } \text{ its } \text{ complex } \text{ conjugate }. \end{aligned}$$
(4.13)

So, by using the hermiticity condition in (4.10), we have

$$\begin{aligned} \varphi _0(\eta , \vec {q}) = a_0(\vec {q}) y_0 (\eta , q) +a_0(-\vec {q})^\dagger y_0 (\eta , q)^*, \end{aligned}$$
(4.14)

where \(a_0(\vec {q})\) is an operator which will be identified with the annihilation operator of the quanta of the scalar field under study.

An important feature that will be useful in analysing the other non-standard sectors is that the expansion in (4.9) and (4.14) can be inverted to obtain \(a_0\) in terms of the fieldFootnote 1

(4.15)

This is possible because the two solutions \(y_0\) and \(y_0^*\) are linearly independent. Then, by using the canonical commutators in (4.8), one finds necessarily

$$\begin{aligned}{}[a_0(\vec {k}), a_0(\vec {q})^\dagger ] = \delta (\vec {k}-\mathbf {q}), \quad [a_0(\vec {k}), a_0(\vec {q})]= 0. \end{aligned}$$
(4.16)

If we had normalised the modes in (4.13) differently we would have found an extra factor multiplying \(\delta (\vec {k}-\mathbf {q})\). Therefore, what fixes the normalisation constants of the modes is the requirement that the operators \(a_0\) and \(a_0^\dagger \) satisfy the standard commutator relations for annihilation and creation operators. This is, however, not quite enough to identify \(a_0\) and \(a_0^\dagger \) as annihilation and creation operators, respectively. The remaining step is done in Appendix B where we review how the positivity of the norm of the quanta created and annihilated by \(a_0^\dagger \) and \(a_0\), respectively, guarantees that the Hamiltonian has a spectrum that is bounded from below. Similar considerations will be useful to analyse the other non-standard sectors.

4.1.2 Metric perturbations

We now turn to the metric perturbations \(\Phi \) and \(\Psi \). As we have already stated, \(\Phi \) should be considered as a non-dynamical field. To find the equation that fixes its value we perform the variation of \(S^{(S)}\) with respect to \(\Phi \), to obtain

$$\begin{aligned} -\frac{4}{3 f_2^2 \bar{M}_\mathrm{P}^2 a^2} \vec {\nabla }^4 \left( \Phi +\Psi \right) -6 \mathcal{H} \Psi '+2\vec {\nabla }^2 \Psi -6\mathcal{H}^2 \Phi = 0,\nonumber \\ \end{aligned}$$
(4.17)

where \(\vec {\nabla }^4 = (\vec {\nabla }^2)^2\).

The main phenomenologically interesting regime is the superhorizon limit, \(\eta \rightarrow 0\), when \(a\rightarrow \infty \). We therefore focus on this case first. Then the first term in (4.17) is small and can be treated perturbatively in an expansion in 1 / a. The solution of (4.17) at next-to-leading order in this expansion is

$$\begin{aligned} \Phi = \frac{1}{3\mathcal{H}^2} \vec {\nabla }^2\Psi -\frac{\Psi '}{\mathcal{H}} + \frac{2 }{9f_2^2 \bar{M}_\mathrm{P}^2 a^2 \mathcal{H}^2} \vec {\nabla }^4\left( \frac{\Psi '}{\mathcal{H}}-\Psi \right) .\nonumber \\ \end{aligned}$$
(4.18)

We now insert this constraint for \(\Phi \) in Eqs. (4.1) and (4.2) and drop all the terms that go to zero in the \(a\rightarrow \infty \) limit, to obtain

$$\begin{aligned}&\frac{\bar{M}_\mathrm{P}^{2}}{2} \int \mathrm{d}^{4}x a^{2}\left[ -\frac{2}{\mathcal {H}}(\Psi \vec {\nabla }^{2}\Psi )' -2\Psi \vec {\nabla }^{2} \Psi +\frac{2}{3{\mathcal {H}}^2}\Psi \vec {\nabla }^{4} \Psi \right. \nonumber \\&\quad \left. +\frac{4}{3 {f_{2}^{2}}\bar{M}_\mathrm{P}^{2} a^{2}}\left( \frac{2\Psi '\vec {\nabla }^{4} \Psi }{\mathcal {H}} - \frac{\Psi ' \vec {\nabla }^{4}\Psi '}{{\mathcal {H}}^{2}} - \Psi \vec {\nabla }^{4}\Psi \right) \right] .\nonumber \\ \end{aligned}$$
(4.19)

The first two terms in the expression above are apparently the leading ones, but notice that an integration by parts showsFootnote 2 that they exactly cancel each other! Therefore, both the Einstein contribution (the third term in this expression) and the Weyl contribution (the one that is divided by \(f_2^2\)) have the same behaviour as \(a\rightarrow \infty \). By taking the variation with respect to \(\Psi \) and using \(\mathcal {H} = -1/\eta \) one obtains the equation

$$\begin{aligned} \frac{\Psi ''}{{\mathcal {H}}^{2}} - 2\frac{\Psi '}{\mathcal {H}} + \frac{{f_{2}^{2}}\bar{M}_\mathrm{P}^{2} a^{2}}{2{\mathcal {H}}^{2}} \Psi = 0. \end{aligned}$$
(4.20)

Notice that in the limit where \(f_2\rightarrow \infty \) one recovers Einstein’s theory result for the de Sitter space, \(\Psi = 0\). The last term in this equation was neither discussed nor shown in [58], which performed a previous analysis of the \(\Psi \)-sector. We observe, however, that such term is important to recover Einstein’s result. The general solution of (4.20) is a linear combination of

$$\begin{aligned} \eta ^{(-1-\sqrt{1-4M_2^2/H^2})/2}, \quad \eta ^{(-1+\sqrt{1-4M_2^2/H^2})/2}. \end{aligned}$$
(4.21)

When \(M_2 = f_2 \bar{M}_\mathrm{P}/\sqrt{2} \gg H\) very intense oscillations produced by the last term in (4.20) make effectively \(\Psi \) go to zero, but for \(M_2<H\) we find a growth of \(\Psi \) at superhorizon scales. At the end of Sect. 4.2 we will show that this divergence is a gauge artefact by showing that no divergences are present in another gauge (the co-moving one). At the same time, however, (4.21) means that the Weyl term dominates even in the superhorizon limit because this divergence comes from the first two terms in (4.20), which come in turn from the Weyl term. The reason why this happens is because the apparently leading terms coming from the Einstein–Hilbert action in Eq. (4.19) actually cancel each other. We conclude that the superhorizon limit is a particular case of Weyl domination (up to \(M_2^2/H^2\) corrections) not of Einstein domination. This somehow differs from the classification proposed in [58]. Notice that if the Weyl term dominates at superhorizon scales then it dominates at any time because a large a suppresses the Weyl contribution.

Therefore, we now consider the case of Weyl domination, from where we will be able to take the superhorizon limit as we have just explained. During Weyl domination (\(f_2\) small) the constraint of \(\Phi \), Eq. (4.17), implies

$$\begin{aligned} \Phi = - \Psi , \end{aligned}$$
(4.22)

which we insert in (4.1) and (4.2) to obtain the action of \(\Psi \):

$$\begin{aligned} S^{(S)}_{\Psi }= & {} \int \mathrm{d}^4x \mathscr {L}_{\Psi }, \quad \text{ where } \nonumber \\ \mathscr {L}_{\Psi }= & {} \frac{a^2}{2}(-\hat{\Psi }'^2 - \hat{\Psi }\vec {\nabla }^2 \hat{\Psi }-4 \mathcal{H}^2 \hat{\Psi }^2), \end{aligned}$$
(4.23)

where \(\hat{\Psi }\equiv \sqrt{6} \bar{M}_\mathrm{P}\Psi \). We see that \(\hat{\Psi }\) is a ghost; it represents the helicity-0 component of the spin-2 ghost. This type of fields require a special quantisation which we will discuss. However, the usual canonical commutators remain unchanged [33]. The conjugate momentum of \(\hat{\Psi }\) is

$$\begin{aligned} \Pi _{\hat{\Psi }} = \frac{\partial \mathscr {L}_{\Psi }}{\partial \hat{\Psi }'}= -a^2 \hat{\Psi }' \end{aligned}$$
(4.24)

so the canonical commutators are

$$\begin{aligned}&[\hat{\Psi }(\eta ,\vec {x}), \hat{\Psi }'(\eta ,\vec {y})] = - \frac{i}{a^2} \delta ^{(3)}(\vec {x} - \vec {y}),\nonumber \\&[\hat{\Psi }(\eta ,\vec {x}), \hat{\Psi }(\eta ,\vec {y})] = 0, \quad [\hat{\Psi }'(\eta ,\vec {x}), \hat{\Psi }'(\eta ,\vec {y})] = 0. \nonumber \\ \end{aligned}$$
(4.25)

We can now perform an expansion in three-dimensional plane waves \(e^{i \vec {q}\cdot \vec {x}}\) of the field,

$$\begin{aligned} \hat{\Psi }(\eta ,\vec {x}) = \int \frac{\mathrm{d}^3q}{(2\pi )^{3/2}}e^{i \vec {q}\cdot \vec {x}} \Psi _0(\eta ,\vec {q}). \end{aligned}$$
(4.26)

Notice that the hermiticity condition on \(\Psi \) implies

$$\begin{aligned} \Psi _0(\eta ,\vec {q})^\dagger = \Psi _0(\eta , - \vec {q}). \end{aligned}$$
(4.27)

The mode functions \( \hat{\Psi }_0\) are the solutions of the field equation in momentum space

$$\begin{aligned} \Psi ''_0+2\mathcal{H} \Psi '_0+(q^2-4\mathcal{H}^2) \Psi _0 = 0. \end{aligned}$$
(4.28)

The two independent solutions are

$$\begin{aligned}&g_0(\eta , q) = \frac{H|\eta |}{\sqrt{2q}} \left( 1-\frac{3i}{q\eta }-\frac{3}{q^2\eta ^2}\right) e^{-iq\eta }\nonumber \\&\quad \text{ and } \text{ its } \text{ complex } \text{ conjugate }. \end{aligned}$$
(4.29)

We see that these functions reproduce the two solutions in (4.21) up to \(M_2^2/H^2\) corrections; these corrections are the effect of the Einstein–Hilbert term, which here we have neglected, but which is of course important to recover Einstein’s gravity when \(M_2 \gg H\). So, by using the hermiticity condition in (4.27), we have

$$\begin{aligned} \Psi _0(\eta , \vec {q}) = b_0(\vec {q}) g_0 (\eta , q) +b_0(-\vec {q})^\dagger g_0 (\eta , q)^*, \end{aligned}$$
(4.30)

where \(b_0(\vec {q})\) is an operator, which will be identified with the annihilation operator of the quanta of the scalar field under study.

One can show now that the canonical commutators (4.25) implyFootnote 3

$$\begin{aligned}{}[ b_0(\vec {k}), b_0(\vec {q})^\dagger ] = - \delta (\vec {k}-\mathbf {q}), \quad [ b_0(\vec {k}), b_0(\vec {q})] = 0. \end{aligned}$$
(4.31)

In Appendix B we show that, by introducing a negative norm to the states with an odd number of quanta created and annihilated by \(b_0^\dagger \) and \(b_0\), respectively, one obtains a Hamiltonian that is bounded from below. A possible way of addressing the issues generated by negative norms will be discussed in Sect. 9. The expression of the Hamiltonian in terms of \(b_0^\dagger \) and \(b_0\) found in Appendix B and the commutation relations in (4.31) allow us to interpret them as creation and annihilation operators, respectively.

4.2 Inclusion of slow-roll

We now turn to the effect of a non-zero slow-roll.

4.2.1 Newtonian gauge

We start from the curvature perturbation \(\mathcal{R}\), which, in the Newtonian gauge, reads

$$\begin{aligned} \mathcal{R} = -\Psi -\mathcal{H} \frac{K_{ij}\phi '^i\varphi ^j}{K_{lm}\phi '^l\phi '^m}. \end{aligned}$$
(4.32)

In our case \(\mathcal {R}\) can be simplified by using the expression of \(K_{ij}\) given in Eq. (2.20), which leads to

$$\begin{aligned} \mathcal{R} = -\Psi -\mathcal{H} \frac{\sum _i\phi '^i\varphi ^i}{\sum _j\phi '^j\phi '^j}. \end{aligned}$$
(4.33)

As is well known, from \(\mathcal{R}\) we can extract important observable quantities, such as its power spectrum \(P_\mathcal{R}(q)\) and the spectral index \(n_\mathrm{s}\). In particular we are interesting in \(\mathcal {R}\) at superhorizon scales. Here we show that \(\mathcal {R}\) in this case is insensitive to the metric perturbation \(\Psi \), at least in the slow-roll approximation and is therefore given by its known value in the absence of the Weyl-squared term.

We start by considering the field equation for the scalar field perturbations \(\varphi ^i\) from Eq. (4.2):

$$\begin{aligned} 0= & {} K_{ij} \varphi ''^j+2\mathcal{H} K_{ij} \varphi '^j +2\gamma ^a_{jl} K_{ai} \phi '^l\varphi '^j +2\mathcal{H} K_{ji,l} \phi '^j\varphi ^l \nonumber \\&+(K_{ji,l} \phi '^j)' \varphi ^l -K_{ij}\vec {\nabla }^2\varphi ^j+a^2 U_{,i,j} \varphi ^j \nonumber \\&-\mathcal{H}(6\Psi +2\Phi )K_{ji} \phi '^j-(3\Psi '+\Phi ')K_{ji}\phi '^j\nonumber \\&-(3\Psi +\Phi )(K_{ji}\phi '^j)' \nonumber \\&+ \frac{3\Psi +\Phi }{2}K_{lj,i}\phi '^l\phi '^j+a^2(\Phi -3\Psi ) U_{,i}. \nonumber \end{aligned}$$

In the leading non-trivial slow-roll approximation we can approximate \(\Phi =- \Psi \), which was derived in the pure de Sitter case, because \(\Phi \) and \(\Psi \) always appear multiplied by some derivative of background quantities and therefore we can use for them the zeroth-order slow-roll approximation. We can also use the slow-roll approximations \(\phi '^i\phi '^j \approx 0\), \(\phi ''^i\approx \mathcal{H} \phi '^i\) and the scalar field equation (2.28). By multiplying by \(K^{im}\) and summing over i we then obtain

$$\begin{aligned}&\varphi ''^m +2\mathcal{H} \varphi '^m+2 \gamma ^m_{jl} \phi '^l\varphi '^j +3\mathcal{H} K^{im} K_{ji,l}\phi '^j \varphi ^l\nonumber \\&-\vec {\nabla }^2 \varphi ^m +a^2 K^{im} U_{,i,j} \varphi ^j = 2\phi '^m \Psi ' + 2a^2 U^{,m} \Psi . \end{aligned}$$
(4.34)

We now restrict the analysis to the scalar field perturbations that are effectively massless. Indeed, as shown in the zeroth-order slow-roll approximation in Sect. 4.1, only those fields contribute in the superhorizon case. This means that we can neglect the term \(a^2 K^{im} U_{,i,j} \varphi ^j \) in the equation above, which can now be written

$$\begin{aligned} \varphi ''^m +2\mathcal{H} \varphi '^m -\vec {\nabla }^2 \varphi ^m= & {} 2\phi '^m \Psi ' + 2a^2 U^{,m} \Psi -2 \gamma ^m_{jl} \phi '^l\varphi '^j\nonumber \\&-3\mathcal{H} K^{im} K_{ji,l}\phi '^j \varphi ^l. \end{aligned}$$
(4.35)

Since we want to find the solution \(\varphi ^m\) in the next-to-leading order in the slow-roll approximation we can substitute in the right-hand-side of this equation the values for \(\Psi \) and \(\varphi ^i\) that we have found in Sect. 4.1 in the pure de Sitter case; we refer to these quantities as \(\Psi _\mathrm{dS}\) and \(\varphi ^i_\mathrm{dS}\). Equation (4.35) then becomes an inhomogeneous equation with the following source term:

$$\begin{aligned}&2\phi '^m \Psi _\mathrm{dS}' + 2a^2 U^{,m} \Psi _\mathrm{dS}-2 \gamma ^m_{jl} \phi '^l\varphi '^j_\mathrm{dS} -3\mathcal{H} K^{im} K_{ji,l}\phi '^j \varphi _\mathrm{dS}^l \nonumber \\&\quad \approx 2\phi '^m \Psi _\mathrm{dS}' + 2a^2 U^{,m} \Psi _\mathrm{dS}, \end{aligned}$$
(4.36)

where in the latter approximation we have used the fact that \(\gamma ^m_{jl} \phi '^l\varphi '^j_\mathrm{dS}\) goes as a and \(3\mathcal{H} K^{im} K_{ji,l}\phi '^j \varphi _\mathrm{dS}^l\) as \(a^2\) in the superhorizon limit (having used \(\phi '^i \approx -a^2 U^{,i}/(3\mathcal{H})\) and \(\varphi '^j, \varphi ^j \sim constant\) in that limit), while the other terms \(2\phi '^m \Psi _\mathrm{dS}' + 2a^2 U^{,m} \Psi _\mathrm{dS}\) go as \(a^3\) (having used the behaviour of \(\Psi \) given in Eq. (4.21) for a moderate ghost massFootnote 4, \(M_2 \lesssim H\)). Equation (4.35) then becomes

$$\begin{aligned} \varphi ''^m +2\mathcal{H} \varphi '^m -\vec {\nabla }^2 \varphi ^m = 2\phi '^m \Psi '_\mathrm{dS} + 2a^2 U^{,m} \Psi _\mathrm{dS}. \end{aligned}$$
(4.37)

Moreover, notice that \(\mathcal{H}\) appearing in the second term of the left-hand-side of this equation can be replaced by the corresponding quantity in the pure de Sitter case: indeed, Eq. (2.25) shows that the difference between the pure de Sitter \(\mathcal {H}\) and the spacetime that takes into account the dynamics of the scalar fields is beyond the next-to-leading order slow-roll approximation that we are using here.

By using now the slow-roll equations \(\phi ''^i \approx \mathcal{H} \phi '^i\), \(\phi '''^i \approx 2\mathcal{H}^2 \phi '^i\) and \(\mathcal{H}' \approx \mathcal{H}^2\), as well as the equations of motion (4.5) and (4.28), one finds that the following configuration is a solution in the next-to-leading slow-roll approximation:

$$\begin{aligned} \varphi ^i =\varphi ^i_\mathrm{dS} -\frac{\phi '^i}{\mathcal{H}} \Psi _\mathrm{dS}. \end{aligned}$$
(4.38)

By using this solution in Eq. (4.33) we find

$$\begin{aligned} \mathcal{R} = -\Psi _\mathrm{dS} -\mathcal{H} \frac{\sum _i\phi '^i\varphi ^i}{\sum _j\phi '^j\phi '^j} = -\mathcal{H} \frac{\sum _i\phi '^i\varphi _\mathrm{dS}^i}{\sum _j\phi '^j\phi '^j}, \end{aligned}$$
(4.39)

where we have substituted \(\Psi \rightarrow \Psi _\mathrm{dS}\) in the first term of \(\mathcal{R}\) because in the second term we have two time-derivatives in the denominator and only one in the numerator and thus going to the next-to-leading order in the slow-roll approximation in \(\varphi ^i\) corresponds to the zeroth-order approximation in the first term of \(\mathcal{R}\). We see that the dependence on the ghost cancels in \(\mathcal{R}\), which therefore reproduces the expression that we have when gravity is described by Einstein’s theory coupled to an arbitrary number of scalar fields. The corresponding power spectrum and the spectral index \(n_\mathrm{s}\) will be recalled in Sect. 7.

4.2.2 Co-moving gauge

We conclude this section by showing that the superhorizon divergence of \(\Psi \) is a gauge artefact. We extend the validity of a previous argument of [58] to theories with a generic number of scalar fields.

To do so we consider the co-moving gauge defined byFootnote 5 \(\delta u = 0\). Given that generically

$$\begin{aligned} \delta u = - a \frac{K_{ij} \phi '^i \varphi ^j}{K_{lm} \phi '^l \phi '^m}, \end{aligned}$$
(4.40)

setting \(\delta u = 0\) with a gauge transformation starting from the Newtonian gauge produces a non-diagonal metric for the scalar perturbations of the form

$$\begin{aligned} \mathrm{d}s^2_\mathrm{scalar}= & {} a^2 (1+2A) \mathrm{d}\eta ^2 -2 a \partial _i B \mathrm{d}\eta \mathrm{d}x^i\nonumber \\&-a^2 (1+2\mathcal {R}) \delta _{ij} \mathrm{d}x^i\mathrm{d}x^j, \end{aligned}$$
(4.41)

where \(\mathcal {R}\) is the curvature perturbation defined in (4.32),

$$\begin{aligned} B= \frac{K_{ij} \phi '^i \varphi ^j}{K_{lm} \phi '^l \phi '^m} \end{aligned}$$
(4.42)

and A is obtained from the Newtonian potential \(\Phi \) by adding a term proportional to the time-derivative of aB and thus does not represent another independent degree of freedom.

We have just shown that \(\mathcal {R}\) is not sensitive to the superhorizon divergence of \(\Psi \). As far as B is concerned, we observe that Eq. (4.33) tells us

$$\begin{aligned} B = -\frac{1}{\mathcal {H}} \mathcal {R} -\frac{1}{\mathcal {H}} \Psi . \end{aligned}$$
(4.43)

The first term in B go to zero at superhorizon scales. Regarding the second one we have to distinguish between two situations.

  • \(M_2^2/H^2 \gtrsim 1/N\). In this case also \(\Psi /\mathcal {H}\) is suppressed at superhorizon scales, \(\eta \sim e^{-N}\ll 1\), where we have taken into account the contribution of the Einstein–Hilbert term to the superhorizon behaviour of \(\Psi \), Eq. (4.21).

  • \(M_2^2/H^2 \lesssim 1/N\). In this case the behaviours in (4.21) practically reduce to \(1/\eta \) and 1, meaning that B does not vanish at superhorizon scales. Also in this situation, however, B remains finite because \(\mathcal {H} \approx -1/\eta \).

Reference [58] did not consider the first case. Here we include it to keep the analysis general. Therefore, we conclude not only that the divergence of \(\Psi \) is a gauge artefact, but also that this extra degree of freedom due to the ghost is either suppressed in the superhorizon limit (for \(M_2^2/H^2 \gtrsim 1/N\)) or remains finite (for \(M_2^2/H^2 \lesssim 1/N\)).

5 Vector perturbations

The contribution of \(S_\mathrm{ES}\) to the quadratic action for the tensor perturbations is

$$\begin{aligned} S_\mathrm{ES}^{(V)} = \frac{\bar{M}_\mathrm{P}^2}{4} \int \mathrm{d}^4x \, a^2(\partial _i V_{j})^2, \end{aligned}$$
(5.1)

where the space indices are contracted with the flat metric \(\delta _{ij}\). The contribution of \(S_\mathrm{W}\) is instead

$$\begin{aligned} S^{(V)}_\mathrm{W} =- \frac{1}{2f_2^2} \int \mathrm{d}^4x (\partial _i V_j' \partial _i V_j' - V_i \vec {\nabla }^4 V_i). \end{aligned}$$
(5.2)

We explicitly checked that the presence of an arbitrary number of scalar fields does not change the form of \(S_\mathrm{ES}^{(V)}\) and \(S^{(V)}_\mathrm{W}\). In the case of \(S_\mathrm{ES}^{(V)}\) the check requires the use of the background equations (2.24)–(2.25).

Thus the full action for vector perturbations \(S_\mathrm{V} = S_\mathrm{ES}^{(V)} + S^{(V)}_\mathrm{W}\) is given by \(S_\mathrm{V} =\int \mathrm{d}^4x \mathscr {L}_{V}\) where

$$\begin{aligned} \mathscr {L}_{V} = \frac{\bar{M}_\mathrm{P}^2}{4}\left[ -a^2 V_{j}\vec {\nabla }^2 V_{j} + \frac{1}{M_2^2}( V_j' \vec {\nabla }^2 V_j' + V_i \vec {\nabla }^4 V_i)\right] . \nonumber \\ \end{aligned}$$
(5.3)

We observe that this action does not contain terms with four time-derivatives. This follows from the fact that the Einstein–Hilbert action (plus an arbitrary number of scalar fields) does not lead to any propagating field with helicity 1 or \(-1\), while the massive ghost should provide two fields, one with helicity 1 and the other one with helicity \(-1\). The \(V_i\) account for these two fields. If we had found terms with four time-derivatives for the \(V_i\) we would have a number of degrees of freedom doubled with respect to the correct one.

We now introduce the conjugate momenta

$$\begin{aligned} P_i = \frac{\partial \mathscr {L}}{\partial V_i'} = \frac{\bar{M}_\mathrm{P}^2}{2M_2^2} \vec {\nabla }^2 V_i'. \end{aligned}$$
(5.4)

The Laplace operator \(\vec {\nabla }^2\) in the expression of the conjugate momenta was missed in the previous studies of vector perturbations of [55]. As we will see, the presence of such an operator modifies the momentum dependence of the vector modes and the commutation rules of the creation and annihilation operators.

The quantisation is obtained as usual by imposing the canonical commutators. In order to do so, however, one should identify the independent degrees of freedom. The condition in (3.2) tells us that for a plane wave with given momentum \(\vec {q}\) there are only two independent components. So to identify the two independent degrees of freedom we go to momentum space and write for \(F_j \equiv \{V_j, P_j \}\)

$$\begin{aligned} F_{j}(\eta , \vec {x}) = \int \frac{\mathrm{d}^3q}{(2\pi )^{3/2}} e^{i\vec {q}\cdot \vec {x}} \sum _{\lambda = \pm 1} F_\lambda (\eta ,\vec {q}) e^\lambda _{j} (\hat{q}), \end{aligned}$$
(5.5)

where \(e^\lambda _{j} (\hat{q})\) are the usual polarisation vectors for helicities \(\lambda = \pm 1\). We recall that for \(\hat{q}\) along the third axis the polarisation tensors that satisfy (3.2) are given by

$$\begin{aligned} e^{+1}_{1} = 1/\sqrt{2}, \quad e^{+1}_{2} = i/\sqrt{2}, \quad e^{+1}_{3} = 0, \quad e^{-1}_{j} = (e^{+1}_{j})^* \nonumber \\ \end{aligned}$$
(5.6)

and for a generic momentum direction \(\hat{q}\) we can obtain \(e^\lambda _{j} (\hat{q})\) by applying to (5.6) a rotation that connects the third axis with \(\hat{q}\). The polarisation vectors defined in this way obey

$$\begin{aligned} e^\lambda _{j} (\hat{q}) (e^{\lambda '}_{j} (\hat{q}))^* = \delta ^{\lambda \lambda '}. \end{aligned}$$
(5.7)

We can now impose the canonical commutators:

$$\begin{aligned}&[V_\lambda (\eta , \vec {q}), (P_{\lambda '}(\eta , \vec {k}))^\dagger ] = i \delta _{\lambda \lambda '} \delta ^{(3)}(\vec {q}-\mathbf {k}),\nonumber \\&\quad \text{(and } \text{ all } \text{ the } \text{ other } \text{ commutators } \text{ vanishing), } \end{aligned}$$
(5.8)

which, according to the expansion in (5.5) and the condition in (5.7), lead to the following canonical commutators in coordinate space:

$$\begin{aligned}&[V_{j}(\eta , \vec {x}), P_j(\eta , \vec {y})] = 2i \delta ^{(3)}(\vec {x}-\mathbf {y}),\nonumber \\&\quad \text{(and } \text{ all } \text{ the } \text{ other } \text{ commutators } \text{ vanishing) }. \end{aligned}$$
(5.9)

By taking the variation of the action we obtain the equations of motion for the vector perturbations, which, working in momentum space, read

$$\begin{aligned} V_\lambda '' = -(q^2+M_2^2 a^2) V_\lambda . \end{aligned}$$
(5.10)

We now solve these equation in the pure de Sitter case, for which \(a^2(\eta ) = 1/(H^2\eta ^2)\). Indeed (2.25) shows that the error that is produced in this way is beyond the next-to-leading order slow-roll approximation that we are using here. By defining \(z \equiv - q\eta \) and \(\rho \equiv H^2/M_2^2\) we find

$$\begin{aligned} \frac{\mathrm{d}^2V_\lambda }{\mathrm{d}z^2} +\left( 1+\frac{1}{\rho z^2}\right) V_\lambda =0, \end{aligned}$$
(5.11)

whose linearly independent solutions areFootnote 6

$$\begin{aligned}&f_1(z)\equiv \sqrt{z} J_{\frac{\sqrt{\rho -4}}{2 \sqrt{\rho }}}(z)+i \sqrt{z} Y_{\frac{\sqrt{\rho -4}}{2 \sqrt{\rho }}}(z)\nonumber \\&\quad \text{ and } \text{ its } \text{ complex } \text{ conjugate. } \end{aligned}$$
(5.12)

We can therefore expand

$$\begin{aligned} V_\lambda (\eta ,\vec {q}) = \gamma _\lambda (\vec {q}) f_1(-q \eta ) +\gamma ^\dagger _{-\lambda }(-\vec {q}) f^*_1(-q \eta ), \end{aligned}$$
(5.13)

where we have used the reality condition \(V_j(\eta ,\vec {x})^\dagger = V_j(\eta ,\vec {x})\) for the fields in coordinate space, which corresponds to \(V_\lambda (\eta ,\vec {q})^\dagger = V_{-\lambda }(\eta , -\vec {q})\) in momentum space. The quantities \( \gamma _\lambda (\vec {q})\) are to be interpreted as operators in the quantum theory, but their normalisation is not fixed. From the analysis of the scalar sector we have learned that the way to properly normalise them is to impose the canonical commutators and require that the \(\gamma _\lambda \) together with their hermitian conjugate satisfy the commutation rules for creation and annihilation operators. To achieve this goal we observe that the \(\gamma _\lambda \) can be expressed as a functional of \(V_i\) with a relation analogue to (4.15). This shows that there is only one assignment for the commutation rules satisfied by \(\gamma _\lambda \). This assignment turns out to be

$$\begin{aligned}{}[\gamma _\lambda (\vec {q}), \gamma _\lambda ^\dagger (\vec {k})] = - \frac{c_\gamma (q)}{\mathcal {F}\mathcal {F}^*} \delta _{\lambda \lambda '}\delta ^{(3)}(\vec {q}-\vec {k}), \end{aligned}$$
(5.14)

and all the other commutators equal to zero, where

$$\begin{aligned} c_\gamma (q) \equiv \frac{M_2^2}{\bar{M}_\mathrm{P}^2 q^3} \quad \text{ and } \quad \mathcal {F} \equiv \frac{(1-i) e^{-\frac{1}{4} i \pi \sqrt{\frac{\rho -4}{\rho }}}}{\sqrt{\pi }}. \end{aligned}$$
(5.15)

Notice that without knowing \(\rho \) we cannot simplify \(\mathcal {F}\mathcal {F}^*\) as we leave open the possibility that \(\rho < 4\). These commutators can be brought into the more standard form

$$\begin{aligned}{}[c_\lambda (\vec {q}), c_\lambda ^\dagger (\vec {k})] = - \delta _{\lambda \lambda '}\delta ^{(3)}(\vec {q}-\vec {k}) \end{aligned}$$
(5.16)

by defining

$$\begin{aligned} c_\lambda (\vec {q}) \equiv \frac{\mathcal {F}}{\sqrt{c_\gamma (q)}} \gamma _\lambda (\vec {q}), \end{aligned}$$
(5.17)

which also leads to the properly normalised modes

$$\begin{aligned} g_1(\eta ,q)= & {} \frac{\sqrt{\pi } M_2e^{i\frac{\pi }{4}\left( 1+\sqrt{1-4\frac{M^2_2}{H^2}}\right) }}{\sqrt{2}\bar{M}_\mathrm{P}q^{3/2}}\nonumber \\&\times \sqrt{-q\eta }\left( J_{\sqrt{\frac{1}{4} -\frac{M_2^2}{H^2}}}(-q\eta ) + i Y_{\sqrt{\frac{1}{4} -\frac{M_2^2}{H^2}}}(-q\eta ) \right) \nonumber \\ \end{aligned}$$
(5.18)

such that the initial function can be expressed as follows:

$$\begin{aligned} V_\lambda (\eta , \vec {q}) = c_\lambda (\vec {q}) g_1(\eta , q) + c^\dagger _{-\lambda }(-\vec {q}) g^*_1(\eta , q). \end{aligned}$$
(5.19)

In a previous calculation Ref. [55] found the opposite result for the commutator in (5.16). This difference is due to the fact that we took into account the operator \(\vec {\nabla }^2\) in the definition of the conjugate momenta, Eq. (5.4), which effectively changes the overall sign when going to momentum space: \(\vec {\nabla }^2\rightarrow -q^2\).

The expression of \(g_1\) we find differs from the previous determinations of Ref. [55]: we have a factor of \(q^{3/2}\) in the denominator, instead of \(q^{1/2}\); this difference is also due to the fact that we took into account the operator \(\vec {\nabla }^2\) in the definition of the conjugate momenta. We also observe that for \(M_2>H/2\) a complex exponential appearing in \(g_1\) becomes real,

$$\begin{aligned} e^{i\frac{\pi }{4}\sqrt{1-4\frac{M^2_2}{H^2}} } = e^{-\frac{\pi }{4}\sqrt{4\frac{M^2_2}{H^2}-1} }, \quad (M_2>2H), \end{aligned}$$
(5.20)

and exponentially suppresses the ghost mode \(g_1\) for \(M_2 \gg H\). This is what we expect because for \(M_2 \gg H\) the effect of the ghost on the inflationary perturbations should disappear. Notice, moreover, that for any value of \(M_2\) the vector modes \(g_1\) are suppressedFootnote 7 at superhorizon scales.

We have seen from the analysis of the scalar perturbations that the commutation rules in (5.16) are not enough to identify \(c_\lambda \) and \(c_\lambda ^\dagger \) as annihilation and creation operators, respectively, but one should see how these operators appear in the Hamiltonian. This is done in Appendix A, where this identification is justified and it is also shown that the Hamiltonian does not have negative eigenvalues, at least when it is conserved, if one introduces a negative norm for the states with an odd number of quanta created by \(c_\lambda ^\dagger \).

6 Tensor perturbations

The contribution of \(S_\mathrm{ES}\) to the quadratic action for the tensor perturbations is well known (see for example the textbook [50])

$$\begin{aligned} S_\mathrm{ES}^{(T)} = \frac{\bar{M}_\mathrm{P}^2}{8} \int \mathrm{d}^4x \, a^2(h'_{ij}h'_{ij}+h_{ij}\vec {\nabla }^2h_{ij}). \end{aligned}$$
(6.1)

The contribution of \(S_\mathrm{W}\) is instead

$$\begin{aligned} S^{(T)}_\mathrm{W} =- \frac{1}{4f_2^2} \int \mathrm{d}^4x (h''_{ij} h''_{ij}+2 h'_{ij}\vec {\nabla }^2 h'_{ij} + h_{ij}\vec {\nabla }^4 h_{ij}).\nonumber \\ \end{aligned}$$
(6.2)

One can explicitly check that the presence of an arbitrary number of scalar fields do not change the form of \(S_\mathrm{ES}^{(T)}\) and \(S^{(T)}_\mathrm{W}\); the check for \(S_\mathrm{ES}^{(T)}\) requires the use of the background equations (2.24)–(2.25).

The quadratic action for the tensor perturbations \(S_{T} =S^{(T)}_\mathrm{W} +S^{(T)}_{EH}\) can be written in terms of the Lagrangian in the usual way \( S_{T} = \int \mathrm{d}^4x \mathscr {L}_{T},\) where

$$\begin{aligned} \mathscr {L}_{T}= & {} \frac{\bar{M}_\mathrm{P}^2a^2}{8} (h'_{ij}h'_{ij}+h_{ij}\vec {\nabla }^2h_{ij})\nonumber \\&- \frac{\bar{M}_\mathrm{P}^2}{8M_2^2} ( h''_{ij} h''_{ij}+2 h'_{ij}\vec {\nabla }^2 h'_{ij} + h_{ij}\vec {\nabla }^4 h_{ij}). \end{aligned}$$
(6.3)

We now introduce the canonical formalism that is suitable to quantise the system. To do so we use the Ostrogradsky canonical method [78, 79]; this method was introduced for Lagrangians without explicit dependence on time, but they can be applied without significant modifications in the present case where the Lagrangian does have such dependence (due to the cosmological scale factor), as explained in Appendix A. This system with four derivatives can be transformed into one with two derivatives by doubling the degrees of freedom

$$\begin{aligned} h_{ij}^{(1)} = h_{ij}, \quad h_{ij}^{(2)} = h'_{ij}. \end{aligned}$$
(6.4)

We can define the conjugate variables as follows:

$$\begin{aligned} p^{(1)}_{ij} = \frac{\delta \mathscr {L}}{\delta h_{ij}'^{(1)}}, \quad p^{(2)}_{ij} = \frac{\delta \mathscr {L}}{\delta h_{ij}'^{(2)}}, \end{aligned}$$
(6.5)

where we have introduced the variational derivatives for a generic variable X

$$\begin{aligned} \frac{\delta \mathscr {L}}{\delta X} = \frac{\partial \mathscr {L}}{\partial X} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial \mathscr {L}}{\partial \dot{X}}, \end{aligned}$$
(6.6)

which generalises the usual formula to four-derivative theories. We obtain

$$\begin{aligned} p^{(1)}_{ij}= & {} \frac{\bar{M}_\mathrm{P}^2}{4}\left( a^2 h'_{ij} -\frac{2}{M_2^2}\vec {\nabla }^2h'_{ij} +\frac{1}{M_2^2} h'''_{ij} \right) , \quad \nonumber \\ p^{(2)}_{ij}= & {} -\frac{\bar{M}_\mathrm{P}^2}{4M_2^2} h''_{ij}. \end{aligned}$$
(6.7)

The quantisation is obtained as usual by imposing the canonical commutators. In order to do so, however, one should identify the independent degrees of freedom. The conditions in (3.3) tell us that for a plane wave with given momentum \(\vec {q}\) there are only two independent components for each canonical variable. So to identify the two independent degrees of freedom we go to momentum space and write for \(F_{ij} \equiv \{ h^{(1)}_{ij}, p^{(1)}_{ij}, h^{(2)}_{ij}, p^{(1)}_{ij} \}\),

$$\begin{aligned} F_{ij}(\eta , \vec {x}) = \int \frac{\mathrm{d}^3q}{(2\pi )^{3/2}} e^{i\vec {q}\cdot \vec {x}} \sum _{\lambda = \pm 2} F_\lambda (\eta ,\vec {q}) e^\lambda _{ij} (\hat{q}), \end{aligned}$$
(6.8)

where \(e^\lambda _{ij} (\hat{q})\) are the usual polarisation tensors for helicities \(\lambda = \pm 2\). We recall that for \(\hat{q}\) along the third axis the polarisation tensors that satisfy (3.3) are given by

$$\begin{aligned} e^{+2}_{11}= & {} -e^{+2}_{22} = 1/2, \quad e^{+2}_{12} = e^{+2}_{21} = i/2, \quad e^{+2}_{3i} =e^{+2}_{i3} = 0, \quad \nonumber \\ e^{-2}_{ij}= & {} (e^{+2}_{ij})^* , \end{aligned}$$
(6.9)

and for a generic momentum direction \(\hat{q}\) we can obtain \(e^\lambda _{ij} (\hat{q})\) by applying to (6.9) a rotation that connects the third axis with \(\hat{q}\). The polarisation tensors defined in this way obey

$$\begin{aligned} e^\lambda _{ij} (\hat{q}) (e^{\lambda '}_{ij} (\hat{q}))^* = \delta ^{\lambda \lambda '}. \end{aligned}$$
(6.10)

As expected the tensor sector includes two fields with helicities \(\pm 2\): they correspond to the graviton and the helicity \(\pm 2\) components of the spin-2 ghost. These two fields make together the field \(h_{ij}\) with a four-derivative Lagrangian. We can now impose the canonical commutators to the variable \(F_{\lambda }\):

$$\begin{aligned}&[h^{(l)}_\lambda (\eta , \vec {q}), p^{(l')}_{\lambda '}(\eta , \vec {k})^\dagger ] = i \delta _{\lambda \lambda '} \delta ^{l l'}\delta ^{(3)}(\vec {q}-\vec {k}), \nonumber \\&\quad \text{(and } \text{ all } \text{ the } \text{ other } \text{ commutators } \text{ vanishing), } \end{aligned}$$
(6.11)

which, according to the expansion in (6.8) and the condition in (6.10), lead to the following canonical commutators in coordinate space:

$$\begin{aligned}&[h^{(l)}_{ij}(\eta , \vec {x}), p^{(l')}_{ij}(\eta , \vec {y})] = 2i \delta ^{l l'}\delta ^{(3)}(\vec {x}-\mathbf {y}).\nonumber \\&\quad \text{(and } \text{ all } \text{ the } \text{ other } \text{ commutators } \text{ vanishing) }. \end{aligned}$$
(6.12)

Now that we know how to quantise we come back to the action in (6.3) and the associated equation

$$\begin{aligned} (a^2h'_{ij})' - a^2 \vec {\nabla }^2 h_{ij} +\frac{1}{M_2^2} (h''''_{ij} -2\vec {\nabla }^2 h''_{ij}+\vec {\nabla }^4 h_{ij}) = 0.\nonumber \\ \end{aligned}$$
(6.13)

By using the expansion in (6.8) for \(h_{ij}\) we obtain

$$\begin{aligned} (a^2h'_\lambda )' + a^2q^2 h_\lambda +\frac{1}{M_2^2} (h''''_\lambda +2q^2 h''_\lambda +q^4 h_\lambda ) = 0.\nonumber \\ \end{aligned}$$
(6.14)

Notice that, again, we can replace \(a^2\) and \(\mathcal {H}\) here with the their pure de Sitter expressions, \(a^2(\eta )= 1/(H^2\eta ^2)\) and \(\mathcal {H} = -1/\eta \): indeed (2.25) shows that the error that is produced in this way is beyond the next-to-leading order slow-roll approximation, which we are using here. We then obtain

$$\begin{aligned} \frac{\mathrm{d}^4}{\mathrm{d}z^4} h_\lambda +2\frac{\mathrm{d}^2}{\mathrm{d}z^2} h_\lambda + h_\lambda + \frac{1}{\rho } \left[ \frac{\mathrm{d}}{\mathrm{d}z}\left( \frac{1}{z^2} \frac{\mathrm{d}}{\mathrm{d}z} h_\lambda \right) +\frac{1}{z^2} h_\lambda \right] =0, \nonumber \\ \end{aligned}$$
(6.15)

where we have introduced again \(z\equiv -q \eta \) and \(\rho \equiv H^2/M_2^2\). We now follow the method of [52, 54] to solve this equation and improve it, showing that it provides all the linearly independent solutions. We write the differential operator appearing in this equation,

$$\begin{aligned} \mathcal {D}_z \equiv \frac{\mathrm{d}^4}{\mathrm{d}z^4} +2\frac{\mathrm{d}^2}{\mathrm{d}z^2} + 1 + \frac{1}{\rho } \left( \frac{1}{z^2}\frac{\mathrm{d}^2}{\mathrm{d}z^2} -\frac{2}{z^3}\frac{\mathrm{d}}{\mathrm{d}z} +\frac{1}{z^2}\right) ,\nonumber \\ \end{aligned}$$
(6.16)

in two equivalent ways

$$\begin{aligned} \mathcal {D}_z= & {} \left( \frac{\mathrm{d}^2}{\mathrm{d}z^2} +\frac{2}{z}\frac{\mathrm{d}}{\mathrm{d}z}+1+\frac{1}{\rho z^2}\right) \left( \frac{\mathrm{d}^2}{\mathrm{d}z^2}-\frac{2}{z}\frac{\mathrm{d}}{\mathrm{d}z} +1\right) , \end{aligned}$$
(6.17)
$$\begin{aligned} \mathcal {D}_z= & {} \left( \frac{1}{z^2}\frac{\mathrm{d}^2}{\mathrm{d}z^2} -\frac{2}{z^3}\frac{\mathrm{d}}{\mathrm{d}z}+\frac{1}{ z^2}\right) \nonumber \\&\times \left( z^2\frac{\mathrm{d}^2}{\mathrm{d}z^2}-2z\frac{\mathrm{d}}{\mathrm{d}z} +2+z^2+\frac{1}{\rho }\right) . \end{aligned}$$
(6.18)

Therefore, the solutions of the two second order equations

$$\begin{aligned}&\left( \frac{\mathrm{d}^2}{\mathrm{d}z^2}-\frac{2}{z}\frac{\mathrm{d}}{\mathrm{d}z} +1\right) h_\lambda =0, \quad \nonumber \\&\quad \times \left( z^2\frac{\mathrm{d}^2}{\mathrm{d}z^2}-2z\frac{\mathrm{d}}{\mathrm{d}z} +2+z^2+\frac{1}{\rho }\right) h_\lambda =0 \end{aligned}$$
(6.19)

are all solutions of the four-derivative equation in (6.15); we also show now that they are all linearly independent. Substituting \(h_\lambda \rightarrow z \mu ^{(1)}_\lambda \) in the first equation and \(h_\lambda \rightarrow z \mu ^{(2)}_\lambda \) in the second one we obtain the equivalent equations

$$\begin{aligned}&\frac{\mathrm{d}^2\mu ^{(1)}_\lambda }{\mathrm{d}z^2} +\left( 1-\frac{2}{z^2}\right) \mu ^{(1)}_\lambda =0, \nonumber \\&\frac{\mathrm{d}^2\mu ^{(2)}_\lambda }{\mathrm{d}z^2} +\left( 1+\frac{1}{\rho z^2}\right) \mu ^{(2)}_\lambda =0. \end{aligned}$$
(6.20)

The two linearly independent solutions of the first equation are

$$\begin{aligned} \frac{1- i z}{z} e^{iz} \quad \text{ and } \text{ its } \text{ complex } \text{ conjugate, } \end{aligned}$$
(6.21)

while the two linearly independent solutions of the second one are

$$\begin{aligned} \sqrt{z} J_{\frac{\sqrt{\rho -4}}{2 \sqrt{\rho }}}(z)+i \sqrt{z} Y_{\frac{\sqrt{\rho -4}}{2 \sqrt{\rho }}}(z) \quad \text{ and } \text{ its } \text{ complex } \text{ conjugate. }\nonumber \\ \end{aligned}$$
(6.22)

Now, given the hermiticity condition of the field in momentum space, \(h_\lambda (\eta ,\vec {q})^\dagger =h_{-\lambda }(\eta , -\vec {q})\) we can writeFootnote 8

$$\begin{aligned} h_\lambda (\eta , \vec {q})= & {} \alpha _\lambda (\vec {q}) w_2(-\eta q) +\beta _\lambda (\vec {q}) f_2(-\eta q) \nonumber \\&+ \alpha ^\dagger _{-\lambda }(-\vec {q}) w^*_2(-\eta q)+ \beta ^\dagger _{-\lambda }(-\vec {q}) f^*_2(-\eta q),\nonumber \\ \end{aligned}$$
(6.23)

where

$$\begin{aligned}&w_2(z)\equiv (1-i z) e^{i z}, \nonumber \\&f_2(z) \equiv z^{3/2}\left( J_{\frac{\sqrt{\rho -4}}{2 \sqrt{\rho }}}(z) +iY_{\frac{\sqrt{\rho -4}}{2 \sqrt{\rho }}}(z) \right) \end{aligned}$$
(6.24)

and \(\alpha _\lambda (\vec {q})\) and \(\beta _\lambda (\vec {q})\) are suitable coefficient (to be interpreted as operators in the quantum theory). Their normalisation can be fixed with a method similar to the one used for scalar and vector perturbations. However, the situation here is more complicated because we have four functions instead of two. We find that these four functions \(w_2\), \(w_2^*\), \(f_2\), \(f_2^*\) are linearly independent: their Wronskian,

$$\begin{aligned} \text{ Wronskian } = \left| \begin{array}{llll} w_2 &{} w_2^* &{} f_2 &{} f_2^* \\ w_2' &{} w_2'^* &{} f_2' &{} f_2'^* \\ w_2'' &{} w_2''^* &{} f_2'' &{} f_2''^* \\ w_2''' &{} w_2'''^* &{} f_2''' &{} f_2'''^* \end{array} \right| , \end{aligned}$$
(6.25)

is never zero as shown in Fig. 1. It follows that we can always express \(\alpha _\lambda (\vec {q}), \beta _\lambda (\vec {q}), \alpha ^\dagger _{-\lambda }(-\vec {q}), \beta ^\dagger _{-\lambda }(-\vec {q})\) as linear functionals of \(h_\lambda (\eta , \vec {q})\). Therefore, there exists only one assignment of the commutation rules of these four operators, which satisfy the canonical commutators in (6.11) (this result eliminates a loophole of previous determinations where the uniqueness of the commutation rules of \(\alpha _\lambda (\vec {q}), \beta _\lambda (\vec {q}), \alpha ^\dagger _{-\lambda }(-\vec {q}), \beta ^\dagger _{-\lambda }(-\vec {q})\) was not proved).

Fig. 1
figure 1

The Wronskian of the four solutions of the tensor perturbations on de Sitter space, defined in Eq. (6.25) as a function of the ratio between the Hubble rate and the ghost mass, showing that the solutions are linearly independent. The inset shows the behaviour for \(H< 2 M_2\) with a logarithmic vertical scale. The Wronskian does not depend on z because the fourth order equation (6.15) does not have the term with three derivatives (Liouville theorem)

The commutation rules we find are

$$\begin{aligned}&[\alpha _\lambda (\vec {q}), \alpha _{\lambda '}^\dagger (\vec {k})] = c_\alpha (q) \delta _{\lambda \lambda '}\delta ^{(3)}(\vec {q}-\vec {k}),\quad \nonumber \\&[\beta _\lambda (\vec {q}), \beta _{\lambda '}^\dagger (\vec {k})] = - \frac{c_\alpha }{\mathcal {F}^*\mathcal {F}} \delta _{\lambda \lambda '}\delta ^{(3)}(\vec {q}-\vec {k}) \end{aligned}$$
(6.26)

and all the other commutators are equal to zero, where

$$\begin{aligned} c_\alpha (q) \equiv \frac{2H^2}{\bar{M}_\mathrm{P}^2 q^3(1+2\rho )}, \quad \mathcal {F} \equiv \frac{(1-i) e^{-\frac{1}{4} i \pi \sqrt{\frac{\rho -4}{\rho }}}}{\sqrt{\pi }}. \end{aligned}$$
(6.27)

These commutators can be brought into the more standard form

$$\begin{aligned}&[a_\lambda (\vec {q}), a_{\lambda '}^\dagger (\vec {k})] = \delta _{\lambda \lambda '}\delta ^{(3)}(\vec {q}-\vec {k}),\nonumber \\&[b_\lambda (\vec {q}), b_{\lambda '}^\dagger (\vec {k})] = - \delta _{\lambda \lambda '}\delta ^{(3)}(\vec {q}-\vec {k}), \end{aligned}$$
(6.28)

by defining

$$\begin{aligned} a_\lambda (\vec {q}) \equiv \frac{1}{\sqrt{c_\alpha (q)}} \alpha _\lambda (\vec {q}), \quad b_\lambda (\vec {q}) \equiv \frac{\mathcal {F}}{\sqrt{c_\alpha (q)}} \beta _\lambda (\vec {q}), \end{aligned}$$
(6.29)

which also leads to properly normalised modes

$$\begin{aligned} y_2(\eta , q)= & {} \frac{\sqrt{2} H}{\bar{M}_\mathrm{P}q^{3/2}\sqrt{1+2\frac{H^2}{M_2^2}}}(1+i q\eta ) e^{-i q\eta }, \\ g_2(\eta , q)= & {} \frac{\sqrt{\pi } He^{i\frac{\pi }{4}\left( 1+\sqrt{1-4\frac{M^2_2}{H^2}}\right) }}{\bar{M}_\mathrm{P}q^{3/2}\sqrt{1+2\frac{H^2}{M_2^2}}} (-q\eta )^{3/2}\nonumber \\&\left( J_{\sqrt{\frac{1}{4} -\frac{M_2^2}{H^2}}}(-q\eta ) + i Y_{\sqrt{\frac{1}{4} -\frac{M_2^2}{H^2}}}(-q\eta ) \right) \nonumber \end{aligned}$$
(6.30)

such that the initial function can be expressed as follows:

$$\begin{aligned} h_\lambda (\eta , \vec {q})= & {} a_\lambda (\vec {q}) y_2(\eta , q) +b_\lambda (\vec {q}) g_2(\eta , q)\nonumber \\&+ a^\dagger _{-\lambda }(-\vec {q}) y^*_2(\eta , q)+ b^\dagger _{-\lambda }(-\vec {q}) g^*_2(\eta , q). \nonumber \\ \end{aligned}$$
(6.31)

The commutator on the left of (6.28) is the standard one for the graviton while the one on the right corresponds to the ghost. In Appendix B.3 we give a rationale for identifying \(a_\lambda \) and \(b_\lambda \) as annihilation operators and \(a_\lambda ^\dagger \) and \(b_\lambda ^\dagger \) as creation operators; we also show there that the Hamiltonian does not have any negative energy eigenvalue provided that the norms of the states created by \(a^\dagger _\lambda \) are positive and those with an odd number of b-quanta are negative. Notice that the expression for \(y_2\) we find reduces to the standard graviton de Sitter mode when \(M_2 \gg H\), but for \(H\ll M_2\) it is suppressed. The expressions of \(y_2\) and \(g_2\) agree with the previous determinations of Refs. [52, 54]. However, we observe here that for \(M_2>2H\) a complex exponential appearing in \(g_2\) becomes real,

$$\begin{aligned} e^{i\frac{\pi }{4}\sqrt{1-4\frac{M^2_2}{H^2}} } = e^{-\frac{\pi }{4}\sqrt{4\frac{M^2_2}{H^2}-1} }, \quad (M_2>2H), \end{aligned}$$
(6.32)

and exponentially suppresses the ghost mode \(g_2\) for \(M_2 \gg H\) (just like what happened for the vector mode \(g_1\)). This is what we expect because for \(M_2 \gg H\) the effect of the ghost on the inflationary perturbations should disappear.

Notice that the tensor modes associated with the ghost, \(g_2(\eta , q)\), vanish at superhorizon scales, \(\eta \rightarrow 0\), even faster than the vector modes. On the other hand, the modes \(y_2(\eta , q)\), associated with the ordinary graviton, do not vanish in this limit.

7 Observational quantities

We consider the power spectrum of the perturbations that survive at superhorizon scales, namely the curvature perturbation and the perturbations associated with the ordinary graviton, with modes given in Eq. (6.30), and, if \(M_2^2/H^2 \lesssim 1/N\), the extra scalar perturbation B. Besides these ones, as usual, the presence of several scalar fields might lead to additional isocurvature modes, which are constrained by observations (see e.g. [60]). We do not enter the analysis of such effects because they are model-dependent and we wish to keep the analysis general here. Such perturbations can be suppressed by an inflationary attractor that effectively reduces the system to a single-field one. The presence of such an attractor has been established in [10] in some models of the sort we analyse here (see also the analysis of the specific model of Sect. 8).

In Sect. 4 we have seen that the curvature perturbation \(\mathcal {R}\) does not receive sizeable corrections from the ghost in the slow-roll approximation. Therefore, the associated power spectrum \(P_\mathcal {R}(k)\) is (see e.g. [61])

$$\begin{aligned} P_\mathcal {R}(q)=\left( \frac{H}{2\pi }\right) ^2 N_{,i}N^{,i}. \end{aligned}$$
(7.1)

Here the field dependent number of e-folds \(N(\phi )\) is defined in Eq. (2.35) and in this section we compute the power spectra at horizon exit \(q=a H\). The corresponding spectral index \(n_\mathrm{s}\) is given in terms of the slow-roll parameters \(\epsilon \) and \(\eta ^i_{\,\,\, j}\) in (2.31) and (2.32) by [61, 62]

$$\begin{aligned} n_\mathrm{s} =1-2\epsilon - \frac{ 2 }{ \bar{M}_\mathrm{Pl}^2 N_{,i}N^{,i}}+\frac{2\eta _{ij}N^{,i}N^{,j}}{N_{,k}N^{,k}}. \end{aligned}$$
(7.2)

In Sect. 6, we have seen that the perturbation associated with the ordinary graviton is suppressed with respect to the usual expression in Einstein gravity (see for example the textbook [63]) by a factor of \((1+2H^2/M_2^2)^{-1/2}\). Therefore, the power spectrum of tensor perturbations is given by

$$\begin{aligned} P_\mathrm{t} = \frac{1}{1+\frac{2 H^2}{M_2^2}} \frac{8}{\bar{M}_\mathrm{Pl}^2} \left( \frac{H}{2\pi }\right) ^2. \end{aligned}$$
(7.3)

By taking the ratio between this equation and (7.1) we obtain the tensor-to-scalar ratio

$$\begin{aligned} r\equiv \frac{P_\mathrm{t}}{P_\mathcal {R}}= \frac{1}{1+\frac{2 H^2}{M_2^2}}\frac{8}{ \bar{M}_\mathrm{Pl}^2 N_{,i}N^{,i}}. \end{aligned}$$
(7.4)

Also, we have seen in Sect. 4 that, in addition to \(\mathcal {R}\), there is another scalar perturbation, denoted with B, that survives at superhorizon scales for \(M_2^2/H^2 \lesssim 1/N\). The power spectrum \(P_\mathrm{B}\) of the spatial gradient of B has been computed in [58] for a single-field inflationary model. The results of Sect. 4 show that the same formula holds for a general matter sector. It turns out to be the same as the tensor power spectrum in Einstein’s gravity, except that it is smaller by a factor of about \(\approx 5\):

$$\begin{aligned} P_\mathrm{B} =\frac{3}{2\bar{M}_\mathrm{P}^2} \left( \frac{H}{2\pi }\right) ^2. \end{aligned}$$
(7.5)

We will conveniently parameterise the effect of B as is done for the tensor perturbations: we introduce the ratio

$$\begin{aligned} r' \equiv \frac{P_\mathrm{B} }{P_\mathcal {R}}=\frac{3}{2 \bar{M}_\mathrm{Pl}^2 N_{,i}N^{,i}}. \end{aligned}$$
(7.6)

Many models of inflation based on Einstein’s gravity predict a tensor power spectrum that is small compared to the curvature power spectrum. Notice also that the correlation between the power spectrum of this isocurvature mode and \(\mathcal {R}\) is suppressed at superhorizon scales: indeed, B contains only the creation and annihilation operators of \(\Psi \), while \(\mathcal {R}\) only those of the scalar field fluctuations \(\varphi ^i\) at these scales and in the slow-roll approximation. Therefore, the bounds on isocurvature power spectra of the last Planck data [60] can easily be fulfilled.

Fig. 2
figure 2

The ratio \(r'\) between the isocurvature power spectrum \(P_\mathrm{B}\) of (the gradient of) B and the curvature power spectrum \(P_{\mathcal {R}}\) as defined in Eq. (7.6), as a function of the ratio between the two mass parameters of \(\zeta \) and s, respectively, setting the dominant inflaton (as explained in the text). The corresponding values of r are given in Eq. (8.4). The insets show the values of \(n_\mathrm{s}\). An interval of e-folds of \(55< N < 65\) is considered (N suppresses \(r'\) and enhances \(n_\mathrm{s}\)). The plot on the left show the results for \(\xi _\mathrm{s} = 0.1\), while that on the right for \(\xi _\mathrm{s} =1\)

We see that the main observational implication of the presence of the ghost in the inflationary perturbations is to suppress r and to introduce another scalar perturbation for small ghost masses, \(M_2\lesssim H/\sqrt{N}\). The spectral index \(n_\mathrm{s}\) and the power spectrum \(P_\mathcal {R}\) in general is insensitive to the ghost. We also stress that this conclusion is independent of the matter content of the theory.

8 An example: the Higgs and the planckion

We apply now the results we obtained to a simple, yet realistic setup: we assume that the only scalar fields that can be active during inflation are the Higgs field, a scalar s that generates the Planck scaleFootnote 9 through its VEV \(\langle s\rangle \), that is \(\bar{M}_\mathrm{P}^2 = \xi _\mathrm{s} \langle s\rangle ^2\), and of course \(\zeta \), which corresponds to the \(R^2\) term in the Lagrangian (see Sect. 2.1). Because the Planck mass is due to s, this field can be thought of as a Higgs of gravity. In Refs. [10, 64, 65] it has been shown that ordinary Higgs inflation [66,67,68,69,70,71,72] (because of the sizeable running of its quartic self-coupling [73,74,75]) always plays a subdominant role during inflation. One can therefore restrict attention to s and \(\zeta \).

There is mixing between s and \(\zeta \) and the mass eigenvalues are [3, 10]

$$\begin{aligned} M_{\pm } = \frac{M_\mathrm{s}^2 + \xi _\mathrm{c} M_0^2}{2} \pm \frac{1}{2} \sqrt{(M_\mathrm{s}^2 + \xi _\mathrm{c} M_0^2)^2 - 4 M_\mathrm{s}^2 M_0^2},\nonumber \\ \end{aligned}$$
(8.1)

where \(M_0^2\equiv f_0^2 \bar{M}_\mathrm{P}^2/2\), \(M_\mathrm{s}^2\equiv \langle \partial ^2 V/ \partial s^2 \rangle \) and \(\xi _\mathrm{c} \equiv 1+6\xi _\mathrm{s}\).

In Refs. [3, 10] inflation has been considered in this setup, but assuming that the ghost does not play a significant role. This is always the case at the background level as the FRW metric is conformally flat and the effect of the Weyl-squared term vanishes. That is also the case at the linear level in the perturbations whenever the ghost mass satisfies \(M_2 \gg H\). Here we would like to extend this analysis to smaller values of \(M_2\). By taking into account the Weyl-squared term, in addition to \(\mathcal {R}\) and the tensor perturbation there is in general another relevant perturbation: the isocurvature scalar mode B. The corresponding power spectra have been given in the general case in Sect. 7. We assume here \(M_2^2/H^2 \lesssim 1/N\) because the other case \(M_2^2/H^2 \gtrsim 1/N\) can be simply obtained by neglecting B.

As shown in [10], there is an inflationary attractor that effectively reduces the system to a single-field inflationary model: this single field is in general a combination of s and \(\zeta \), which, however, reduces to s when \(M_0 \gg M_\mathrm{s}\) and to z in the other limit. These two cases correspond to the following predictions.

  • For \(M_0 \gg M_\mathrm{s}\) (s-inflation) the inflationary predictions are

    $$\begin{aligned} n_\mathrm{s}\approx & {} 1-\frac{2}{N}\mathop {\approx }\limits ^{N\approx 60}0.967,\quad r'\approx \frac{3}{2N}\mathop {\approx }\limits ^{N\approx 60} 0.025 \quad \nonumber \\ r\approx & {} \frac{1}{1+\frac{2 H^2}{M_2^2}}\frac{8}{N}\mathop {\approx }\limits ^{N\approx 60} \frac{0.13}{1+\frac{2 H^2}{M_2^2}} . \end{aligned}$$
    (8.2)

    The scalar amplitude \(P_\mathrm{R}=M_\mathrm{s}^2 N^2/(6\pi ^2 \bar{M}_\mathrm{P}^2)\) is reproduced for \(M_\mathrm{s}\approx 1.4 \times 10^{13}\) GeV.

  • In the opposite limit (\(\zeta \)-inflation) one realises Starobinsky’s inflation and the inflationary predictions are

    $$\begin{aligned} n_\mathrm{s}\approx & {} 1 - \frac{2}{N}\mathop {\approx }\limits ^{N\approx 60} 0.967,\quad r' \approx \frac{9}{4N^2}\mathop {\approx }\limits ^{N\approx 60} 0.0006, \quad \nonumber \\ r\approx & {} \frac{1}{1+\frac{2 H^2}{M_2^2}}\frac{12}{N^2}\mathop {\approx }\limits ^{N\approx 60} \frac{0.003}{1+\frac{2 H^2}{M_2^2}}. \end{aligned}$$
    (8.3)

    The scalar amplitude \(P_\mathrm{R} = f_0^2 N^2/(48\pi ^2)\) is reproduced for \(f_0 \approx 1.8 \times 10^{-5}\).

In Fig. 2 we present \(r'\) and \(n_\mathrm{s}\) for intermediate values of \(M_0/M_\mathrm{s}\). The corresponding value of r is given by

$$\begin{aligned} r = \frac{16 r'}{3\left( 1+\frac{2H^2}{M_2^2}\right) }, \end{aligned}$$
(8.4)

where we combined (7.5), (7.6), (7.3) and (7.4). We see that r is strongly suppressed when \(M_2 \ll H\): this limit is relevant because, for example, the maximal value of \(M_2\) compatible with the solution of the hierarchy problem discussed in [3] is \(\sim \) \(10^{11}\) GeV, while the typical value of the inflationary H in this setup is \(\sim \) \(10^{13}\) GeV. We find that \(r'\) satisfies the bound on isocurvature power spectra of [60] by taking an appropriate value of \(f_0\) (this condition is required to match the observed \(P_{\mathcal {R}}\) and leads typically to \(f_0\sim 10^{-5}\)); in the limit of pure s-inflation we find, however, that the bounds are very close to the predictions, which suggests that this possibility can be tested with future observations [76]. The prediction of \(n_\mathrm{s}\) is quite stable as a function of \(M_0/M_\mathrm{s}\) and agrees well with the bounds of [60]. Finally, r always satisfies the bounds of [60] when \(M_0\ll M_\mathrm{s}\). In the opposite limit there is some tension when \(M_2 \gtrsim H\) as s-inflation is essentially due to a quadratic potential that predicts a rather large value of r. This tension, however, disappears if we take instead \(M_2 \ll H\), as suggested by Ref. [3]. We see that a relatively light ghost can allow even an inflation due to a quadratic potential.

9 Issues due to the Weyl-squared term

Depending on how the probability is defined, the decay width of the ghost, \(\Gamma _2\), may be negative leading to interpretational issues. Feynman has pointed out that negative probabilities are still acceptable if the corresponding events are somewhat unobservables [77]. The processes affected by \(\Gamma _2<0\) in our case might be interpreted as nothing but intermediate processes of complete events with positive total probability, along the lines of [29, 77]. In the early universe these consist typically of decays of the inflaton into ghosts and other particles followed by ghost decays, which lead to total positive probabilities.

Ref. [49] pointed out,Footnote 10 however, that such intermediate processes lead to a microscopic violation of causality; to show this the authors considered two stable particles prepared in some initial state that come close enough to interact through the exchange of a ghost, and they are later detected. Reference [49] considered a scalar ghost on flat space; it is not known whether the same result holds for a spin-2 state on de Sitter spacetime and we leave a complete analysis for future work. However, even if it were the case some necessary conditions should be met to conclude that causality is violated: first, the energy should not be much smaller than the ghost mass in order to see these effects; second, one should be able to tell where the initial and final particles are and what their momenta are (otherwise it would not be possible to reconstruct where and when the ghost is annihilated and produced). The first condition forces H or T to be comparable or larger than the ghost mass. The second condition implies that the initial and final particles should be non-relativistic and can only be met if H and the temperature T are much smaller than the mass of the colliding particles. We observe that stable particles with masses fulfilling the second condition do not necessarily exist in a given no-scale model; for example, in the case of Sect. 8 the inflatons have typically masses of order \(10^{13}\) GeV, but they are unstable [10]. Even if all the necessary conditions to have acausality were satisfied we now argue that the possible acausal processes are diluted by the expansion of the universe.

First ignore finite temperature effects and consider the inflationary period. In this section we denote with \(H_\mathrm{inf}\) the corresponding Hubble rate. Given that (for moderate extensions of the Standard Model) \(|\Gamma _2| \lesssim M_2^3/\bar{M}_\mathrm{P}^2\) [42], forFootnote 11 \(M_2\lesssim H_\mathrm{inf}\) we see that \(\Gamma _2\) is small compared to the Hubble rate during inflation.

Indeed, the observational constraint on \(H_\mathrm{inf}\) from the Planck observatory, \(H_\mathrm{inf}<3.6\times 10^{-5} \bar{M}_\mathrm{P}\) (see sec 5.1 of [60]) implies \(\Gamma _2 \lesssim 10^{-9} H_\mathrm{inf}\). Therefore, the effect of the possible acausal processes would be diluted by the universe expansion. The dilution takes place even later, as long as \(|\Gamma _2| \lesssim H\), and is described by the Boltzmann equation for the ghost number density \(n_\mathrm{g}\), \(\dot{n}_\mathrm{g} +3H n_\mathrm{g} \sim - \Gamma _2 n_\mathrm{g},\) which leads to a decreasing \(n_\mathrm{g}\) for \(H \gtrsim |\Gamma _2|\).

On the other hand, when H becomes smaller than \(\Gamma _2\) it is also much smaller than the ghost mass and the possible acausal processes cannot be observable as argued before (we are using here \(M_2\gg \Gamma _2\) which is amply satisfied wheneverFootnote 12 \(M_2\ll \bar{M}_\mathrm{P}\)).

Finally, let us consider the finite temperature effects. Effectively, the maximal temperature reached during the universe expansion is the reheating temperature. This quantity has been computed in [10], at least in some realisations of the no-scale scenario, and turns out to be not larger than \(10^9\) GeV. This is much smaller than \(M_2\) if one saturates the bound \(M_2 \lesssim 10^{11}\) GeV of [3], required to solve the hierarchy problem, and no sizeable effects due to the ghost are expected.

10 Conclusions

In this work we have analysed all inflationary perturbations in the most general (classically) scale-invariant theory: this includes all terms quadratic in curvature (the \(R^2\) and the Weyl-squared terms), the most general no-scale matter Lagrangian and the non-minimal couplings between the scalar fields and R. The scales we observe in nature are generated dynamically through dimensional transmutation, in which scale-invariance is broken by quantum effects.

The main results we have found are the following.

  • We have performed a detailed and careful analysis of all sectors: scalar, vector and tensor perturbations. The corresponding modes are found by means of a Lagrangian approach. We have also shown that the full conserved Hamiltonian for all the perturbations does not feature negative energies if appropriately quantised. An explanation is provided for how the behaviour of ordinary Einstein’s gravity coupled to a generic matter sector is recovered when the ghost mass \(M_2\) is much bigger than the Hubble rate during inflation \(H_\mathrm{inf}\).

  • The expressions of all the (potentially) observable quantities derived from the relevant power spectra are presented for the most general scale-invariant theory: the curvature power spectrum with the corresponding scalar spectra index \(n_\mathrm{s}\), the tensor power spectrum and the power spectrum of an isocurvature mode B associated with the helicity-0 component of the spin-2 ghost.

  • Then these general results have been applied to a specific concrete model where the scalar sector features the planckion s (the scalar field whose VEV generates the Planck scale) and the Higgs field, in addition to, of course, the Starobinsky scalar \(\zeta \) due to the \(R^2\) term. When \(M_2 \gg H_\mathrm{inf}\) we recover the results of [10]. For smaller values of \(M_2\) the tensor-to-scalar ratio r becomes suppressed and allows s-inflation (which is instead generically in tension with observations for \(M_2 > H_\mathrm{inf}\)). More generally, \(M_2< H_\mathrm{inf}\) render viable a large class of models which were in tension with the most recent observations by Planck [60], such as inflationary models with quadratic potentials. For these small values of \(M_2\) there is a scalar isocurvature mode, B, which, however, is consistent with the bounds on the isocurvature power spectra of [60]. Interestingly, however, its power spectrum is rather close to the observational bounds for s-inflation, leading to the possibility to test this model with future observations.

  • We have also argued that the possible issues due to the spin-2 ghost associated with the Weyl-squared term do not create phenomenological problems in some no-scale models, at least when \(M_2\) is close to the upper bound which ensures the naturalness of the Higgs mass, \(M_2 \le 10^{11}\) GeV.

The present work has several possible future applications. For example, it would be interesting to apply the general formulae derived here to no-scale models where scale-invariance is broken by non-perturbative effects [1] and in general to many no-scale models, other than the one considered in Sect. 8. Also, the identification of the quantum perturbations we have performed, with the associated Hilbert space and energy spectra, opens the way to a consistent analysis of non-linear quantum effects on cosmological backgrounds. Perhaps the results of [33] on the quantisation of interacting four-derivative theories can be useful in such analysis.