Advertisement

f(R) Theories

Abstract

Over the past decade, f(R) theories have been extensively studied as one of the simplest modifications to General Relativity. In this article we review various applications of f(R) theories to cosmology and gravity — such as inflation, dark energy, local gravity constraints, cosmological perturbations, and spherically symmetric solutions in weak and strong gravitational backgrounds. We present a number of ways to distinguish those theories from General Relativity observationally and experimentally. We also discuss the extension to other modified gravity theories such as Brans-Dicke theory and Gauss-Bonnet gravity, and address models that can satisfy both cosmological and local gravity constraints.

Introduction

General Relativity (GR) [225, 226] is widely accepted as a fundamental theory to describe the geometric properties of spacetime. In a homogeneous and isotropic spacetime the Einstein field equations give rise to the Friedmann equations that describe the evolution of the universe. In fact, the standard big-bang cosmology based on radiation and matter dominated epochs can be well described within the framework of General Relativity.

However, the rapid development of observational cosmology which started from 1990s shows that the universe has undergone two phases of cosmic acceleration. The first one is called inflation [564, 339, 291, 524], which is believed to have occurred prior to the radiation domination (see [402, 391, 71] for reviews). This phase is required not only to solve the flatness and horizon problems plagued in big-bang cosmology, but also to explain a nearly flat spectrum of temperature anisotropies observed in Cosmic Microwave Background (CMB) [541]. The second accelerating phase has started after the matter domination. The unknown component giving rise to this late-time cosmic acceleration is called dark energy [310] (see [517, 141, 480, 485, 171, 32] for reviews). The existence of dark energy has been confirmed by a number of observations — such as supernovae Ia (SN Ia) [490, 506, 507], large-scale structure (LSS) [577, 578], baryon acoustic oscillations (BAO) [227, 487], and CMB [560, 561, 367].

These two phases of cosmic acceleration cannot be explained by the presence of standard matter whose equation of state w = P/ρ satisfies the condition w ≥ 0 (here P and ρ are the pressure and the energy density of matter, respectively). In fact, we further require some component of negative pressure, with w < −1/3, to realize the acceleration of the universe. The cosmological constant Λ is the simplest candidate of dark energy, which corresponds to w = −1. However, if the cosmological constant originates from a vacuum energy of particle physics, its energy scale is too large to be compatible with the dark energy density [614]. Hence we need to find some mechanism to obtain a small value of Λ consistent with observations. Since the accelerated expansion in the very early universe needs to end to connect to the radiation-dominated universe, the pure cosmological constant is not responsible for inflation. A scalar field ϕ with a slowly varying potential can be a candidate for inflation as well as for dark energy.

Although many scalar-field potentials for inflation have been constructed in the framework of string theory and supergravity, the CMB observations still do not show particular evidence to favor one of such models. This situation is also similar in the context of dark energy — there is a degeneracy as for the potential of the scalar field (“quintessence” [111, 634, 267, 263, 615, 503, 257, 155]) due to the observational degeneracy to the dark energy equation of state around w = −1. Moreover it is generally difficult to construct viable quintessence potentials motivated from particle physics because the field mass responsible for cosmic acceleration today is very small (mϕ ≃ 10−33 eV) [140, 365].

While scalar-field models of inflation and dark energy correspond to a modification of the energy-momentum tensor in Einstein equations, there is another approach to explain the acceleration of the universe. This corresponds to the modified gravity in which the gravitational theory is modified compared to GR. The Lagrangian density for GR is given by f(R) = R − 2Λ, where R is the Ricci scalar and Λ is the cosmological constant (corresponding to the equation of state w = −1). The presence of Λ gives rise to an exponential expansion of the universe, but we cannot use it for inflation because the inflationary period needs to connect to the radiation era. It is possible to use the cosmological constant for dark energy since the acceleration today does not need to end. However, if the cosmological constant originates from a vacuum energy of particle physics, its energy density would be enormously larger than the today’s dark energy density. While the Λ-Cold Dark Matter (ΛCDM) model (f(R) = R − 2Λ) fits a number of observational data well [367, 368], there is also a possibility for the time-varying equation of state of dark energy [10, 11, 450, 451, 630].

One of the simplest modifications to GR is the f(R) gravity in which the Lagrangian density f is an arbitrary function of R [77, 512, 102, 106]. There are two formalisms in deriving field equations from the action in f(R) gravity. The first is the standard metric formalism in which the field equations are derived by the variation of the action with respect to the metric tensor gμν. In this formalism the affine connection \(\Gamma _{\beta \gamma}^\alpha\) depends on gμν. Note that we will consider here and in the remaining sections only torsion-free theories. The second is the Palatini formalism [481] in which gμν and \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables when we vary the action. These two approaches give rise to different field equations for a non-linear Lagrangian density in R, while for the GR action they are identical with each other. In this article we mainly review the former approach unless otherwise stated. In Section 9 we discuss the Palatini formalism in detail.

The model with f(R) = R + αR2 (α > 0) can lead to the accelerated expansion of the Universe because of the presence of the αR2 term. In fact, this is the first model of inflation proposed by Starobinsky in 1980 [564]. As we will see in Section 7, this model is well consistent with the temperature anisotropies observed in CMB and thus it can be a viable alternative to the scalar-field models of inflation. Reheating after inflation proceeds by a gravitational particle production during the oscillating phase of the Ricci scalar [565, 606, 426].

The discovery of dark energy in 1998 also stimulated the idea that cosmic acceleration today may originate from some modification of gravity to GR. Dark energy models based on f(R) theories have been extensively studied as the simplest modified gravity scenario to realize the late-time acceleration. The model with a Lagrangian density f(R) = Rα/Rn (α > 0, n > 0) was proposed for dark energy in the metric formalism [113, 120, 114, 143, 456]. However it was shown that this model is plagued by a matter instability [215, 244] as well as by a difficulty to satisfy local gravity constraints [469, 470, 245, 233, 154, 448, 134]. Moreover it does not possess a standard matter-dominated epoch because of a large coupling between dark energy and dark matter [28, 29]. These results show how non-trivial it is to obtain a viable f(R) model. Amendola et al. [26] derived conditions for the cosmological viability of f(R) dark energy models. In local regions whose densities are much larger than the homogeneous cosmological density, the models need to be close to GR for consistency with local gravity constraints. A number of viable f(R) models that can satisfy both cosmological and local gravity constraints have been proposed in. [26, 382, 31, 306, 568, 35, 587, 206, 164, 396]. Since the law of gravity gets modified on large distances in f(R) models, this leaves several interesting observational signatures such as the modification to the spectra of galaxy clustering [146, 74, 544, 526, 251, 597, 493], CMB [627, 544, 382, 545], and weak lensing [595, 528]. In this review we will discuss these topics in detail, paying particular attention to the construction of viable f(R) models and resulting observational consequences.

The f(R) gravity in the metric formalism corresponds to generalized Brans-Dicke (BD) theory [100] with a BD parameter ωBD = 0 [467, 579, 152]. Unlike original BD theory [100], there exists a potential for a scalar-field degree of freedom (called “scalaron” [564]) with a gravitational origin. If the mass of the scalaron always remains as light as the present Hubble parameter H0, it is not possible to satisfy local gravity constraints due to the appearance of a long-range fifth force with a coupling of the order of unity. One can design the field potential of f(R) gravity such that the mass of the field is heavy in the region of high density. The viable f(R) models mentioned above have been constructed to satisfy such a condition. Then the interaction range of the fifth force becomes short in the region of high density, which allows the possibility that the models are compatible with local gravity tests. More precisely the existence of a matter coupling, in the Einstein frame, gives rise to an extremum of the effective field potential around which the field can be stabilized. As long as a spherically symmetric body has a “thin-shell” around its surface, the field is nearly frozen in most regions inside the body. Then the effective coupling between the field and non-relativistic matter outside the body can be strongly suppressed through the chameleon mechanism [344, 343]. The experiments for the violation of equivalence principle as well as a number of solar system experiments place tight constraints on dark energy models based on f(R) theories [306, 251, 587, 134, 101].

The spherically symmetric solutions mentioned above have been derived under the weak gravity backgrounds where the background metric is described by a Minkowski space-time. In strong gravitational backgrounds such as neutron stars and white dwarfs, we need to take into account the backreaction of gravitational potentials to the field equation. The structure of relativistic stars in f(R) gravity has been studied by a number of authors [349, 350, 594, 43, 600, 466, 42, 167]. Originally the difficulty of obtaining relativistic stars was pointed out in [349] in connection to the singularity problem of f(R) dark energy models in the high-curvature regime [266]. For constant density stars, however, a thin-shell field profile has been analytically derived in [594] for chameleon models in the Einstein frame. The existence of relativistic stars in f(R) gravity has been also confirmed numerically for the stars with constant [43, 600] and varying [42] densities. In this review we shall also discuss this issue.

It is possible to extend f(R) gravity to generalized BD theory with a field potential and an arbitrary BD parameter ωBD. If we make a conformal transformation to the Einstein frame [213, 609, 408, 611, 249, 268], we can show that BD theory with a field potential corresponds to the coupled quintessence scenario [23] with a coupling Q between the field and non-relativistic matter. This coupling is related to the BD parameter via the relation 1/(2Q2) = 3 + 2ωBD [343, 596]. One can recover GR by taking the limit Q − 0, i.e., ωBD → ∞. The f(R) gravity in the metric formalism corresponds to \(Q = - 1/\sqrt 6\) [28], i.e., ωBD = 0. For large coupling models with \(\left\vert Q \right\vert = \mathcal{O}\left(1 \right)\) it is possible to design scalar-field potentials such that the chameleon mechanism works to reduce the effective matter coupling, while at the same time the field is sufficiently light to be responsible for the late-time cosmic acceleration. This generalized BD theory also leaves a number of interesting observational and experimental signatures [596].

In addition to the Ricci scalar R, one can construct other scalar quantities such as RμνRμν and Rμνρσ Rμνρσ from the Ricci tensor Rμν and Riemann tensor Rμνρσ [142]. For the Gauss-Bonnet (GB) curvature invariant defined by \(\mathcal{G} \equiv {R^2} - 4{R_{\alpha \beta}}{R^{\alpha \beta}} + {R_{\alpha \beta \gamma \delta}}{R^{\alpha \beta \gamma \delta}}\), it is known that one can avoid the appearance of spurious spin-2 ghosts [572, 67, 302] (see also [98, 465, 153, 447, 110, 181, 109]). In order to give rise to some contribution of the GB term to the Friedmann equation, we require that (i) the GB term couples to a scalar field ϕ, i.e., \(F\left(\phi \right)\mathcal{G}\) or (ii) the Lagrangian density f is a function of Q, i.e., \(f\left({\mathcal G} \right)\). The GB coupling in the case (i) appears in low-energy string effective action [275] and cosmological solutions in such a theory have been studied extensively (see [34, 273, 105, 147, 588, 409, 468] for the construction of nonsingular cosmological solutions and [463, 360, 361, 593, 523, 452, 453, 381, 25] for the application to dark energy). In the case (ii) it is possible to construct viable models that are consistent with both the background cosmological evolution and local gravity constraints [458, 188, 189] (see also [165, 180, 178, 383, 633, 599]). However density perturbations in perfect fluids exhibit negative instabilities during both the radiation and the matter domination, irrespective of the form of \(f\left(\mathcal{G} \right)\) [383, 182]. This growth of perturbations gets stronger on smaller scales, which is difficult to be compatible with the observed galaxy spectrum unless the deviation from GR is very small. We shall review such theories as well as other modified gravity theories.

This review is organized as follows. In Section 2 we present the field equations of f(R) gravity in the metric formalism. In Section 3 we apply f(R) theories to the inflationary universe. Section 4 is devoted to the construction of cosmologically viable f(R) dark energy models. In Section 5 local gravity constraints on viable f(R) dark energy models will be discussed. In Section 6 we provide the equations of linear cosmological perturbations for general modified gravity theories including metric f(R) gravity as a special case. In Section 7 we study the spectra of scalar and tensor metric perturbations generated during inflation based on f(R) theories. In Section 8 we discuss the evolution of matter density perturbations in f(R) dark energy models and place constraints on model parameters from the observations of large-scale structure and CMB. Section 9 is devoted to the viability of the Palatini variational approach in f(R) gravity. In Section 10 we construct viable dark energy models based on BD theory with a potential as an extension of f(R) theories. In Section 11 the structure of relativistic stars in f(R) theories will be discussed in detail. In Section 12 we provide a brief review of Gauss-Bonnet gravity and resulting observational and experimental consequences. In Section 13 we discuss a number of other aspects of f(R) gravity and modified gravity. Section 14 is devoted to conclusions.

There are other review articles on f(R) gravity [556, 555, 618] and modified gravity [171, 459, 126, 397, 217]. Compared to those articles, we put more weights on observational and experimental aspects of f(R) theories. This is particularly useful to place constraints on inflation and dark energy models based on f(R) theories. The readers who are interested in the more detailed history of f(R) theories and fourth-order gravity may have a look at the review articles by Schmidt [531] and Sotiriou and Faraoni [556].

In this review we use units such that c = ħ = kB = 1, where c is the speed of light, ħ is reduced Planck’s constant, and kB is Boltzmann’s constant. We define \({\kappa ^2} = 8\pi G = 8\pi/m_{{\rm{pl}}}^2 = 1/M_{{\rm{pl}}}^2\), where G is the gravitational constant, mpl = 1.22 × 1019 GeV is the Planck mass with a reduced value \({M_{{\rm{pl}}}} = {m_{{\rm{pl}}}}/\sqrt {8\pi} = 2.44 \times {10^{18}}{\rm{Gev}}\). Throughout this review, we use a dot for the derivative with respect to cosmic time t and “X” for the partial derivative with respect to the variable X, e.g., f,R∂f/∂R and f,RR2f/∂R2. We use the metric signature (−, +, +, +). The Greek indices μ and ν run from 0 to 3, whereas the Latin indices i and j run from 1 to 3 (spatial components).

Field Equations in the Metric Formalism

We start with the 4-dimensional action in f(R) gravity:

$$S = {1 \over {2{\kappa ^2}}}\int {{{\rm{d}}^4}x\sqrt {- g} f(R) +} \;\int {{{\rm{d}}^4}x} {{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.1)

where κ2 = 8πG, g is the determinant of the metric gμν, and \({{\mathcal L}_M}\) is a matter LagrangianFootnote 1 that depends on gμν and matter fields ΨM. The Ricci scalar R is defined by R = gμν Rμν, where the Ricci tensor Rμν is

$${R_{\mu \nu}} = {R^\alpha}_{\mu \alpha \nu} = {\partial _\lambda}\Gamma _{\mu \nu}^\lambda - {\partial _\mu}\Gamma _{\lambda \nu}^\lambda + \Gamma _{\mu \nu}^\lambda \Gamma _{\rho \lambda}^\rho - \Gamma _{\nu \rho}^\lambda \Gamma _{\mu \lambda}^\rho.$$
(2.2)

In the case of the torsion-less metric formalism, the connections \(\Gamma _{\beta \gamma}^\alpha\) are the usual metric connections defined in terms of the metric tensor gμν, as

$$\Gamma _{\beta \gamma}^\alpha = {1 \over 2}{g^{\alpha \lambda}}\left({{{\partial {g_{\gamma \lambda}}} \over {\partial {x^\beta}}} + {{\partial {g_{\lambda \beta}}} \over {\partial {x^\gamma}}} - {{\partial {g_{\beta \gamma}}} \over {\partial {x^\lambda}}}} \right).$$
(2.3)

This follows from the metricity relation, \({\nabla _\lambda}{g_{\mu \nu}} = \partial {g_{\mu \nu}}/\partial {x^\lambda} - {g_{\rho \nu}}\Gamma _{\mu \lambda}^\rho - {g_{\mu \rho}}\Gamma _{\nu \lambda}^\rho = 0\).

Equations of motion

The field equation can be derived by varying the action (2.1) with respect to gμν:

$${\Sigma _{\mu \nu}} \equiv F(R){R_{\mu \nu}}(g) - {1 \over 2}f(R){g_{\mu \nu}} - {\nabla _\mu}{\nabla _\nu}F(R) + {g_{\mu \nu}}\square F(R) = {\kappa ^2}T_{\mu \nu}^{(M)},$$
(2.4)

where F(R) = ∂f/∂R. \(T_{\mu \nu}^{\left(M \right)}\) is the energy-momentum tensor of the matter fields defined by the variational derivative of \({{\mathcal L}_M}\) in terms of gμν:

$$T_{\mu \nu}^{(M)} = - {2 \over {\sqrt {- g}}}{{\delta {{\mathcal L}_M}} \over {\delta {g^{\mu \nu}}}}.$$
(2.5)

This satisfies the continuity equation

$${\nabla ^\mu}T_{\mu \nu}^{(M)} = 0,$$
(2.6)

as well as Σμν, i.e., ∇μΣμν = 0.Footnote 2 The trace of Eq. (2.4) gives

$$3\square F(R) + F(R)R - 2f(R) = {\kappa ^2}T,$$
(2.7)

where \(T = {g^{\mu \nu}}T_{\mu \nu}^{\left(M \right)}\) and \(\Box F = \left({1/\sqrt {- g}} \right){\partial _\mu}\left({\sqrt {- g} {g^{\mu \nu}}{\partial _\nu}F} \right)\).

Einstein gravity, without the cosmological constant, corresponds to f(R) = R and F(R) = 1, so that the term □F(R) in Eq. (2.7) vanishes. In this case we have R = −κ2T and hence the Ricci scalar R is directly determined by the matter (the trace T). In modified gravity the term □F(R) does not vanish in Eq. (2.7), which means that there is a propagating scalar degree of freedom, φF(R). The trace equation (2.7) determines the dynamics of the scalar field φ (dubbed “scalaron” [564]).

The field equation (2.4) can be written in the following form [568]

$${G_{\mu \nu}} = {\kappa ^2}\left({T_{\mu \nu}^{(M)} + T_{\mu \nu}^{(D)}} \right),$$
(2.8)

where GμνRμν − (1/2)gμνR and

$${\kappa ^2}T_{\mu \nu}^{(D)} \equiv {g_{\mu \nu}}(f - R)/2 + {\nabla _\mu}{\nabla _\nu}F - {g_{\mu \nu}}\square F + (1 - F){R_{\mu \nu}}.$$
(2.9)

Since ∇μGμν = 0 and \({\nabla ^\mu}T_{\mu \nu}^{\left(M \right)} = 0\), it follows that

$${\nabla ^\mu}T_{\mu \nu}^{(D)} = 0.$$
(2.10)

Hence the continuity equation holds, not only for Σμν, but also for the effective energy-momentum tensor \(T_{\mu \nu}^{\left(D \right)}\) defined in Eq. (2.9). This is sometimes convenient when we study the dark energy equation of state [306, 568] as well as the equilibrium description of thermodynamics for the horizon entropy [53].

There exists a de Sitter point that corresponds to a vacuum solution (T = 0) at which the Ricci scalar is constant. Since □F(R) = 0 at this point, we obtain

$$F(R)R - 2f(R) = 0.$$
(2.11)

The model f(R) = αR2 satisfies this condition, so that it gives rise to the exact de Sitter solution [564]. In the model f(R) = R + αR2, because of the linear term in R, the inflationary expansion ends when the term αR2 becomes smaller than the linear term R (as we will see in Section 3). This is followed by a reheating stage in which the oscillation of R leads to the gravitational particle production. It is also possible to use the de Sitter point given by Eq. (2.11) for dark energy.

We consider the spatially flat Friedmann-Lemaître-Robertson-Walker (FLRW) spacetime with a time-dependent scale factor a(t) and a metric

$${\rm{d}}{s^2} = {g_{\mu \nu}}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu} = - {\rm{d}}{t^2} + {a^2}(t){\rm{d}}{x^2},$$
(2.12)

where t is cosmic time. For this metric the Ricci scalar R is given by

$$R = 6(2{H^2} + \dot H),$$
(2.13)

where Hȧ/a is the Hubble parameter and a dot stands for a derivative with respect to t. The present value of H is given by

$${H_0} = 100\;h\;{\rm{km}}\;{\sec ^{- 1}}{\rm{Mp}}{{\rm{c}}^{- 1}} = 2.1332\;h \times {10^{- 42}}{\rm{GeV,}}$$
(2.14)

where h = 0.72 ± 0.08 describes the uncertainty of H0 [264].

The energy-momentum tensor of matter is given by \({T^\mu}_\nu ^{\left(M \right)} = {\rm{diag}}\left({- {\rho _M},\,{P_M},\,{P_M},\,{P_M}} \right)\), where ρM is the energy density and PM is the pressure. The field equations (2.4) in the flat FLRW background give

$$3F{H^2} = (FR - f)/2 - 3H\dot F + {\kappa ^2}{\rho _M},$$
(2.15)
$$- 2F\dot H = \ddot F - H\dot F + {\kappa ^2}({\rho _M} + {P_M}),$$
(2.16)

where the perfect fluid satisfies the continuity equation

$${\dot \rho _M} + 3H({\rho _M} + {P_M}) = 0{.}$$
(2.17)

We also introduce the equation of state of matter, wMPM/ρm. As long as wM is constant, the integration of Eq. (2.17) gives \({\rho _M} \propto {a^{- 3\left({1 + {w_M}} \right)}}\). In Section 4 we shall take into account both non-relativistic matter (wM = 0) and radiation (wr = 1/3) to discuss cosmological dynamics of f(R) dark energy models.

Note that there are some works about the Einstein static universes in f(R) gravity [91, 532]. Although Einstein static solutions exist for a wide variety of f(R) models in the presence of a barotropic perfect fluid, these solutions have been shown to be unstable against either homogeneous or inhomogeneous perturbations [532].

Equivalence with Brans-Dicke theory

The f(R) theory in the metric formalism can be cast in the form of Brans-Dicke (BD) theory [100] with a potential for the effective scalar-field degree of freedom (scalaron). Let us consider the following action with a new field χ,

$$S = {1 \over {2{\kappa ^2}}}\int {{{\rm{d}}^4}} x\sqrt {- g} [f(\chi) + {f_{,\chi}}(\chi)(R - \chi)] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}){.}$$
(2.18)

Varying this action with respect to χ, we obtain

$${f_{,\chi \chi}}(\chi)(R - \chi) = 0{.}$$
(2.19)

Provided f,χχ(χ) ≠ 0 it follows that χ = R. Hence the action (2.18) recovers the action (2.1) in f(R) gravity. If we define

$$\varphi \equiv {f_{,\chi}}(\chi),$$
(2.20)

the action (2.18) can be expressed as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over {2{\kappa ^2}}}\varphi R - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.21)

where U(φ) is a field potential given by

$$U(\varphi) = {{\chi (\varphi)\varphi - f(\chi (\varphi))} \over {2{\kappa ^2}}}.$$
(2.22)

Meanwhile the action in BD theory [100] with a potential U(φ) is given by

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}\varphi R - {{{\omega _{{\rm{BD}}}}} \over {2\varphi}}{{(\nabla \varphi)}^2} - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.23)

where ωBD is the BD parameter and (∇φ)2gμνμφ∂νφ. Comparing Eq. (2.21) with Eq. (2.23), it follows that f(R) theory in the metric formalism is equivalent to BD theory with the parameter ωBD = 0 [467, 579, 152] (in the unit κ2 = 1). In Palatini f(R) theory where the metric gμν and the connection \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables, the Ricci scalar is different from that in metric f(R) theory. As we will see in Sections 9.1 and 10.1, f(R) theory in the Palatini formalism is equivalent to BD theory with the parameter ωBD = −3/2.

Conformal transformation

The action (2.1) in f(R) gravity corresponds to a non-linear function f in terms of R. It is possible to derive an action in the Einstein frame under the conformal transformation [213, 609, 408, 611, 249, 268, 410]:

$${\tilde g_{\mu \nu}} = {\Omega ^2}{g_{\mu \nu}},$$
(2.24)

where Ω2 is the conformal factor and a tilde represents quantities in the Einstein frame. The Ricci scalars R and \(\tilde R\) in the two frames have the following relation

$$R = {\Omega ^2}(\tilde R + 6\tilde \square\omega - 6{\tilde g^{\mu \nu}}{\partial _\mu}\omega {\partial _\nu}\omega),$$
(2.25)

where

$$\omega \equiv \ln \Omega, \quad \quad {\partial _\mu}\omega \equiv {{\partial \omega} \over {\partial {{\tilde x}^\mu}}},\quad \quad \tilde\square \omega \equiv {1 \over {\sqrt {- \tilde g}}}{\partial _\mu}(\sqrt {- \tilde g} {\tilde g^{\mu \nu}}{\partial _\nu}\omega){.}$$
(2.26)

We rewrite the action (2.1) in the form

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left({{1 \over {2{\kappa ^2}}}FR - U} \right) + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(2.27)

where

$$U = {{FR - f} \over {2{\kappa ^2}}}.$$
(2.28)

Using Eq. (2.25) and the relation \(\sqrt {- g} = {\Omega ^{- 4}}\sqrt {- \tilde g}\), the action (2.27) is transformed as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \left[ {{1 \over {2{\kappa ^2}}}F{\Omega ^{- 2}}(\tilde R + 6\tilde \square\omega - 6{{\tilde g}^{\mu \nu}}{\partial _\mu}\omega {\partial _\nu}\omega) - {\Omega ^{- 4}}U} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({\Omega ^{- 2}}{\tilde g_{\mu \nu}},{\Psi _M}){.}$$
(2.29)

We obtain the Einstein frame action (linear action in \(\tilde R\)) for the choice

$${\Omega ^2} = F.$$
(2.30)

This choice is consistent if F > 0. We introduce a new scalar field ϕ defined by

$$\kappa \phi \equiv \sqrt {3/2} \ln F.$$
(2.31)

From the definition of ω in Eq. (2.26) we have that \(\omega = \kappa \phi/\sqrt 6\). Using Eq. (2.26), the integral \(\int {{{\rm{d}}^4}x} \sqrt {- \tilde g} \tilde \Box \omega\) vanishes on account of the Gauss’s theorem. Then the action in the Einstein frame is

$${S_E} = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \left[ {{1 \over {2{\kappa ^2}}}\tilde R - {1 \over 2}{{\tilde g}^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V(\phi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({F^{- 1}}(\phi){\tilde g_{\mu \nu}},{\Psi _M}),$$
(2.32)

where

$$V(\phi) = {U \over {{F^2}}} = {{FR - f} \over {2{\kappa ^2}{F^2}}}.$$
(2.33)

Hence the Lagrangian density of the field ϕ is given by \({{\mathcal L}_\phi} = - {1 \over 2}{\tilde g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V\left(\phi \right)\) with the energy-momentum tensor

$$\tilde T_{\mu \nu}^{(\phi)} = - {2 \over {\sqrt {- \tilde g}}}{{\delta (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\delta {{\tilde g}^{\mu \nu}}}} = {\partial _\mu}\phi {\partial _\nu}\phi - {\tilde g_{\mu \nu}}\left[ {{1 \over 2}{{\tilde g}^{\alpha \beta}}{\partial _\alpha}\phi {\partial _\beta}\phi + V(\phi)} \right].$$
(2.34)

The conformal factor \({\Omega ^2} = F = \exp \left({\sqrt {2/3} \kappa \phi} \right)\) is field-dependent. From the matter action (2.32) the scalar field ϕ is directly coupled to matter in the Einstein frame. In order to see this more explicitly, we take the variation of the action (2.32) with respect to the field ϕ:

$$- {\partial _\mu}\left({{{\partial (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\partial ({\partial _\mu}\phi)}}} \right) + {{\partial (\sqrt {- \tilde g} {{\mathcal L}_\phi})} \over {\partial \phi}} + {{\partial {{\mathcal L}_M}} \over {\partial \phi}} = 0,$$
(2.35)

that is

$$\tilde \square\phi - {V_{,\phi}} + {1 \over {\sqrt {- \tilde g}}}{{\partial {{\mathcal L}_M}} \over {\partial \phi}} = 0,\quad {\rm{where}}\quad \tilde \square\phi \equiv {1 \over {\sqrt {- \tilde g}}}{\partial _\mu}(\sqrt {- \tilde g} {\tilde g^{\mu \nu}}{\partial _\nu}\phi).$$
(2.36)

Using Eq. (2.24) and the relations \(\sqrt {- \tilde g} = {F^2}\sqrt {- g}\) and \({\tilde g^{\mu \nu}} = {F^{- 1}}{g^{\mu \nu}}\), the energy-momentum tensor of matter is transformed as

$$\tilde T_{\mu \nu}^{(M)} = - {2 \over {\sqrt {- \tilde g}}}{{\delta {{\mathcal L}_M}} \over {\delta {{\tilde g}^{\mu \nu}}}} = {{T_{\mu \nu}^{(M)}} \over F}.$$
(2.37)

The energy-momentum tensor of perfect fluids in the Einstein frame is given by

$$\tilde T_{\;\;\;\nu}^{\mu (M)} = {\rm diag} (-{\tilde \rho _M},\tilde P_{M},\tilde P_{M},\tilde P_{M}) = {\rm diag}(- \rho _{M}/{F^2},{P_M}/{F^2},{P_M}/{F^2},{P_M}/{F^2}){.}$$
(2.38)

The derivative of the Lagrangian density \({{\mathcal L}_M} = {{\mathcal L}_M}\left({{g_{\mu \nu}}} \right) = {{\mathcal L}_M}\left({{F^{- 1}}\left(\phi \right){{\tilde g}_{\mu \nu}}} \right)\) with respect to ϕ is

$${{\partial {{\mathcal L}_M}} \over {\partial \phi}} = {{\delta {{\mathcal L}_M}} \over {\delta {g^{\mu \nu}}}}{{\partial {g^{\mu \nu}}} \over {\partial \phi}} = {1 \over {F(\phi)}}{{\delta {{\mathcal L}_M}} \over {\delta {{\tilde g}^{\mu \nu}}}}{{\partial (F(\phi){{\tilde g}^{\mu \nu}})} \over {\partial \phi}} = - \sqrt {- \tilde g} {{{F_{,\phi}}} \over {2F}}\tilde T_{\mu \nu}^{(M)}{\tilde g^{\mu \nu}}.$$
(2.39)

The strength of the coupling between the field and matter can be quantified by the following quantity

$$Q \equiv - {{{F_{,\phi}}} \over {2\kappa F}} = - {1 \over {\sqrt 6}},$$
(2.40)

which is constant in f(R) gravity [28]. It then follows that

$${{\partial {{\mathcal L}_M}} \over {\partial \phi}} = \sqrt {- \tilde g} \kappa Q\tilde T,$$
(2.41)

where \(\tilde T = {\tilde g_{\mu \nu}}{\tilde T^{\mu \nu \left(M \right)}} = - {\tilde \rho _M} + 3{\tilde P_M}\). Substituting Eq. (2.41) into Eq. (2.36), we obtain the field equation in the Einstein frame:

$$\tilde \square\phi - {V_{,\phi}} + \kappa Q\tilde T = 0.$$
(2.42)

This shows that the field ϕ is directly coupled to matter apart from radiation \(\left({\tilde T = 0} \right)\).

Let us consider the flat FLRW spacetime with the metric (2.12) in the Jordan frame. The metric in the Einstein frame is given by

$$\begin{array}{*{20}c} {{\rm{d}}{{\tilde s}^2} = {\Omega ^2}{\rm{d}}{s^2} = F(- {\rm{d}}{t^2} + {a^2}(t){\rm{d}}{x^2}),\quad \quad \quad \;\;\;}\\ {= - {\rm{d}}{{\tilde t}^2} + {{\tilde a}^2}(\tilde t){\rm{d}}{x^2},}\\ \end{array}$$
(2.43)

which leads to the following relations (for F > 0)

$${\rm{d}}\tilde t = \sqrt F {\rm{d}}t,\quad \tilde a = \sqrt F a,$$
(2.44)

where

$$F = {e^{- 2Q\kappa \phi}}.$$
(2.45)

Note that Eq. (2.45) comes from the integration of Eq. (2.40) for constant Q. The field equation (2.42) can be expressed as

$${{{{\rm{d}}^2}\phi} \over {{\rm{d}}{{\tilde t}^2}}} + 3\tilde H{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}} + {V_{,\phi}} = - \kappa Q({\tilde \rho _M} - 3{\tilde P_M}),$$
(2.46)

where

$$\tilde H \equiv {1 \over {\tilde a}}{{{\rm{d}}\tilde a} \over {{\rm{d}}\tilde t}} = {1 \over {\sqrt F}}\left({H + {{\dot F} \over {2F}}} \right).$$
(2.47)

Defining the energy density \({\tilde \rho _\phi} = {1 \over 2}{\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right)^2} + V\left(\phi \right)\) and the pressure \({\tilde P_\phi} = {1 \over 2}{\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right)^2} - V\left(\phi \right)\), Eq. (2.46) can be written as

$${{{\rm{d}}{{\tilde \rho}_\phi}} \over {{\rm{d}}\tilde t}} + 3\tilde H({\tilde \rho _\phi} + {\tilde P_\phi}) = - \kappa Q({\tilde \rho _M} - 3{\tilde P_M}){{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}.$$
(2.48)

Under the transformation (2.44) together with \({\rho _M} = {F^2}{\tilde \rho _M},\,{P_M} = {F^2}{\tilde P_M}\), and \(H = {F^{1/2}}[\tilde H - ({\rm{d}}F/{\rm{d}}\tilde t)/2F]\), the continuity equation (2.17) is transformed as

$${{{\rm{d}}{{\tilde \rho}_M}} \over {{\rm{d}}\tilde t}} + 3\tilde H({\tilde \rho _M} + {\tilde P_M}) = \kappa Q({\tilde \rho _M} - 3{\tilde P_M}){{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}.$$
(2.49)

Equations (2.48) and (2.49) show that the field and matter interacts with each other, while the total energy density \({\tilde \rho _T} = {\tilde \rho _\phi} + {\tilde \rho _M}\) and the pressure \({\tilde P_T} = {\tilde P_\phi} + {\tilde P_M}\) satisfy the continuity equation \({{\rm{d}}_{\tilde \rho T}}/{\rm{d}}\tilde t + 3\tilde H\left({{{\tilde \rho}_T} + {{\tilde P}_T}} \right) = 0\). More generally, Eqs. (2.48) and (2.49) can be expressed in terms of the energy-momentum tensors defined in Eqs. (2.34) and (2.37):

$${\tilde \nabla _\mu}\tilde T_\nu ^{\mu (\phi)} = - Q\tilde T{\tilde \nabla _\nu}\phi, \quad {\tilde \nabla _\mu}\tilde T_\nu ^{\mu (M)} = Q\tilde T{\tilde \nabla _\nu}\phi,$$
(2.50)

which correspond to the same equations in coupled quintessence studied in [23] (see also [22]).

In the absence of a field potential V(ϕ) (i.e., massless field) the field mediates a long-range fifth force with a large coupling (∣Q∣ ≃ 0.4), which contradicts with experimental tests in the solar system. In f(R) gravity a field potential with gravitational origin is present, which allows the possibility of compatibility with local gravity tests through the chameleon mechanism [344, 343].

In f(R) gravity the field ϕ is coupled to non-relativistic matter (dark matter, baryons) with a universal coupling \(Q = - 1/\sqrt 6\). We consider the frame in which the baryons obey the standard continuity equation ρma−3, i.e., the Jordan frame, as the “physical” frame in which physical quantities are compared with observations and experiments. It is sometimes convenient to refer the Einstein frame in which a canonical scalar field is coupled to non-relativistic matter. In both frames we are treating the same physics, but using the different time and length scales gives rise to the apparent difference between the observables in two frames. Our attitude throughout the review is to discuss observables in the Jordan frame. When we transform to the Einstein frame for some convenience, we go back to the Jordan frame to discuss physical quantities.

Inflation in f(R) Theories

Most models of inflation in the early universe are based on scalar fields appearing in superstring and supergravity theories. Meanwhile, the first inflation model proposed by Starobinsky [564] is related to the conformal anomaly in quantum gravityFootnote 3. Unlike the models such as “old inflation” [339, 291, 524] this scenario is not plagued by the graceful exit problem — the period of cosmic acceleration is followed by the radiation-dominated epoch with a transient matter-dominated phase [565, 606, 426]. Moreover it predicts nearly scale-invariant spectra of gravitational waves and temperature anisotropies consistent with CMB observations [563, 436, 566, 355, 315]. In this section we review the dynamics of inflation and reheating. In Section 7 we will discuss the power spectra of scalar and tensor perturbations generated in f(R) inflation models.

Inflationary dynamics

We consider the models of the form

$$f(R) = R + \alpha {R^n},\quad (\alpha > 0,n > 0),$$
(3.1)

which include the Starobinsky’s model [564] as a specific case (n = 2). In the absence of the matter fluid (ρM = 0), Eq. (2.15) gives

$$3(1 + n\alpha {R^{n - 1}}){H^2} = {1 \over 2}(n - 1)\alpha {R^n} - 3n(n - 1)\alpha H{R^{n - 2}}\dot R.$$
(3.2)

The cosmic acceleration can be realized in the regime F = 1 + nαRn−1 ≫ 1. Under the approximation FnαRn−1, we divide Eq. (3.2) by 3nαRn−1 to give

$${H^2} \simeq {{n - 1} \over {6n}}\left({R - 6nH{{\dot R} \over R}} \right).$$
(3.3)

During inflation the Hubble parameter H evolves slowly so that one can use the approximation ∣Ḣ/H2∣ ♪ 1 and ∣Ḧ/(HḢ)∣ ♪ 1. Then Eq. (3.3) reduces to

$${{\dot H} \over {{H^2}}} \simeq - {\epsilon _1},\quad \;\;{\epsilon _1} = {{2 - n} \over {(n - 1)(2n - 1)}}.$$
(3.4)

Integrating this equation for ϵ1 > 0, we obtain the solution

$$H \simeq {1 \over {{\epsilon _1}t}},\quad \;\;a \propto {t^{1/{\epsilon _1}}}.$$
(3.5)

The cosmic acceleration occurs for ϵ1 < 1, i.e., \(n > \left({1 + \sqrt 3} \right)/2\). When n = 2 one has ϵ1 = 0, so that H is constant in the regime F ≫ 1. The models with n > 2 lead to super inflation characterized by > 0 and \(a \propto {\left\vert {{t_0} - t} \right\vert^{- 1/\left\vert {{\epsilon_1}} \right\vert}}\) (t0 is a constant). Hence the standard inflation with decreasing H occurs for \(\left({1 + \sqrt 3} \right)/2 < n < 2\).

In the following let us focus on the Starobinsky’s model given by

$$f(R) = R + {R^2}/(6{M^2}),$$
(3.6)

where the constant M has a dimension of mass. The presence of the linear term in R eventually causes inflation to end. Without neglecting this linear term, the combination of Eqs. (2.15) and (2.16) gives

$$\ddot H - {{{{\dot H}^2}} \over {2H}} + {1 \over 2}{M^2}H = - 3H\dot H,$$
(3.7)
$$\ddot R + 3H\dot R + {M^2}R = 0.$$
(3.8)

During inflation the first two terms in Eq. (3.7) can be neglected relative to others, which gives ≃ − M2/6. We then obtain the solution

$$H \simeq {H_i} - ({M^2}/6)(t - {t_i}),$$
(3.9)
$$a \simeq {a_i}\exp [{H_i}(t - {t_i}) - ({M^2}/12){(t - {t_i})^2}],$$
(3.10)
$$R \simeq 12{H^2} - {M^2},$$
(3.11)

where Hi and ai are the Hubble parameter and the scale factor at the onset of inflation (t = ti), respectively. This inflationary solution is a transient attractor of the dynamical system [407]. The accelerated expansion continues as long as the slow-roll parameter

$${\epsilon _1} = - {{\dot H} \over {{H^2}}} \simeq {{{M^2}} \over {6{H^2}}},$$
(3.12)

is smaller than the order of unity, i.e., H2M2. One can also check that the approximate relation 3HṘ + M2R ≃ 0 holds in Eq. (3.8) by using R ≃ 12H2. The end of inflation (at time t = tf) is characterized by the condition ϵf ≃ 1, i.e., \({H_f} \simeq M/\sqrt 6\). From Eq. (3.11) this corresponds to the epoch at which the Ricci scalar decreases to RM2. As we will see later, the WMAP normalization of the CMB temperature anisotropies constrains the mass scale to be M ≃ 1013 GeV. Note that the phase space analysis for the model (3.6) was carried out in [407, 24, 131].

We define the number of e-foldings from t = ti to t = tf:

$$N \equiv \int\nolimits_{{t_i}}^{{t_f}} H \;{\rm{d}}t \simeq {H_i}({t_f} - {t_i}) - {{{M^2}} \over {12}}{({t_f} - {t_i})^2}.$$
(3.13)

Since inflation ends at tfti + 6Hi/M2, it follows that

$$N \simeq {{3H_i^2} \over {{M^2}}} \simeq {1 \over {2{\epsilon _1}({t_i})}},$$
(3.14)

where we used Eq. (3.12) in the last approximate equality. In order to solve horizon and flatness problems of the big bang cosmology we require that N ≳ 70 [391], i.e., ϵ1(ti) ≲ 7 × 10−3. The CMB temperature anisotropies correspond to the perturbations whose wavelengths crossed the Hubble radius around N = 55–60 before the end of inflation.

Dynamics in the Einstein frame

Let us consider inflationary dynamics in the Einstein frame for the model (3.6) in the absence of matter fluids \(\left({{\mathcal{L}_M} = 0} \right)\). The action in the Einstein frame corresponds to (2.32) with a field ϕ defined by

$$\phi = \sqrt {{3 \over 2}} {1 \over \kappa}\ln F = \sqrt {{3 \over 2}} {1 \over \kappa}\ln \left({1 + {R \over {3{M^2}}}} \right).$$
(3.15)

Using this relation, the field potential (2.33) reads [408, 61, 63]

$$V(\phi) = {{3{M^2}} \over {4{\kappa ^2}}}{\left({1 - {e^{- \sqrt {2/3} \kappa \phi}}} \right)^2}.$$
(3.16)

In Figure 1 we illustrate the potential (3.16) as a function of ϕ. In the regime κϕ ≫ 1 the potential is nearly constant (V(ϕ) ≃ 3M2/(4κ2)), which leads to slow-roll inflation. The potential in the regime κϕ ≪ 1 is given by V(ϕ) ≃ (1/2)M2ϕ2, so that the field oscillates around ϕ = 0 with a Hubble damping. The second derivative of V with respect to ϕ is

$${V_{,\phi \phi}} = - {M^2}{e^{- \sqrt {2/3} \kappa \phi}}\left({1 - 2{e^{- \sqrt {2/3} \kappa \phi}}} \right),$$
(3.17)

which changes from negative to positive at \(\phi = {\phi _1} \equiv \sqrt {3/2} \left({\ln \,2} \right)/\kappa \simeq 0.169{m_{{\rm{pl}}}}\).

Figure 1
figure1

The field potential (3.16) in the Einstein frame corresponding to the model (3.6). Inflation is realized in the regime κϕ ≫ 1.

Since F ≃ 4H2/M2 during inflation, the transformation (2.44) gives a relation between the cosmic time \(\tilde t\) in the Einstein frame and that in the Jordan frame:

$$\tilde t = \int\nolimits_{{t_i}}^t {\sqrt F} \;{\rm{d}}t \simeq {2 \over M}\left[ {{H_i}(t - {t_i}) - {{{M^2}} \over {12}}{{(t - {t_i})}^2}} \right],$$
(3.18)

where t = ti corresponds to \(\tilde t = 0\). The end of inflation (tfti + 6Hi/M2) corresponds to \({\tilde t_f} = \left({2/M} \right)N\) in the Einstein frame, where N is given in Eq. (3.13). On using Eqs. (3.10) and (3.18), the scale factor \(\tilde a = \sqrt F a\) in the Einstein frame evolves as

$$\tilde a(\tilde t) \simeq \left({1 - {{{M^2}} \over {12H_i^2}}M\tilde t} \right){\tilde a_i}{e^{M\tilde t/2}},$$
(3.19)

where \({\tilde a_i} = 2{H_i}{a_i}/M\). Similarly the evolution of the Hubble parameter \(\tilde H = \left({H/\sqrt F} \right)\left[ {1 + \dot F/\left({2HF} \right)} \right]\) is given by

$$\tilde H(\tilde t) \simeq {M \over 2}\left[ {1 - {{{M^2}} \over {6H_i^2}}{{\left({1 - {{{M^2}} \over {12H_i^2}}M\tilde t} \right)}^{- 2}}} \right],$$
(3.20)

which decreases with time. Equations (3.19) and (3.20) show that the universe expands quasi-exponentially in the Einstein frame as well.

The field equations for the action (2.32) are given by

$$3{\tilde H^2} = {\kappa ^2}\left[ {{1 \over 2}{{\left({{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}} \right)}^2} + V(\phi)} \right],$$
(3.21)
$${{{{\rm{d}}^2}\phi} \over {{\rm{d}}{{\tilde t}^2}}} + 3\tilde H{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}} + {V_{,\phi}} = 0.$$
(3.22)

Using the slow-roll approximations \({\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right)^2} \ll V\left(\phi \right)\) and \(\left\vert {{{\rm{d}}^2}\phi/{\rm{d}}{{\tilde t}^2}} \right\vert \ll \left\vert {\tilde H{\rm{d}}\phi/{\rm{d}}\tilde t} \right\vert\) during inflation, one has \(3{\tilde H^2} \simeq {\kappa ^2}V\left(\phi \right)\) and \(3\tilde H\left({{\rm{d}}\phi/{\rm{d}}\tilde t} \right) + {V_{,\phi}} \simeq 0\). We define the slow-roll parameters

$${\tilde \epsilon _1} \equiv - {{{\rm{d}}\tilde H/{\rm{d}}\tilde t} \over {{{\tilde H}^2}}} \simeq {1 \over {2{\kappa ^2}}}{\left({{{{V_{,\phi}}} \over V}} \right)^2},\quad {\tilde \epsilon _2} \equiv {{{{\rm{d}}^2}\phi/{\rm{d}}{{\tilde t}^2}} \over {\tilde H({\rm{d}}\phi/{\rm{d}}\tilde t)}} \simeq {\tilde \epsilon _1} - {{{V_{,\phi \phi}}} \over {3{{\tilde H}^2}}}.$$
(3.23)

for the potential (3.16) it follows that

$${\tilde \epsilon_1} \simeq {4 \over 3}{({e^{\sqrt {2/3} \kappa \phi}} - 1)^{- 2}},\quad \;{\tilde \epsilon_2} \simeq {\tilde \epsilon_1} + {{{M^2}} \over {3{{\tilde H}^2}}}{e^{- \sqrt {2/3} \kappa \phi}}(1 - 2{e^{- \sqrt {2/3} \kappa \phi}}),$$
(3.24)

which are much smaller than 1 during inflation (κϕ ≫ 1). The end of inflation is characterized by the condition \(\left\{{{{\tilde \epsilon}_1},\,\left\vert {{{\tilde \epsilon}_2}} \right\vert} \right\} = \mathcal{O}\left(1 \right)\). Solving \({\tilde \epsilon_1} = 1\), we obtain the field value ϕf ≃ 0.19mpl.

We define the number of e-foldings in the Einstein frame,

$$\tilde N = \int\nolimits_{{{\tilde t}_i}}^{{{\tilde t}_f}} {\tilde H} {\rm{d}}\tilde t \simeq {\kappa ^2}\int\nolimits_{{\phi _f}}^{{\phi _i}} {{V \over {{V_{,\phi}}}}{\rm{d}}\phi,}$$
(3.25)

where ϕi is the field value at the onset of inflation. Since \(\tilde H{\rm{d}}\tilde t = H{\rm{d}}t\left[ {1 + \dot F/\left({2HF} \right)} \right]\), it follows that \(\tilde N\) is identical to N in the slow-roll limit: ∣/(2HF)∣ ≃ ∣Ḣ/H2∣ ≪ 1. Under the condition κϕi ≫ 1 we have

$$\tilde N \simeq {3 \over 4}{e^{\sqrt {2/3} \kappa {\phi _i}}}.$$
(3.26)

This shows that ϕi ≃ 1.11mpl for \(\tilde N = 70\). From Eqs. (3.24) and (3.26) together with the approximate relation \(\tilde H \simeq M/2\), we obtain

$${\tilde \epsilon _1} \simeq {3 \over {4{{\tilde N}^2}}},\quad {\tilde \epsilon _2} \simeq {1 \over {\tilde N}},$$
(3.27)

where, in the expression of \({\tilde \epsilon _2}\), we have dropped the terms of the order of 1/Ñ2. The results (3.27) will be used to estimate the spectra of density perturbations in Section 7.

Reheating after inflation

We discuss the dynamics of reheating and the resulting particle production in the Jordan frame for the model (3.6). The inflationary period is followed by a reheating phase in which the second derivative \(\ddot R\) can no longer be neglected in Eq. (3.8). Introducing \(\hat R = {a^{3/2}}R\), we have

$$\ddot \hat R + \left({{M^2} - {3 \over 4}{H^2} - {3 \over 2}\dot H} \right)\hat R = 0.$$
(3.28)

Since M2 ≫ {H2, ∣∣} during reheating, the solution to Eq. (3.28) is given by that of the harmonic oscillator with a frequency M. Hence the Ricci scalar exhibits a damped oscillation around R = 0:

$$R \propto {a^{- 3/2}}\sin (Mt).$$
(3.29)

Let us estimate the evolution of the Hubble parameter and the scale factor during reheating in more detail. If we neglect the r.h.s. of Eq. (3.7), we get the solution H(t) = const × cos2 (Mt/2). Setting H(t) = f(t)cos2(Mt/2) to derive the solution of Eq. (3.7), we obtain [426]

$$f(t) = {1 \over {C + (3/4)(t - {t_{{\rm{os}}}}) + 3/(4M)\sin [M(t - {t_{{\rm{os}}}})]}},$$
(3.30)

where tos is the time at the onset of reheating. The constant C is determined by matching Eq. (3.30) with the slow-roll inflationary solution = −M2/6 at t = tos. Then we get C = 3/M and

$$H(t) = {\left[ {{3 \over M} + {3 \over 4}(t - {t_{{\rm{os}}}}) + {3 \over {4M}}\sin \;M(t - {t_{{\rm{os}}}})} \right]^{- 1}}{\cos ^2}\left[ {{M \over 2}(t - {t_{{\rm{os}}}})} \right].$$
(3.31)

Taking the time average of oscillations in the regime M(ttos) ≫ 1, it follows that 〈H〉 ≃ (2/3)(ttos) −1. This corresponds to the cosmic evolution during the matter-dominated epoch, i.e., 〈a〉 ∝ (ttos)2/3. The gravitational effect of coherent oscillations of scalarons with mass M is similar to that of a pressureless perfect fluid. During reheating the Ricci scalar is approximately given by R ≃ 6, i.e.

$$R \simeq - 3{\left[ {{3 \over M} + {3 \over 4}(t - {t_{{\rm{os}}}}) + {3 \over {4M}}\sin \;M(t - {t_{{\rm{os}}}})} \right]^{- 1}}M\sin [M(t - {t_{{\rm{os}}}})].$$
(3.32)

In the regime M(ttos) ≫ 1 this behaves as

$$R \simeq - {{4M} \over {t - {t_{{\rm{os}}}}}}\sin [M(t - {t_{{\rm{os}}}})].$$
(3.33)

In order to study particle production during reheating, we consider a scalar field χ with mass mχ. We also introduce a nonminimal coupling (1/2)ξRχ2 between the field χ and the Ricci scalar R [88]. Then the action is given by

$$S = \int {{{\rm{d}}^4}x\sqrt {- g} \left[ {{{f(R)} \over {2{\kappa ^2}}} - {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\chi {\partial _\nu}\chi - {1 \over 2}m_\chi ^2{\chi ^2} - {1 \over 2}\xi R{\chi ^2}} \right],}$$
(3.34)

where f(R) = R + R2/(6M2). Taking the variation of this action with respect to χ gives

$$\square\chi - m_\chi ^2\chi - \xi R\chi = 0.$$
(3.35)

We decompose the quantum field χ in terms of the Heisenberg representation:

$$\chi (t,x) = {1 \over {{{(2\pi)}^{3/2}}}}\int {{{\rm{d}}^{\rm{3}}}} k\left({{{\hat a}_k}{\chi _k}(t){e^{- ik \cdot x}} + \hat a_k^\dagger \chi _k^{\ast}(t){e^{ik \cdot x}}} \right),$$
(3.36)

where \({{\hat a}_k}\) and \(\hat a_{_k}^\dag\) are annihilation and creation operators, respectively. The field χ can be quantized in curved spacetime by generalizing the basic formalism of quantum field theory in the flat spacetime. See the book [88] for the detail of quantum field theory in curved spacetime. Then each Fourier mode χk(t) obeys the following equation of motion

$${\ddot \chi _k} + 3H{\dot \chi _k} + \left({{{{k^2}} \over {{a^2}}} + m_\chi ^2 + \xi R} \right){\chi _k} = 0,$$
(3.37)

where k = ∣k∣ is a comoving wavenumber. Introducing a new field uk = k and conformal time η = ∫ a−1dt, we obtain

$${{{{\rm{d}}^2}{u_k}} \over {{\rm{d}}{\eta ^2}}} + \left[ {{k^2} + m_\chi ^2{a^2} + \left({\xi - {1 \over 6}} \right){a^2}R} \right]{u_k} = 0,$$
(3.38)

where the conformal coupling correspond to ξ = 1/6. This result states that, even though ξ = 0 (that is, the field is minimally coupled to gravity), R still gives a contribution to the effective mass of uk. In the following we first review the reheating scenario based on a minimally coupled massless field (ξ = 0 and mχ = 0). This corresponds to the gravitational particle production in the perturbative regime [565, 606, 426]. We then study the case in which the nonminimal coupling ∣ξ∣ is larger than the order of 1. In this case the non-adiabatic particle production preheating [584, 353, 538, 354] can occur via parametric resonance.

Case: ξ = 0 and mχ = 0

In this case there is no explicit coupling among the fields χ and R. Hence the χ particles are produced only gravitationally. In fact, Eq. (3.38) reduces to

$${{{{\rm{d}}^2}{u_k}} \over {{\rm{d}}{\eta ^2}}} + {k^2}{u_k} = U{u_k},$$
(3.39)

where U = a2R/6. Since U is of the order of (aH)2, one has k2U for the mode deep inside the Hubble radius. Initially we choose the field in the vacuum state with the positive-frequency solution [88]: \(u_k^{(i)} = {e^{- ik\eta}}/\sqrt {2k}\). The presence of the time-dependent term U(η) leads to the creation of the particle χ. We can write the solution of Eq. (3.39) iteratively, as [626]

$${u_k}(\eta) = u_k^{(i)} + {1 \over k}\int\nolimits_0^\eta U (\eta ^{\prime})\sin [k(\eta - \eta ^{\prime})]{u_k}(\eta ^{\prime}){\rm{d}}\eta ^{\prime}.$$
(3.40)

After the universe enters the radiation-dominated epoch, the term U becomes small so that the flat-space solution is recovered. The choice of decomposition of χ into âk and \(\hat a_{_k}^\dag\) is not unique. In curved spacetime it is possible to choose another decomposition in term of new ladder operators \({{\hat {\mathcal A}}_k}\) and \(\hat {\mathcal A}_k^\dag\), which can be written in terms of âk and \(\hat a_{_k}^\dag\), such as \({\hat {\mathcal{A}}_k} = {\alpha _k}{{\hat a}_k} + \beta _k^ \ast \hat a_{- k}^\dagger\). Provided that \(\beta _k^ {\ast} \neq 0\), even though âk∣0〉 ≠ 0, we have \({{\hat {\mathcal A}}_k}\left\vert 0 \right\rangle \neq 0\). Hence the vacuum in one basis is not the vacuum in the new basis, and according to the new basis, the particles are created. The Bogoliubov coefficient describing the particle production is

$${\beta _k} = - {i \over {2k}}\int\nolimits_0^\infty U (\eta ^{\prime}){e^{- 2ik\eta ^{\prime}}}{\rm{d}}\eta ^{\prime}.$$
(3.41)

The typical wavenumber in the η-coordinate is given by k, whereas in the t-coordinate it is k/a. Then the energy density per unit comoving volume in the η-coordinate is [426]

$$\begin{array}{*{20}c} {{\rho _\eta} = {1 \over {{{(2\pi)}^3}}}\int\nolimits_0^\infty {4\pi {k^2}{\rm{d}}k \cdot k\vert {\beta _k}{\vert ^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad}}\\ {= {1 \over {8{\pi ^2}}}\int\nolimits_0^\infty {{\rm{d}}\eta} \;U(\eta)\int\nolimits_0^\infty {{\rm{d}}\eta} ^{\prime}U(\eta ^{\prime})\int\nolimits_0^\infty {{\rm{d}}k \cdot k{e^{2ik(\eta^{\prime} - \eta)}}}}\\ {= {1 \over {32{\pi ^2}}}\int\nolimits_0^\infty {{\rm{d}}\eta {{{\rm{d}}U} \over {{\rm{d}}\eta}}} \int\nolimits_0^\infty {{\rm{d}}\eta^{\prime}{{U(\eta^{\prime})} \over {\eta^{\prime}- \eta}},\quad \quad \quad \quad \quad \quad}}\\ \end{array}$$
(3.42)

where in the last equality we have used the fact that the term U approaches 0 in the early and late times.

During the oscillating phase of the Ricci scalar the time-dependence of U is given by \(U = I(\eta)\sin (\int\nolimits_0^\eta {\omega {\rm{d}}\bar \eta})\), where I(η) = ca(η)1/2 and ω = Ma (c is a constant). When we evaluate the term dU/dη in Eq. (3.42), the time-dependence of I(η) can be neglected. Differentiating Eq. (3.42) in terms of η and taking the limit \(\int\nolimits_0^\eta {\omega {\rm{d}}\bar \eta} \gg 1\), it follows that

$${{{\rm{d}}{\rho _\eta}} \over {{\rm{d}}\eta}} \simeq {\omega \over {32\pi}}{I^2}(\eta){\cos ^2}\left({\int\nolimits_0^\eta {\omega {\rm{d}}\bar \eta}} \right),$$
(3.43)

where we used the relation limk→∞ sin(kx)/x = πδ(x). Shifting the phase of the oscillating factor by π/2, we obtain

$${{{\rm{d}}{\rho _\eta}} \over {{\rm{d}}\eta}} \simeq {{M{U^2}} \over {32\pi}} = {{M{a^4}{R^2}} \over {1152\pi}}.$$
(3.44)

The proper energy density of the field χ is given by ρχ = (ρη/a)/a3 = ρη/a4. Taking into account g* relativistic degrees of freedom, the total radiation density is

$${\rho _M} = {{{g_\ast}} \over {{a^4}}}{\rho _\eta} = {{{g_\ast}} \over {{a^4}}}\int\nolimits_{{t_{{\rm{os}}}}}^t {{{M{a^4}{R^2}} \over {1152\pi}}} {\rm{d}}t,$$
(3.45)

which obeys the following equation

$${\dot \rho _M} + 4H{\rho _M} = {{{g_\ast}M{R^2}} \over {1152\pi}}.$$
(3.46)

Comparing this with the continuity equation (2.17) we obtain the pressure of the created particles, as

$${P_M} = {1 \over 3}{\rho _M} - {{{g_\ast}M{R^2}} \over {3456\pi H}}.$$
(3.47)

Now the dynamical equations are given by Eqs (2.15) and (2.16) with the energy density (3.45) and the pressure (3.47)

In the regime M(ttos) ≫ 1 the evolution of the scale factor is given by aa0(ttos)2/3, and hence

$${H^2} \simeq {4 \over {9{{(t - {t_{{\rm{os}}}})}^2}}},$$
(3.48)

where we have neglected the backreaction of created particles. Meanwhile the integration of Eq (3.45) gives

$${\rho _M} \simeq {{{g_\ast}{M^3}} \over {240\pi}}{1 \over {t - {t_{{\rm{os}}}}}},$$
(3.49)

where we have used the averaged relation 〈R2〉 ≃ 8M2/(ttos)2 [which comes from Eq. (3.33)]. The energy density ρM evolves slowly compared to H2 and finally it becomes a dominant contribution to the total energy density \((3{H^2} \simeq 8\pi {\rho _M}/m_{{\rm{pl}}}^2)\) at the time \({t_f} \simeq {t_{{\rm{os}}}} + 40m_{{\rm{pl}}}^2/({g_{\ast}}{M^3})\). In [426] it was found that the transition from the oscillating phase to the radiation-dominated epoch occurs slower compared to the estimation given above. Since the epoch of the transient matter-dominated era is about one order of magnitude longer than the analytic estimation [426], we take the value \({t_f} \simeq {t_{{\rm{os}}}} + 400m_{{\rm{pl}}}^2/({g_{\ast}}{M^3})\) to estimate the reheating temperature Tr. Since the particle energy density ρM(tf) is converted to the radiation energy density \({\rho _r} = {g_\ast}{\pi ^2}T_r^4/30\), the reheating temperature can be estimated asFootnote 4

$${T_r} \underset{\sim}{<} 3 \times {10^{17}}g_\ast ^{1/4}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^{3/2}}{\rm{GeV}}.$$
(3.50)

As we will see in Section 7, the WMAP normalization of the CMB temperature anisotropies determines the mass scale to be M ≃ 3 × 10−6mpl. Taking the value g* = 100, we have Tr ≲ 5 × 109 GeV. For t > tf the universe enters the radiation-dominated epoch characterized by at1/2, R = 0, and ρrt−2.

Case: ∣ξ∣ ≳ 1

If ∣ξ∣ is larger than the order of unity, one can expect the explosive particle production called preheating prior to the perturbative regime discussed above. Originally the dynamics of such gravitational preheating was studied in [70, 592] for a massive chaotic inflation model in Einstein gravity. Later this was extended to the f(R) model (3.6) [591].

Introducing a new field Xk = a3/2χk, Eq. (3.37) reads

$${\ddot X_k} + \left({{{{k^2}} \over {{a^2}}} + m_\chi ^2 + \xi R - {9 \over 4}{H^2} - {3 \over 2}\dot H} \right){X_k} = 0.$$
(3.51)

As long as ∣ξ∣ is larger than the order of unity, the last two terms in the bracket of Eq. (3.51) can be neglected relative to ξR. Since the Ricci scalar is given by Eq. (3.33) in the regime M(ttos) ≫ 1, it follows that

$${\ddot X_k} + \left[ {{{{k^2}} \over {{a^2}}} + m_\chi ^2 - {{4M\xi} \over {t - {t_{{\rm{os}}}}}}\sin \{M(t - {t_{{\rm{os}}}})\}} \right]{X_k} \simeq 0.$$
(3.52)

The oscillating term gives rise to parametric amplification of the particle χk. In order to see this we introduce the variable z defined by M(ttos) =2z ± π/2, where the plus and minus signs correspond to the cases ξ > 0 and ξ < 0 respectively. Then Eq. (3.52) reduces to the Mathieu equation

$${{{{\rm{d}}^2}} \over {{\rm{d}}{z^2}}}{X_k} + [{A_k} - 2q\cos (2z)]{X_k} \simeq 0,$$
(3.53)

where

$${A_k} = {{4{k^2}} \over {{a^2}{M^2}}} + {{4m_\chi ^2} \over {{M^2}}},\quad \;q = {{8\vert \xi \vert} \over {M(t - {t_{{\rm{os}}}})}}.$$
(3.54)

The strength of parametric resonance depends on the parameters Ak and q. This can be described by a stability-instability chart of the Mathieu equation [419, 353, 591]. In the Minkowski spacetime the parameters Ak and q are constant. If Ak and q are in an instability band, then the perturbation Xk grows exponentially with a growth index μk, i.e., \({X_k} \propto {e^{{\mu _k}z}}\). In the regime q ≪ 1 the resonance occurs only in narrow bands around Ak = 2, where = 1, 2, …, with the maximum growth index μk = q/2 [353]. Meanwhile, for large q(≫ 1), a broad resonance can occur for a wide range of parameter space and momentum modes [354].

In the expanding cosmological background both Ak and q vary in time. Initially the field Xk is in the broad resonance regime (q ≫ 1) for ∣ξ∣ ≫ 1, but it gradually enters the narrow resonance regime (q ≲ 1). Since the field passes many instability and stability bands, the growth index μk stochastically changes with the cosmic expansion. The non-adiabaticity of the change of the frequency \(\omega _k^2 = {k^2}/{a^2} + m_\chi ^2 - 4M\xi \sin \{M(t - {t_{{\rm{os}}}})\}/(t - {t_{{\rm{os}}}})\) can be estimated by the quantity

$${r_{{\rm{na}}}} \equiv \left\vert {{{{{\dot \omega}_k}} \over {\omega _k^2}}} \right\vert = M{{\vert {k^2}/{a^2} + 2M\xi \cos \{M(t - {t_{{\rm{os}}}})\}/(t - {t_{{\rm{os}}}})\vert} \over {\vert {k^2}/{a^2} + m_\chi ^2 - 4M\xi \sin \{M(t - {t_{{\rm{os}}}})\}/(t - {t_{{\rm{os}}}}){\vert ^{3/2}}}},$$
(3.55)

where the non-adiabatic regime corresponds to rna ≳ 1. For small k and mχ we have rna ≫ 1 around M(ttos) = , where n are positive integers. This corresponds to the time at which the Ricci scalar vanishes. Hence, each time R crosses 0 during its oscillation, the non-adiabatic particle production occurs most efficiently. The presence of the mass term mχ tends to suppress the non-adiabaticity parameter rna, but still it is possible to satisfy the condition rna ≳ 1 around R = 0.

For the model (3.6) it was shown in [591] that massless χ particles are resonantly amplified for ∣ξ∣ ≳ 3. Massive particles with mχ of the order of M can be created for ∣ξ∣ ≳ 10. Note that in the preheating scenario based on the model \(V(\phi, \chi) = (1/2)m_\phi ^2{\phi ^2} + (1/2){g^2}{\phi ^2}{\chi ^2}\) the parameter q decreases more rapidly (q ∝ 1/t2) than that in the model (3.6) [354]. Hence, in our geometric preheating scenario, we do not require very large initial values of q [such as \(q > {\mathcal O}({10^3})\)] to lead to the efficient parametric resonance.

While the above discussion is based on the linear analysis, non-linear effects (such as the mode-mode coupling of perturbations) can be important at the late stage of preheating (see, e.g., [354, 342]). Also the energy density of created particles affects the background cosmological dynamics, which works as a backreaction to the Ricci scalar. The process of the subsequent perturbative reheating stage can be affected by the explosive particle production during preheating. It will be of interest to take into account all these effects and study how the thermalization is reached at the end of reheating. This certainly requires the detailed numerical investigation of lattice simulations, as developed in [255, 254].

At the end of this section we should mention a number of interesting works about gravitational baryogenesis based on the interaction \((1/M_{\ast}^2)\int {{{\rm{d}}^4}x\sqrt {- g} {J^\mu}{\partial _\mu}R}\) between the baryon number current Jμ and the Ricci scalar R (M* is the cut-off scale characterizing the effective theory) [179, 376, 514]. This interaction can give rise to an equilibrium baryon asymmetry which is observationally acceptable, even for the gravitational Lagrangian f(R) =Rn with n close to 1. It will be of interest to extend the analysis to more general f(R) gravity models.

Dark Energy in f(R) Theories

In this section we apply f(R) theories to dark energy. Our interest is to construct viable f(R) models that can realize the sequence of radiation, matter, and accelerated epochs. In this section we do not attempt to find unified models of inflation and dark energy based on f(R) theories.

Originally the model f(R) = Rα/Rn (α > 0, n > 0) was proposed to explain the late-time cosmic acceleration [113, 120, 114, 143] (see also [456, 559, 17, 223, 212, 16, 137, 62] for related works). However, this model suffers from a number of problems such as matter instability [215, 244], the instability of cosmological perturbations [146, 74, 544, 526, 251], the absence of the matter era [28, 29, 239], and the inability to satisfy local gravity constraints [469, 470, 245, 233, 154, 448, 134]. The main reason why this model does not work is that the quantity f,RR ≡ ∂2f/R2 is negative. As we will see later, the violation of the condition f,RR > 0 gives rise to the negative mass squared M2 for the scalaron field. Hence we require that f,RR > 0 to avoid a tachyonic instability. The condition f,R∂f/∂R > 0 is also required to avoid the appearance of ghosts (see Section 7.4). Thus viable f(R) dark energy models need to satisfy [568]

$${f_{,R}} > 0,\quad \;{f_{,RR}} > 0,\quad \;{\rm{for}}\quad R \geq {R_0}(> 0),$$
(4.56)

where R0 is the Ricci scalar today.

In the following we shall derive other conditions for the cosmological viability of f(R) models. This is based on the analysis of [26]. For the matter Lagrangian \({{\mathcal L}_M}\) in Eq. (2.1) we take into account non-relativistic matter and radiation, whose energy densities ρm and ρr satisfy

$${\dot \rho _m} + 3H{\rho _m} = 0,$$
(4.57)
$${\dot \rho _r} + 4H{\rho _r} = 0,$$
(4.58)

respectively. From Eqs. (2.15) and (2.16) it follows that

$$3F{H^2} = (FR - f)/2 - 3H\dot F + {\kappa ^2}({\rho _m} + {\rho _r}),$$
(4.59)
$$- 2F\dot H = \ddot F - H\dot F + {\kappa ^2}[{\rho _m} + (4/3){\rho _r}].$$
(4.60)

Dynamical equations

We introduce the following variables

$${x_1} \equiv - {{\dot F} \over {HF}},\quad {x_2} \equiv - {f \over {6F{H^2}}},\quad {x_3} \equiv {R \over {6{H^2}}},\quad {x_4} \equiv {{{\kappa ^2}{\rho _r}} \over {3F{H^2}}},$$
(4.61)

together with the density parameters

$${\Omega _m} \equiv {{{\kappa ^2}{\rho _m}} \over {3F{H^2}}} = 1 - {x_1} - {x_2} - {x_3} - {x_4},\quad {\Omega _r} \equiv {x_4},\quad {\Omega _{{\rm{DE}}}} \equiv {x_1} + {x_2} + {x_3}.$$
(4.62)

It is straightforward to derive the following equations

$${{{\rm{d}}{x_1}} \over {{\rm{d}}N}} = - 1 - {x_3} - 3{x_2} + x_1^2 - {x_1}{x_3} + {x_4},$$
(4.63)
$${{{\rm{d}}{x_2}} \over {{\rm{d}}N}} = {{{x_1}{x_3}} \over m} - {x_2}(2{x_3} - 4 - {x_1}),$$
(4.64)
$${{{\rm{d}}{x_3}} \over {{\rm{d}}N}} = - {{{x_1}{x_3}} \over m} - 2{x_3}({x_3} - 2),$$
(4.65)
$${{{\rm{d}}{x_4}} \over {{\rm{d}}N}} = - 2{x_3}{x_4} + {x_1}{x_4},$$
(4.66)

where N = ln a is the number of e-foldings, and

$$m \equiv {{{\rm{d}}\ln F} \over {{\rm{d}}\ln R}} = {{R{f_{,RR}}} \over {{f_{,R}}}},$$
(4.67)
$$r \equiv - {{{\rm{d}}\ln f} \over {{\rm{d}}\ln R}} = - {{R{f_{,R}}} \over f} = {{{x_3}} \over {{x_2}}}.$$
(4.68)

From Eq. (4.68) the Ricci scalar can be expressed by x3/x2. Since m depends on R, this means that m is a function of r, that is, m = m(r). The ΛCDM model, f(R) = R − 2Λ, corresponds to m = 0. Hence the quantity characterizes the deviation of the background dynamics from the ΛCDM model. A number of authors studied cosmological dynamics for specific f(R) models [160, 382, 488, 252, 31, 198, 280, 72, 41, 159, 235, 1, 279, 483, 321, 432].

The effective equation of state of the system is defined by

$${w_{{\rm{eff}}}} \equiv - 1 - 2\dot H/(3{H^2}),$$
(4.69)

which is equivalent to weff = − (2x3 − 1)/3. In the absence of radiation (x4 = 0) the fixed points for the above dynamical system are

$${P_1}:({x_1},{x_2},{x_3}) = (0, - 1,2),\quad \quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = - 1,$$
(4.70)
$${P_2}:({x_1},{x_2},{x_3}) = (- 1,0,0),\quad \quad {\Omega _m} = 2,\quad \quad \quad \quad {w_{{\rm{eff}}}} = 1/3,$$
(4.71)
$${P_3}:({x_1},{x_2},{x_3}) = (1,0,0),\quad \quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = 1/3,$$
(4.72)
$${P_4}:({x_1},{x_2},{x_3}) = (- 4,5,0),\quad \quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = 1/3,$$
(4.73)
$${{P_5}:({x_1},{x_2},{x_3}) = \left({{{3m} \over {1 + m}}, - {{1 + 4m} \over {2{{(1 + m)}^2}}},{{1 + 4m} \over {2(1 + m)}}} \right),}$$
(4.74)
$${{\Omega _m} = 1 - {{m(7 + 10m)} \over {2{{(1 + m)}^2}}},\quad \quad {w_{{\rm{eff}}}} = - {m \over {1 + m}},}$$
(4.75)
$$\begin{array}{*{20}c} {{P_6}:({x_1},{x_2},{x_3}) = \left({{{2(1 - m)} \over {1 + 2m}},{{1 - 4m} \over {m(1 + 2m)}}, - {{(1 - 4m)(1 + m)} \over {m(1 + 2m)}}} \right),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {\quad {\Omega _m} = 0,\quad \quad \quad \quad {w_{{\rm{eff}}}} = {{2 - 5m - 6{m^2}} \over {3m(1 + 2m)}}.}\\ \end{array}$$
(4.76)

The points P5 and P6 are on the line m(r) = − r − 1 in the (r, m) plane.

The matter-dominated epoch (Ωm ≃ 1 and weff − 0) can be realized only by the point P5 for m close to 0. In the (r, m) plane this point exists around (r, m) = (−1, 0). Either the point P1 or P6 can be responsible for the late-time cosmic acceleration. The former is a de Sitter point (weff = −1) with r = −2, in which case the condition (2.11) is satisfied. The point P6 can give rise to the accelerated expansion (weff < −1/3) provided that \(m > (\sqrt 3 - 1)/2\), or −1/2 < m < 0, or \(m < - (1 + \sqrt 3)/2\).

In order to analyze the stability of the above fixed points it is sufficient to consider only time-dependent linear perturbations δxi(t) (i = 1, 2, 3) around them (see [170, 171] for the detail of such analysis). For the point P5 the eigenvalues for the 3 × 3 Jacobian matrix of perturbations are

$$3(1 + m_5^\prime),\;\;\;{{- 3{m_5} \pm \sqrt {{m_5}(256m_5^3 + 160m_5^2 - 31{m_5} - 16)}} \over {4{m_5}({m_5} + 1)}},$$
(4.77)

where m5m(r5) and \(m_5^\prime \equiv {{{\rm{d}}m} \over {{\rm{d}}r}}({r_5})\) with r5 ≈ −1. In the limit that ∣m5∣ ≪ 1 the latter two eigenvalues reduce to \(- 3/4 \pm \sqrt {- 1/{m_5}}\). For the models with m5 < 0, the solutions cannot remain for a long time around the point P5 because of the divergent behavior of the eigenvalues as m5 → −0. The model f(R) = Rα/Rn (α > 0, n > 0) falls into this category. On the other hand, if 0 < m5 < 0.327, the latter two eigenvalues in Eq. (4.77) are complex with negative real parts. Then, provided that \(m_5^\prime > - 1\), the point P5 corresponds to a saddle point with a damped oscillation. Hence the solutions can stay around this point for some time and finally leave for the late-time acceleration. Then the condition for the existence of the saddle matter era is

$$m(r) \simeq + 0,\quad {{{\rm{d}}m} \over {{\rm{d}}r}} > - 1,\quad {\rm{at}}\quad r = - 1.$$
(4.78)

The first condition implies that viable f(R) models need to be close to the ΛCDM model during the matter domination. This is also required for consistency with local gravity constraints, as we will see in Section 5.

The eigenvalues for the Jacobian matrix of perturbations about the point P1 are

$$- 3,\quad - {3 \over 2} \pm {{\sqrt {25 - 16/{m_1}}} \over 2},$$
(4.79)

where m1 = m(r = −2). This shows that the condition for the stability of the de Sitter point P1 is [440, 243, 250, 26]

$$0 < m(r = - 2) \leq 1.$$
(4.80)

The trajectories that start from the saddle matter point P5 satisfying the condition (4.78) and then approach the stable de Sitter point P1 satisfying the condition (4.80) are, in general, cosmologically viable.

One can also show that P6 is stable and accelerated for (a) \(m_6^\prime < - 1,\,(\sqrt 3 - 1)/2 < {m_6} < 1\), (b) \(m_6^\prime > - 1,\,{m_6} < - (1 + \sqrt 3)/2\), (c) \(m_6^\prime > - 1,\, - 1/2 < {m_6} < 0\), (d) \(m_6^\prime > - 1,\,{m_6} \geq 1\). Since both P5 and P6 are on the line m = −r − 1, only the trajectories from \(m_5^\prime > - 1\) to \(m_6^\prime < - 1\) are allowed (see Figure 2). This means that only the case (a) is viable as a stable and accelerated fixed point P6. In this case the effective equation of state satisfies the condition weff > −1.

Figure 2
figure2

Four trajectories in the (r, m) plane. Each trajectory corresponds to the models: (i) ΛCDM, (ii) f(R) = (Rb − Λ)c, (iii) f(R) = RαRn with α > 0, 0 < n < 1, and (iv) m(r) = −C(r + l)(r2 + ar + b). From [31].

From the above discussion the following two classes of models are cosmologically viable.

  • Class A: Models that connect P5 (r ≃ −1, m ≃ +0) to P1 (r = −2, 0 < m ≤ 1)

  • Class B: Models that connect P5 (r ≃ −1, m ≃ +0) to \({P_6}\left({m = - r - 1,\,\left({\sqrt 3 - 1} \right)/2 < m < 1} \right)\)

From Eq. (4.56) the viable f(R) dark energy models need to satisfy the condition m > 0, which is consistent with the above argument.

Viable f(R) dark energy models

We present a number of viable f(R) models in the (r, m) plane. First we note that the ΛCDM model corresponds to m = 0, in which case the trajectory is the straight line (i) in Figure 2. The trajectory (ii) in Figure 2 represents the model f(R) = (Rb − Λ)c [31], which corresponds to the straight line m(r) = [(1 − c)/c]r + b − 1 in the (r, m) plane. The existence of a saddle matter epoch demands the condition c ≥ 1 and bc ≃ 1. The trajectory (iii) represents the model [26, 382]

$$f(R) = R - \alpha {R^n}\quad (\alpha > 0,0 < n < 1),$$
(4.81)

which corresponds to the curve m = n(1 + r)/r. The trajectory (iv) represents the model m(r) = −C(r + 1)(r2 + ar + b), in which case the late-time accelerated attractor is the point P6 with \({\left({\sqrt 3 - 1} \right)/2 < m < 1}\).

In [26] it was shown that m needs to be close to 0 during the radiation domination as well as the matter domination. Hence the viable f(R) models are close to the ΛCDM model in the region RR0. The Ricci scalar remains positive from the radiation era up to the present epoch, as long as it does not oscillate around R = 0. The model f(R) = Rα/Rn (α > 0, n > 0) is not viable because the condition f,RR > 0 is violated.

As we will see in Section 5, the local gravity constraints provide tight bounds on the deviation parameter m in the region of high density (RR0), e.g., m(R) ≲ 10−15 for R = 105R0 [134, 596]. In order to realize a large deviation from the ΛCDM model such as \(m(R) > {\mathcal O}(0.1)\) today (R = R0) we require that the variable m changes rapidly from the past to the present. The f(R) model given in Eq. (4.81), for example, does not allow such a rapid variation, because evolves as m ≃ (−r −1) in the region RR0. Instead, if the deviation parameter has the dependence

$$m = C{(- r - 1)^p},\quad p > 1,\;\;C > 0,$$
(4.82)

it is possible to lead to the rapid decrease of m as we go back to the past. The models that behave as Eq. (4.82) in the regime RR0 are

$$({\rm{A}})\;f(R) = R - \mu {R_c}{{{{(R/{R_c})}^{2n}}} \over {{{(R/{R_c})}^{2n}} + 1}}\quad {\rm{with}}\;\;n,\mu, {R_c} > 0,$$
(4.83)
$$({\rm{B}})\;f(R) = R - \mu {R_c}\left[ {1 - {{(1 + {R^2}/R_c^2)}^{- n}}} \right]\quad {\rm{with}}\;\;n,\mu, {R_c} > 0.$$
(4.84)

The models (A) and (B) have been proposed by Hu and Sawicki [306] and Starobinsky [568], respectively. Note that Rc roughly corresponds to the order of R0 for \(\mu = O(1)\). This means that p = 2n + 1 for RR0. In the next section we will show that both the models (A) and (B) are consistent with local gravity constraints for n ≳ 1.

In the model (A) the following relation holds at the de Sitter point:

$$\mu = {{{{(1 + x_d^{2n})}^2}} \over {x_d^{2n - 1}(2 + 2x_d^{2n} - 2n)}},$$
(4.85)

where xdR1/Rc and R1 is the Ricci scalar at the de Sitter point. The stability condition (4.80) gives [587]

$$2x_d^{4n} - (2n - 1)(2n + 4)x_d^{2n} + (2n - 1)(2n - 2) \geq 0.$$
(4.86)

The parameter μ has a lower bound determined by the condition (4.86). When n = 1, for example, one has \({x_d} \geq \sqrt 3\) and \(\mu \geq 8\sqrt 3/9\). Under Eq. (4.86) one can show that the conditions (4.56) are also satisfied.

Similarly the model (B) satisfies [568]

$${(1 + x_d^2)^{n + 2}} \geq 1 + (n + 2)x_d^2 + (n + 1)(2n + 1)x_d^4,$$
(4.87)

with

$$\mu = {{{x_d}{{(1 + x_d^2)}^{n + 1}}} \over {2[{{(1 + x_d^2)}^{n + 1}} - 1 - (n + 1)x_d^2]}}.$$
(4.88)

When n = 1 we have \({x_d} \geq \sqrt 3\) and \(\mu \geq 8\sqrt 3/9\), which is the same as in the model (A). For general n, however, the bounds on μ in the model (B) are not identical to those in the model (A).

Another model that leads to an even faster evolution of m is given by [587]

$$({\rm{C}})\;f(R) = R - \mu {R_c}\tanh \;(R/{R_c})\quad {\rm{with}}\;\;\mu, {R_c} > 0.$$
(4.89)

A similar model was proposed by Appleby and Battye [35]. In the region RRc the model (4.89) behaves as f(R) ≃ RμRc [1 − exp(−2R/Rc)], which may be regarded as a special case of (4.82) in the limit that p ≫ 1Footnote 5. The Ricci scalar at the de Sitter point is determined by μ, as

$$\mu = {{{x_d}{{\cosh}^2}({x_d})} \over {2\sinh ({x_d})\cosh ({x_d}) - {x_d}}}.$$
(4.90)

From the stability condition (4.80) we obtain

$$\mu > 0.905,\quad \;{x_d} > 0.920{.}$$
(4.91)

The models (A), (B) and (C) are close to the ΛCDM model for RRcs, but the deviation from it appears when R decreases to the order of Rc. This leaves a number of observational signatures such as the phantom-like equation of state of dark energy and the modified evolution of matter density perturbations. In the following we discuss the dark energy equation of state in f(R) models. In Section 8 we study the evolution of density perturbations and resulting observational consequences in detail.

Equation of state of dark energy

In order to confront viable f(R) models with SN Ia observations, we rewrite Eqs. (4.59) and (4.60) as follows:

$$3A{H^2} = {\kappa ^2}({\rho _m} + {\rho _r} + {\rho _{{\rm{DE}}}}),$$
(4.92)
$$- 2A\dot H = {\kappa ^2}[{\rho _m} + (4/3){\rho _r} + {\rho _{{\rm{DE}}}} + {P_{{\rm{DE}}}}],$$
(4.93)

where A is some constant and

$${\kappa ^2}{\rho _{{\rm{DE}}}} \equiv (1/2)(FR - f) - 3H\dot F + 3{H^2}(A - F),$$
(4.94)
$${\kappa ^2}{P_{{\rm{DE}}}} \equiv \ddot F + 2H\dot F - (1/2)(FR - f) - (3{H^2} + 2\dot H)(A - F).$$
(4.95)

Defining ρDE and PDE in the above way, we find that these satisfy the usual continuity equation

$${\dot \rho _{{\rm{DE}}}} + 3H({\rho _{{\rm{DE}}}} + {P_{{\rm{DE}}}}) = 0{.}$$
(4.96)

Note that this holds as a consequence of the Bianchi identities, as we have already mentioned in the discussion from Eq. (2.8) to Eq. (2.10).

The dark energy equation of state, wDEPDE/ρDE, is directly related to the one used in SN Ia observations. From Eqs. (4.92) and (4.93) it is given by

$${w_{{\rm{DE}}}} = - {{2A\dot H + 3A{H^2} + {\kappa ^2}{\rho _r}/3} \over {3A{H^2} - {\kappa ^2}({\rho _m} + {\rho _r})}} \simeq {{{w_{{\rm{eff}}}}} \over {1 - (F/A){\Omega _m}}},$$
(4.97)

where the last approximate equality is valid in the regime where the radiation density ρr is negligible relative to the matter density ρm. The viable f(R) models approach the ΛCDM model in the past, i.e., F → 1 as R → ∞. In order to reproduce the standard matter era (3H2κ2ρm) for z ≫ 1, we can choose A = 1 in Eqs. (4.92) and (4.93). Another possible choice is A = F0, where F0 is the present value of F. This choice may be suitable if the deviation of F0 from 1 is small (as in scalar-tensor theory with a nearly massless scalar field [583, 93]). In both cases the equation of state wDE can be smaller than −1 before reaching the de Sitter attractor [306, 31, 587, 435], while the effective equation of state weff is larger than −1. This comes from the fact that the denominator in Eq. (4.97) becomes smaller than 1 in the presence of the matter fluid. Thus f(R) gravity models give rise to the phantom equation of state of dark energy without violating any stability conditions of the system. See [210, 417, 136, 13] for observational constraints on the models (4.83) and (4.84) by using the background expansion history of the universe. Note that as long as the late-time attractor is the de Sitter point the cosmological constant boundary crossing of weff reported in [52, 50] does not typically occur, apart from small oscillations of weff around the de Sitter point.

There are some works that try to reconstruct the forms of f(R) by using some desired form for the evolution of the scale factor a(t) or the observational data of SN Ia [117, 130, 442, 191, 621, 252]. We need to caution that the procedure of reconstruction does not in general guarantee the stability of solutions. In scalar-tensor dark energy models, for example, it is known that a singular behavior sometimes arises at low-redshifts in such a procedure [234, 271]. In addition to the fact that the reconstruction method does not uniquely determine the forms of f(R), the observational data of the background expansion history alone is not yet sufficient to reconstruct f(R) models in high precision.

Finally we mention a number of works [115, 118, 119, 265, 319, 515, 542, 90] about the use of metric f(R) gravity as dark matter instead of dark energy. In most of past works the power-law f(R) model f = Rn has been used to obtain spherically symmetric solutions for galaxy clustering. In [118] it was shown that the theoretical rotation curves of spiral galaxies show good agreement with observational data for n = 1.7, while for broader samples the best-fit value of the power was found to be n = 2.2 [265]. However, these values are not compatible with the bound ∣n − 1∣ < 7.2 × 10−19 derived in [62, 160] from a number of other observational constraints. Hence, it is unlikely that f(R) gravity works as the main source for dark matter.

Local Gravity Constraints

In this section we discuss the compatibility of f(R) models with local gravity constraints (see [469, 470, 245, 233, 154, 448, 251] for early works, and [31, 306, 134] for experimental constraints on viable f(R) dark energy models, and [101, 210, 330, 332, 471, 628, 149, 625, 329, 45, 511, 277, 534, 133, 445, 309, 89] for other related works). In an environment of high density such as Earth or Sun, the Ricci scalar R is much larger than the background cosmological value R0. If the outside of a spherically symmetric body is a vacuum, the metric can be described by a Schwarzschild exterior solution with R = 0. In the presence of non-relativistic matter with an energy density ρm, this gives rise to a contribution to the Ricci scalar R of the order κ2ρm.

If we consider local perturbations δR on a background characterized by the curvature R0, the validity of the linear approximation demands the condition δRR0. We first derive the solutions of linear perturbations under the approximation that the background metric \(g_{\mu \nu}^{(0)}\) is described by the Minkowski metric ημν. In the case of Earth and Sun the perturbation δR is much larger than R0, which means that the linear theory is no longer valid. In such a non-linear regime the effect of the chameleon mechanism [344, 343] becomes important, so that f(R) models can be consistent with local gravity tests.

Linear expansions of perturbations in the spherically symmetric background

First we decompose the quantities R, F(R), and Tμν into the background part and the perturbed part: R = R0 + δR, F = F0(1 + δF), and Tμν = (0)Tμν + δTμν about the approximate Minkowski background \((g_{\mu \nu}^{(0)} \approx {\eta _{\mu \nu}})\). In other words, although we consider R close to a mean-field value R0, the metric is still very close to the Minkowski case. The linear expansion of Eq. (2.7) in a time-independent background gives [470, 250, 154, 448]

$${\nabla ^2}{\delta _F} - {M^2}{\delta _F} = {{{\kappa ^2}} \over {3{F_0}}}\delta T,$$
(5.1)

where δTημνδTμν and

$${M^2} \equiv {1 \over 3}\left[ {{{{f_{,R}}({R_0})} \over {{f_{,RR}}({R_0})}} - {R_0}} \right] = {{{R_0}} \over 3}\left[ {{1 \over {m({R_0})}} - 1} \right].$$
(5.2)

The variable m is defined in Eq. (4.67). Since 0 < m(R0) < 1 for viable f(R) models, it follows that M2 > 0 (recall that R0 > 0).

We consider a spherically symmetric body with mass Mc, constant density ρ (= −δT), radius rc, and vanishing density outside the body. Since δF is a function of the distance r from the center of the body, Eq. (5.1) reduces to the following form inside the body (r < rc):

$${{{{\rm{d}}^2}} \over {{\rm{d}}{r^2}}}{\delta _F} + {2 \over r}{{\rm{d}} \over {{\rm{d}}r}}{\delta _F} - {M^2}{\delta _F} = - {{{\kappa ^2}} \over {3{F_0}}}\rho,$$
(5.3)

whereas the r.h.s. vanishes outside the body (r > rc). The solution of the perturbation δF for positive M2 is given by

$${({\delta _F})_{r < {r_c}}} = {c_1}{{{e^{- Mr}}} \over r} + {c_2}{{{e^{Mr}}} \over r} + {{8\pi G\rho} \over {3{F_0}{M^2}}},$$
(5.4)
$${({\delta _F})_{r > {r_c}}} = {c_3}{{{e^{- Mr}}} \over r} + {c_4}{{{e^{Mr}}} \over r},$$
(5.5)

where ci (i = 1, 2, 3, 4) are integration constants. The requirement that \({({\delta _F})_{r > {r_c}}} \rightarrow 0\) as r → ∞ gives c4 = 0. The regularity condition at r = 0 requires that c2 = −c1. We match two solutions (5.4) and (5.5) at r = rc by demanding the regular behavior of δF(r) and \(\delta _F^{\prime}(r)\). Since δFδR, this implies that R is also continuous. If the mass M satisfies the condition Mrc ≪ 1, we obtain the following solutions

$${({\delta _F})_{r < {r_c}}} \simeq {{4\pi G\rho} \over {3{F_0}}}\left({r_c^2 - {{{r^2}} \over 3}} \right),$$
(5.6)
$${({\delta _F})_{r > {r_c}}} \simeq {{2G{M_c}} \over {3{F_0}r}}{e^{- Mr}}.$$
(5.7)

As we have seen in Section 2.3, the action (2.1) in f(R) gravity can be transformed to the Einstein frame action by a transformation of the metric. The Einstein frame action is given by a linear action in \(\tilde R\), where \(\tilde R\) is a Ricci scalar in the new frame. The first-order solution for the perturbation hμν of the metric \({\tilde g_{\mu \nu}} = {F_0}({\eta _{\mu \nu}} + {h_{\mu \nu}})\) follows from the first-order linearized Einstein equations in the Einstein frame. This leads to the solutions h00 = 2 GMc/(F0r) and hij = 2GMc/(F0r) δij. Including the perturbation δF to the quantity F, the actual metric gμν is given by [448]

$${g_{\mu \nu}} = {{{{\tilde g}_{\mu \nu}}} \over F} \simeq {\eta _{\mu \nu}} + {h_{\mu \nu}} - {\delta _F}{\eta _{\mu \nu}}.$$
(5.8)

Using the solution (5.7) outside the body, the (00) and (ii) components of the metric gμν are

$${g_{00}} \simeq - 1 + {{2G_{{\rm{eff}}}^{(N)}{M_c}} \over r},\quad \;{g_{ii}} \simeq 1 + {{2G_{{\rm{eff}}}^{(N)}{M_c}} \over r}\gamma,$$
(5.9)

where\(G_{{\rm{eff}}}^{(N)}\) and γ are the effective gravitational coupling and the post-Newtonian parameter, respectively, defined by

$$G_{{\rm{eff}}}^{(N)} \equiv {G \over {{F_0}}}\left({1 + {1 \over 3}{e^{- Mr}}} \right),\quad \quad \gamma \equiv {{3 - {e^{- Mr}}} \over {3 + {e^{- Mr}}}}.$$
(5.10)

For the f(R) models whose deviation from the ΛCDM model is small (m ≪ 1), we have M2R0/[3m(R0)] and R ≃ 8πGρ. This gives the following estimate

$${(M{r_c})^2} \simeq 2{{{\Phi _c}} \over {m({R_0})}},$$
(5.11)

where \({\Phi _c} = G{M_c}/({F_0}{r_c}) = 4\pi G\rho r_c^2/(3{F_0})\) is the gravitational potential at the surface of the body. The approximation Mrc ≪ 1 used to derive Eqs. (5.6) and (5.7) corresponds to the condition

$$m({R_0}) \gg {\Phi _c}.$$
(5.12)

Since F0δF = f,rr(R0)δR, it follows that

$$\delta R = {{{f_{,R}}({R_0})} \over {{f_{,RR}}({R_0})}}{\delta _F}.$$
(5.13)

The validity of the linear expansion requires that δRR0, which translates into δFm(R0). Since δF ≃ 2GMc/(3F0rc) = 2Φc/3 at r = rc, one has δFm(R0) ≪ 1 under the condition (5.12). Hence the linear analysis given above is valid for m(R0) ≫ Φc.

For the distance r close to rc the post Newtonian parameter in Eq. (5.10) is given by γ≃ 1/2 (i.e., because Mr ≪ 1). The tightest experimental bound on γ is given by [616, 83, 617]:

$$\vert \gamma - 1\vert \; < 2.3 \times {10^{- 5}},$$
(5.14)

which comes from the time-delay effect of the Cassini tracking for Sun. This means that f(R) gravity models with the light scalaron mass (Mrc ≪ 1) do not satisfy local gravity constraints [469, 470, 245, 233, 154, 448, 330, 332]. The mean density of Earth or Sun is of the order of ρ ≃ 1–10 g/cm3, which is much larger than the present cosmological density \(\rho _c^{(0)} \simeq {10^{- 29}}g/{\rm{c}}{{\rm{m}}^3}\). In such an environment the condition δRR0 is violated and the field mass M becomes large such that Mrc ≫ 1. The effect of the chameleon mechanism [344, 343] becomes important in this nonlinear regime (δRR0) [251, 306, 134, 101]. In Section 5.2 we will show that the f(R) models can be consistent with local gravity constraints provided that the chameleon mechanism is at work.

Chameleon mechanism in f(R) gravity

Let us discuss the chameleon mechanism [344, 343] in metric f(R) gravity. Unlike the linear expansion approach given in Section 5.1, this corresponds to a non-linear effect arising from a large departure of the Ricci scalar from its background value R0. The mass of an effective scalar field degree of freedom depends on the density of its environment. If the matter density is sufficiently high, the field acquires a heavy mass about the potential minimum. Meanwhile the field has a lighter mass in a low-density cosmological environment relevant to dark energy so that it can propagate freely. As long as the spherically symmetric body has a thin-shell around its surface, the effective coupling between the field and matter becomes much smaller than the bare coupling ∣Q∣. In the following we shall review the chameleon mechanism for general couplings Q and then proceed to constrain f(R) dark energy models from local gravity tests.

Field profile of the chameleon field

The action (2.1) in f(R) gravity can be transformed to the Einstein frame action (2.32) with the coupling \(Q = - 1/\sqrt 6\) between the scalaron field \(\phi = \sqrt {3/(2{\kappa ^2})}\) ln F and non-relativistic matter. Let us consider a spherically symmetric body with radius \({\tilde r_c}\) in the Einstein frame. We approximate that the background geometry is described by the Minkowski space-time. Varying the action (2.32) with respect to the field ϕ, we obtain

$${{{{\rm{d}}^2}\phi} \over {{\rm{d}}{{\tilde r}^2}}} + {2 \over {\tilde r}}{{{\rm{d}}\phi} \over {{\rm{d}}\tilde r}} - {{{\rm{d}}{V_{{\rm{eff}}}}} \over {{\rm{d}}\phi}} = 0,$$
(5.15)

where \(\tilde r\) is a distance from the center of symmetry that is related to the distance r in the Jordan frame via \(\tilde r = \sqrt F r = {e^{- Q\kappa \phi}}r\). The effective potential Veff is defined by

$${V_{{\rm{eff}}}}(\phi) = V(\phi) + {e^{Q\kappa \phi}}{\rho ^{\ast}},$$
(5.16)

where ρ* is a conserved quantity in the Einstein frame [343]. Recall that the field potential V(ϕ) is given in Eq. (2.33). The energy density \(\tilde \rho\) in the Einstein frame is related with the energy density ρ in the Jordan frame via the relation \(\tilde \rho = \rho/{F^2} = {e^{4Q\kappa \phi}}\rho\). Since the conformal transformation gives rise to a coupling Q between matter and the field, \(\tilde \rho\) is not a conserved quantity. Instead the quantity \({\rho ^{\ast}} = {e^{3Q\kappa \phi}}\rho = {e^{- Q\kappa \phi}}\tilde \rho\) corresponds to a conserved quantity, which satisfies \({\tilde r^3}{\rho ^{\ast}} = {r^3}\rho\). Note that Eq. (5.15) is consistent with Eq. (2.42).

In the following we assume that a spherically symmetric body has a constant density ρ* = ρA inside the body \((\tilde r < {\tilde r_c})\) and that the energy density outside the body \((\tilde r > {\tilde r_c})\) is ρ* = ρB (≪ρA). The mass Mc of the body and the gravitational potential Φc at the radius \({\tilde r_c}\) are given by \({M_c} = (4\pi/3)\tilde r_c^3{\rho _A}\) and \({\Phi _c} = G{M_c}/{\tilde r_c}\), respectively. The effective potential has minima at the field values ϕA and ϕB:

$${V_{,\phi}}({\phi _A}) + \kappa Q{e^{Q\kappa {\phi _A}}}{\rho _A} = 0,$$
(5.17)
$${V_{,\phi}}({\phi _B}) + \kappa Q{e^{Q\kappa {\phi _B}}}{\rho _B} = 0.$$
(5.18)

The former corresponds to the region of high density with a heavy mass squared \(m_A^2 \equiv {V_{{\rm{eff,}}\phi \phi}}({\phi _A})\), whereas the latter to a lower density region with a lighter mass squared \(m_B^2 \equiv {V_{{\rm{eff,}}\phi \phi}}({\phi _B})\). In the case of Sun, for example, the field value ϕB is determined by the homogeneous dark matter/baryon density in our galaxy, i.e., ρB ≃ 10−24 g/cm3.

When Q > 0 the effective potential has a minimum for the models with V,ϕ < 0, which occurs, e.g., for the inverse power-law potential V(ϕ) = M4+nϕn. The f(R) gravity corresponds to a negative coupling \((Q = - 1/\sqrt 6)\), in which case the effective potential has a minimum for V,ϕ > 0. As an example, let us consider the shape of the effective potential for the models (4.83) and (4.84). In the region RRc both models behave as

$$f(R) \simeq R - \mu {R_c}\left[ {1 - {{({R_c}/R)}^{2n}}} \right].$$
(5.19)

For this functional form it follows that

$$F = {e^{{2 \over {\sqrt 6}}\kappa \phi}} = 1 - 2n\mu {(R/{R_c})^{- (2n + 1)}},$$
(5.20)
$$V(\phi) = {{\mu {R_c}} \over {2{\kappa ^2}}}{e^{- {4 \over {\sqrt 6}}\kappa \phi}}\left[ {1 - (2n + 1){{\left({{{- \kappa \phi} \over {\sqrt 6 n\mu}}} \right)}^{{{2n} \over {2n + 1}}}}} \right].$$
(5.21)

The r.h.s. of Eq. (5.20) is smaller than 1, so that ϕ < 0. The limit R → ∞ corresponds to ϕ → −0. In the limit ϕ → −0 one has VμRc/(2κ2) and V,ϕ → ∞. This property can be seen in the upper panel of Figure 3, which shows the potential V(ϕ) for the model (4.84) with parameters n = 1 and μ = 2. Because of the existence of the coupling term \({e^{- \kappa \phi/\sqrt 6}}{\rho ^{\ast}}\), the effective potential Veff(ϕ) has a minimum at

$$\kappa {\phi _M} = - \sqrt 6 n\mu {\left({{{{R_c}} \over {{\kappa ^2}{\rho ^{\ast}}}}} \right)^{2n + 1}}.$$
(5.22)

Since Rκ2ρ*≫ Rc in the region of high density, the condition ∣κϕM∣≪ 1 is in fact justified (for n and μ of the order of unity). The field mass mϕ about the minimum of the effective potential is given by

$$m_\phi ^2 = {1 \over {6n(n + 1)\mu}}{R_c}{\left({{{{\kappa ^2}{\rho ^{\ast}}} \over {{R_c}}}} \right)^{2(n + 1)}}.$$
(5.23)

This shows that, in the regime Rκ2ρ* ≫ Rc, mϕ is much larger than the present Hubble parameter \({H_0}(\sim \sqrt {{R_c}})\). Cosmologically the field evolves along the instantaneous minima characterized by Eq. (5.22) and then it approaches a de Sitter point which appears as a minimum of the potential in the upper panel of Figure 3.

Figure 3
figure3

(Top) The potential V(ϕ) = (FRf)/(2κ2F2) versus the field \(\phi = \sqrt {3/(16\pi){m_{{\rm{pl}}}}}\) ln F for the Starobinsky’s dark energy model (4.84) with n = 1 and μ = 2. (Bottom) The inverted effective potential −Veff for the same model parameters as the top with \({\rho ^{\ast}} = 10{R_c}m_{{\rm{pl}}}^2\). The field value, at which the inverted effective potential has a maximum, is different depending on the density ρ*, see Eq. (5.22). In the upper panel “de Sitter” corresponds to the minimum of the potential, whereas “singular” means that the curvature diverges at ϕ = 0.

In order to solve the “dynamics” of the field ϕ in Eq. (5.15), we need to consider the inverted effective potential (−Veff). See the lower panel of Figure 3 for illustration [which corresponds to the model (4.84)]. We impose the following boundary conditions:

$${{{\rm{d}}\phi} \over {{\rm{d}}\tilde r}}(\tilde r = 0) = 0,$$
(5.24)
$$\phi (\tilde r \rightarrow \infty) = {\phi _B}.$$
(5.25)

The boundary condition (5.25) can be also understood as \({\lim\nolimits_{\tilde r \rightarrow \infty}}{\rm{d}}\phi {\rm{/d}}\tilde r = 0\). The field ϕ is at rest at \(\tilde r = 0\) and starts to roll down the potential when the matter-coupling term κQρAeQκϕ in Eq. (5.15) becomes important at a radius \({\tilde r_1}\). If the field value at \(\tilde r = 0\) is close to ϕA, the field stays around ϕA in the region \(0 < \tilde r < {\tilde r_1}\). The body has a thin-shell if \({\tilde r_1}\) is close to the radius \({\tilde r_c}\) of the body.

In the region \(0 < \tilde r < {{\tilde r}_1}\) one can approximate the r.h.s. of Eq. (5.15) as \({\rm{d}}{V_{{\rm{eff}}}}/{\rm{d}}\phi \simeq m_A^2(\phi - {\phi _A})\) around ϕ = ϕA, where \(m_A^2 = {R_c}{({\kappa ^2}{\rho _A}/{R_c})^{2(n + 1)}}/[6n(n + 1)]\). Hence the solution to Eq. (5.15) is given by \(\phi (\tilde r) = {\phi _A} + A{e^{- {m_A}\tilde r}}/\tilde r + B{e^{{m_A}\tilde r}}/\tilde r\), where A and B are constants. In order to avoid the divergence of ϕ at \(\tilde r = 0\) we demand the condition B = −A, in which case the solution is

$$\phi (\tilde r) = {\phi _A} + {{A({e^{- {m_A}\tilde r}} - {e^{{m_A}\tilde r}})} \over {\tilde r}}\quad \quad (0 < \tilde r < {\tilde r_1}){.}$$
(5.26)

In fact, this satisfies the boundary condition (5.24).

In the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\) the field \(\vert\phi (\tilde r)\vert\) evolves toward larger values with the increase of \({\tilde r}\). In the lower panel of Figure 3 the field stays around the potential maximum for \(0 < \tilde r < {{\tilde r}_1}\), but in the regime \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\) it moves toward the left (largely negative ϕ region). Since ∣V,ϕ∣ ≪ κQρAeQκϕ∣ in this regime we have that dVeff/dϕκQρA in Eq. (5.15), where we used the condition Qκϕ ≪ 1. Hence we obtain the following solution

$$\phi (\tilde r) = {1 \over 6}\kappa Q{\rho _A}{\tilde r^2} - {C \over {\tilde r}} + D\quad \quad ({\tilde r_1} < \tilde r < {\tilde r_c}),$$
(5.27)

where C and D are constants.

Since the field acquires a sufficient kinetic energy in the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\), the field climbs up the potential hill toward the largely negative ϕ region outside the body \((\tilde r > {{\tilde r}_c})\). The shape of the effective potential changes relative to that inside the body because the density drops from ρA to ρB. The kinetic energy of the field dominates over the potential energy, which means that the term dVeff/dϕ in Eq. (5.15) can be neglected. Recall that one has ∣ϕB∣ ≫ ∣ϕA∣ under the condition ρAρB [see Eq. (5.22)]. Taking into account the mass term \(m_B^2 = {R_c}{({k^2}{\rho _B}/{R_c})^{2(n + 1)}}/[6n(n + 1)]\), we have \({\rm{d}}{V_{{\rm{eff}}}}/{\rm{d}}\phi \simeq m_B^2(\phi - {\phi _B})\) on the r.h.s. of Eq. (5.15). Hence we obtain the solution \(\phi (\tilde r) = {\phi _B} + E{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}/\tilde r + F{e^{{m_B}(\tilde r - {{\tilde r}_c})}}/\tilde r\) with constants E and F. Using the boundary condition (5.25), it follows that F = 0 and hence

$$\phi (\tilde r) = {\phi _B} + E{{{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}} \over {\tilde r}}\quad (\tilde r > {\tilde r_c}){.}$$
(5.28)

Three solutions (5.26), (5.27) and (5.28) should be matched at \(\tilde r = {{\tilde r}_1}\) and \(\tilde r = {{\tilde r}_c}\) by imposing continuous conditions for ϕ and \({\rm{d}}\phi {\rm{/d}}\tilde r\). The coefficients A, C, D and E are determined accordingly [575]:

$$C = {{{s_1}{s_2}[({\phi _B} - {\phi _A}) + (\tilde r_1^2 - \tilde r_c^2)\kappa Q{\rho _A}/6] + [{s_2}\tilde r_1^2({e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}}) - {s_1}\tilde r_c^2]\kappa Q{\rho _A}/3} \over {{m_A}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}}){s_2} - {m_B}{s_1}}},$$
(5.29)
$$A = - {1 \over {{s_1}}}(C + \kappa Q{\rho _A}\tilde r_1^3/3),$$
(5.30)
$$E = - {1 \over {{s_2}}}(C + \kappa Q{\rho _A}\tilde r_c^3/3),$$
(5.31)
$$D = {\phi _B} - {1 \over 6}\kappa Q{\rho _A}\tilde r_c^2 + {1 \over {{{\tilde r}_c}}}(C + E),$$
(5.32)

where

$${s_1} \equiv {m_A}{\tilde r_1}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}}) + {e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}},$$
(5.33)
$${s_2} \equiv 1 + {m_B}{\tilde r_c}.$$
(5.34)

if the maxss mB outside the body is small to satisfy the condition \({m_B}{{\tilde r}_c} \ll 1\) and mAmB, we can neglect the contribution of the mB-dependent terms in Eqs. (5.29)(5.32). Then the field profile is given by [575]

$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _A} - {1 \over {{m_A}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}})}}\left[ {{\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2)} \right]{{{e^{- {m_A}\tilde r}} - {e^{{m_A}\tilde r}}} \over {\tilde r}},}\\ {(0 < \tilde r < {{\tilde r}_1}),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.35)
$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _B} + {1 \over 2}\kappa Q{\rho _A}({{\tilde r}^2} - 3\tilde r_c^2) + {{\kappa Q{\rho _A}\tilde r_1^3} \over {3\tilde r}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {- \left[ {1 + {{{e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}}} \over {{m_A}{{\tilde r}_1}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}})}}} \right]\left[ {{\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2)} \right]{{{{\tilde r}_1}} \over {\tilde r}},}\\ {({{\tilde r}_1} < \tilde r < {{\tilde r}_c}),\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.36)
$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _B} - \left[ {{{\tilde r}_1}({\phi _B} - {\phi _A}) + {1 \over 6}\kappa Q{\rho _A}\tilde r_c^3\left({2 + {{{{\tilde r}_1}} \over {{{\tilde r}_c}}}} \right){{\left({1 - {{{{\tilde r}_1}} \over {{{\tilde r}_c}}}} \right)}^2}} \right.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {+ \left. {{{{e^{- {m_A}{{\tilde r}_1}}} - {e^{{m_A}{{\tilde r}_1}}}} \over {{m_A}({e^{- {m_A}{{\tilde r}_1}}} + {e^{{m_A}{{\tilde r}_1}}})}}\left\{{{\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2)} \right\}} \right]{{{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}} \over {\tilde r}},\quad}\\ {(\tilde r > {{\tilde r}_c}){.}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.37)

Originally a similar field profile was derived in [344, 343] by assuming that the field is frozen at ϕ = ϕA in the region \({{\tilde r}_1} < \tilde r < {{\tilde r}_c}\).

The radius r1 is determined by the following condition

$$m_A^2[\phi ({\tilde r_1}) - {\phi _A}] = \kappa Q{\rho _A}.$$
(5.38)

This translates into

$${\phi _B} - {\phi _A} + {1 \over 2}\kappa Q{\rho _A}(\tilde r_1^2 - \tilde r_c^2) = {{6Q{\Phi _c}} \over {\kappa {{({m_A}{{\tilde r}_c})}^2}}}{{{m_A}{{\tilde r}_1}({e^{{m_A}{{\tilde r}_1}}} + {e^{- {m_A}{{\tilde r}_1}}})} \over {{e^{{m_A}{{\tilde r}_1}}} - {e^{- {m_A}{{\tilde r}_1}}}}},$$
(5.39)

where \({\Phi _c} = {\kappa ^2}{M_c}/(8\pi {{\tilde r}_c}) = {\kappa ^2}{\rho _A}\tilde r_c^2/6\) is the gravitational potential at the surface of the body. Using this relation, the field profile (5.37) outside the body reduces to

$$\begin{array}{*{20}c} {\phi (\tilde r) = {\phi _B} - {{2Q{\Phi _c}} \over \kappa}\left[ {1 - {{\tilde r_1^3} \over {\tilde r_c^3}} + 3{{{{\tilde r}_1}} \over {{{\tilde r}_c}}}{1 \over {{{({m_A}{{\tilde r}_c})}^2}}}\left\{{{{{m_A}{{\tilde r}_1}({e^{{m_A}{{\tilde r}_1}}} + {e^{- {m_A}{{\tilde r}_1}}})} \over {{e^{{m_A}{{\tilde r}_1}}} - {e^{- {m_A}{{\tilde r}_1}}}}} - 1} \right\}} \right]{{{{\tilde r}_c}{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}} \over {\tilde r}},}\\ {(\tilde r > {{\tilde r}_c}){.}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ \end{array}$$
(5.40)

If the field value at \(\tilde r = 0\) is away from ϕA, the field rolls down the potential for \(\tilde r > 0\). This corresponds to taking the limit \({{\tilde r}_1} \to 0\) in Eq. (5.40), in which case the field profile outside the body is given by

$$\phi (\tilde r) = {\phi _B} - {{2Q} \over \kappa}{{G{M_c}} \over {\tilde r}}{e^{- {m_B}(\tilde r - {{\tilde r}_c})}}.$$
(5.41)

This shows that the effective coupling is of the order of Q and hence for \(\vert Q\vert = {\mathcal O}(1)\) local gravity constraints are not satisfied.

Thin-shell solutions

Let us consider the case in which \({{\tilde r}_1}\) is close to \({{\tilde r}_c}\), i.e.

$$\Delta {\tilde r_c} \equiv {\tilde r_c} - {\tilde r_1} \ll {\tilde r_c}.$$
(5.42)

This corresponds to the thin-shell regime in which the field is stuck inside the star except around its surface. If the field is sufficiently massive inside the star to satisfy the condition \({m_A}{{\tilde r}_c} \gg 1\), Eq. (5.39) gives the following relation

$${\epsilon _{{\rm{th}}}} \equiv {{\kappa ({\phi _B} - {\phi _A})} \over {6Q{\Phi _c}}} \simeq {{\Delta {{\tilde r}_c}} \over {{{\tilde r}_c}}} + {1 \over {{m_A}{{\tilde r}_c}}},$$
(5.43)

where ϵth is called the thin-shell parameter [344, 343]. Neglecting second-order terms with respect to \(\Delta {{\tilde r}_c}/{{\tilde r}_c}\) and \(1/({m_A}{{\tilde r}_c})\) in Eq. (5.40), it follows that

$$\phi (\tilde r) \simeq {\phi _B} - {{2{Q_{{\rm{eff}}}}} \over \kappa}{{G{M_c}} \over {\tilde r}}{e^{- {m_B}(\tilde r - {{\tilde r}_c})}},$$
(5.44)

where Qeff is the effective coupling given by

$${Q_{{\rm{eff}}}} = 3Q{\epsilon _{{\rm{th}}}}.$$
(5.45)

Since õth ≪ 1 under the conditions \(\Delta {{\tilde r}_c}/{{\tilde r}_c} \ll 1\) and \(1/({m_A}{{\tilde r}_c}) \ll 1\), the amplitude of the effective coupling Qeff becomes much smaller than 1. In the original papers of Khoury and Weltman [344, 343] the thin-shell solution was derived by assuming that the field is frozen with the value ϕ = ϕA in the region \(0 < \tilde r < {{\tilde r}_1}\). In this case the thin-shell parameter is given by \({\epsilon _{{\rm{th}}}} \simeq \Delta {{\tilde r}_c}/{{\tilde r}_c}\), which is different from Eq. (5.43). However, this difference is not important because the condition \(\Delta {{\tilde r}_c}/{{\tilde r}_c} \gg 1/({m_A}{{\tilde r}_c})\) is satisfied for most of viable models [575].

Post Newtonian parameter

We derive the bound on the thin-shell parameter from experimental tests of the post Newtonian parameter in the solar system. The spherically symmetric metric in the Einstein frame is described by [251]

$${\rm{d}}{\tilde s^2} = {\tilde g_{\mu \nu}}{\rm{d}}{\tilde x^\mu}{\rm{d}}{\tilde x^\nu} = - [1 - 2\tilde {\mathcal A}(\tilde r)]{\rm{d}}{t^2} + [1 + 2\tilde {\mathcal B}(\tilde r)]{\rm{d}}{\tilde r^2} + {\tilde r^2}{\rm{d}}{\Omega ^2},$$
(5.46)

where \(\tilde {\mathcal A}(\tilde r)\) and \(\tilde {\mathcal B}(\tilde r)\) are functions of \({\tilde r}\) and dΩ2 = dθ2 + (sin2 θ)dϕ2. In the weak gravitational background \((\tilde {\mathcal A}(\tilde r) \ll 1\) and \(\tilde {\mathcal B}(\tilde r) \ll 1)\) the metric outside the spherically symmetric body with mass Mc is given by \(\tilde {\mathcal A}(\tilde r) \simeq \tilde {\mathcal B}(\tilde r) \simeq G{M_c}/\tilde r\).

Let us transform the metric (5.46) back to that in the Jordan frame under the inverse of the conformal transformation, \({g_{\mu \nu}} = {e^{2Q\kappa \phi}}{{\tilde g}_{\mu \nu}}\). Then the metric in the Jordan frame, \({\rm{d}}{s^2} = {e^{2Q\kappa \phi}}{\rm{d}}{{\tilde s}^2} = {g_{\mu \nu}}{\rm{d}}{x^\mu}{\rm{d}}{x^\nu}\), is given by

$${\rm{d}}{s^2} = - [1 - 2{\mathcal A}(r)]{\rm{d}}{t^2} + [1 + 2{\mathcal B}(r)]{\rm{d}}{r^2} + {r^2}{\rm{d}}{\Omega ^2}.$$
(5.47)

Under the condition ∣Qκϕ∣ ≪ 1 we obtain the following relations

$$\tilde r = {e^{Q\kappa \phi}}r,\quad {\mathcal A}(r) \simeq \tilde {\mathcal A}(\tilde r) - Q\kappa \phi (\tilde r),\quad {\mathcal B}(r) \simeq \tilde {\mathcal B}(\tilde r) - Q\kappa \tilde r{{{\rm{d}}\phi (\tilde r)} \over {{\rm{d}}\tilde r}}.$$
(5.48)

In the following we use the approximation \(r \simeq \tilde r\), which is valid for ∣Qκϕ∣ ≪ 1. Using the thin-shell solution (5.44), it follows that

$${\mathcal A}(r) = {{G{M_c}} \over r}[1 + 6{Q^2}{\epsilon _{{\rm{th}}}}(1 - r/{r_c})],\quad {\mathcal B}(r) = {{G{M_c}} \over r}(1 - 6{Q^2}{\epsilon _{{\rm{th}}}}),$$
(5.49)

where we have used the approximation ∣ϕB∣ ≫ ∣ϕA∣ and hence ϕB ≃ 6QΦcϵth/κ.

The term QκϕB in Eq. (5.48) is smaller than \({\mathcal A}(r) = G{M_c}/r\) under the condition r/rc < (6Q2ϵth)−1. Provided that the field ϕ reaches the value ϕB with the distance rB satisfying the condition rB/rc < (6Q2th)−1, the metric \({\mathcal A}(r)\) does not change its sign for r < rB. The post-Newtonian parameter γ is given by

$$\gamma \equiv {{{\mathcal B}(r)} \over {{\mathcal A}(r)}} \simeq {{1 - 6{Q^2}{\epsilon_{{\rm{th}}}}} \over {1 + 6{Q^2}{\epsilon_{{\rm{th}}}}(1 - r/{r_c})}}.$$
(5.50)

The experimental bound (5.14) can be satisfied as long as the thin-shell parameter ϵth is much smaller than 1. If we take the distance r = rc, the constraint (5.14) translates into

$${\epsilon_{{\rm{th}}, \odot}} < 3.8 \times {10^{- 6}}/{Q^2},$$
(5.51)

where ϵth,⊙ is the thin-shell parameter for Sun. In f(R) gravity \((Q = - 1/\sqrt 6)\) this corresponds to ϵth,⊙ < 2.3 × 10−5.

Experimental bounds from the violation of equivalence principle

Let us next discuss constraints on the thin-shell parameter from the possible violation of equivalence principle (EP). The tightest bound comes from the solar system tests of weak EP using the free-fall acceleration of Earth (a) and Moon (aMoon) toward Sun [343]. The experimental bound on the difference of two accelerations is given by [616, 83, 617]

$${{\vert {a_ \oplus} - {a_{{\rm{Moon}}}}\vert} \over {\vert {a_ \oplus} + {a_{{\rm{Moon}}}}\vert/2}} < {10^{- 3}}.$$
(5.52)

Provided that Earth, Sun, and Moon have thin-shells, the field profiles outside the bodies are given by Eq. (5.44) with the replacement of corresponding quantities. The presence of the field ϕ(r) with an effective coupling Qeff gives rise to an extra acceleration, afifth = ∣Qeffϕ(r)∣. Then the accelerations a and aMoon toward Sun (mass M) are [343]

$${a_\oplus} \simeq {{G{M_ \odot}} \over {{r^2}}}\left[ {1 + 18{Q^2}\epsilon _{{\rm{th}}, \oplus}^2{{{\Phi _ \oplus}} \over {{\Phi _ \odot}}}} \right],$$
(5.53)
$${a_{{\rm{Moon}}}} \simeq {{G{M_ \odot}} \over {{r^2}}}\left[ {1 + 18{Q^2}\epsilon _{{\rm{th}}, \oplus}^2{{\Phi _ \oplus ^2} \over {{\Phi _ \odot}{\Phi _{{\rm{Moon}}}}}}} \right],$$
(5.54)

where ϵth, ⊕ is the thin-shell parameter of Earth, and Φ ≃ 2.1 × 10−6, Φ ≃ 7.0 × 10−10, ΦMoon ≃ 3.1 × 10−11 are the gravitational potentials of Sun, Earth and Moon, respectively. Hence the condition (5.52) translates into [134, 596]

$${\epsilon_{{\rm{th}}, \oplus}} < 8.8 \times {10^{- 7}}/\vert Q\vert,$$
(5.55)

which corresponds to ϵth,⊕ < 2.2 × 10−6 in f(R) gravity. This bound provides a tighter bound on model parameters compared to (5.51).

Since the condition ∣ϕB∣≫ ∣ϕA∣ is satisfied for ρAρB, one has ϵth,⊕κϕB/(6QΦ) from Eq. (5.43). Then the bound (5.55) translates into

$$\vert \kappa {\phi _{B, \oplus}}\vert < 3.7 \times {10^{- 15}}.$$
(5.56)

Constraints on model parameters in f(R) gravity

We place constraints on the f(R) models given in Eqs. (4.83) and (4.84) by using the experimental bounds discussed above. In the region of high density where is much larger than Rc, one can use the asymptotic form (5.19) to discuss local gravity constraints. Inside and outside the spherically symmetric body the effective potential Veff for the model (5.19) has two minima at

$$\kappa {\phi _A} \simeq - \sqrt 6 n\mu {\left({{{{R_c}} \over {{\kappa ^2}{\rho _A}}}} \right)^{2n + 1}},\quad \;\;\kappa {\phi _B} \simeq - \sqrt 6 n\mu {\left({{{{R_c}} \over {{\kappa ^2}{\rho _B}}}} \right)^{2n + 1}}.$$
(5.57)

The bound (5.56) translates into

$${{n\mu} \over {x_d^{2n + 1}}}{\left({{{{R_1}} \over {{\kappa ^2}{\rho _B}}}} \right)^{2n + 1}} < 1.5 \times {10^{- 15}},$$
(5.58)

where xdR1/Rc and R1 is the Ricci scalar at the late-time de Sitter point. In the following we consider the case in which the Lagrangian density is given by (5.19) for RR1. If we use the models (4.83) and (4.84), then there are some modifications for the estimation of R1. However this change should be insignificant when we place constraints on model parameters.

At the de Sitter point the model (5.19) satisfies the condition \(\mu = x_d^{2n + 1}/[2(x_d^{2n} - n - 1)]\). Substituting this relation into Eq. (5.58), we find

$${n \over {2(x_d^{2n} - n - 1)}}{\left({{{{R_1}} \over {{\kappa ^2}{\rho _B}}}} \right)^{2n + 1}} < 1.5 \times {10^{- 15}}.$$
(5.59)

For the stability of the de Sitter point we require that m(R1) < 1, which translates into the condition \(x_d^{2n} > 2{n^2} + 3n + 1\). Hence the term \(n/[2(x_d^{2n} - n - 1)]\) in Eq. (5.59) is smaller than 0.25 for n > 0.

We use the approximation that R1 and ρB are of the orders of the present cosmological density 10−29 g/cm3 and the baryonic/dark matter density 10−24 g/cm3 in our galaxy, respectively. From Eq. (5.59) we obtain the bound [134]

$$n > 0{.}9{.}$$
(5.60)

Under this condition one can see an appreciable deviation from the ΛCDM model cosmologically as R decreases to the order of Rc.

If we consider the model (4.81), it was shown in [134] that the bound (5.56) gives the constraint n < 3 × 10−10. This means that the deviation from the ΛCDM model is very small. Meanwhile, for the models (4.83) and (4.84), the deviation from the ΛCDM model can be large even for \(n = {\mathcal O}(1)\), while satisfying local gravity constraints. We note that the model (4.89) is also consistent with local gravity constraints.

Cosmological Perturbations

The f(R) theories have one extra scalar degree of freedom compared to the ΛCDM model. This feature results in more freedom for the background. As we have seen previously, a viable cosmological sequence of radiation, matter, and accelerated epochs is possible provided some conditions hold for f(R). In principle, however, one can specify any given H = H(a) and solve Eqs. (2.15) and (2.16) for those f(R(a)) compatible with the given H(a).

Therefore the background cosmological evolution is not in general enough to distinguish f(R) theories from other theories. Even worse, for the same H(a), there may be some different forms of f(R) which fulfill the Friedmann equations. Hence other observables are needed in order to distinguish between different theories. In order to achieve this goal, perturbation theory turns out to be of fundamental importance. More than this, perturbations theory in cosmology has become as important as in particle physics, since it gives deep insight into these theories by providing information regarding the number of independent degrees of freedom, their speed of propagation, their time-evolution: all observables to be confronted with different data sets.

The main result of the perturbation analysis in f(R) gravity can be understood in the following way. Since it is possible to express this theory into a form of scalar-tensor theory, this should correspond to having a scalar-field degree of freedom which propagates with the speed of light. Therefore no extra vector or tensor modes come from the f(R) gravitational sector. Introducing matter fields will in general increase the number of degrees of freedom, e.g., a perfect fluid will only add another propagating scalar mode and a vector mode as well. In this section we shall provide perturbation equations for the general Lagrangian density f(R, ϕ) including metric f(R) gravity as a special case.

Perturbation equations

We start with a general perturbed metric about the flat FLRW background [57, 352, 231, 232, 437]

$${\rm{d}}{s^2} = - (1 + 2\alpha)\,{\rm{d}}{t^2} - 2a(t)\,({\partial _i}\beta - {S_i}){\rm{d}}t\,{\rm{d}}{x^i} + {a^2}(t)({\delta _{ij}} + 2\psi {\delta _{ij}} + 2{\partial _i}{\partial _j}\gamma + 2{\partial _j}{F_i} + {h_{ij}})\,{\rm{d}}{x^i}\,{\rm{d}}{x^j},$$
(6.1)

where α, β, ψ, γ, are scalar perturbations, Si, Fi are vector perturbations, and hij is the tensor perturbations, respectively. In this review we focus on scalar and tensor perturbations, because vector perturbations are generally unimportant in cosmology [71].

For generality we consider the following action

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \,\left[ {{1 \over {2{\kappa ^2}}}f(R,\phi) - {1 \over 2}\omega (\phi){g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V(\phi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M})\,,$$
(6.2)

where f(R, ϕ) is a function of the Ricci scalar R and the scalar field ϕ, ω(ϕ) and V(ϕ) are functions of ϕ, and SM is a matter action. We do not take into account an explicit coupling between the field ϕ and matter. The action (6.2) covers not only f(R) gravity but also other modified gravity theories such as Brans-Dicke theory, scalar-tensor theories, and dilaton gravity. We define the quantity F(R, ϕ) ≡ ∂, f/∂R. Varying the action (6.2) with respect to gμν and ϕ, we obtain the following field equations

$$\begin{array}{*{20}c} {F{R_{\mu \nu}} - {1 \over 2}f{g_{\mu \nu}} - {\nabla _\mu}{\nabla _\nu}F + {g_{\mu \nu}}\square F\quad \quad \quad \quad \quad \quad \;}\\ {= {\kappa ^2}\left[ {\omega \left({{\nabla _\mu}\phi {\nabla _\nu}\phi - {1 \over 2}{g_{\mu \nu}}{\nabla ^\lambda}\phi {\nabla _\lambda}\phi} \right) - V{g_{\mu \nu}} + T_{\mu \nu}^{(M)}} \right],}\\ \end{array}$$
(6.3)
$$\square\phi + {1 \over {2\omega}}\left({{\omega _{,\phi}}{\nabla ^\lambda}\phi {\nabla _\lambda}\phi - 2{V_{,\phi}} + {{{f_{,\phi}}} \over {{\kappa ^2}}}} \right) = 0\,,$$
(6.4)

where \(T_{\mu \nu}^{(M)}\) is the energy-momentum tensor of matter.

We decompose ϕ and F into homogeneous and perturbed parts, \(\phi = \bar \phi + \delta \phi\) and \(F = \bar F + \delta F\), respectively. In the following we omit the bar for simplicity. The energy-momentum tensor of an ideal fluid with perturbations is

$$T_0^0 = - ({\rho _M} + \delta {\rho _M})\,,\quad T_i^0 = - ({\rho _M} + {P_M}){\partial _i}v\,,\quad T_j^i = ({P_M} + \delta {P_M})\delta _j^i\,,$$
(6.5)

where υ characterizes the velocity potential of the fluid. The conservation of the energy-momentum tensor (∇μTμν = 0) holds for the theories with the action (6.2) [357].

For the action (6.2) the background equations (without metric perturbations) are given by

$$3F{H^2} = {1 \over 2}(RF - f) - 3H\dot F + {\kappa ^2}\left[ {{1 \over 2}\omega {{\dot \phi}^2} + V(\phi) + {\rho _M}} \right]\,,$$
(6.6)
$$- 2F\dot H = \ddot F - H\dot F + {\kappa ^2}\omega {\dot \phi ^2} + {\kappa ^2}({\rho _M} + {P_M})\,,$$
(6.7)
$$\ddot \phi + 3H\dot \phi + {1 \over {2\omega}}\left({{\omega _{,\phi}}{{\dot \phi}^2} + 2{V_{,\phi}} - {{{f_{,\phi}}} \over {{\kappa ^2}}}} \right) = 0\,,$$
(6.8)
$${\dot \rho _M} + 3H({\rho _M} + {P_M}) = 0\,,$$
(6.9)

where R is given in Eq. (2.13).

For later convenience, we define the following perturbed quantities

$$\chi \equiv a(\beta + a\dot \gamma)\,,\qquad A \equiv 3(H\alpha - \dot \psi) - {\Delta \over {{a^2}}}\chi \,.$$
(6.10)

Perturbing Einstein equations at linear order, we obtain the following equations [316, 317] (see also [436, 566, 355, 438, 312, 313, 314, 492, 138, 33, 441, 328])

$$\begin{array}{*{20}c} {{\Delta \over {{a^2}}}\psi + HA = - {1 \over {2F}}\left[ {\left({3{H^2} + 3\dot H + {\Delta \over {{a^2}}}} \right)\delta F - 3H\dot{\delta F} + {1 \over 2}\left({{\kappa ^2}{\omega _{,\phi}}{{\dot \phi}^2} + 2{\kappa ^2}{V_{,\phi}} - {f_{,\phi}}} \right)\delta \phi} \right.\;\;\;\;\;\;\;\;}\\ {\left. {+ {\kappa ^2}\omega \dot \phi \dot \delta \phi + (3H\dot F - {\kappa ^2}\omega {{\dot \phi}^2})\alpha + \dot FA + {\kappa ^2}\delta {\rho _M}} \right]\,,}\\ \end{array}$$
(6.11)
$$H\alpha - \dot \psi = {1 \over {2F}}\left[ {{\kappa ^2}\omega \dot \phi \delta \phi + \dot{\delta F}- H\delta F - \dot F\alpha + {\kappa ^2}({\rho _M} + {P_M})v} \right]\,,$$
(6.12)
$$\dot \chi + H\chi - \alpha - \psi = {1 \over F}(\delta F - \dot F\chi)\,,$$
(6.13)
$$\begin{array}{*{20}c} {\dot A + 2HA + \left({3H + {\Delta \over {{a^2}}}} \right)\alpha = {1 \over {2F}}\left[ {3\ddot{\delta F} + 3H\dot{\delta F} - \left({6{H^2} + {\Delta \over {{a^2}}}} \right)\delta F + 4{\kappa ^2}\omega \dot \phi \dot{\delta \phi}} \right.\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad}\\ {+ (2{\kappa ^2}{\omega _{,\phi}}{{\dot \phi}^2} - 2{\kappa ^2}{V_{,\phi}} + {f_{,\phi}})\delta \phi - 3\dot F\dot \alpha - \dot FA\quad \quad}\\ {\left. {- (4{\kappa ^2}\omega {{\dot \phi}^2} + 3H\dot F + 6\ddot F)\alpha + {\kappa ^2}(\delta {\rho _M} + \delta {P_M})} \right],}\\ \end{array}$$
(6.14)
$$\begin{array}{*{20}c} {\ddot{\delta F}+ 3H\dot{\delta F} - \left({{\Delta \over {{a^2}}} + {R \over 3}} \right)\delta F + {2 \over 3}{\kappa ^2}\dot \phi \dot{\delta} \phi + {1 \over 3}({\kappa ^2}{\omega _{,\phi}}{{\dot \phi}^2} - 4{\kappa ^2}{V_{,\phi}} + 2{f_{,\phi}})\delta \phi}\\ {= {1 \over 3}{\kappa ^2}(\delta {\rho _M} - 3\delta {P_M}) + \dot F(A + \dot \alpha) + \left({2\ddot F + 3H\dot F + {2 \over 3}{\kappa ^2}\omega {{\dot \phi}^2}} \right)\alpha - {1 \over 3}F\delta R\,,}\\ \end{array}$$
(6.15)
$$\begin{array}{*{20}c} {\delta \ddot \phi + \left({3H + {{{\omega _{,\phi}}} \over \omega}\dot \phi} \right)\delta \dot \phi + \left[ {- {\Delta \over {{a^2}}} + {{\left({{{{\omega _{,\phi}}} \over \omega}} \right)}_{,\phi}}{{{{\dot \phi}^2}} \over 2} + {{\left({{{2{V_{,\phi}} - {f_{,\phi}}} \over {2\omega}}} \right)}_{,\phi}}} \right]\delta \phi}\\ {= \dot \phi \dot \alpha + \left({2\ddot \phi + 3H\dot \phi + {{{\omega _{,\phi}}} \over \omega}{{\dot \phi}^2}} \right)\alpha + \dot \phi A + {1 \over {2\omega}}{F_{,\phi}}\delta R\,,\quad \quad \quad \quad \quad}\\ \end{array}$$
(6.16)
$$\delta {\dot \rho _M} + 3H(\delta {\rho _M} + \delta {P_M}) = ({\rho _M} + {P_M})\left({A - 3H\alpha + {\Delta \over {{a^2}}}v} \right)\,,$$
(6.17)
$${1 \over {{a^3}({\rho _M} + {P_M})}}\,{{\rm{d}} \over {{\rm{d}}t}}[{a^3}({\rho _M} + {P_M})v] = \alpha + {{\delta {P_M}} \over {{\rho _M} + {P_M}}}\,,$$
(6.18)

where δR is given by

$$\delta R = - 2\left[ {\dot A + 4HA + \left({{\Delta \over {{a^2}}} + 3\dot H} \right)\alpha + 2{\Delta \over {{a^2}}}\psi} \right]\,.$$
(6.19)

We shall solve the above equations in two different contexts: (i) inflation (Section 7), and (ii) the matter dominated epoch followed by the late-time cosmic acceleration (Section 8).

Gauge-invariant quantities

Before discussing the detail for the evolution of cosmological perturbations, we construct a number of gauge-invariant quantities. This is required to avoid the appearance of unphysical modes. Let us consider the gauge transformation

$$\hat t = t + \delta t\,,\qquad {\hat x^i} = {x^i} + {\delta ^{ij}}{\partial _j}\delta x\,,$$
(6.20)

where δt and δx characterize the time slicing and the spatial threading, respectively. Then the scalar metric perturbations α, β, ϕ and E transform as [57, 71, 412]

$$\hat \alpha = \alpha - \dot \delta t\,,$$
(6.21)
$$\hat \beta = \beta - {a^{- 1}}\delta t + a\dot \delta x\,,$$
(6.22)
$$\hat \psi = \psi - H\delta t\,,$$
(6.23)
$$\hat \gamma = \gamma - \delta x\,.$$
(6.24)

Matter perturbations such as δϕ and δρ obey the following transformation rule

$$\hat{\delta \phi} = \delta \phi - \dot \phi \,\delta t\,,$$
(6.25)
$$\hat{\delta \rho} = \delta \rho - \dot \rho \,\delta t\,.$$
(6.26)

Note that the quantity δF is also subject to the same transformation: \(\overset \wedge {\delta F} = \delta F - \dot F\delta t\). We express the scalar part of the 3-momentum energy-momentum tensor \(\delta T_i^0\) as

$$\delta T_i^0 = {\partial _i}\delta q\,.$$
(6.27)

for the scalar field and the perfect fluid one has \(\delta q = - \dot \phi \delta \phi\) and δq = −(ρM + PM)υ, respectively. This quantity transforms as

$$\hat{\delta q} = \delta q + (\rho + P)\delta t\,.$$
(6.28)

One can construct a number of gauge-invariant quantities unchanged under the transformation (6.20):

$$\Phi = \alpha - {{\rm{d}} \over {{\rm{d}}t}}[{a^2}(\gamma + \beta/a)]\,,\qquad \Psi = - \psi + {a^2}H(\dot \gamma + \beta/a)\,,$$
(6.29)
$${\mathcal R} = \psi + {H \over {\rho + P}}\delta q\,,\qquad {{\mathcal R}_{\delta \phi}} = \psi - {H \over {\dot \phi}}\delta \phi \,,\qquad {{\mathcal R}_{\delta F}} = \psi - {H \over {\dot F}}\delta F\,,$$
(6.30)
$$\delta {\rho _q} = \delta \rho - 3H\delta q\,.$$
(6.31)

Since \(\delta q = - \dot \phi \delta \phi\) for single-field inflation with a potential V(ϕ), \({\mathcal R}\) is identical to \({{\mathcal R}_{\delta \phi}}\) [where we used \(\rho = {{\dot \phi}^2}/2 + V(\phi)\) and \(P = {{\dot \phi}^2}/2 - V(\phi)]\). In f(R) gravity one can introduce a scalar field ϕ as in Eq. (2.31), so that \({{\mathcal R}_{\delta F}} = {{\mathcal R}_{\delta \phi}}\). From the gauge-invariant quantity (6.31) it is also possible to construct another gauge-invariant quantity for the matter perturbation of perfect fluids:

$${\delta _M} = {{\delta {\rho _M}} \over {{\rho _M}}} + 3H(1 + {w_M})v\,,$$
(6.32)

where wM = PM/ρM.

We note that the tensor perturbation hij is invariant under the gauge transformation [412].

We can choose specific gauge conditions to fix the gauge degree of freedom. After fixing a gauge, two scalar variables δt and δx are determined accordingly. The Longitudinal gauge corresponds to the gauge choice \(\hat \beta = 0\) and \(\hat \gamma = 0\), under which \(\delta t = a(\beta + \alpha \dot \gamma)\) and δx = γ. In this gauge one has \(\hat \Phi = \hat \alpha\) and \(\hat \Psi = - \hat \psi\), so that the line element (without vector and tensor perturbations) is given by

$${\rm{d}}{s^2} = - (1 + 2\Phi){\rm{d}}{t^2} + {a^2}(t)(1 - 2\Psi){\delta _{ij}}{\rm{d}}{x^i}{\rm{d}}{x^j}\,,$$
(6.33)

where we omitted the hat for perturbed quantities.

The uniform-field gauge corresponds to \(\overset \wedge {\delta \phi} = 0\) which fixes \(\delta t = \delta \phi/\dot \phi\). The spatial threading δx is fixed by choosing either \(\hat \beta = 0\) or \(\hat \gamma = 0\) (up to an integration constant in the former case). For this gauge choice one has \({{\hat {\mathcal R}}_{\delta \phi}} = \hat \psi\). Since the spatial curvature \(^{(3)}{\mathcal R}\) on the constant-time hypersurface is related to ϕ via the relation \(^{(3)}{\mathcal R} = - 4{\nabla ^2}\psi/{a^2}\), the quantity \({\mathcal R}\) is often called the curvature perturbation on the uniform-field hypersurface. We can also choose the gauge condition \(\overset \wedge {\delta q} = 0\) or \(\overset \wedge {\delta F} = 0\).

Perturbations Generated During Inflation

Let us consider scalar and tensor perturbations generated during inflation for the theories (6.2) without taking into account the perfect fluid (SM = 0). In f(R) gravity the contribution of the field ϕ such as δϕ is absent in the perturbation equations (6.11)(6.16). One can choose the gauge condition ϕF = 0, so that \({{\mathcal R}_{\delta F}} = \psi\). In scalar-tensor theory in which F is the function of ϕ alone (i.e., the coupling of the form F(ϕ)R without a non-linear term in R), the gauge choice δϕ = 0 leads to \({{\mathcal R}_{\delta \phi}} = \psi\). Since δF = F,ϕδϕ = 0 in this case, we have \({{\mathcal R}_{\delta F}} = {{\mathcal R}_{\delta \phi}} = \psi\).

We focus on the effective single-field theory such as f(R) gravity and scalar-tensor theory with the coupling F(ϕ)R, by choosing the gauge condition δϕ = 0 and δF = 0. We caution that this analysis does not cover the theory such as \({\mathcal L} = \xi (\phi)R + \alpha {R^2}\) [500], because the quantity F depends on both ϕ and R (in other words, δF = F,ϕδϕ + F,RδR). In the following we write the curvature perturbations \({{\mathcal R}_{\delta F}}\) and \({{\mathcal R}_{\delta \phi}}\) as \({\mathcal R}\).

Curvature perturbations

Since δϕ = 0 and δF = 0 in Eq. (6.12) we obtain

$$\alpha = {\dot {\mathcal R} \over {H + \dot F/(2F)}}\,.$$
(7.34)

Plugging Eq. (7.34) into Eq. (6.11), we have

$$A = - {1 \over {H + \dot F/(2F)}}\left[ {{\Delta \over {{a^2}}}{\mathcal R} + {{3H\dot F - {\kappa ^2}\omega {{\dot \phi}^2}} \over {2F\{H + \dot F/(2F)\}}}\dot {\mathcal R}} \right]\,.$$
(7.35)

Equation (6.14) gives

$$\dot A + \left({2H + {{\dot F} \over {2F}}} \right)A + {{3\dot F} \over {2F}}\dot \alpha + \left[ {{{3\ddot F + 6H\dot F + {\kappa ^2}\omega {{\dot \phi}^2}} \over {2F}} + {\Delta \over {{a^2}}}} \right]\alpha = 0\,,$$
(7.36)

where we have used the background equation (6.7). Plugging Eqs. (7.34) and (7.35) into Eq. (7.36), we find that the curvature perturbation satisfies the following simple equation in Fourier space

$$\ddot {\mathcal R}+ {{{{({a^3}{Q_s})}^ \cdot}} \over {{a^3}{Q_s}}}\dot {\mathcal R}+ {{{k^2}} \over {{a^2}}}{\mathcal R} = 0\,,$$
(7.37)

where k is a comoving wavenumber and

$${Q_s} \equiv {{\omega {{\dot \phi}^2} + 3{{\dot F}^2}/(2{\kappa ^2}F)} \over {{{[H + \dot F/(2F)]}^2}}}\,.$$
(7.38)

Introducing the variables \({z_s} = a{\sqrt Q _s}\) and \(u = {z_s}{\mathcal R}\), Eq. (7.37) reduces to

$$u^{\prime\prime} + \left({{k^2} - {{z_s^{\prime\prime}} \over {{z_s}}}} \right)u = 0\,,$$
(7.39)

where a prime represents a derivative with respect to the conformal time η = ∫ a−1dt.

In General Relativity with a canonical scalar field ϕ one has ω = 1 and F = 1, which corresponds to \({Q_s} = {{\dot \phi}^2}/{H^2}\). Then the perturbation u corresponds to \(u = a[ - \delta \phi + (\dot \phi/H)\psi ]\). In the spatially flat gauge (ω = 0) this reduces to u = − aδϕ, which implies that the perturbation u corresponds to a canonical scalar field δχ = aδϕ. In modified gravity theories it is not clear at this stage that the perturbation \(u = a\sqrt {{Q_s}} {\mathcal R}\) corresponds a canonical field that should be quantized, because Eq. (7.37) is unchanged by multiplying a constant term to the quantity Qs defined in Eq. (7.38). As we will see in Section 7.4, this problem is overcome by considering a second-order perturbed action for the theory (6.2) from the beginning.

In order to derive the spectrum of curvature perturbations generated during inflation, we introduce the following variables [315]

$${\epsilon _1} \equiv - {{\dot H} \over {{H^2}}}\,,\quad {\epsilon _2} \equiv {{\ddot \phi} \over {H\dot \phi}}\,,\quad {\epsilon _3} \equiv {{\dot F} \over {2HF}}\,,\quad {\epsilon _4} \equiv {{\dot E} \over {2HE}}\,,$$
(7.40)

where \(E \equiv F[\omega + 3{{\dot F}^2}/(2{\kappa ^2}{{\dot \phi}^2}F)]\). Then the quantity Qs can be expressed as

$${Q_s} = {\dot \phi ^2}{E \over {F{H^2}{{(1 + {\epsilon _3})}^2}}}\,.$$
(7.41)

if the parameter ϵ1 is constant, it follows that η =−1/[(1−ϵ1)aH] [573]. If \({{\dot \epsilon}_i} = 0\) (i = 1, 2, 3, 4) one has

$${{z_s^{\prime\prime}} \over {{z_s}}} = {{\nu _{\mathcal R}^2 - 1/4} \over {{\eta ^2}}}\,,\qquad {\rm{with}}\qquad \nu _{\mathcal R}^2 = {1 \over 4} + {{(1 + {\epsilon _1} + {\epsilon _2} - {\epsilon _3} + {\epsilon _4})(2 + {\epsilon _2} - {\epsilon _3} + {\epsilon _4})} \over {{{(1 - {\epsilon _1})}^2}}}\,.$$
(7.42)

then the solution to Eq. (7.39) can be expressed as a linear combination of Hankel functions,

$$u = {{\sqrt {\pi \vert \eta \vert}} \over 2}{e^{i(1 + 2{\nu _{\mathcal R}})\pi/4}}\left[ {{c_1}H_{{\nu_{\mathcal R}}}^{(1)}(k\vert \eta \vert) + {c_2}H_{{\nu _{\mathcal R}}}^{(2)}(k\vert \eta \vert)} \right]\,,$$
(7.43)

where c1 and c2 are integration constants.

During inflation one has ∣ϵi∣ ≪ 1, so that \(z_s^{^{\prime\prime}}/{z_s} \approx {(aH)^2}\). For the modes deep inside the Hubble radius (kaH, i.e., ∣∣ ≫1) the perturbation u satisfies the standard equation of a canonical field in the Minkowski spacetime: u″+ k2u ≃ 0. After the Hubble radius crossing (k = aH) during inflation, the effect of the gravitational term \(z_s^{^{\prime\prime}}/{z_s}\) becomes important. In the super-Hubble limit (kaH, i.e., ∣≪ 1) the last term on the l.h.s. of Eq. (7.37) can be neglected, giving the following solution

$${\mathcal R} = {c_1} + {c_2}\int {{{{\rm{d}}t} \over {{a^3}{Q_s}}}} \,,$$
(7.44)

where c1 and c2 are integration constants. The second term can be identified as a decaying mode, which rapidly decays during inflation (unless the field potential has abrupt features). Hence the curvature perturbation approaches a constant value c1 after the Hubble radius crossing (k < aH).

In the asymptotic past (kη → −∞) the solution to Eq. (7.39) is determined by a vacuum state in quantum field theory [88], as \(u \rightarrow {e^{- ik\eta}}/\sqrt {2k}\). This fixes the coefficients to be c1 = 1 and c2 = 0, giving the following solution

$$u = {{\sqrt {\pi \vert \eta \vert}} \over 2}{e^{i(1 + 2{\nu _{\mathcal R}})\pi/4}}H_{{\nu _{\mathcal R}}}^{(1)}(k\vert \eta \vert)\,.$$
(7.45)

We define the power spectrum of curvature perturbations,

$${{\mathcal P}_{\mathcal R}} \equiv {{4\pi {k^3}} \over {{{(2\pi)}^3}}}{\left\vert {\mathcal R} \right\vert ^2}\,.$$
(7.46)

Using the solution (7.45), we obtain the power spectrum [317]

$${{\mathcal P}_{\mathcal R}} = {1 \over {{Q_s}}}{\left({(1 - {\epsilon _1}){{\Gamma ({\nu _{\mathcal R}})} \over {\Gamma (3/2)}}{H \over {2\pi}}} \right)^2}{\left({{{\vert k\eta \vert} \over 2}} \right)^{3 - 2{\nu _{\mathcal R}}}}\,,$$
(7.47)

where we have used the relations \(H_\nu ^{(1)}(k\vert \eta \vert) \rightarrow - (i/\pi)\Gamma (\nu){(k\vert \eta \vert/2)^{- \nu}}\) for → 0 and \(\Gamma (3/2) = \sqrt \pi/2\). Since the curvature perturbation is frozen after the Hubble radius crossing, the spectrum (7.47) should be evaluated at k = aH. The spectral index of \({\mathcal R}\), which is defined by \({n_{\mathcal R}} - 1 = {\rm{d ln}}\,{{\mathcal P}_{\mathcal R}}/{\rm{d}}\,{\rm{ln}}\,k{\vert _{k = aH}}\), is

$${n_{\mathcal R}} - 1 = 3 - 2{\nu _{\mathcal R}}\,,$$
(7.48)

where \({\nu _{\mathcal R}}\) is given in Eq. (7.42). As long as ∣ϵi∣(i = 1, 2, 3, 4) are much smaller than 1 during inflation, the spectral index reduces to

$${n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} - 2{\epsilon _2} + 2{\epsilon _3} - 2{\epsilon _4}\,,$$
(7.49)

where we have ignored those terms higher than the order of ϵi’s. Provided that ∣ϵi∣ ≫ 1 the spectrum is close to scale-invariant \(({n_{\mathcal R}} \simeq 1)\). From Eq. (7.47) the power spectrum of curvature perturbations can be estimated as

$${{\mathcal P}_{\mathcal R}} \simeq {1 \over {{Q_s}}}{\left({{H \over {2\pi}}} \right)^2}\,.$$
(7.50)

A minimally coupled scalar field ϕ in Einstein gravity corresponds to ϵ3 = 0, ϵ4 = 0 and \({Q_s} = {{\dot \phi}^2}/{H^2}\), in which case we obtain the standard results \({n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} - 2{\epsilon _2}\) and \({{\mathcal P}_{\mathcal R}} \simeq {H^4}/(4{\pi ^2}{{\dot \phi}^2})\) in slow-roll inflation [573, 390].

Tensor perturbations

Tensor perturbations hij have two polarization states, which are generally written as λ = +, × [391]. In terms of polarization tensors \(e_{ij}^ +\) and \(e_{ij}^ \times\). they are given by

$${h_{ij}} = {h_ +}e_{ij}^ + + {h_ \times}e_{ij}^ \times \,.$$
(7.51)

If the direction of a momentum k is along the z-axis, the non-zero components of polarization tensors are given by \(e_{xx}^ + = - e_{yy}^ + = 1\) and \(e_{xy}^ \times = e_{yx}^ \times = 1\).

For the action (6.2) the Fourier components hλ (λ = +, ×) obey the following equation [314]

$$\ddot {{h_\lambda}}+ {{{{({a^3}F)}^ \cdot}} \over {{a^3}F}}\dot{{h_\lambda}} + {{{k^2}} \over {{a^2}}}{h_\lambda} = 0\,.$$
(7.52)

This is similar to Eq. (7.37) of curvature perturbations, apart from the difference of the factor F instead of Qs. Defining new variables \({z_t} = a\sqrt F\) and \({u_\lambda} = {z_t}{h_\lambda}/\sqrt {16\pi G}\), it follows that

$$u_\lambda ^{\prime\prime} + \left({{k^2} - {{z_t^{\prime\prime}} \over {{z_t}}}} \right){u_\lambda} = 0\,.$$
(7.53)

We have introduced the factor 16πG to relate a dimensionless massless field hλ with a massless scalar field uλ having a unit of mass.

If \({{\dot \epsilon}_i} = 0\), we obtain

$${{z_t^{\prime\prime}} \over {{z_t}}} = {{\nu _t^2 - 1/4} \over {{\eta ^2}}}\,,\qquad {\rm{with}}\qquad \nu _t^2 = {1 \over 4} + {{(1 + {\epsilon _3})(2 - {\epsilon _1} + {\epsilon _3})} \over {{{(1 - {\epsilon _1})}^2}}}\,.$$
(7.54)

We follow the similar procedure to the one given in Section 7.1. Taking into account polarization states, the spectrum of tensor perturbations after the Hubble radius crossing is given by

$${{\mathcal P}_T} = 4 \times {{16\pi G} \over {{a^2}F}}{{4\pi {k^3}} \over {{{(2\pi)}^3}}}\vert {u_\lambda}{\vert ^2} \simeq {{16} \over \pi}{\left({{H \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over F}{\left({(1 - {\epsilon _1}){{\Gamma ({\nu _t})} \over {\Gamma (3/2)}}} \right)^2}{\left({{{\vert k\eta \vert} \over 2}} \right)^{3 - 2{\nu _t}}}\,,$$
(7.55)

which should be evaluated at the Hubble radius crossing (k = aH). The spectral index of \({{\mathcal P}_T}\) is

$${n_T} = 3 - 2{\nu _t}\,,$$
(7.56)

where νt is given in Eq. (7.54). If ∣ϵi∣ ≪ 1, this reduces to

$${n_T} \simeq - 2{\epsilon _1} - 2{\epsilon _3}\,.$$
(7.57)

then the amplitude of tensor perturbations is given by

$${{\mathcal P}_T} \simeq {{16} \over \pi}{\left({{H \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over F}\,.$$
(7.58)

We define the tensor-to-scalar ratio

$$r \equiv {{{{\mathcal P}_T}} \over {{{\mathcal P}_{\mathcal R}}}} \simeq {{64\pi} \over {m_{{\rm{pl}}}^2}}{{{Q_s}} \over F}\,.$$
(7.59)

For a minimally coupled scalar field ϕ in Einstein gravity, it follows that nT ≃ −2ϵ1, \({{\mathcal P}_T} \simeq 16{H^2}/(\pi m_{{\rm{p1}}}^2)\), and r ≃ 16ϵ1.

The spectra of perturbations in inflation based onf(R) gravity

Let us study the spectra of scalar and tensor perturbations generated during inflation in metric f(R) gravity. Introducing the quantity E = 32/(2κ2), we have \({\epsilon _4} = \ddot F/(H\dot F)\) and

$${Q_s} = {{6F\epsilon _3^2} \over {{\kappa ^2}{{(1 + {\epsilon _3})}^2}}} = {E \over {F{H^2}{{(1 + {\epsilon _3})}^2}}}\,.$$
(7.60)

Since the field kinetic term \({{\dot \phi}^2}\) is absent, one has ϵ2 = 0 in Eqs. (7.42) and (7.49). Under the conditions ∣ϵi∣ ≪ 1 (i = 1, 3, 4), the spectral index of curvature perturbations is given by \({n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} + 2{\epsilon _3} - 2{\epsilon _4}\).

In the absence of the matter fluid, Eq. (2.16) translates into

$${\epsilon _1} = - {\epsilon _3}(1 - {\epsilon _4})\,,$$
(7.61)

which gives ϵ1≃ −ϵ3 for ∣ϵ4∣ ≪ 1. Hence we obtain [315]

$${n_{\mathcal R}} - 1 \simeq - 6{\epsilon _1} - 2{\epsilon _4}\,.$$
(7.62)

From Eqs. (7.50) and (7.60), the amplitude of \({\mathcal R}\) is estimated as

$${{\mathcal P}_{\mathcal R}} \simeq {1 \over {3\pi F}}{\left({{H \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over {\epsilon _3^2}}\,.$$
(7.63)

Using the relation ϵ1 ≃ −ϵ3, the spectral index (7.57) of tensor perturbations is given by

$${n_T} \simeq 0\,,$$
(7.64)

which vanishes at first-order of slow-roll approximations. From Eqs. (7.58) and (7.63) we obtain the tensor-to-scalar ratio

$$r \simeq 48\epsilon _3^2 \simeq 48\epsilon _1^2\,.$$
(7.65)

The model f(R) = αRn (n > 0)

Let us consider the inflation model: f(R) = αRn (n > 0). From the discussion given in Section 3.1 the slow-roll parameters ϵi (i = 1, 3, 4) are constants:

$${\epsilon _1} = {{2 - n} \over {(n - 1)(2n - 1)}}\,,\qquad {\epsilon _3} = - (n - 1){\epsilon _1}\,,\qquad {\epsilon _4} = {{n - 2} \over {n - 1}}\,.$$
(7.66)

In this case one can use the exact results (7.48) and (7.56) with \({\nu _{\mathcal R}}\) and νt given in Eqs. (7.42) and (7.54) (with ϵ2 = 0). Then the spectral indices are

$${n_{\mathcal R}} - 1 = {n_T} = - {{2{{(n - 2)}^2}} \over {2{n^2} - 2n - 1}}\,.$$
(7.67)

If n = 2 we obtain the scale-invariant spectra with \({n_{\mathcal R}} = 1\) and nT = 0. Even the slight deviation from n = 2 leads to a rather large deviation from the scale-invariance. If n = 1.7, for example, one has \({n_{\mathcal R}} - 1 = {n_T} = - 0.13\), which does not match with the WMAP 5-year constraint: \({n_{\mathcal R}} = 0.960 \pm 0.013\) [367].

The model f(R) = R+R2/(6 M2)

For the model f(R) = R+R2/(6M2), the spectrum of the curvature perturbation \({\mathcal R}\) shows some deviation from the scale-invariance. Since inflation occurs in the regime RM2 and ∣∣≪ H2, one can approximate FR/(3M2) ≃ 4H2/M2. Then the power spectra (7.63) and (7.58) yield

$${{\mathcal P}_{\mathcal R}} \simeq {1 \over {12\pi}}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^2}{1 \over {\epsilon _1^2}}\,,\qquad {{\mathcal P}_T} \simeq {4 \over \pi}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^2}\,,$$
(7.68)

where we have employed the relation ϵ3 ≃ − ϵ1.

Recall that the evolution of the Hubble parameter during inflation is given by Eq. (3.9). As long as the time tk at the Hubble radius crossing (k = aH) satisfies the condition (M2/6)(tkti) ≪ Hi, one can approximate H(tk) ≪ Hi. Using Eq. (3.9), the number of e-foldings from t = tk to the end of inflation can be estimated as

$${N_k} \simeq {1 \over {2{\epsilon _1}({t_k})}}\,.$$
(7.69)

then the amplitude of the curvature perturbation is given by

$${{\mathcal P}_{\mathcal R}} \simeq {{N_k^2} \over {3\pi}}{\left({{M \over {{m_{{\rm{pl}}}}}}} \right)^2}\,.$$
(7.70)

The WMAP 5-year normalization corresponds to \({{\mathcal P}_{\mathcal R}} = (2.445 \pm 0.096) \times {10^{- 9}}\) at the scale k = 0.002 Mpc—1 [367]. Taking the typical value Nk = 55, the mass M is constrained to be

$$M \simeq 3 \times {10^{- 6}}{m_{{\rm{pl}}}}\,.$$
(7.71)

Using the relation F ≪ 4H2/M2, it follows that ϵ4 ≃ − ϵ1. Hence the spectral index (7.62) reduces to

$${n_{\mathcal R}} - 1 \simeq - 4{\epsilon _1} \simeq - {2 \over {{N_k}}} = - 3.6 \times {10^{- 2}}{\left({{{{N_k}} \over {55}}} \right)^{- 1}}\,.$$
(7.72)

for Nk = 55 we have \({n_{\mathcal R}} \simeq 0.964\), which is in the allowed region of the WMAP 5-year constraint (\({n_{\mathcal R}} = 0.964 \pm 0.013\) at the 68% confidence level [367]). The tensor-to-scalar ratio (7.65) can be estimated as

$$r \simeq {{12} \over {N_k^2}} \simeq 4.0 \times {10^{- 3}}{\left({{{{N_k}} \over {55}}} \right)^{- 2}}\,,$$
(7.73)

which satisfies the current observational bound r < 0.22 [367]. We note that a minimally coupled field with the potential V(ϕ) = m2ϕ2/2 in Einstein gravity (chaotic inflation model [393]) gives rise to a larger tensor-to-scalar ratio of the order of 0.1. Since future observations such as the Planck satellite are expected to reach the level of \(r = {\mathcal O}({10^{- 2}})\), they will be able to discriminate between the chaotic inflation model and the Starobinsky’s f(R) model.

The power spectra in the Einstein frame

Let us consider the power spectra in the Einstein frame. Under the conformal transformation \({{\tilde g}_{\mu \nu}} = F{g_{\mu \nu}}\), the perturbed metric (6.1) is transformed as

$$\begin{array}{*{20}c} {{\rm{d}}{{\tilde s}^2} = F{\rm{d}}{s^2}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;\quad \quad \quad \quad \quad}\\ {= - (1 + 2\tilde \alpha)\,{\rm{d}}{{\tilde t}^2} - 2\tilde a(\tilde t)\,({\partial _i}\tilde \beta - {{\tilde S}_i}){\rm{d}}\tilde t\;{\rm{d}}{{\tilde x}^i}\quad \quad \quad \quad \quad}\\ {+ {{\tilde a}^2}(\tilde t)({\delta _{ij}} + 2\tilde \psi {\delta _{ij}} + 2{\partial _i}{\partial _j}\tilde \gamma + 2{\partial _j}{{\tilde F}_i} + {{\tilde h}_{ij}}){\rm{d}}{{\tilde x}^i}{\rm{d}}{{\tilde x}^j}\,.}\\ \end{array}$$
(7.74)

We decompose the conformal factor into the background and perturbed parts, as

$$F(t,x) = \bar F(t)\left({1 + {{\delta F(t,x)} \over {\bar F(t)}}} \right)\,.$$
(7.75)

In what follows we omit a bar from F. We recall that the background quantities are transformed as Eqs. (2.44) and (2.47). The transformation of scalar metric perturbations is given by

$$\tilde \alpha = \alpha + {{\delta F} \over {2F}}\,,\qquad \tilde \beta = \beta \,,\qquad \tilde \psi = \psi + {{\delta F} \over {2F}}\,,\qquad \tilde \gamma = \gamma \,.$$
(7.76)

Meanwhile vector and tensor perturbations are invariant under the conformal transformation (\(({{\tilde S}_i} = {S_i},{{\tilde F}_i} = {F_i},{{\tilde h}_{ij}} = {h_{ij}})\)).

Using the above transformation law, one can easily show that the curvature perturbation \({\mathcal R} = \psi - H\delta F/\dot F\) in f(R) gravity is invariant under the conformal transformation:

$$\tilde {\mathcal R} = {\mathcal R}\,.$$
(7.77)

Since the tensor perturbation is also invariant, the tensor-to-scalar ratio in the Einstein frame is identical to that in the Jordan frame. For example, let us consider the model f(R) = R + R2/(6M2). Since the action in the Einstein frame is given by Eq. (2.32), the slow-roll parameters \({{\tilde \epsilon}_3}\) and \({{\tilde \epsilon}_4}\) vanish in this frame. Using Eqs. (7.49) and (3.27), the spectral index of curvature perturbations is given by

$${\tilde n_{\mathcal R}} - 1 \simeq - 4{\tilde{\epsilon}_1} - 2{\tilde{\epsilon}_2} \simeq - {2 \over {{{\tilde N}_k}}}\,,$$
(7.78)

where we have ignored the term of the order of \(1/\tilde N_k^2\). Since \({{\tilde N}_k} \simeq {N_k}\) in the slow-roll limit (∣/(2HF)∣ ≪ 1), Eq. (7.78) agrees with the result (7.72) in the Jordan frame. Since \({Q_s} = {({\rm{d}}\phi {\rm{/d}}\tilde t)^2}/{H^2}\) in the Einstein frame, Eq. (7.59) gives the tensor-to-scalar ratio

$$\tilde r = {{64\pi} \over {m_{{\rm{pl}}}^2}}{\left({{{{\rm{d}}\phi} \over {{\rm{d}}\tilde t}}} \right)^2}{1 \over {{{\tilde H}^2}}} \simeq 16{\tilde \epsilon _1} \simeq {{12} \over {\tilde N_k^2}}\,,$$
(7.79)

where the background equations (3.21) and (3.22) are used with slow-roll approximations. Equation (7.79) is consistent with the result (7.73) in the Jordan frame.

The equivalence of the curvature perturbation between the Jordan and Einstein frames also holds for scalar-tensor theory with the Lagrangian \({\mathcal L} = F(\phi)R/(2{\kappa ^2}) - (1/2)\omega (\phi){g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - V(\phi)\) [411, 240]. For the non-minimally coupled scalar field with F(ϕ) = 1 − ξk2ϕ2 [269, 241] the spectral indices of scalar and tensor perturbations have been derived by using such equivalence [366, 590].

The Lagrangian for cosmological perturbations

In Section 7.1 we used the fact that the field which should be quantized corresponds to \(u = a{\sqrt Q _s}{\mathcal R}\). This can be justified by writing down the action (6.1) expanded at second-order in the perturbations [437]. We recall again that we are considering an effective single-field theory such as f(R) gravity and scalar-tensor theory with the coupling F(ϕ)R. Carrying out the expansion of the action (6.2) in second order, we find that the action for the curvature perturbation \({\mathcal R}\) (either \({{\mathcal R}_{\delta F}}\) or \({{\mathcal R}_{\delta \phi}}\)) is given by [311]

$$\delta {S^{(2)}} = \int {\rm{d}} t\;{{\rm{d}}^3}x\,{a^3}\,{Q_s}\left[ {{1 \over 2}\,\dot{\mathcal R}^2 - {1 \over 2}\,{1 \over {{a^2}}}{{(\nabla {\mathcal R})}^2}} \right]\,,$$
(7.80)

where Qs is given in Eq. (7.38). In fact, the variation of this action in terms of the field \({\mathcal R}\) gives rise to Eq. (7.37) in Fourier space. We note that there is another approach called the Hamiltonian formalism which is also useful for the quantization of cosmological perturbations. See [237, 209, 208, 127] for this approach in the context of f(R) gravity and modified gravitational theories.

Introducing the quantities \(u = {z_S}{\mathcal R}\) and \({z_S} = a{\sqrt Q _s}\), the action (7.80) can be written as

$$\delta {S^{(2)}} = \int {\rm{d}} \eta \,{{\rm{d}}^3}x\left[ {{1 \over 2}\,{{u}^{\prime 2}} - {1 \over 2}{{(\nabla u)}^2} + {1 \over 2}{{z_s^{\prime \prime}} \over {{z_s}}}{u^2}} \right]\,,$$
(7.81)

where a prime represents a derivative with respect to the conformal time η = ∫ a−1dt. The action (7.81) leads to Eq. (7.39) in Fourier space. The transformation of the action (7.80) to (7.81) gives rise to the effective massFootnote 6

$$M_s^2 \equiv - {1 \over {{a^2}}}{{z_s^{\prime\prime}} \over {{z_s}}} = {{\dot Q_s^2} \over {4Q_s^2}} - {{{{\ddot Q}_s}} \over {2{Q_s}}} - {{3H{{\dot Q}_s}} \over {2{Q_s}}}.$$
(7.82)

We have seen in Eq. (7.42) that during inflation the quantity \(z_s^{^{\prime\prime}}/{z_s}\) can be estimated as \(z_s^{^{\prime\prime}}/{z_s} \simeq 2{(aH)^2}\) in the slow-roll limit, so that \(M_s^2 \simeq - 2{H^2}\). For the modes deep inside the Hubble radius (kaH) the action (7.81) reduces to the one for a canonical scalar field u in the flat spacetime. Hence the quantization should be done for the field \(u = a\sqrt {{Q_s}} {\mathcal R}\), as we have done in Section 7.1.

From the action (7.81) we understand a number of physical properties in f(R) theories and scalar-tensor theories with the coupling F(ϕ)R listed below.

  1. 1.

    Having a standard d’Alambertian operator, the mode has speed of propagation equal to the speed of light. This leads to a standard dispersion relation ω = k/a for the high-k modes in Fourier space.

  2. 2.

    The sign of Qs corresponds to the sign of the kinetic energy of \({\mathcal R}\). The negative sign corresponds to a ghost (phantom) scalar field. In f(R) gravity (with \(\dot \phi = 0\)) the ghost appears for F < 0. In Brans-Dicke theory with F(ϕ) = κ2ϕ and ω(ϕ) = ωBD/ϕ [100] (where ϕ > 0) the condition for the appearance of the ghost \((\omega {{\dot \phi}^2} + 3{F^2}/(2{\kappa ^2}F) < 0)\) translates into ωBD < −3/2. In these cases one would encounter serious problems related to vacuum instability [145, 161].

  3. 3.

    The field u has the effective mass squared given in Eq. (7.82). In f(R) gravity it can be written as

    $$M_s^2 = - {{72{F^2}{H^4}} \over {{{(2FH + {f_{,RR}}\dot R)}^2}}} + {1 \over 3}F\left({{{288{H^3} - 12HR} \over {2FH + {f_{,RR}}\dot R}} + {1 \over {{f_{,RR}}}}} \right) + {{f_{,RR}^2{{\dot R}^2}} \over {4{F^2}}} - 24{H^2} + {7 \over 6}R\,,$$
    (7.83)

    where we used the background equation (2.16) to write in terms of R and H2. In Fourier space the perturbation u obeys the equation of motion

    $$u^{\prime\prime} +({{k^2} + M_s^2{a^2}})\;u = 0\,.$$
    (7.84)

    For \({k^2}/{a^2} \gg M_s^2\), the field u propagates with speed of light. For small k satisfying \({k^2}/{a^2} \ll M_s^2\), we require a positive \(M_s^2\) to avoid the tachyonic instability of perturbations. Recall that the viable dark energy models based on f(R) theories need to satisfy Rf,RRF (i.e., m = Rf,RR/f,R ≪ 1) at early times, in order to have successful cosmological evolution from radiation domination till matter domination. At these epochs the mass squared is approximately given by

    $$M_s^2 \simeq {F \over {3{f_{,RR}}}}\,,$$
    (7.85)

    which is consistent with the result (5.2) derived by the linear analysis about the Minkowski background. Together with the ghost condition F > 0, this leads to f,RR > 0. Recall that these correspond to the conditions presented in Eq. (4.56).

Observational Signatures of Dark Energy Models in f(R) Theories

In this section we discuss a number of observational signatures of dark energy models based on metric f(R) gravity. Our main interest is to distinguish these models from the ΛCDM model observationally. In particular we study the evolution of matter density perturbations as well as the gravitational potential to confront f(R) models with the observations of large-scale structure (LSS) and Cosmic Microwave Background (CMB). The effect on weak lensing will be discussed in Section 13.1 in more general modified gravity theories including f(R) gravity.

Matter density perturbations

Let us consider the perturbations of non-relativistic matter with the background energy density ρm and the negligible pressure (Pm = 0). In Fourier space Eqs. (6.17) and (6.18) give

$${\dot{\delta} {\rho _m}} + 3H\delta {\rho _m} = {\rho _m}\left({A - 3H\alpha - {{{k^2}} \over {{a^2}}}v} \right)\,,$$
(8.86)
$$\dot v = \alpha \,,$$
(8.87)

where in the second line we have used the continuity equation, \({{\dot \rho}_m} + 3H{\rho _m} = 0\). The density contrast defined in Eq. (6.32), i.e.

$${\delta _m} = {{\delta {\rho _m}} \over {{\rho _m}}} + 3Hv\,,$$
(8.88)

obeys the following equation from Eqs. (8.86) and (8.87):

$${\ddot \delta _m} + 2H{\dot \delta _m} + {{{k^2}} \over {{a^2}}}(\alpha - \dot \chi) = 3\ddot B + 6H\dot B\,,$$
(8.89)

where Bψ and we used the relation \(A = 3(H\alpha - \dot \psi) + ({k^2}/{a^2})\chi\).

In the following we consider the evolution of perturbations in f(R) gravity in the Longitudinal gauge (6.33). Since χ = 0, α = Φ, ψ = −Ψ, and \(A = 3(H\Phi + \dot \Psi)\) in this case, Eqs. (6.11), (6.13), (6.15), and (8.89) give

$$\begin{array}{*{20}c} {{{{k^2}} \over {{a^2}}}\Psi + 3H(H\Phi + \dot \Psi) = - {1 \over {2F}}\left[ {\left({3{H^2} + 3\dot H - {{{k^2}} \over {{a^2}}}} \right)\delta F - 3H\dot{\delta F}} \right.}\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\\ {\left. {+ 3H\dot F\Phi + 3\dot F(H\Phi + \dot \Psi) + {\kappa ^2}\delta {\rho _m}} \right]\,,}\\ \end{array}$$
(8.90)
$$\Psi - \Phi = {{\delta F} \over F}\,,$$
(8.91)
$$\ddot{\delta F} + 3H\dot{\delta F} + \left({{{{k^2}} \over {{a^2}}} + {M^2}} \right)\delta F = {{{\kappa ^2}} \over 3}\delta {\rho _m} + \dot F(3H\Phi + 3\dot \Psi + \dot \Phi) + (2\ddot F + 3H\dot F)\Phi \,,$$
(8.92)
$${\ddot \delta _m} + 2H{\dot \delta _m} + {{{k^2}} \over {{a^2}}}\Phi = 3\ddot B + 6H\dot B\,,$$
(8.93)

where B = + Ψ. In order to derive Eq. (8.92), we have used the mass squared M2 = (F/F,RR)/3 introduced in Eq. (5.2) together with the relation δR = δF/F,R.

Let us consider the wavenumber k deep inside the Hubble radius (kaH). In order to derive the equation of matter perturbations approximately, we use the quasi-static approximation under which the dominant terms in Eqs. (8.90)(8.93) correspond to those including k2/a2, δρm (or δm) and M2. In General Relativity this approximation was first used by Starobinsky in the presence of a minimally coupled scalar field [567], which was numerically confirmed in [403]. This was further extended to scalar-tensor theories [93, 171, 586] and f(R) gravity [586, 597]. Precisely speaking, in f(R) gravity, this approximation corresponds to

$$\left\{{{{{k^2}} \over {{a^2}}}\vert \Phi \vert, {{{k^2}} \over {{a^2}}}\vert \Psi \vert, {{{k^2}} \over {{a^2}}}\vert \delta F\vert, {M^2}\vert \delta F\vert} \right\} \gg \{{H^2}\vert \Phi \vert, {H^2}\vert \Psi \vert, {H^2}\vert B\vert, {H^2}\vert \delta F\vert \} \,,$$
(8.94)

and

$$\vert \dot X\vert \underset{\sim}{<} \vert HX\vert \,,\quad {\rm{where}}\quad X = \Phi, \Psi, F,\dot F,\delta F,\dot{\delta F}\,.$$
(8.95)

From Eqs. (8.90) and (8.91) it then follows that

$$\Psi \simeq {1 \over {2F}}\left({\delta F - {{{a^2}} \over {{k^2}}}{\kappa ^2}\delta {\rho _m}} \right)\,,\qquad \Phi \simeq - {1 \over {2F}}\left({\delta F + {{{a^2}} \over {{k^2}}}{\kappa ^2}\delta {\rho _m}} \right)\,.$$
(8.96)

Since (k2/a2 + M2)δFκ2δρm/3 from Eq. (8.92), we obtain

$${{{k^2}} \over {{a^2}}}\Psi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{2 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}}\,,\qquad {{{k^2}} \over {{a^2}}}\Phi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{4 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}}\,.$$
(8.97)

We also define the effective gravitational potential

$${\Phi _{{\rm{eff}}}} \equiv (\Phi + \Psi)/2\,.$$
(8.98)

This quantity characterizes the deviation of light rays, which is linked with the Integrated Sachs-Wolfe (ISW) effect in CMB [544] and weak lensing observations [27]. From Eq. (8.97) we have

$${\Phi _{{\rm{eff}}}} \simeq - {{{\kappa ^2}} \over {2F}}{{{a^2}} \over {{k^2}}}\delta {\rho _m}\,.$$
(8.99)

From Eq. (6.12) the term is of the order of H2Φ/(κ2ρm) provided that the deviation from the ΛCDM model is not significant. Using Eq. (8.97) we find that the ratio 3Hυ/(\(3H\upsilon/(\delta {\rho _m}/{\rho _m})\)) is of the order of (aH/k)2, which is much smaller than unity for sub-horizon modes. Then the gauge-invariant perturbation δm given in Eq. (8.88) can be approximated as δmδρm/ρm. Neglecting the r.h.s. of Eq. (8.93) relative to the l.h.s. and using Eq. (8.97) with δρmρmδm, we get the equation for matter perturbations:

$${\ddot \delta _m} + 2H{\dot \delta _m} - 4\pi {G_{{\rm{eff}}}}{\rho _m}{\delta _m} \simeq 0\,,$$
(8.100)

where Geff is the effective (cosmological) gravitational coupling defined by [586, 597]

$${G_{{\rm{eff}}}} \equiv {G \over F}{{4 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}}\,.$$
(8.101)

We recall that viable f(R) dark energy models are constructed to have a large mass M in the region of high density (RR0). During the radiation and deep matter eras the deviation parameter m = Rf,RR/f,R is much smaller than 1, so that the mass squared satisfies

$${M^2} = {R \over 3}\left({{1 \over m} - 1} \right) \gg R\,.$$
(8.102)

if m grows to the order of 0.1 by the present epoch, then the mass M today can be of the order of H0. In the regimes M2k2/a2 and M2k2/a2 the effective gravitational coupling has the asymptotic forms GeffG/F and Geff ≃ 4G/(3F), respectively. The former corresponds to the “General Relativistic (GR) regime” in which the evolution of δm mimics that in GR, whereas the latter corresponds to the “scalar-tensor regime” in which the evolution of δm is non-standard. For the f(R) models (4.83) and (4.84) the transition from the former regime to the latter regime, which is characterized by the condition M2 = k2/a2, can occur during the matter domination for the wavenumbers relevant to the matter power spectrum [306, 568, 587, 270, 589].

In order to derive Eq. (8.100) we used the approximation that the time-derivative terms of δF on the l.h.s. of Eq. (8.92) is neglected. In the regime M2k2/a2, however, the large mass M can induce rapid oscillations of δF. In the following we shall study the evolution of the oscillating mode [568]. For sub-horizon perturbations Eq. (8.92) is approximately given by

$$\ddot{\delta F}+ 3H\dot{\delta F}+ \left({{{{k^2}} \over {{a^2}}} + {M^2}} \right)\delta F \simeq {{{\kappa ^2}} \over 3}\delta {\rho _m}\,.$$
(8.103)

The solution of this equation is the sum of the matter induce mode δFind ≃ (κ2/3)δρm/(k2/a2+ M2) and the oscillating mode δFosc satisfying

$${\ddot{\delta F}_{{\rm{osc}}}} + 3H{\dot{\delta F} _{{\rm{osc}}}} + \left({{{{k^2}} \over {{a^2}}} + {M^2}} \right)\delta {F_{{\rm{osc}}}} = 0\,.$$
(8.104)

As long as the frequency \(\omega = \sqrt {{k^2}/{a^2} + {M^2}}\) satisfies the adiabatic condition \(\vert \dot \omega \vert \, \ll {\omega ^2}\), we obtain the solution of Eq. (8.104) under the WKB approximation:

$$\delta {F_{{\rm{osc}}}} \simeq c{a^{- 3/2}}{1 \over {\sqrt {2\omega}}}\cos \left({\int \omega {\rm{d}}t} \right)\,,$$
(8.105)

where c is a constant. Hence the solution of the perturbation δR is expressed by [568, 587]

$$\delta R \simeq {1 \over {3{f_{,RR}}}}{{{\kappa ^2}\delta {\rho _m}} \over {{k^2}/{a^2} + {M^2}}} + c{a^{- 3/2}}{1 \over {{f_{,RR}}\sqrt {2\omega}}}\cos \left({\int \omega {\rm{d}}t} \right)\,.$$
(8.106)

For viable f(R) models, the scale factor a and the background Ricci scalar R(0) evolve as at2/3 and R(0) ≃ 4/(3t2) during the matter era. Then the amplitude of δRosc relative to R(0) has the time-dependence

$${{\vert \delta {R_{{\rm{osc}}}}\vert} \over {{R^{(0)}}}} \propto {{{M^2}t} \over {{{({k^2}/{a^2} + {M^2})}^{1/4}}}}\,.$$
(8.107)

The f(R) models (4.83) and (4.84) behave as m(r) = C(−r − 1)p with p = 2n + 1 in the regime RRc. During the matter-dominated epoch the mass M evolves as Mt−(p+1). In the regime M2k2/a2 one has ∣δRosc∣/R(0)t−(3p+1)/2 and hence the amplitude of the oscillating mode decreases faster than R(0). However the contribution of the oscillating mode tends to be more important as we go back to the past. In fact, this behavior was confirmed in the numerical simulations of [587, 36]. This property persists in the radiation-dominated epoch as well. If the condition ∣δR∣ < R(0) is violated, then R can be negative such that the condition f,R > 0 or f,RR > 0 is violated for the models (4.83) and (4.84). Thus we require that ∣δR∣ is smaller than R(0) at the be ginning of the radiation era. This can be achieved by choosing the constant in Eq. (8.106) to be sufficiently small, which amounts to a fine tuning for these models.

For the models (4.83) and (4.84) one has F = 1 − 2(R/Rc)2n−1 in the regime RRc. Then the field ϕ defined in Eq. (2.31) rapidly approaches 0 as we go back to the past. Recall that in the Einstein frame the effective potential of the field has a potential minimum around ϕ = 0 because of the presence of the matter coupling. Unless the oscillating mode of the field perturbation δϕ is strongly suppressed relative to the background field ϕ(0), the system can access the curvature singularity at ϕ = 0 [266]. This is associated with the condition ∣δR∣ < R(0) discussed above. This curvature singularity appears in the past, which is not related to the future singularities studied in [461, 54]. The past singularity can be cured by taking into account the R2 term [37], as we will see in Section 13.3. We note that the f(R) models proposed in [427] [e.g., f(R) = RαRc ln(1+R/Rc)] to cure the singularity problem satisfy neither the local gravity constraints [580] nor observational constraints of large-scale structure [194].

As long as the oscillating mode δRosc is negligible relative to the matter-induced mode δRind, we can estimate the evolution of matter perturbations δm as well as the effective gravitational potential Φeff. Note that in [192, 434] the perturbation equations have been derived without neglecting the oscillating mode. As long as the condition ∣δRosc∣ < ∣δRindδ∣ is satisfied initially, the approximate equation (8.100) is accurate to reproduce the numerical solutions [192, 589]. Equation (8.100) can be written as

$${{{{\rm{d}}^2}{\delta _m}} \over {{\rm{d}}{N^2}}} + \left({{1 \over 2} - {3 \over 2}{w_{{\rm{eff}}}}} \right){{{\rm{d}}{\delta _m}} \over {{\rm{d}}N}} - {3 \over 2}{\Omega _m}{{4 + 3{M^2}{a^2}/{k^2}} \over {3(1 + {M^2}{a^2}/{k^2})}} = 0\,,$$
(8.108)

where N = lna, weff = −1 − 2 /(3H2), and Ωm = 8πGρm/(3FH2). The matter-dominated epoch corresponds to weff = 0 and Ωm = 1. In the regime M2k2/a2 the evolution of δm and Φeff during the matter dominance is given by

$${\delta _m} \propto {t^{2/3}},\qquad {\Phi _{{\rm{eff}}}} = {\rm{constant}}\,,$$
(8.109)

where we used Eq. (8.99). The matter-induced mode δRind relative to the background Ricci scalar R(0) evolves as ∣δRind∣/R(0)t2/3δm. At late times the perturbations can enter the regime M2k2/a2, depending on the wavenumber k and the mass M. When M2k2/a2, the evolution of δm and Φeff during the matter era is [568]

$${\delta _m} \propto {t^{(\sqrt {33} - 1)/6}}\,,\qquad {\Phi _{{\rm{eff}}}} \propto {t^{(\sqrt {33} - 5)/6}}\,.$$
(8.110)

For the model m(r) = C(−r − 1)p, the evolution of the matter-induced mode in the region M2k2/a2 is given by \(\vert \delta {R_{{\rm{ind}}}}\vert/{R^{(0)}} \propto {t^{- 2p +}}(\sqrt {33} - 5)/6\). This decreases more slowly relative to the ratio ∣δRosc∣/R(0) [587], so the oscillating mode tends to be unimportant with time.

The impact on large-scale structure

We have shown that the evolution of matter perturbations during the matter dominance is given by δmt2/3 for M2k2/a2 (GR regime) and \({\delta _m} \propto {t^{(\sqrt {33} - 1)/6}}\) for M2k2/a2 (scalar-tensor regime), respectively. The existence of the latter phase gives rise to the modification to the matter power spectrum [146, 74, 544, 526, 251] (see also [597, 493, 494, 94, 446, 278, 435] for related works). The transition from the GR regime to the scalar-tensor regime occurs at M2 = k2/a2. If it occurs during the matter dominance (R ≃ 3H2), the condition M2 = k2/a2 translates into [589]

$$m \simeq {(aH/k)^2}\,,$$
(8.111)

where we have used the relation M2R/(3m) (valid for m ≪ 1).

We are interested in the wavenumbers k relevant to the linear regime of the galaxy power spectrum [577, 578]:

$$0.01\,h\;{\rm{Mp}}{{\rm{c}}^{- 1}}\underset{\sim}{<} k \underset{\sim}{<} 0.,h\;{\rm{Mp}}{{\rm{c}}^{- 1}}\,,$$
(8.112)

where h = 0.72 ± 0.08 corresponds to the uncertainty of the Hubble parameter today. Non-linear effects are important for k ≳ 0.2 h Mpc−1. The current observations on large scales around k ∼ 0.01 h Mpc−1 are not so accurate but can be improved in future. The upper bound = 0.2 h Mpc−1 corresponds to k ≃ 600a0H0, where the subscript “0” represents quantities today. If the transition from the GR regime to the scalar-tensor regime occurred by the present epoch (the redshift z = 0) for the mode k = 600a0H0, then the parameter m today is constrained to be

$$m(z = 0)\underset{\sim}{>} 3 \times {10^{- 6}}\,.$$
(8.113)

When m(z = 0) ≲ 3 × 10−6 the linear perturbations have been always in the GR regime by today, in which case the models are not distinguished from the ΛCDM model. The bound (8.113) is relaxed for non-linear perturbations with k ≳ 0.2 h Mpc−1, but the linear analysis is not valid in such cases.

If the transition characterized by the condition (8.111) occurs during the deep matter era (z ≫ 1), we can estimate the critical redshift zk at the transition point. In the following let us consider the models (4.83) and (4.84). In addition to the approximations \({H^2} \simeq H_0^2\Omega _m^{(0)}{(1 + z)^3}\) and R ≃ 3H2 during the matter dominance, we use the the asymptotic forms mC(−r − 1)2n+1 and r ≃ −1 − μRc/R with C = 2n(2n + 1)/μ2n. Since the dark energy density today can be approximated as \(\rho _{{\rm{DE}}}^{(0)} \approx \mu {R_c}/2\), it follows that \(\mu {R_c} \approx 6H_0^2\Omega _{{\rm{DE}}}^{(0)}\). Then the condition (8.111) translates into the critical redshift [589]

$${z_k} = {\left[ {{{\left({{k \over {{a_0}{H_0}}}} \right)}^2}{{2n(2n + 1)} \over {{\mu ^{2n}}}}{{{{(2\Omega _{{\rm{DE}}}^{(0)})}^{2n + 1}}} \over {{{(\Omega _m^0)}^{2(n + 1)}}}}} \right]^{1/(6n + 4)}} - 1\,.$$
(8.114)

For n = 1, μ = 3, \(\Omega _m^{(0)} = 0.28\), and k = 300a0H0 the numerical value of the critical redshift is zk = 4.5, which is in good agreement with the analytic value estimated by (8.114).

The estimation (8.114) shows that, for larger k, the transition occurs earlier. The time tk at the transition has a k-dependence: tkk−3/(6n+4). For t > tk the matter perturbation evolves as \({\delta _m} \propto {t^{(\sqrt {33} - 1)/6}}\) by the time t = tΛ corresponding to the onset of cosmic acceleration (ä = 0). The matter power spectrum \({P_{{\delta _m}}} = \vert {\delta _m}{\vert ^2}\) at the time tΛ shows a difference compared to the case of the ΛCDM model [568]:

$${{{P_{{\delta _m}}}({t_\Lambda})} \over {{P_{{\delta _m}}}^{\Lambda {\rm{CDM}}}({t_\Lambda})}} = {\left({{{{t_\Lambda}} \over {{t_k}}}} \right)^{2\left({{{\sqrt {33} - 1} \over 6} - {2 \over 3}} \right)}} \propto {k^{{{\sqrt {33} - 5} \over {6n + 4}}}}\,.$$
(8.115)

We caution that, when zk is close to zΛ (the redshift at t = tΛ), the estimation (8.115) begins to lose its accuracy. The ratio of the two power spectra today, i.e., \({P_{{\delta _m}}}({t_0})/{P_{{\delta _m}}}^{\Lambda {\rm{CDM}}}({t_0})\) is in general different from Eq. (8.115). However, numerical simulations in [587] show that the difference is small for n of the order of unity.

The modified evolution (8.110) of the effective gravitational potential for z < zk leads to the integrated Sachs-Wolfe (ISW) effect in CMB anisotropies [544, 382, 545]. However this is limited to very large scales (low multipoles) in the CMB spectrum. Meanwhile the galaxy power spectrum is directly affected by the non-standard evolution of matter perturbations. From Eq. (8.115) there should be a difference between the spectral indices of the CMB spectrum and the galaxy power spectrum on the scale (8.112) [568]:

$$\Delta {n_s} = {{\sqrt {33} - 5} \over {6n + 4}}\,.$$
(8.116)

Observationally we do not find any strong signature for the difference of slopes of the two spectra. If we take the mild bound Δns < 0.05, we obtain the constraint n > 2. Note that in this case the local gravity constraint (5.60) is also satisfied.

In order to estimate the growth rate of matter perturbations, we introduce the growth index γ defined by [484]

$${f_\delta} \equiv {{{{\dot \delta}_m}} \over {H{\delta _m}}} = {({\tilde \Omega _m})^\gamma}\,,$$
(8.117)

where \({{\tilde \Omega}_m} = {\kappa ^2}{\rho _m}/(3{H^2}) = F{\Omega _m}\). This choice of \({{\tilde \Omega}_m}\) comes from writing Eq. (4.59) in the form 3H2 = ρDE + κ2ρm, where ρDE ≡ (FRf)/2 − 3 HḞ + 3H2(1 − F) and we have ignored the contribution of radiation. Since the viable f(R) models are close to the ΛCDM model in the region of high density, the quantity F approaches 1 in the asymptotic past. Defining ρDE and \({{\tilde \Omega}_m}\) in the above way, the Friedmann equation can be cast in the usual GR form with non-relativistic matter and dark energy [568, 270, 589].

The growth index in the ΛCDM model corresponds to γ ≃ 0.55 [612, 395], which is nearly constant for 0 < z < 1. In f(R) gravity, if the perturbations are in the GR regime (M2k2/a2) today, γ is close to the GR value. Meanwhile, if the transition to the scalar-tensor regime occurred at the redshift zk larger than 1, the growth index becomes smaller than 0.55 [270]. Since \(0 < {{\tilde \Omega}_m} < 1\), the smaller γ implies a larger growth rate.

In Figure 4 we plot the evolution of the growth index γ in the model (4.83) with n = 1 and μ = 1.55 for a number of different wavenumbers. In this case the present value of γ is degenerate around γ0 ≃ 0.41 independent of the scales of our interest. For the wavenumbers k = 0.1 h Mpc−1 and k = 0.01 h Mpc−1 the transition redshifts correspond to zk = 5.2 and zk = 2.7, respectively. Hence these modes have already entered the scalar-tensor regime by today.

Figure 4
figure4

Evolution of γ versus the redshift z in the model (4.83) with n = 1 and μ = 1.55 for four different values of k. For these model parameters the dispersion of γ with respect to k is very small. All the perturbation modes shown in the figure have reached the scalar-tensor regime (M2k2/a2) by today. From [589].

From Eq. (8.114) we find that zk gets smaller for larger n and μ. If the mode k = 0.2 h Mpc−1 crossed the transition point at \({z_k} > {\mathcal O}(1)\) and the mode k = 0.01 h Mpc−1 has marginally entered (or has not entered) the scalar-tensor regime by today, then the growth indices should be strongly dispersed. For sufficiently large values of n and μ one can expect that the transition to the regime M2k2/a2 has not occurred by today. The following three cases appear depending on the values of n and μ [589]:

  1. (i)

    All modes have the values of γ0 close to the ΛCDM value: γ0 =. 55, i.e., 0.53 ≲ γ0 ≲ 0.55.

  2. (ii)

    All modes have the values of γ0 close to the value in the range 0.40 ≲ γ0 ≲ 0.43.

  3. (iii)

    The values of γ0 are dispersed in the range 0.40 ≲ γ0 ≲ 0.55.

The region (i) corresponds to the opposite of the inequality (8.113), i.e., m(z = 0) ≲ 3 × 10−6, in which case n and μ take large values. The border between (i) and (iii) is characterized by the condition m(z = 0) ≈ 3 × 10−6. The region (ii) corresponds to small values of n and μ (as in the numerical simulation of Figure 4), in which case the mode k = 0. 01 h Mpc−1 entered the scalar-tensor regime for \({z_k} > {\mathcal O}(1)\).

The regions (i), (ii), (iii) can be found numerically by solving the perturbation equations. In Figure 5 we plot those regions for the model (4.84) together with the bounds coming from the local gravity constraints as well as the stability of the late-time de Sitter point. Note that the result in the model (4.83) is also similar to that in the model (4.84). The parameter space for n ≲ 3 and \(\mu = {\mathcal O}(1)\) is dominated by either the region (ii) or the region (iii). While the present observational constraint on γ is quite weak, the unusual converged or dispersed spectra found above can be useful to distinguish metric f(R) gravity from the ΛCDM model in future observations. We also note that for other viable f(R) models such as (4.89) the growth index today can be as small as γ0 ≃ 0.4 [589]. If future observations detect such unusually small values of γ0, this can be a smoking gun for f(R) models.

Figure 5
figure5

The regions (i), (ii) and (iii) for the model (4.84). We also show the bound n > 0.9 coming from the local gravity constraints as well as the condition (4.87) coming from the stability of the de Sitter point. From [589].

Non-linear matter perturbations

So far we have discussed the evolution of linear perturbations relevant to the matter spectrum for the scale k ≲ 0.01–0.2 h Mpc−1. For smaller scale perturbations the effect of non-linearity becomes important. In GR there are some mapping formulas from the linear power spectrum to the non-linear power spectrum such as the halo fitting by Smith et al. [540]. In the halo model the non-linear power spectrum P(k) is defined by the sum of two pieces [169]:

$$P(k) = {I_1}(k) + {I_2}{(k)^2}{P_L}(k)\,,$$
(8.118)

where PL(k) is a linear power spectrum and

$${I_1}(k) = \int {{{{\rm{d}}M} \over M}} {\left({{M \over {\rho _m^{(0)}}}} \right)^2}{{{\rm{d}}n} \over {{\rm{d}}\ln M}}\,{y^2}(M,k)\,,\qquad {I_2}(k) = \int {{{{\rm{d}}M} \over M}} {\left({{M \over {\rho _m^{(0)}}}} \right)^2}{{{\rm{d}}n} \over {{\rm{d}}\ln M}}\,b(M)y(M,k)\,.$$
(8.119)

Here M is the mass of dark matter halos, \(\rho _m^{(0)}\) is the dark matter density today, dn/dln M is the mass function describing the comoving number density of halos, y(M, k) is the Fourier transform of the halo density profile, and b(M) is the halo bias.

In modified gravity theories, Hu and Sawicki (HS) [307] provided a fitting formula to describe a non-linear power spectrum based on the halo model. The mass function dn/d ln M and the halo profile ρ depend on the root-mean-square σ(M) of a linear density field. The Sheth-Tormen mass function [535] and the Navarro-Frenk-White halo profile [449] are usually employed in GR. Replacing σ for σGR obtained in the GR dark energy model that follows the same expansion history as the modified gravity model, we obtain a non-linear power spectrum P(k) according to Eq. (8.118). In [307] this non-linear spectrum is called P(k). It is also possible to obtain a nonlinear spectrum P0(k) by applying a usual (halo) mapping formula in GR to modified gravity. This approach is based on the assumption that the growth rate in the linear regime determines the non-linear spectrum. Hu and Sawicki proposed a parametrized non-linear spectrum that interpolates between two spectra P(k) and P0(k) [307]:

$$P(k) = {{{P_0}(k) + {c_{{\rm{nl}}}}{\Sigma ^2}(k){P_\infty}(k)} \over {1 + {c_{{\rm{nl}}}}{\Sigma ^2}(k)}}\,,$$
(8.120)

where cnl is a parameter which controls whether P(k) is close to P0(k) or P(k). In [307] they have taken the form Σ2(k) = k3PL(k)/(2π2).

The validity of the HS fitting formula (8.120) should be checked with N-body simulations in modified gravity models. In [478, 479, 529] N-body simulations were carried out for the f (R) model (4.83) with n = 1/2 (see also [562, 379] for N-body simulations in other modified gravity models). The chameleon mechanism should be at work on small scales (solar-system scales) for the consistency with local gravity constraints. In [479] it was found that the chameleon mechanism tends to suppress the enhancement of the power spectrum in the non-linear regime that corresponds to the recovery of GR. On the other hand, in the post Newtonian intermediate regime, the power spectrum is enhanced compared to the GR case at the measurable level.

Koyama et al. [371] studied the validity of the HS fitting formula by comparing it with the results of N-body simulations. Note that in this paper the parametrization (8.120) was used as a fitting formula without employing the halo model explicitly. In their notation P0 corresponds to “Pnon-GR” derived without non-linear interactions responsible for the recovery of GR (i.e., gravity is modified down to small scales in the same manner as in the linear regime), whereas P corresponds to “PGR” obtained in the GR dark energy model following the same expansion history as that in the modified gravity model. Note that cnl characterizes how the theory approaches GR by the chameleon mechanism. Choosing Σ as

$${\Sigma ^2}(k,z) = {\left({{{{k^3}} \over {2{\pi ^2}}}{P_L}(k,z)} \right)^{1/3}}\,,$$
(8.121)

where PL is the linear power spectrum in the modified gravity model, they showed that, in the f(R) model (4.83) with n = 1/2, the formula (8.120) can fit the solutions in perturbation theory very well by allowing the time-dependence of the parameter cnl in terms of the redshift z. In the regime 0 < z < 1 the parameter cnl is approximately given by cnl(z = 0) = 0.085.

In the left panel of Figure 6 the relative difference of the non-linear power spectrum P(k) from the GR spectrum PGR(k) is plotted as a dashed curve (“no chameleon” case with cnl = 0) and as a solid curve (“chameleon” case with non-zero cnl derived in the perturbative regime). Note that in this simulation the fitting formula by Smith et al. [540] is used to obtain the non-linear power spectrum from the linear one. The agreement with N-body simulations is not very good in the non-linear regime (k > 0.1h Mpc−1). In [371] the power spectrum Pnon-GR in the no chameleon case (i.e., cnl = 0) was derived by interpolating the N-body results in [479]. This is plotted as the dashed line in the right panel of Figure 6. Using this spectrum Pnon-GR for cnl ≠ 0, the power spectrum in N-body simulations in the chameleon case can be well reproduced by the fitting formula (8.120) for the scale k < 0.5h Mpc−1 (see the solid line in Figure 6). Although there is some deviation in the regime k > 0.5h Mpc−1, we caution that N-body simulations have large errors in this regime. See [530] for clustered abundance constraints on the f(R) model (4.83) derived by the calibration of N-body simulations.

Figure 6
figure6

Comparison between N-body simulations and the two fitting formulas in the f(R) model (4.83) with n = 1/2. The circles and triangles show the results of N-body simulations with and without the chameleon mechanism, respectively. The arrow represents the maximum value of k(= 0.08h Mpc−1) by which the perturbation theory is valid. (Left) The fitting formula by Smith et al. [540] is used to predict Pnon-GR and PGR. The solid and dashed lines correspond to the power spectra with and without the chameleon mechanism, respectively. For the chameleon case cnl(z) is determined by the perturbation theory with cnl(z = 0) = 0.085. (Right) The N-body results in [479] are interpolated to derive Pnon-GR without the chameleon mechanism. The obtained Pnon-GR is used for the HS fitting formula to derive the power spectrum P in the chameleon case. From [371].

In the quasi non-linear regime a normalized skewness, \({S_3} = \langle \delta _m^3\rangle/{\langle \delta _m^2\rangle ^2}\), of matter perturbations can provide a good test for the picture of gravitational instability from Gaussian initial conditions [79]. If large-scale structure grows via gravitational instability from Gaussian initial perturbations, the skewness in a universe dominated by pressureless matter is known to S3 = 34/7 in GR [484]. In the ΛCDM model the skewness depends weakly on the expansion history of the universe (less than a few percent) [335]. In f(R) dark energy models the difference of the skewness from the ΛCDM model is only less than a few percent [576], even if the growth rate of matter perturbations is significantly different. This is related to the fact that in the Einstein frame dark energy has a universal coupling \(Q = - 1/\sqrt 6\) with all non-relativistic matter, unlike the coupled quintessence scenario with different couplings between dark energy and matter species (dark matter, baryons) [30].

Cosmic Microwave Background

The effective gravitational potential (8.98) is directly related to the ISW effect in CMB anisotropies. This contributes to the temperature anisotropies today as an integral [308, 214]

$${\Theta _{{\rm{ISW}}}} \equiv \int\nolimits_0^{{\eta _0}} {\rm{d}} \eta {e^{- \tau}}{{{\rm{d}}{\Phi _{{\rm{eff}}}}} \over {{\rm{d}}\eta}}{j_\ell}[k({\eta _0} - \eta)]\,,$$
(8.122)

where τ is the optical depth, η = ∫a−1dt is the conformal time with the present value η0, and j[k(η0η)] is the spherical Bessel function for CMB multipoles and the wavenumber k. In the limit ≫ 1 (i.e., small-scale limit) the spherical Bessel function has a dependence j(x) ≃ (x/ℓ)−1/2, which is suppressed for large . Hence the dominant contribution to the ISW effect comes from the low modes (\(\ell = \mathcal O(1)\)).

In the ΛCDM model the effective gravitational potential is constant during the matter dominance, but it begins to decay after the Universe enters the epoch of cosmic acceleration (see the left panel of Figure 7). This late-time variation of Φeff leads to the contribution to ΘISW, which works as the ISW effect.

Figure 7
figure7

(Left) Evolution of the effective gravitational potential Φeff (denoted as Φ in the figure) versus the scale factor a (with the present value a = 1) on the scale k−1 = 103 Mpc for the ΛCDM model and f(R) models with B0 = 0.5, 1.5, 3.0, 5.0. As the parameter B0 increases, the decay of Φeff decreases and then turns into growth for B0 ≳ 1.5. (Right) The CMB power spectrum ( + 1)C/(2π) for the ΛCDM model and f(R) models with B0 = 0.5, 1.5, 3.0, 5.0. As B0 increases, the ISW contributions to low multipoles decrease, reach the minimum around B0 = 1.5, and then increase. The black points correspond to the WMAP 3-year data [561]. From [545].

For viable f(R) dark energy models the evolution of Φeff during the early stage of the matter era is constant as in the ΛCDM model. After the transition to the scalar-tensor regime, the effective gravitational potential evolves as \({\Phi _{{\rm{eff}}}} \propto {t^{(\sqrt {33 - 5)}/6}}\) during the matter dominance [as we have shown in Eq. (8.110)]. The evolution of Φeff during the accelerated epoch is also subject to change compared to the ΛCDM model. In the left panel of Figure 7 we show the evolution of Φeff versus the scale factor a for the wavenumber k = 10−3 Mpc−1 in several different cases. In this simulation the background cosmological evolution is fixed to be the same as that in the ΛCDM model. In order to quantify the difference from the ΛCDM model at the level of perturbations, [628, 544, 545] defined the following quantity

$$B \equiv m\,{{\dot R} \over R}\,{H \over {\dot H}}\,,$$
(8.123)

where m = Rf,RR/f,R. If the effective equation of state weff defined in Eq. (4.69) is constant, it then follows that R = 3H2(1–3 weff) and hence B = 2 m. The stability of cosmological perturbations requires the condition B > 0 [544, 526]. The left panel of Figure 7 shows that, as we increase the values of B today (= B0), the evolution of Φeff at late times tends to be significantly different from that in the ΛCDM model. This comes from the fact that, for increasing B, the transition to the scalar-tensor regime occurs earlier.

From the right panel of Figure 7 we find that, as B0 increases, the CMB spectrum for low multipoles first decreases and then reaches the minimum around B0 = 1.5. This comes from the reduction in the decay rate of Φeff relative to the ΛCDM model, see the left panel of Figure 7. Around B0 = 1.5 the effective gravitational potential is nearly constant, so that the ISW effect is almost absent (i.e., ΘISW ≈ 0). For B0 ≳ 1.5 the evolution of Φeff turns into growth. This leads to the increase of the large-scale CMB spectrum, as B0 increases. The spectrum in the case B0 = 3.0 is similar to that in the ΛCDM model. The WMAP 3-year data rule out B0 > 4.3 at the 95% confidence level because of the excessive ISW effect [545].

There is another observational constraint coming from the angular correlation between the CMB temperature field and the galaxy number density field induced by the ISW effect [544]. The f(R) models predict that, for B0 ≳ 1, the galaxies are anticorrelated with the CMB because of the sign change of the ISW effect. Since the anticorrelation has not been observed in the observational data of CMB and LSS, this places an upper bound of B0 ≳ 1 [545]. This is tighter than the bound B0 < 4.3 coming from the CMB angular spectrum discussed above.

Finally we briefly mention stochastic gravitational waves produced in the early universe [421, 172, 122, 123, 174, 173, 196, 20]. For the inflation model f(R) = R + R2/(6M2) the primordial gravitational waves are generated with the tensor-to-scalar ratio of the order of 10−3, see Eq. (7.73). It is also possible to generate stochastic gravitational waves after inflation under the modification of gravity. Capozziello et al. [122, 123] studied the evolution of tensor perturbations for a toy model f = R1+ϵ in the FLRW universe with the power-law evolution of the scale factor. Since the parameter ϵ is constrained to be very small (∣ϵ∣ < 7.2 × 10−19) [62, 160], it is very difficult to detect the signature of f(R) gravity in the stochastic gravitational wave background. This property should hold for viable f(R) dark energy models in general, because the deviation from GR during the radiation and the deep matter era is very small.

Palatini Formalism

In this section we discuss f(R) theory in the Palatini formalism [481]. In this approach the action (2.1) is varied with respect to both the metric gμν and the connection \(\Gamma _{\beta \gamma}^\alpha\). Unlike the metric approach, gμν and \(\Gamma _{\beta \gamma}^\alpha\) are treated as independent variables. Variations using the Palatini approach [256, 607, 608, 261, 262, 260] lead to second-order field equations which are free from the instability associated with negative signs of f,RR [422, 423]. We note that even in the 1930s Lanczos [378] proposed a specific combination of curvature-squared terms that lead to a second-order and divergence-free modified Einstein equation.

The background cosmological dynamics of Palatini f(R) gravity has been investigated in [550, 553, 21, 253, 495], which shows that the sequence of radiation, matter, and accelerated epochs can be realized even for the model f(R) = Ra/Rn with n > 0 (see also [424, 457, 495]). The equations for matter density perturbations were derived in [359]. Because of a large coupling Q between dark energy and non-relativistic matter dark energy models based on Palatini f(R) gravity are not compatible with the observations of large-scale structure, unless the deviation from the ΛCDM model is very small [356, 386, 385, 597]. Such a large coupling also gives rise to non-perturbative corrections to the matter action, which leads to a conflict with the Standard Model of particle physics [261, 262, 260] (see also [318, 472, 473, 475, 55]).

There are also a number of works [470, 471, 216, 552] about the Newtonian limit in the Palatini formalism (see also [18, 19, 107, 331, 511, 510]). In particular it was shown in [55, 56] that the non-dynamical nature of the scalar-field degree of freedom can lead to a divergence of non-vacuum static spherically symmetric solutions at the surface of a compact object for commonly-used polytropic equations of state. Hence Palatini f(R) theory is difficult to be compatible with a number of observations and experiments, as long as the models are constructed to explain the late-time cosmic acceleration. Moreover it is also known that in Palatini gravity the Cauchy problem [609] is not well-formulated due to the presence of higher derivatives of matter fields in field equations [377] (see also [520, 135] for related works). We also note that the matter Lagrangian (such as the Lagrangian of Dirac particles) cannot be simply assumed to be independent of connections. Even in the presence of above mentioned problems it will be useful to review this theory because we can learn the way of modifications of gravity from GR to be consistent with observations and experiments.

Field equations

Let us derive field equations by treating gμν and \(\Gamma _{\beta \gamma}^\alpha\) as independent variables. Varying the action (2.1) with respect to gμν, we obtain

$$F(R){R_{\mu \nu}}(\Gamma) - {1 \over 2}f(R){g_{\mu \nu}} = {\kappa ^2}T_{\mu \nu}^{(M)},$$
(9.1)

where F(R) = ∂f/∂R, Rμν(Γ) is the Ricci tensor corresponding to the connections \(\Gamma _{\beta \gamma}^\alpha\), and \(T_{\mu \nu}^{(M)}\) is defined in Eq. (2.5). Note that Rμν(Γ) is in general different from the Ricci tensor calculated in terms of metric connections Rμν(g). The trace of Eq. (9.1) gives

$$F(R)R - 2f(R) = {\kappa ^2}T,$$
(9.2)

where \(T = {g^{\mu \nu}}T_{\mu \nu}^{(M)}\). Here the Ricci scalar R(T) is directly related to T and it is different from the Ricci scalar R(g) = gμνRμν(g) in the metric formalism. More explicitly we have the following relation [556]

$$R(T) = R(g) + {3 \over {2{{({f^\prime}(R(T)))}^2}}}({\nabla _\mu}{f^\prime}(R(T)))({\nabla ^\mu}{f^\prime}(R(T))) + {3 \over {{f^\prime}(R(T))}}\square {f^\prime}(R(T)),$$
(9.3)

where a prime represents a derivative in terms of R(T). The variation of the action (2.1) with respect to the connection leads to the following equation

$$\begin{array}{*{20}c} {{R_{\mu \nu}}(g) - {1 \over 2}{g_{\mu \nu}}R(g) = {{{\kappa ^2}{T_{\mu \nu}}} \over F} - {{FR(T) - f} \over {2F}}{g_{\mu \nu}} + {1 \over F}({\nabla _\mu}{\nabla _\nu}F - {g_{\mu \nu}}\square F)\quad \quad \quad \;} \\ {- {3 \over {2{F^2}}}\;\left[ {{\partial _\mu}F{\partial _\nu}F - {1 \over 2}{g_{\mu \nu}}{{(\nabla F)}^2}} \right].} \\ \end{array}$$
(9.4)

In Einstein gravity (f(R) = R − 2Λ and F(R) = 1) the field equations (9.2) and (9.4) are identical to the equations (2.7) and (2.4), respectively. However, the difference appears for the f(R) models which include non-linear terms in R. While the kinetic term □F is present in Eq. (2.7), such a term is absent in Palatini f(R) gravity. This has the important consequence that the oscillatory mode, which appears in the metric formalism, does not exist in the Palatini formalism. As we will see later on, Palatini f(R) theory corresponds to Brans-Dicke (BD) theory [100] with a parameter ωBD = −3/2 in the presence of a field potential. Such a theory should be treated separately, compared to BD theory with ωBD ≠ −3/2 in which the field kinetic term is present.

As we have derived the action (2.21) from (2.18), the action in Palatini f(R) gravity is equivalent to

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \;\left[ {{1 \over {2{\kappa ^2}}}\varphi R(T) - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}),$$
(9.5)

where

$$\varphi = {f^\prime}(R(T)),\qquad U = {{R(T){f^\prime}(R(T)) - f(R(T))} \over {2{\kappa ^2}}}.$$
(9.6)

Since the derivative of U in terms of φ is U,φ = R/(2κ2), we obtain the following relation from Eq. (9.2):

$$4U - 2\varphi {U_{,\varphi}} = T.$$
(9.7)

Using the relation (9.3), the action (9.5) can be written as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over {2{\kappa ^2}}}\varphi R(g) + {3 \over {4{\kappa ^2}}}{1 \over \varphi}{{(\nabla \varphi)}^2} - U(\varphi)} \right] + \int {{{\rm{d}}^4}} x{{\mathcal L}_M}({g_{\mu \nu}},{\Psi _M}).$$
(9.8)

Comparing this with Eq. (2.23) in the unit κ2 = 1, we find that Palatini f(R) gravity is equivalent to BD theory with the parameter ωBD = −3/2 [262, 470, 551]. As we will see in Section 10.1, this equivalence can be also seen by comparing Eqs. (9.1) and (9.4) with those obtained by varying the action (2.23) in BD theory. In the above discussion we have implicitly assumed that \({\mathcal L_M}\) does not explicitly depend on the Christoffel connections \(\Gamma _{\mu \nu}^\lambda\). This is true for a scalar field or a perfect fluid, but it is not necessarily so for other matter Lagrangians such as those describing vector fields.

There is another way for taking the variation of the action, known as the metric-affine formalism [299, 558, 557, 121]. In this formalism the matter action SM depends not only on the metric gμν but also on the connection \(\Gamma _{\mu \nu}^\lambda\). Since the connection is independent of the metric in this approach, one can define the quantity called hypermomentum [299], as \(\Delta _\lambda ^{\mu \nu} \equiv (- 2/\sqrt {- g})\delta {\mathcal L_M}/\delta \Gamma _{\mu \nu}^\lambda\). The usual assumption that the connection is symmetric is also dropped, so that the antisymmetric quantity called the Cartan torsion tensor, \(S_{\mu \nu}^\lambda \equiv \Gamma _{[\mu \nu ]}^\lambda\), is defined. The non-vanishing property of \(S_{\mu \nu}^\lambda\) allows the presence of torsion in this theory. If the condition \(\Delta _\lambda ^{[\mu \nu ]} = 0\) holds, it follows that the Cartan torsion tensor vanishes \((S_{\mu \nu}^\lambda = 0)\) [558]. Hence the torsion is induced by matter fields with the anti-symmetric hypermomentum. The f(R) Palatini gravity belongs to f(R) theories in the metric-affine formalism with \(\Delta _\lambda ^{\mu \nu} = 0\). In the following we do not discuss further f(R) theory in the metric-affine formalism. Readers who are interested in those theories may refer to the papers [557, 556].

Background cosmological dynamics

We discuss the background cosmological evolution of dark energy models based on Palatini f(R) gravity. We shall carry out general analysis without specifying the forms of f(R). We take into account non-relativistic matter and radiation whose energy densities are ρm and ρr, respectively. In the flat FLRW background (2.12) we obtain the following equations

$$FR - 2f = - {\kappa ^2}{\rho _m},$$
(9.9)
$$6F{\left({H + {{\dot F} \over {2F}}} \right)^2} - f = {\kappa ^2}({\rho _m} + 2{\rho _r}),$$
(9.10)

together with the continuity equations, \({\dot \rho _m} + 3H{\rho _m} = 0\) and \({\dot \rho _r} + 4H{\rho _r} = 0\). Combing Eqs. (9.9) and (9.10) together with continuity equations, it follows that

$$\dot R = {{3{\kappa ^2}H{\rho _m}} \over {{F_{,R}}R - F}} = - 3H{{F\,R - 2f} \over {{F_{,R}}R - F}},$$
(9.11)
$${H^2} = {{2{\kappa ^2}({\rho _m} + {\rho _r}) + F\,R - f} \over {6F\xi}},$$
(9.12)

where

$$\xi \equiv {\left[ {1 - {3 \over 2}{{{F_{,R}}(F\,R - 2f)} \over {F({F_{,R}}R - F)}}} \right]^2}.$$
(9.13)

In order to discuss cosmological dynamics it is convenient to introduce the dimensionless variables:

$${y_1} \equiv {{F\,R - f} \over {6F\xi {H^2}}},\qquad {y_2} \equiv {{{\kappa ^2}{\rho _r}} \over {3F\xi {H^2}}},$$
(9.14)

by which Eq (9.12) can be written as

$${{{\kappa ^2}{\rho _m}} \over {3F\xi {H^2}}} = 1 - {y_1} - {y_2}.$$
(9.15)

Differentiating y1 and y2 with respect to N = ln a, we obtain [253]

$${{{\rm{d}}{y_1}} \over {{\rm{d}}N}} = {y_1}\left[ {3 - 3{y_1} + {y_2} + C(R)(1 - {y_1})} \right],$$
(9.16)
$${{{\rm{d}}{y_2}} \over {{\rm{d}}N}} = {y_2}\left[ {- 1 - 3{y_1} + {y_2} - C(R){y_1}} \right],$$
(9.17)

where

$$C(R) \equiv {{R\dot F} \over {H(F\,R - f)}} = - 3{{(F\,R - 2f){F_{,R}}R} \over {(F\,R - f)({F_{,R}}R - F)}}.$$
(9.18)

The following constraint equation also holds

$${{1 - {y_1} - {y_2}} \over {2{y_1}}} = - {{F\,R - 2f} \over {F\,R - f}}.$$
(9.19)

Hence the Ricci scalar R can be expressed in terms of y1 and y2.

Differentiating Eq. (9.11) with respect to t, it follows that

$${{\dot H} \over {{H^2}}} = - {3 \over 2} + {3 \over 2}{y_1} - {1 \over 2}{y_2} - {{\dot F} \over {2H\,F}} - {{\dot \xi} \over {2H\xi}} + {{\dot F\,R} \over {12F\xi {H^3}}},$$
(9.20)

from which we get the effective equation of state:

$${w_{{\rm{eff}}}} = - 1 - {2 \over 3}{{\dot H} \over {{H^2}}} = - {y_1} + {1 \over 3}{y_2} + {{\dot F} \over {3H\,F}} + {{\dot \xi} \over {3H\xi}} - {{\dot F\,R} \over {18F\xi {H^3}}}.$$
(9.21)

The cosmological dynamics is known by solving Eqs. (9.16) and (9.17) with Eq. (9.18). If C(R) is not constant, then one can use Eq. (9.19) to express R and C(R) in terms of y1 and y2.

The fixed points of Eqs. (9.16) and (9.17) can be found by setting dy1/dN = 0 and dy2/dN = 0. Even when C(R) is not constant, except for the cases C(R) = −3 and C(R) = −4, we obtain the following fixed points [253]:

  1. 1.

    Pr: (y1,y2) = (0, 1),

  2. 2.

    Pm: (y1, y2) = (0, 0),

  3. 3.

    Pd: (y1, y2) = (1, 0).

The stability of the fixed points can be analyzed by considering linear perturbations about them. As long as dC/dy1 and dC/dy2 are bounded, the eigenvalues λ1 and λ2 of the Jacobian matrix of linear perturbations are given by

  1. 1.

    Pr: (λ1, λ2) = (4 + C(R), 1),

  2. 2.

    Pm: (λ1, λ2) = (3 + C(R), −1),

  3. 3.

    Pd: (λ1, λ2) = (−3 − C(R), −4 − C(R)).

In the ΛCDM model (f(R) = R − 2Λ) one has weff = −y1 + y2/3 and C(R) = 0. Then the points Pr, Pm, and Pd correspond to weff = 1/3, (λ1, λ2) = (4, 1) (radiation domination, unstable), weff = 0, (λ1, λ2) = (3, −1) (matter domination, saddle), and weff = −1, (λ1, λ2) = (−3, −4) (de Sitter epoch, stable), respectively. Hence the sequence of radiation, matter, and de Sitter epochs is in fact realized.

Let us next consider the model f(R) = Rβ/Rn with β > 0 and n > −1. In this case the quantity C(R) is

$$C(R) = 3n{{{R^{1 + n}} - (2 + n)\beta} \over {{R^{1 + n}} + n(2 + n)\beta}}.$$
(9.22)

The constraint equation (9.19) gives

$${\beta \over {{R^{1 + n}}}} = {{2{y_1}} \over {3{y_1} + n({y_1} - {y_2} + 1) - {y_2} + 1}}.$$
(9.23)

The late-time de Sitter point corresponds to R1+n = (2 + n)β, which exists for n > −2. Since C(R) = 0 in this case, the de Sitter point Pd is stable with the eigenvalues (λ1, λ2) = (−3, −4). During the radiation and matter domination we have β/R1+n ≪ 1 (i.e., f(R) ≃ R) and hence C(R) = 3n. Pr corresponds to the radiation point (weff = 1/3) with the eigenvalues (λ1, λ2) = (4 + 3n, 1), whereas Pm to the matter point (weff = 0) with the eigenvalues (λ1, λ2) = (3 + 3n, −1). Provided that n > −1, Pr and Pm correspond to unstable and saddle points respectively, in which case the sequence of radiation, matter, and de Sitter eras can be realized. For the models f(R) = R + αRmβ/Rn, it was shown in [253] that unified models of inflation and dark energy with radiation and matter eras are difficult to be realized.

In Figure 8 we plot the evolution of weff as well as y1 and y2 for the model f(R) = Rβ/Rn with n = 0.02. This shows that the sequence of (Pr) radiation domination (weff = 1/3), (Pm) matter domination (weff = 0), and de Sitter acceleration (weff = −1) is realized. Recall that in metric f(R) gravity the model f(R) = Rβ/Rn (β > 0, n > 0) is not viable because f,RR is negative. In Palatini f(R) gravity the sign of f,RR does not matter because there is no propagating degree of freedom with a mass M associated with the second derivative f,RR [554].

Figure 8
figure8

The evolution of the variables y1 and y2 for the model f(R) = Rβ/Rn with n = 0.02, together with the effective equation of state weff. Initial conditions are chosen to be y1 = 10−40 and y2 = 1.0–10−5. From [253].

In [21, 253] the dark energy model f(R) = Rβ/Rn was constrained by the combined analysis of independent observational data. From the joint analysis of Super-Nova Legacy Survey [39], BAO [227] and the CMB shift parameter [561], the constraints on two parameters n and β are n ∈ [−0.23, 0.42] and β ∈ [2.73, 10.6] at the 95% confidence level (in the unit of H0 = 1) [253]. Since the allowed values of n are close to 0, the above model is not particularly favored over the ΛCDM model. See also [116, 148, 522, 46, 47] for observational constraints on f(R) dark energy models based on the Palatini formalism.

Matter perturbations

We have shown that f(R) theory in the Palatini formalism can give rise to the late-time cosmic acceleration preceded by radiation and matter eras. In this section we study the evolution of matter density perturbations to confront Palatini f(R) gravity with the observations of large-scale structure [359, 356, 357, 598, 380, 597]. Let us consider the perturbation δρm of non-relativistic matter with a homogeneous energy density ρm. Koivisto and Kurki-Suonio [359] derived perturbation equations in Palatini f(R) gravity. Using the perturbed metric (6.1) with the same variables as those introduced in Section 6, the perturbation equations are given by

$$\begin{array}{*{20}c} {{\Delta \over {{a^2}}}\psi + \left({H + {{\dot F} \over {2F}}} \right)A + {1 \over {2F}}\left({{{3{{\dot F}^2}} \over {2F}} + 3H\,\dot F} \right)\alpha \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {= {1 \over {2F}}\;\left[ {\left({3{H^2} - {{3{{\dot F}^2}} \over {4{F^2}}} - {R \over 2} - {\Delta \over {{a^2}}}} \right)\;\delta F + \left({{{3\dot F} \over {2F}} + 3H} \right)\; \dot{\delta F} - {\kappa ^2}\delta {\rho _m}} \right],} \\ \end{array}$$
(9.24)
$$H\alpha - \dot \psi = {1 \over {2F}}\;\left[ {\dot {\delta F} - \left({H + {{3\dot F} \over {2F}}} \right)\delta F - \dot F\alpha + {\kappa ^2}{\rho _m}v} \right],$$
(9.25)
$$\dot \chi + H\chi - \alpha - \psi = {1 \over F}(\delta F - \dot F\chi),$$
(9.26)
$$\begin{array}{*{20}c} {\dot A + \left({2H + {{\dot F} \over {2F}}} \right)A + \left({3\dot H + {{3\ddot F} \over F} + {{3H\,\dot F} \over {2F}} - {{3{{\dot F}^2}} \over {{F^2}}} + {\Delta \over {{a^2}}}} \right)\alpha + {3 \over 2}{{\dot F} \over F}\dot \alpha \quad \quad \quad \quad \quad \quad \;\;} \\ {= {1 \over {2F}}\left[ {{\kappa ^2}\delta {\rho _m} + \left({6{H^2} + 6\dot H + {{3{{\dot F}^2}} \over {{F^2}}} - R - {\Delta \over {{a^2}}}} \right)\delta F + \left({3H - {{6\dot F} \over F}} \right)\dot {\delta F} + 3\ddot {\delta F}} \right],} \\ \end{array}$$
(9.27)
$$R\delta F - F\delta R = - {\kappa ^2}\delta {\rho _m},$$
(9.28)

where the Ricci scalar R can be understood as R(T).

From Eq. (9.28) the perturbation δF can be expressed by the matter perturbation δρm, as

$$\delta F = {{{F_{,R}}} \over R}{{{\kappa ^2}\delta {\rho _m}} \over {1 - m}},$$
(9.29)

where m = RF,R/F. This equation clearly shows that the perturbation δF is sourced by the matter perturbation only, unlike metric f(R) gravity in which the oscillating mode of δF is present. The matter perturbation δρm and the velocity potential υ obey the same equations as given in Eqs. (8.86) and (8.87), which results in Eq. (8.89) in Fourier space.

Let us consider the perturbation equations in Fourier space. We choose the Longitudinal gauge (χ = 0) with α = Φ and ψ = Ψ. In this case Eq. (9.26) gives

$$\Psi - \Phi = {{\delta F} \over F}.$$
(9.30)

Under the quasi-static approximation on sub-horizon scales used in Section 8.1, Eqs. (9.24) and (8.89) reduce to

$${{{k^2}} \over {{a^2}}}\Psi \simeq {1 \over {2F}}\left({{{{k^2}} \over {{a^2}}}\delta F - {\kappa ^2}\delta {\rho _m}} \right)\,,$$
(9.31)
$${\ddot \delta _m} + 2H{\dot \delta _m} + {{{k^2}} \over {{a^2}}}\Phi \simeq 0\,.$$
(9.32)

Combining Eq. (9.30) with Eq. (9.31), we obtain

$${{{k^2}} \over {{a^2}}}\Psi = - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}\left({1 - {\zeta \over {1 - m}}} \right)\,,\qquad {{{k^2}} \over {{a^2}}}\Phi = - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}\left({1 + {\zeta \over {1 - m}}} \right)\,,$$
(9.33)

where

$$\zeta \equiv {{{k^2}} \over {{a^2}}}{{{F_{,R}}} \over F} = {{{k^2}} \over {{a^2}R}}m\,.$$
(9.34)

Then the matter perturbation satisfies the following Eq. [597]

$${\ddot \delta _m} + 2H{\dot \delta _m} - {{{\kappa ^2}{\rho _m}} \over {2F}}\left({1 + {\zeta \over {1 - m}}} \right){\delta _m} \simeq 0\,.$$
(9.35)

The effective gravitational potential defined in Eq. (8.98) obeys

$${\Phi _{{\rm{eff}}}} \simeq - {{{\kappa ^2}{\rho _m}} \over {2F}}{{{a^2}} \over {{k^2}}}{\delta _m}\,.$$
(9.36)

In the above approximation we do not need to worry about the dominance of the oscillating mode of perturbations in the past. Note also that the same approximate equation of δm as Eq. (9.35) can be derived for different gauge choices [597].

The parameter ζ is a crucial quantity to characterize the evolution of perturbations. This quantity can be estimated as ζ ≈ (k/aH)2m, which is much larger than m for sub-horizon modes (kaH). In the regime ζ ≪ 1 the matter perturbation evolves as δmt2/3. Meanwhile the evolution of δm in the regime ζ ≫ 1 is completely different from that in GR. If the transition characterized by ζ = 1 occurs before today, this gives rise to the modification to the matter spectrum compared to the GR case.

In the regime ζ ≫ 1, let us study the evolution of matter perturbations during the matter dominance. We shall consider the case in which the parameter m (with ∣m∣ ≪ 1) evolves as

$$m \propto {t^p}\,,$$
(9.37)

where p is a constant. For the model f(R) = RμRc(R/Rc)n (n < 1) the power p corresponds to p = 1 + n, whereas for the models (4.83) and (4.84) with n > 0 one has p = 1 + 2n. During the matter dominance the parameter ζ evolves as ζ = ±(t/tk)2p+2/3, where the subscript “k” denotes the value at which the perturbation crosses ζ = ±1. Here + and − signs correspond to the cases m > 0 and m < 0, respectively. Then the matter perturbation equation (9.35) reduces to

$${{{{\rm{d}}^2}{\delta _m}} \over {{\rm{d}}{N^2}}} + {1 \over 2}{{{\rm{d}}{\delta _m}} \over {{\rm{d}}N}} - {3 \over 2}\left[ {1 \pm {e^{(3p + 1)(N - {N_k})}}} \right]{\delta _m} = 0\,.$$
(9.38)

When m > 0, the growing mode solution to Eq. (9.38) is given by

$${\delta _m} \propto \exp \left({{{\sqrt 6 {e^{(3p + 1)(N - {N_k})/2}}} \over {3p + 1}}} \right)\,,\qquad {f_\delta} \equiv {{{{\dot \delta}_m}} \over {H{\delta _m}}} = {{\sqrt 6} \over 2}{e^{(3p + 1)(N - {N_k})/2}}\,.$$
(9.39)

This shows that the perturbations exhibit violent growth for p > −1/3, which is not compatible with observations of large-scale structure. In metric f(R) gravity the growth of matter perturbations is much milder.

When m < 0, the perturbations show a damped oscillation:

$${\delta _m} \propto {e^{- (3p + 2)(N - {N_k})/4}}\,\cos (x + \theta)\,,\qquad {f_\delta} = - {1 \over 4}(3p + 2) - {{3p + 1} \over 2}x\tan (x + \theta)\,,$$
(9.40)

where \(x = \sqrt 6 {e^{(3p + 1)(N - {N_k})/2}}/(3p + 1)\), and θ is a constant. The averaged value of the growth rate fδ is given by \({\bar f_\delta} = - (3p + 2)/4\), but it shows a divergence every time x changes by π. These negative values of fδ are also difficult to be compatible with observations.

The f(R) models can be consistent with observations of large-scale structure if the universe does not enter the regime ∣ζ∣ > 1 by today. This translates into the condition [597]

$$\left\vert {m(z = 0)} \right\vert \underset{\sim}{<} {({a_0}{H_0}/k)^2}\,.$$
(9.41)

Let us consider the wavenumbers 0.01 h Mpc−1k ≲ 0.2 h Mpc−1 that corresponds to the linear regime of the matter power spectrum. Since the wavenumber k = 0.2 h Mpc−1 corresponds to k ≈ 600a0H0 (where “0” represents present quantities), the condition (9.41) gives the bound ∣m(z = 0)∣ ≲ 3 × 10−6.

If we use the observational constraint of the growth rate, fδ ≲ 1.5 [418, 605, 211], then the deviation parameter m today is constrained to be ∣m(z = 0)∣ ≲ 10−5–10−4 for the model f(R) = R − λRc(R/Rc)n (n < 1) as well as for the models (4.83) and (4.84) [597]. Recall that, in metric f(R) gravity, the deviation parameter can grow to the order of 0.1 by today. Meanwhile f(R) dark energy models based on the Palatini formalism are hardly distinguishable from the ΛCDM model [356, 386, 385, 597]. Note that the bound on m(z = 0) becomes even severer by considering perturbations in non-linear regime. The above peculiar evolution of matter perturbations is associated with the fact that the coupling between non-relativistic matter and a scalar-field degree of freedom is very strong (as we will see in Section 10.1).

The above results are based on the fact that dark matter is described by a cold and perfect fluid with no pressure. In [358] it was suggested that the tight bound on the parameter m can be relaxed by considering imperfect dark matter with a shear stress. Although the approach taken in [358] did not aim to explain the origin of a dark matter stress Π that cancels the k-dependent term in Eq. (9.35), it will be of interest to further study whether some theoretically motivated choice of Π really allows the possibility that Palatini f(R) dark energy models can be distinguished from the ΛCDM model.

Shortcomings of Palatini f(R) gravity

In addition to the fact that Palatini f(R) dark energy models are hardly distinguished from the ΛCDM model from observations of large-scale structure, there are a number of problems in Palatini f(R) gravity associated with non-dynamical nature of the scalar-field degree of freedom.

The dark energy model f = Rμ4/R based on the Palatini formalism was shown to be in conflict with the Standard Model of particle physics [261, 262, 260, 318, 55] because of large non-perturbative corrections to the matter Lagrangian [here we use for the meaning of R(T)]. Let us consider this issue for a more general model f = Rμ2(n+1)/Rn. From the definition of φ in Eq. (9.6) the field potential U(φ) is given by

$$U(\varphi) = {{n + 1} \over {2{n^{n/(n + 1)}}}}{{{\mu ^2}} \over {{\kappa ^2}}}{(\varphi - 1)^{n/(n + 1)}}\,,$$
(9.42)

where φ =1 + 2(n+1)Rn−1. Using Eq. (9.7) for the vacuum (T = 0), we obtain the solution

$$\varphi (T = 0) = {{2(n + 1)} \over {n + 2}}\,.$$
(9.43)

In the presence of matter we expand the field φ as φ = φ(T = 0) + δφ. Substituting this into Eq. (9.7), we obtain

$$\delta \varphi \simeq {n \over {{{(n + 2)}^{{{n + 2} \over {n + 1}}}}}}{{{\kappa ^2}T} \over {{\mu ^2}}}\,.$$
(9.44)

for \(n = \mathcal O(1)\) we have \(\delta \varphi \approx {\kappa ^2}T/{\mu ^2} = T/({\mu ^2}M_{{\rm{pl}}}^2)\) with φ(T = 0) ≈ 1. Let us consider a matter action of a Higgs scalar field ϕ with mass mϕ:

$${S_M} = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\phi {\partial _\nu}\phi - {1 \over 2}m_\phi ^2{\phi ^2}} \right]\,.$$
(9.45)

Since \(T \approx m_\phi ^2\delta {\phi ^2}\) it follows that \(\delta \varphi \approx m_\phi ^2\delta {\phi ^2}/({\mu ^2}M_{{\rm{pl}}}^2)\). Perturbing the Jordan-frame action (9.8) [which is equivalent to the action in Palatini f(R) gravity] to second-order and using the solution \(\varphi \approx 1 + m_\phi ^2\delta {\phi ^2}/({\mu ^2}M_{{\rm{pl}}}^2)\), we find that the effective action of the Higgs field ϕ for an energy scale E much lower than mϕ (= 100–1000 GeV) is given by [55]

$$\delta {S_M} \simeq \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {- {1 \over 2}{g^{\mu \nu}}{\partial _\mu}\delta \phi {\partial _\nu}\delta \phi - {1 \over 2}m_\phi ^2\delta {\phi ^2}} \right]\left({1 + {{m_\phi ^2\delta {\phi ^2}} \over {{\mu ^2}M_{{\rm{pl}}}^2}} + \cdots} \right)\,.$$
(9.46)

Since δϕmϕ for Emϕ, the correction term can be estimated as

$$\delta \varphi \approx {{m_\phi ^2\delta {\phi ^2}} \over {{\mu ^2}M_{{\rm{pl}}}^2}} \approx {\left({{{{m_\phi}} \over \mu}} \right)^2}{\left({{{{m_\phi}} \over {{M_{{\rm{pl}}}}}}} \right)^2}\,.$$
(9.47)

In order to give rise to the late-time acceleration we require that μH0 ≈ 10−42 GeV. For the Higgs mass mϕ = 100 GeV it follows that δϕ ≈ 1056 ≫ 1. This correction is too large to be compatible with the Standard Model of particle physics.

The above result is based on the models f(R) = Rμ2(n+1)/Rn with \(n = \mathcal O(1)\). Having a look at Eq. (9.44), the only way to make the perturbation δϕ small is to choose n very close to 0. This means that the deviation from the ΛCDM model is extremely small (see [388] for a related work). In fact, this property was already found by the analysis of matter density perturbations in Section 9.3. While the above analysis is based on the calculation in the Jordan frame in which test particles follow geodesics [55], the same result was also obtained by the analysis in the Einstein frame [261, 262, 260, 318].

Another unusual property of Palatini f(R) gravity is that a singularity with the divergent Ricci scalar can appear at the surface of a static spherically symmetric star with a polytropic equation of state \(P = c\rho _0^\Gamma\) with 3/2 < Γ < 2 (where P is the pressure and ρ0 is the rest-mass density) [56, 55] (see also [107, 331]). Again this problem is intimately related with the particular algebraic dependence (9.2) in Palatini f(R) gravity. In [56] it was claimed that the appearance of the singularity does not very much depend on the functional forms of f(R) and that the result is not specific to the choice of the polytropic equation of state.

The Palatini gravity has a close relation with an effective action which reproduces the dynamics of loop quantum cosmology [477]. [474] showed that the model f(R) = R + R2/(6M2), where M is of the order of the Planck mass, is not plagued by a singularity problem mentioned above, while the singularity typically arises for the f(R) models constructed to explain the late-time cosmic acceleration (see also [504] for a related work). Since Planck-scale corrected Palatini f(R) models may cure the singularity problem, it will be of interest to understand the connection with quantum gravity around the cosmological singularity (or the black hole singularity). In fact, it was shown in [60] that non-singular bouncing solutions can be obtained for power-law f(R) Lagrangians with a finite number of terms.

Finally we note that the extension of Palatini f(R) gravity to more general theories including Ricci and Riemann tensors was carried out in [384, 387, 95, 236, 388, 509, 476]. While such theories are more involved than Palatini f(R) gravity, it may be possible to construct viable modified gravity models of inflation or dark energy.

Extension to Brans-Dicke Theory

So far we have discussed f(R) gravity theories in the metric and Palatini formalisms. In this section we will see that these theories are equivalent to Brans-Dicke (BD) theory [100] in the presence of a scalar-field potential, by comparing field equations in f(R) theories with those in BD theory. It is possible to construct viable dark energy models based on BD theory with a constant parameter ωBD. We will discuss cosmological dynamics, local gravity constraints, and observational signatures of such generalized theory.

Brans-Dicke theory and the equivalence with f(R) theories

Let us start with the following 4-dimensional action in BD theory

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \left[ {{1 \over 2}\varphi R - {{{\omega _{{\rm{BD}}}}} \over {2\varphi}}{{(\nabla \varphi)}^2} - U(\varphi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M}),$$
(10.1)

where ωBD is the BD parameter, U(φ) is a potential of the scalar field φ, and SM is a matter action that depends on the metric gμν and matter fields ΨM. In this section we use the unit \({\kappa ^2} = 8\pi G = 1/M_{{\rm{pl}}}^2 = 1\), but we recover the gravitational constant G and the reduced Planck mass Mpl when the discussion becomes transparent. The original BD theory [100] does not possess the field potential U(φ).

Taking the variation of the action (10.1) with respect to gμν and φ, we obtain the following field equations

$$\begin{array}{*{20}c} {{R_{\mu \nu}}(g) - {1 \over 2}{g_{\mu \nu}}R(g) = {1 \over \varphi}{T_{\mu \nu}} - {1 \over \varphi}{g_{\mu \nu}}U(\varphi) + {1 \over \varphi}({\nabla _\mu}{\nabla _\nu}\varphi - {g_{\mu \nu}}\square \varphi)\quad \quad \quad \quad \;} \\ {+ {{{\omega _{{\rm{BD}}}}} \over {{\varphi ^2}}}\left[ {{\partial _\mu}\varphi {\partial _\nu}\varphi - {1 \over 2}{g_{\mu \nu}}{{(\nabla \varphi)}^2}} \right],} \\ \end{array}$$
(10.2)
$$(3 + 2{\omega _{{\rm{BD}}}})\square \varphi + 4U(\varphi) - 2\varphi {U_{,\varphi}} = T,$$
(10.3)

where R(g) is the Ricci scalar in metric f(R) gravity, and Tμν is the energy-momentum tensor of matter. In order to find the relation with f(R) theories in the metric and Palatini formalisms, we consider the following correspondence

$$\varphi = F(R),\qquad U(\varphi) = {{RF - f} \over 2}.$$
(10.4)

Recall that this potential (which is the gravitational origin) already appeared in Eq. (2.28). We then find that Eqs. (2.4) and (2.7) in metric f(R) gravity are equivalent to Eqs. (10.2) and (10.3) with the BD parameter ωBD = 0. Hence f(R) theory in the metric formalism corresponds to BD theory with ωBD = 0 [467, 579, 152, 246, 112]. In fact we already showed this by rewriting the action (2.1) in the form (2.21). We also notice that Eqs. (9.4) and (9.2) in Palatini f(R) gravity are equivalent to Eqs. (2.4) and (2.7) with the BD parameter ωBD = −3/2. Then f(R) theory in the Palatini formalism corresponds to BD theory with ωBD = −3/2 [262, 470, 551]. Recall that we also showed this by rewriting the action (2.1) in the form (9.8).

One can consider more general theories called scalar-tensor theories [268] in which the Ricci scalar R is coupled to a scalar field φ. The general 4-dimensional action for scalar-tensor theories can be written as

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \;\left[ {{1 \over 2}F(\varphi)R - {1 \over 2}\omega (\varphi){{(\nabla \varphi)}^2} - U(\varphi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M}),$$
(10.5)

where F(φ) and U(φ) are functions of φ. Under the conformal transformation \({\tilde g_{\mu \nu}} = F{g_{\mu \nu}}\), we obtain the action in the Einstein frame [408, 611]

$${S_E} = \int {{{\rm{d}}^4}} x\sqrt {- \tilde g} \;\left[ {{1 \over 2}\tilde R - {1 \over 2}{{(\tilde \nabla \phi)}^2} - V(\phi)} \right] + {S_M}({F^{- 1}}{\tilde g_{\mu \nu}},{\Psi _M}),$$
(10.6)

where V = U/F2. We have introduced a new scalar field ϕ to make the kinetic term canonical:

$$\phi \equiv \int {\rm{d}} \varphi \,\sqrt {{3 \over 2}{{\left({{{{F_{,\varphi}}} \over F}} \right)}^2} + {\omega \over F}}.$$
(10.7)

We define a quantity Q that characterizes the coupling between the field ϕ and non-relativistic matter in the Einstein frame:

$$Q \equiv - {{{F_{,\phi}}} \over {2F}} = - {{{F_{,\varphi}}} \over F}\;{\left[ {{3 \over 2}\;{{\left({{{{F_{,\varphi}}} \over F}} \right)}^2} + {\omega \over F}} \right]^{- 1/2}}.$$
(10.8)

Recall that, in metric f(R) gravity, we introduced the same quantity Q in Eq. (2.40), which is constant \((Q = - 1\sqrt 6)\). For theories with Q =constant, we obtain the following relations from Eqs. (10.7) and (10.8):

$$F = {e^{- 2Q\phi}},\qquad \omega = (1 - 6{Q^2})F\;{\left({{{{\rm{d}}\phi} \over {{\rm{d}}\varphi}}} \right)^2}.$$
(10.9)

In this case the action (10.5) in the Jordan frame reduces to [596]

$$S = \int {{{\rm{d}}^4}} x\sqrt {- g} \;\left[ {{1 \over 2}F(\phi)R - {1 \over 2}(1 - 6{Q^2})F(\phi){{(\nabla \phi)}^2} - U(\phi)} \right] + {S_M}({g_{\mu \nu}},{\Psi _M}),\quad {\rm{with}}\quad F(\phi) = {e^{- 2Q\phi}}.$$
(10.10)

In the limit that Q → 0 we have F(ϕ) → 1, so that Eq. (10.10) recovers the action of a minimally coupled scalar field in GR.

Let us compare the action (10.10) with the action (10.1) in BD theory. Setting φ = F = e−2, the former is equivalent to the latter if the parameter ωBD is related to Q via the relation [343, 596]

$$3 + 2{\omega _{{\rm{BD}}}} = {1 \over {2{Q^2}}}.$$
(10.11)

This shows that the GR limit (ωBD → ∞) corresponds to the vanishing coupling (Q → 0). Since \(Q = - 1\sqrt 6\) in metric f(R) gravity one has ωBD = 0, as expected. The Palatini f(R) gravity corresponds to ωBD = −3/2, which corresponds to the infinite coupling (Q2 → ∞). In fact, Palatini gravity can be regarded as an isolated “fixed point” of a transformation involving a special conformal rescaling of the metric [247]. In the Einstein frame of the Palatini formalism, the scalar field ϕ does not have a kinetic term and it can be integrated out. In general, this leads to a matter action which is non-linear, depending on the potential U(ϕ). This large coupling poses a number of problems such as the strong amplification of matter density perturbations and the conflict with the Standard Model of particle physics, as we have discussed in Section 9.

Note that BD theory is one of the examples in scalar-tensor theories and there are some theories that give rise to non-constant values of Q. For example, the action of a nonminimally coupled scalar field with a coupling ξ corresponds to F(φ) = 1 −ξφ2 and ω(φ) = 1, which gives the field-dependent coupling Q(φ) = ξφ/[1 − ξφ2(1 − 6ξ)]1/2. In fact the dynamics of dark energy in such a theory has been studied by a number of authors [22, 601, 151, 68, 491, 44, 505]. In the following we shall focus on the constant coupling models with the action (10.10). We stress that this is equivalent to the action (10.1) in BD theory.

Cosmological dynamics of dark energy models based on Brans-Dicke theory

The first attempt to apply BD theory to cosmic acceleration is the extended inflation scenario in which the BD field φ is identified as an inflaton field [374, 571]. The first version of the inflation model, which considered a first-order phase transition in BD theory, resulted in failure due to the graceful exit problem [375, 613, 65]. This triggered further study of the possibility of realizing inflation in the presence of another scalar field [394, 78]. In general the dynamics of such a multi-field system is more involved than that in the single-field case [71]. The resulting power spectrum of density perturbations generated during multi-field inflation in BD theory was studied by a number of authors [570, 272, 156, 569].

In the context of dark energy it is possible to construct viable single-field models based on BD theory. In what follows we discuss cosmological dynamics of dark energy models based on the action (10.10) in the flat FLRW background given by (2.12) (see, e.g., [596, 22, 85, 289, 5, 327, 139, 168] for dynamical analysis in scalar-tensor theories). Our interest is to find conditions under which a sequence of radiation, matter, and accelerated epochs can be realized. This depends upon the form of the field potential U(ϕ). We first carry out general analysis without specifying the forms of the potential. We take into account non-relativistic matter with energy density ρm and radiation with energy density ρr. The Jordan frame is regarded as a physical frame due to the usual conservation of non-relativistic matter (ρma−3). Varying the action (10.10) with respect to gμν and ϕ, we obtain the following equations

$$3F\,{H^2} = (1 - 6{Q^2})F{\dot \phi ^2}/2 + U - 3H\,\dot F + {\rho _m} + {\rho _r},$$
(10.12)
$$2F\,\dot H = - (1 - 6{Q^2})F{\dot \phi ^2} - \ddot F + H\,\dot F - {\rho _m} - 4{\rho _r}/3,$$
(10.13)
$$(1 - 6{Q^2})\;F\;\left[ {\ddot \phi + 3H\,\dot \phi + \dot F/(2F)\dot \phi} \right] + {U_{,\phi}} + Q\,F\,R = 0,$$
(10.14)

where F = e−2

We introduce the following dimensionless variables

$${x_1} \equiv {{\dot \phi} \over {\sqrt 6 H}},\qquad {x_2} \equiv {1 \over H}\sqrt {{U \over {3F}}}, \qquad {x_3} \equiv {1 \over H}\sqrt {{{{\rho _r}} \over {3F}}},$$
(10.15)

and also the density parameters

$${\Omega _m} \equiv {{{\rho _m}} \over {3F\,{H^2}}},\qquad {\Omega _r} \equiv x_3^2,\qquad {\Omega _{{\rm{DE}}}} \equiv (1 - 6{Q^2})x_1^2 + x_2^2 + 2\sqrt 6 Q{x_1}.$$
(10.16)

These satisfy the relation Ωm + Ωr + ΩDE = 1 from Eq. (10.12). From Eq. (10.13) it follows that

$${{\dot H} \over {{H^2}}} = - {{1 - 6{Q^2}} \over 2}\;\left({3 + 3x_1^2 - 3x_2^2 + x_3^2 - 6{Q^2}x_1^2 + 2\sqrt 6 Q{x_1}} \right) + 3Q(\lambda x_2^2 - 4Q).$$
(10.17)

Taking the derivatives of x1, x2 and x3 with respect to N = ln a, we find

$$\begin{array}{*{20}c} {{{{\rm{d}}{x_1}} \over {{\rm{d}}N}} = {{\sqrt 6} \over 2}(\lambda x_2^2 - \sqrt 6 {x_1})\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;} \\ {+ {{\sqrt 6 Q} \over 2}\;\left[ {(5 - 6{Q^2})x_1^2 + 2\sqrt 6 Q{x_1} - 3x_2^2 + x_3^2 - 1} \right] - {x_1}{{\dot H} \over {{H^2}}},} \\ \end{array}$$
(10.18)
$${{{\rm{d}}{x_2}} \over {{\rm{d}}N}} = {{\sqrt 6} \over 2}(2Q - \lambda){x_1}{x_2} - {x_2}{{\dot H} \over {{H^2}}},$$
(10.19)
$${{{\rm{d}}{x_3}} \over {{\rm{d}}N}} = \sqrt 6 Q{x_1}{x_3} - 2{x_3} - {x_3}{{\dot H} \over {{H^2}}},$$
(10.20)

where λ ≡ − U,ϕ/U.

If λ is a constant, i.e., for the exponential potential U = U0eλϕ, one can derive fixed points for Eqs. (10.18)(10.20) by setting dxi/dN = 0 (i = 1, 2, 3). In Table 1 we list the fixed points of the system in the absence of radiation (x3 = 0). Note that the radiation point corresponds to (x1, x2, x3) = (0, 0, 1). The point (a) is the ϕ-matter-dominated epoch (ϕMDE) during which the density of non-relativistic matter is a non-zero constant. Provided that Q2 ≪ 1 this can be used for the matter-dominated epoch. The kinetic points (b1) and (b2) are responsible neither for the matter era nor for the accelerated epoch (for ∣Q∣ ≲ 1). The point (c) is the scalar-field dominated solution, which can be used for the late-time acceleration for weff < −1/3. When Q2 ≪ 1 this point yields the cosmic acceleration for \(- \sqrt 2 + 4Q < \lambda < \sqrt 2 + 4Q\). The scaling solution (d) can be responsible for the matter era for ∣Q∣≪∣λ∣, but in this case the condition weff < −1/3 for the point (c) leads to λ2 ≲ 2. Then the energy fraction of the pressureless matter for the point (d) does not satisfy the condition Ωm ≃ 1. The point (e) gives rise to the de Sitter expansion, which exists for the special case with λ = 4Q[which can be also regarded as the special case of the point (c)]. From the above discussion the viable cosmological trajectory for constant λ is the sequence from the point (a) to the scalar-field dominated point (c) under the conditions Q2 ≪ 1 and \(- \sqrt 2 + 4Q < \lambda < \sqrt 2 + 4Q\). The analysis based on the Einstein frame action (10.6) also gives rise to the ϕMDE followed by the scalar-field dominated solution [23, 22].

Table 1 The critical points of dark energy models based on the action (10.10) in BD theory with constant λ = −U,ϕ/U in the absence of radiation (x3 = 0). The effective equation of state \({w_{{\rm{eff}}}} = - 1 - 2\dot H/(3{H^2})\) is known from Eq. (10.17).

Let us consider the case of non-constant λ. The fixed points derived above may be regarded as “instantaneous” pointsFootnote 7 [195, 454] varying with the time-scale smaller than H−1. As in metric f(R) gravity \((Q = - 1\sqrt 6)\) we are interested in large coupling models with ∣Q∣ of the order of unity. In order for the potential U(ϕ) to satisfy local gravity constraints, the field needs to be heavy in the region \(R \gg {R_0} \sim H_0^2\) such that ∣λ∣ ≫ 1. Then it is possible to realize the matter era by the point (d) with ∣Q∣ ≪ ∣λ∣. Moreover the solutions can finally approach the de Sitter solution (e) with λ = 4Q or the field-dominated solution (c). The stability of the point (e) was analyzed in [596, 250, 242] by considering linear perturbations δx1, δx2 and δF. One can easily show that the point (e) is stable for

$$Q{{{\rm{d}}\lambda} \over {{\rm{d}}F}}({F_1}) > 0\quad \rightarrow \quad {{{\rm{d}}\lambda} \over {{\rm{d}}\phi}}({\phi _1}) < 0,$$
(10.21)

where F1 = e−21 with ϕ1 being the field value at the de Sitter point. In metric f(R) gravity \((Q = - 1\sqrt 6)\) this condition is equivalent to m = Rf,RR/f,R < 1.

For the f(R) model (5.19) the field ϕ is related to the Ricci scalar R via the relation \({e^{2\phi/\sqrt 6}} = 1 - 2n\mu {(R/{R_c})^{- (2n + 1)}}\). Then the potential U = (FRf)/2 in the Jordan frame can be expressed as

$$U(\phi) = {{\mu {R_c}} \over 2}\;\left[ {1 - {{2n + 1} \over {{{(2n\mu)}^{2n/(2n + 1)}}}}\;{{\left({1 - {e^{2\phi/\sqrt 6}}} \right)}^{2n/(2n + 1)}}} \right].$$
(10.22)

for theories with general couplings Q we consider the following potential [596]

$$U(\phi) = {U_0}\;\left[ {1 - C{{(1 - {e^{- 2Q\phi}})}^p}} \right]\qquad ({U_0} > 0,\;C > 0,\;0 < p < 1),$$
(10.23)

which includes the potential (10.22) in f(R) gravity as a specific case with the correspondence U0 = μRc/2 and C = (2n + 1)/(2)2n/(2n+1), \(Q = - 1/\sqrt 6\), and p = 2n/(2n + 1). The potential behaves as U(ϕ) → U0 for ϕ → 0 and U(ϕ) → U0(1−C) in the limits ϕ → ∞ (for Q > 0) and ϕ → −∞ (for Q < 0). This potential has a curvature singularity at ϕ = 0 as in the models (4.83) and (4.84) of f(R) gravity, but the appearance of the singularity can be avoided by extending the potential to the regions ϕ > 0 (Q < 0) or ϕ < 0 (Q > 0) with a field mass bounded from above. The slope λ = −U,ϕ/U is given by

$$\lambda = {{2Cp\,Q{e^{- 2Q\phi}}{{(1 - {e^{- 2Q\phi}})}^{p - 1}}} \over {1 - C{{(1 - {e^{- 2Q\phi}})}^p}}}.$$
(10.24)

During the radiation and deep matter eras one has R = 6(2H2 + Ḣ) ≃ ρm/F from Eqs. (10.12)(10.13) by noting that U0 is negligibly small relative to the background fluid density. From Eq. (10.14) the field is nearly frozen at a value satisfying the condition U,ϕ + m ≃ 0. Then the field ϕ evolves along the instantaneous minima given by

$${\phi _m} \simeq {1 \over {2Q}}\;{\left({{{2{U_0}pC} \over {{\rho _m}}}} \right)^{1/(1 - p)}}.$$
(10.25)

As long as ρm ≫ 2U0pC we have that ∣ϕm∣ ≪ 1. In this regime the slope λ in Eq. (10.24) is much larger than 1. The field value ∣ϕm∣ increases for decreasing ρm and hence the slope λ decreases with time.

Since λ ≫ 1 around ϕ = 0, the instantaneous fixed point (d) can be responsible for the matter-dominated epoch provided that ∣Q∣ ≪λ. The variable F = e−2 decreases in time irrespective of the sign of the coupling Q and hence 0 < F < 1. The de Sitter point is characterized by λ = 4Q, i.e.,

$$C = {{2{{(1 - {F_1})}^{1 - p}}} \over {2 + (p - 2){F_1}}}.$$
(10.26)

The de Sitter solution is present as long as the solution of this equation exists in the region 0 < F1 < 1. From Eq. (10.24) the derivative of λ in terms of ϕ is given by

$${{{\rm{d}}\lambda} \over {{\rm{d}}\phi}} = - {{4Cp{Q^2}F{{(1 - F)}^{p - 2}}[1 - pF - C{{(1 - F)}^p}]} \over {{{[1 - C{{(1 - F)}^p}]}^2}}}.$$
(10.27)

When 0 < C < 1, we can show that the function g(F) ≡ 1 − pFC(1−F)p is positive and hence the condition dλ/dϕ < 0 is satisfied. This means that the de Sitter point (e) is a stable attractor. When C > 1, the function g(F) can be negative. Plugging Eq. (10.26) into Eq. (10.27), we find that the de Sitter point is stable for

$${F_1} > {1 \over {2 - p}}.$$
(10.28)

If this condition is violated, the solutions choose another stable fixed point [such as the point (c)] as an attractor.

The above discussion shows that for the model (10.23) the matter point (d) can be followed by the stable de Sitter solution (e) for 0 < C < 1. In fact numerical simulations in [596] show that the sequence of radiation, matter and de Sitter epochs can be in fact realized. Introducing the energy density ρDE and the pressure PDE of dark energy as we have done for metric f(R) gravity, the dark energy equation of state wDE = PDE/ρDE is given by the same form as Eq. (4.97). Since for the model (10.23) F increases toward the past, the phantom equation of state (wDE < − 1) as well as the cosmological constant boundary crossing (wDE = − 1) occurs as in the case of metric f(R) gravity [596].

As we will see in Section 10.3, for a light scalar field, it is possible to satisfy local gravity constraints for ∣Q∣ ≲ 10−3. In those cases the potential does not need to be steep such that λ ≫ 1 in the region RR0. The cosmological dynamics for such nearly flat potentials have been discussed by a number of authors in several classes of scalar-tensor theories [489, 451, 416, 271]. It is also possible to realize the condition wDE < −1, while avoiding the appearance of a ghost [416, 271].

Local gravity constraints

We study local gravity constraints (LGC) for BD theory given by the action (10.10). In the absence of the potential U(ϕ) the BD parameter ωBD is constrained to be ωBD > 4 × 104 from solar-system experiments [616, 83, 617]. This bound also applies to the case of a nearly massless field with the potential U(ϕ) in which the Yukawa correction eMr is close to unity (where M is a scalar-field mass and r is an interaction length). Using the bound ωBD > 4 × 104 in Eq. (10.11), we find that

$$\vert Q\vert \; < 2.5 \times {10^{- 3}}.$$
(10.29)

This is a strong constraint under which the cosmological evolution for such theories is difficult to be distinguished from the ΛCDM model (Q = 0).

If the field potential is present, the models with large couplings\((\vert Q\vert = \mathcal O(1))\) can be consistent with local gravity constraints as long as the mass M of the field ϕ is sufficiently large in the region of high density. For example, the potential (10.23) is designed to have a large mass in the high-density region so that it can be compatible with experimental tests for the violation of equivalence principle through the chameleon mechanism [596]. In the following we study conditions under which local gravity constraints can be satisfied for the model (10.23).

As in the case of metric f(R) gravity, let us consider a configuration in which a spherically symmetric body has a constant density ρA inside the body with a constant density ρ = ρB (≪ ρA) outside the body. For the potential V = U/F2 in the Einstein frame one has V,ϕ ≃ − 2U0QpC(2)p−1 under the condition ∣∣ ≪ 1. Then the field values at the potential minima inside and outside the body are

$${\phi _i} \simeq {1 \over {2Q}}\;{\left({{{2{U_0}\,p\,C} \over {{\rho _i}}}} \right)^{1/(1 - p)}},\qquad i = A,B.$$
(10.30)

The field mass squared \(m_i^2 \equiv {V_{,\phi \phi}}\) at ϕ = ϕi (i = A, B) is approximately given by

$$m_i^2 \simeq {{1 - p} \over {{{({2^p}\,pC)}^{1/(1 - p)}}}}{Q^2}\;{\left({{{{\rho _i}} \over {{U_0}}}} \right)^{(2 - p)/(1 - p)}}{U_0}.$$
(10.31)

Recall that U0 is roughly the same order as the present cosmological density ρ0 ≃ 10−29 g/cm3. The baryonic/dark matter density in our galaxy corresponds to ρB ≃ 10−24 g/cm3. The mean density of Sun or Earth is about \({\rho _A} = \mathcal O(1)\;{\rm{g}}/{\rm{c}}{{\rm{m}}^3}\). Hence mA and mB are in general much larger than H0 for local gravity experiments in our environment. For \({m_A}{{\tilde r}_c} \gg 1\) the chameleon mechanism we discussed in Section 5.2 can be directly applied to BD theory whose Einstein frame action is given by Eq. (10.6) with F = e−2.

The bound (5.56) coming from the possible violation of equivalence principle in the solar system translates into

$${\left({2{U_0}pC/{\rho _B}} \right)^{1/(1 - p)}} < 7.4 \times {10^{- 15}}\,\vert Q\vert.$$
(10.32)

Let us consider the case in which the solutions finally approach the de Sitter point (e) in Table 1. At this de Sitter point we have \(3{F_1}H_1^2 = {U_0}[1 - C{(1 - {F_1})^p}]\) with C given in Eq. (10.26). Then the following relation holds

$${U_0} = 3H_1^2\left[ {2 + (p - 2){F_1}} \right]/p.$$
(10.33)

Substituting this into Eq. (10.32) we obtain

$${\left({{R_1}/{\rho _B}} \right)^{1/(1 - p)}}(1 - {F_1}) < 7.4 \times {10^{- 15}}\vert Q\vert,$$
(10.34)

where \({R_1} = 12H_1^2\) is the Ricci scalar at the de Sitter point. Since (1 − F1) is smaller than 1/2 from Eq. (10.28), it follows that (R1/ρB)1/(1−p) < 1.5 × 10−14Q∣. Using the values R1 = 10−29 g/cm3 and ρB = 10−24 g/cm3, we get the bound for p [596]:

$$p > 1 - {5 \over {13.8 - {{\log}_{10}}\,\vert Q\vert}}.$$
(10.35)

When ∣Q∣ = 10−1 and ∣Q∣ = 1 we have P > 0.66 and p > 0.64, respectively. Hence the model can be compatible with local gravity experiments even for \(\vert Q\vert = \mathcal O(1)\).

Evolution of matter density perturbations

Let us next study the evolution of perturbations in non-relativistic matter for the action (10.10) with the potential U(ϕ) and the coupling F(ϕ) = e−2. As in metric f(R) gravity, the matter perturbation δm satisfies Eq. (8.93) in the Longitudinal gauge. We define the field mass squared as M2U,ϕϕ. For the potential consistent with local gravity constraints [such as (10.23)], the mass M is much larger than the present Hubble parameter H0 during the radiation and deep matter eras. Note that the condition M2R is satisfied in most of the cosmological epoch as in the case of metric f(R) gravity.

The perturbation equations for the action (10.10) are given in Eqs. (6.11)(6.18) with f = F(ϕ)R, ω = (1 − 6Q2)F, and V = U. We use the unit κ2 = 1, but we restore κ2 when it is necessary. In the Longitudinal gauge one has χ = 0, α = Φ, ψ = −Ψ, and \(A = 3(H\Phi + \dot \Psi)\) in these equations. Since we are interested in sub-horizon modes, we use the approximation that the terms containing k2/a2, δρm, δR, and M2 are the dominant contributions in Eqs. (6.11)(6.19). We shall neglect the contribution of the time-derivative terms of δϕ in Eq. (6.16). As we have discussed for metric f(R) gravity in Section 8.1, this amounts to neglecting the oscillating mode of perturbations. The initial conditions of the field perturbation in the radiation era need to be chosen so that the oscillating mode δϕosc is smaller than the matter-induced mode δϕind. In Fourier space Eq. (6.16) gives

$$\left({{{{k^2}} \over {{a^2}}} + {{{M^2}} \over \omega}} \right)\;\delta {\phi _{{\rm{ind}}}} \simeq {1 \over {2\omega}}{F_{,\phi}}\delta R.$$
(10.36)

Using this relation together with Eqs. (6.13) and (6.18), it follows that

$$\delta {\phi _{{\rm{ind}}}} \simeq {{2QF} \over {({k^2}/{a^2})(1 - 2{Q^2})F + {M^2}}}{{{k^2}} \over {{a^2}}}\Psi.$$
(10.37)

Combing this equation with Eqs. (6.11) and (6.13), we obtain [596, 547] (see also [84, 632, 631])

$${{{k^2}} \over {{a^2}}}\Psi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{({k^2}/{a^2})(1 - 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})F + {M^2}}},$$
(10.38)
$${{{k^2}} \over {{a^2}}}\Phi \simeq - {{{\kappa ^2}\delta {\rho _m}} \over {2F}}{{({k^2}/{a^2})(1 + 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})F + {M^2}}},$$
(10.39)

where we have recovered κ2. Defining the effective gravitational potential Φeff = (Φ + Ψ)/2, we find that Φeff satisfies the same form of equation as (8.99).

Substituting Eq. (10.39) into Eq. (8.93), we obtain the equation of matter perturbations on sub-horizon scales [with the neglect of the r.h.s. of Eq. (8.93)]

$${\ddot \delta _m} + 2H{\dot \delta _m} - 4\pi {G_{{\rm{eff}}}}{\rho _m}{\delta _m} \simeq 0,$$
(10.40)

where the effective gravitational coupling is

$${G_{{\rm{eff}}}} = {G \over F}{{({k^2}/{a^2})(1 + 2{Q^2})F + {M^2}} \over {({k^2}/{a^2})F + {M^2}}}.$$
(10.41)