1 Introduction

For over a century, Einstein’s general relativity (GR) has continued to be an impressive theory of gravity that fits observations from our solar system to the entire cosmological model of the universe. Guided by some key principles, Einstein came to the important realization of a very close relationship between the curvature of spacetime and gravity. Taking into account further requirements, such as coordinate invariance, conservation laws, and limits that must be consistent with Newtonian gravity, he proposed his gravitational field equations (Einstein 1915). Astonishingly, the same simple but powerful equations remain to date the most accurate description of gravitational physics at all scales.

Shortly after that, GR gave birth to the current standard model of cosmology predicting exact solutions with expanding or contracting universes. It allowed the combination of ideas from Friedmann and Lemaître about expanding universes (Friedmann 1922; Lemaître 1931) along with the geometry of homogeneous and isotropic spacetimes of Robertson (1935) and Walker (1937) in order to produce the so-called Friedmann–Lemaître–Robertson–Walker models (FLRW). These models describing the background cosmological evolution were completed by the addition of cosmological perturbation theory to populate them with cosmic structures. Over the years and decades to follow, the FLRW models plus cosmological perturbations benefited from a number of theoretical developments and observational techniques that allowed us to map the whole history of cosmic evolution from very early times to the current stages of the universe as we observe it today.

However, this scientific triumph in cosmology came with two conundrums: dark matter and cosmic acceleration (or dark energy). Indeed, in order for the FLRW model to fit current observations, we first need \(\sim \) 25% of the mass-energy content in the universe to be in the form of a pressureless dark matter component that interacts only gravitationally with baryons and light (possibly weakly with baryons as well). The requirement for the presence of such a dark matter component does not come only from cosmology but also from rotation curves of galaxies, gravitational lensing observations, and the requirement of deep initial potential wells that would have allowed the formation of the largest structures that we observe today; see for example Trimble (1987), Bertone et al. (2005), D’Amico et al. (2009), Einasto (2014), Freese (2017) and references therein. The dark matter problem motivated the introduction of modified gravity (MG) theories that would explain such observations by a small modification to Kepler laws such as Modified Newtonian Dynamics (MOND) (Milgrom 1983b), its relativistic generalization known as TeVeS (tensor–vector–scalar) theory (Bekenstein 2004), or other vector–tensor theories. While Dark Matter motivated proposals of some MG models, the main focus of this review is rather on models that attempt to address the question of cosmic acceleration that we describe next.

The second problem in standard cosmology is indeed that of cosmic acceleration and the dark energy associated with it. Two decades ago, two independent groups using supernova measurements found that the universe’s expansion is speeding up rather than slowing down (Riess et al. 1998; Perlmutter et al. 1999). A plethora of complementary cosmological observations have continued since to confirm this result and require that an FLRW model fitting observations must have a genuine or effective dark energy component that would account for \(\sim \) 70% of the total energy budget in the universe. In such a universe, the baryons constitute only \(\sim \) 5% of this budget. This picture has become the concordance model in cosmology referred to as the Lambda-Cold-Dark-Matter (\(\varLambda \)CDM) model. This best fit model is spatially flat. \(\varLambda \) is the cosmological constant, and its addition to the Einstein’s equations can produce an accelerated expansion of the universe.

The cosmological constant can be cast into the model as an effective cosmic fluid with an equation of state of minus one. This coincides exactly with the equation of state of the vacuum energy associated with zero-level quantum fluctuations. Interestingly, this connects the problem of cosmic acceleration to the problems of the cosmological constant/vacuum energy problems (Weinberg 1989; Carroll et al. 1992; Sahni and Starobinsky 2000; Carroll 2001; Peebles and Ratra 2003; Padmanabhan 2003; Copeland et al. 2006; Ishak 2007). Namely, why is the value measured from cosmology so small compared to the one predicted from quantum field calculations? This is known as the old cosmological constant problem. A second question (the new problem) is why the energy density associated with the cosmological constant/vacuum energy is of the same order of magnitude as the matter density at present cosmic time? (If it were any larger it would have prevented cosmic structure from forming.) Other types of dark energy have been proposed with an equation of state that is very close to minus one and would be not connected to the cosmological constant/vacuum energy. These are for example quintessence models based on a scalar field with kinetic energy and potential terms that can be cast as well into an effective dark energy model with a negative equation of state also close to minus one (Peebles and Ratra 1988; Ratra and Peebles 1988; Caldwell et al. 1998). It is worth noting that most of these dark energy models do not address the cosmological constant problem and may suffer from some form of fine-tuning as well.

Relevant to our review, the question of cosmic acceleration motivated a number of proposals for modified gravity models that could produce such an acceleration without the need for a cosmological constant. Such models are said to be self-accelerating. In most cases, these models do not address the cosmological constant problems and it is hoped that by some mechanism, for example degravitation or some given cancellation, vacuum energy does not contribute to gravitational and cosmological dynamics. However, in some cases, modified gravity models do provide some degravitation mechanism, although not successfully so far. We discuss these further in this review.

Finally, there are also modified gravity models at high energies that have been motivated by the search for quantum gravity and other unified theories of physics which may or may not have any consequences at cosmological scales.

While the rapid growth of the subject of deviations from GR and MG models has been motivated by cosmic acceleration/dark energy and to some extent by dark matter, the subject of MG models is an old one. Indeed, just a few years after GR was introduced, Weyl gravity was proposed by Weyl (1918), and so were the theories of Eddington (1924), Cartan (1922b) and Brans and Dicke (1961), and many others. Testing GR and gravity theories within the solar system and using other astrophysical objects have been the subject of intense work with a number of important results over the last five decades or so; see for a review (Will 2014). Impressively, GR fits all these local tests of gravity. In fact, it fits them so well that these tests are commonly referred to as GR local tests. This is very useful to the current cosmological developments, because it has established very stringent constraints at the level of the solar system that any gravity theory must pass. Nevertheless, to address these requirements, some MG models have some gravitational screening mechanisms that allow them to deviate from GR at cosmological scales but then become indistinguishable from it at small scales.

Further motivation for testing GR at cosmological scales is the increasing quantity and quality of available cosmological data. These are indeed good times for cosmology where a plethora of complementary observational data from ongoing and planned surveys will continue to flow for the many decades to come. These include the cosmic microwave background radiation, weak gravitational lensing, galaxy surveys, distances to supernovae, baryon acoustic oscillations, and gravitational waves. A good piece of news is that one can not only combine these data sets to increase their constraining power, but one can also apply consistency tests between such complementary data sets. This would allow one to identify any problems with systematic effects in the data or any problems with the underlying model. Furthermore, nature has also given us a break in cosmology as we have two types of data sets. Indeed, some data sets are sensitive to the geometry and expansion of the universe and some other sets are sensitive to the growth of large-scale structure (i.e., the rate at which structures cluster in the universe). These two sets of observations must be consistent with one another. For testing deviations from GR and constraining MG models, it was realized that MG models can mimic an expansion history of the universe that is identical to that of the concordance \(\varLambda \)CDM model while they can still have a structure formation history that is different and distinguishable from that of \(\varLambda \)CDM. It has become common practice that the background expansion is modeled with an effective dark energy with an equation of state close to the minus one value of \(\varLambda \)CDM. Meanwhile, any departure from GR is constrained by using the growth data from large-scale structure observables.

There are two general approaches that have been developed to test departures from GR at cosmological scales. The first one is where the deviation is parameterized in a phenomenological way with no necessary exact knowledge of the specific alternative theory. The growth equations are modified by the addition of MG parameters that represent the departure from GR. These MG parameters are expected to take values of unity for GR but depart from it for MG models. It is worth noting that such an effective description may not necessarily remain valid at all scales constrained by observations and so must be used with some caution when compared to various observations. The second approach is to choose a specific class of MG models [like the popular f(R) or DGP models (see Sects. 7.4.1 and 7.5.2)] and derive cosmological perturbations and observables for these models. These are then implemented in cosmological analysis software and used to compare to the data. We cover both approaches in this review. A related question is what one could call a modified gravity model versus a dark energy model. There are some guiding helpful prescriptions that we discuss in the review but the spectrum of models has a grey zone where such a distinction is not unambiguous. We characterize various types of deviations from general relativity and organize MG models accordingly with some illustrative examples.

In this review, we aim at providing an overall current picture of the field of testing gravity at cosmological scales including a selection of recent important results on the subject. The review is meant to provide an entry point for students and researchers interested in the field where they can find summaries and references to further readings. This review can also serve for experienced researchers or other readers to find quickly recent developments or results in the field. As required for the Living Review guidelines, this review is written with the depth and style of a plenary review talk on the subject. It is not meant to replace thorough comprehensive reviews on various parts of this topic and we refer the reader constantly to such specialized reviews as we discuss each sub-topic.

2 General relativity (GR)

2.1 Basic principles

Einstein considered some key guiding principles and well-known limits that a successful theory of gravity must obey. At the forefront is the principle of covariance—that is the laws of physics must be independent of any coordinate system. So the right language must be that of tensors or another coordinate independent formulation. Such a successful theory should locally be consistent with special relativity and must inherit its principles including the equivalence of local inertial frames of reference, the universal constancy of the speed of light in vacuum, and the Lorentz-invariance of the theory.

An important part of Einstein’s reflections when he proposed special relativity and then continued to work toward general relativity was about the principles of equivalence. He found guidance in Mach’s ideas about relativity and the nature of inertia (Mach et al. 1905, 1988), although, he had to abandon some of them later on.

From the principle of equivalence between gravity and inertia that we provide below, Einstein developed the important insight that gravity seems to have a privileged status compared to other interactions. That is gravity is equivalent to inertia. The principle of universality of free-fall and gravitational interaction as expressed below in the equivalence principles combined with some insight that gravity is omnipresent in spacetime, led Einstein to formulate gravity as the curvature of spacetime. See various discussions and perspectives in reviews and books, e.g., Will (2014, 2018), d’Inverno (1992), Rindler (2006), Weinberg (1972), Misner et al. (1973) and Carroll (2003).

  • Weak equivalence principle (WEP) WEP is stated in a variety of formulations. One of them is usually stated as the equivalence between the inertial mass and the gravitational mass which has been been tested to a few parts in \(10^{13}\) (Adelberger 2001; Wagner et al. 2012) and a few parts in 10\(^{14}\) (Touboul et al. 2017), Will (2014) for WEP test timeline. Einstein then advocated that inertia and gravity must be the same and that an observer inside a “cabin” (with no windows) at rest in the presence of a gravitational acceleration will not be able to distinguish that situation from one where the “cabin” is on a rocket moving up with the exact opposite acceleration. The WEP is expressed as the universality of the gravitational interaction and free-fall for all particles. For our review, we focus on the notions of universality of free fall and the matter coupling in the context of GR \(+\) dark energy versus modified gravity (MG) models following for example Joyce et al. (2016). WEP is satisfied if there exists some spacetime metric (in the Jordan frame) to which all species of matter are universally coupled. Test particles fall then along geodesics of this metric.

  • Einstein equivalence principle (EEP) The EEP requires the validity of the WEP, and that in all local freely falling frames, the laws of physics reduce to those of special relativity (assuming tidal gravitational forces are absent). It is also customary to add here that the EEP contains the statement that the outcome of any local non-gravitational experiment is independent of where and when it is performed (Will 2014).

  • Strong equivalence principle (SEP) The SEP extends the universality of free fall of the WEP to massive gravitating objects so it is completely independent of the composition of the objects as well as their gravitational binding energy. Compact objects like Black Holes will also fall along geodesics like test particles (Will 1994, 2014). The SEP extends also the EEP to include all of the laws of physics, gravitational or otherwise.

One more remark is worth mentioning about the relationship between the equivalence principles and the spacetime metric. Let us recall that metric theories of gravity satisfy the following properties, see for example Will (2014): (i) a symmetric metric exists, (ii) test particles follow geodesics of such a metric, and (iii) in local reference frames, the non-gravitational laws of physics are those of special relativity. From this definition, it follows that metric theories obey the EEP. It also encourages one to imply that theories that obey the EEP are metric theories, e.g., Will (2014).

We conclude this subsection by commenting on a few other notions that guided Einstein in formulating his equations of the gravitational field. The geometrical nature of GR and the principles it is built upon are certainly far from the Newtonian gravity based on forces and potentials, not to mention the notions of absolute space and other shortcomings that had to be abolished. However, it is interesting to remark that the notion of spacetime and its metric to explain gravity can be compared to the notion of the gravitational potential field in space created by massive objects. However, there is a major difference, in GR there is no gravitational potential or gravity that is added on the top of spacetime, but gravity is curvature of spacetime itself. This was a major insight that Einstein got from his EEP principle. In fact, he knew well that GR must have Newtonian gravity as a limit in the weak regime and that provided to him many hints on how to formulate the field equations that we provide in the next section.

2.2 Einstein field equations (EFEs) and their exact solutions

In addition to the principles above, Einstein used the fact that, in the weak field limit, the gravitational field equations must locally reduce to those of Newtonian gravity where the metric tensor components would be related to the gravitational potential and the field equations must reduce to Poisson equations. From the latter, he imposed that the curvature side of the equations must contain only up to second order derivatives of the metric and must also be of the same tensor rank as the energy-momentum tensor. This naturally led Einstein to consider the Ricci tensor, derived from contracting twice the Riemann curvature tensor, but there was a little bit more into it. Indeed, he knew that the equations must satisfy conservation laws and thus must be divergence-free. While the vanishing of the divergence of the matter-energy source side of the equations is assured by energy conservation laws and continuity equations, on the curvature side, the Ricci tensor is not divergence-free so more work was required. For that, Einstein built precisely the tensor that holds his name which, by the Bianchi identity, is divergence-free hence complies with conservation laws, as it should. Some technical or historical entire books or articles have been devoted to what led Einstein to derive his equations and we refer the reader to the extended study by Janssen et al. (2007) and references therein.

With no further discussion, the Einstein’s field equations (EFEs) read

$$\begin{aligned} G_{\mu \nu }+\varLambda g_{\mu \nu }=8\pi G T_{\mu \nu }, \end{aligned}$$

where \(G_{\mu \nu }\equiv R_{\mu \nu }-\frac{1}{2}g_{\mu \nu }R\) is the Einstein tensor representing the curvature of spacetime, \(R_{\mu \nu }\) is the Ricci tensor, R the Ricci scalar, \(g_{\mu \nu }\) is the metric tensor, and \(\varLambda \) is the cosmological constant. For brevity we use units such that \(c=1\) throughout. On the RHS, the source (content) of spacetime is represented by the energy momentum tensor

$$\begin{aligned} T_{\mu \nu }=(\rho +p)u_{\mu }u_{\nu }+pg_{\mu \nu }+q_{\mu }u_{\nu }+u_{\mu }q_{\nu }+\pi _{\mu \nu }, \end{aligned}$$

where \(u^{\mu }\) is the tangent velocity 4-vector (e.g., the tangent field to the cosmic fluid particle world-lines) normalized by \(u_{\mu }u^{\mu }=-1\), \(\rho \) is the relativistic mass-energy density, p is the isotropic pressure, \(q^{\mu }\) the energy flux, and \(\pi _{\mu \nu }\) is the trace-free anisotropic pressure or stress, all relative to \(u^{\mu }\). The quantities \(\rho \), p, \(q_{\mu }\), and \(\pi _{\mu \nu }\) are functions of time and space. We use the signature \((-,+,+,+)\) and a 3\(+\)1 decomposition of spacetime unless stated otherwise.

In standard cosmology, it is assumed that the cosmic fluid is well-described by a perfect fluid (i.e., \(q_{\mu }=0\) and \(\pi _{\mu \nu }=0\)) at the cosmic background level which accounts for baryons, dark matter, radiation and a cosmological constant or another dark energy component. The energy-momentum tensor then reduces to

$$\begin{aligned} T_{\mu \nu }=({\bar{\rho }}+\bar{p})u_{\mu }u_{\nu }+\bar{p}g_{\mu \nu }, \end{aligned}$$

where the last three terms of (2) are set to zero, and the over bar means average over space of quantities and are now functions of time only. However, at the perturbation level, the velocity field contributes to the heat flux and neutrinos, for example, generate anisotropic shear at early times in the universe.

It is not widely known that the EFEs have over 1300 exact solutions that have been derived over the last century, see for example the classical compilation book by Stephani et al. (2003) and also Online Interactive Geometric Databases equipped with a live tensor component calculator (Ishak and Lake 2002). These solutions are based on symmetries of the spacetime and defined forms of the energy momentum source.

While the large number of exact solutions exhibit the richness and mathematical beauty of the field, a number of solutions still lack any physical interpretation (Stephani et al. 2003; Delgaty and Lake 1998; Ishak et al. 2001). Some of these solutions have found direct applications to real astrophysical systems. These include the popular Schwarzschild static spherically symmetric vacuum solution around a concentric mass (Schwarzschild 1916). The solution is often used to model space around Earth, Sun, or other slowly rotating objects where it leads to more accurate predictions than Newtonian gravity, see e.g., Will (2014). The solution is also used to represent the exterior spacetime around a static spherically symmetric black hole. A second well-know exact solution is that of Kerr (1963) representing the vacuum space around an axially symmetric rotating compact object or black hole. Next, several other static spherically symmetric non-vacuum solutions such as the Tolman family of solutions (Tolman 1939) and the Buchdahl solutions (Buchdahl 1967) have been used to model the interior of compact astrophysical objects such as Neutron stars (Lattimer and Prakash 2007). Finally, some solutions have found applications in cosmology. These include, for example, the isotropic and homogeneous Friedmann–Lemaître–Robertson–Walker (FLRW) solutions (Friedmann 1922; Lemaître 1931; Robertson 1935; Walker 1937), the inhomogeneous Lemaître–Tolman–Bondi solutions (Lemaître 1933; Tolman 1934; Bondi 1947), the inhomogeneous Szekeres models (Szekeres 1975), the anisotropic Bianchi models (Ellis and MacCallum 1969), and others (Ellis and van Elst 1999).

Einstein’s Equations of general relativity connected naturally the isotropic and homogeneous geometry of space given by the Robertson–Walker metric to the cosmic fluid substratum described by a perfect fluid, giving birth to the standard model of cosmology that we describe in the next section.

It is important to note, and in particular in the context of this review, that while Einstein derived his equations from the principles and approach discuss above, the field equations also derive immediately from a variational principle where the action for the curvature sector is simply the Ricci scalar. This was derived simultaneously by Einstein and Hilbert and the curvature part of the action bears their names. The GR action with a cosmological constant term reads

$$\begin{aligned} S_{\textit{GR}}=\int d^{4}x \sqrt{-g} \left[ \frac{R- 2 \varLambda }{16 \pi G}+{\mathcal {L}}_{M}\right] , \end{aligned}$$

where g is the determinant of the metric tensor and \({\mathcal {L}}_{M}\) is the Lagrangian for the matter fields. Variations of Eq. (4) with respect to the metric, \(g_{\mu \nu }\), gives the field equations (1) above. Modified gravity models are often introduced at the level of the action.

Finally, with regards to this review, it is worth clarifying that modifications to GR mean also that the above exact solutions are not anymore valid and need to be replaced by their homologous solutions in the modified theory. For cosmology, an FLRW metric is often used but then leads to modified dynamical equations often referred to as modified Friedmann’s equations.

3 The standard model of cosmology

3.1 The homogeneous cosmological background

3.1.1 FLRW metric and Friedmann’s equations

From the nearly isotropic large scale observations around us and the assumption that it should not look any different from another point in the universe (i.e., the cosmological principle), one can infer that the universe can be described by a spacetime that is globally isotropic and thus homogeneous. The geometry is then described by the metric of Friedmann–Lemaître–Robertson–Walker (FLRW) with line element

$$\begin{aligned} ds^2=-dt^2+a^2(t)\left( \frac{dr^2}{1-kr^2}+r^2(d\theta ^2+\sin ^2\theta d\phi ^2)\right) , \end{aligned}$$

where a(t) is the expansion scale factor representing the time-dependent evolution of the spatial part of the metric (surfaces of constant t), and \(k\in \{-1,0,+1\}\) determines the geometry of these spatial sections: negatively curved, flat, or positively curved, respectively.

The EFEs (1) solved for the FLRW metric (5) and a perfect fluid source energy momentum tensor (3) give the dynamical Friedmann equations. The first equation derives from time-time components of the EFEs as

$$\begin{aligned} \frac{\dot{a}^2}{a^2}=H(t)^2=\frac{8\pi G}{3}{\bar{\rho }} +\frac{\varLambda }{3}-\frac{k}{a^2}, \end{aligned}$$

where an overdot denotes the derivative with respect to the cosmic time t, and we isolated on the LHS the Hubble parameter defined as,

$$\begin{aligned} H(t)\equiv \frac{\dot{a}(t)}{a(t)}. \end{aligned}$$

This allows us to define a first cosmological parameter, the Hubble constant as \(H_0=H(t_0)\) where \(t_0\) is the present time. It is common to use instead the normalized parameter \(h \equiv H_0/\)(100  km \(\mathrm s^{-1}\) Mpc\(^{-1}\)). As usual, in the spatially flat case, the scale factor can be normalized such that its present value \(a_0=a(t_0)\equiv 1\). We recall that in spatially curved space, one cannot normalize simultaneously the spatial curvature and the scale factor. The cosmological redshift is related to the scale factor by \(1 + z = a_0/a\).

The second Friedmann equation derives from the combination of the space-space component and the time-time component of the EFEs, and can be written as an acceleration/deceleration equation as follows

$$\begin{aligned} \frac{\ddot{a}}{a}\, = -\frac{4\pi G}{3}\left( {\bar{\rho }}\, +\, 3\bar{p} \right) \, +\, \frac{\varLambda }{3}. \end{aligned}$$

It is sometimes more convenient to replace the radial coordinate, r, by the comoving coordinate \(\chi \) using \(d\chi \equiv {dr}/{\sqrt{1-kr^2}}\) so that the line element reads

$$\begin{aligned} ds^2=-dt^2+a^2(t)\left( d\chi ^2+f_{K}^2(\chi ) \left( d\theta ^2+\sin ^2\theta d\phi ^2\right) \right) , \end{aligned}$$


$$\begin{aligned} f_K(\chi )=\left\{ \begin{array}{ll} \sin (\chi ) &{}\quad k=+1 \\ \chi &{}\quad k=0 \\ \sinh (\chi ) &{}\quad k=-1 \end{array}.\right. \end{aligned}$$

Finally, it is also sometimes convenient to change the coordinate (cosmic) time to the conformal time defined as \(d\tau \equiv {dt}/{a(t)}\) so the line element now reads

$$\begin{aligned} ds^2=a^2 (\tau ) \left[ -d\tau ^2 + d\chi ^2+f_{K}^2(\chi ) \left( d\theta ^2+\sin ^2\theta d\phi ^2\right) \right] . \end{aligned}$$

The Friedmann equations and the FLRW metric provide a description of the homogeneous universe and its dynamics serving as a basis to study the propagation of light, the expansion history, distance measures, and the energy budget of the universe.

Again, with regards to modifications to GR, the Friedmann’s equations above, i.e., (6) and (8), are modified and so are all the observables and distance measurements described below that build on these equations. For example, in relation to cosmic acceleration, the cosmological constant term can be replaced by extra terms coming from the modification and that could play a similar role to it. However, as we already mentioned in the introduction, some of these models are able to fit well the expansion and background observations so any further distinction will have to come from the growth of structure constraints and observables.

3.1.2 Cosmic mass-energy budget, dark energy and cosmic acceleration

In general relativity, conservation laws are given by the vanishing of the covariant derivative of the energy momentum tensor, i.e., \(T^{\mu \nu }_{\;\;\;\;\; ;\nu }=0\). This provides the continuity equation

$$\begin{aligned} \dot{{\bar{\rho }}}+3 \frac{\dot{a}}{a}({\bar{\rho }}+\bar{p})=\dot{{\bar{\rho }}}+3 \frac{\dot{a}}{a}{\bar{\rho }}(1+w)=0, \end{aligned}$$

where in the last step we used the equation of state variable, w, defined as

$$\begin{aligned} \bar{p}=w{\bar{\rho }}. \end{aligned}$$

It follows from the continuity Eq. (12), that for a matter (baryon and dark matter) dominated epoch (i.e., \(w=0\)) \({\bar{\rho }}_m \propto a^{-3}\), for a radiation dominated epoch (i.e., \(w=1/3\)) \({\bar{\rho }}_r \propto a^{-4}\), and for a cosmological constant (i.e., \(w=-1\)) \(\rho _{\varLambda }\) is a constant, while for a dynamical dark energy with \(w_{de}\),

$$\begin{aligned} {\bar{\rho }}_{de} = {\bar{\rho }}_{de}^{0} a^{-3(1+w_{de})}. \end{aligned}$$

In models of dynamical dark energy, \(w_{de}\) is another cosmological parameter that is allowed to be different from \(-1\) in cosmological analyses. It can also be allowed to vary in redshift (or scale factor) in which case it can, for example, take the form \(w(a)=w_0+w_a(1-a)\) known as CPL parameterization (Chevallier and Polarski 2001; Linder 2003). Other parameterizations for w have been introduce in order to fit other dark energy or modified gravity models. Alternatively, the equation of state can also be binned in the redshift.

It is trivial to observe from the second Friedmann equation (8) that a cosmic effective dark energy fluid with an equation of state \(w_{de}=p_{de}/\rho _{de}<-1/3\) gives an accelerated expansion. This is the case for a cosmological constant. The field equations of GR have no difficulty in mathematically producing an accelerated expansion, but the real challenge is to figure out what is the physical nature of such an effective dark energy fluid.

So far, most analyses are consistent with the value of \(w=-1\) of a cosmological constant with shrinking error bars around it; see for example DES Year-1 cosmological parameter paper (Abbott et al. 2018b) where combining most available data sets gave \(w_{de}= -1.00{\,}^{+0.04}_{-0.05}\). Although the latest data from Planck and Planck combined with other data sets was found to slightly favor \(w_{de}\) values slightly smaller than \(-1\) (Ade et al. 2016b). However, current data do not yet significantly constrain the \(w_0\) and \(w_a\) parameters for a time-varying equation of state of DE.

In order to describe the energy budget in the universe as measured from observations, we first need to describe the critical density of the universe evaluated today, noted as \(\rho _{crit}^0\). This will serve as a reference density and is determined from the first Friedmann equation (6) in a spatially flat universe with no cosmological constant. That is:

$$\begin{aligned} \rho _{crit}^0= & {} \frac{3H^2_0}{8\pi G} \nonumber \\= & {} 1.9\times 10^{-29}h^2 {\mathrm{grams}} \,\,{\mathrm{cm}}^{-3} \nonumber \\= & {} 2.8\times 10^{11}h^2 {\mathrm{M}}_{\odot } {\mathrm{Mpc}}^{-3}. \end{aligned}$$

The last line is given in solar masses, \(\mathrm{M}_{\odot }\), per megaparsec cubed. We can now use this reference density to express the density parameters today for different species as the ratio

$$\begin{aligned} \varOmega _i^0=\frac{{\bar{\rho }}_i^0}{\rho _{crit}^0}. \end{aligned}$$

This defines 3 other cosmological parameters with their values today as for example estimated from Planck and other data sets (Ade et al. 2016a): \(\varOmega _b^0\approx 0.05\) for baryonic matter, \(\varOmega _dm^0 \approx 0.26\) for cold dark matter, \(\varOmega _{\varLambda }^0 \approx 0.69\) for a cosmological constant, and a tiny curvature “density” parameter \(|\varOmega _k^0\equiv -k/H_0^2| < 0.01\). These numbers characterize the standard spatially flat Lambda-Cold-Dark-Matter (\(\varLambda \)CDM) concordance model.

The Friedmann equation (6) can be re-written in terms of these density parameters and the scale factor as

$$\begin{aligned} H^2(a)=H^2_0 \left[ \varOmega _m^0 a^{-3}+ \varOmega _r^0 a^{-4}+ \varOmega _k^0 a^{-2}+\varOmega _{de}^0 a^{-3(1+w)} \right] , \end{aligned}$$

where we use \(\varOmega _m^0 \equiv \varOmega _b^0+\varOmega _c^0\) and recall that \(\varOmega _r^0\approx 10^{-4}\) and is so negligible at the present time. So when evaluated today for a spatially flat universe with a cosmological constant, \(\varLambda \), Eq. (17) reduces to simply \(\varOmega _m^0+\varOmega _{\varLambda }^0=1\).

3.1.3 Cosmological distances

Another useful background information to cover is that of distances in cosmology. We start with the physical distance or proper distance (e.g., Weinberg 1972), defined for example by integrating the line element (9) at a given instant along a radial direction so that \(dt=d\theta =d\phi =0\)

$$\begin{aligned} d_{phys}(t)={a(t)} \int ^{\chi }_0 d\chi '={a(t)}\chi . \end{aligned}$$

This is the distance that would be instantaneously measured if we used a gigantic ruler from us to a remote object. In Weinberg (1972), this is equivalently defined from (9) as

$$\begin{aligned} d_{prop}(t)=\int ^{r}_0 \sqrt{g_{rr}}dr'={a(t)}\int ^{r}_0 \frac{dr'}{\sqrt{1-kr'^2}}={a(t)}\chi . \end{aligned}$$

This distance is time dependent so a radial comoving distance is often used as

$$\begin{aligned} \chi =\frac{d_{phys}}{{a(t)}}. \end{aligned}$$

In the spatially flat case, with the normalization of \(a\equiv 1\) today, the comoving distance is normalized to be equal to the proper distance today. Also, the normalized comoving distance to a galaxy with redshift z (or \(a=1/(1+z)\)) is thus given from Eq. (9) as

$$\begin{aligned} \chi =\int ^{t_{today}}_{t}\frac{dt'}{a(t')}= \int ^{1}_{a}\frac{da'}{a'^2 H(a')}. \end{aligned}$$

Now, astronomers define other distances that can be measured by different methods. First, the angular diameter distance is defined for an object that has a typical diameter size, \(\mathcal {D}\), and an angular observed size, \(\delta \theta \) as (Ellis 1973; Ellis and van Elst 1999)

$$\begin{aligned} d_A\equiv & {} \frac{\mathcal {D}}{\delta \theta }=\frac{\sqrt{g_{\theta \theta }}d\theta }{\delta \theta }\nonumber \\= & {} {a(t)}f_K(\chi ), \end{aligned}$$

where we have used the metric (9) and \(f_K(\chi )\) is given by (10). Furthermore, the comoving angular diameter distance is defined as

$$\begin{aligned} d_{AC}\equiv \frac{d_A}{{a(t)}}=f_K(\chi ), \end{aligned}$$

so in a spatially flat cosmology, \(\chi \) is also referred to as the comoving angular diameter distance.

Finally, for an object with luminosity, L, and flux, F, measured here at the observer [for example on a Charged-Coupled Device (CCD)], the luminosity distance, \(d_L\), is defined from the relation

$$\begin{aligned} F \equiv \frac{L}{4\pi d_L^2}. \end{aligned}$$

From photon conservation, the flux measured at observer can be written in terms of the metric functions of (9) and the source redshift as Ellis and van Elst (1999)

$$\begin{aligned} F=\frac{L}{4\pi (1+z)^2 r_G^2}, \end{aligned}$$

where \(r_G\equiv a(t_0) f_K(\chi )\) is called the galaxy area distance. Furthermore, two effects need to be considered. The first is that photons are redshifted by a factor \((1+z)\), and the second effect is that there is a time dilation due to cosmic expansion providing a second factor \((1+z)\).

Now, comparing Eqs. (24) and (25), and using \(r_G\), the luminosity distance is given by

$$\begin{aligned} d_L(z)=f_K(\chi )(1+z). \end{aligned}$$

\(d_L\) is thus related to the angular diameter distance, \(d_A\), by

$$\begin{aligned} d_L=d_A(1+z)^2. \end{aligned}$$

This is Etherington’s reciprocity theorem (or distance-duality relation), which is true when the number of photons traveling on null geodesics is conserved.

3.2 The inhomogeneous lumpy universe and the growth of large-scale structure

3.2.1 Large-scale structure and cosmological perturbations

The universe we observe at large scales is rather full of clusters and superclusters of galaxies. Such a picture is mathematically realized by applying linear perturbations to Einstein’s equations in an FLRW background. Sufficiently large scales are considered so linear perturbations are a valid description.

This is done by adding to the metric tensor a small perturbation tensor. Then computing the Einstein tensor to the first order. At the same time, the energy momentum tensor is also linearly perturbed. The Einstein equations then give the usual background Friedmann equations (184) plus additional equations governing the evolution of the perturbations (see, e.g., Carroll 2003; Peter 2013 for a pedagogical introductions and also some of the seminal references Bardeen 1980; Kodama and Sasaki 1984). An insightful approach to these linear perturbations is to decompose the components of the symmetric metric tensor perturbations according to how they transform under spatial rotations. The 00-component of the metric perturbation tensor is a scalar, the three 0i-components (or equally the three i0-components) constitute a vector, and the remaining nine ij components form a symmetric spatial tensor of rank two. This is known as the SVT decomposition of linear perturbations. The three parts transform only into components of the same type under spatial rotations. In GR, the scalar modes are, for example, associated with matter density fluctuations and used for large scale structure studies, tensor modes are associated with gravitational radiation used, for example, for primordial gravitational waves, while vector modes decay in and are usually ignored. Last, in addition to this decomposition, one needs to specify a gauge choice where the components of the perturbations can be different in the corresponding coordinate system, see e.g., Carroll (2003) and Peter (2013) for pedagogical discussions. Modification to gravity can be implemented at the level of scalar mode perturbations as we discuss further below or at the level of tensor modes as in, e.g., Saltas et al. (2014), Pettorino and Amendola (2015), Dubovsky et al. (2010), Raveri et al. (2015), Amendola et al. (2014) and Lin and Ishak (2016).

In this review, we will focus scalar perturbations. The perturbed spatially flat FLRW metric reads in, for example, the conformal Newtonian gauge as

$$\begin{aligned} ds^2=a(\tau )^2[-(1+2\varPsi )d\tau ^2+(1-2{\varPhi })dx^idx_i], \end{aligned}$$

where \(x_i\)’s are the comoving coordinates, and \(\tau \) the conformal time defined further above. \(\varPhi \) and \(\varPsi \) are the gravitational scalar potentials describing the scalar mode of the metric perturbations.

We consider subhorizon scales with \(k \gg aH\). In many analyses and papers on testing gravity at cosmological scales, the perturbed equations are often specialized to the quasi-static limit or approximation. This means that the time evolution of the gravitational potentials is assumed to be small compared to the Hubble time so one can assume the derivatives of the potentials to be zero for sub-Hubble-horizon scales. For scalar–tensor theories, this approximation also means that one neglects the time derivatives of the fluctuations in the scalar field at scales below the scalar perturbation sound horizon. More on this approximation or its limits can be found in, e.g., Noller et al. (2014), Sawicki and Bellini (2015) and Pogosian and Silvestri (2016).

The first-order perturbed Einstein equations in Fourier space give two equations that describe the evolution of the two scalar gravitational potentials, e.g., Ma and Bertschinger (1995). The combination of the time-time and time-space perturbed equations provides a Poisson equation for the potential \(\varPhi \). The second equation includes the two potentials and comes from the traceless space-space components. The two equations read (in the quasi-static approximation for the potentials)

$$\begin{aligned} k^2{\varPhi }= & {} -4\pi G a^2\sum _i {\bar{\rho }}_i \delta _i \end{aligned}$$
$$\begin{aligned} k^2(\varPsi -\varPhi )= & {} -12 \pi G a^2\sum _i {\bar{\rho }}_i(1+w_i)\sigma _i, \end{aligned}$$

where \({\bar{\rho }}_i\) and \(\sigma _i\) are the density and the shear stress, respectively, for matter species denoted by the index i. \(\delta _i\) is the gauge-invariant, rest-frame overdensity for matter species, i. Its evolution describes the growth of inhomogeneities. It is defined by

$$\begin{aligned} \delta _i = \delta _i +3\mathcal {H}\frac{q_i}{k}, \end{aligned}$$

where \(\mathcal {H} ={a}'/a\) is the Hubble factor in conformal time (where \('\) is for differentiation with respect to conformal time), and for species i,

$$\begin{aligned} \delta _i=\frac{\rho _i-{\bar{\rho }}_i}{{\bar{\rho }}_i} \end{aligned}$$

is the fractional overdensity; \({{\bar{\rho }}_i}\) is the background average density; \(q_i\) is the heat flux related to the divergence of the peculiar velocity, \(\theta _i\), by

$$\begin{aligned} \theta _i=\frac{k\ q_i}{1+w_i}. \end{aligned}$$

From conservation of the energy-momentum in the perturbed matter fluid, these quantities for uncoupled fluid species or the mass-averaged quantities for all the fluids evolve as, e.g., Ma and Bertschinger (1995):

$$\begin{aligned} {\delta }'= & {} -k q +3(1+w){\varPhi }'+3\mathcal {H}\left( w-\frac{\delta P}{\delta \rho }\right) \delta \end{aligned}$$
$$\begin{aligned} \frac{{q}'}{k}= & {} -\mathcal {H}(1-3w)\frac{q}{k}+\frac{\delta P}{\delta \rho }\delta +(1+w)\left( \varPsi -\sigma \right) . \end{aligned}$$

Combining these two equations, one obtains the evolution equation of \(\delta \) as

$$\begin{aligned} {\delta }' = 3(1+w)\left( {\varPhi }'+\mathcal {H}\varPsi \right) +3\mathcal {H}w\delta -\left[ k^2+3\left( \mathcal {H}^2-{\mathcal {H}}'\right) \right] \frac{q}{k}-3\mathcal {H}(1+w)\sigma . \end{aligned}$$

Equations (29), (30), (34), and (35) above are coupled to one another; their combinations, along with the evolution equations for the scale factor \(a(\tau )\), can provide a full description of the growth history of structures in the universe.

3.2.2 Growth factor and growth rate of large-scale structure

Now, specializing the above equations to the case of matter (baryons plus cold dark matter) at late time, we can set \(w=\delta P/\delta \rho =\sigma =0\). Also using the quasi-static approximation (i.e., \({\varPhi }'=0\)), Eq. (34) reduces to

$$\begin{aligned} \delta '_m=-kq=-\theta . \end{aligned}$$

Next, taking its derivative and using Eq. (35) as well as the two Poisson equations (29) and (30), we write

$$\begin{aligned} {\delta }_m''+\mathcal {H}{\delta }_m'-4 \pi G a^2 {\bar{\rho }} \delta _m=0. \end{aligned}$$

In cosmic time, this reads,

$$\begin{aligned} \ddot{\delta }_m+2{H}\dot{\delta }_m-4 \pi G {\bar{\rho }} \delta _m=0. \end{aligned}$$

This time evolution equation for \(\delta \) has a solution with decaying and growing modes. We are interested in the growing modes (denoted with a \(+\) subscript) that gave the structures that we observe today in the universe. One thus defines \(D_{+}(t)\) as the linear growth factor of perturbations relating the overdensity \(\delta (t)\) at some given time t to its value at some initial time \(t_i\). That is

$$\begin{aligned} \delta (t)=\frac{D_{+}(t)}{D_{+}(t_i)} \delta (t_i), \end{aligned}$$

where \({D_{+}(t_i)}\) and \(\delta (t_i)\) are constants set by initial conditions. The growth factor is often properly normalized as \(G(z)\equiv D(a)/a\).

A paramount quantity in probing the growth of large scale structure is the growth rate, defined as the derivative of the logarithm of the growth factor with respect to the logarithm of the scale factor, i.e.,

$$\begin{aligned} f(a)\equiv \frac{d \ln D}{d \ln a}. \end{aligned}$$

As we will discuss further, some observations, such as Redshift Space Distortions (RSD), are directly sensitive to this function (or its product with the amplitude of matter fluctuation, \(\sigma _8(a)\)). The growth differential equation (39) above can be rewritten in terms of the growth rate (41) where the effect of modification to gravity can be encapsulated in an effective gravitational constant \(G_{\mathrm{eff}}\) or a modified gravity parameter \(\mu (k,a)\) (see Sect. 5.2 further) and thus re-written as:

$$\begin{aligned} \frac{df}{d \ln a}+f^2+\left( \frac{\dot{H}}{H^2}+2\right) f=\frac{3}{2}\frac{G_{\mathrm{eff}}^{\psi }}{G}\varOmega _m \equiv \frac{3}{2}\,\mu \, \varOmega _m \end{aligned}$$

(for GR, \(G_{\mathrm{eff}}=G\) and \(\mu =1\) , recovering the standard expression).

Fig. 1
figure 1

Image reproduced with permission from Huterer et al. (2015), copyright by Elsevier

Growth rate of matter density fluctuations, f(z). Theory prediction curves are shown for: the \(\varLambda \)CDM model; the Dvali–Gabadadze–Porrati braneworld model (the self-accelerating branch, see Sect. 7.5.2) (Dvali et al. 2000); and the f(R) (see Sect. 7.4.1) modified gravity model (Hu and Sawicki 2007a) [model with \(c=3\) from Linder (2009)]. Note that the growth in f(R) models is scale-dependent so the authors show predictions at two wavenumbers, \(k=0.02\,\hbox {h}\,{{\mathrm{Mpc}}^{-1}}\) and \(k=0.1\,\hbox {h}\,{{\mathrm{Mpc}}^{-1}}\). Also shown are the error bars projected from a future galaxy spectroscopic redshift survey designed with DESI survey specifications (Aghamousa et al. 2016)

For illustration, we reproduce Fig. 2 from Huterer et al. (2015) (Fig. 1 here) where it is shown how the function f(z) can be a discriminator for various gravity theories.

3.2.3 Correlation function and matter power spectrum

The galaxy correlation function is a measure of the degree of clustering in a spatial or angular distribution of galaxies. If \(\delta _g(\mathbf {r})\) represents the galaxy overdensity with respect to an expected mean density then the correlation function is given by the 2-point function

$$\begin{aligned} \xi (\mathbf {r_1},\mathbf {r_2}) \equiv \langle \delta _g(\mathbf {r_1})\delta _g(\mathbf {r_2}) \rangle , \end{aligned}$$

where \(\langle \dots \rangle \,\) denotes the ensemble average. The galaxy correlation function can be further understood as follows (Baugh 2000): Let’s consider two volume elements, \(dV_1\) and \(dV_2\) separated in space by \(r_{12}\). The 2-point correlation can be defined as the excess probability, in comparison with a random distribution, of finding a galaxy in \(dV_1\) and another in \(dV_2\). That is:

$$\begin{aligned} d P = \bar{n}^2\left[ 1+\xi (r_{12})\right] dV_1 dV_2, \end{aligned}$$

where \(\bar{n}\) is the mean galaxy number density. Due to the assumption of isotropy and homogeneity, the vector notation is dropped and only the distance \(r_{12}\) has been kept.

A closely related quantity is the galaxy power spectrum which is defined as the Fourier transform of the correlation function as

$$\begin{aligned} P_g(k)= & {} \int \xi (r)e^{i\mathbf{k\cdot r}}d^3r,\end{aligned}$$
$$\begin{aligned} \xi (r)= & {} \int P(k)e^{-i\mathbf{k\cdot r}} \frac{d^3k}{(2\pi )^3}. \end{aligned}$$

Note that we have again dropped the vector notation in the argument of \(P_g(k)\) and \(\xi (r)\) due to the statistical isotropy and homogeneity. In other words, they are only functions of the magnitudes of \(\mathbf{k}\) and \(\mathbf{r}\). In this case, it is assumed that one of the two galaxies is at the origin and the other one is at a distance r. It is worth noting that for a Gaussian random field, the power spectrum contains all the statistical information of the field which explains its wide use in cosmological studies.

The correlation function can be measured from a galaxy survey using estimators taking into account observational subtleties (Landy and Szalay 1993). Its theoretical counterpart is calculated from using the model predicted matter power spectrum that we discuss next. However, we use now the term matter because we refer to the dark matter field and its fluctuation, \(\delta (\mathbf {k},z)\), which is traced by the galaxy fluctuation modulo some bias factor. The matter power spectrum, P(kz), is defined by

$$\begin{aligned} \langle \delta (\mathbf{k},z)\delta (\mathbf{k'},z)\rangle = (2\pi )^3P(k,z)\ \delta _D^3(\mathbf{k}-\mathbf{k}'), \end{aligned}$$

where \(\delta _D^3\) is the delta function of Dirac. P(kz) is determined from theoretical grounds as we discuss next.

The standard picture of structure formation in the universe is that structures have grown by gravitational infall and clustering from primordial small fluctuations in the matter density field. These seed fluctuations would have originated from microscopic quantum fluctuations that have been blown up to macroscopic scales by cosmic inflation (Guth 1981; Bardeen et al. 1983; Albrecht and Steinhardt 1982). These primordial fluctuations would be scale invariant and described by the power spectrum (Harrison 1970; Peebles and Yu 1970; Zeldovich 1972)

$$\begin{aligned} P(k) \propto k^{n_s}. \end{aligned}$$

with \(n_s\approx 1\). This is consistent with current observations finding that \(n_s=0.9652 \pm 0.0062\), see e.g., Ade et al. (2016a) and Spergel et al. (2003).

The matter power spectrum today has evolved from this primordial spectrum while subject to a number of physical processes. During the radiation-dominated epoch, perturbations outside the horizon grow as the square of the expansion scale factor while those inside the horizon do not grow. This is due to the radiation pressure in the primordial plasma acting against gravity and preventing gravitational infall. Furthermore, as the universe expands, modes entering the horizon are also frozen. This happens until the time of matter-radiation dominance equality where modes inside the horizon can then grow. Accordingly, the scale of the horizon at this matter-radiation equality is marked in the distribution of density fluctuations and appears as a turn-over in the shape of the matter power spectrum, see e.g., Peacock (1999) and Dodelson (2003). This and other processes about mode behaviors are formulated in the so-called transfer function, T(k), (Bardeen et al. 1986; Sugiyama 1995; Eisenstein and Hu 1998). The primordial power spectrum is also enhanced by the growth factor of structure, G(z) as described in Sect. 3.2.2. In sum, the matter power spectrum today can be written as a product of the components discussed above plus a primordial amplitude determined by observations:

$$\begin{aligned} P(k,z)=A_s\,k^{n_s}\,T^2(k)\,G^2(z). \end{aligned}$$

In a last step, we need to connect the galaxy and matter power spectra. For that, we recall that galaxies trace the distribution of dark matter in the universe so the galaxy overdensity also traces the matter overdensity. However, this tracing is subject to some subtle galaxy bias that can be non-local and nonlinear encoding various processes and physics of structure formation, see for example discussion in Percival (2013) and references therein. On large scales, it is often assumed that one has a linear bias defined via \(\delta _g(z,k)= b (z,k) \,\delta _m(z,k)\). Additionally, as we discuss in some detail in Sect. 4.3, peculiar motion of galaxies adds distortions that can be accounted for via the factor \(f(z)\mu ^2\) where \(\mu \) is the cosine of the angle to the line of sight. Consequently, the galaxy power spectrum can be written as

$$\begin{aligned} P^s_{gg}(k,\mu ,z)=A_s\,k^{n_s}\,T^2(k)\,G^2(z)\left[ b(z,k)+f(z)\mu ^2\right] ^2. \end{aligned}$$

Finally, the linear matter power spectrum above under-predicts power on small scales, and must be modified to the nonlinear matter power spectrum \(P_{nl}\) to include nonlinear effects on small scales using simulations or fitting formulas for specific class of models, e.g., Peacock and Dodds (1996) and Smith et al. (2003) for \(\varLambda \)CDM and Zhao (2014), Hojjati et al. (2011) and Zhao et al. (2009) for f(R) MG models (see Sect. 7.4.1). The presence of screening mechanisms also complicates the picture for nonlinear modes in MG. There have been some recent interesting developments on simulation codes for MG models. Winther et al. (2015) (and references therein) presents a comparative analysis of MG N-body codes. See also Valogiannis and Bean (2017), Winther et al. (2017) where a Comoving Lagrangian Acceleration (COLA) approach was used. This last method uses fewer time-steps and resources and trades some accuracy at small scales to obtain more efficiency. A parameterization for modified gravity on nonlinear cosmological scales was also proposed in Lombriser (2016).

Relevant to our review, deviations from general relativity can affect the transfer function T(k), the growth factor \(G^2(z)\), and the growth rate f(z). These can be reflected on the shape and amplitude of the galaxy power spectrum as a function of redshift and scale with some degeneracies. We discuss in the next section various observational probes, surveys and techniques that constrain and connect to the galaxy power spectrum.

4 Cosmological probes of gravity theory

A well-appreciated “break” that nature has given us in cosmology is that we have two categories of measurements and probes that we can use. One category of probes constrains the expansion history and geometry of the universe via, for example, distance measurements and expansion rate. The second category constrains the growth and history of structure formation and clustering over space and time in the universe. Not only can we combine them, we can also contrast them for consistency. Indeed, combining probes from the two categories allows one to break further degeneracies between cosmological parameters and to tighten significantly the constraints, while contrasting their constraints can reveal systematics in some data sets or the need of some extensions to the underlying model. It is worth noting that some probes are sensitive to both the expansion and the growth such as CMB and weak lensing, however, for probing modifications to GR, it is rather the growth constraints that are the most useful.

Modifications to gravity change the Friedmann equations and the functions derived from them for distance and expansion observables. We give in Sect. 7 examples for some MG models. However, as we show there as well, the modified terms in the Friedmann equations can be cast into effective dark energy density and pressure leading to an effective equation of state. A number of MG models can then have an expansion history that is indistinguishable from that of \(\varLambda \)CDM (or a quintessence model closed to it), thus fitting cosmological distance and expansion observations equally well with the \(\varLambda \)CDM. However, such models can still exhibit a growth of structure that is different from that of \(\varLambda \)CDM so growth data can then be used as a discriminator between the theories. For this reason, studies testing GR at cosmological scales then focused on deviations from GR (or MG models) that can mimic well the expansion history of \(\varLambda \)CDM but can still be distinguished from it using the growth rate of structure. For that, most studies assume a \(\varLambda \)CDM (or a quintessence wCDM) background model and then use the growth probes to constrain any deviation from GR. It has been argued though that one should implement and use both expansion and growth explicitly modified functions for consistency. Also, the background can be used to test GR based on spatial curvature consistency, see e.g. Zolnierowski and Blanchard (2015).

We briefly overview various probes of gravity below and refer the reader to corresponding review articles in each sub-section. We start with probes of cosmic geometry and expansion and then follow with various probes of the growth of large-scale structure in the universe.

4.1 Probes of cosmic geometry and expansion

Bearing in mind the strategy described above, probes of expansion and geometry have been very useful in constraining tightly background cosmological parameters such as the density parameters, the Hubble constant, the true or effective equation of state of dark energy, and then setting the stage for growth probes to constrain any deviation from GR at cosmological scales.

4.1.1 Standard candles: type Ia supernova

One of the first compelling evidences for cosmic acceleration came from Supernovae type Ia (SN Ia) observations (Riess et al. 1998; Perlmutter et al. 1999). After some corrections, SN Ia can be considered as good standard candles with an average absolute bolometric magnitude of \(M_{B}\approx -19.3\); see for example Phillips (1993). The ratio of their apparent brightness to their intrinsic one can provide a measure of their luminosity distance while their redshift can be measured independently from spectroscopy. The theoretical model’s function \(d_L(z)\) (or m(z)) are then fit to the data points after further corrections on the data, see for example Hamuy et al. (1996), Riess et al. (1998), Perlmutter et al. (1999) and references therein. These and other similar plots are known as the popular Hubble plots. SN Ia Hubble plots provide relative measurements of distances that can be calibrated using low redshift distance measurements such as Cepheid variable stars in the host galaxies building a distance ladder. A more practical function to use for distance estimation in cosmological analyses is the distance modulus

$$\begin{aligned} \mu (z)=m(z)-M=5 \log D_L + 25, \end{aligned}$$

where M is an effective absolute magnitude degenerate with the Hubble constant, \(H_0\) and \(D_L\) is the luminosity distance in units of Mpc given, for example, for a spatially flat \(\varLambda \)CDM universe by

$$\begin{aligned} D_L(z)=\frac{(1+z)}{H_0}\int ^{z}_0\frac{dz'}{\sqrt{\varOmega _m^0 (1+z')^{3}+ \varOmega _{\varLambda }^0 }}. \end{aligned}$$

\(D_L(z)\) for spatially curved universes follows straightforwardly from Eqs. (26), (10), (21) and (17). Supernova data combined with other distance probe data sets can put tight constraints on background cosmological parameters. For example, supernova constraints on present time density parameters \(\varOmega _m^0\) and \(\varOmega _{\varLambda }^0\) have a degeneracy direction that is orthogonal to that from CMB constraints so when combined together they provide tight constraints on these parameters, see e.g., Spergel et al. (2003). We list here a number of projects and popular compilations of supernova data that we will refer to in this review including: Supernova Legacy Survey (SNLS) compilation (Conley et al. 2011); Union2.1 compilation (Suzuki et al. 2012); Joint Light Curve Analysis (JLA) constructed from SNLS, SDSS and several low-redshift SN samples, e.g., Betoule et al. (2014); Pan-STARRS sample, e.g., Rest et al. (2014); and most recently the Pantheon Sample compiled from a number of the above and other surveys which was provided in Scolnic et al. (2017).

4.1.2 Standard rulers: angular distance to CMB last scattering surface and baryon acoustic oscillations

The very early universe was made of a hot and dense plasma of electrons, baryons, mixed with a pressure-less dark matter component. Photons were trapped with this plasma via Thompson scattering. This is sometimes referred to as the baryon-photon fluid. As the universe expanded and cooled down, electrons and protons formed neutral hydrogen atoms. This is called recombination and happened at approximately 380,000 years after the Big Bang corresponding to a redshift of about 1090 (Ade et al. 2016a; Spergel et al. 2003). Shortly after that, photons decoupled from the matter and traveled freely in the universe constituting the relic background radiation that we observe today as the CMB.

Before decoupling, the baryon-photon fluid was subject to gravitational infall toward the center of overdense regions (dominated by dark matter) but then pushed back outward by the building pressure of the photons. This process created spherical sound oscillations in the plasma fluid traveling at a sound speed \(c_s\) that depends on the baryons and photon density parameters. The largest comoving distance that such sound waves could have traveled from the Big Bang time to decoupling time is denoted here as \(r_{s,\mathrm{com},\mathrm{dec}}\) and can be calculated as follows

$$\begin{aligned} r_{s,\mathrm{com},\mathrm{dec}}= & {} \int ^{t_\mathrm{dec}}_{0}\frac{c_sdt}{a} \\= & {} \frac{c}{\sqrt{3}}\int ^{t_\mathrm{dec}}_{0}\frac{dt}{a\sqrt{1+(3\varOmega _b)/4(\varOmega _{\gamma })a}}\\= & {} \frac{c}{\sqrt{3}H_0}\int ^{a_\mathrm{dec}}_{0}\frac{da}{\sqrt{\varOmega _r+a\varOmega _m}\sqrt{1+(3\varOmega _b)/4(\varOmega _{\gamma })a}} \end{aligned}$$

For example, if we use the values from Ade et al. (2016a) as follows: \(\varOmega _{b} = 0.0492\), \(\varOmega _m=0.3156\), \(\varOmega _{\gamma } = 5.45\times 10^{-5}\), \(\varOmega _{r} = 9.16\times 10^{-5}\) for baryon, matter, photon, and radiation (photons \(+\) neutrinos) density parameters, respectively; \(H_0=67.3\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\)and \(z_\mathrm{dec}=1090\); then Eq. (53) above gives \(r_{s,\mathrm{com},\mathrm{dec}}=144.7\) Mpc.

The corresponding physical scale is given by \(r_{s,\mathrm{dec}}=a_\mathrm{dec}\times r_{s,\mathrm{com},\mathrm{dec}}=0.133\) Mpc and is called the crossing sound horizon at time of recombination. It corresponds to the largest scale at which an acoustic oscillation can be present in the baryon-photon fluid. After decoupling, these standing acoustic waves remained imprinted in the CMB temperature maps as well as in the distribution of matter structure in the universe. It constitutes a “standard ruler” that can be measured in the universe while taking into account the expansion scale factor (or redshift).

For the CMB, this standard ruler and the angular diameter distance from the observer to the CMB last scattering surface can be combined to give the angular size of the sound horizon on such a surface as

$$\begin{aligned} \theta _s\approx \frac{r_s}{d_A^{sls}}. \end{aligned}$$

This angle is particularly sensitive to the density and spatial curvature parameters, thus providing a good constraints on the geometry of the universe. This is related to the position of the CMB acoustic peaks (e.g., \(\ell \approx \pi /\theta _s\) for the first peak). Planck has put a remarkably tight constraints on this angle as \(\theta _s=(1.04106 \pm 0.00031)\times 10^{-2}\) (Ade et al. 2016a). A concise description of how the distance to last scattering using the crossing sound horizon can be found in for example Wijenayake and Ishak (2015) and more detail in Bond et al. (1997).

On the side of Baryons, part of the pattern is the presence of shells of overdense regions with comoving radius equal to the sound crossing horizon. This pattern is called the Baryon Acoustic Oscillations (BAO) and was indeed detected in various galaxy surveys as we cite further below. In BAO geometry, one is dealing with a spherical shell of matter so one can use the standard ruler along the line of sight (longitudinal) as well as in the transverse direction.

For the line-of-sight part, one can write from the line element of spacetime

$$\begin{aligned} H(z) = \frac{\delta z}{\delta \chi _{\parallel }}. \end{aligned}$$

One can measure \(\delta z\) from spectroscopy in the survey while \(\delta \chi _{\parallel }\) is the standard ruler, so one can constrain the Hubble function H(z) at some effective redshift.

For the transverse part, one can use the small angle approximation for the angle subtended by the standard ruler \(\delta \chi _{\bot }\) as

$$\begin{aligned} d_A(z)=\frac{\delta \chi _{\bot }}{\delta \theta }, \end{aligned}$$

where \(\delta \theta \) is measured from the survey while \(\delta \chi _{\bot }\) is the known standard ruler so one can derive the angular diameter distance \(d_A(z)\) at the effective redshift used.

Some analyses like Gaztañaga et al. (2009) and Chuang and Wang (2012) have used this approach and made very low-signal-to-noise detection because extremely large volumes are necessary for a 2D correlation function (Beutler et al. 2011). But a number of other analyses, e.g., Cole et al. (2005), Beutler et al. (2011), Blake et al. (2011b) and Anderson et al. (2012) made much stronger detections using rather a 1D correlation function and an effective projected distance defined as

$$\begin{aligned} D_V(z) \equiv \left[ (1+z)^2 d_A^2(z)\frac{cz}{H(z)} \right] . \end{aligned}$$

In such analyses, what is fit to the data is then the ratio

$$\begin{aligned} d_z=\frac{r_s(z_{\mathrm{drag}})}{D_V(z)}, \end{aligned}$$

where \(r_s(z_{drag})\) is specifically the comoving crossing sound horizon when baryons became dynamically decoupled from photons. This can be understood as after photons last scattering, the baryons encountered a baryon drag epoch until redshift of about 1060 (Ade et al. 2016a). Other variations or definitions of useful effective distances like (56) have been defined and used in literature (Bassett and Hlozek 2010; Aubourg et al. 2015).

A number of measurements of BAO have been made and have become very useful in constraining the background geometry providing important complementary data to that of CMB and SN measurements. These include measurements of the BAO effective projected distance (or other measures) by for example the SDSS at \(z_{\mathrm{eff}}=0.15\) (Eisenstein et al. 2005; Ross et al. 2015), the 2-degree-Field Galaxy Survey (2dFGRS) at \(z_{\mathrm{eff}}=0.32\) (Cole et al. 2005), BOSS LOWZ at \(z_{\mathrm{eff}}=0.32\) and CMASS at \(z_{\mathrm{eff}}=0.57\) (Anderson et al. 2014), the 6dFGS measured at \(z_{\mathrm{eff}}=0.106\) (Beutler et al. 2011), and WiggleZ survey at \(z_{\mathrm{eff}}=0.6\) (Blake et al. 2011b).

4.1.3 Local measurements of the Hubble constant or measurements of H(z)

The Hubble constant, \(H_0\), is one of the oldest cosmological parameters describing the rate of expansion of the Universe and entering all distance and geometry measurements of the universe.

A direct measurement of the local Hubble constant is possible using the cosmic distance ladder (e.g., Freedman and Madore 2010). Once this local measurement is accomplished, it can serve as a prior to further cosmological analyses. This is in particular useful if one wants to fix the background cosmology to that of a fiducial \(\varLambda \)CDM while allowing for the growth parameter to vary. This is useful in the case of models that can mimic a \(\varLambda \)CDM expansion but can still have a distinct growth rate of structure, like for example some f(R) models (see Sect. 7.4.1).

Furthermore, other cosmological probes such as the CMB infer the value of the Hubble constant by assuming and using a cosmological model. Therefore the comparison of the local measurement with that of the CMB provides an important consistency test for the underlying model. This highlights the importance of such a local measurement and we report here some of the values of the local measurements of \(H_0\).

We list here some measurements of \(H_0\). First, using the Hubble Space Telescope (HST) Key Project and Cepheid calibration of distances to 31 galaxies and other calibrated secondary distance indicators (Type Ia and Type II Supernovae), Freedman et al. (2001) reported \(H_0 = 72 \pm 8\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\). A decade later, Riess et al. (2011) used HST new camera observations of over 600 Cepheids in host galaxies of 8 Type Ia SN. This allowed the authors to calibrate the SN magnitude-redshift relation and to obtain a much more precise value of \(H_0 = 73.8 \pm 2.4\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\). Efstathiou (2014) used different outlier rejection criteria for the Cepheids and obtains \(H_0 = 70.6 \pm 3.3\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\). He also obtained \(H_0 = 72.5 \pm 2.5\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\)when the H-band period-luminosity relation is assumed to be independent of metallicity using other combined distance anchors. Freedman et al. (2012) used HST with further calibrations from the Spitzer Space Telescope to measure \(H_0 = 74.3 \pm 1.5 \,{\mathrm{(statistical)}}\, \pm 2.1\, {\mathrm{(systematic)}}\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\). Most recently, Riess et al. (2016), used four geometric calibration methods of Cepheids to obtain \(73.24\pm 1.74\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\).

It is worth noting here that a tension seems to persist between the local measurement values and the lower value obtained from Planck, i.e., \(H_0=66.93 \pm 0.62\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\). This tension has been the subject of numerous discussions in recent literature offering different perspectives (Bernal et al. 2016; Lin and Ishak 2017a, b; Luković et al. 2018; Wang et al. 2017; Haridasu et al. 2017; Zhang et al. 2018; Gómez-Valent and Amendola 2018; Abbott et al. 2018a). As we discuss further below in some of the sub-sections (see e.g., Sects. 9.7 and 9.9), some authors find that some modified gravity models reduce or alleviate the tension in the Hubble parameter (see e.g., Barreira et al. 2014a; Belgacem et al. 2018b).

However, other approaches have been used to determine local measurement of \(H_0\). Some time ago, Gott et al. (2001) developed and used a median statistics method that provides an alternative of \(\chi ^2\) likelihood methods and requires fewer assumptions about the data. They found at that time a median value of \(H_0=67\) km \(\mathrm s^{-1}\) Mpc\(^{-1}\)with \(\pm 2\) km \(\mathrm s^{-1}\) Mpc\(^{-1}\)statistical errors (95% CL) and \(\pm 5\) km \(\mathrm s^{-1}\) Mpc\(^{-1}\)statistical errors (95% CL) from using 331 measurements of \(H_0\) from by Huchra’s compilation. Some time later Chen and Ratra (2011) used the same method and the final compilation of Huchra with 553 measurements finding a median of \(H_0 = 68 \pm 5.5\) km \(\mathrm s^{-1}\) Mpc\(^{-1}\)(at 95% CL) including statistical and systematics uncertainties. Most recently, Chen et al. (2017) used rather the Hubble function H(z) with 28 measurements at intermediate redshifts \(0.07\le z \le 2.3\) in order to determine the local Hubble constant, \(H_0\). They find for the spatially flat and non-flat \(\varLambda \mathrm {CDM}\) model, \(H_0=68.3^{+2.7}_{-2.6}\). The authors stress that this value is consistent with the low value obtained with the previous work using the median statistics. They also note that this value is consistent with the low value measured by Planck while it includes the high value from local measurement in the previous paragraph within the 2\(\sigma \) bound. Further work using, H(z), was carried (Moresco et al. 2016; Farooq et al. 2017; Yu et al. 2018) where the authors put constraints on a cosmological deceleration-acceleration transition with various levels of confidence. Capozziello et al. (2014) made some first developments to constrain f(R) models using the cosmological deceleration-acceleration transition redshift. They required that the model reduces to \(\varLambda \mathrm {CDM}\) at \(z=0\) but they parametrize possible departures from it at higher redshifts in terms of a two-parameter logarithmic correction. They found that the transition in this model happens at a redshift consistent with using type Ia supernova apparent magnitude data and Hubble parameter measurements. Finally, Gómez-Valent and Amendola (2018) followed on the H(z) approach using cosmic chronometers, Type Ia supernovae, Gaussian processes and a novel Weighted Polynomial Regression method to find \(H_0=67.06\pm 1.68\) km \(\mathrm s^{-1}\) Mpc\(^{-1}\)which is in agreement with low values and in 2.71-\(\sigma \) tension with the local measurement of Riess et al. They also determine a more conservative value of \(H_0=68.45\pm 2.00\) which is still about 2-\(\sigma \) tension with the value from Riess et al. further above. With future precise data from for example, GAIA, and other experiments, one will hopefully get to the bottom of these tensions.

4.2 Weak gravitational lensing

Trajectories of photons traveling to us from remote galaxies get deflected along the line of sight by matter overdensities in the intervening medium. This is called gravitational lensing. Depending on the positions of the sources and lenses relative to the observer, these deflections can result in strong, intermediate, or weak lensing. Strong and intermediate lensing provides spectacular multiple images such as Einstein rings and crosses (Cabanac et al. 2005; Belokurov et al. 2009), giant arcs, and arclets (Hennawi et al. 2008). Less impressive but so abundant, weak lensing consists of tiny distortions to the shapes of millions and millions of galaxies that can be accounted for using statistical techniques and turned into a powerful cumulative signal which probes the cosmology of the intervening deflector medium including any modification to gravity theory at cosmological scales.

Weak lensing at cosmological scales, also called cosmic shear, is quantified by the shear of images that tend to transform circular shapes into elliptical ones and is represented by the complex-quantity \(\gamma \), and the convergence, \(\kappa \), that represents the magnification of these images. In this weak regime, the two effects are very small, of the order of a few percent at most and equal, thus used interchangeably. To linear order, the shear is a good approximation to the reduced shear that is determined from the measured shapes (ellipticies) of galaxy images and on scales typically used in weak lensing analyses to date, see e.g., reviews Bartelmann and Schneider (2001) and Kilbinger (2015).

Cosmic shear surveys measure ellipticities and positions of galaxies in the sky and then build from them pairs and triplets called 2- and 3-point correlation functions that can be compared to theoretical models using the lensing power spectrum and bispectrum that are derived from the formalism as follows (we use a mixture of steps from Kilbinger 2015; Troxel and Ishak 2015).

The mean convergence can be written as a weighted projection of the overdensities along the line of sight

$$\begin{aligned} \kappa (\varvec{\theta }) = \frac{3 H_0^2 \varOmega _{\mathrm{m}}}{2 c^2} \int \limits _0^{\chi _{_{\mathrm{H}}}} {\mathrm{d} \chi } \frac{g(\chi )}{a(\chi )} f_K(\chi ) \, \delta (f_K(\chi ) \varvec{\theta }, \chi ), \end{aligned}$$

where \(\chi _{_{\mathrm{H}}}\) is is the comoving coordinate at the horizon, \(f_K(\chi )\) is given by Eq. (10), and \(g(\chi )\) is defined as

$$\begin{aligned} g(\chi ) = \int \limits _\chi ^{\chi _{_{\mathrm{H}}}} \mathrm{d} \chi ^\prime \, n(\chi ^\prime ) \frac{f_K(\chi ^\prime - \chi )}{f_K(\chi ^\prime )}, \end{aligned}$$

and represent the lensing efficiency at a distance \(\chi \). The convergence 2-point correlation functions is constructed as

$$\begin{aligned} \langle \kappa (\varvec{\theta _1}) \kappa (\varvec{\theta _2}) \rangle , \end{aligned}$$

where again \(\langle \,\,\rangle \) denotes the ensemble average. Now, the convergence scalar field can be decomposed into multipole moments of the spherical harmonics as

$$\begin{aligned} \kappa (\varvec{\theta })=\sum _{lm}\kappa _{lm}Y^m_l(\varvec{\theta }), \end{aligned}$$


$$\begin{aligned} \kappa _{lm}=\int d\hat{\theta } \kappa (\varvec{\theta },\chi ) Y^{m*}_l(\varvec{\theta }). \end{aligned}$$

The convergence power spectrum \(P_\kappa (\ell )\) is then defined by

$$\begin{aligned} \left\langle \kappa _{lm} \kappa _{l'm'} \right\rangle =\delta _{ll'} \delta _{mm'} P_\kappa (\ell ). \end{aligned}$$

In the Limber approximation (Limber 1953), it is given by Kaiser (1992, 1998) and Jain and Seljak (1997):

$$\begin{aligned} P_\kappa (\ell ) = \frac{9}{4} \, \varOmega _{\mathrm{m}}^2 \left( \frac{H_0}{c} \right) ^4 \int _0^{\chi _{_{\mathrm{H}}}} \mathrm{d} \chi \, \frac{g^2(\chi )}{a^2(\chi )} P_\delta \left( k = \frac{\ell }{f_K(\chi )}, \chi \right) , \end{aligned}$$

where \( P_\delta \left( k = \frac{\ell }{f_K(\chi )}, \chi \right) \) is the 3D nonlinear matter power spectrum (Sect. 3.2.3).

As we discuss further below, modifications to gravity will alter the growth factor function and the matter power spectrum (49) as well as Weyl potential Eq. (88). A generalization of the above steps to the convergence 3-point correlation, \(\langle \kappa (\varvec{\theta _1}) \kappa (\varvec{\theta _2}) \kappa (\varvec{\theta _3}) \rangle \), provides the convergence bispectrum

$$\begin{aligned} B_{\kappa }(\ell _1,\ell _2,\ell _3)&=\int _0^{\chi _{_{\mathrm{H}}}}d\chi \frac{W^3(\chi )}{f_K(\chi )^4(\chi )}\nonumber \\&\quad B_{\delta }\left( k_1=\frac{\ell _1}{f_K(\chi )},k_2=\frac{\ell _2}{f_K(\chi )},k_3=\frac{\ell _3}{f_K(\chi )};\chi \right) , \end{aligned}$$

where we encapsulated the other factors into the \(W(\chi )\) as follows,

$$\begin{aligned} W(\chi )&=\frac{3}{2}H_0^2\frac{\varOmega _m}{a(\chi )}\int _{\chi }^{\chi _{_{\mathrm{H}}}}d\chi ' n(\chi ')f_K(\chi )\frac{f_K(\chi '-\chi )}{f_K(\chi ')}, \end{aligned}$$

and \(B_{\delta }(k_1=\frac{\ell _1}{f_K(\chi )},k_2=\frac{\ell _2}{f_K(\chi )},k_3=\frac{\ell _3}{f_K(\chi )};\chi )\) is the 3D matter bispectrum.

Next, we describe a few more steps on how comparison to observed ellipticities of galaxies is performed. We note that the ellipticity is also represented as a complex number field just like the shear. For a galaxy with intrinsic ellipticity \(\epsilon ^{\mathrm{int}}\), cosmic shear modifies this ellipticity [via combination with the reduced shear Kilbinger (2015)] such that the observed ellipticity in the weak-lensing regime is given by

$$\begin{aligned} \varepsilon \approx \varepsilon ^{\mathrm{int}} + \gamma . \end{aligned}$$

If we average over a large number of galaxies, we expect the averaged first term to drop due to the assumed random intrinsic ellipticity of galaxies (any residual is usually put into a noise term) so the observed ellipticity components can be used as an estimator of the complex shear, i.e., \(\gamma = \left\langle \varepsilon \right\rangle \).

Additionally, galaxies also have intrinsic alignments that provide signals contaminating the lensing signal. These intrinsic alignments are due to processes of galaxy formation in the gravitational field. They need to be isolated and mitigated for weak lensing to reach its full potential. See the following reviews for this topic (Troxel and Ishak 2015; Kirk 2015).

In practice, the two components of the shear can be identified as a tangential component with respect to the 1-axis, i.e., \(\gamma _t=-\gamma _1\), and a cross-component, i.e., \(\gamma _\times =-\gamma _2\), obtained by a rotation of an angle \(+\pi /4\) from the tangential component. These components are used to build 2-point correlators that can be combined to construct two practical and often-used 2-point correlations from observations as follows (Miralda-Escude 1991),

$$\begin{aligned} \xi _+(\theta )&= \langle \gamma _{\mathrm{t}} \gamma _{\mathrm{t}} \rangle (\theta ) + \langle \gamma _\times \gamma _\times \rangle (\theta ); \quad \nonumber \\ \xi _-(\theta )&= \langle \gamma _{\mathrm{t}} \gamma _{\mathrm{t}} \rangle (\theta ) - \langle \gamma _\times \gamma _\times \rangle (\theta ) . \end{aligned}$$

The explicit corresponding weighted estimators from ellipticities can be found in for example Kilbinger (2015).

Finally, in order to compare the correlation functions above to their theoretical counterparts, the shear 2-point correlations are related to the convergence power spectrum as follows

$$\begin{aligned} \xi _+(\theta )&= \frac{1}{2\pi } {\displaystyle \int } \mathrm{d} \ell \, \ell \,\mathrm{J}_0(\ell \theta ) P_\kappa (\ell ),\nonumber \\ \xi _-(\theta )&= \frac{1}{2\pi } {\displaystyle \int } \mathrm{d} \ell \, \ell \,\mathrm{J}_4(\ell \theta ) P_\kappa (\ell ), \end{aligned}$$

where \(J_n(x)\) are the n-th order Bessel function of the first kind.

Finally, it is worth mentioning that cosmic shear analyses perform a powerful technique called tomography where the data is split into redshift bins. This strongly probes the growth rate of large scale structure. With tomography, the 2-point correlation functions between two bins i and j is specialized as

$$\begin{aligned} \xi _{\pm }^{ij}(\theta )&= \frac{1}{2\pi } {\displaystyle \int } \mathrm{d} \ell \, \ell \,\mathrm{J}_{0/4}(\ell \theta ) P_\kappa ^{ij}(\ell ), \end{aligned}$$

where the corresponding power spectrum is given by

$$\begin{aligned} P_\kappa ^{ij}(\ell ) = \frac{9}{4} \, \varOmega _{\mathrm{m}}^2 \left( \frac{H_0}{c} \right) ^4 \int _0^{\chi _{\mathrm{lim}}} \mathrm{d} \chi \, \frac{g^i(\chi )g^j(\chi )}{a^2(\chi )} P_\delta \left( k = \frac{\ell }{f_K(\chi )}, \chi \right) . \end{aligned}$$

Modifications to gravity are constrained by weak lensing via the growth factor function and any other changes in the matter power spectrum (49) as well as the modifications to the Weyl potential equation (88). The latter change is usually captured phenomenologically by the addition of the MG parameter factor, \(\varSigma (k,\chi )^2\), in the integrand of equation (71). This highlights the sensitivity and importance of WL surveys in testing deviations from GR. We reproduce here the right-top panel of Fig. 1 from Shirasaki et al. (2016) (Fig. 2 here) comparing convergence power spectra of two f(R) models, two dynamical dark energy models and the standard \(\varLambda \)CDM model.

Fig. 2
figure 2

Image reproduced with permission from Shirasaki et al. (2016), copyright by the authors

Top panel: convergence power spectra for f(R) models (see Sect. 7.4.1), dynamical dark energy models and the \(\varLambda \)CDM standard model. Error bars are for the survey indicated on the figure—sky coverage of 20,000 square degrees with a galaxy density number of 10 per arcminutes squared. The dashed line corresponds to the shot noise term of auto power spectrum. Bottom panel: Ratio between \(\varLambda \)CDM model and f(R) models or wCDM models

Recent cosmic shear surveys have already provided us with several analyses to constrain modification to GR or some classes of MG models that we discuss further below. These include, CFHTLenS (Heymans et al. 2013; Simpson et al. 2013), KIDS (Joudaki et al. 2017; Hildebrandt et al. 2017), and KIDS+2dFLenS (Amon et al. 2017; Joudaki et al. 2018). It is expected that LSST (https://www.lsst.org/; LSST Dark Energy Science Collaboration 2012) and WFIRST (https://wfirst.gsfc.nasa.gov/; Spergel et al. 2015), and Euclid (http://sci.esa.int/euclid/; Amiaux et al. 2012) will be particularly effective in constraining beyond \(\varLambda \)CDM model including deviations from GR and a number of classes of MG theories (Jennings et al. 2012; Xu 2015; Kwan et al. 2012; Tsujikawa 2015; Bellini et al. 2016; Okumura et al. 2016).

4.3 Galaxy surveys: clustering and redshift space distortions (RSD)

In the recent years, a wealth of cosmological information has been provided to us from spectroscopic redshift surveys such as SDSS, BOSS, 2dF, 6dF and WiggleZ. From galaxy redshift surveys one can measure the isotropically averaged galaxy power spectrum or the galaxy correlation function and thus put constraints on cosmological parameters as well as MG parameters and models. This can be done via constraints on various factors in the galaxy power spectrum (50) discussed in Sect. 3.2.3. For example, we reproduce Fig. 2 from Barreira et al. (2014a) (see Fig. 4 here) showing in the bottom panel the data points from the SDSS-DR7 Luminous Red Galaxy host halo power spectrum of Reid et al. (2010) against Galilean MG models and \(\varLambda \)CDM with massive neutrinos (Barreira et al. 2014a).

Additionally, there are Lyman-\(\alpha \) surveys (sub-surveys) that can determine the frequency, density and temperature of matter clouds containing neutral hydrogen between the observer and remote quasars. Each spectrum gives information about multiple structures along the line of sight and that traces the distribution and growth of matter along the line of sight, see for example Weinberg et al. (2003), McDonald et al. (2006) and Font-Ribera et al. (2013).

In regards to testing deviations from GR using galaxy redshift surveys, it seems that “the good comes from the bad”. Indeed, observations along the line of sight are also subject to distortions due to the fact that we make measurements in the redshift space and then convert them to the real space. It turns out that these distortions are a rich source of cosmological information which has at its forefront the redshift space distortions (RSD) that are very sensitive to the growth rate of structure and the gravity theory governing such a growth. We briefly describe below some aspects of the RSD formalism and refer the reader to specialized reviews on the topic (Samushia et al. 2014; Blake et al. 2011a; Hamilton 1998; Percival and White 2009; Percival 2013) and references therein.

Redshifts to remote cosmic objects such as galaxies are distorted by peculiar velocities of these objects with respect to the Hubble flow. These peculiar velocities follow large-scale infall of matter toward over dense regions in the cosmic web and by that they can trace the growth rate of large-scale structure. The distortions can be observed in the redshift space as two main effects. The first one is due to random peculiar velocity distribution of galaxies in clusters that produce a Doppler effect stretching out a cluster of galaxies in the radial direction on redshift maps. This radial stretching points to the observer and was dubbed by the “fingers-of-god” (FoG) effect, see e.g., the seminal papers by Kaiser (1987) and Hamilton (1998). See also earlier work by Jackson (1972). The FoG effect happens at relatively smaller nonlinear scales. The second effect happens on larger scales where the peculiar velocities are not random but directed coherently toward the center of overdense regions (center of mass of clusters). It is a subtle blend of effects that combine to produce a flattening of the distribution on larger scales on redshift survey maps, sometimes dubbed as the “pancakes-of-god”, see e.g., Hamilton (1998), Percival and White (2009) and Percival (2013). The related equations are as follows.

A point in the redshift space can be related to the real space by

$$\begin{aligned} \mathbf {s}({\mathbf{r}} ) = {\mathbf{r}} + v_r ({\mathbf{r}} ) {\hat{\mathbf{r}}}, \end{aligned}$$

where \(v_r\) is the peculiar velocity projected in the radial direction. Next, we recall the linearized continuity equation

$$\begin{aligned} \beta \delta _m + \bar{\nabla } \cdot \bar{v} = 0 \, \end{aligned}$$

where v is the matter velocity field, \(\beta (z) \equiv f(z)/b(z)\) and b(z) is the galaxy bias .

Using the Jacobian between the redshift and real spaces, conservation of galaxy number in the two spaces, the continuity equation and a few steps, it is straightforward to derive (Kaiser 1987; Hamilton 1998)

$$\begin{aligned} \delta ^s_g(k) = \left( 1+\beta \mu ^2 \right) \delta ^r_g(k), \end{aligned}$$

where the \(\mu \) is the cosine of the angle with the line of sight.

Using (74) and a linear galaxy bias, the corresponding power spectra are related as follows

$$\begin{aligned} P^s_g(k,\mu ,z)= & {} b(z)^2\left[ 1 + \beta (z) \mu ^2 \right] ^2 P_m^r(k,z) \end{aligned}$$
$$\begin{aligned}= & {} \left[ b (z) + f(z) \mu ^2 \right] ^2 P_m^r(k,z) \, , \end{aligned}$$

where in the last line, we split b(z) and f(z) on purpose and note that from the matter power spectrum on the right comes its amplitude, e.g., \(\sigma _8\) that is then degenerate with f(z) in such a measurement. This illustrates why RSD surveys probe \(b(z)\sigma _8\) and \(f(z)\sigma _8\), unless the degeneracies are broken by other means.

Equation (75) gives the linear RSD at large-scales,Footnote 1 while the nonlinear FoG effect can be modeled by a damping factor multiplying the power spectrum and often chosen to be an exponential (Lorentzian) or Gaussian form Percival and White (2009)

$$\begin{aligned} F_{\mathrm{Lorentzian}}(k,\mu ^2)= & {} [1+(k\sigma _p\mu )^2]^{-1}, \end{aligned}$$
$$\begin{aligned} F_{\mathrm{Gaussian}}(k,\mu ^2)= & {} \exp [-(k\sigma _p\mu )^2]. \end{aligned}$$

It is then customary to multiply Eqs. (75) and (78) to combine the effect with caution though about some limitations and the need for some accurate simulations as discussed in for example Percival and White (2009). Indeed, other combined models including contributions from nonlinear effects and numerical simulations are used to fully explore RSD modeling and observations and we refer the reader to the following RSD reviews in the literature (Hamilton 1998; Percival and White 2009; Percival 2013) and references therein.

Fig. 3
figure 3

Image reproduced with permission from Okumura et al. (2016), copyright by the authors

Growth rate \(f(z)\sigma _8(z)\) measurements for redshift range \(0<z<1.55\) and theoretical predictions from standard GR-\(\varLambda \)CDM model and MG models f(R) (see Sect.7.4.1), covariant Galileons (see Sect. 7.3.1), extended Galileons, DGP (see Sect. 7.5.2), and models with varying gravitational constant. The constraint obtained from Subaru FastSound sample at \(1.19<z<1.55\) (Okumura et al. 2016) is plotted as the big red point. The other results include the 6dFGS, 2dFGRS, SDSS main galaxies, SDSS LRG, BOSS LOWZ , WiggleZ, BOSS CMASS, VVDS, and VIPERS surveys at \(z<1\). Predicted \(f\sigma _8\) from GR-\(\varLambda \)CDM with the amplitude determined by minimizing their \(\chi ^2\) is shown as the red solid line. The data points used for the \(\chi ^2\) minimization are denoted as the filled-symbol points. The other curves are predictions from MG models as indicated on the right

Finally, it is worth noting that measurement of RSD are degenerate with another effect called the Alcock–Paczynski effect (Alcock and Paczynski 1979) which is caused by the conversion of angles and redshifts measured in redshift space to physical distances and Hubble function in the real space. If the theoretical cosmological model used is significantly different from the true model then further distortions are introduced in this process. These can be confused with the RSD effects and need to be accounted for. This results in a further multiplicative expression to Eq. (75) with one or two more parameters. See for example, treatments and discussions in Ballinger et al. (1996), Simpson and Peacock (2010), Samushia et al. (2012) and Montanari and Durrer (2012). This is well summarized in the following equation from Raccanelli et al. (2015):

$$\begin{aligned} P_g^{\mathrm{s}}(k',\mu ',\alpha _\bot ,\alpha _{||},\mathbf{p})= \frac{(b+\mu '^{2}f)^2}{\alpha _{\bot }^2\alpha _{||}} P_m^{\mathrm{r}}\left[ \frac{k'}{\alpha _\bot }\sqrt{1+\mu ^{'2}\left( \frac{1}{F^2}-1\right) }\right] , \end{aligned}$$

where \(\mathbf{p}\) are the cosmological parameters of the real-space power-spectrum and the primed quantities are the observed quantities that have been introduced here to distinguish them from the real quantities as follows: \(k'\) and \(\mu '\) are the observed wavevector and angle; their relation to the real quantities is given by \(k'_{||}=\alpha _{||}k_{||}\), \(k'_\bot =\alpha _\bot k_\bot \), \(\mu '=\frac{k_{||}'}{\sqrt{k_{||}'+k_\bot '}}\); \(F=\alpha _{||}/\alpha _\bot \), with \(\alpha _{||}=\frac{H^{\mathrm{fid}}}{H^{\mathrm{real}}}\) and \(\alpha _\bot =\frac{D^{\mathrm{real}}}{D^{\mathrm{fid}}}\) the ratios of angular and radial distances between the fiducial and real cosmological models, see Raccanelli et al. (2015).

An important aspect of RSD analyses is to use measurements of the correlation function from galaxy redshift surveys and then compare them to galaxy theoretical power spectrum or its Legendre decomposition in order to estimate \(f\sigma _8\) and \(b\sigma _8\) at different effective redshifts.

For our review, we stress that modifications to gravity enter into the \(f(z)\sigma ^8\) term in Eq. (80) and also into the \(G^2(z)\) contained in the matter power spectrum. RSD measurements are thus very important in constraining deviations from GR affecting Poisson equation (29). While current error bars on measurements are still too large to exclude a number of contenders to GR, RSD is considered one of the most promising probes of gravity theories and has been used in a number of analysis as we discuss further below. For example, it has been shown in Okada et al. (2013) that RSD can already exclude some covariant Galileon MG models (see Sect. 7.3.1) to high level of confidence (Okada et al. 2013). We reproduce Fig. 17 from Okumura et al. (2016) (see Fig. 3) for a number of \(f\sigma _8\) measurements to date along with GR-\(\varLambda \)CDM and five MG models (see discussion in Sect. 6.2).

Current RSD data include for example measurements from 6dFGS (Beutler et al. 2012), 2dFGRS (Cole et al. 2005), SDSS LRG (Samushia et al. 2012), BOSS LOWZ (Tojeiro et al. 2012), BOSS CMASS (Anderson et al. 2014), VVDS (Guzzo et al. 2008), VIPERS (de la Torre et al. 2013), WiggleZ Dark Energy Survey (Blake et al. 2012; Parkinson et al. 2012), and Subaru FMOS galaxy redshift survey (FastSound) (Okumura et al. 2016). A compilation of 34 points with corrections from model dependencies can be found in Nesseris et al. (2017). It is worth noting that when using \(f \sigma _8\) data to constrain modified gravity models, one has to make sure no assumptions of the \(\varLambda \mathrm {CDM}\) model are kept in the data points due to calibration using \(\varLambda \mathrm {CDM}\) mocks. See for example the following papers that performed validation analyses of \(f \sigma _8\) constraints in MG models and pointed out to possible biases (Taruya et al. 2014; Barreira et al. 2016; Bose et al. 2017).

In addition to linear scales, RSD and velocity power spectra were shown to be a promising probe of deviations from gravity. Jennings et al. (2012) used large volume N-body simulations to study dark matter clustering in redshift space in f(R) modified gravity models (see Sect. 7.4.1). The nonlinear matter and velocity fields were resolved to a high level of accuracy over a broad range of scales for f(R) models. The analysis found significant deviations from the clustering signal in GR, with an enhanced boost in power on large scales and stronger damping on small scales in the f(R) models at redshifts z below 1. In particular, they found that the velocity power spectrum is a strong discriminator between f(R) and GR suggesting that the extraction of the velocity power spectrum from future galaxy surveys is a promising method to constrain deviations from GR. See also (Hellwing et al. 2014) on the galaxy velocity field and a signature of MG.

It is worth mentioning here that almost a decade ago RSD already attracted a lot of attention after a study in Guzzo et al. (2008) using the VIMOS-VLT Deep Survey (VVDS) measured the anisotropy parameter \(\beta (z=0.77) = 0.70 \pm 0.26\), which corresponds to a growth rate of structure \(f(z=0.77) = 0.91 \pm 0.36\) consistent with GR and \(\varLambda \)CDM, but with too large errors leaving room for other possibilities. We present recent constraints from RSD on gravity in Sect. 6.2.

4.4 Cosmic microwave background radiation

This relic radiation that we call the CMB is among the most powerful cosmological probes. Not only does it constrain the background geometry (as discussed in Sect. 4.1.2) but it also constrains the growth of structure in the universe. The information in the CMB is expressed into temperature and polarization power spectra. These spectra have primary anisotropies that were imprinted at the surface of last scattering and also secondary anisotropies that happen later while the CMB photons are traveling in the intervening medium.

CMB spectra provide via their primary anisotropies a powerful probe of the early universe to constrain cosmological parameters. It is complementary to other geometry probes such as supernova and BAO that probe the later times. CMB by itself can already tightly constrain background parameters such as the Hubble constant, the matter density and the effective dark energy density parameters. In combination with other probes, it can also tightly constrain an effective dark energy equation of state.

Most relevant to dark energy and modification to GR at cosmological scales, are the secondary anisotropies that constrain scalar mode perturbations and the growth of large-scale structure. These are the Integrated Sachs–Wolfe–Effect (ISW) that affect the spectrum at small multipoles (large angular scales) (Sachs and Wolfe 1967; Kofman and Starobinskij 1985), Lensing of the CMB (Blanchard and Schneider 1987; Cole and Efstathiou 1989; Linder 1997; Seljak 1996) that affects the spectrum progressively at high multipoles (small angular scales), and the Sunyaev–Zel’dovich (SZ) effect at even higher multipoles (smaller angular scales). We review the former two effects in the next sub-sections.

Finally, it is worth mentioning that a general practice in using CMB in analysis where geometry constraints are compared to growth constraints, the spectra are split into low and high multipoles as follows. Low multipoles (\(\ell < 30\)) are used to constrain the growth while the higher multipoles (\(30 \le \ell \le 2508\)) are more sensitive to the background geometry via the position of the acoustic peaks and are used for that.

4.4.1 Integrated Sachs–Wolfe (ISW) effect

The Integrated Sachs–Wolfe (ISW) effect is a secondary anisotropy in the CMB temperature fluctuations that is caused by time variations in the gravitational potentials (Sachs and Wolfe 1967; Kofman and Starobinskij 1985; Rees and Sciama 1968). In this review, we focus on the late-time ISW that can be caused by a Dark Energy component or a modification to gravity that can effect the evolution of the potentials associated with large-scale structures and voids. Namely, CMB photons traveling to us encounter potential wells due to large structures. They gain energy while falling down the potential wells but then lose it back while climbing out of them except for a small difference left due to a stretching in the potential well caused by repulsive Dark Energy or Modified gravity that happened during the photons’ journey through the potential. This results in a net gain in energy for the photons coming out of the potential’s well. The opposite scenario happens to photons when they travel across large voids (potential hills) causing a net loss in their energy. The effect is given by

$$\begin{aligned} \frac{\delta T}{T}(\hat{n})= -\int ^{\eta _{*}}_{\eta _{0}}d\eta \frac{\partial (\varPsi +\varPhi )}{\partial \eta } \end{aligned}$$

where T is the CMB temperature, \(\eta _{*}\) is the conformal time at CMB surface and \(\eta _{0}\) at the observer. We note that spatial curvature can also cause such a variation (Kamionkowski 1996) but we assume here spatial flatness in accordance with current observational constraints.

The ISW effect modifies the CMB temperature power spectrum at the largest angular scales with multipoles \(\ell \le 10\) affecting the height of the left tail of the spectrum. The first detections of the ISW effect were done by cross-correlating the WMAP CMB temperature data with galaxy density surveys, see for example Boughn and Crittenden (2004), Fosalba et al. (2003), Nolta et al. (2004), Corasaniti et al. (2005), Padmanabhan et al. (2005), Vielva et al. (2006) and Giannantonio et al. (2012) and later on by cross-correlating Planck with large scale structure data (Ade et al. 2014b, 2016d). Other methods using stacking of CMB fields at coordinates coinciding with known superstructures have also led to detection, see for example Granett et al. (2008), Pápai et al. (2011) and Ade et al. (2014b). The ISW was also detected through the ISW-lensing bispectrum using Planck data only (Ade et al. 2016d).

By changing the gravitational potentials [as in (88)] and their time evolution (growth), MG models affect the ISW and change the very-left end of the CMB power spectrum. We reproduce Fig. 2 from Barreira et al. (2014a) (see Fig. 4 here) where the top panel shows the ISW effect for various Galileon MG models (see Sect. 7.3.1) and the \(\varLambda \)CDM model augmented by massive neutrinos. As we discuss further in Sect. 9.7, such an effect played a major role in ruling out the cubic Galileon models and putting very stringent constraints on the quartic and quintic ones. It is worth noting though that since the ISW effect enters only on the largest angular scales, its constraining power is limited by cosmic variance. However, cross-correlating CMB with large-scale structure tracers such as galaxies enhances its measurement significance and usefulness as we listed above.

As we describe further below, the ISW effect has been used extensively to constrain deviations from GR in conjunction with other data sets and plays a central role in obtaining such constraints.

4.4.2 CMB lensing

Just as in cosmic shear, CMB photons traveling to us from the surface of last scattering are subject to deflections by large-scale structure and mass concentrations along the intervening medium. These deflections change the trajectories of photons and affect the CMB temperature and polarization maps observed in the form of very small distortions that can be statistically collected and analyzed from high-precision CMB experiments (Blanchard and Schneider 1987; Cole and Efstathiou 1989; Linder 1997; Seljak 1996). This lensing smears out the CMB temperature power spectrum and produces non-guaussianities in the temperature and polarization maps, generating 3- and 4-point correlations (Bernardeau 1998; Zaldarriaga and Seljak 1999; Okamoto and Hu 2003), and converting E-mode polarization of the CMB photons into lensing B-mode (Zaldarriaga and Seljak 1998). CMB lensing and its effects have been measured by various experiments (Hanson et al. 2013; van Engelen et al. 2012; Keisler et al. 2015; Ade et al. 2014f, g, e; van Engelen et al. 2015; Das et al. 2011; Ade et al. 2014d, 2016c). For example, Planck-2015 measured the CMB lensing potential to an overwhelming 40-\(\sigma \) confidence level (Ade et al. 2016c).

Fig. 4
figure 4

Image reproduced with permission from Barreira et al. (2014a), copyright by APS

These plots illustrate the differences between \(\varLambda \mathrm {CDM}\) and Galileon models (see Sect. 7.3.1), with and without massive neutrinos. The Galileon models have background Friedmann equations that contain a scalar-field energy density contribution that generates late time cosmic acceleration and has an evolution consistent with observations and thus similar to that of a \(\varLambda \mathrm {CDM}\) model. The Galileon scalar field here also affects linear perturbations and is not coupled to matter. The effect of the Galileon field considered here is focused on large-scale structure. The Top: CMB temperature power spectra showing the ISW effect at low multipoles. Middle: CMB lensing potential spectra. Bottom: linear matter power spectra. The models plotted in dashed lines indicate their best fit models to Ade et al. (2014c) temperature data, WMAP9 polarization data (Hinshaw et al. 2013), and Planck-2013 CMB lensing (Ade et al. 2014d). They note these as PL models. The solid lines indicate their best fits to CMB data (i.e., PL) plus BAO measurements from 6dF, SDSS DR7 and BOSS DR9. They note these as PLB models. The models correspond to best-fitting base Galileon modified gravity model (in blue), \({\nu } {{\mathrm{Galileon}}}\) (in red) and \({\nu } \varLambda {{\mathrm{CDM}}}\) (in green). For the last two models, the authors added massive neutrino. In the upper and middle panels, the data points show the power spectrum measured by the Planck satellite (Ade et al. 2014c). In the lower panel, the data points show the SDSS-DR7 Luminous Red Galaxy power spectrum of Reid et al. (2010), but scaled down to match the amplitude of the best-fitting \({\nu } {{\mathrm{Galileon}}}\) (PLB) model (Barreira et al. 2014a). We refer to this figure from various parts of the text

These deflections and the resulting observed lensed CMB are sensitive to the distribution and growth rate of large-scale structures and their associated gravitational potential. Modification to the gravitational potential due to deviations from general relativity are thus reflected on the CMB Lensing and can be used to constrain MG parameters and models.

CMB lensing can be understood as a remapping of CMB temperature (or polarization) as follows. The lensed CMB temperature, noted as \(\tilde{T}({\hat{\mathbf {n}}})\) in a direction \({\hat{\mathbf {n}}}\), is given by the unlensed temperature, \(T({\hat{\mathbf {n}}}') = T({\hat{\mathbf {n}}}+ {\varvec{\alpha }})\) in the deflected direction \({\hat{\mathbf {n}}}'={\hat{\mathbf {n}}}+ {\varvec{\alpha }}\). \({\varvec{\alpha }}\) is the deflection angle that is expressed at lowest order as \({\varvec{\alpha }}= \nabla \psi _{_{\mathrm{L}}}\) where \(\psi _{_{\mathrm{L}}}\) is the lensing potential, see e.g., Lewis and Challinor (2006). The latter is the result of an integration along the line of sight of the gravitational potential from the surface of last scattering all the way to us as an observer, that is

$$\begin{aligned} \psi _{_{\mathrm{L}}}({\hat{\mathbf {n}}}) \equiv -2 \int _0^{\chi _*} {\text {d}}\chi \, \frac{f_K(\chi _*-\chi )}{f_K(\chi _*)f_K(\chi )} \varPsi _{\mathrm{w}}(\chi {\hat{\mathbf {n}}}; \tau _0 -\chi ), \end{aligned}$$

where \(\chi _*\) is the conformal distance to the surface of last scattering; \(\tau _0 -\chi \) is the conformal time at which the photon was at position \(\chi {\hat{\mathbf {n}}}\); \(\varPsi _{\mathrm{w}}(\chi {\hat{\mathbf {n}}}; \tau )\equiv (\varPsi +\varPhi )/2\) is the Weyl gravitational potential at conformal distance \(\chi \), in direction \({\hat{\mathbf {n}}}\), and at conformal time \(\tau \);

Following a similar procedure as in Sect. 4.2, the power spectrum of the CMB lensing potential, for a spatially flat cosmology and in the Limber approximation (Limber 1953) is given as (see, e.g., Lewis and Challinor 2006)

$$\begin{aligned} C_l^{\psi _{_{\mathrm{L}}}\psi _{_{\mathrm{L}}}} = \frac{8\pi ^2}{l^3} \int _0^{\chi _*} \chi {\text {d}}\chi \, \mathcal {P}_\varPsi (l/\chi ;\tau _0-\chi ) \left( \frac{\chi _*-\chi }{\chi _*\chi }\right) ^2. \end{aligned}$$

The lensing potential power spectrum probes the matter power spectrum and its evolution and is thus sensitive to its amplitude, growth and how modification to GR affects these quantities. For example, it is very sensitive to modification to the second perturbed Eq. (30). For example, we reproduce Fig. 2 from Barreira et al. (2014a) (Fig. 4 here) where the middle panel shows how CMB lensing power spectra for Galileon MG models (see Sect. 7.3.1) versus \(\varLambda \)CDM model plus massive neutrinos.

It is worth pointing out Hojjati and Linder (2016) where the authors showed that CMB Lensing will be particularly useful in constraining modified gravity models, massive neutrino models, or other new physical models that are scale dependent. Such signatures will show up in the CMB lensing power spectrum and provide an additional means to constrain MG models and other models beyond wCDM. They show that the shapes of the deviations of the CMB lensing power spectra from that of a \(\varLambda \)CDM model are fairly distinct between the various scale-dependent physical origins. They highlight the role of arcminute resolution polarization experiments such as ACTpol, POLARBEAR/Simons Array, and SPT-3G, as well as the next generation CMB-S4 will be able to distinguish between these models.

A number of analyses of CMB Lensing have provided already useful constraints on various cosmological parameters, see for example Hanson et al. (2013), van Engelen et al. (2012), Keisler et al. (2015), Ade et al. (2014f, 2014g, 2014e), van Engelen et al. (2015), Das et al. (2011) and Ade et al. (2014d, 2016c). We will provide in Sect. 6 further below, various constraints on deviations from GR and MG models based on CMB Lensing.

5 Formalisms and approaches to testing GR at cosmological scales

Modifications to GR at cosmological scales have been often proposed at the level of the action and its Lagrangian or at the level of the perturbed Einstein’s equations. Accordingly, formalisms for deviations from GR in this context have been developed at these two levels as we discuss in the following sub-sections.

5.1 Effective field theory (EFT) approach to dark energy and modified gravity

The EFT approach to dark energy and modified gravity is often referred to as a “Unified” approach to dark energy since it includes in its action a broad spectrum of single field scalar–tensor dark energy and modified gravity models. It was applied first to inflation models using a Lagrangian derived from an EFT expansion (Cheung et al. 2008) and then to dark energy by for example Gubitosi et al. (2013), Bloomfield et al. (2013), Gleyzes et al. (2013) and Creminelli et al. (2009).

The approach is based on constructing a Lagrangian that includes the scalar terms for a perturbed FLRW metric assuming a single field dark energy models with operators up to a given dimension and those that are invariant under spatial diffeomorphisms. This EFT formulation is also done in the unitary gauge where the foliations of constant time coincide with the hypersurfaces of uniform scalar field. This gauge allows one to write the action only in terms of the metric and its derivatives with no scalar field perturbations appearing there, however it brings limitation of a background-dependent EFT approach compared with the covariant EFT approach of, e.g., Weinberg (2008) and Bloomfield and Flanagan (2012). The action satisfying the above restrictions, that is up to quadratic order in the perturbations, and contains only operators that lead to at most second-order equations of motion, takes the following form in the Jordan frame (Gubitosi et al. 2013; Bloomfield et al. 2013; Gleyzes et al. 2013; Creminelli et al. 2009):

$$\begin{aligned} S= & {} \int d^4x \sqrt{-g} \bigg \{ \frac{m_0^2}{2} \varOmega (t) R+ \varLambda (t) - c(t) \delta g^{00} \nonumber \\&+\, \frac{M_2^4 (t)}{2} (\delta g^{00})^2 - \frac{\bar{M}_1^3 (t)}{2} \delta g^{00} \delta K^\mu _\mu - \frac{\bar{M}_2^2 (t)}{2} (\delta K^\mu _\mu )^2 \nonumber \\&-\, \frac{\bar{M}_3^2 (t)}{2} \delta K^i_j \delta K^j_i + \frac{\hat{M}^2 (t)}{2} \delta g^{00} \delta R^{(3)} \nonumber \\&+ \,m_2^2(t)\left( g^{\mu \nu }+n^{\mu } n^{\nu }\right) \partial _{\mu }(g^{00})\partial _{\nu }(g^{00}) \bigg \}\nonumber \\&+\, S_{m} [g_{\mu \nu }, \chi _i] \ , \end{aligned}$$

where \(m_0^{-2} = 8\pi G\) is the reduced Planck mass; \(\delta g^{00}\) is the perturbation of the time-time component of the inverse metric; \(\delta {K}{^\mu _\nu }\), \(\delta K\) are the perturbation of the extrinsic curvature and its trace; \(\delta R^{(3)}\) is the perturbation of the three dimensional spatial Ricci scalar of constant-time hypersurfaces; \(n^{\mu }\) is the 4-vector normal to the constant-time hypersurfaces; and \(S_m\) is the action for all matter fields \(\chi _i\) minimally coupled to the metric \(g_{\mu \nu }\).

The coefficients \(M^i_j(t)\) are functions of time and have dimensions of mass. The functions c(t) and \(\varLambda (t)\) (not to be confused with the cosmological constant) can be re-expressed in terms of the function \(\varOmega (t)\) and background functions such as the Hubble and density parameters by using the FLRW background evolution equations. Thus, the theories covered by action (84) can be specified by the following 7 functions of time:

$$\begin{aligned} \left\{ \varOmega ,\bar{M}_1^3,\bar{M}_2^4,\bar{M}_3^2,M_2^4,\hat{M}^2, m_2^2\right\} \end{aligned}$$

plus one function describing the background evolution such as the Hubble function.

It is worth mentioning that the EFT approach covers both the background evolution and the linear perturbations of the metric so it provides equations and parameterization that can be compared to the background evolution as well as the growth of large-scale structure observations. However, in order to compare effectively the whole set to observations, one needs to do further useful parameterizations of the functions (85). For example, for Horndeski models (Horndeski 1974), these functions are mapped to the so-called \(\alpha _x\) parameterization (Bellini and Sawicki 2014) which is then connected to the physical aspect of the theory as we discuss in Sect. 7.3.1 further below. See also another informative reconstruction of Horndeski from EFT of dark energy in Kennedy et al. (2017).

Table 1 The EFT formalism covers a number of different theories of dark energy and modified gravity

The EFT action (84) is general enough to include broad classes of dark energy and modified gravity such as the Horndeski 1974 or generalized Galileons (Deffayet et al. 2009a), beyond Horndeski models (Zumalacarregui and García-Bellido 2014; Gleyzes et al. 2015a, b), Hořava–Lifshitz gravity in its low energy limit (Hořava 2009b; Kase and Tsujikawa 2014), ghost condensate models (Hamed et al. 2004), and DGP braneworld models (Dvali et al. 2000). We reproduce Table I from Linder et al. (2016) (Table 1 here) that shows the list of the function parameters (85), the corresponding terms in the Lagrangian operators of the action (84), and some gravity theories with the terms they involve from the EFT Lagrangian.

While the EFT approach can be praised for its clear theoretical motivation and systematic nature, it has the disadvantage of requiring the use of a large number of parameters and functions. This number overwhelms the limited constraining power of current cosmological data. Nevertheless, some of the coefficients can be set to zero or can be shown to be interrelated in the case of some known dark energy or modified gravity models so one can reduce the number of parameters to a practical one. This of course affects the primary motivation of the EFT approach in providing a systematics method but the hope is that as more orthogonal and precise data sets become available in the future this method will reach its aimed goals. Also, the effectiveness of the EFT approach was questioned in Linder et al. (2016) stating that the EFT functions used do not have a simple time dependence that can be fit to observations for different cosmic eras, but as they state, one can nevertheless gain some general characteristics of such dependencies for early and late time limits of cosmic evolution.

Most recently, Lagos et al. (2016, 2018) followed on a previous effort of the Parameterized-Post-Friedmann formalism of Baker et al. (2013) in order to extend the EFT formalism to cover beyond scalar–tensor theories. The general approach they proposed recovers the standard \(\alpha \)-parameterization of Bellini and Sawicki (2014) for Horndeski models (see Sect. 7.3.1) but also applies to beyond-Horndeski models, vector–tensor theories, and tensor–tensor theories. In each of the more complicated theories, the formalism considers a few additional \(\alpha _x\)-parameters for up to 12 parameters in the most general case. We refer the reader to their papers for more information.

Due to its broad application, the EFT approach has been implemented in several Einstein–Boltzmann solvers and Markov-Chain–Monte-Carlo codes to analyze CMB and other datasets, see for example Hu et al. (2014b), Bellini et al. (2018) and references therein, as well as our discussion in Sect. 11.

5.2 Modified growth parameters

We discussed in Sect. 3.2 how the growth of large scale structure can be described by the two equations (29) and (30) derived from linear perturbations of the Einstein’s Field Equations. Now, the effect of deviations from GR on the growth of large structure can be encapsulated in two parameters added to these equations. These are then often called the modified growth or Modified gravity (MG) equations. Usually, one of the MG parameters modifies the coupling between the gravitational potential and the energy-density source while the other parameter quantifies the difference between the two gravitational potentials. There are various related parameterizations notations and we review some of the most commonly used ones in the literature.

One pair of such parameters is given by Q(ka) and R(ka) as follows, see e.g., Caldwell et al. (2007), Amendola et al. (2008) and Bean and Tangmatitham (2010):

$$\begin{aligned} k^2{\varPhi }= & {} -4\pi G a^2\sum _i {\bar{\rho }}_i \delta _i \, Q(k,a) \end{aligned}$$
$$\begin{aligned} k^2(\varPsi -R(k,a)\,\varPhi )= & {} -12 \pi G a^2\sum _i {\bar{\rho }}_i(1+w_i)\sigma _i \, Q(k,a), \end{aligned}$$

where each matter specie is denoted by the index i, \({\bar{\rho }}_i\) is the corresponding mass-energy density, \(\delta _i\) is the rest-frame overdensity, and \(\sigma _i\) is the shear stress. Q(ka) and R(ka) are scale and time dependent and both take the value of unity in GR.

The parameter Q(ka) represents a modification to the “Poisson equation” (29) (see comments in Dossett et al. 2011b), while the parameter R(ka) quantifies the inequality between the two potentials referred to as the gravitational slip (Caldwell et al. 2007) (at late times, when anisotropic stress is negligible, Eq. (87) gives \(R=\varPsi \!/\!\varPhi \)). Caldwell et al. (2007) noted the slip parameter as \(\varPsi = (1 + \varpi )\varPhi \) based on a cosmological extension to the PPN formalism, see e.g., Will (2014).

In order to avoid a strong degeneracy between the parameters Q(ka) and R(ka), Eqs. (86) and (87) can be combined to introduce another MG parameter as follows (see, e.g., Amendola et al. 2008):

$$\begin{aligned} k^2(\varPsi +\varPhi ) = -8\pi G a^2\sum _i {\bar{\rho }}_i \delta _i \,\varSigma (k,a)\, -12 \pi G a^2\sum _i {\bar{\rho }}_i (1+w_i)\sigma _i \, Q(k,a), \end{aligned}$$


$$\begin{aligned} \varSigma (k,a) \equiv \frac{Q(k,a)[1+R(k,a)]}{2}. \end{aligned}$$

The parameter \(\varSigma (k,a)\) enters the equation for the Weyl potential defined earlier (i.e., \(\varPsi _{\mathrm{w}}\equiv (\varPsi +\varPhi )/2\)) which affects the propagation of light. The parameter is thus directly constrained by some observations such as weak gravitational lensing. Just like the parameters Q and R, \(\varSigma \) takes unity in general relativity.

A second pair of MG parameters often used in the literature is where a modification to Eq. (29) is done indirectly by defining a modified field equation containing the parameter \(\mu (k,a)\) plus a gravitational slip parameter, \(\eta (k,a)\) (Zhao et al. 2009, 2010; Hojjati et al. 2011; Caldwell et al. 2007; Amendola et al. 2008). The modified growth equations then read:

$$\begin{aligned} k^2\varPsi= & {} -4\pi G a^2\sum _i {\bar{\rho }}_i \delta _i \, \mu (k,a). \end{aligned}$$
$$\begin{aligned} \frac{\varPhi }{\varPsi }= & {} \eta (k,a). \end{aligned}$$

The generalization of these two equations for non-zero shear can be found in, for example, equations (13) and (14) of Hojjati et al. (2011). Again, \(\varSigma (k,a)\) is defined from their combination as

$$\begin{aligned} \varSigma (k,a)\equiv \frac{\mu (k,a)[1+\eta (k,a)]}{2} \end{aligned}$$

Similarly, these parameters have a scale and time dependencies and take the value of unity for GR.

A third notation is one that associates MG parameters with effective gravitational constants in the growth equations (see, e.g., Tsujikawa 2007; Song and Koyama 2009; Linder 2017) so that the modified Poisson equations take the form

$$\begin{aligned} k^2\varPsi= & {} -4\pi G_{\mathrm{eff}}^{\varPsi } a^2\sum _i {\bar{\rho }}_i \delta _i \end{aligned}$$
$$\begin{aligned} k^2(\varPsi +\varPhi )= & {} -8\pi G_{\mathrm{eff}}^{\varPsi +\varPhi } a^2\sum _i {\bar{\rho }}_i \delta _i. \end{aligned}$$

Equation (93) governs the coupling between the gravitational potential for non-relativistic particles to the source density fluctuation while Eq. (94) governs the coupling of the gravitational potential for relativistic particles to the source density fluctuation and affects geodesics of relativistic particles such as light propagation and gravitational lensing. Often \(G_{\mathrm{eff}}^{\varPsi }\) is dubbed as \(G_\mathrm{matter}\) and \(G_{\mathrm{eff}}^{\varPsi +\varPhi }\) as \(G_\mathrm{light}\).

It is worth concluding this sub-section by providing the relationships between the different parametrizations above during matter domination and assuming zero anisotropic stress

$$\begin{aligned} \mu =Q R = \frac{G_{\mathrm{eff}}^{\varPsi }}{G}=\frac{G_\mathrm{matter}}{G},&\quad \eta = \frac{1}{R} \end{aligned}$$
$$\begin{aligned} \varSigma = \frac{Q(1+R)}{2} =\frac{G_{\mathrm{eff}}^{\varPsi +\varPhi }}{G}= \frac{G_\mathrm{light}}{G}&\quad \mu \eta = Q . \end{aligned}$$

A more extended discussion of the relationship between MG parameters can be found in Daniel et al. (2010).

Finally, on super-horizon scales \(k\ll a H\) and for adiabatic perturbations, there are further useful constraints from coordinates invariance that apply to GR and also MG theories (Bertschinger 2006). These provide a consistency relation between the two gravitational potential which reduces the two independent functions (MG parameters) above to only one parameter. The consistency relation plus the MG parameter \(\eta (a)\) can be used to characterize deviation from GR at super-horizon scales. In other words, at these long wavelength, \(\eta (a)\) is the only important degree of freedom for MG gravity (Bertschinger 2006; Bertschinger and Zukin 2008; Hu and Sawicki 2007b).

5.3 Evolution of MG parameters in time and scale

Departures from general relativity can evolve in time and/or scale and this has been included in parametrizations and studies. Mainly two approaches have been used in doing so. The first method employs generic functional forms, while the second uses binning in redshift and scale. A third method combines the two previous ones into a hybrid method.

  • Functional forms for time and scale evolution: For example, Bean and Tangmatitham (2010) used:

    $$\begin{aligned} X(k,a) = \left[ X_0 e^{-k/k_c}+X_\infty (1-e^{-k/k_c})-1\right] a^s +1, \end{aligned}$$

    where X denotes, for example, Q or R. \(Q_0\) and \(R_0\) are the present-day asymptotic superhorizon values while \(Q_\infty \) and \(R_\infty \) are the present-day asymptotic subhorizon values of Q(ka) and R(ka). \(k_c\) is a comoving transition scale. The time evolution is given by \(a^s\). It was noted though in, for example Zhao et al. (2010), Song et al. (2011) and Dossett et al. (2011a), that such a functional exponential form causes a too strong dependence of MG parameters on the exponent s and can exacerbate tensions between GR and data (Dossett et al. 2011a). It was found in these papers that a binning method in redshift avoids this problems. The model parameters that can be used to detect deviations from GR are now: \(Q_0\), \(R_0\), \(Q_\infty \), \(R_\infty \), \(k_c\), and s. The parameters s and \(k_c\) take the values \(s=0\) and \(k_c=\infty \) in GR and the other parameters reduce to unity. The constraints on \(\varSigma (k,a)\) can then be derived using Eq. (89).

    In a similar way, the parameters, \(\mu \) and \(\eta \) have also been allowed to evolve, for example, in redshift. In Dossett et al. (2011a), the two parameters have a redshift dependence transitioning to constant values below some redshift, \(z_s\), and then take the GR value of unity following a hyperbolic tangent function with a transition width, \(\delta z\):

    $$\begin{aligned} \mu (z)= & {} \frac{1-\mu _{0}}{2}\Big (1 + \tanh {\frac{z-z_s}{\delta z}}\Big ) + \mu _{0}, \end{aligned}$$
    $$\begin{aligned} \eta (z)= & {} \frac{1-\eta _0}{2}\Big (1 + \tanh {\frac{z-z_s}{\delta z}}\Big ) + \eta _{0}. \end{aligned}$$

    The parameter \(\varSigma (z)\) then follows from Eq. (96) above.

    Functional forms for MG parameters have been discussed to be less flexible than binning or hybrid methods in Dossett et al. (2011a) and Daniel et al. (2010).

  • Time and scale binning method of MG parameters: An example of binning MG parameters in time (redshift) and scale is provided in Dossett et al. (2015). Two scale bins are defined as \(k\le 0.01\,h\) Mpc\(^{-1}\) and \(k>0.01\,h\) Mpc\(^{-1}\). These are crossed with two other bins in redshift defined by \(0<z\le 1\) and \(1<z\le 2\). In order to assure for the transition between the bins to be continuous and for numerical implementation stability, the following transition functions have been used:

    $$\begin{aligned}&X(k,a) =\frac{1}{2}\big (1 + X_{z_1}(k)\big )+\frac{1}{2}\big (X_{z_2}(k) - X_{z_1}(k)\big )\nonumber \\&\tanh {\frac{z-1}{0.05}}+\frac{1}{2}\big (1 - X_{z_2}(k)\big )\tanh {\frac{z-2}{0.05}}, \end{aligned}$$


    $$\begin{aligned} X_{z_1}(k)= & {} \frac{1}{2}\big (X_2+X_1\big )+\frac{1}{2}\big (X_2-X_1\big )\tanh {\frac{k-0.01}{0.001}}, \nonumber \\ X_{z_2}(k)= & {} \frac{1}{2}\big (X_4+X_3\big )+\frac{1}{2}\big (X_4-X_3\big )\tanh {\frac{k-0.01}{0.001}}, \end{aligned}$$

    where X takes the values Q or \(\varSigma \) so in this parameterization a total of eight MG parameters are varied, \(\varSigma _i\) and \(Q_i\), \(i=1,2,3,4\). Again, all these parameters take a value of unity in GR.

  • Hybrid methods for MG parameters: Finally, the implementation of MG parameters can be optimized to take advantage of each of the two methods above. For that, hybrid methods have been employed in order to keep a functional form for the scale dependence while using bins of redshift for the time evolution as follows (Dossett et al. 2015). The redshift bins are similarly given by Eq. (100) above while the scale dependence is given the form:

    $$\begin{aligned} X_{z_1}(k)= & {} X_1 e^{-\frac{k}{0.01}}+X_2\left( 1-e^{-\frac{k}{0.01}}\right) , \nonumber \\ X_{z_2}(k)= & {} X_3 e^{-\frac{k}{0.01}}+X_4\left( 1-e^{-\frac{k}{0.01}}\right) . \end{aligned}$$

    This gives again eight MG parameters, \(\varSigma _i\) and \(Q_i\), \(i=1,2,3,4\) to be constrained by observations.

  • f(R) guided time and scale parametrization: Guided by f(R) formalism (see Sect. 7.4.1), Bertschinger and Zukin (2008) suggested a phenomenological time and scale parametrization as follows:

    $$\begin{aligned} \mu (a,k)= & {} \frac{1+\alpha _1k^2a^s}{1+\alpha _2k^2a^s} \end{aligned}$$
    $$\begin{aligned} \eta (a,k)= & {} \frac{1+\beta _1k^2a^s}{1+\beta _2k^2a^s}, \end{aligned}$$

    To construct such a parameterization, the authors required GR to hold at early times, so that \(s>0\). They also noted that this parametrization describes f(R) theories with \(|f_R|\ll 1\) for \(\alpha _1=\frac{4}{3}\alpha _2=2\beta _1=\beta _2=4f_{RR}/a^{2+s}\). \((\alpha _1,\alpha _2,\beta _1,\beta _2)\) are arbitrary constants with \(\alpha _2\) and \(\beta _2\) positive so \(\mu \) and \(\gamma \) remains finite for all k. \(\alpha _1\) must be positive as well to assure that \(\mu \) is positive and gravity is attractive.

  • Using rational functions of \(k^2\) and five functions of time: Silvestri et al. (2013) showed that for local theories of gravity with one scalar degree of freedom with up to second order equation of motion and in the quasi-static approximation, the two MG parameter \(\mu (k,a)\) and \(\eta (k,a)\) can be written as rational functions of \(k^2\) with at most 5 functions of time in all generality as follows:

    $$\begin{aligned} \eta (a,k)= & {} \frac{p_1(a)+p_2(a)k^2}{1+p_3(a)k^2}, \end{aligned}$$
    $$\begin{aligned} \mu (a,k)= & {} \frac{1+p_3(a)k^2}{p_4(a)+p_5(a)k^2}. \end{aligned}$$

    They note that even if this parametrization has been derived for the quasi-linear limit, it is expected to work fine at the near- and super-horizon scales since \(\eta (a,k \rightarrow 0)=p_1(a)\ne 1\). They also note that \(\mu (a,k\rightarrow 0)=1/p_4(a)\ne 1\) should be of no-consequences on observables and that super-horizon perturbations will have an evolution consistent with the background expansion (Silvestri et al. 2013). See also discussions for this type of rational functions in De Felice et al. (2011) and for higher order in the wavenumber in Vardanyan and Amendola (2015).

Table 2 The layout of the binned parametrizations

5.4 The growth index parameter \(\gamma \)

Another approach to use the linear growth of structure to constrain deviations from general relativity is by defining the growth index parameter as follows. In some pioneering early work for a matter-dominated universe, the growth function f was shown to be well-approximated by the following ansatz (Peebles 1980; Fry 1985; Lightman and Schechter 1990):

$$\begin{aligned} f \equiv \varOmega _m^\gamma \end{aligned}$$

where \(\gamma \) is the growth index parameter. Peebles (1980) introduced the approximation \(f(z=0)\approx \varOmega _0^{0.6}\) for matter dominated models. After that, Fry (1985) and Lightman and Schechter (1990) proposed more accurate approximations for such a model, i.e., \(f(z=0) \approx \varOmega _0^{4/7}\).

Later on, the work was extended to dark energy models (GR-wCDM) with a slowly varying equation of state by Wang and Steinhardt (1998) deriving the following expression:

$$\begin{aligned} \gamma (\varOmega _m,w)=\frac{3(1-w)}{5-6w}+\frac{3}{125} \frac{(1-w)(1-3w/3}{(1-6w/5)^2(1-12w/5)}(1-\varOmega _m) \end{aligned}$$

with an asymptotic early value of \(\gamma _\infty ^{w\mathrm{CDM}}=3(1-w)/(5-6w)\) reducing to the well known \(\varLambda \)CDM model value of \(\gamma ^{\mathrm{LCDM}}=\frac{6}{11}=0.545\).

Linder (2005) extended this growth index approach to modified gravity theories and pointed out that it can be used as a discriminator between quintessence dark energy models and modified gravity models. For example, for the DGP model (see Sect. 7.5.2) has a growth index parameter of \(\gamma ^{\mathrm{DGP}}=\frac{11}{16}=0.68\) (Lue et al. 2004; Linder 2005) and thus is clearly distinct from the value of the \(\varLambda \)CDM model. Indeed, despite some dispersion of \(\gamma ^{w\mathrm{CDM}}\) for various values of w and also some dispersion of \(\gamma ^{\mathrm{DGP}}\) for various values of \(\varOmega _{m}(a)\), such fluctuations do not overlap and \(\gamma \) remains a good discriminator for gravity theories, see e.g., Linder and Cahn (2007), Gong (2008), Polarski and Gannouji (2008) and Ishak and Dossett (2009) for spatially flat models and Gong et al. (2009) and Mortonson et al. (2009) for curved models.

Moreover, the growth index can be allowed to vary in redshift and provides more stringent constraints on gravity theories (Polarski and Gannouji 2008; Ishak and Dossett 2009). For example, Polarski and Gannouji (2008) proposed a redshift dependent parameterization of the form

$$\begin{aligned} \gamma (z)=\gamma _0 + \gamma '\,z, \end{aligned}$$

where \(\gamma '\equiv \frac{d\gamma }{dz}(z=0)\). The study showed the usefulness of a variable growth index to distinguish between dark energy models and modified gravity models (Polarski and Gannouji 2008). Ishak and Dossett (2009) and Wu et al. (2009) proposed a redshift dependent parameterization that covers a wide range of redshift highlighting that the sign of the slope \(\gamma (z)\) can provide further discrimination between gravity theories.

5.5 The \(E_G\)-parameter test

Zhang et al. (2007) proposed a measure they called \(E_G\) to test deviations from GR’s gravitational potentials in a way that is insensitive to the galaxy bias. The idea is to use a ratio of the galaxy–galaxy lensing angular cross power spectrum over the velocity–galaxy cross power spectrum. We use here a mixture of notation from Zhang et al. (2007) and Leonard et al. (2015) to describe this quantity. The corresponding estimator was defined in the original paper (Zhang et al. 2007) as

$$\begin{aligned} \hat{E}_{G}(\ell , \delta \ell )=\frac{C_{\kappa g}(\ell , \delta \ell )}{3H_{0}^2a^{-1}\sum \limits _{\alpha } j_{\alpha }(\ell , \delta \ell )P^{\alpha }_{vg}}, \end{aligned}$$

where \(C_{\kappa g}(\ell , \delta \ell )\) is the galaxy–galaxy lensing cross-power spectrum in bins of \(\delta \ell \); \(P^{\alpha }_{vg}\) is the galaxy–velocity cross-power spectrum between \(k_{\alpha }\) and \(k_{\alpha +1}\); and \(f_{\alpha }(\ell , \delta \ell )\) is a weighting function defined accordingly. The corresponding expectation value is then given by:

$$\begin{aligned} E_{G}(\ell )=\left[ \frac{\nabla ^2(\varPsi +\varPhi )}{3H_0^2a^{-1}f\delta _M}\right] _{k=\ell /\bar{\chi },\bar{z}} \end{aligned}$$

where f is the linear growth rate of structure, \(\delta _M\) is the matter overdensity field, \(\bar{\chi }\) is the comoving distance corresponding to redshift \(\bar{z}\). For GR \(\varLambda \)CDM, \(E_G\) is independent of length scale and is given by Zhang et al. (2007)

$$\begin{aligned} E_{G}=\frac{\varOmega _M(z=0)}{f(z)}. \end{aligned}$$

The scale independence holds for wCDM models with large-sound speed and negligible anisotropic stress like Quintessence. It also holds for some modified gravity models like DGP (see Sect. 7.5.2) but not for other MG models. The scale dependence of \(E_G\) can be used as a further discriminator between MG models (Zhang et al. 2007).

It is also worth providing a second definition of \(E_G\) motivated by observations as given by Reyes et al. (2010)

$$\begin{aligned} E_G(R)=\frac{\varUpsilon _{gm}(R)}{\beta \varUpsilon _{gg}(R)}, \end{aligned}$$

where R is the transverse separation from the lens-galaxy; \(\varUpsilon _{gm}(R)\) and \(\varUpsilon _{gg}(R)\) are the galaxy-matter and galaxy-galaxy annular differential surface densities respectively, see e.g., Baldauf et al. (2010). By construction, these are correlation functions that do not include any contribution from length scales smaller than some cut-off \(R = R_0\). This second definition in Eq. (113) provides a ratio that is practically similar to the information content of Eq. (110) and also factors out the galaxy bias. Most recently, Leonard et al. (2015) provided further insights on how theoretical uncertainties such as scale dependence of the bias, projection effects, and cut-off scale can affect measurements of \(E_G\) using future high precision probes and the conclusions that can be drawn from them. We present further below in Sect. 6.3 some constraints on the \(E_G\) measure from recent data.

We conclude this sub-section with some recent findings about the \(E_G\) measure from Amon et al. (2017) using the deep imaging data of KiDS with overlapping spectroscopic regions from 2dFLenS, BOSS DR12 and GAMA. The authors find that changing the metric potentials by as much as 10% produces smaller differences in the \(E_G\) predictions than changing the value of \(\varOmega _m^0\) between the values prefered by Planck and KiDS. They conclude that for this statistic to achieve its aim, the current tensions in cosmological parameters between Planck and large scale structure must be resolved first.

5.6 Parameterized post-Friedmann formalism

It appears that the parametrized post-Friedmann (PPF) formalisms at cosmological scales (Hu and Sawicki 2007b; Baker et al. 2013) has not yet reached the same popularity that its homologous, the parameterized post-Newtonian (PPN), has received when testing GR at solar system levels or binary systems (Will 2014). This could be attributed perhaps to the context and the level of maturity of other methods developed to deal with the specific problems for which each formalism has been introduced. There are at least two major developments in PPF formalisms (Hu and Sawicki 2007b; Baker et al. 2013) but also a number of previous developments such as in Bertschinger (2006), Caldwell et al. (2007), Amin et al. (2008), Pogosian et al. (2010) and Baker et al. (2011). It is also worth noting that the PPF work of Baker et al. (2013) was followed by some of the same authors and others in Lagos et al. (2016, 2018) where the approach was changed to an EFT one as we comment at the end of this subsection.

While inspired by PPN, PPF needs to be formulated to account for cosmological Hubble scales where the exact form of the linearized metric is unknown and the redshift dependence must be taken into account. Therefore, PPF uses rather functions of the redshift and scale and is based on the parameterization of the perturbed field equations instead of the spacetime metric (Baker et al. 2013; Amendola et al. 2013a). We provide a very brief overview below and refer the reader to the original papers (Hu and Sawicki 2007b; Baker et al. 2013).

The first one was proposed in Hu and Sawicki (2007b) where the authors discuss super-horizon, quasi-static and nonlinear regimes of modified gravity with a particular attention to the transitions between them. They construct a PPF formalism for linear perturbations in MG models that joins the super-horizon regime and the sub-horizon quasi-static regime. They propose PPF functions that make the bridge between these two regimes at a scale parameterized by the Hubble length. They defined three functions and one parameter as follows:

  • The metric ratio

    $$\begin{aligned} g(\ln a,k_H) \equiv {\varPhi -\varPsi \over {\varPhi } +\varPsi }, \end{aligned}$$

    where \(k_H \equiv k/aH\) is the wavenumber in units of the Hubble parameter. Note that in terms of the post-Newtonian parameter \(\eta = {\varPhi }\!/\varPsi \), \(g= (\eta -1)/(\eta +1)\).

    The expansion history H and the metric ratio g define completely super-horizon scalar metric fluctuations for adiabatic perturbations.

  • The function \(f_\zeta (\ln a)\) expressing the super-horizon relationship between the metric and density, see Eqs. (16)–(19) in Hu and Sawicki (2007b). As noted there, the exact form of \(f_\zeta (\ln a)\) is rarely important for observable quantities. That is the case, for example, for the galaxy redshift surveys and gravitational lensing. Only observable quantities that depend on the comoving density scales beyond the quasi-static regime are affected by \(f_\zeta (\ln a)\).

  • The function \(f_G(\ln a)\) that parameterizes a possible time-dependent modification of the Newton constant in the quasi-static regime. It is defined from the Poisson equation

    $$\begin{aligned} k^2 \varPsi _{\mathrm{w}} = {4\pi G \over 1+f_G} a^2 {\bar{\rho }}_m \delta _{m}, \end{aligned}$$

    where \(\varPsi _{\mathrm{w}}\) is the Weyl potential defined earlier.

  • The parameter \(c_\varGamma \) that characterizes the relationship between the transition scale and the Hubble scale. As shown in Hu and Sawicki (2007b), the interpolation between the super-horizon regime and the quasi-static regime is given by

    $$\begin{aligned} \left( 1+ c_\varGamma ^2 k_H^{2}\right) \left[ \varGamma ' +\varGamma + c_\varGamma ^2 k_H^{2 }\left( \varGamma -f_G\varPsi _{\mathrm{w}}\right) \right] = S, \end{aligned}$$

    where \(\varGamma \) is added to the modified Poisson equation (115) in order to match the super-horizon scale behavior

    $$\begin{aligned} k^2 [\varPsi _{\mathrm{w}} + \varGamma ] = 4\pi G a^2 \rho _m\varDelta _m, \end{aligned}$$

    and where S is the source for the equation of motion of \(\varGamma \) (Hu and Sawicki 2007b).

For MG models affecting cosmic evolution after matter radiation equality, these 3 functions governing the relations for the metric, the density and the velocity, plus the usual transfer functions specify fully the linear observables of the model.

They provided two examples, one for a f(R) theory model (see Sect. 7.4.1) and another for a DGP theory model (see Sect. 7.5.2). We reproduce their example for the former here. The square of the Compton length (inverse mass) in units of the Hubble length for f(R) is proportional to

$$\begin{aligned} B\equiv & {} {f_{RR} \over 1+f_R} {R'}{H \over H'}, \end{aligned}$$

where \(^{\prime }=d/d\ln a\) and \(f_{RR}= d^2 f/dR^2\). The metric ratio parameter \(g \rightarrow -1/3\) below the Compton length scale. They determine that the PPF metric ratio as \(k_H\rightarrow 0\) is given by

$$\begin{aligned} g(\ln a, k_H=0)= g_{\mathrm{SH}}(\ln a) = {\varPhi -\varPsi \over {\varPhi } + \varPsi }, \end{aligned}$$


$$\begin{aligned} f_\zeta = c_\zeta g \end{aligned}$$

with \(c_\zeta \approx -1/3\). They take for the transition to the quasi-static regime the interpolating function

$$\begin{aligned} g(\ln a,k)&= { g_{\mathrm{SH}} + g_{\mathrm{QS}}(c_{g}k_H)^{n_{g}} \over 1+ (c_{g}k_H)^{n_{g}}}, \end{aligned}$$

where \(g_{\mathrm{QS}}=-1/3\). They find that \(c_g=0.71 B^{1/2}\) and \(n_g=2\) where they used \(\varOmega _m=0.24\) and \(w_{\mathrm{eff}}=-1\).

Last, they find that \(f_R\) is the function that rescales the effective Newton constant and the quasi-static transition happens near the horizon scale. The two statements correspond to

$$\begin{aligned} f_G = f_R, \qquad c_\varGamma =1. \end{aligned}$$

The second PPF formalism was proposed in Baker et al. (2013) taking into account the recent exploding development in the area of dark energy and modified gravity models. A concise summary of the formalism was also given in Amendola et al. (2013a) and we follow that presentation here. Baker et al. (2013) start with scalar perturbations of the Einstein field equations of the form

$$\begin{aligned} \delta G_{\mu \nu } \;=\; 8\pi G\,\delta T_{\mu \nu }+\delta U_{\mu \nu }^{\mathrm {metric}}+\delta U_{\mu \nu }^{\mathrm {d.o.f}}+\mathrm {\ gauge\ invariance\ fixing\ terms}, \end{aligned}$$

where \(\delta T_{\mu \nu }\) is the perturbed stress-energy tensor of cosmic fluids. \(\delta U_{\mu \nu }^{\mathrm {metric}}\) contains new terms from metric perturbations due to modified gravity that constitute terms beyond those coming from \(\delta G_{\mu \nu }\) in GR. \(\delta U_{\mu \nu }^{\mathrm {d.o.f.}}\) contains terms from scalar perturbations of new degrees of freedom due to modified gravity. For example, such terms can come from perturbations of the scalar field from scalar–tensor theories or scalar modes from vector or tensor fields in MG models.

Baker et al. (2013) then considered the expansion of \(\delta U_{\mu \nu }^{\mathrm {metric}}\) in terms of two gauge-invariant perturbation variables. The first is simply the standard gauge-invariant Bardeen potentials, \(\hat{\varPhi }\). The second is a combination of the two Bardeen potentials as follows: \({\hat{\varGamma }}=1/k (\dot{\hat{\varPhi }}+{\mathcal {H}}{\hat{\varPsi }})\). They provided then the equations further below where \(\delta U_{\mu \nu }^{\mathrm {metric}}\) is expressed as a linear combination of \(\hat{\varPhi }\), \({\hat{\varGamma }}\) and their derivatives keeping the gauge-invariance of the field equations. The coefficient of such terms are then part of the PPF function set. They also expressed \(\delta U_{\mu \nu }^{\mathrm {d.o.f.}}\) for the new degrees of freedom in terms of gauge-invariant potentials \(\{{\hat{\chi }}_i\}\) with also coefficients providing other PPF functions. They write then the expanded four components of the perturbed field equations Eq. (123), where 22 PPF parameters where used as functions of time (redshift).

The set of PPF parameters covers super-horizon and sub-horizon scales but the set simplifies significantly in the quasi-static regime reducing to what could be encapsulated in one of the pairs of parameters discussed in Sect. 5.2. It was argued in, for example Amendola et al. (2013a), that in such a regime, which is relevant to weak lensing surveys and galaxy surveys, such a minimal subset is more practical to compare with observation but Baker et al. (2013) explains that such a PPF formalism can extend to horizon scales and can serve for comparisons to large-scale CMB modes contributions to the ISW effect and lensing-ISW cross-correlations, well beyond the quasi-static approximation (Hu et al. 2013; Hu 2008).

Most recently, some of the authors of Baker et al. (2013) and others commented in Lagos et al. (2016, 2018) that the expanded four components of the perturbed field equations with PPF parameters in Baker et al. (2013) contain a lot of free functions because the parameterization is built directly at the level of the field equations. In other words, the coefficients PPF parameters are not all independent. To remove some of the redundancies, Lagos et al. (2016, 2018) built a corresponding parametrization at the level of the action which they call the EFT of cosmological perturbations. As a result, the maximum needed number of parameters drops to 12 in this EFT parameterization compared to 22 in the EFT formalism above. This provides an extension to the scalar–tensor EFT approach that we discussed in Sect. 5.1.

Finally, we conclude this section by a most recent work of Clifton and Sanghai (2018) where the authors proposed a set of 4 parameters to model minimal deviations from GR (metric theories) that can be used to cover scales at solar systems, galactic, and cosmological scales all the way to super-horizon. Two of the parameters are the well-known effective gravitational constant (\(\mu \)) and the slip parameter (that they note \(\zeta \)). They apply consistency relations in order to connects the behavior of these parameters between small and large scales. They show that using these conditions, \(\mu \) and \(\zeta \) can be expressed on small and large scales using 4 parameters \(\{\alpha ,\gamma ,\alpha _c,\gamma _c\}\). The first two parameters are the same as the PPN parameters but allowed to vary at cosmological scales while the two other are specific to cosmological evolution and enters the two Friedmann equations. They refer to the set as PPNC. It will be interesting to see applications of this set to currently available data.

5.7 Remarks on transition to nonlinear scales

A legitimate question is to ask if the various parametrizations and approaches discussed above could deal (or be extended to deal) in some way with nonlinear scales. A related question is if any parametrizations can deal with the nonlinear scales then can they reflect accurately any screening mechanism (see Sect. 8) at work in models.

First, the phenomenological MG parameterization using \(\mu \), \(\eta \), \(\varSigma \) and other related parameters have been proposed based on the linearly perturbed Einstein equations so they are constrained to only linear scales by construction. Most recently, Clifton and Sanghai (2018) proposed a scheme (or parametrization) that is argued to link between MG parametrization at small scales and large scales. The idea is based on two parameters they put between quote marks as the “slip” and the “effective Newton’s constant” that can be written in terms four functions of time. Two of these four functions are a direct generalization of the usual \(\alpha \) and \(\gamma \) parameters from PPN formalism at small scales, see e.g., Will (2014). This development uses concepts of averaging small scales to larger scales. This very recent proposal came in a short paper and is at a very early stage at the moment of writing this review. It will be interesting to follow further development of this work and any clarifications on how it could deal with any screening mechanisms and other relevant questions.

Second, when considering the measure \(E_G\) at nonlinear scales, it was observed in Leonard et al. (2015) that there was a difference between \(E_G(\ell )\) as given by Eq. (111) and \(E_G(R)\) as given by Eq. (113). They state that while \(E_G(\ell )\) is defined in Fourier-space and includes only linear scales, that is not necessarily the case for \(E_G(R)\) which is defined in real space and scales are not separated in an easy way. They found that the inclusion of non-linearities in the correlation function used into \(E_G(R)\) do not cause the measure to deviate from the expected GR value at small scales. They attribute this to fact that nonlinearities enter into \(\varUpsilon _{gm}(R)\) and \(\varUpsilon _{gg}(R)\) (i.e., the galaxy-matter and galaxy-galaxy annular differential surface densities) via the same combination of correlation function terms, so they effectively cancel out from the ratio. It remains an open question whether such a behavior is also expected for modified gravity models.

Third, the PPF formalism of Hu and Sawicki (2007b) was proposed with a prescription on how to derive the nonlinear matter power spectrum in modified gravity theories that should in principle capture the screening mechanism as well. The prescription is based on the assumption that such a nonlinear power spectrum should reduce to that of GR on small scales. The fitting formula they proposed is as follows

$$\begin{aligned} P(k,z)=\frac{P_{\mathrm{non-GR}}(k,z)+c_{\mathrm{nl}}\varSigma ^2(k,z)P_{\mathrm{GR}}(k,z)}{1+c_{\mathrm{nl}}\varSigma ^2(k,z)}, \end{aligned}$$

where \(P_{\mathrm{GR}}\) is for the nonlinear power spectrum in a GR-\(\varLambda \)CDM model that has the same expansion history as that of the modified gravity model under consideration. \(P_{\mathrm{non-GR}}\) is for the nonlinear power spectrum in this modified gravity but without the screening mechanism necessary to recover GR on small scales. In other words, the fitting formula corrects the MG power spectrum to fit GR at small scales. The weighting function,

$$\begin{aligned} \varSigma ^2(k,z)\equiv \frac{k^3}{2\pi ^2}P_{\mathrm{lin}}(k,z), \end{aligned}$$

represents the degree of nonlinearity and governs the degree of screening efficiency. \(P_{\mathrm{lin}}\) is the linear power spectrum in the modified gravity model. The \(c_{nl}\) are coefficient (but can also be time-dependent) to control the scale of the effect. See, e.g., Hu and Sawicki (2007b).

Koyama et al. (2009) did further fitting using the PPF formalism with prescription above and added an exponent n on the right of Eq. (125). They found that \(n=1\) for DGP and \(n=1/3\) for f(R) provide good fits to N-body simulations of the models up to \(k\sim 0.5\) h/Mpc. They also determined values for \(c_{nl}\) in their fitting work. Zhao et al. (2011) used an exponent n as a function of k and 3 parameters. They extended the good fit to N-body simulations up to \(k=10\) h/Mpc for f(R) models. These two studied and others found that the Chameleon mechanism at work was accurately reproduced by the implementation of this prescription.

Lombriser et al. (2014) and Lombriser (2014) combined the spherical collapse model, the halo model, linear perturbation theory, quasi-nonlinear interpolation motivated by the \(c_{nl} \varSigma ^2(k,z)\) above and one-loop perturbations in order to derive a description of nonlinear the nonlinear matter power spectrum of f(R) gravity with chameleon screening on scales of up to \(k\sim 10\) h/Mpc. This encouraged Lombriser (2016) to push further the method above of combining the perturbative approach with one-halo contributions obtained from a generalized modified spherical collapse model. The author proposed a parametrization based on the spherical collapse that enters into effect as one transitions into the deep nonlinear regime. The formalism he proposed allows one to encode different screening mechanisms at work in scalar–tensor theories. This sophisticated parametrization is then combined with generalized perturbative approaches to give a formalism that constitutes a nonlinear extension to the linear PPF formalism discussed above. For a detailed description, see Lombriser (2016).

Finally, there have been some recent proposals of extending the EFT formulation of the dark energy to nonlinear scales such as in, e.g., Cusin et al. (2018) for the Vainshtein mechanism, or to develop post-Newtonian–Vainshtein formalism that can be connected to it, see e.g., McManus et al. (2017) and Bolis et al. (2018). It was highlighted in Lombriser et al. (2018) that the EFT formulation of dark energy they explore in their paper can be connected to the nonlinear parameterization developped in Lombriser (2016). The topic of expanding the EFT formulation of dark energy to nonlinear regime is a subject of interest in the most recent literature and is to be followed very closely.

6 Constraints and results on MG parameters (i.e., deviations from GR) from current cosmological data sets

In this section we describe current results on testing MG phenomenological parameters from cosmology. These are only a subset of selected available papers and results in the literature. We aimed here to focus on some of the recent results, or in some cases, on less recent constraints but those that helped exclude substantial regions of MG parameter spaces. We organize this section by the parameterizations described above and then by probes and surveys.

6.1 Constraints on modified growth parameters

6.1.1 Constraints from Planck CMB, ISW, CMB lensing, and other data sets

We start with the XIVth paper of the Planck 2015 data release (Ade et al. 2016b) that was dedicated to dark energy and modified gravity models beyond \(\varLambda \)CDM (we hereafter refer to the paper as Planck2015MG). The authors used Planck CMB temperature, polarization and CMB lensing data sets combined with several other data sets as follows. They defined Planck low-\(\ell \) data their temperature and polarization multipoles with \(\ell \le 29\) (noted therein as “lowP”), and also the high-\(\ell \) temperature-only data (noted Planck-TT) with \(30 \le \ell \le 2500\). They also used their CMB lensing data which is sensitive to dynamical dark energy and late-time modification to gravity (Ade et al. 2016c). Planck2015MG considered BAO as the primary data set to be combined with CMB in order to break degeneracies among cosmological parameters constrained by the background evolution and used data from Ross et al. (2015), Anderson et al. (2014) and Beutler et al. (2011). They used supernova data from the (JLA) compilation (Betoule et al. 2013, 2014). They also used a local measurement of the Hubble constant, \(H_0=70.6\pm 3.3\)  km \(\mathrm s^{-1}\) Mpc\(^{-1}\), from Efstathiou (2014) who reanalyzed the results of Riess et al. (2011). For constraints on the growth-rate of large scale structure, Planck2015MG used constraints on \(f\sigma _8\) from the RSD data compilation of Samushia et al. (2014) (see references therein) as well as weak lensing data from the CFHTLenS survey using the 2D data of Kilbinger et al. (2013) and the tomography data from only blue-galaxies in order to avoid any intrinsic alignment contamination present in the red-galaxies (Heymans et al. 2013).

For MG parameters, Planck2015MG constrained \(\mu (k,a)\), \(\eta (k,a)\), and \(\varSigma (k,a)\) as defined earlier in Eqs. (90), (91), and (92) but added to them specific time and scale dependencies. They defined a parametrization that is similar to that described in (104) (Bertschinger and Zukin 2008) for the quasi-static regime but which is more general and covers a wider range of scales (Ade et al. 2016b). For the time evolution they considered two cases, one where the dependence is expressed via the effective dark energy density \(\varOmega _\mathrm{{DE}}(a)\), and a second case where the scale factor appears directly in the parametrization. They also split the time evolution using \(E_{ij}\) constants, \(i,j-1,2\) to represent early and late time evolution. The \(E_{ij}\) parameters are constrained from the data and the parameters \(\mu \), \(\nu \) and \(\varSigma \) are reconstructed from them.

Fig. 5
figure 5

Figure reproduced with permission from Ade et al. (2016b)

Contour plots for marginalized posterior distributions for 68% and 95% C.L for the two parameters \(\{\mu _0-1,\eta _0-1\}\) at the present time with no scale dependence. On the left, time dependence is considered via the effective dark energy density parameter. On the right panel, time evolution is considered by direct inclusion of the scale factor. Results discussed in text of Sect. 6.1.1. The label Planck stands for PlanckTT+TEB

However, Planck2015MG found that the current data can not meaningfully constrain the scale dependent MG parameters and that the inclusion scale dependence have very little effect on the \(\chi ^2\) value of the best fit. Therefore their main MG parameter analysis was carried out without scale dependence except for a small illustrative example.

We reproduce here their Fig. 14 (see Fig. 5 here), their Fig. 15 (see Fig. 6) and their Table 6 (see Table 3 here) showing constraints on \(\mu (k,a)\), \(\eta (k,a)\), and \(\varSigma (k,a)\) from various combinations of Planck and other data sets. Note that Planck2015MG use on their figures or tables Planck to refer to the combination Planck TT \(+\) lowP data. We expanded that in the header of Table 3 for clarity.

Fig. 6
figure 6

Figure reproduced with permission from Ade et al. (2016b)

Contour plots for marginalized posterior distributions for 68% and 95% C.L for the two parameters \(\{\mu _0-1,\eta _0-1\}\) at the present time with no scale dependence. The time dependence is considered via the effective dark energy density parameter. \(\varSigma \) is obtained from Eq. (92). Results discussed in text of Sect. 6.1.1. In the labels, Planck stands for PlanckTT+TEB

Their reproduced Figs. 5 and 6 show that while, \(\mu (k,a)\), \(\eta (k,a)\), and \(\varSigma (k,a)\) are close to their GR value of 1, some tension with GR is present and they provide some explanations for the source of such tension. This is indicated by the dashed horizontal and vertical lines in Fig. 5. In case (1) above, with time evolution based on effective \(\varOmega _{\mathrm{DE}}(a)\), the tension is at the 2\(\sigma \) level for Planck TT \(+\) lowP data and rises above 2\(\sigma \) when the constraints are tightened by adding the BAO \(+\) RSD data. The tension reaches 3\(\sigma \) level for Planck TT \(+\) lowP \(+\) WL \(+\) BAO \(+\) RSD combination. For case (2), with time evolution depending directly on a, there is less tension. It goes from 1-\(\sigma \) for Planck TT \(+\) lowP data to 2-\(\sigma \). They commented that the latter increase from 2 to 3-\(\sigma \) in the tension is mainly driven by the additional external data sets and so is the goodness of the fit of the models with the two additional MG parameters that show an improvement that ranges from \(\delta \chi ^2=-6.3\) when using Planck \(+\) lowP to \(\delta \chi ^2=-10.8\) when combining Planck TT \(+\) lowP \(+\) WL \(+\) BAO \(+\) RSD, compared to the \(\varLambda \)CDM.

Table 3 Marginalized mean values and 68 % C.L. errors on cosmological parameters and the MG parameters \(\{\mu _0-1,\eta _0-1\,\varSigma _0-1\}\) at the present time with no scale dependence

Planck2015MG comment that the tension above can be understood from their Fig. 1 showing that the best fit power spectrum Planck TT \(+\) lowP prefers models with slightly less power in the CMB at large scales (i.e., ISW effect) and models with a higher CMB lensing potential when compared to the \(\varLambda \)CDM model. They state that this point corroborates with the fact that MG parameters departing from GR values are found to be degenerate with the lensing amplitude parameter \(A_L\). This is simply a non-physical scaling parameter to check how the CMB power spectrum is affected by lensing. It should be equal to 1 for consistency. Calabrese et al. (2008) found that \(A_L\) is not equal to 1 when using the \(\varLambda \)CDM model, but Planck2015MG find that if MG parameters are allowed to vary then \(A_L\) becomes consistent with unity again but then MG parameters move away from their \(\varLambda \mathrm {CDM}\) value. However, Planck2015MG points out that CMB lensing analysis from the 4-point function of Ade et al. (2016c) is consistent with \(A_L=1\) and in agreement with \(\varLambda \mathrm {CDM}\) with no requirement of a higher lensing potential. Therefore, when Planck2015MG use this CMB Lensing data, the MG parameter confidence contours are shifted to regions where the tensions above are removed (fall to 1-\(\sigma \) for CMB data only and below 2-\(\sigma \) for all data combined). GR and \(\varLambda \)CDM provides a good fit then. It is worth noting though that recent work confirms some tension between Planck temperature and polarization data versus Planck CMB Lensing data (Motloch and Hu 2018).

Their Fig. 16 and Table 7 provide a summary of the tensions with and without CMB Lensing where they present the tension using departure from the line of maximum degeneracy between the two MG parameters.

Their Table 6 (Table 3) shows the corresponding marginalized mean values and the 65% CL errors on the MG parameters for each combination of data sets. This shows the explicit constraints on MG parameters and the tensions reported above. As commented in Planck2015MG, the addition of the BAO \(+\) SN \(+\) H does not improve significantly the MG constraints while the RSD data does provide a noticeable improvement, as expected. Finally, as shown in their Fig. 18, the current available data is not able to provide useful constraints when the scale dependence of the MG parameters is included in the analysis.

6.1.2 Constraints on MG parameters from mainly weak lensing data

KIDS-450 + other data sets

Joudaki et al. (2017) conducted a detailed analysis to test extensions to the standard \(\varLambda \)CDM cosmological model including constraints on deviations from GR using weak lensing tomography using 450 deg\(^2\) of imaging data from the Kilo Degree Survey (KiDS) (Hildebrandt et al. 2017). The authors also used the Planck temperature and polarization measurements on large angular scales (\(\ell \le 29\)) using low-\(\ell \) (TEB likelihood) and temperature only (TT) at smaller scales (PLIK TT likelihood) (Ade et al. 2016a). They explored if any of the extensions to the standard model could alleviate the tension reported in Hildebrandt et al. (2017) between KiDS and Planck constraints. The extent and sources of these tensions has been put into question though by Efstathiou and Lemos (2018).

They used the parameterization Q(kz) and \(\varSigma (k,z)\) as in (86) and (88), and binned in scale and redshift similar to Table 2, with transitions at \(\mathrm{k}= 0.05 \mathrm{h}\, {\mathrm{Mpc}}^{-1}\) and \(\mathrm{z} = 1\). They used as lensing statistics, the correlations functions in Eq. (70). They included in their analysis all of the key lensing systematics such as intrinsic alignments of galaxies and baryonic effects by modeling them and adding the corresponding parameters to be also constrained by the data. They used for the MG part of their analysis the ISiTGR software (Dossett et al. 2011b) which is a modified version of CosmoMC and CAMB (Lewis and Bridle 2002; Lewis et al. 2000) (see Sect. 11.1).

Fig. 7
figure 7

Left: Marginalized posterior contours (inner 68% CL, outer 95% CL) in the \(Q_2 - \varSigma _2\) space for KiDS with fiducial angular scales shown in green (labeled by ‘FS’), KiDS keeping only the largest angular scales shown in pink (labeled by ‘LS’), and combined with Planck in grey and blue, respectively. The indices represent the combination of MG bins, such that \(z < 1\) and \(k > 0.05~h~{\mathrm{Mpc}}^{-1}\). The intersection of the dashed lines give the GR prediction (i.e., \(Q = \varSigma = 1\)). Reproduced with permission from Fig. 13 in Joudaki et al. (2017). Right: In addition to the cases described on th left, the constraints include galaxy–galaxy lensing correlation with cosmic shear in WL and RSD data as described in the text. ’Large-scale cuts’ mean that small scales have been excluded because of no adequate modeling for generic MG deviations in the nonlinear regime that can be utilized here. Again, the intersection of the horizontal and vertical lines is the GR prediction (i.e., \(Q = \varSigma = 1\))

We reproduce the right panel of their Fig. 13 (see left panel of Fig. 7 here) showing the constrains on \(Q_2\) and \(\varSigma _2\). As shown on the figure, KiDS constraints are consistent with GR and are mainly sensitive to \(\varSigma _2\) as expected for lensing constraints. The authors report that this is also the case for the other 6 \(Q_i\) and \(\varSigma _i\) parameters. Furthermore, using \(\chi ^2\) and other Bayesian tests, they find that the data has no significant preference for the model with additional MG parameters compared to \(\varLambda \)CDM. The tension between Planck and KiDS goes away but they attribute that to the weakening in the constraints due to the additions of 8 MG parameters. They conclude that their data (combined with Planck) has no preference for a deviation from GR. They found instead that a model with a dynamical dark energy and a time-evolving equation of state is moderately preferred by the data and alleviates the tension between their data and Planck.

In a subsequent study (Joudaki et al. 2018), the authors combined KiDS lensing tomography data and the overlapping areas from two spectroscopic redshift galaxy clustering surveys: 2dFLenS (Blake et al. 2016a) and BOSS (Dawson et al. 2013; Anderson et al. 2014). The same Planck data as above was used again. They performed cosmological parameter constraints including MG parameters using three large-scale structure measurements: cosmic shear tomography, galaxy-galaxy lensing tomography, and redshift-space distortions (RSD) in the form of redshift-space multipole power spectra (Taylor and Hamilton 1996). This provided the analysis with significantly more constraining power and tightening of constraints on all parameters. However, this tightening of constraints also made the tension between large-scale constraints and Planck at the 2.6\(\sigma \) level. They found that models with MG parameters could resolve the discordance in the linear/large-scale case, but are not favored by model selection. The same result stands for extended models with massive neutrinos, curvature or evolving dark energy. The big plus for constraints on MG parameters in their analysis comes from the complementarity between cosmic shear that is sensitive to the sum of the two potentials via light deflection, i.e., \(\varPsi +\varPhi \), and the redshift space distortions that are sensitive to the potential \(\varPsi \) via the matter growth of large scale structure. They use the same bins in redshift and scale for MG parameters as above and keep the background cosmology as a \(\varLambda \)CDM one.

We reproduce the right panel of their Fig. 11 (in the right panel of our Fig. 7) showing the new constraints in \(Q_2-\varSigma _2\) plane. These two parameters are in the second bin in redshift (i.e., \(z<1\)) and second bin in length-scale (i.e., \(k>0.05\,\hbox {h}\,\mathrm{Mpc}^{-1}\)). One can see a significant improvement in the constraints in the right panel compared to the left which highlights the importance of adding the RSD data and the galaxy-galaxy lensing correlation to cosmic shear data, as the authors stress in their conclusion.

For this WL \(+\) RSD combined analysis, they find \(Q_2 = 2.8^{+1.1}_{-2.0}\) and \(\varSigma _2 = 1.04^{+0.11}_{-0.14}\), while for KiDS only in Joudaki et al. (2017) \(\varSigma _2 = 1.23^{+0.34}_{-0.70}\) and unconstrained \(Q_2\) within its prior range. These and all other constraints on the six other modified gravity parameters are all consistent with the GR values of unity. The tightest constraints in this analysis come from combining cosmic shear, galaxy–galaxy lensing correlation, RSD and Planck \(Q_2 = 1.28^{+0.41}_{-1.00}\) and \(\varSigma _2 = 0.90^{+0.14}_{-0.18}\). As they comment, these are conservative results since only large-scale cuts are used which are found consistent with Planck. This is a good improvement from the previous analysis above with large-scale ‘KiDS cosmic shear +Planck’ constraints where \(Q_2 > 2.2\) (restricted by the upper bound prior) and \(\varSigma _2 = 2.13^{+0.58}_{-1.10}\). The authors conclude that as we will have more overlap between KiDS and 2dFLenS/BOSS, we will be able to obtain more stringent constraints using the data combination used here.

CFHTLenS + other data sets

Some years earlier, Simpson et al. (2013) used combined structure growth data from the CFHTLenS tomographic cosmic shear survey (Heymans et al. 2013; Benjamin et al. 2013), the WiggleZ Dark Energy Survey (Blake et al. 2012), and redshift space distortions from the 6dFGS (Beutler et al. 2012) to constrain MG parameters and deviations from the Newtonian potentials. They also used background data for \(H_0\) from Riess et al. (2011), BAO data from Anderson et al. (2012), and Padmanabhan et al. (2012), as well as CMB temperature (TT) and polarization (TE) with data from WMAP7 (Komatsu et al. 2011).

They used a slightly modified parametrization so that our \(\mu (k, a)\) and \(\varSigma (k, a)\) in (91) and (92) are replaced by \([1 + \mu (k, a)]\) and \([1 + \varSigma (k, a)]\) respectively and now taking 0 value in the GR case instead of 1. They modeled the time-evolution of the MG parameter to scale with the background effective dark energy density as:

$$\begin{aligned} \varSigma (a) = \varSigma _0 \frac{\varOmega _\varLambda (a)}{\varOmega _\varLambda } \, , \, \, \, \, \mu (a) = \mu _0 \frac{\varOmega _\varLambda (a)}{\varOmega _\varLambda } \, , \end{aligned}$$

where \(\varOmega _\varLambda \equiv \varOmega _\varLambda (a=1)\) is today’s value so that \(\mu _0\) and \(\varSigma _0\) represent today’s values of \(\mu (a)\) and \(\varSigma (a)\) as well, respectively.

They used measurements constraints on \((f\sigma _8,F)\) from the WiggleZ and 6dFGS surveys where F(z) represents the amplitude of the Alcock–Paczynski effect degenerate with the RSDs as we discussed in Sect. 4.3. These measurements are from three effective redshift slices from the WiggleZ \(z = 0.44\), 0.60, and 0.73, with \(\sigma _8(z) = (0.41 \pm 0.08, 0.39 \pm 0.06, 0.44 \pm 0.07)\) and \(F = (0.48 \pm 0.05, 0.65 \pm 0.05, 0.86 \pm 0.07)\) plus a fourth data point of \(f \sigma _8 = 0.423 \pm 0.055\) at a lower redshift \(z=0.067\) from the 6dFGS with negligible sensitivity to the Alcock–Paczynski distortion.

In their analysis they considered the \(\varLambda \)CDM, the flat and non-flat wCDM models all augmented with the MG parameters \(\mu _0\) and \(\varSigma _0\). In all cases, they found no indication of departure from general relativity on cosmological scales. They put the following limits on MG parameters: \(\mu _0 = 0.05 \pm 0.25\) and \(\varSigma _0 = 0.00 \pm 0.14\) for a flat \(\varLambda \)CDM background model. They note that these correspond to deviations in the present-day Newtonian potential and spatial curvature potential of \(\delta \varPsi /\varPsi _{\textit{GR}} = 0.05 \pm 0.25\) and \(\delta {\varPhi } / \varPhi _{\textit{GR}} = -0.05 \pm 0.3\) respectively, with significant correlations between the errors. When they allow for w to vary for the background, these constraints change to \(\mu _0 = -\,0.59 \pm 0.34\) and \(\varSigma _0 = -\,0.19 \pm 0.11\). They also constrained the growth index parameter to \(\gamma =0.52 \pm 0.09\) for a \(\varLambda \)CDM background model, thus in agreement with the GR value of \(6/11=0.545\).

6.1.3 Constraints on MG parameters from various probes and analyses

Peirone et al. (2017b) perform an extensive analytical and numerical analysis of the MG parameters \(\varSigma \) and \(\mu \) or equivalently \(G_\mathrm{light}/G\) and \(G_\mathrm{matter}/G\). They consider Horndeski models that are broadly consistent with background and perturbation tests of gravity and the cosmic expansion history with late time acceleration. They also take into account the recent result from GW170817 and its counterpart GRB170817A, setting \(c_T=c\). They confirm a conjecture they made in their earlier work (Pogosian and Silvestri 2016) about MG parameters in Horndeski models, that is \((\varSigma -1)(\mu -1)\ge 0\) (that is the two factors must be of the same sign) must hold in viable Horndeski models in the quasi-static approximation. They also discussed in their previous work (Pogosian and Silvestri 2016) consistency relations between the two MG parameters that, if broken would exclude some sub-classes of Horndeski models (e.g., \(\varSigma \ne 1\) would rule out all models with a canonical form of kinetic energy). They remark that while the results of Ade et al. (2016b) indicate \(\mu <1\) and \(\varSigma >0\) are not statistically significant, however, if such values will hold in more precise experiments in the future that would rule out all Horndeski models. In the latter paper, they show that requiring no ghosts and no gradient instabilities prevents from having values within the \(\varSigma -1>0\) and \(\mu -1<0\) range. They also examined the conjectured condition versus the Compton wavelengths considered. They also found that observations from background expansion also put constraints on gravitational coupling which in turn re-enforces the conjecture limits. They also test the validity of the quasi-static approximation in Horndeski models finding that it holds well at small and intermediate scales but fails at \(k\le 0.001\) h/Mpc. They conclude in their analysis that despite the stringent result from GW, there remain Horndeski models with non-trivial modifications to gravity at the level of linear perturbations and large scale structure. They stress the complementarity of different approaches used to constrain modification to GR and the practicality of using the phenomenological \(\varSigma \) and \(\mu \) parameterization and their consistency relations, see also Pogosian and Silvestri (2016).

Another analysis of these self-consistency relations between MG parameters and growth rate in Horndeski models was performed by Perenon et al. (2017). They considered accelerating Horndeski models with \(-1.1 \le w_{\mathrm{eff}} \le -\,0.9\) and classified them according to their early or late time effects as follows. Late-time dark energy where both dark energy energy momentum tensor and non-minimal gravitational couplings are negligible at early times. Early-time dark energy where the dark energy momentum tensor is at work even at early times but non-minimal coupling happens at late time only. Finally, they call early modified gravity where both dark energy momentum and non-minimal gravitational couplings are also present at early time during matter domination. They proposed a convenient way to represent the viability of the models using two diagnostic planes: the \(\mu (z)-\varSigma (z)\) and the \(f(z)\sigma _8(z)-\varSigma (z)\) planes. They derived the following conclusions from their detailed analysis in the first plane. If model-independent measurements find either (i) \(\varSigma -1<0\) at redshift zero or (ii) \(\mu -1<0\) with \(\varSigma -1>0\) at high redshifts (\(z>1.5\)) or (iii) \(\mu -1>0\) with \(\varSigma -1<0\) at high redshifts, Horndeski theories are ruled out. In the second plane, they found that: (i) If \(f\sigma _8\) is found to be larger than that of \(\varLambda \)CDM model at \(z>1.5\) then early dark energy models are ruled out. On the opposite case (for \(f\sigma _8\)), (ii) measuring \(\varSigma <1\) will rule out late dark energy models, while, (iii) \(\varSigma >1\), it is the early modified gravity case as described earlier in this paragraph that is allowed.

Fig. 8
figure 8

Figure reproduced with permission from Ferté et al. (2017)

Contour plots for 68% and 95% CL on MG parameters \(\varSigma \) and \(\mu \) combining Planck CMB data (TT \(+\) lowP \(+\) CMB lensing), RSD data from BOSS DR12 and 6dFGS, and cosmic shear data from CFHTLenS in blue and DES-SV in red. The cross point represent the GR values (0,0) according to the authors’ definitions and show that GR is consistent with the data sets used. The combination for the contours in blue gives among the tightest current constraints on MG parameters as: \(\varSigma = -0.01_{-0.04}^{+0.05}\) and \(\mu = -0.06 \pm 0.18\) (68 % confidence level)

Ferté et al. (2017) performed an analysis to constrain the two MG parameters but using the definitions \([1 + \mu (a)]\) and \([1 + \varSigma (a)]\) to enter in the Poisson and lensing equations instead of \(\mu (a)\) and \(\varSigma (a)\) so taking 0 values in the GR case instead of