1 Introduction

The search for an explanation to the current phase of accelerated expansion of the universe is one of the most important paradigms in modern cosmology [3]. With the available observational information, the concordance model of cosmology that best fits the data is the flat \(\varLambda \)CDM [4] model, where the universe is filled in with cold dark matter (CDM) and a dark energy (DE) component –in the form of a cosmological constant \(\varLambda \)– in addition to the standard baryonic and electromagnetic ingredients. The main concern with the DE is that its physical interpretation is still unknown. Other plausible causes of the observed accelerated expansion should be explored [5,6,7,8,9,10]. Because \(\varLambda \)CDM is based on Einstein’s general relativity (GR), one possible way is to explore cosmological models not based on GR [11,12,13,14,15,16,17,18,19,20]. However, there is a degeneracy problem between models supported by GR and models based on other metric theories, generally termed Modified Gravity (MG) theories. This degenerescence cannot be broken using data from the dynamical evolution of the Universe alone, as the same dynamics can be explained by different MG theories as well as by evolving DE fluids, \(\omega =\omega (a)\), where a is the scale factor [21,22,23]. The best observable to discriminate between MG theories and \(\varLambda \)CDM is the growth rate of cosmic structures, f(a), defined as [24]

$$\begin{aligned} f(a) \equiv \dfrac{d\ln \delta }{d\ln a}, \end{aligned}$$
(1)

where \(\delta \) is the matter density contrast. The growth rate \(f = f(a)\) has indeed the potential to constrain models based on MG theories and \(\varLambda \)CDM DE (based on GR) from the measure of the growth index, \(\gamma \), when one parametrizes f as [11]

$$\begin{aligned} f(a) = \varOmega _{m}(a)^{\gamma }. \end{aligned}$$
(2)

In the \(\varLambda \)CDM concordance model, \(\gamma \simeq 0.55\), and for some MG models \(\gamma \) evolves with time: \(\gamma \simeq 0.41 - 0.21z\) [25]. In fact, changing the gravitational theory will affect the way how matter clumps at all scales, beyond the expansion of the homogeneous universe. As such, alternative cosmological models based on MG theories make very different predictions for matter clustering and evolution.

Direct measurements of f(a), or equivalently of f(z), where the scale factor and the redshift are related by \(a=1/(1+z)\), are difficult to obtain. Instead the data available comes from the measurements of the galaxy redshift space distortions in the form of a product of two quantities evolving with time: \([f\sigma _8](z) \equiv f(z) \sigma _8(z)\), where \(\sigma _8\) is the root-mean-square linear fluctuation of the matter distribution at the scale of 8 Mpc/h, h the dimensionless Hubble parameter, where \(H_0 = 100 \,h\) km/s/Mpc. Notice that the letter f refers to the growth rate of cosmic structures, that is, it represents the function f(a) or f(z), but it is commonly used in the form f(R) to refer to MG theories. In what follows the symbol f(R), as a function of R, only refers to MG theories, while the letter f alone or the function f(a), or f(z), refers to the cosmic growth rate. In this paper, we use this observational parameter, \([f\sigma _8]\) and the background evolution of the universe, H to explore the parameter space and test a model of f(R) gravity, namely, the Starobinsky f(R) model.

We use a Markov Chain Monte Carlo (MCMC) method to explore the parameter space of the theory with the selected datasets for H(z) and \([f\sigma _8](z)\), both individually and jointly. In the end, we further extend our analysis with a joint study of the observables proposed by Linder [2] and, in the context of Modified Gravity, more recently by Matsumoto [26].

This work is organized as follows. A brief description of the Starobinsky cosmological model [1] is presented in Sect. 2. In Sect. 3 we provide details of the two data sets used in the analyses, and in Sect. 4 we describe the methodology of the analyses performed, show our results, and provide their statistical interpretation. The conclusions obtained from these analyses and the final remarks of our work are addressed in Sect. 5.

2 Starobinsky f(R) model cosmology

2.1 Background space-time

A general f(R) theory in the metric formalism that is minimally coupled to matter has an action [27]

$$\begin{aligned} S = \int d^4 x \sqrt{-g}\left[ \frac{1}{2\kappa }f(R) + {\mathcal {L}}_m(g_{\mu \nu }) \right] , \end{aligned}$$
(3)

where we include in the action the matter Lagrangian density, \({\mathcal {L}}_m\), and define \(\kappa \equiv 8\pi G\) (we adopt units with \(c = 1\)). This allows us to use geometrized units more easily when needed. In this work we define the function f(R) as in the model introduced by Starobinsky [1]

$$\begin{aligned} f(R) \equiv R + \uplambda R_0 \left( \left( 1+\frac{R^2}{R_0^2}\right) ^{-n} -1\right) . \end{aligned}$$
(4)

\(R_0\) and \(\uplambda \) are model parameters related to the observed DE density and a characteristic curvature, respectively [28], which can be constrained by the measured cosmological parameters using Eq. (13). In this paper we will analyse the cases \(n=1\) and \(n=2\), which are extensively studied in the literature [28, 29]. While the model with \(n=1\) is known to have difficulty in passing solar system tests and reproducing the matter density fluctuations power spectrum [1, 30], it is still used in the literature as a prototypical example of the theory, as well as a fit to the data and MCMC analysis [28, 31, 32]. The parameter range for models which pass the Solar System test is \(n\ge 2\), which also has the most pronounced effect on the evolution of matter perturbations [33, 34].

The variation with respect to the metric gives rise to the field equations of this model, given by the extended Einstein equations

$$\begin{aligned}&F(R) R_{\mu \nu } - \frac{1}{2}f(R)g_{\mu \nu } - [\nabla _\mu \nabla _\nu - g_{\mu \nu }\square ]F(R) = \kappa T_{\mu \nu }, \end{aligned}$$
(5)
$$\begin{aligned}&F(R)R - 2f(R) + 3\,\square f(R) = \kappa T, \end{aligned}$$
(6)

where we define \(F(R) \equiv df(R) / d R\), \(T \equiv g^{\mu \nu }T_{\mu \nu }\), and the second equation is the trace of the first one, useful to derive the dynamics of the f(R) function, which is more complicated than in GR where one has an algebraic relation between R and T [35].

To model our cosmological space-time we adopt the Friedmann-Lemaître-Robertson-Walker (FLRW) for the background space-time filled with a perfect fluid, with metric and stress-energy tensor given by, respectively,

$$\begin{aligned}&ds^2 = -dt^2 + a^2(t)\left[ \frac{dr^2}{1 - Kr^2} + r^2 (d\theta ^2 + \sin ^2\theta \,d\phi ^2) \right] , \end{aligned}$$
(7)
$$\begin{aligned}&T^{\mu \nu } = (\rho + P)u^\mu u^\nu + Pg^{\mu \nu } , \end{aligned}$$
(8)

where we take, as usual, K is the scalar curvature of the 3-space, and \(\rho \) and P, are the density and pressure of the perfect fluid, respectively. \(u^{\mu }\) is the 4-velocity vector of an observer comoving with the fluid; \(a=a(t)\) is the dimensionless scale factor, with \(a(t_0)=1\), \(t_0\) is today’s cosmic time, and we always use geometrized units, unless otherwise noted. Because we shall compare our analyses with those of the flat \(\varLambda \)CDM model, we consider the flat FLRW space-time, i.e., \(K=0\).

One can analyse the dynamical evolution of our cosmological model using Eqs. (7) and (8) in Eq. (6), obtaining the modified Friedmann equations

$$\begin{aligned}&\!H^2 = \frac{\kappa }{3F(R)}\left[ \rho + \frac{RF(R) - f(R)}{2} - 3H{\dot{R}} F'(R) \right] , \end{aligned}$$
(9)
$$\begin{aligned}&2{\dot{H}} + 3H^2 = -\frac{\kappa }{F(R)} \left[ P + {\dot{R}}^2F''(R) + 2H{\dot{R}}F'(R) \right. \nonumber \\&\qquad \qquad \qquad \quad + \, {\ddot{R}}F'(R) + \frac{1}{2} (f(R) - RF(R)) \Big ], \end{aligned}$$
(10)

where \(\dot{\,}\) means derivative with respect to the cosmic time t. These equations must obey certain constraints in order that our model is stable and correctly reproduces the late-time acceleration of the universe, without deviating from \(\varLambda \)CDM at early times (associated with \(R \gg R_0\)). An analysis of such constraints on general f(R) theories can be found in [35]. Following [1], it suffices that (4) should satisfy, for the values \(n=1\) and \(n=2\)

$$\begin{aligned}&F(R)> 0, \quad F'(R)> 0, \quad R \ge R_1, \quad \uplambda > \frac{8\sqrt{3}}{3}, \quad n=1, \end{aligned}$$
(11)
$$\begin{aligned}&\uplambda > \frac{\sqrt{(\sqrt{13}-2)}}{2}, \quad n=2 \end{aligned}$$
(12)

where \(R_1\) is the Ricci curvature of a de Sitter point in the space of solutions. More stringent constraints on the model parameters n and \(R_1\), can be obtained by likelihood analyses on various data sets for the Starobinsky and other f(R) models, like the ones performed in Ref. [31], whereas more extensive and general analyses of the stability of the model can be found in Ref. [36].

It will be convenient to write Eq. (4) in terms of the measured cosmological parameters \(\varOmega _{m0} = \frac{8\pi G \rho _{m0}}{3 H_0^2}\) and \(\varOmega _{\varLambda 0} = \frac{8\pi G \rho _{\varLambda }}{3 H_0^2}\). In this case, one can find the correct way to write the parameters by making the model return to \(\varLambda \)CDM at high z limit [1, 28]

$$\begin{aligned} f(R) =\,\,&R \,+\, 6\uplambda H_0^2(1-\varOmega _{m})&\nonumber \\&\times \left( \left( 1+\frac{R^2}{[6H_0^2(1-\varOmega _{m})]^2}\right) ^{-n} -1\right) .&\end{aligned}$$
(13)

This relation allows us to constrain \(R_0\) and \(\uplambda \) using data linked to the observables \(\varOmega _{m0}\) and \(H_0\).

2.2 Metric perturbations

In order to go beyond the background cosmological observables and test our theory against structure growth we need to introduce metric perturbations. Following the general perturbative procedure for scalar-tensor theories found in [37], in the usual Newtonian (or comoving) gauge, the perturbed metric is given by

$$\begin{aligned} ds^2 = -(1+2\varPsi )dt^2 + (1-2\varPhi )\,a^2(t)\,\delta _{ij} dx^i dx^j, \end{aligned}$$
(14)

where \(i,j=1,2,3\), and \(\varPsi , \varPhi \) are the Bardeen potentials which satisfy \(\varPsi = \varPhi \) in the absence of anisotropic stress. The physical processes we are interested in, and the cosmological observables associated with them (i.e., accelerated cosmic expansion and growth of cosmic structures), are all well within the scale of the sub-horizon approximation \(k \gg a^2H^2\). In this case, we can write the evolution equation for the matter density contrast, \(\delta \), as

$$\begin{aligned} {\ddot{\delta }} + 2H{\dot{\delta }} + \frac{k^2}{a^2}\varPhi = 4\pi G\, \mu (k,a)\,{\rho _{m}}\,\delta , \end{aligned}$$
(15)

where G is the Newtonian gravitational constant. The \(\mu (k,a)\) factor is written as

$$\begin{aligned} \mu (k,a) \equiv \frac{1}{8\pi F(R)}\left( \frac{1 + 4\frac{k^2}{a^2 R}\,m}{1 + 3\frac{k^2}{a^2 R}\,m}\right) , \end{aligned}$$
(16)

where m is a parameter that quantifies the deviation from the \(\varLambda \)CDM model

$$\begin{aligned} m \equiv \frac{R F^\prime (R)}{F(R)} \implies m \vert _{{\small \varLambda CDM}} = 0. \end{aligned}$$

Equation (15) allow us to analyse how our cosmological model should behave at the limits of very small and very large scales; since R is of the order of H(z), we obtain

$$\begin{aligned}&\lim _{k \gg a^2 H^2} \mu (k) = \frac{4}{3}\,\frac{1}{8\pi F(R)}, \end{aligned}$$
(17)
$$\begin{aligned}&\lim _{k \ll a^2 H^2} \mu (k) = \frac{1}{8\pi F(R)}, \end{aligned}$$
(18)

and we observe that, at small scales, the modification of GR gives an extra factor of \(\frac{4}{3}\) to the force term of the equation, therefore gravity becomes stronger and cosmic structures grow faster. In the opposite limit, at large scales, the Eq. (15) is the same as in GR [37], now with the term 1/F(R), which arises naturally in a theory that couples to gravity. In the language of the Scalar-Tensor theory, this would be the term that couples gravity to the scalar field.

The regime of cosmic structure formation lies between these scales, so that we need the full equation (15) to account for the physics in this regime. However, when the Hubble scale is not large enough in comparison to the perturbations (high redshift), or when we have scales of the order of \(H_0\) (low redshift), the behavior of the structure formation should have the \(\varLambda \)CDM model as a limit.

It is useful to rewrite Eq. (15) as an ODE on the variable \(N \equiv \ln {a}\), also called the e-fold number, which can be obtained after a simple chain rule derivation

$$\begin{aligned} \delta '' + \left( 2 + \frac{H'}{H}\right) \delta ' - \frac{3}{2}\varOmega _m(a)\, \mu (k,a)\,\delta = 0, \end{aligned}$$
(19)

where \('\) means derivative with respect to N, and \(\varOmega _{m}(a)\) is the density parameter of (dark + baryonic) matter as a function of the scale factor a, defined above. This is the equation to be solved in order to find the linear growth of our model, using the definition of the growth rate of cosmic structures function f(a) given in Eq. (1) in the same way as in GR.

3 Observational data

There are in the literature several compilations of H(z) and \([f \sigma _8](z)\) observational data. In this section, we are going to present the datasets we are considering to impose observational constraints in the Starobinsky model.

3.1 H(z) data

The history of the expansion of the universe is probed by the observational Hubble parameter H(z). It can be measured by several independent methodologies, most of them, based on distance measurements of galactic objects such as supernovae and quasars. However, it is known that in some approaches the cosmological distance measurements depend on a fiducial model, which makes the use of these data problematic when the objective is to constrain the free parameters of cosmological models in alternative scenarios.

In this work, since we are considering a gravity model of type f(R), we are going to use in our analyses, the H(z) measurements obtained by the differential age technique, which is independent of the fiducial model. This technique, also known as cosmic chronometers (CC), was proposed by Jimenez and Loeb in [38].

The basic idea of the CC data consists of the spectroscopic determination of the age difference between two passively evolving early-type galaxies. The assumption of old galaxies to realize the age difference measurements is important to assure that the galaxies were formed at the same time, although are localized in slightly different redshift.

The Hubble parameter is directly related to the measured quantity dz/dt by

$$\begin{aligned} H(z) = -\frac{dz}{dt}\frac{1}{(1+z)}. \end{aligned}$$
(20)

In this work we use the compilation of 31 measurements of H(z), covering the range \(0.07< z < 1.965\) [39]. The data is shown in Table 1 and in Fig. 1.

Table 1 The 31 Hubble parameter data points, H(z), and their respective \(1\,\sigma \) errors \(\sigma _H(z)\) from the CC data [39]. The units for both \(H(z), \sigma _H(z)\) are km/s/Mpc
Fig. 1
figure 1

The CC Hubble parameter data, H(z), displayed in Table 1

3.2 \([f \sigma _8](z)\) data

Precise measurements of \([f \sigma _{8}](z)\) can be done using the Redshift Space Distortions (RSD) approach, by studying the peculiar velocities caused by local gravitational potentials that introduce distortions in the two-point correlation function of cosmic objects [40]. In fact, the calculation of the anisotropic two-point correlation function, \(\xi (s, \mu )\) [41], allows us to measure \(f\sigma _{8}\), where \(\sigma _{8}\) is the root-mean-square linear fluctuation of the matter distribution at the scale of 8 Mpc/h (for other approaches to study matter clustering see, e.g., [42,43,44,45]). The literature reports diverse compilations of measurements of \([f\sigma _8](z)\) (see, e.g., [46, 47]) which we update here. Our compilation, shown in Table 2 and in Fig. 2, takes into account:

  1. 1.

    we consider \([f\sigma _8](z)\) data obtained from disjoint or uncorrelated redshift bins when the measurements concern the same cosmological tracer, and data from possibly correlated redshift bins when different cosmological tracers were analyzed;

  2. 2.

    we consider the latest measurement of \([f\sigma _8](z)\) when the same survey collaboration performed two or more measurements corresponding to several data releases.

Table 2 Updated compilation of 26 \([f\sigma _8](z)\) data with their respective \(1\sigma \) errors, \(\sigma _{f\sigma _8}\)
Fig. 2
figure 2

The \([f\sigma _8](z)\) data displayed in Table 2

3.3 Joint data

The binned data given in Table 3 was obtained binning simultaneously the \([f\sigma _8](z)\) and H(z) (from the Tables 1 and 2, respectively) in 5 redshift bins: (0.0, 0.30],  (0.30, 0.60],  (0.60, 0.85],  (0.85, 1.4],  (1.4, 2.0], with mean redshifts: \({\bar{z}} = \,0.15, \,0.45, \,0.725, \,1.125, \,1.7\). The number of data pairs \((f\sigma _8, H)\) in each bin were: (7, 9), (6, 10), (6, 2), (4, 6), (3, 4), respectively. The values in these 5 bins for \([f\sigma _8](z)\) and H(z), and their errors, correspond to their variance weighted means [63]. This binned data will be used in the \(\chi ^2\) joint analysis.

4 Analyses and results

In this section we describe the parameter space analyses of the f(R) model given in Eq. (4), using the observational data described in the previous Sect. 3. To obtain the theoretical predictions for H(z) and f(z) from the chosen model, Eq. (3), we integrate Eqs. (9) and (19) numerically and use their results in the assembling of the likelihood function. This gives us our statistical analyses.

The standard Bayesian inference will be considered for the parameter estimation. To investigate how appropriate the data is to constrain the parameter space of the model, they will be considered separately and later combined in a joint analysis.

4.1 H(z) and \([f\sigma _8](z)\) observational constraints

Recently, the tension between different estimates of the Hubble constant has drawn the attention of cosmologists. The local determination of the Hubble constant from SHO-ES collaboration [64] is \(H_0^{\tiny {\hbox {SHOES}}}= 74.03 \pm 1.42 \) km/s/Mpc while the value inferred by the Planck collaboration [4], from the Cosmic Microwave Background (CMB) analysis in a flat \(\varLambda \)CDM framework is \(H_0^{\hbox {\tiny Planck}} = 67.36 \pm 0.54\) km/s/Mpc from the last Planck collaboration results [4].

The divergence in \(H_0\) measurements is the main reason to look for alternative approaches to perform analyses that are independent of the \(H_0\) value [65]. In our analyses, we shall consider \(H(z)/H_0\) instead of H(z) data, and for this we use \(H_0 = H_0^{\tiny {\hbox {Planck}}}\) [4].

For the model given in Eq. (4) and H(z) data, the Likelihood function is given by

$$\begin{aligned} {\mathcal {L}}(z|\theta ) \propto -\frac{1}{2}\sum _{i}^{N_H} \, \frac{( H^i_{th}(z|\theta ) - H^i_{obs}(z))^2}{\sigma _i ^2}, \end{aligned}$$
(21)

where \(N_H\) is the number of points in the dataset, \(\theta ={(\varOmega _{m0},\uplambda )}\) are the free parameters of the model, \(H^i_{th}(z|\theta )\) is the theoretical Hubble parameter at redshift \(z_i\), \(H^i_{obs}(z)\) and \(\sigma \) are the observation error values of the Hubble parameter given in Table 1.

The theoretical Hubble parameter depends on the current Hubble parameter \(H_0\). To eliminate this dependency, we will follow the approach introduced by [66], that consists of marginalization of the likelihood over \(H_0\) parameters

$$\begin{aligned} {\mathcal {L}}= \varGamma -\frac{B^2}{A} + \ln {A} - 2\ln {\left[ 1+\text {erf} \left( \frac{B}{\sqrt{2A}}\right) \right] }, \end{aligned}$$
(22)

where, by the definition of \(E(z) \equiv H_{th} / H_0\),

$$\begin{aligned} A= & {} \sum ^N_{i=1} \frac{E^2(z_i)}{\sigma _i^2}, \end{aligned}$$
(23)
$$\begin{aligned} B= & {} \sum ^N_{i=1} \frac{E(z_i)H_{obs}(z_i)}{\sigma _i^2}, \end{aligned}$$
(24)

and

$$\begin{aligned} \varGamma = \sum ^N_{i=1} \frac{H_{obs}^2(z_i)}{\sigma _i^2}. \end{aligned}$$
(25)

Similarly, the Likelihood function for \([f\sigma _8](z)\) data is given by:

$$\begin{aligned} {\mathcal {L}}(z|\theta ) \propto -\frac{1}{2}\sum _{i}^{N_f} \, \frac{( f\sigma _8^{i_{th}}(z|\theta ) - f\sigma _8^{i_{obs}}(z))^2}{\sigma _i ^2}, \end{aligned}$$
(26)

where we have the same parametric space \(\theta ={(\varOmega _{m0},\uplambda )}\), \( f\sigma _8^{i_{th}}(z|\theta )\) is the theoretical growth function given by the model with parameters \(\theta \) at redshift \(z_i\); \(f^i_{obs}(z)\) and \(\sigma \) are the observed values, given by the data in Table 2; \(N_f\) is the size of the \(f\sigma _8\) dataset. The \(f\sigma _8\) function does not have \(H_0\) as a free parameter and therefore does not need to be marginalized. The parametric space of both observables H(z) and \([f\sigma _8](z)\) are the same and the joint likelihood function is given by the product of the individual likelihoods according to

$$\begin{aligned} {\mathcal {L}} = {\mathcal {L}}_{H}\times {\mathcal {L}}_{f\sigma _8}. \end{aligned}$$
(27)
Table 3 Binned data of \([f\sigma _8](z)\) and H(z) obtained by calculating the variance weighted mean in each redshift bin (see the text for details)

The Bayesian inference of the parameters is obtained through the expected values of the posterior density \(p(\theta |z)\) and since the posteriori distribution is unknown for the model, i.e., it can not be approximated by a normal or gaussian distribution, we are considering the Bayes Theorem that establishes

$$\begin{aligned} p(\theta |z) = {\mathcal {L}}(z|\theta ) \times \varPi (\theta ) \end{aligned}$$
(28)

where \(\varPi (\theta )\) is the prior.

The parametric space of the variables of the model will be explored following the methodology of Markov chain Monte Carlo (MCMC) and the Metropolis-Hastings algorithm to generate a set of samples from a posteriori distribution. To implement the MCMC, we are using a Python open-source code, emcee [67].

The posteriori distribution was generated for the following intervals in the \(n=1\) case, with flat distributions: \(\uplambda ^{-1} \in [0.001, 1.8]\) and \(\varOmega _{m0} \in [0.2, 0.6]\) for the H(z) data set; \(\uplambda ^{-1} \in [0.001, 2.0]\) and \(\varOmega _{m0} \in [0.15, 0.3]\) for the \([f{\sigma 8}](z)\) data set; and finally, \(\uplambda ^{-1} \in [0.001, 2.0]\) and \(\varOmega _{m0} \in [0.2, 0.4]\) for the combined data set. For the \(n=2\) case we use a flat prior in all the MCMC runs, as well, we consider the same intervals \(\uplambda ^{-1} \in [0.1, 2]\), \(\varOmega _{m0} \in [0.15, 0.4]\).

Fig. 3
figure 3

MCMC simulations for the f(R) model with \(n=1\), considering H(z) data and the flat priors: \(\uplambda ^{-1} = [0.001, 1.8]\) and \(\varOmega _{m0} = [0.2, 0.6]\)

Fig. 4
figure 4

MCMC simulations for the f(R) model with \(n=1\), considering \([f\sigma _8](z)\) data and the flat priors: \(\uplambda ^{-1} \in [0.001, 2.0]\) and \(\varOmega _{m0} \in [0.15, 0.3]\)

Fig. 5
figure 5

MCMC simulations for the f(R) model with \(n=1\), considering joint analyses of the H(z) and \([f\sigma _8](z)\) data and the flat priors: \(\uplambda ^{-1} = [0.001, 2.0]\) and \(\varOmega _{m0} = [0.2, 0.4]\)

Fig. 6
figure 6

MCMC simulations for the f(R) model with \(n=2\), considering H(z) data and the flat priors: \(\uplambda ^{-1} \in [0.1, 2]\) and \(\varOmega _{m0} \in [0.15, 0.4]\)

Fig. 7
figure 7

MCMC simulations for the f(R) model with \(n=2\), considering \([f\sigma _8](z)\) data and the flat priors: \(\uplambda ^{-1} \in [0.1, 2.0]\) and \(\varOmega _{m0} \in [0.15, 0.4]\)

Fig. 8
figure 8

MCMC simulations for the f(R) model with \(n=2\), considering joint analyses of the H(z) and \([f\sigma _8](z)\) data and the flat priors: \(\uplambda ^{-1} = [0.1, 2.0]\) and \(\varOmega _{m0} = [0.15, 0.4]\)

We summarize our results in Table 4, and the parameter space contours obtained can be seen in blue in Figs. 3, 4, and 5 for the \(n=1\) case; and in green in Figs. 6, 7, and 8 for the \(n=2\) case.

Table 4 Results of our likelihood analyses for the cosmological parameters and their uncertainties

4.2 \([f \sigma _8 - H]\) diagram analyses

Recently, Linder [2] and Matsumoto et al. [26] have proposed a joint analysis of the H(z) and \([f\sigma _8](z)\) cosmological observables as a way to break the degeneracy of DE and MG models, also allowing to recognize the redshift regime where the tested model most affects the growth of cosmic structures.

In Moresco et al. [63], the authors used a joint statistical analysis of both these observables and applied the method of Linder [2] to analyze individually the parameters of \(\varLambda \)CDM + \(\varSigma m_\nu \) and wCDM (for cosmological parameter analyses with massive neutrinos see, e.g., [68]). The authors obtained \(1\sigma \) constraints on the parameters from growth structure data at low redshifts (\(z < 2\)) and the last Planck (2018) data release. Of particular interest was the degeneracy between models with massive neutrinos and a modified growth parameter \(\gamma \), according to this analysis the models investigated provide a better fit than the flat \(\varLambda \)CDM model, as constrained by the Planck mission [69]. The authors then simulated data points in the \(f\sigma _8 - H\) plane to distinguish between the models that better fit the data, and found that they could be distinguished with high statistical significance using the \(f\sigma _8 - H\) plane. This shows that the joint analysis proposed by Linder [2] and Matsumoto et al. [26] can be contrasted with data to distinguish models that are degenerate in the cosmological parameters fits. In particular, we are interested in study how the Starobinsky f(R) model (see the Sect. 2.1) can be constrained by using \([f\sigma _8](z)\) and H(z) data in the diagram proposed by the aforementioned authors.

Fig. 9
figure 9

Comparison of the evolution of the ratio of observables in the Starobinsnky f(R) models and their \(\varLambda \)CDM values. Each point is the theoretical prediction for a given redshift. Different colored curves correspond to different values of the model parameter, \( \uplambda ^{-1} = 0.3, 0.7, 1)\), with \(\varOmega _{m0} = 0.3\) and \(n=1\)

One can build an f(R) modified gravity model that mimics the background evolution of \(\varLambda \)CDM, like the one presented in Sect. 2.1 and the well-known Hu–Sawicki model [21]. Both models have a similar cosmological evolution, whereas the growth of structures is quite different to the \(\varLambda \)CDM model, owing to the scale dependence of the solution \(\delta (k, a)\) of Eq. (19). In the Refs. [2, 26] it is shown how the \(f\sigma _8-H\) plot can help distinguish between MG and \(\varLambda \)CDM-type models (i.e., \(\varLambda \)CDM or wCDM or \(w_0 w_a\)CDM models with different parameters). To perform a similar analysis for our model, we fix the scale of our perturbations at 0.1h/Mpc, around the scale where measurements and homogeneity assumptions in the \(\varLambda \)CDM model are made [48, 70] and a scale where linear perturbation theory is valid in f(R) theories and nonlinear effects can be disregarded [71].

In Fig. 9, we see the separate plot of the ratio of H(z) and \([f\sigma _8] / [f\sigma _{8\varLambda CDM}]\) for our Starobinsky f(R) model and their \(\varLambda \)CDM values, respectively. For \(H / H_{\varLambda CDM}\) the curves show similarity, also requiring a high amount of precision to distinguish the \(< 2\%\) absolute difference of the values, whereas for \([f\sigma _8](z)\) there is a high degeneracy in some redshift intervals between the same model with different parameters. This degeneracy makes it difficult to use data from this redshift interval to constrain the true \(\uplambda \) parameter of the model, while also, again, requiring a higher degree of precision to distinguish between the curves in this interval. For higher values of redshift, the degeneracy between the \(f\sigma _8\) curves worsens, as they need to converge to the \(\varLambda \)CDM value by construction. Thus the redshift interval that allows us to constrain the true parameter of the model with good precision and low degeneracy between the curves is highly dependent on our ability to measure with high precision the local H(z) history and the value of the model parameter. Using the \(f\sigma _8 / f\sigma _{8\varLambda CDM} \times H/H_{\varLambda CDM}\) joint analysis, devised by [26] and the similar \(H(z)/H_0 \times [f\sigma _8](z)\) from [2] to plot the curves for our model, we see that the degeneracy between the different values of the parameters is manifestly broken considering the whole parameters space, as shown in Fig. 10.

Fig. 10
figure 10

Curves on the \([f\sigma _8] / [f\sigma _8]_{\varLambda CDM} \times H/H_{\varLambda CDM}\) plane for different values of the \(\uplambda \) parameter, for the \(n=1\) case. The curves, while having a similar profile, have different curvatures while also having different start and end points. From left to right we have \(\uplambda ^{-1} = 0.3, 0.5, 0.7, 1\)

With the degeneracy broken, the only limiting factor in distinguishing the curves in the parameter space is the precision and the confidence interval of these measurements. Figure 11 shows how the parameter space can be further distinguished, including the \(\varLambda \)CDM model. While the expansion history of our (and general) f(R) models follows the \(\varLambda \)CDM model closely enough that it can’t be distinguished from the standard model at the background level, when one conjoins just the \([f\sigma _8](z)\) evolution with the H(z) expansion of the models, the degeneracy is clarified. Although there is still a degree of degeneracy between the curves, with the \(H(z)/H_0\) data spanning a large interval, the difficulty in distinguishing between models lies, once again, in the precision of the available data.

Fig. 11
figure 11

\([f\sigma _8](z) \times H(z)/H_0\) curves for the models \(\varLambda \)CDM and Starobinsky f(R) with different values for the parameter \(\uplambda \) and \(n=1\)

In [2] the author gives an in depth analysis of the relevant redshift range to better distinguish dark energy models using this joint analysis of observables, also giving a preview of what future surveys will be able to give precise enough measurements to further break the degeneracy between models. An extensive analysis using similar methods with other MG models can be found in [26]. In [63] a similar analysis is done considering parameter extensions of the concordance \(\varLambda \)CDM model, with a likelihood approach. Here we’ll follow [63] in using the \(f\sigma _8 - H\) diagram to further distinguish between the parameters of our model.

4.2.1 Goodness of fit between models

We have produced a set of 5 binned data pairs, \((f\sigma _8(z)\), H(z)), for an equal number of redshift bins (see Table 3); they were obtained by calculating the variance weighted mean in each redshift bin. We use this data to test goodness of fit procedure, via a \(\chi ^2\) methodology, to check which set of parameters best fits the joint data, obtained from the MCMC runs in Sect. 4.1. This analysis complements our likelihood analysis where we obtained best fit parameters from the three different sets of data described in Sect. 3. Here, we use the binned data to check which of the pair of parameters obtained from the exploration of parameter space best fits this joint data. Hereafter we’ll call each of the three different best fits obtained from the MCMC run a “model”; the model with the parameters given by the run on just the H(z) data, the one from just \(f\sigma _8(z)\) data and the one from the run on both datasets, referring to Table 4.

Fig. 12
figure 12

Goodness of fit test for binned data, from Table 3. The blue curve corresponds to \(\varLambda \)CDM, the orange one to the Starobinsky \(n=1\) model with the best-fit parameter \(\uplambda \), and the green one with the best-fit parameter for \(n=2\), both found from the minimization of Eq. (32) in the MCMC analyses

We devised a simple 2D minimum squared weighted deviation (also a type of \(\chi ^2\)), using the binned data from Table 3. If, for each data point, we have the individual error on each variable, we have

  • \(\sigma _{f\sigma _8} \equiv \) \([f\sigma _8](z)\) error,

  • \(\sigma _{H} \equiv H(z)\) error,

and since the data for both observables are independent measurements, i.e., they are not correlated, we may write

$$\begin{aligned} \sigma ^2 \equiv \sigma ^2_{\text {joint}} = \sigma ^2_{H} + \sigma ^2_{f\sigma 8}, \end{aligned}$$
(29)

as the joint error for both variables (termed variance in quadrature).

For each model we have a predicted value \((f\sigma _8(z_i)\), \(H(z_i))\) at redshift \(z_i\), and a data point \((x_i, y_i)\) in the \(f\sigma _8 \times H(z)\) plane. The squared residual between the model prediction \(P_i \equiv (f\sigma _8(z_i),H(z_i))\) and the observed data \(O_i \equiv (x_i, y_i)\) is given by

$$\begin{aligned} (P_i - O_i)^2 = ||(f\sigma _8(z_i) - x_i, H(z_i) - y_i)||^2, \end{aligned}$$
(30)

where \(||\cdot ||\) is the distance between the two data points \((f\sigma _8(z_i),H(z_i))\) and \((x_i, y_i)\) in the \(f\sigma _8 \times H\) plane.

The \(\chi ^2\) is then defined as the sum of the square of the residuals averaged by the joint error for each data

$$\begin{aligned} \chi ^2 \equiv \sum ^n_i \frac{(P_i - O_i)^2}{\sigma _i^2}. \end{aligned}$$
(31)

The reduced mean squared weighted deviation is then, for each model analyzed, the ratio between \(\chi ^2\) and the number of degrees of freedom \(\nu \), \(\nu = n - m = 5 - 2 = 3\) for the Starobinsky models, and \(\nu = 5 - 1 = 4\) for the flat \(\varLambda CDM\) model

$$\begin{aligned} \chi ^2_\nu = \frac{\chi ^2}{\nu }. \end{aligned}$$
(32)

In our case the flat \(\varLambda \)CDM model has one free parameter \(\varOmega _{\varLambda 0}\) (where \(\varOmega _{m0}\) is constrained by the algebraic relation \(\varOmega _{\varLambda 0} + \varOmega _{m0} = 1\) ), one less than the Starobinsky models analysed, i.e., \(\varOmega _{m 0}\) and \(\uplambda ^{-1}\).

Each MCMC parameter space exploration results in a best fit of the model via the likelihood minimization performed by the MCMC method. We performed three sets of MCMC runs for each of the cases \(n=1\) and \(n=2\), each on a different data set, described in Sect. 3; here we treat each result as a different model, and test the models against the binned data (see Table 3).

Table 5 Goodness of fit test, measured by \(\chi _\nu ^2\) through the analysis of the binned data in Table 3, to compare the behavior of the \(\varLambda \)CDM and 3 Starobinsky models (each with a best-fitted \(\uplambda \) parameter provided by the MCMC runs considering 3 data sets)

We then calculate (32) for each of our models and the minimization of this parameter gives a naive estimate of the model that best fits the data in the \(f\sigma _8 \times H\) plane. The results of this naive analysis allow us to differentiate between models with different values of \(\uplambda \) obtained from the MCMC run, which can be seen in Table 5. The model that best fits the data according to this criterium is plotted in Fig. 12

From the \(\chi ^2\) statistics, we see that the \(\varLambda \)CDM model is, from this joint data set, the one with the smallest \(\chi ^2_\nu \) value, naively giving an “overfit” to the data; with the \(n=2\) case having the best fit with all data sets, in comparison with the \(n=1\) case. This is expected, as per the observational problems mentioned before concerning the \(n=1\) model, and from the fact that the parameter \(n=1\) is in direct relation to deviations from \(\varLambda \)CDM; \(n=1\) being the integer value greater than 0 that deviates the most from the concordance model. As for the “overfit” of the \(\varLambda \)CDM model, it’s also important to note that the model has 1 parameter less than the f(R) models, which gives it a lower value from the reduced \(\chi ^2\) statistics.

While one could say that \(\varLambda \)CDM fits the joint data better than any version of the Starobinsky models from the \(\chi ^2\) test, some observations on the methods and data are in order. First, it’s interesting to note that the \(f\sigma _8\) data alone gives a better fit to the joint data than the result from the MCMC run on data from both \([f\sigma _8](z)\) and H(z) data, in the \(n=1\) case. This could be the case for diverse reasons; one being that the H(z) is actually worsening the results from the \([f\sigma _8](z)\) data when analyzed together. As expected theoretically, background quantities don’t constrain modified gravity models suitably; the MCMC run on the H(z) data alone, found in Fig. 3, was not able to constraint the free parameters analyzing the \([f\sigma _8](z)\) data. In the \(n=2\) case, the joint data constrains the \(\uplambda \) parameter significantly better than both the H and \(f\sigma _8\) alone, while also giving a better fit from the \(\chi ^2\) test in general, as can be seen from Fig. 12. This points to a fault with the \(n=1\) model once again.

Also, the binned data presented in Table 3, built from the H(z) and \([f\sigma _8](z)\) datasets, gives us 5 data points for goodness of fit analysis. As a test on whether the binning choice affects the \(\chi ^2\) statistics, we considered other binned sets and found similar results. The advantage of this approach is that it allows to break of the degeneracy between models with dissimilar \(\uplambda \) parameter, but due to the large errors at high\(-z\) the analysis is not as accurate as desirable to clearly discriminate between the \(\varLambda \)CDM and the Starobinsky models. This could also explain the “overfitting” of the \(\varLambda \)CDM model, which would give a \(\chi ^2\) closer to 1 if the error was better constrained. In fact, more data points and tighter error constraints are needed to improve the significance of the goodness of fit. This means not only precise measurements on these observables, in particular at high\(-z\), but also data in more redshift intervals. From our MCMC analysis, we observe that the data on \([f\sigma _8](z)\) was able to better constraint the parameters of the f(R) model in both cases, thus results from upcoming surveys such as the DESI, SKA, and EUCLID telescopes [72] will be able to probe the growth of cosmic structures with precision to greatly improve the significance of this kind of analysis.

It is important to note that the conjoined analysis found in Table 5, even if with low statistical significance, was able to differentiate between different models inside the same f(R) theory, and in particular with the same n parameter, which we saw had high degeneracy in the H(z) observables, and some degeneracy in \([f\sigma _8](z)\). Thus the proposal of [2, 26] of breaking this degeneracy between models using the conjoined data is effectual.

4.3 Remarks on stability and solar system test

From the results of the MCMC runs, seen in Table 4, and from the constraints on the theory in (11), that give \(\uplambda ^{-1} < 0.21\) for the \(n=1\) case, and \(\uplambda ^{-1} < 1.578\) for \(n=2\), we see that no best fit for \(\uplambda \) in the \(n=1\) case gives a stable model of the theory, while in the \(n=2\) case the best fit with the \(f\sigma _8\) dataset gives a fit that does not pass the de Sitter stability criteria. This once again shows some of the issues with the \(n=1\) model.

As for the solar system test, the constraint comes mainly from the PPN parameter \(\gamma \), which, from the Cassini probe, has to satisfy the constraint [1, 73]

$$\begin{aligned} |\gamma -1| < 4\times 10^{-4}. \end{aligned}$$
(33)

In a recent paper on solar system tests on f(R) theories, [73] has shown that a model with the parameters \(\uplambda = \uplambda ^{-1} = 1\), \(R_0 = 4.17H_0^2\) and \(n=2\) easily passes the (33) constraint and the chameleon tests in the solar system. From the best fit parameters obtained in Table 5, we see that the best fit parameters for the \(n=2\) are well inside the constraint obtained in [73], since \(R_0\uplambda = 6\uplambda (1-\varOmega _{m0})H_0^2 < 4H_0^2\) for all the cases.

For the \(n=1\) case, it is known since the original Starobinsky’s work [1] that the model does not satisfy the constraints from the solar system test.

5 Conclusions and final remarks

In this paper, we have used the cosmic expansion and the structure growth data to constrain the free parameters of a relevant f(R) gravity model, namely the Starobinsky model. We’ve used the bayesian based Monte Carlo Markov Chain method to explore the parameter space of this model from both sets of data individually and using a joint likelihood analysis. Furthermore, we also used the recent method proposed by Linder [2] and Matsumoto et al. [26] to complement the result of the MCMC statistical analysis. This novel approach, based on a joint data analysis from H and \(f\sigma _8\), has proven able to give further distinction between different models which were shown to have degeneracy in the theoretical predictions of the H(z) and \([f\sigma _8](z)\) observables in the range of the free parameters of the f(R) model.

In the end, our results show that in the f(R) Starobinsky model, for \(n=1\) the model is not very well constrained but the data from \(f\sigma _8\) gives a good fit to the joint data, where for \(n=2\), the parameters are significantly better constrained and the joint data gives a great fit. The \(\varLambda \)CDM has the smallest \(\chi ^2\) value for the test, as expected. This joint analysis gives another observational result in motivating \(n\ge 2\) in the f(R) Starobinsky model. However more –and more precise– data are needed in order to determine the preferred model with good statistical significance. The joint binned data set still gives space to overfitting from the \(\varLambda \)CDM model and big error margins in the parameter space of the f(R) models. Therefore, the Starobinsky model cannot be discarded as a possible alternative to the \(\varLambda \)CDM model, still passing all the observational tests for \(n\ge 2\).

A possible continuation of this work would be to use this joint analysis to compare diverse cosmologically motivated f(R) models, such as the one in [21] and the one in this paper break the degeneracy between different f(R) theories. The results of this paper show that this method is promising for further distinction between modified gravity models that are degenerate in certain cosmological probes. The extension of this analysis is left for future work.