1 Introduction

The standard hot Big-Bang model of cosmology is described by a flat \(\Lambda \)CDM universe, with 70% of the energy density comprising of the cosmological constant (or any dark energy fluid with equation of state \(w \equiv P / \rho \) close to \(-1\)) and 25% cold (non-baryonic) dark matter and 5% baryons [4]. This model has two episodes of acceleration (one in the early universe caused by inflation [5], posited to solve the horizon and flatness problems in the standard hot Big-Bang model [6]), and another in the late universe, caused by dark energy [7]. This model has been spectacularly confirmed by Planck 2018 CMB observations [8] along with other large-scale structure probes. There are however a few data-driven lingering problems with the standard \(\Lambda \)CDM paradigm, such as the Hubble constant tension between local and high redshift measurements [9, 10], \(\sigma _8\) tension between CMB and galaxy clusters [11, 12], Lithium-7 problem in Big-Bang nucleosynthesis [13], anomalies in CMB at low l [14], etc. A few works have also challenged some of the most well-established tenets of the standard cosmological model, viz. cosmic acceleration [15] and even cosmic expansion [16].

Independent of the above data driven problems, there are also conceptual problems with the standard model. The best-fit model of scalar-field driven inflation (an essential pillar of standard hot Big-Bang model) with flat potentials also causes lots of fine-tuning issues [17]. Furthermore, we don’t yet have laboratory evidence for any cold dark matter candidate, despite searching for over three decades [18]. If the dark energy turns out to be a cosmological constant, a non-zero value would be very problematic from the point of view of quantum field theory [19, 20].

Therefore, because of some of the above problems, many alternatives to the standard model have been constructed. One such model is the \(R_h=ct\) universe model, proposed by Melia [21,22,23]. In this model, the size of the Hubble sphere given by \(R_h(t)=ct\) is upheld for all times in contrast to the case of the \(\Lambda \)CDM model, where this coincidence is true only at the current epoch, i.e. \(R_h(t_0)=ct_0\). This model has \(a(t) \propto t\) and \(H(z)=H_0(1+z)\). One direct result of this is that the rate of expansion \(\dot{a}\) is constant; and pressure and energy density satisfy an equation of state given by \(p=-\frac{\rho }{3}\). This is known as the zero active mass condition, and has been argued by Melia to be a necessary requirement due to the symmetries of FRW universes [24]. (See however Ref. [25] for objections to this argument of zero active mass condition.) Melia has also argued that this model provides a cosmological basis for the origin of the rest mass energy relation, i.e. \(E=mc^2\) [26], although this has been disputed [27]. The \(R_h=ct\) model also has several antecedents and generalizations, discussed in Refs. [28, 29], and an up-to-date review of all such models can be found in Ref. [30]. This model has been tested with a whole slew of cosmological observations by Melia and collaborators; such as cosmic chronometers [3], quasar core angular size measurements [31], quasar X-ray and UV fluxes [32], Type 1a SN [33], strong lensing [34], cluster gas mass fraction [35], etc and found to be in better agreement compared to \(\Lambda \)CDM model. However, other researchers have reached opposite conclusions and have argued that this model is inconsistent with observations [1, 36,37,38,39,40,41,42]. Even before this model was introduced, there were severe observational constraints on power-law cosmologies, within which this model can be subsumed [43, 44]. These results in turn have also been contested by Melia and collaborators [45]. Conceptual problems have also been raised against this model [25, 46,47,48,49,50,51], although some have been countered [52]. We note however so far this model is yet to reproduce the Cosmic Microwave Background temperature and polarization anisotropy measurements.

In this work, we try to adjudicate between one such conflicting claim between two of the above works: Ref. [1] (BS12, hereafter) and Refs. [2, 3] (MM13 and MY18, hereafter), which have reached diametrically opposite conclusions, when analyzing Hubble parameter (H(z)) measurements. BS12 reconstructed a non-parametric fit for H(z) using Gaussian Process Regression (GPR hereafter) from 18 cosmic chronometer measurements and 8 BAO measurements spanning the redshift range \(0.09 \le z \le 0.73\). They argued based on a visual inspection of the reconstructed H(z) and its derivatives, that the \(\Lambda \)CDM model is a much better fit than the \(R_h=ct\) model. Soon thereafter, MM13 however pointed out that 19 unbinned H(z) measurements obtained from chronometers, support \(R_h=ct\) over the \(\Lambda \)CDM. This assertion was based on AIC, BIC, and KIC based tests from information theory and \(\chi ^2\)/dof. Most recently, MY18 used 30 H(z) measurements using cosmic chronometers, and similar to BS12, used GPR to reconstruct a non-parametric H(z). Model comparison of \(\Lambda \)CDM vs \(R_h=ct\) was done by calculating the normalized area difference between the model and the reconstructed H(z). They argued that with this procedure, \(R_h=ct\) model is a better fit than \(\Lambda \)CDM. Here, we do an independent analysis of H(z) data, using the latest measurements from chronometers.

The outline of this paper is as follows. We discuss the GPR technique and Bayesian model comparison technique in Sect. 2 and Sect. 3 respectively. The key points made in the two conflicting sets of papers BS12 versus MM13, MY18 are discussed in Sect. 4. The description of our datasets and analysis can be found in Sect. 5. Our results using H(z) measurements can be found in Sect. 6. A comparison of the two models using the \(Om(z_1,z_2)\) statistic can be found in Sect. 7. We conclude in Sect. 8.

2 Gaussian process regression

Both the groups (BS12 and MY18) have used GPR for their analysis. Therefore, we provide an abridged introduction to GPR, before discussing the results of their analysis. A more detailed explanation can be found in Section 2 of Ref. [53]. GPR is a widely used technique in astronomy as it allows us to smoothly interpolate in a non-parametric fashion between different datapoints, thereby allowing us to increase the number of degrees of freedom. However, they do not provide more information than the underlying data. Gaussian process is similar to a Gaussian distribution but it describes the distribution of functions instead of random variables. To describe the distribution of these functions, we need the mean function \(\mu (x)\) and a covariance function \( cov(f(x),f(\tilde{x})) = k(x,\tilde{x}) \) connecting the values of f evaluated at x and \(\tilde{x}\). There are many choices for the covariance function. Both the papers have used a squared exponential/Gaussian covariance function, so even in this paper we use a Gaussian kernel for GPR. For a Gaussian kernel \(k(x,\tilde{x})\) is:

$$\begin{aligned} k(x,\tilde{x}) = \sigma _f^2 \exp \left( -\frac{(x-\tilde{x})^2}{2l^2} \right) . \end{aligned}$$

Here, \(\sigma _f\) and l are hyper-parameters which describe the ‘bumpiness’ of the function.

Even a random function f(x) can be generated using the covariance matrix. Let \({\mathbf {X}}\) be the set of points \({x_i}\) and one can generate a vector \({\mathbf {f}}^{*}\) of function values at \({\mathbf {X}}^{*}\) with \(f^*_i = f(x_i^*)\) as

$$\begin{aligned} {\mathbf {f}}^{*} = {\mathcal {N}}({\mu ^*},K({\mathbf {X}}^{*,X^*})). \end{aligned}$$

The notation \({\mathcal {N}}\) means that the Gaussian process is evaluated at \(x^*\), where \(f(x^*)\) is a random value drawn from a normal distribution. Similarly, observational data can be written in the same way as

$$\begin{aligned} {\mathbf {y}} = {\mathcal {N}}({\mu },K({\mathbf {X,X}})+C) \end{aligned}$$

where C is the covariance matrix of the data. If data is uncorrelated the covariance matrix is simply \(diag(\sigma _i^2)\). Using the values of y at \({\mathbf {X}}\) we can reconstruct \({\mathbf {f}}^{*}\) using

$$\begin{aligned} \overline{{\mathbf {f}}^{*}} = {\mu ^*} + K({\mathbf {X}}^{*,X})[K({\mathbf {X,X}})+C]^{-1}({\mathbf {y}}-\mu ) \end{aligned}$$

and

$$\begin{aligned} cov({\mathbf {f}}^{*}) = K({\mathbf {X}}^{*,X^*}) - K({\mathbf {X}}^{*,X})[K({\mathbf {X,X}})+C]^{-1}K({\mathbf {X,X^*}}) \end{aligned}$$

where \(\overline{{\mathbf {f}}^{*}}\) and \(cov({\mathbf {f}}^{*})\) are mean and covariance of \({\mathbf {f}}^{*}\) respectively. The diagonal elements of \(cov({\mathbf {f}}^{*})\) provide us the variance of \({\mathbf {f}}^{*}\). More details on this can found in Ref. [53]. Both BS12 and MY18 implement GPR in Python using the package GaPP, which was developed by Seikel and collaborators [53].

3 Model comparison summary

Model comparison between two models can be broadly classified into three distinct categories: frequentist, information-theory, and Bayesian techniques [54,55,56,57,58]. In this work we shall only apply Bayesian model comparison, since this is argued to be the most robust among the different model comparison techniques [56, 59]. We briefly summarize this technique and more details can be found in Refs. [56, 58, 59] or some of our previous works [60, 61].

Using Bayesian statistics, we compute the probability that the data was generated by each model, also called the Bayesian evidence (Z) [56]:

$$\begin{aligned} P(\Theta |D,M) = \frac{P(D|\Theta ,M) P(\Theta ,M)}{P(D|M)} \end{aligned}$$
(1)

where \(P(\Theta |D,M)\) is the posterior, \(P(D|\Theta ,M)\) is the likelihood, \(P(\Theta ,M)\) is the prior, and P(D|M) is the evidence, also sometimes referred to as marginal likelihood. Note that unlike the other model comparison test, the Bayesian evidence does not use the best-fit value of a given model. It considers the entire range. Again, the model with a higher evidence, i.e, higher probability that the data was generated from that model, will be the better model to describe the data. From the Bayesian evidence of the two models, we can calculate the value of the Bayes factor, which is simply the ratio of the evidence for the two models and given by:

$$\begin{aligned} B = \frac{Z_1}{Z_2}. \end{aligned}$$
(2)

For the Bayes factor, we evaluate the ratio of the evidence of the \(\Lambda \)CDM to the evidence for the \(R_h=ct\) model. The significance can be evaluated using the Jeffreys scale [56].

4 Summary of BS12, MM13, and MY18

As mentioned in the introduction, there is a large amount of literature comparing the \(R_h = ct\) model with the \(\Lambda \)CDM model. We focus on the particular case of these two sets of papers (BS12 versus MM13/MY18) and a few others which only use H(z) measurements, where they have arrived at conflicting results despite similar analysis. We then briefly mention some other works which compared the two models using only expansion history.

BS12 reconstructed the value of the deceleration parameter q(z) from Union2.1 Type 1a Supernova dataset with GPR, and showed from a visual inspection that the reconstructed q(z) better fits the \(\Lambda \)CDM model. They also used Hubble rate data from 18 cosmic chronometer and 8 BAO measurements, and reconstructed H(z) with GPR, and plotted it against the predicted values of H(z) from the \(\Lambda \)CDM model and the \(R_h = ct\) model. They compared the reconstructed H(z), its first and second derivative, as well as the Om(z) diagnostic [62] against the theoretical predictions of the two models. They again used visual inspection from these plots to conclude that the \(\Lambda \)CDM model is a better fit to the data compared to \(R_h = ct\). Very soon after BS12, MM13 considered 19 unbinned H(z) measurements from cosmic chronometers and fit this data to both the models. They found that the \(\chi ^2\)/DOF (or reduced \(\chi ^2\)) is equal to 0.745 and 0.777 for \(R_h=ct\) and \(\Lambda \)CDM (with parameters given by: \(\Omega _M=0.32\), \(H_0= 68.9 \pm 2.4\) km/s/Mpc) respectively. Therefore, the reduced \(\chi ^2\) was smaller for \(\Lambda \)CDM. However, when \(\Lambda \)CDM model is fit to the cosmic chronometer data, the estimated values of \(\Omega _M\) and \(H_0\) (0.27 and \(73.8 \pm 2.4\) km/s/Mpc respectively) yield a \(\chi ^2\)/DOF of 0.9567, which is greater than that for \(R_h=ct\) universe. However, no comparison of the goodness of fit based on \(\chi ^2\) p.d.f. was made. They also found smaller values of AIC, BIC, and KIC for \(R_h=ct\) universe compared to \(\Lambda \)CDM. However, we note that the difference in information criterion between the two models did not cross the threshold of 10, needed for any one model to be decisively favored over the other. They further criticized the SN data analysis in BS12, arguing that the data used was optimized for \(\Lambda \)CDM cosmology. They also argued that the BAO data analyzed in BS12 includes non-linear evolution of the matter density and velocity fields, and hence is not model-independent. Therefore, their analysis was done using only chronometers.

A similar analysis using the latest cosmic chronometer data (consisting of 30 measurements) and GPR was carried out in MY18. Here, they used an analytical approach to compare the two models after reconstructing the values of H(z) using GPR. They argued that the \(R_h=ct\) performs better than the \(\Lambda \)CDM model, contradicting the conclusion of BS12. To quantify this, they constructed a mock data set using Gaussian random variables, and then computed the normalized absolute area difference between this and the real function. For each model they calculated the differential areas by replacing the mock data set with the predictions from the models, and then estimated the probability of the model (p-value). From this analysis they came to the conclusion that the \(R_h=ct\) model is the better model among the two for the chronometer data.

Besides the above two sets of papers, Ref. [39] showed using AIC and BIC that a combination of JLA type 1a SN sample and 30 H(z) measurements from chronometers and BAO strongly support the \(\Lambda \)CDM model over \(R_h=ct\) universe. They also found using AIC and BIC that the chronometer only measurements by themselves do not decisively favor any one model. Haridasu et al. [38] did a joint analysis of Type 1a SN, BAO, GRB and chronometer H(z) data and compared the likelihood of \(\Lambda \)CDM model with \(R_h=ct\) using AIC and BIC. They found that both \(\Delta \)AIC and \(\Delta \)BIC between the two models is greater than 20, thereby decisively ruling out \(R_h=ct\) model. Hu and Wang showed from a test of the cosmic distance duality relation using a sample of galaxy clusters and Type 1a SN, that the \(\Lambda \)CDM model is strongly favored over \(R_h=ct\) with both \(\Delta \)AIC and \(\Delta \)BIC greater than 10 [40]. Tu et al. used a combination of strong lensing, Type 1a supernovae, BAO and cosmic chronometers to argue that \(\Lambda \)CDM is moderately favored over \(R_h=ct\) model with the natural logarithm of the Bayes factor greater than five [41].

5 Datasets and analysis

The H(z) data from cosmic chronometers are obtained by comparing relative ages of galaxies at different redshifts and is given by the following expression, assuming an FRW metric [63]:

$$\begin{aligned} H(z) = -\frac{1}{1+z}\frac{ dz}{d t}. \end{aligned}$$
(3)

Based on the measurements of the age difference, \(\Delta t\), between two passively–evolving galaxies that are separated by a small redshift interval \(\Delta z\), we can approximately calculate the value of dz/dt from \(\Delta z/ \Delta t\). This differential age method is much more reliable than a method based on an absolute age determination for galaxies, as absolute stellar ages are more vulnerable to systematic uncertainties than relative ages.

Even though cosmic chronometers probe only the expansion history of the universe, they have been used for a variety of cosmological inferences, such as determination of \(H_0\) [64,65,66,67], transition redshift from deceleration to acceleration [68, 69], cosmic distance duality relation [70], \(\sigma _8\) estimation [71], dark energy equation of state [72, 73], etc. The complete data set of 31 measurements of H(z) at redshifts \(0.07<z<1.965\) from cosmic chronometers is listed in Table 1. This data set was obtained from the compilation in Table III of Ref. [71]. A graphical summary of this unbinned data, along with the reconstructed H(z) using GPR can be found in Fig. 1.

Although BS12 (and also Ref. [39]) has used H(z) measurements from BAO to rule out \(R_h=ct\) model, we have only used the Hubble parameter data obtained from cosmic chronometers. This is due to various concerns regarding combining data from these two sources for parameter estimation within \(\Lambda \)CDM and for testing \(R_h=ct\) universe [2, 74]. One problem in using the BAO data for assessing the viability of an alternative to the \(\Lambda \)CDM model arises from the fact that measurement of the Hubble parameter from BAO requires the assumption of a particular cosmological model, unlike the model independent measurements of cosmic chronometers. All BAO measurements are scaled by the size of the sound horizon at the drag epoch, \(r_s\). Computing the value of \(r_s\) requires the assumption of a fiducial model. Most analyses which employ BAO measurements use the value of \(r_s\) obtained using the \(\Lambda \)CDM model. This would induce a bias towards the \(\Lambda \)CDM model when comparing it with other models. Another concern is that one also needs to model the non-linear evolution of density and velocity fields, which are not model-independent [2, 31]. Therefore, in MY18 and MM12, no BAO data was used, whereas both BAO and chronometer data was used in BS12. Accounting for all these problems we present our results for model comparison without the BAO data.

Table 1 H(z) data from cosmic chronometers along with references to original sources. This list was compiled from Ref. [71]
Fig. 1
figure 1

Plot showing H(z) chronometer data along with the best-fit \(\Lambda \)CDM model and the \(R_h = ct\) model, with best-fit parameters obtained using unbinned data. Also shown is H(z) reconstructed non-parametrically (along with \(1\sigma \) and \(2\sigma \) errors in reconstruction). The reconstruction was done with Gaussian Process Regression using the GaPP package

The first step in model comparison is to find the best-fit values of the free parameters in \(\Lambda \)CDM as well as the \(R_h=ct\) universe model. This is obtained by minimizing the \(\chi ^2\) functional given by:

$$\begin{aligned} \chi ^2= \sum _{i=1}^N \left( \frac{H_i(z) -H^{\Lambda CDM/R_h=ct}(z,\theta )}{\sigma _i}\right) ^2, \end{aligned}$$
(4)

where \(H_i(z)\) indicate the various Hubble parameter measurements, N is the total number of datapoints used, \(H^{\Lambda CDM/R_h=ct}(z,\theta )\) encapsulates the relation for the Hubble parameter in \(\Lambda \)CDM and \(R_h=ct\) cosmology; \(\sigma _i\) denotes the error in H(z); and \(\theta \) denotes the parameter vector in the two models.

In the \(R_h = ct\) model, H(z) is given by:

$$\begin{aligned} H(z)=H_0(1+z), \end{aligned}$$
(5)

whereas for the \(\Lambda \)CDM model, H(z) is:

$$\begin{aligned} H(z)=H_0 \sqrt{\Omega _M(1+z)^3+(1-\Omega _M-\Omega _{\Lambda })(1+z)^2+\Omega _{\Lambda }} \end{aligned}$$
(6)

where \(\Omega _M\) and \(\Omega _{\Lambda }\) are the density parameters of matter and the cosmological constant respectively. Note that for a flat \(\Lambda \)CDM model, \(\Omega _{\Lambda }=1-\Omega _M\), which reduces the number of free parameters by one. For a flat \(\Lambda \)CDM model the equation would be -

$$\begin{aligned} H(z)=H_0 \sqrt{\Omega _M(1+z)^3+1-\Omega _M}. \end{aligned}$$
(7)

In both BS12 and MM13, a flat \(\Lambda \)CDM model was used for the model comparison. So in this work we will stick to the flat case of the \(\Lambda \)CDM model (\(\Omega _k = 0\)) with H(z) given from Eq. 7.

Since Bayesian model comparison does not depend upon the best-fit values, we do not have to maximize any likelihood. We only need to choose priors for the two models. For \(\Lambda \)CDM, we used two sets of priors. The first set assumes a uniform distribution for \(\Omega _M\) and \(H_0\). For the second set of priors, we use the 2018 Planck cosmology determined best-fit parameters [8], and choose Gaussian priors centered around these values. The \(R_h=ct\) universe has only one free parameter, \(H_0\) and we used the same (uniform) \(H_0\) prior as in \(\Lambda \)CDM model.Footnote 1 A summary of all the priors used for model comparison for both the models can be found in Table 2. In this work, the Bayesian evidence was computed using the dynesty [81] package, which uses the nested sampling technique.

6 Results

We now present our results for model comparison using the chronometer dataset. We carried out two different analyses. The first analysis involves using the unbinned data. The second analysis involves reconstructing H(z) using the non-parametric GPR method. For each of these datasets, we used two different priors for \(\Lambda \)CDM, as outlined in the previous section. For this purpose, we repeat the analysis done in BS12, wherein H(z) is reconstructed at many values using GPR. The GPR was done using the GaPP software. This GPR reconstructed H(z) for chronometers along with the original unbinned measurements is shown in Fig. 1, along with the best-fit \(\Lambda \)CDM model and the \(R_h = ct\) model. For carrying out model comparison with GPR, we use 100 reconstructed measurements uniformly distributed between the lowest and highest available redshift.

Table 2 The priors used for the analysis. \({\mathcal {U}}(x,y)\) denotes a top-hat or a uniform prior between x and y. \({\mathcal {N}}(x,y)\) denotes a Gaussian prior with a mean of x and scale parameter of y. The Gaussian priors for \(\Lambda \)CDM are centered around the best fit values of the 2018 results of Planck collaboration [8], with the scale parameter equal to \(1\sigma \) error of these results. The priors on \(H_0\) are given in units of km/s/Mpc
Table 3 Comparison of Bayes factor for \(R_h=ct\) and \(\Lambda \)CDM using unbinned measurements of chronometer data listed in Table 1, using two different sets of priors in \(\Lambda \)CDM (cf. Table 2) \(\log Z\) denotes the logarithm of the Bayesian evidence. The Bayes factor is defined as the ratio of the evidence of the \(\Lambda \)CDM model to the evidence for the \(R_h=ct\) universe model. When uniform priors are used for \(\Lambda \)CDM, the Bayesian evidence for the two models are almost identical, with no one model been preferred. When we used Gaussian priors centered on the Planck best-fit values [8], \(\Lambda \)CDM is very strongly preferred over \(R_h=ct\)
Table 4 Model Comparison tests using GPR measurements of chronometer data listed in Table 1. The explanation of all the columns is the same as in Table 3. When uniform priors are used, no one model is preferred, whereas \(\Lambda \)CDM is decisively favored if we use Gaussian priors obtained from the 2018 Planck best-fit measurements [8]

6.1 Model comparison using unbinned data

Our model comparison results using unbinned analysis using both the prior choices are summarized in Table 3. The summary of these results is as follows. When uniform priors for \(\Lambda \)CDM are chosen, the Bayes factor (defined as ratio of Bayesian evidence for \(\Lambda \)CDM model to \(R_h=ct\)) is close to one, and hence does not prefer any one model over the other. However, if we choose Gaussian priors centered around Planck best-fit values, then \(\Lambda \)CDM is very strongly favored over \(R_h=ct\) using Jeffreys scale. Therefore, we disagree with MM12 that \(R_h=ct\) is favored, if you consider only chronometer data.

6.2 Model comparison using GPR data

Our results for model comparison using data reconstructed with GPR can be found in Table 4. The Bayes factor again marginally favors \(\Lambda \)CDM, when uniform priors are used. When we use Planck based priors, then \(\Lambda \)CDM is decisively favored over \(R_h=ct\).

Therefore, in summary we disagree with MS18 that \(R_h=ct\) provides a better fit than the \(\Lambda \)CDM model, since no test provides a decisive evidence for either model and most tests strongly favor the \(\Lambda \)CDM model. At the same time we note that out \(R_h=ct\) model cannot be currently ruled out using chronometers, if we use uniform priors on \(\Omega _M\) and \(H_0\).

7 Diagnosis using Om statistic

We now explore if we can distinguish between the two models using the two-point \(Om(z_1,z_2)\) statistic between any two pairs of redshifts (\(z_1\),\(z_2\)). The \(Om(z_1,z_2)\) statistic is defined as [82]:

$$\begin{aligned} Om(z1,z2)= \frac{h^2(z_1)-h^2(z_2)}{(1+z_1)^3-(1+z_2)^3} \end{aligned}$$
(8)

where \(h(z)=H(z)/H_0\). The \(Om(z_1,z_2)\) statistic has been used to map out the expansion history of the universe and also as a null test of \(\Lambda \)CDM in a number of works [74, 82,83,84,85,86]. For \(\Lambda \)CDM model, \(Om(z_1,z_2)\) has the remarkable property that it is independent of \(z_1\) and \(z_2\), and is equal to \(\Omega _M\) [83]. Therefore, computing the \(Om(z_1,z_2)\) using H(z) measurements enables us to carry out a model independent test of \(\Lambda \)CDM and simultaneously obtain an estimate of \(\Omega _M\). For \(R_h=ct\) universe, Om(z1, z2) is given by

$$\begin{aligned} Om^{R_h=ct}= \frac{(1+z_1)^2-(1+z_2)^2}{(1+z_1)^3-(1+z_2)^3}. \end{aligned}$$
(9)

Therefore for \(R_h=ct\) model, Om(z1, z2) is not a constant and is a function of \(z_1\) and \(z_2\).

From 31 H(z) measurements, we obtain a total of \(^{31}C_2\) or 465 \(Om(z_1,z_2)\) data points. These data points can be found in Fig. 2. The errors are obtained from Gaussian error propagation from the errors in \(H(z_1)\) and \(H(z_2)\). As we can see, for low values of the redshift difference, the errors in \(Om(z_1,z_2)\) are quite large, and although they reduce with increasing \(z_2-z_1\), they are usually of the same order as \(Om(z_1, z_2)\).

For doing model comparison, we need to determine the total number of free parameters in \(\Lambda \)CDM and \(R_h=ct\). For \(\Lambda \)CDM, this is equal to one, since \(H_0\) is degenerate with \(\Omega _m\), and choosing a different \(H_0\) would lead to a different \(\Omega _m\). However, irrespective of which value of \(H_0\) is used, Om(z1, z2) would be a constant, independent of the redshift difference. Since Om(z1, z2) is constant for \(\Lambda \)CDM model, the best-fit maximum likelihood estimate would just be the weighted mean of all the Om(z1, z2) measurements. For \(R_h=ct\), the only free parameter would be \(H_0\), since varying \(H_0\) would vertically re-scale the whole plot by a constant offset.

For \(\Lambda \)CDM, we get

$$\begin{aligned} \chi ^2/dof = 185.4/350 \end{aligned}$$

and for \(R_h=ct\) we get

$$\begin{aligned} \chi ^2/dof = 185.2/350. \end{aligned}$$

For doing this fit, we removed four H(z) points with the largest error bars. So the total number of \(Om(z_1,z_2)\) data points used for doing the fits is equal to 351. As we see, both the \(\chi ^2/dof\) values are smaller than one and are very close to each other making the \(Om(z_1,z_2)\) ineffective for this model comparison. For illustrative purposes, we show this best-fit along with some of the \(Om(z_1,z_2)\) (after removing the error bars) in Fig. 3. Therefore, it is not possible to distinguish between the two models using current H(z) chronometer data.

Fig. 2
figure 2

Plot of \(Om(z_1,z_2)\) calculated from the chronometer data using Eq. 8. These points were calculated from combinations of the 31 Hubble measurements in pairs, amounting to 465 points. The theoretical plots are not included because the theoretical curves cannot be distinguished at this scale

Fig. 3
figure 3

This plot shows the theoretical curves for the best-fit \(\Lambda \)CDM and \(R_h=ct\) models, along with the data for \(Om(z_1,z_2)\) (grey points). For this plot, four H(z) points with the largest error bars have been removed for which the value of \(Om(z_1,z_2)\) was large. Since this plot is for illustrative purposes, we have also removed the error bars in the \(Om(z_1,z_2)\) values for brevity. The actual error bars are much larger than the differences between the two models. Therefore, it is not possible to distinguish between the two models using \(Om(z_1,z_2)\) measurements

8 Conclusions

In this work we try to independently assess the viability of \(\Lambda \)CDM vs \(R_h=ct\) universe using only H(z) measurements from cosmic chronometers to resolve conflicting claims between two groups of authors. In 2012, Bilicki and Seikel [1] claimed using H(z) measurements from chronometers and BAO, that \(R_h=ct\) model is conclusively ruled out. This was contested by Melia and collaborators  [2, 3], who showed using H(z) measurements from chronometers that \(R_h=ct\) universe is favored over \(\Lambda \)CDM. They also pointed out BAO measurements cannot be used to test \(R_h=ct\) models, since the BAO H(z) measurements implicitly assume \(\Lambda \)CDM. A few other works [36, 38, 39, 41] also found that type 1a SN, H(z) measurements from chronometers, and BAO rule out \(R_h=ct\) model.

In order to settle the conflicting results between the above two groups of authors, we considered measurements from only chronometers (to emulate the analysis in Refs. [2, 3]).) We did not consider the BAO measurements, given the circularity involved in using them for testing non-\(\Lambda \)CDM universes [2, 3]. We carried out model comparison using both the unbinned data, and also by doing a non-parametric reconstruction using GPR. To carry out model comparison, we used a Bayesian model comparison technique by computing the Bayes factor between the two models. We used two different priors for the \(\Lambda \)CDM: a uniform prior over a wide parameter range, and also Gaussian priors centered around the 2018 Planck best-fit \(\Lambda \)CDM cosmology. A summary of these priors used can be found in Table 2.

Our results for both these priors and datasets can be found in Tables 3 and 4. When we use a uniform prior, the difference in significance between the two models is negligible, using both the datasets. However, for the priors centered around the Planck 2018 best-fit \(\Lambda \)CDM values, we find that \(\Lambda \)CDM is very strongly/decisively favored over \(R_h=ct\) for the unbinned/GPR reconstructed datasets. Therefore, we conclude that using the chronometer H(z) data, \(R_h=ct\) model is not preferred over \(\Lambda \)CDM.

We also investigated if the Om statistic, calculated using redshift pairs, which has been used in previous literature for testing \(\Lambda \)CDM model [82, 83], can be used to discriminate between the two models. Unfortunately, the current error bars in \(Om(z_1,z_2)\) estimated using chronometer H(z) data are too large to enable a robust model comparison.

Therefore, in summary, we disagree with the claims in both Ref. [1] and Refs. [2, 3], and conclude that neither model is ruled out or decisively favored using only H(z) measurements with chronometers, if we use uniform priors on parameters of both models. A more acid test would be using CMB and other large scale structure based tests in a theory-independent fashion.