Gaussian discriminators between $\Lambda$CDM and wCDM cosmologies using expansion data

The Gaussian linear model provides a unique way to obtain the posterior probability distribution as well as the Bayesian evidence analytically. Considering the expansion rate data, the Gaussian linear model can be applied for $\Lambda$CDM, wCDM, and a non-flat $\Lambda$CDM. In this paper, we simulate the expansion data with various precision and obtain the Bayesian evidence, then it has been used to discriminate the models. The data uncertainty is in the range $\sigma\in(0.5,10)\%$ and two different sampling rates have been considered. Our results indicate that it is possible to discriminate $w=-1.02$ (or $w=-0.98$) model from the $\Lambda$CDM $(w=-1)$ with $\sigma=0.5\%$ uncertainty in expansion rate data. Finally, we perform a parameters inference in both the MCMC and Gaussian linear model, using currently available expansion rate data, and compare the results.


I. INTRODUCTION
ΛCDM cosmology offers a simple and consistent concordance model which has agreed with observations for several decades [1,2]. Separately, the ΛCDM model gives excellent agreement with cosmic microwave background data [3,4], as well as late time measurements of cosmic expansion [5]. However, recent observations have suggested a growing cosmic tension in the reporting of the Hubble constant [6][7][8], among other tensions. Together with the long standing consistency issues with the model [9], this points to possible deviations from ΛCDM cosmology entering the observational regime. In this new context, it is imperative to understand the accuracy of observational data needed to discriminate between viable alternatives.
Cosmological tensions in the ΛCDM model has led to a reevaluation of the foundations of the model such as the role of the cosmological principle [10,11], as well as the nature of dark matter and dark energy (such as Refs. [12][13][14]), and the fundamental description of gravity [15][16][17][18][19]. One such alternative that has gained popularity in recent years is the wCDM model. This is composed of a dynamical equation of state for dark energy that varies across the cosmic history of the Universe. It remains an open question whether observational constraints can point to a varying equation of state in the near future. Thus, it is important to understand what precision would be needed to discriminate between these two models.
Among all tensions and issues in the ΛCDM, the so called H 0 tension, is the most severe one. A Lot of efforts have been undertaken so far to tackle the problem without any reliable and satisfactory solution (to see more details refer to [20]). Notice that, along with all model modifications scenarios that have been considered so far, describing a data set in a model independent approach might be very useful in this case [21][22][23].
For a typical cosmological setup, one naturally investigates a Markov chain Monte Carlo (MCMC) approach to infer parameter values using the latest observational data sets. However, given the plethora of cosmological models being proposed, this approach only gives the constrain on the free parameters and says nothing about model comparison. For this purpose, some statistical measures, under some simpli-fying assumptions, including the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and Deviance information criterion (DIC) have been used for model selection. On the other hand the Bayesian evidence provides a robust and reliable measure for model selection. Unfortunately, computation of the quantity involves a high-dimension integration over the product of likelihood and prior, which is computationally very expensive. To overcome this, some numerical approaches like the nested sampling [24][25][26] and the Savage-Dickey density ratio [27] have been developed. However, when the model is linear in its free parameters, the Gaussian linear model (GLM) provides an analytic solution for the Bayesian evidence. The formalism for a Gaussian or flat prior has been presented in [28]. Moreover, the Bayesian model selection has been utilized in [29,30] to understand reliability of the Bayesian evidence. In this work, we consider three important cosmological models which are linear in their parameters and apply the GLM method to understand how precision of the expansion rate data affects the significance of the model discrimination through Bayesian evidence.
The structure of this paper is as follows: In Sec. II, we give the basic formalism of the Gaussian linear model (GLM) and introduce the analytic formula to obtain the posterior distribution as well as the evidence. In Sec. III, the details of simulated data in our models are given. In addition, we present the results of applying the GLM on these data in the section. Then in Sec. IV, we describe current available Hubble data from different observations and perform an MCMC parameter inference to obtain the best value of parameters as well as their uncertainties. We also apply the GLM method to the observational data and compare the results with those of the MCMC. Finally, we conclude and discuss the main points of our finding in section Sec. V.

II. GAUSSIAN LINEAR MODEL
In this section, we briefly provide the basic formalism of the GLM. In this scenario, a database is modeled by a function which is linear in its parameters arXiv:2203.01817v1 [astro-ph.CO] 3 Mar 2022 where X(x) is an arbitrary function of x and θ j s are the free parameters. Notice that the base functions can very well be a non-linear function of x. Assuming a n obs dimension database as (x i , y i , τ i ), the likelihood function is given by where, and (4) Here the maximum likelihood occurs at θ 0 and L denotes the likelihood Fisher matrix.
In order to perform the Bayesian parameter inference, we need to define a prior on the free parameters. We consider a Gaussian prior as where θ pri (Σ pri ) is the mean (covariance matrix) of the prior and n par denotes the number of free parameters. Using the Bays theorem, the posterior distribution is proportional to where . In addition to the Bayesian parameter inference, the GLM provides the Baysian evidence which is a key quantity in model comparison. The Bayesian evidence includes an integration over all parameters space and is given by Since both the prior and likelihood are a multivariate Gaussian in the GLM, the integral has an analytical solution. The Bayesian evidence in the GLM is given by, where | ln B 01 | Strength of evidence < 1 Inconclusive 1.
Weak evidence 2. 5 Moderate evidence 5 Strong evidence and |Σ| denotes the determinant of Σ. In Addition, as we mentioned before, it is possible to have a solution in the case of the flat priors. To see more details refer to [28].
When comparing two models M 0 and M 1 , using the Bayes theorem, it is straightforward to obtain, where B 01 is the Bayes factor. Usually, the prior on the models p(M ) is taken to be flat and so the Bayes factor is the key quantity in Bayesian model comparison. The value of Bayes factor should be interpreted by an empirically calibrated scale to compare given models. The Jeffreys' [31] and the Kass-Raftery scales [32] provides two well-known scales to interpret the Bayes factor. These two scales are presented in Tab. I and Tab. II. Notice that these evidences are in favor of the model with larger evidence.

III. SIMULATED DATA AND RESULTS
In order to apply the GLM in the cosmological context, we should have a linear model. Considering the ΛCDM model, the Hubble parameter as a function of redshift can be written as where H 0 = 100h is the current expansion rate of the universe and Ω m is the matter density parameter. Interestingly, the second format is linear in its parameters and can be seen as a GLM with and free parameters In addition to the ΛCDM, the wCDM and non-flat ΛCDM (NΛCDM ) also can be written as a GLM model, for the wCDM and for the NΛCDM. In this case, the Ω k is the curvature density parameter. Since these models are GLM, it is easy to find the value of parameters which maximize the likelihood, the mean and covariance of posterior distribution as well as the Bayesian evidence.
To obtain the posterior distributions as well as the evidence, we need to define a prior on the free parameters. To avoid any possible prior bias, we consider a Gaussian wide prior on the free parameters. We use a multivariate Gaussian with mean and covariance matrix as where the first, second and third row present covariance of the Ω m h 2 , Ω k h 2 and h 2 respectively. We examine different mean and covarince matrix to check the robustness of our results. As long as the priors are wide enough, our results are the same and there is no prior bias.

A. ΛCDM and wCDM
In order to realize how precision and sampling rate of an expansion rate database affects the model comparison, we simulate the expansion rate data with different precision and use the GLM method to perform a model comparison. To compare the ΛCDM and wCDM, two sampling rates have been considered. In the first case, we simulate 100 data points in range z ∈ (0, 3) with uncertainty σ ∈ (0.5, 10%) using the ΛCDM model. For the second sampling strategy, we simulate 50 data points in the redshift range z ∈ (0, 2) and a similar uncertainty range as the first one. Now, we consider the simulated data and compute the evidence for both ΛCDM and wCDM. In our analysis, the EoS, is selected in a range (w ∈ −0.9, −1.1). Moreover, we consider a Gaussian distribution for the uncertainty and so the data points are randomly generated at each simulation. Taking this into account, we have a distribution of the Bayes factor. In order to consider such a statistical fluctuation, we have generated 40 simulated data sets and compute the mean of Bayes factor. Notice that, we examine other values for the number of data sets and the results are quite the same by considering more data sets. The mean Bays factor ln B 01 = ln B Λ − ln B w for two strategies, have been shown in Figs (1) and (2) respectively. The value of Bays factor for each cell is shown as a numerical value on the cell.  From Fig.(1), it is clear that discriminating ΛCDM from wCDM with uncertainty larger than 10 (−1.4) ∼ 4% for a wide range of EoS is almost impossible. With ∼ 3% uncertainty, we see a strong evidence only for w ∼ −0.9 or w ∼ −1.1, which is already disfavored by other observations. On the other hand, with σ ≤ 1%, the chance of discriminating increases significantly. For example with σ = 0.5%, a 2% deviation in the EoS of dark energy (w = 1.02 or w = 0.98) could be detected with a strong evidence. Notice that, as we mentioned above, the base model for simulated data is the ΛCDM and the Bayes factor is computed for the ΛCDM and wCDM. Contrary, if we consider the wCDM as the base model for simulated data and compute the Bayes factor as ln B 01 = ln B w − ln B Λ , the results are the same. In this case, a positive Bayes factor indicates more evidence in favor of the wCDM.
In addition to the precision of each data point, the sampling rate of a database affects the model comparison. For a less cadence database, the results are presented in (2). Overall, the results are the same as the first sampling strategy but strength of the Bayes factor decreases. For example the extreme case in the first sampling strategy is ln B 01 ∼ 80 for σ = 0.5% and w = −0.9, while considering the second strategy, the number decreases to ln B 01 ∼ 33 which is around 60% less than the former.
B. ΛCDM and the non-flat ΛCDM As we mentioned above, the NΛCDM can also be written in the form of a GLM. In this case, we follow similar strategies as previous one. The parameter Ω k h 2 is selected in the range Ω k h 2 ∈ (−0.05, 0.05) to simulate data points. In this case, we simulate the Hubble data using NΛCDM and compute the evidence in both NΛCDM and ΛCDM. In Figs (3) and (4)      Our results indicate that with σ > 3% there is no chance to discriminate a flat and non flat ΛCDM models. The evidence become more significant at both positive and negative curvature with smaller uncertainties. We see very strong evidence for Ω k h 2 ∼ 0.02 with σ < 1% which is much higher for larger and smaller values of Ω k h 2 . Furthermore, interestingly, we see a negative evidence for a high accuracy data σ < 1% when Ω k h 2 ∼ 0. This is due to the Occam's razor effect which favor a simpler model. The effect indicates that an extra free parameter, not being constrained significantly with the data, makes the evidence smaller compare to the model without that free parameter. In these cases, the Bayes factor favor the simpler model which is in our case the flat ΛCDM. Of course the constrain for Ω k h 2 is significantly improved for larger and smaller value of Ω k h 2 so the evidence become positive and the more complex model is favored.
The results for the second sampling strategy have been shown in (4). As the previous case, the strength of the evidence decreases for a less cadence sampling rate. In this case, the extreme case decreases around 75% in the second sampling rate.

IV. EVIDENCE FOR CURRENT AVAILABLE HUBBLE DATA
In this section, we apply the GLM approach on the current observational data. The expansion rate database in our analysis is a combination of the Hubble parameter measurement from cosmic chronometers and the BAO measurements. The database has been collected in [33]. In addition, the local H 0 measurement (the SHOES data) [5] has been added to the database. Considering the ΛCDM, wCDM and the non-flat ΛCDM, the results of MCMC analysis have been shown in the Table (III). To perform the MCMC analysis, we use the public python package pymc3 [34]. In this analysis, we consider the wide Gaussian prior on the free parameters introduced in section III. The 1σ uncertainty of each parameters has been shown along with its mean value. These values are estimated from a sample of parameters generated in the MCMC algorithm. Now, we use the database and apply the GLM formalism to obtain the MLE, mean and covariance of posterior as well as the evidence. The results have been presented in Tab.(IV). As it is clear, the results from the GLM are quit in agreement with those of the MCMC. Notice that, in the case of GLM, we have an analytic posterior distribution for the free parameters and compute the mean and 1σ uncertainty directly from the distributions.
In our analysis, we find ln B 01 = ln B Λ − ln B N = 0.43 which indicates an inconclusive evidence for considered models. In fact, this result was expected because of the low precision observational data points. Notice that the most data points from cosmic chronometer have uncertainty larger than 10% but uncertainty of the expansion rate data from BAO is less than 10% and the uncertainty of the most precise one is σ ∼ 3.5%.

V. CONCLUSION
The landscape of cosmological models has drastically increased in recent years with the combined open problems of cosmological tensions and the internal consistency issues of gravitational models. In this work, we explore the GLM in the context of three cosmological models, namely ΛCDM, nonflat ΛCDM and the wCDM, in order to explore the question of precision requirements for specific data sets to discriminate these models. We wish to assess the data set precision needed to differentiate between each pair of these cosmological models.
Vanilla ΛCDM is our base model for considering any modification to the concordance model. Here, we compare ΛCDM together with wCDM where the equation of state parameter is allowed to be any value. These two model are central to modern cosmology and have Friedmann equations represented by Eq. (11) and Eq. (13) respectively. In addition, we also con-sider ΛCDM with a possible non-flat component in Eq. (12). There have been recent suggestions in the literature that such a scenario may be preferable in the context of recent reporting by the Planck collaboration data [35].
The GLM approach presented here takes Friedmann equation components, and through a calculation resulting in the Bayes factor, can determine whether enough precision is present to differentiate between pairs of models. In Figs. (1,2) this is done for the ΛCDM and wCDM models where two sampling strategies are shown with very consistent results. Here, we find that indeed for an equation of state parameter that veers away from the ΛCDM value, the approach indicates a high confidence for differentiating these models. Specifically, we see a strong evidence in favor of the ΛCDM when the data simulated with uncertainty σ ∼ 0.5 and the rival model is the wCDM with w = −1.02 or w = −0.98. Moreover, we show how different sampling rates affects the Bayes factor. The pattern of the results are almost the same for our two sampling strategies but the value of the Bayes factor is smaller in the case of the less cadence sampling.
In Figs. (3,4) we repeat the analysis for the ΛCDM and a non-flat ΛCDM setting. Our results indicate that it is impossible to discriminate these models with σ > 3% even if the Ω k h 2 is in the range (−0.05, 0.05). The strength of the evidence become much more larger for Ω k h 2 not close to zero with uncertainty σ < 3%. In addition, we see clear evidence of Occam's razor effect when the curvature density is close to zero. To make this procedure as transparent as possible, we are making the code for this analysis available for others to use and improve upon 1 .