Model selection applied to reconstructions of the Dark Energy

The main aim of this paper is to perform a model comparison for some reconstructions of the key properties that describe the dark energy of the Universe i.e. energy density and the equation of state (EoS). We carry out this process by using a binning and a linear interpolation methodologies, and on top of that, we incorporate a correlation function mechanism. An extension of the two of them was also introduced, where internal amplitudes are allowed to vary in height as well as in position. The reconstructions were made with data from the Hubble parameter, Supernovae Type Ia and Baryon Acoustic Oscillations (H+SN+BAO), all of which span a range from z=0.01\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$z=0.01$$\end{document} to z=2.34\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$z=2.34$$\end{document}. First we perform the parameter estimation for each of the reconstructions to then provide a model selection through the Bayesian Evidence. Throughout our process we found a better fit to the data, up to 4σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4\sigma $$\end{document} compared to Λ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document}CDM, and the presence of some interesting features, i.e. an oscillatory behaviour at late times, a decrease in the dark energy density component at early times and a transition to the phantom divide-line in the EoS. To discern these features from noisy contributions, we include a principal component analysis and found that some of these characteristics should be taken into account to satisfy current observations.


I. INTRODUCTION
Since the discovery of the current accelerated expansion of the Universe, cosmologists have been challenged to explain the cause of this mysterious phenomenon.In order to provide a possible explanation, the idea of a new exotic source was introduced, called dark energy (DE).In addition to the cosmological constant Λ, being the simplest assumption to be the dark energy, there is also a still-unknown key component for structure formation in the Universe, coined as Cold Dark Matter (hereafter CDM).These two dark components comprise the basis of the standard cosmological model or ΛCDM.Even though the ΛCDM model has had remarkable achievements, i.e. satisfies most of the current cosmological observations on different scales, it still presents several shortcomings.On the theoretical side, for instance, a disagreement encountered by cosmology and quantum field theory about the vacuum predictions [1]; on the observational side several problems began to arise, for example the discrepancy on the measurements of the Hubble constant H 0 among datasets [2].This so-called Hubble tension seems to be enhancing as observations become more precise [3] and even a variable H 0 (also named "running") has also been found [4][5][6][7][8], further indicating the standard model's inability to reconcile late-time data with high-redshift data.The presence of these issues opens up the possibility to consider alternatives beyond the ΛCDM model.
A natural extension to the cosmological constant is to allow a redshift dependency either through the equation of state (EoS) w DE (z) or the energy density ρ DE (z) for the dark energy.Different proposals with a dynamical behaviour have been presented.Some of them include scalar fields in the form of Quintessence, Phantom [9][10][11] or the two of them combined, called Quintom models [12]; others intent to modify the theory of general gravity known as f (R) models [13], anisotropic massive Brans-Dicke gravity [14] or theories with extra dimensions such as brane world models [15,16].
On the other hand, in the absence of a fundamental and well defined theory, several parameterizations to cosmological functions have been suggested to get insights of the general DE behaviour and hence to look for possible deviations from the cosmological constant [17].Some of them include quantities such as the deceleration parameter q(z) [18][19][20], the jerk parameter j(z) [21,22] or, more predominantly, the dark energy EoS w DE (z).Focusing on the EoS we can find a plethora of parameterizations to express w DE (z) (or w DE (t)), for instance in terms of power-law, logarithmic, exponential and trigonometric components, or combinations of them [17,23,24].These approaches can even be classified depending on the number of parameters that describe w DE (z), i.e. with a single, two or more parameters [25,26].Similarly, there have been, although not as many as for the equation of state (EoS), parameterizations for the dark energy density, i.e. the Generalised Emergent Dark Energy (GEDE) [21,27,28], the Graduated Dark Energy (gDE) [29][30][31][32], the Early Dark Energy (EDE) [33] and the Energy-Momentum Log-gravity (EMLG) [34].Generally, in these models it can be obtained an associated EoS but the main assumptions come from the density, that is, the GEDE considers directly the evolution of the density parameter, and in the gDE model an inertial mass density is introduced which allows a possible transition to negative effective energy densities.Even though these parametric forms usually provide a better fit to the data, they have the limitation of assuming an a priori functional form which may lead to some bias or misleading model-dependent results, regardless of the DE nature.To avoid these possible issues, non-parametric arXiv:2111.10457v2[astro-ph.CO] 24 Mar 2023 and model-independent techniques are used.They allow us to extract information directly from the data to detect features within cosmological functions.That is, the goal of non-parametric and model-independent approaches is to reconstruct (infer) an unknown quantity without a predefined shape.Non-parametric reconstructions may include Artificial Neural Networks or Gaussian processes, as they have proven to be useful in cosmology as more data become available [35][36][37][38][39][40][41][42][43][44][45][46].Examples of model-independent ones are higher order terms on a Taylor series [17], Fourier series expansion [47] and the Padé approximation [48], just to mention a few.Another example of a model-independent reconstruction (relevant to this work) was introduced by [49] that considers adding a correlation function term to the data.An interesting result, by using this technique, found that a dynamical form of w DE (z) is preferred over the constant value with a 3.5σ significance level [3].Also, in [50] this same method was used directly with the DE energy-density, finding a dynamic behavior and even evidence in its favor when compared with the standard model, although this only happened when reducing some of the parameters' priors.Moreover, a similar path consists of combining the Principal Component Analysis (PCA) with the Goodness of Fit [51], where PCA helps to remove noisy oscillations induced by the reconstruction method to find deviations from the cosmological constant [52].In a similar fashion, the study in [53] introduced the nodal reconstruction in which a piecewise linear interpolation (or cubic splines) represents the main functional form to be reconstructed.This approach has been utilized to recover the EoS directly from the data [53] and its final form presented a well constrained redshift dependency with a small bump at z ≈ 1.3, but beyond z ≈ 1.5 the data was too inaccurate to provide a good prescription.Later on, a similar form of this method was also applied in [54,55] and in [56] (although here it is referred as knot-spline reconstruction) and showed that this type of model-independent approach can be used to find specific features by allowing some central knots to vary both in height and position.More recently an improvement of these methods has been applied to the dark energy EoS [57].
The main aim of this work is to perform modelindependent reconstructions of the dark energy features and to provide a model comparison through the Bayesian evidence and goodness-of-fit.Our analysis is carried out among the nodal reconstruction [57], the correlation function method [49], an extension of the two of them where internal amplitudes are allowed to vary in height as well as in position, and finally, for comparison, to include some of the parametric proposals.Even though these techniques are applicable to any function describing the dark energy, we focus on the EoS and the energy density.After the reconstruction is carried out, some of the cosmological functions can therefore be derived, i.e.Hubble parameter H(z), deceleration parameter q(z) and the Om(z) diagnostic.Finally we perform the PCA to discern possible important features from noise contributions.The novelty in this work is the joint study of the EoS and the energy density by using the model-independent approaches (nodal and step functions) to then perform model comparison through the Bayesian evidence, derived some functions in terms of the results and carried out the PCA to find preferred behaviours of DE to then keep, discard or propose other models/parameterizations.The paper is organized as follows: in section II we describe the reconstruction methodology, the PCA method is also introduced and explained.Then in section III we provide a brief review of the underlying theory, datasets and some specifications about the parameter estimation and model selection.In section IV we present the main results, and finally in section V we give our conclusions.

II. RECONSTRUCTION METHODOLOGY
One of the primary reconstruction methods used in this paper is built on a set of step functions.In this approach steps or bins are utilized to describe any function f , where the steps are connected together via hyperbolic tangents to preserve continuity.The target function looks like this: where N is the number of bins, f i the amplitude of the bin value, z i the position where the bin begins in the z axis and ξ a smoothness parameter.
Another approach used in this paper is the nodal reconstruction [57].This type of reconstruction consists of performing interpolations, either by using linear, cubic or higher order splines, to fill in the gaps amongst certain number of nodes.The simplest example is the linear interpolation, that is, given two coordinates (z i , f i ) and (z i+1 , f i+1 ), the interpolated function is as follows The interpolation could also be made with higher order polynomials (or splines) to preserve smoothness, nonetheless it may present heavy correlations between nodes or introduce unwanted noise to the reconstruction, it may also present numerical problems when the amplitude values are changing too abruptly or when the nodes are positioned too close, as shown in the appendix of [53].
In both types of reconstructions the nodes are located in space at certain positions z i and with variable amplitudes f i .For instance, Figure 1  the smoothness parameter, along with the interpolation method.
A modified version of the binning and nodal reconstruction techniques is to consider the internal positions z i be free parameters, which will allow us to capture more specific features at certain places [55,56].Notice that in this version, the internal variable positions have to be sorted in a way to avoid possible overlapping in the reconstructions.This approach gives more degrees of freedom (one for each variable z i ) to the reconstruction.When using the linear interpolation the expected behavior is straightforward, as it only varies the lines between nodes, however in the binning process it will affect the width of the bins, so it would be easier to find specific features on the positions rather than on the amplitudes.
Finally, there exists the possibility of either overfitting by using a very complex model with a large amount of bins (nodes) or underfitting by not capturing enough features due to the use of just few bins (nodes).Both of these possible issues can be managed by performing a model comparison among reconstructions through the Bayesian Evidence, that is, we modulate the impact of additional parameters and their priors to find out the most suitable to the data.This method follows the principle of simplicity and economy of explanation known as Occam's Razor [58] which states: "the simplest theory compatible with the available evidence ought to be preferred".

Correlation Function method
This method is applied on top of the binning approach in order to obtain a function that evolves smoothly [59].The idea behind it is to treat the function in place as a random field evolving along with a correlation function ξ, for instance with ξ(0) being the normalization factor and z c represents a smoothing distance.The correlation function (3), named CPZ [49], has a characteristic correlation length after which its contribution decreases, hence providing stronger correlations between neighboring bins when they are located at distances smaller than z c .There exist several alternatives that can reproduce this behaviour up to a certain degree, like the exponential fall-off ξ(δz) = ξ(0)e −δz/zc or the power law ξ(δz) = (δz/z c ) −n , but the CPZ has a more transparent dependency in the parameters f i , a relatively simpler behaviour and it constrains the high frequency modes better [49].Throughout this work, we use the following values z c = 0.3 and ξ(0) = 0.1 since they normalize the shorter wavelength modes of the data [59].
Assuming every amplitude f i are equally distributed with the same width-location ∆ = z i+1 − z i , then the average of f (z) over each f i is The variation from the fiducial model averaged over the bin is δf i = f i − f fid , where the fiducial model is the underlying scheme upon which our reconstruction will be dependant on.In this way the covariance matrix can be obtained by and therefore the associated prior Finally, once the prior is given (or equivalently χ 2 prior = −2 ln P prior ), our total χ 2 to minimize becomes The fiducial model could be one previously known, for instance in the case of the dark energy EoS, it could be the cosmological constant with w DE (z)= −1 or ρ DE (z)=constant.However, demanding a behaviour like the cosmological constant may create a bias when performing the reconstruction.The same would be true for any other fixed fiducial model, so we opted for the floating prior proposed by [49], where here z j is the position of the amplitude f j and N j is the number of amplitudes that fulfill the condition |z j −z i | ≤ z c .This floating prior makes sure that the parameter values stay continuous by evaluating the mean value of each parameter with its immediate neighbors.
It is important to mention that the correlation function method applied here is used as an "agnostic" way of reducing overfitting, agnostic in the sense that our imposed correlations are mainly obtained via a data-driven approach, not a theory-driven one, i.e. effective field theories (EFTs) such as Quintessence or Horndeski [60][61][62][63][64].This could provide a small bias against EFTs, particularly Quintessence, as explained in [60].Also, since this method is being used as a way to diminish overfitting it could even prevent some interesting features to appear or even wither existing ones, such as the disappearance of wiggles in [63,64] when using a theory prior.To see if any possible bias may arise we will have an instance where we do not use the Correlation Function method (as seen in Fig. 5).

Bayesian statistics
In order to perform the parameter estimation we follow the definition of the Bayes Theorem being u the vector of parameters of our hypothesis M (or model) to assess, D is the experimental (observational) data, P (u|D, M ) the posterior probability distribution, L(D|u, M ) the likelihood, P (u|M ) the prior distribution and E(D|M ) the Bayesian evidence.Once the Bayesian evidence is computed for two models M 1 and M 2 , the Bayes factor is defined as This quotient, together with the Jeffrey's scale shown in Table I [ 58,65], is a great empirical tool for performing model selection, that is, it gives an insight of how good a model M 1 is when compared to model M 2 .An example of this method can be seen in [66] where the authors compared some DE models to ΛCDM.In this work, M 1 will correspond to ΛCDM and M 2 will be any of our reconstructions in order to make a direct comparison with the standard model.Even so, it is important to mention that Jeffreys' scale is empirical in nature and sometimes rule out the true model [67]; added to this we have the dependence of the Bayesian evidence on priors and on model constrains [68].So, even though it is a great tool for comparison, it should not be taken as completely decisive when performing model selection.

Principal Component Analysis
After doing the reconstruction we can perform a process known as Principal Component Analysis (PCA)  on the parameter space to draw conclusions about the data and how well they are constraining the parameter space.Once the parameter inference is complete, we compute the Fisher matrix of the parameters w i , which is F = C −1 , where C is the covariance matrix.Then, we diagonalize F to find a basis where the parameters are uncorrelated, so that where the rows of W are the eigenvectors e i (z) of the basis in which our parameters are uncorrelated and D is a diagonal matrix.If p is the vector of the best-fit values of our f i then the new uncorrelated parameters are q = W p.
, and also their corresponding e i (z) and q i .These d i are related to the errors as Then, we can reconstruct the function in place with standard deviation where z n is the location for each parameter f i .Now we can choose any number of principal components (PCs) to reconstruct the function (from one PC up to the original number of parameters in the reconstruction).If we have the same number of PCs as there are bins then the reconstruction will look the same as no variance has been removed, but by removing the PCs with smaller d i contribution, then we remove the noisiest aspects of the reconstruction (with the biggest errors σ i ).A common practice is to remove enough PCs to maintain 95% of the information or variance.

III. THEORY AND DATASETS
For a homogeneous and isotropic flat universe given by the Friedmann-Robertson-Walker metric, the Friedmann equation describing the dynamical evolution, in terms of redshift z, can be written as where the Hubble parameter H(z) represents the expansion rate of the Universe, with H 0 being its value today.Here, we have the components of the Universe written in terms of the dimensionless density parameters Ω i (z) ≡ ρ i (z)/ρ c (z), where their contribution at the present time (represented by subscript 0) are Ω r,0 for the relativistic particles, Ω m,0 describes the matter content (baryons and dark matter) and Ω Λ,0 is associated to the cosmological constant; ρ c is the critical density of a spatially flat Universe.By letting aside the cosmological constant and allowing a dynamical dark energy component, the last term in the equation ( 14) is replaced by Ω Λ,0 → ρ DE (z)/ρ c,0 .Furthermore, if the dark energy is assumed to be a perfect fluid with a barotropic EoS, then once we compute the EoS we are able to derive the energy density through On the other hand, the dark energy density could, in general, come from an effective model contribution and not necessarily from a physical component.Hence ρ DE (z) may have any shape (including negative values).Therefore, if the shape of ρ DE (z)/ρ c,0 is obtained from any process, then we are able to derive its associated equation of state: Finally, if the shape of the energy density is obtained, either directly or through w DE (z), we are able to compute some derived functions, for instance the Om diagnostic, which provides a null test for the cosmological constant [69] and the deceleration parameter Data sets The Hubble parameter tells us the expansion rate of the Universe, and it can be expressed as a function of the redshift through the Friedmann equation.This parameter can also be assembled by gathering measurements of old stars known as cosmic chronometers as they work as "standard clocks" in cosmology.That is, H(z) can be obtained by calculating the derivative of the cosmic time with respect to the redshift as where the rate ∆z/∆t is measured with the difference in age of the cosmic chronometers.In this work we will use the collection of these cosmic chronometers [70][71][72][73][74][75][76] (written as H in the datasets), which can be found within the repository [77].
Similarly to standard clocks, the type Ia supernovae (SNIa) are coined as the "standard candles".The distance modulus of the SNIa is derived from the empirical relation when observing light curves where X is the stretch parameter, C is the color parameter, M B is the absolute magnitude, m * B is the B-band apparent magnitude, α and β are nuisance parameters.The Pantheon Sample measured the apparent magnitude m B = m * B + αX − βC, with fix absolute magnitude M B .Also, for a given cosmological model the distance modulus is from which the luminosity distance, D L (z) = H 0 d L (z), can be calculated in terms of redshift as d L (z) = (1 + z)r(z), with r(z) being the comoving distance In this work we use the full catalogue of 1048 supernovae from the Pantheon SN Ia sample, covering a redshift range of 0 < z < 2 [78] (written as SN in the datasets).The full covariance matrix associated is comprised of a statistical and a systematic part, and along with the data, they are provided in the repository [79].
On the other hand, the baryon acoustic oscillations (BAOs) are used as "standard rulers" in cosmology, as they measure the angular distance D A = r(z)/(1 + z).The BAO scale is set by the radius of the sound horizon with c s (z) being the sound speed of the baryonphoton fluid and z d the drag epoch [80].Since the size of r d depends on the cosmological model, the BAO actually constrains D A (z)/r d (with . The sound horizon is calibrated by using Big Bang Nucleosynthesis [81].The BAO datasets used here   contain the SDSS DR12 Galaxy Consensus, BOSS DR14 quasars (and eBOSS), Lyman-α DR14 auto-correlation, SDSS MGS and 6dFGS located at different redshifts up to 2.36.For a more complete explanation see [82][83][84][85][86][87] and references therein.
To compute the χ 2 for each data sample, we have where d m and d obs are our model predictions and the observables respectively; C data is the covariance matrix associated to each of the datasets.Since observations of each dataset are independent from each other, the joint χ 2 can be calculated as

Models and priors
Since the Bayesian evidence is very susceptible to the number of parameters and their prior distribution, it is worth to be careful when selecting them.A summary of the additional parameters along with their prior ranges is displayed in Table II.
First, to provide a comparison to the reconstructions, we constrain some parameterization models.For instance, the wCDM model which corresponds to a constant EoS w DE (z)= w c and the Chevallier-Polarski-Linder (CPL) EoS [26] in which w DE (z)= w 0 + w a z 1+z , being w 0 , w a and w c free parameters to be estimated with data.The flat priors for w 0 and w c are the same [−2.0,0.0] and the flat prior used for w a is [−2.0, 2.0].Then, inspired by the idea of a density capable of performing a possible transition to ρ DE ≤ 0 at high redshifts, similar to the one introduced by [29], we propose a simple parameterization with the shape of a sigmoid function: with z cut the redshift value where the transition may occur and k 0 = 1 + 1 1+e 10z cut is a constant which compensates the necessary amount so that ρ DE (z = 0)/ρ c,0 = 1 − Ω m,0 , to account for the Friedmann constraint.The parameter z cut has a flat prior within the range [0.0, 3.0].The sharp part of this sigmoid function comes from the argument in the exponential, if this number is larger (smaller) its transition to zero would be sharper (smoother).
Throughout all the reconstructions we let the data to decide the level of complexity of the two main functions within the range 0 ≤ z ≤ 3, that is, we place the nodes and bins (free parameters) over this range, and for z > 3 we adopt a constant value corresponding to the last amplitude.
In the first set of reconstructions, the position of the nodes and bins are kept fixed and uniformly distributed within z = [0, 3].In both types of reconstructions it is relatively free to choose any number of amplitudes, thus we used from 1 to 6 (and then, without loss of generality, jump to 20) to see the improvement of the fit to the data and how well the Bayesian evidence responded.Notice that the wCDM model is equivalent to the EoS reconstruction with one bin.In particular, we allow the possibility of having negative energy density, hence the amplitudes for the nodes and bins for ρ DE (z)/ρ c,0 were set to move freely on the ordinate with flat priors [0.0, 1.5] if z < 1.5, otherwise the prior is set to [−1.5, 1.5], since it is beyond z = 1.5 where a switch to negative energy density is generally presented.For the amplitudes in the EoS we have flat priors of [−2.5, 1.0].An important point is that when incorporating the correlation function method with a floating prior (χ 2 prior ) or CPZ, it is recommended to choose a large number of bins.In our case, we used 20 bins and obtained consistent results.
In the first set of reconstructions, the positions z i for each parameter remained fixed, however one may argue that the location of the amplitudes could bias the results.To check this point, in the second set of reconstructions every amplitude varies as well as the internal positions are allowed to move freely, spanning over the z-direction (but without overlapping each other).For the reconstruction of w DE (z) we consider 6 parameters: 4 varying amplitudes and 2 internal positions; similarly for ρ DE (z)/ρ c,0 we use 5 parameters: 3 amplitudes and 2 internal positions.We will refer to them as 4y2x (3y2x) or simply internal z i .Whereas the priors for the amplitudes remain the same as in the first set, the priors for the internal z i positions are [0.2,1.4] and [1.6, 2.8] respectively.
To perform the reconstruction analysis we used a modified version of the SimpleMC code [88] along with dynesty [89], a nested sampling algorithm which allows to com-pute efficiently the Bayesian evidence.For the number of live points we followed the general rule [90] of using 50 × ndim live points, where the dimensionality ndim corresponds to the number of parameters, so the total number of live points depends on the reconstruction and the amount of bins/nodes.As for the stopping criterion we have an accuracy of 0.01, which indicates the maximum difference between samples.Within the SimpleMC code we have also implemented the Principal Component Analysis and the correlation function method with the floating prior.For the functional posterior confidence contour plots we used a python package named fgivenx [91].See Ref. [92], and references therein, for an extended review of the cosmological parameter inference and model selection procedure.

IV. RESULTS
We present the results in 4 subsections: regarding the EoS, the energy density, the derived functions in terms of our results and the PCA analysis to distinguish noise from signal.The best-fit parameter values, the logarithm of the Bayes' factor (ln B ΛCDM,i ) and the goodness of fit (∆χ 2 ) are presented in Table III; complementary to this table, both quantities are displayed in Fig. 2. The regions of confidence for the parameterizations are shown in Fig. 3, whilst the reconstructions are shown in Figs. 4 and 5.

Dark Energy Equation of State
The best-fit values (with standard deviations) for the wCDM and the CPL parameterization, correspond to w c = −0.99 ± 0.06, w 0 = −1.01 ± 0.08 and w a = 0.12 ± 0.47 respectively.That is, the models with one or two parameters, wCDM, CPL and the 2parameter reconstructions produce very similar results, which means they are statistically consistent with the cosmological constant, within the 68% confidence region (see Figure 3 and the first column of Figure 4).Also, these results can be validated when comparing the ∆χ 2 presented in Table III.
When computing the Bayes' factors of all the models (green region of the top panel in Fig. 2) one observes, in general, a moderate evidence against the reconstructions regardless of the number of extra parameters, compared to the standard model.However, in the reconstructions certain features at high redshift become apparent as more parameters are added, and therefore an enhancement in the goodness of fit.For instance, when using three amplitudes (second column of Figure 4), a bump-like shape appears at z ≈ 1.5 and after that the general form prefers a crossing of the phantom-divide-line (PDL), i.e. for z 2 the amplitude values lean toward w DE < −1 outside the 68% confidence region, which represents  an improvement of ∆χ 2 ∼ −3.If we continue the process of adding amplitudes to the reconstructions, the main trend preserves a bump but now located at about z ≈ 1.2 and a crossing of the PDL at z ∼ 1.5.This is not true though when having 5 extra parameters (and we will have a similar problem with the density with 4 extra parameters), although the reason for this is mainly related to the reconstruction methodology, for a detailed explanation please refer to Appendix A. We also obtained that the cosmological constant is slightly outside of the 95% confidence contours at several places (see last column of Figure 4), and according to the Table III by using the definition of statistical significance in terms of the standard deviation σ, that is, the signal-to-noise

FIG. 2:
In this graph we show the differences in the ∆χ 2 and the Bayes factor between ΛCDM and our reconstructions for wDE(z) and ρDE/ρc,0.The green shades show the strength-of-evidence according to the Jeffrey's scale and the orange shades show the 1 to 4-σ levels of statistical significance S/N ≡ |∆χ 2 |.Ideally we want the upper markers to stay as close as possible to the black line at 0 (preferably to cross it), which it is an indication of a better Bayes' factor, and the lower ones to be far away from zero, which indicates a better fit to the data.The stars indicate the fitness of the reconstruction of the CPL EoS (top) and the energy density with a sigmoid (bottom), the crosses indicate the internal z i reconstructions and the triangles the reconstructions with 20 bins plus correlation function.The binning reconstructions are plotted with blue lines, whereas the nodal with red lines.
ratio S/N ≡ |∆χ 2 |, it represents a ∼ 3σ deviation from ΛCDM based on the improvement in the fit alone.These two features play a key role in identifying the correct dark energy model.If future surveys confirm their existence, the cosmological constant and single scalar field theories (with minimal assumptions) might be in serious problems as they cannot reproduce these essential features, and therefore alternative models should be taken into account.Furthermore, in the internal-z i reconstructions the internal positions are able to localize the position for the bump and the PDL, and the results resemble the previous ones, see the last two columns of Figure 5.Besides the presence of the bump and crossing of the PDL, we notice that at z = 0 the 68% confidence contour lays down right below the limits of w DE < −1.An important point to stress out is that the freedom of the internal positions led to a better fit to the data, compared to the reconstructions with the same number of parameters but fixed positions; displayed as the x-markers in the top panel of Figure 2.
Finally, to corroborate our findings, we include two reconstructions with 20 parameters: binning and binning with CPZ correlation function, shown in the column 1 and 2 of Figure 5.For these particular reconstructions, we have only focused on the binning method as it provides a better fit to the data.As commented above, the correlation method is incorporated to create a function that evolves smoothly, and in general, they both share the same structure: a bump located at z ≈ 1.2, a crossing of the PDL at z ≈ 1.5 and a slight preference of w DE < −1 at z = 0.Besides these three features (found already in the internal models), there is also an oscillatory behaviour throughout the whole structure, which yields to a deviation of about 4σ to the ΛCDM.Even though this result may be considered as an overfitting due to the large number of parameters and small ∆χ 2 , its Bayes factor is as good as the reconstruction with fewer amplitudes.Also, the authors in [47] obtained a similar shape by using only three parameters in a Fourier basis and concluded a deviation of about 3σ from the cosmological constant, as we did here through a different mechanism.

Dark Energy Density
Similarly to the previous section, with one extra parameter, through the z cut in the sigmoid function for ρ DE , we have got a better fit to the data by more than 1σ, and its Bayes' factor results on a negative value B ΛCDM,i = −0.124± 0.131, indicating a slight evidence in its favor, although it is still within the 1σ for the error.The constraints of the z cut parameter corresponds to 2.101 ± 0.413, which provides an insight of a possible vanishing energy density beyond z = 2.This feature is also noticeable when looking at the 1-bin reconstruction where the cosmological constant is on the edge of the 68% confidence contour for z > 1.5 (see Fig. 3).
By incorporating additional parameters to the reconstruction of ρ DE (z)/ρ c,0 the fit to the data becomes better than the standard model, as shown by the negative values of the ∆χ 2 on the bottom panel of cating an evidence in favor (but still inconclusive) to our reconstructions, and also reflected on the improvement of the χ 2 .This may be happening due to the data having a preference for an energy density with a bump located at z ≈ 1.2 and then as the redshift becomes larger ρ DE (z) decreases until reaching a zero value, and even passing through negative values, but still statistically consistent with zero.That is, by having at least two amplitudes our reconstruction methodology is flexible enough to present this behavior, as seen in the first three columns of Fig 4 .As we continue adding more parameters, the presence of a possible sign change in the energy density is more pronounced.This transition occurs near z ≈ 2, and in the region around z ≈ 2.5 the deviation from the cosmological constant peaks.The general behaviour of our reconstructions is reflected by these two main features, which together provide a deviation up to 2.8σ from the standard model.
In the same manner as the reconstruction for the EoS, one may say that the position of these two features (the bump at z ∼ 1.2 and the vanishing energy density behaviour for z > 1.5) could be biased because the particular location for the amplitudes.Even so, it is stated in [93] that these particular features and their positions are prompted by the Lyman-α BAO data (as we will further discuss in the PCA subsection).In order to find an optimal place for the internal positions, we set them free by allowing them to move around the z-axis.Because of this additional freedom, the internal reconstruction (or 3y2x model) is able to localize these features and provides a better fit to the data, compared to the reconstruction with three fixed amplitudes.Moreover, despite having 5 extra parameters, the binning reconstruction has a negative Bayes factor which favours this model over the rest of the reconstructions (displayed as crosses in the bottom panel of Fig. 2).
Lastly, as was made with the EoS, we incorporated a 20 bins and 20 bins with the correlation function reconstructions.Both present more substructures like an oscillatory-like behavior at late times and also have a transition to a null or negative density in z > 2, as seen in Fig 5 .Analogous to the guidance offered by the oscillatory demeanor found in the EoS reconstruction, its apparent wavering nature in z < 1.5 could encourage the study of an oscillatory basis such as a Fourier series.Looking at the ∆χ 2 we notice a deviation from the ΛCDM of about 3.4σ, and a Bayes factor comparable with the few-parameter reconstructions.
The drop-off behaviour of ρ DE (z) and, perhaps, a transition to a negative energy density has been captured in other works [29,50,[93][94][95][96][97], as it seems to alleviate the tension that arises by estimating the Hubble constant H 0 with different datasets.It was also found in [8] that, when considering flat ΛCDM and binning the data, negative energy densities (Ω m,0 > 1) are expected for higher redshifts.Hence, it may be pertinent to study models with a similar demeanor.This behaviour does not necessarily imply a negative physical energy density per se, but it may be the indication of an effective energy density, i.e. similar to the one generated by the curvature component [98,99].
In general, and throughout the reconstruction process, we have found different features beyond the cosmological constant, which result in deviations up to 4σ.One final interesting observation is that the reconstructions when using bins are generally better than with nodes and also the Bayes factor shows an improvement.Likewise, there is a preference for the reconstruction with 20 bins over 20 bins plus the correlation function, reflected on the ∆χ 2 , in fact there is a strong evidence against using the 20 bins+prior model, according to the Jeffreys' scale.

Derived functions
Once we have obtained the general form of both the EoS and the energy density we proceed to obtain their respective derived functions, these being: the Hubble parameter H(z)/(1 + z), the Om diagnostic and the deceleration parameter q(z).For the reconstructed energy density, we get as a derived the EoS and similarly for the reconstructed EoS, the energy density is an inferred one.These derived functions correspond to the 20 bin reconstruction, which produced the best fit, and their posterior probabilities are displayed in Fig. 6.In general, the functions coming from the energy density show an enhanced oscillatory behaviour, compared to the functions derived from the EoS.When comparing the reconstructed EoS with the one deduced from ρ DE /ρ c,0 we notice an important difference: we allowed for negative energy density values, hereby the derived EoS presents a discontinuity at about z ≈ 2, seen in the best-fit model denoted by the red dotted line in Fig. 6.Such type of discontinuities have been found and studied in other reconstructions and different mod-els, such as [29,100,101].Regarding the derived energy density: when reconstructing ρ DE /ρ c,0 directly its freedom in the parameter space allowed it to reach null values of the energy density, and even negative ones at high redshifts; but when a barotropic EoS is imposed, through the conservation equation, the derived energy density remains always positive with a bump located at z ≈ 1.5 and a smooth drop out at high redshifts.
In general, the Om diagnostic shows consistency with several parameterizations and reconstructions [103,104].Nevertheless, in our reconstructions we have found a mixed behaviour between quintessence and phantom components, corroborated by the EoS.That is, we have certain places where Om(z) > Ω m,0 (quintessence) and others with Om(z) < Ω m,0 (phantom).
Finally, the deceleration parameter q(z) for the re-constructed EoS gives a value for the transition to an accelerated Universe around z ∼ 0.6, which is statistically consistent with the ΛCDM value, and with results previously obtained [19,104,105].On the other hand, when q(z) is reconstructed through the energy density, the universe goes through several short periods of acceleration-deceleration (a similar behaviour was seen in [50]), however the main acceleration transition corresponds to z ∼ 0.45, unlike in ΛCDM, where the acceleration starts at z ∼ 0.7.

PCA and the Bayes' Factor
By applying the principal component analysis to remove the noisiest modes (enough to preserve 95% of the variance) of the reconstructions, we see that the biggest changes happened mainly at z > 2.0.An example of this can be seen in Fig. 7, where we used the reconstruction with 20 bins for the energy density and the EoS.Since the only information here is coming from the Lyman-α BAO (z ∼ 2.3) it may be reasonable to argue that the main tensions amongst models come from these high-redshift data (as has been also suggested in [106][107][108]), or perhaps it exists the possibility that large systematic errors are present in this dataset.This may be an indication that the dynamical DE behavior, referring to the crossing of the PDL for the EoS and the null energy density at early times, is merely due to noisy data, although this will be confirmed until a significant amount of information is obtained in this redshift region.This also has an effect in the Bayesian Evidence.As the parameters beyond z = 1.5 are the least constrained they contribute little to nothing to the final evidence of the reconstruction and thus means that we have to be very careful when utilizing the Bayes' Factor to directly perform model selection.Added to this is the fact that the Evidence is pretty susceptible to the prior range.Such problems are common when performing reconstructions, but a possible solution has been proposed in [68], although it is still a work in progress.
Nevertheless, something that should be borne in mind is that certain characteristics pointing out to dynamics are still present even when removing the noisiest PCs.These characteristics, in the energy density, are the oscillatory nature at low redshift and the transition to a null or negative energy density in z ≈ 1.5; with the EoS the preserved features include the oscillatory behavior, the bump in z ≈ 1.3, a preference for values below w DE = −1 at z = 0, and the crossing of the phantom divide line in late times.

V. CONCLUSIONS
Throughout this work we have studied several parametric forms and model-independent reconstructions of two properties of the Dark Energy: the EoS and the energy density.Regarding the parametric forms, we have analyzed the wCDM and CPL models, whereas for the energy density we introduced a simple form given by a sigmoid function to describe a transition behaviour.Then, to introduce more flexibility for the recovered functions, we also included two types of reconstructions: based on step functions smoothly connected (bins) and linear interpolations (nodes).Each of the reconstructions may have several amplitudes as free parameters with fixed locations, as well as variable positions to localize possible features.On the top of the bin reconstruction we incorporated a floating prior that averages the set of neighboring amplitudes to behave as a mean between bins, hence to preserve continuity among heights.
The wCDM and CPL parameterizations of the EoS showed little to no improvement over ΛCDM.However, with just a single parameter, the sigmoid function for the energy density had a better fitness to data as well as a better Bayes' factor when compared to ΛCDM.Even though this parameterization is phenomenological, it could be an indication that these type of models should be further studied upon.
By adding complexity in the model-independent reconstructions, through extra amplitudes with fixed positions, the general outcome presented a dynamical behavior beyond the cosmological constant.In both reconstructed functions, ρ DE (z) and w DE (z), there is a presence of a bump located at about z ∼ 1.2.This feature along with a crossing to the phantom-divide-line in the EoS, and a transition to a null energy density (or even negative values) yield to deviations from a constant energy density of about 3σ.However, for these type of reconstructions the Bayes factor penalized the incorporation of parameters and displayed a moderate evidence against the reconstruction of w DE (z) and an inconclusive to weak evidence for ρ DE (z).
In order to avoid a possible bias due to the location of the amplitudes, we have let the internal points to move freely in the z-space.The autonomy of the internal positions led to localize the features, previously mentioned, and to an improvement on the fit of about 1σ, in comparison to the same number of amplitudes with fixed positions.Another point to stress out about the internal reconstructions, is the enhancement of the Bayes' factor which could get even negative values, i.e. for ρ DE (z) with bins and the 3y2x method.
Finally, when considering 20 parameters in the binned reconstruction we noticed an improvement in the fitness up to about 4σ.Nevertheless, a key result to bear in mind is that for the case of 20 bins+prior the Bayes's factor showed a strong evidence against this type of reconstruction.This is also reflected on a worse ∆χ 2 , when compared to a reconstruction with the same number of parameters.
In general, the derived functions inherited the behaviour from the reconstructed ones: an oscillatory shape.These functions also exhibited consistency with other reconstruction methods.For instance, a dynamical DE behaviour, a different redshift transition from a decelerating universe to an accelerating one, a better agreement to the BAO data.From the energy density reconstruction, the derived EoS presents a discontinuity at redshift around z ∼ 2, which is necessary if the energy density transitioned to negative values; found in several models and parameterizations.
By performing a principal component analysis we found that, in both types of reconstructions, the amplitudes beyond z = 1.5 are the least constrained, where the predominant data around these redshift values is the BAO Lyman-α.This shows that current Lyman-α BAO prefers a nearly null or negative energy density, or a transition from quintessence to a phantom dark energy.However, more precise data in this region is necessary to fully discern a dynamical DE, also, considering that the parameters in this region are unconstrained, it directly affects the estimation of the Bayesian Evidence, so any remarks about the Bayes' Factor should be made carefully.Nevertheless it was found that some features could indeed be considered as signal, like the general oscillatory behaviour, the bump located at z ≈ 1.2 and the w DE < −1 at the present redshift.Some concluding remarks can be summarized as follow: i) the binned reconstruction provided better results than the node method; ii) when incorporating the correlation function, the fitness and the Bayes' Factor worsen considerably, even when the final shape is very much alike; iii) our model-independent reconstructions resulted in a better fit to the data (up to 4σ) and, in some cases a better Bayes' Factor; iv) if future surveys confirm our results, the cosmological constant and single scalar field theories (with minimal assumptions) might be in serious problems as they cannot reproduce these essential features.This could be a great incentive to study DE models with this type of dynamical behavior and encourage direct reconstruction of other functions, such as those that may lead to discontinuities in the EoS.Finally, v) the PCA analysis showed a great promise for some of the reconstructed features, such as the oscillations and a bump at intermediate redshifts.
As a future prospect, it remains the addition of other datasets that contain information from linear perturbations, such as the cosmic microwave background and the matter power spectrum data.Also, with the resulting shape provided by our model-independent reconstructions, we may search for a parameterization or a physical model that incorporates the important characteristics already found.It would also be interesting to study some alternative PCA methods that have a mathematical basis to truly discern important features from noise.

FIG. 1 :
FIG.1: Example of the reconstruction plot for the f (z) function with 8 nodes and bins.We observe the difference in the reconstruction when using two values for the smoothness parameter ξ.

Fig 2 . 2 FIG. 3 :
FIG.3:These plots show the functional posterior probability: the probability of wDE(z) or ρDE(z)/ρc,0 as normalised in each slice of constant z, with colour scale in confidence interval values.The 68% (1σ) and 95% (2σ) confidence intervals are plotted as black lines.From left to right: parameterized equations of state for wCDM and CPL, phenomenological sigmoid and 1-bin energy density reconstructions.The dashed black line corresponds to the standard ΛCDM values.

2 FIG. 4 :
FIG.4: From top to bottom: reconstructed wDE(z) with bins (and nodes), reconstructed ρDE(z)/ρc,0 with bins (and nodes).It is easy to observe there is more structure (more apparent features) in the reconstructions as the number of parameters increases (from left to right).

2 FIG. 5 : 80 H 80 H 2 FIG. 6 :
FIG.5: From left to right: reconstructions with 20 bins, 20 bins with correlation function, internal z i -bins and internal z i -nodes.Purple figures correspond to the EoS whereas green to the energy density.

FIG. 7 :
FIG. 7:Applying the principal component technique to our reconstructions of the EoS and the energy density with 20 bins.By eliminating 5 PCs (which add up to about 5% of the total variance for each reconstruction) we obtain the orange figures with slightly overestimated errors localized in z > 1.5.

2 FIG. 9 :
FIG.9: Functional posteriors of the EoS reconstruction, from up to down, with 4, 5 and 6 bins using only Hubble parameter data.
ln B 12 Odds Probability Strength of evidence

TABLE I :
[58]reys' scale for model selection with the logarithm of the Bayes' factor.Using the convention from[58].

TABLE II :
Additional parameters and their flat prior range.

TABLE III :
Mean values, and standard deviations, for the cosmological parameters.For each model, the last two columns present the Bayes Factor, and the ∆χ 2 = χ 2 ΛCDM − χ 2 i for fitness comparison.The datasets used are BAO+H+SN.Here ln E ΛCDM = −530.79(0.12).