Elucidating cosmological model dependence with H0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_0$$\end{document}

We observe that the errors on the Hubble constant H0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_0$$\end{document}, a universal parameter in any FLRW cosmology, can be larger in specific cosmological models than Gaussian processes (GP) data reconstruction. We comment on the prior mean function and trace the smaller GP errors to stronger correlations, which we show precludes all well studied dynamical dark energy models. We also briefly illustrate cosmographic expansions as another model independent cosmological reconstruction. Our analysis suggests that “cosmological model independence”, especially in the statement of Hubble tension, has become a misnomer.


Introduction
Cosmology rests upon assumptions. When one works with assumptions, timely contradictions are inevitable and these seed progress. Over two decades ago, the concordance flat ΛCDM model emerged from a set of contradictions. Today, cosmological tensions [1] point to problems with the assumption that the Universe is flat ΛCDM. Moreover, some assumptions underlying supernovae are in a state of flux [2][3][4], and even the assumption that the Universe is isotropic and homogeneous is being called into question [5][6][7][8][9][10]. This perpetual cycle of assumptions and contradictions is integral to cosmology. Recently, Gaussian Processes (GP) has become a staple of data-driven cosmology . In this letter, using the Hubble constant H 0 , we chip away at the widespread assumption that GP data reconstruction is cosmological model independent.
Cosmology strives to make robust statements across a host of cosmological models and this drives the "model independence" narrative. Working within parametric models, it is well established that Taylor expansion, or cosmography [60,61], offers one a glimpse of model independence, but the Cauchy-Hadamard theorem [62] (see [63]) confines one a e-mail: ocolgain@gmail.com (corresponding author) to low redshifts, z 1. Nevertheless, even within these restrictions, the Hubble constant H 0 can be determined in a bona fide model independent manner [64,65]. In contrast, GP reconstruction is a non-parametric technique that in principle allows one to extend "model independence" to higher redshifts. In practice, one reconstructs data through an assumption on the covariance matrix, or "kernel", and its "hyperparameters".
The commonly held belief that GP is model independent may be misleading for largely two reasons. First, cosmological inferences of H 0 from GP [66][67][68][69][70][71] at the percent level can be discrepant with local H 0 determinations [72] (see also [73]). 1 These local determinations are largely independent of the background cosmological model and only rely on the assumption of homogeneity and isotropy, which is needed to identify the observed rate of expansion with H 0 . If true, since the two determinations of H 0 are independent of the cosmological model, it is an immediate corollary that Hubble tension has no cosmological resolution, at least within the Friedmann-Lemaître-Robertson-Walker (FLRW) framework.
Admittedly, this may be true, so there may be no contradiction. Nevertheless, more seriously, Table 1 shows the average errors for 300+ flat ΛCDM mock realisations with forecasted DESI data [74], where GP based on commonly used kernels in the Matérn covariance matrix class with positive parameter ν (e.g. see [75]) is compared against the ubiquitous Chevallier-Polarski-Linder (CPL) model [76,77] for dynamical dark energy (DDE). As we argue later, similar results should hold for all parametric DDE models. The obvious question is how does a putative "model independent" technique outperform a specific model on errors? Recall that typically, where there's smoke, there's fire.
In essence, the overt problem with non-parametric techniques, such as GP, is that the implications for parametric In this note, we focus on H 0 , which, as remarked in [78], is an integration constant in the Friedmann equations, so it is universal to all FLRW cosmologies. Concretely, we show that while the correlations in simpler models such as flat ΛCDM and wCDM are typically stronger than the GP output, in turn correlations from GP are generically stronger than DDE models. As a result, GP represents a restriction on the parameter space of DDE models. This explains the smaller errors in Table 1 and highlights the problem with the assumption that GP is model independent.
Taylor's Theorem: Let n ≥ 1 be an integer and the let the function f : R → R be n times differentiable at the point z 0 ∈ R. Then there exists a function f n : R → R such that and lim z→z 0 f n (z) = 0.

Cauchy-Hadamard Theorem:
Consider the formal power series in z ∈ C of the form f (z) = ∞ n=0 c n (z − z 0 ) n , where z 0 , c n ∈ C. Then the radius of convergence R of f at the point z 0 is given by Observe that Taylor's theorem simply guarantees that provided the Hubble parameter H (z) is differentiable, which is usually the case, the remainder function f n (z) exists and approaches zero as z approaches z 0 . While one could perform this expansion at any redshift, it is natural to consider expansions around z = 0, and this is the basis for cosmographic (Taylor) expansions [60,61]. Note, working in the vicinity of z = 0 is also sufficient for determining H 0 .
Taylor expansions can be regarded as truly model independent reconstructions of H (z) to a given order in the vicinity of z 0 , only if they can cover all models to that order and precision. In practice, this is an impractical definition and usually one simply demands that the Taylor expansion covers a suitably large class of models. This is essentially the regime that Riess et al. [64] operate in to make local determinations of H 0 (see also [65]). The farther one goes from z = z 0 , the fewer models that are accurately described by the Taylor expansion.
As observed in [61], recalling the definition of the Hubble parameter in terms of the scale factor, H ≡ȧ/a, and the usual expression for a in terms of redshift z, a = 1/(1 + z), it is clear that the scale factor becomes singular at z = −1. We can extend z into the complex plane, where in the Cauchy-Hadamard language (2), this singularity corresponds to at least one of the c n becoming large at z ≈ −1. This in turn ensures R → 0 in its vicinity. For this reason, as stated in [61], the radius of convergence of any FLRW cosmology is at most |z| = 1. It should be clear that Taylor's theorem does not apply to expansions in (1 + z) about z = 0, e. g. [92,93], only expansions about z = −1, and there the radius of convergence is strictly zero. Together, these theorems make expansions in (1 + z), or log 10 (1 + z) [94,95] completely random in the sense that adding higher order terms does not improve convergence (see [96][97][98]).

Gaussian processes
GP is a method to smooth a given (sparse) dataset. In essence, given n observational real data points y = (y 1 , . . . y n ) at redshifts z = (z 1 , . . . z n ) with a covariance matrix C, one wishes to reconstruct a function f * = ( f (z * 1 ), . . . f (z * N )) underlying the data at N new points z * = (z * 1 , . . . , z * N ), where typically N ≥ n. Obviously, attempting to reconstruct data far outside the range of the original data will lead to questionable results. 2 In implementing GP, one has to make an assumption on how the reconstructed data points are correlated, and to do so, one introduces a new covariance matrix K (z * , z * ), typically called a kernel. The kernel K is a function of some hyperparameters, which in cosmological applications are usually taken to be two (σ f , f ). The most commonly used kernel is Gaussian, The other kernels that are commonplace in cosmological settings are the Matérn covariance functions, e.g. see [75], where Γ is the gamma function andK ν is a modified Bessel function. We refer the reader to [75] for a discussion on the optimal covariance matrices for cosmology. Here ν is a positive parameter and in the ν → ∞ limit one recovers the Gaussian kernel (3). It should be noted that the Matérn kernels are only mean square n-differentiable provided ν > n. This differentiability property is important when one is interested in the derivatives of H (z), but as we work here with OHD, this is less of a concern. In addition to the Gaussian, following [75], we will largely focus on ν = p + 1 2 , where p = 0, 1, 2, 3, 4.
The only problem now is to identify the hyperparameters and this is done through the following log normal likelihood: In a strict Bayesian sense, one should marginalise over the hyperparameters through an MCMC routine, e. g. [99]. However, this is computationally more expensive, so in practice it is common to simply optimise (6), by setting to zero the gradient of ln L,

Mean function
Going through the GP literature in cosmology, one can identify two schools of thought, or viewpoints, on how to deal with the mean function μ(z). Here, we follow Seikel et al. [14], where the zero mean function, μ(z) = 0, is imposed. This choice is closer to our interests, as it represents the methodology that has led to curiously small errors on H 0 [66][67][68][69][70][71]. When μ(z) = 0, irrespective of whether one optimises (6) or marginalises over the hyperparameters, there is very little difference to the results [68,75]. As is clear from (6) or (7), since y 2 σ 2 y ∼ C, the hyperparameter σ f is a large number. As a result, K C, which is clear from the values in Table 2. With this difference in scales, one can approximate and the mean and covariance matrix become to leading order: where D(z * , z) := K (z * , z)K (z, z) −1 is the "dressing matrix" which essentially dresses the original data y and covariance matrix C. It is a matrix of O(1) numbers. It should be clear from the leading order expressions that GP implemented with zero mean μ(z) is a mapping from data y into the mean f * , and a mapping from the covariance matrix C into the reconstructed covariance matrix cov( f * ).
The other school, primarily Shafieloo et al. [13], maintains that the prior on the mean is important. As is clear from (6) or (7), a reasonably competitive guess for the mean should lead to a small y − μ(z), which makes σ f a small number, (1). For this reason, one just recovers the input mean if one optimises the likelihood (6) and one must marginalise over the hyperparameters. This marks a key distinction between the two approaches. The other important difference is that one expects marginalisation to lead to a distribution of σ f peaked in the vicinity of σ f ≈ 0. Therefore, one is working in the opposite regime to (9) where now C K . In this case, the mean and the covariance to leading Fig. 1 The real CC and BAO data serving as the basis for mock realisations and sub-leading order are, It should be clear that these expressions are trivial at leading order: one gets out what one puts in. So, the sub-leading terms have to matter and this is where marginalisation helps. Nevertheless, the better the guess on the mean, which is typically inferred from specific models, the smaller the sub-leading terms, and hence the less relevant the GP becomes. Clearly, choosing a prior is a balancing act that represents additional modeling in this "model independent" approach and for this reason, we adopt the μ(z) = 0 viewpoint.

Data
We use OHD, which serves as the basis for mock realisations. More precisely, we make use of cosmic chronometer (CC) [100][101][102][103][104][105][106][107] and homogenised BAO [108][109][110][111][112][113][114][115][116][117][118][119] data. It should be stressed that the CC data largely comprises statistical errors only, and the systematic errors on H (z) are a work in progress [120]. A further caveat is that, as pointed out in [121], some of the cosmic chronometer data may be less reliable. That being said, this OHD will only serve as the basis for mock realisations of the flat ΛCDM cosmological model with the canonical parameters (H 0 , Ω m0 ) = (70, 0.3). Furthermore, we are not interested in the absolute value of H 0 , but the errors and only their relative values. We present the OHD in Fig. 1. From the real data, we extract the redshifts z i and the errors in the Hubble parameter σ H (z i ) . To perform the mocks, at each z i we choose a new H (z i ) value from a normal distribution about the flat ΛCDM value with standard deviation σ H (z i ) .

Analysis
In this section we focus on H 0 extracted from the GP reconstruction. This is arguably the simplest cosmological parameter that one can reconstruct from the data since it just involves an extrapolation beyond the last data point (z = 0.07 in our study) to z = 0. Before beginning, it is instructive to remove the BAO data and run the GP analysis for the CC data with a Gaussian kernel, just to validate our GP code. We find H 0 = 67.55 +4.75 −4.68 km/s/Mpc, which reproduces the result of Yu et al. [67], H 0 = 67.42 ± 4.75 km/s/Mpc, so there is no indication that our GP code is doing anything unusual. In particular, the errors are the same size. It should be stressed again that GP is simple linear algebra (5).
Here, we begin exploring the GP whereby the likelihood (6) is minimised. This represents a simplification, but it has been confirmed in [68,75] that this make little difference. In Table 2 we show how the inferred Hubble constant H 0 depends on the kernel for the full redshift range of the data 0 z 2.5. It should be stressed that we are using OHD, namely CC and BAO data, but since we average over a large number of mocks, we are reporting general trends. We find that errors on H 0 decrease with increasing ν, and our analysis shows that the smallest H 0 error is achieved for the Gaussian kernel. Our findings are in line with Table 1 of [66], where we have included the ν = 1/2 and ν = 3/2 entries just to fill out the picture. Note that results are kernel dependent. This trend is expected. The reason being that as ν increases one is increasing the differentiability of the kernel. This means that the class of reconstructed functions, here H (z), should be smoother, since their derivatives have to be continuous to a greater order. Ultimately, smoother functions enjoy stronger correlations and smaller errors.
Observe that the central H 0 values are all biased lower than the mock value H 0 = 70 km/s/Mpc. Independently, we have performed some fits with Taylor expansions and one observes the same phenomenon, which suggests that this biasing is down to the data. As is evident from Fig. 1, the error bars Table 3 Average values of H 0 for different models based on ∼ 300 mock realisations of the data in Fig. 1. This is to be compared with the GP results, in particular the ν = ∞ case, quoted in Table 2 increase at low redshifts, but the slope of H (z) is fixed by the BAO data, making it less likely that the visibly poorer quality CC data can affect the central value.
It is instructive to fit the same data to concrete models in order to compare the errors in H 0 . The result is reported in Table 3. Evidently, the errors on H 0 for both ΛCDM and wCDM are within the GP errors, but for CPL we find that the error on H 0 is larger. In order to ascertain if this is a fluke, we replace our OHD with the DESI H (z) determination forecasts in the extended redshift range 0.05 ≤ z ≤ 3.55 [74], where we assume the optimistic outcome that the five-year survey covers 14,000 deg 2 . Repeating the mocking exercise outlined in Sect. 3.2, we can see from Table 1 that even with the forecasted data, GP outperforms the CPL model by leading to smaller errors on H 0 . 3 We conclude that this is not an artifact of the dataset and that GP produces smaller errors on H 0 than CPL.

Correlations
Recall that the output from GP is a mean and a covariance matrix and that any covariance matrix C i j is simply a dressing of the correlation matrix D i j through the errors σ i , C i j = σ i σ j D i j (no summation). It is an easy task to take the output covariance matrix from GP and identify the underlying correlation matrix. Concretely, we have Next, one can fix z i = 0 and the first row of the correlation matrix gives us an indication of the correlations between H (z = 0) and H (z i ). These are shown in Fig. 2, where it is clear that the ν = 1 2 kernel has the weakest correlations with redshift, whereas the Gaussian kernel (ν = ∞) exhibits the strongest. It can also be observed that beyond ν = 3 2 , the differences in the correlations are not so pronounced. This explains not only the trend in the H 0 errors in Table 2, but also why the difference in the H 0 errors beyond ν = 3 2 is not so great. It is an undeniable fact that stronger correlations lead to smaller errors.
One can then extend this analysis to parametric models, so that a direct comparison can be made. In fit-  Table 2 as ν is increased ting parametric models, one typically ends up with an MCMC chain for the parameters, which within the twoparameter family of dynamical dark energy (DDE) models we study, amounts to a maximum of four parameters (H 0 , Ω m0 , w 0 , w a ). Concretely, we make use of a redshift model [122,123], the CPL model [76,77], as well as models due to Efstathiou [124], Jassal-Bagla-Padmanabhan (JBP) [125] and Barboza-Alcaniz (BA) [126]. See Table 4 for explicit expressions for the corresponding equations of state w(z) and dark energy densities X (z) := ρ de (z)/ρ de,0 = exp 3 z 0 1+w(z ) 1+z dz . As observed recently in [127], focusing on any particular DDE parametrisation risks biasing the search for DDE, so here we analyse a broad class of models (see also [128]). We will see that our conclusions are robust to the parametrisation.
From the MCMC chain, one can infer H (z i ) at the same redshifts as the GP. From there one can make a direct comparison by plotting H (z) and the confidence intervals. However, since the mean values can be displaced, it may be difficult to quantify the difference through plots. Nevertheless, one can boil any distinction down to numbers. Viewing H (z i ) as parameters in their own right, one can infer the corresponding covariance matrix and strip away the errors to leave the correlation matrix. In Fig. 3 we plot the correlations across the parametric models.
In line with expectations, the flat ΛCDM model exhibits the strongest correlations, next wCDM, while any of the DDE models, including the CPL model, are less strongly correlated. This is more or less the content of Table 3, though, over a large number of mocks; here we only use the real data in Fig. 1. Ultimately, since the data is the same, these correlations are simply an artifact of the number of parameters in the model. However, for DDE models, the correlations are also in line with observations made in [127], which is yet another consistency check. There it was noted that the errors on w a , Efstathiou [124] w 0 + w a ln(1 which propagate to all parameter errors, including H 0 , are larger for the JBP and CPL models, while the Efstathiou, BA and redshift models lead to smaller errors. Once again, this is evident from Fig. 3. As explained the figures are only for a single realisation of the data, while Tables 1 and 3 represent repeated mock realisations, so the conclusion that the errors on H 0 differ should be beyond doubt. So while Figs. 2 and 3 are based on real data including BAO, which assumes a fiducial flat ΛCDM cosmology, the fact that mocks return results consistent with smaller (larger) errors, or alternatively stronger (weaker) correlations, should convince readers that Figs. 2 and 3 are representative. Finally, the reader will note that we have mocked up on flat ΛCDM. However, we have checked that if one mocks up on one of the DDE models, for example the CPL model, and dials (w 0 , w a ) so that the CPL model fits the data better than flat ΛCDM, GP still leads to smaller errors than the best fitting DDE model. We expect our findings here are generic in nature.

Conclusions
"Model independence" has casually slipped into the cosmology lexicon. Here, our interest in the claims were piqued by notably small errors in H 0 that are even comparable in size to flat ΛCDM errors [66][67][68][69][70][71]. 4 We started with some observations on Taylor expansion and the regime where the corresponding cosmographic expansion may be regarded as model independent in a bona fide sense. As explained, these observations are rooted in 18th and 19th century math theorems, but this is routinely overlooked with dire consequences. Importantly, beyond the radius of convergence |z| = 1, higher order terms in any expansion no longer converge.
We also explained the difference between the two schools in the GP method [13,14]. Clearly, an assumption on the mean function, as advocated in [13], can lead to very different expressions at leading order. In particular, a competitive guess on the mean, not only represents extra modeling, but risks triviliasing the GP. Moreover, one can add a "nugget" or noise contribution to K + C [11,13], but this again is extra modeling. While these observations may not settle the debate, we believe they constitute progress.
We analysed the correlations relevant for the inferred H 0 values with μ(z) = 0 [14]. Consistent with observations on the errors, we found that GP leads to stronger correlations and thus smaller errors than well known parametric DDE models. If this is confirmed by the community, one must conclude that GP analysis is tantamount to fitting the wCDM model to the same data. While this may seem counter-intuitive, it is worth noting that GP has a myriad of applications, but in cosmology, H (z) is to first approximation a simple monotonically increasing function of redshift. Therefore, it is possible that the optimal kernels for cosmology have yet to be identified. It is also possible that a hybrid approach that involves inference from both data and a prior on the kernel is required, e. g. [130,131].
Finally, let us emphasise again that we have only analysed H 0 . Our motivation was that competing discrepant "model independent" H 0 determinations immediately lead to the conclusion that Hubble tension has no resolution in an FLRW cosmology. It is imperative to extend our results to other cosmological parameters, e. g. [47,[49][50][51][52][53][54][55][56][57][58][59]132], to ascertain the level of implicit or hidden model dependence in the GP approach. It is worth recalling again our comments on Taylor expansions, namely that a Taylor expansion beyond the strict vicinity of z ∼ 0 corresponds to a class of models. In essence, GP should be the same. The responsibility is on the GP community to properly define the class of models in a transparent manner.

Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors' comment: All the data is in the public domain.] Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Funded by SCOAP 3 .