Detecting relic gravitational waves in the CMB: A statistical bias

Analyzing the imprint of relic gravitational waves (RGWs) on the cosmic microwave background (CMB) power spectra provides a way to determine the signal of RGWs. In this Letter, we discuss a statistical bias, which could exist in the data analysis and has the tendency to overlook the RGWs. We also explain why this bias exists, and how to avoid it.


I. INTRODUCTION
A stochastic background of relic gravitational waves was produced in the very early stage of the Universe due to the superadiabatic amplification of zero point quantum fluctuations of the gravitational field [1,2]. The relic gravitational waves have a wide range of spreading of the spectra, and their detection provides a direct way to study the physics in the early Universe.
Recently, there have been several experimental efforts to constrain the amplitude of relic gravitational waves in different frequencies. Among various direct observations, LIGO S5 has experimentally obtained so far the most stringent bound Ω gw (f ) ≤ 6.9 × 10 −6 around f ∼ 100Hz [3], which will be much improved by future observations, including the third-generation Einstein Telescope [4]. The timing studies on the millisecond pulsars by the PPTA and EPTA teams also reported upper limits Ω gw (f ) 10 −8 at f ∼ 1/yr [5,6]. In addition, there are two bounds on the integration Ω gw (f )d ln f 1.5 × 10 −5 , obtained by the big bang nucleosynthesis observation [7] and the cosmic microwave background radiation observation [8].
In this paper, we shall focus on the detection of relic gravitational waves by the cosmic microwave background (CMB) radiation observations. The RGWs leave well understood imprints on the anisotropies in temperature and polarization of CMB [9,10]. The theoretical analysis of these imprints along with the data (including T , C, E, B) from CMB experiments allows one to determine the RGW background by constraining the parameters: the tensor-to-scalar ratio r and the spectral index n t . The current observations of CMB by WMAP satellite place an interesting bound r ≤ 0.20 [11] by assuming n t = −r/8, which has been generalized in [12]. These bounds are equivalent to the constraints on the energy density Ω gw (f ) of relic gravitational waves at the lowest frequency range f ∼ 10 −17 Hz.
Detecting the relic gravitational waves remains one of the most important tasks for the upcoming CMB observations (see [13] for reviews). Due to the various large contaminations, in the near future, we can only expect to detect a signal of RGWs in a relative low signal-to- * Electronic address: wzhao7@mail.ustc.edu.cn noise ratio (S/N ). This result would guide the far future detections.
As for the whole data analysis, we expect that, the maximum value of the parameters in the posterior possibility density function (pdf) is unbiased for the 'true' values of the parameters, which is auto-satisfied when the S/N is high. However when S/N is low, the maximum values of the parameters sometimes lead to a biased guide for the 'true' values, which can be generated either by some systematics or by the statistics, and should be avoided in any data analysis.
In this Letter, we will point out that, a statistical bias could exist in the CMB data analysis for the detection of RGWs. We also explain why the bias does exist, and suggest the way to avoid it.

II. THE STATISTICAL BIAS
The primordial power spectrum of relic gravitational waves can be simply described by the following power-law formula: where k 0 is the pivot wavenumber, which can be arbitrarily chosen. A t (k 0 ) is the amplitude of RGWs, and n t is the spectral index. The value of n t is quite close to zero, predicted by the physical models of the early Universe. As usual, we can define the tensor-to-scalar ratio In order to discuss the statistical bias for the detection of RGWs in the data analysis, let us simulate the observable data for the Planck satellite, where we only consider the Planck instrumental noises at the 143GHz frequency channels [14]. We adopt the 'input' cosmological models as Ω b h 2 = 0.02267, Ω c h 2 = 0.1131, Ω Λ = 0.726, τ reion = 0.084, h = 0.705, A s = 2.445 × 10 −9 and n s = 1. The RGWs parameters are adopted as r =r = 0.05, n t =n t = 0. As we have discussed in the previous paper [15], this small r is expected to be detected at 2σ for the assumed noise level.
Based on this input cosmological model, and the assumed noise level, we simulate 500 data samples. For In the data analysis, we assume all the parameters, except for r and n t , are all fixed as their input values. For the parameter n t , one always presumes the relation n t = n s − 1 or n t = −r/8 in the data analysis [16] [17]. However, this assumption does depend on the special cosmological models. If they are not the truth, but presumed, the finial conclusion of the data analysis would deviate from the real physics.
In order to avoid this danger, the natural way is setting r and n t as free parameters. We choose the flat priors of them in the range r ∈ [0, 1] and n t ∈ [−3, 3]. We adopt the best-pivot wavenumber, which is k 0 = 0.0006Mpc −1 for the input model and the assumed noise level [18].
The most interesting final result is the maximum value in the 1-dimensional posterior pdf for the parameters r and n t . In this paper, we denote them by r ML and n tML . Of course, their values do depend on the simulated data. For different data samples, they have different values. We expect the distribution of these 500 r ML and n tML are around their input values. However, it may be not the truth in the real analysis. In Fig.1, we plot the distribution of r ML and n tML with blue shadows. This figure shows that, the distribution of n tML is peaked at zero, the input value. However, the distribution of r ML obviously approaches to r = 0, and biased the input value at r = 0.05. This suggests that, if we deal with the data analysis in this way, the resulting conclusion has the tendency to deviate from the 'true' value of r, and to overlook the RGWs.

III. UNDERSTANDING THE STATISTICAL BIAS
It is important to understand why this statistical bias does exist. In order to realize it, let us proceed the following analytical approximation for the likelihood analysis.
The primordial power spectrum of RGW in (1) can be rewritten as, which can be approximated as In this approximation, we have used |n t | ≪ 1. The total CMB power spectra C Y ℓ (Y = T, C, E, B) include the contributions of density perturbations and gravitational waves, i.e.
where C Y ℓ,s and C Y ℓ,t are the contributions of density perturbations and gravitational waves, separately. Note that C B ℓ,s = 0. By considering |n t | ≪ 1, the spectra C Y ℓ,t , as a function of r and n t , can be approximated as [18] Here C Y ℓ,t ≡ C Y ℓ,t (r = 1, n t = 0), and best-pivot multipole ℓ 0 = k 0 ×10 4 Mpc [18]. So P t (k) and C Y ℓ,t are all the linear combinations of the parameters r and rn t . Now, let us turn to the likelihood function. The exact form can be found in the previous works [19] [15,16,18]. In the analytical approximation, it can be well approximated by [18] The likelihood function (6) can be rewritten as [18] where we have defined the quantities (8) which are all independent of the variables r and n t . Obviously, the value of d Y ℓ depends on the data. For a larger number of different sample, the average value of d Y ℓ is d Y ℓ = a Y ℓ (r +rn t b ℓ ), due to the facts of D Y ℓ = C Y ℓ,s + C Y ℓ,t (r =r, n t =n t ) and C Y ℓ,t (r =r, n t = n t ) ≃ C Y ℓ (r +rn t b ℓ ).
Since we have adopted the best-pivot multipole ℓ 0 , which is defined by requiring [18] ℓ Y (a Y ℓ ) 2 b ℓ = 0, the likelihood (7) can be rewritten as [18] where C is a constant, and the other quantities are defined by The posterior pdf relates to the likelihood by the prior. Here, let us adopt the flat prior for the parameters r and n t , the 2-dimensional posterior pdf for the variables is which follows the 1-dimensional posterior pdf for r as follows, We notice that, when r p ≫ r s , corresponding to S/N ≫ 1 (see [18] for details), this pdf can be reduced that This is gaussian function for r, and peaks at r = r p with spread r s . From the expression of r p , we know that, the value of r p depends on the data D Y ℓ by the quantity d Y ℓ . However, the average value of r p for a larger number of sample isr p =r, i.e. r p is an unbiased estimator forr. This has been mentioned in the previous paper [18].
But here, we want to emphasize that, when r p is not much larger than r s , the peak of the posterior pdf in (13) is smaller than r p , due to the term 1/r. Especially when r p < 3r s , the peak of the pdf is very close to zero, which is never an unbiased estimator for the input valuer. This explains what we have found in the left panel of Fig.1.

IV. AVOIDING THE STATISTICAL BIAS
Now, let us consider the possible way to avoid this bias in the data analysis. Let us return to the likelihood function in (9). We find that, if considering r and z ≡ rn t as two independent parameters, this likelihood is a simple gaussian function for the uncorrected parameters r and z. Now, we adopt the flat prior for the variables r and z, and the posterior pdf for r and z becomes from which follows that the 1-dimensional posterior pdf for r is This pdf peaks at r = r p , which is an unbiased estimator for the input valuer. Similarly, we can also find that, the 1-dimensional posterior pdf for z peaks at z = z p , which is also an unbiased estimator forẑ ≡rn t . So the statistical bias in data analysis is elegantly avoided. In order to clearly show this result, we have analyzed the same 500 samples, by adopting the flat prior on r and z. In Fig.1, we plot the distribution of the r ML and z ML with the solid columns. As expected, we find that, these r ML and z ML are all distributed around at their input valuesr = 0.05 andẑ = 0, and the bias for the tensor-to-scalar ratio is naturally avoided. In this figure, we also plot the distribution of n tML , which also unbiased distributed around its input valuen t = 0.
It is interesting to compare the difference between the prior f (r, z) and the general prior f (r, n t ). They can be related by the Jacobi, i.e. f (r, n t ) = ∂(r, z) ∂(r, n t ) f (r, z) = rf (r, z).
This relation shows that, the flat prior f (r, z) = 1 exactly corresponds to f (r, n t ) = r. So, comparing with the analysis with flat prior f (r, n t ) = 1, the new flat prior f (r, z) induces a larger value of the variable r.

V. CONCLUSION
In this Letter, we find a statistical bias in the CMB data analysis for the detection of RGWs, when the signalto-noise ratio is not very high. This could overlook the signal of RGWs in the CMB data analysis. We explain why this bias does exist by the analytical approximation of the likelihood function, and also find this bias can be elegantly avoided by adopting the orthogonalized parameters r and z ≡ rn t , instead of the general parameters r and n t .
In the end, we should emphasize that a similar statistical bias might exist for any data analysis [20], which should be carefully treated.