1 Introduction

Recently, the BICEP2 (background imaging of cosmic extragalactic polarization) Collaboration reported the detection of the B-mode polarization of the cosmic microwave background (CMB), which implies that the primordial gravitational waves (PGWs) are likely to have been detected [1]. If confirmed by upcoming experiments, the BICEP2’s result will greatly impact on the fundamental physics. The tensor-to-scalar ratio derived by the observed B-mode power spectrum is unexpectedly large, \(r=0.20^{+0.07}_{-0.05}\), with \(r=0\) disfavored at the 7.0\(\sigma \) level [1]. This result is in tension with the upper limit \(r<0.11\) (95 % CL) deduced from the measurements of temperature power spectrum by the Planck Collaboration (Planck+WP+highL, where WP refers to the WMAP 9-year polarization data and highL refers to the temperature data from ACT and SPT) [2]. One simple way of relieving this tension is to allow for a negative running of the scalar spectral index of order \(10^{-2}\), which challenges the design of the inflation models since the usual slow-roll inflation models predict a negligible running (of order \(10^{-4}\)).

To reduce the tension, more possibilities should be explored. One interesting suggestion is to consider additional sterile neutrino species in the universe [3, 4]. Since the tensor-to-scalar ratio \(r\) is found to be around 0.2, the standard cosmology should at least be extended to \(\Lambda \)CDM+\(r\) model (now this is the base model with seven parameters). Thus, the model with sterile neutrino is naturally called \(\Lambda \)CDM+\(r\)+\(\nu _s\) model, in which two additional parameters, \(N_\mathrm{eff}\) and \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\), are included. It is shown that in the \(\Lambda \)CDM+\(r\)+\(\nu _s\) model the tension between Planck and BICEP2 can be greatly relieved at the expense of the increase of \(n_s\) [3, 4]. Moreover, actually, by including a sterile neutrino species in the universe, not only the tension between Planck and BICEP2 is relieved, but also the other tensions between Planck and other astrophysical observations, such as the \(H_0\) direct measurement, the cluster counts, and the galaxy shear measurement, can all be significantly reduced.Footnote 1 Thus, the model with sterile neutrino seems to be an economical choice for the cosmology today. Furthermore, by combining the Planck + WP with the baryon acoustic oscillations (BAO), \(H_0\), Sunyaev–Zeldovich (SZ) cluster counts, CMB lensing, galaxy shear, and BICEP2 data, it is found that in the \(\Lambda \)CDM + \(r\) + \(\nu _s\) model the existing cosmological data prefer \(\Delta N_\mathrm{eff}>0\) at the 2.7\(\sigma \) level and a nonzero mass of sterile neutrino at the 3.9\(\sigma \) level [3]. (See also Ref. [4] for a similar analysis.)

Other proposals to address the large B-modes include, e.g., foregrounds or some uncounted temperature-polarization leakage [12], non-standard inflation models or more general early-universe scenarios [1321], large-field excursions [22, 23], primordial magnetic fields [24], topological defects [25, 26], spatial variation of \(r\) [27], and so on. Obviously, the forthcoming new data from, e.g., Planck and Keck array are expected to improve the foreground model and provide more tight constraints on the B-modes, resolving the current tension problem.

In this paper, we will consider neutrinos and extra relativistic components within the base \(\Lambda \)CDM+\(r\) model. We will use the current data to constrain the models with neutrinos. The models we consider in this paper include: (i) the active neutrinos with additional parameter \(\sum m_{\nu }\), (ii) the extra relativistic components with additional parameter \(N_\mathrm{eff}\), (iii) the active neutrinos along with the extra relativistic components with additional parameters \(\sum m_{\nu }\) and \(N_\mathrm{eff}\), and (iv) the massive sterile neutrino with additional parameters \(N_\mathrm{eff}\) and \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\). The observational data we use in this paper are from Planck+WP+BAO, \(H_0\) direct measurement, Planck SZ cluster counts, Planck CMB lensing, cosmic shear measurement, and BICEP2. This work will provide a detailed cosmological analysis on the models with neutrinos under the consideration of the BICEP2 data.

The paper is organized as follows. In Sect. 2, we briefly describe the cosmological models with neutrinos and the observational data. In Sect. 3, we present the fit results and discuss these results in detail. Conclusion is given in Sect. 4.

2 Models, parameters, and data

2.1 Cosmological models involving neutrinos

The cosmology with neutrinos has been described in detail and reviewed by the WMAP Collaboration [2830] and the Planck Collaboration [2]. In this paper, our conventions are consistent with those adopted by the Planck Collaboration [2], i.e., those used in the camb Boltzmann code. So, we will not describe in detail the equations but only specify the models with different parameters; for the details as regards the cosmology with neutrinos we refer the reader to Refs. [2, 2830].

Under the current situation that the large PGWs have been discovered, the base cosmology should be extended to the 7-parameter \(\Lambda \)CDM+\(r\) model. The base parameters for this model are

$$\begin{aligned} \{\omega _b,~\omega _c,~100\theta _\mathrm{MC},~\tau ,~n_s,~\ln (10^{10}A_s),~r_{0.05}\}, \end{aligned}$$

where \(\omega _b\equiv \Omega _b h^2\) and \(\omega _c\equiv \Omega _c h^2\) are the present-day baryon and cold dark matter densities, respectively, \(\theta _\mathrm{MC}\) is the approximation (used in CosmoMC) to the angular size of the sound horizon at the time of last-scattering \(r_s(z_*)/D_A(z_*)\), \(\tau \) is the Thomson scattering optical depth due to reionization, \(n_s\) and \(A_s\) are the spectral index and amplitude of the primordial curvature perturbations, respectively, and \(r_{0.05}\) is the tensor-to-scalar ratio at \(k_0=0.05\) Mpc\(^{-1}\). Other parameters, such as \(\Omega _\Lambda \), \(\Omega _m\), \(\sigma _8\), \(H_0\), \(r_{0.002}\), and so on, are the derived parameters.

In this base cosmology, there are three active neutrino species. Due to non-instantaneous decoupling corrections and other subtle corrections, the effective number of relativistic species in the standard cosmology is \(N_\mathrm{eff}=3.046\). A minimal-mass normal hierarchy for the neutrino mass is assumed in the base cosmology, i.e., only one massive eigenstate with \(m_\nu =0.06\) eV (\(\Omega _\nu h^2\approx \sum m_\nu /93.04~\mathrm{eV}\approx 0.0006\)).

In this paper, we consider the extensions to this base cosmology. Neutrinos and extra relativistic components bring additional base parameters to the model.

  • Consider the total mass of active neutrinos. In this case, a degenerate model is assumed in which the three active neutrino species are degenerate in mass and the total mass \(\sum m_\nu \) is a free parameter. Thus, in this extension, one additional base parameter, \(\sum m_\nu \), is introduced.

  • Consider the extra neutrino-like radiation. In this case, the extra relativistic degrees of freedom are effectively massless. The total mass of active neutrinos \(\sum m_\nu \) is kept fixed at 0.06 eV, but the parameter \(N_\mathrm{eff}\) is free. Thus, in this extension, one additional base parameter, \(N_\mathrm{eff}\), is introduced.

  • Simultaneously consider the active neutrino mass and extra radiation. In this case, the parameters \(N_\mathrm{eff}\) and \(\sum m_\nu \) are both free. So, two additional parameters, \(N_\mathrm{eff}\) and \(\sum m_\nu \), are introduced.

  • Consider the massive sterile neutrino. In this case, the total mass of active neutrinos \(\sum m_\nu \) is kept fixed at 0.06 eV, but we add one massive sterile neutrino in the model. Thus, two additional parameters, \(N_\mathrm{eff}\) and \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\), are introduced.

We use flat priors for the base parameters. When the base parameters are varied, the prior ranges are chosen to be much wider than the posterior so that the results of parameter estimation are not affected. The priors are set following the Planck Collaboration [2]. In addition to these priors, a “hard” prior on the Hubble constant \(H_0\) of [20, 100] km s\(^{-1}\) Mpc\(^{-1}\) is imposed.

2.2 Observational data

We consider the following data sets:

  • Planck+WP: the CMB TT angular power spectrum data from Planck [2], in combination with the large-scale EE and TE polarization power spectrum data from 9-year WMAP [30].

  • BAO: the latest measurement of the cosmic distance scale from the Data Release 11 (DR11) galaxy sample of the Baryon Oscillation Spectroscopic Survey (BOSS) [that is part of the Sloan Digital Sky Survey III (SDSS-III)]: \(D_V(0.32)(r_{d,\mathrm{fid}}/r_d)=(1,\!264\pm 25)\) Mpc and \(D_V(0.57)(r_{d,\mathrm{fid}}/r_d)=(2,\!056\pm 20)\) Mpc, with \(r_{d,\mathrm{fid}}=149.28\) Mpc [31].Footnote 2

  • \(H_0\): the direct measurement of the Hubble constant using the cosmic distance ladder in the Hubble Space Telescope observations of Cepheid variables and type Ia supernovae, \(H_0=(73.8\pm 2.4)~\mathrm{km}~\mathrm{s}^{-1}~\mathrm{Mpc}^{-1}\) [34].Footnote 3

  • SZ: the counts of rich clusters of galaxies from the sample of Planck thermal Sunyaev–Zeldovich (SZ) clusters constrain the combination of \(\sigma _8\) and \(\Omega _m\), \(\sigma _8(\Omega _m/0.27)^{0.3}=0.78\pm 0.01\) [7].Footnote 4

  • Lensing: the CMB lensing power spectrum \(C_\ell ^{\phi \phi }\) from Planck [38], and also the combination of \(\sigma _8\) and \(\Omega _m\) given by the cosmic shear data of the weak lensing from the CFHTLenS survey, \(\sigma _8(\Omega _m/0.27)^{0.46}=0.774\pm 0.040\) [39]. Footnote 5

  • BICEP2: the CMB angular power spectra (TT, TE, EE, and BB) data from BICEP2 [1].

Actually, the Planck data are in tension with several astrophysical observations, as discussed by the Planck Collaboration [2], in the case of the 6-parameter base \(\Lambda \)CDM model. Planck data are in good agreement with the BAO data that are based on a simple geometrical measurement, so we can always combine Planck+WP with BAO without any question. But the Planck data are in tension with the \(H_0\), SZ, and Lensing data. For the 6-parameter base \(\Lambda \)CDM model, the Planck+WP+highL data combination gives the fit results: \(H_0=(67.3\pm 1.2)~\mathrm{km}~\mathrm{s}^{-1}~\mathrm{Mpc}^{-1}\), \(\sigma _8(\Omega _m/0.27)^{0.3}=0.87\pm 0.02\), and \(\sigma _8(\Omega _m/0.27)^{0.46}=0.89\pm 0.03\) [2], which are in tension with the \(H_0\) direct measurement [34], the cluster counts [7],Footnote 6 and the cosmic shear measurement [39] at the 2–3\(\sigma \) level. In addition, Planck is also in mild tension with the SNLS type Ia supernova compilation (at about the 2\(\sigma \) level).

Due to the complexity of these astrophysical data, these tensions can possibly be interpreted in terms of that some sources of systematic errors are not completely understood in these astrophysical measurement. An alternative explanation is that the base \(\Lambda \)CDM model is incorrect or should be extended.

The possibilities that the tensions between Planck and these astrophysical data might imply new physics have been explored. For example, the tension between Planck and the \(H_0\) direct measurement might hint that dark energy is not the cosmological constant but is some dynamical field (or fluid). It is shown in Ref. [44] that in a dynamical dark energy model, such as the constant \(w\) model or the holographic dark energy model, the tension between Planck and \(H_0\) is greatly reduced. But the mild tension between the Planck data and the SNLS type Ia supernova data may come from the systematic error, which could be greatly eliminated by considering the new effects of supernova, such as the evolution of the color–luminosity parameter \(\beta \), as analyzed in Refs. [45, 46].

Sterile neutrino can also play a very significant role in relieving the tensions between Planck and the astrophysical observations. Involving sterile neutrino can increase the early-time Hubble expansion rate and the free-streaming damping, leading to the changes of the acoustic scale and the growth of cosmic structure, thus the tensions between Planck and \(H_0\), cluster counts, and cosmic shear can simultaneously be greatly reduced when the massive sterile neutrino is considered [810]. Furthermore, very recently, it was shown that the tension between Planck and BICEP2 can also be significantly relieved when the sterile neutrino is involved in the model [3, 4]. Therefore, in the \(\Lambda \)CDM+\(r\)+\(\nu _s\) model, almost all the tensions between Planck and other astrophysical observations can be simultaneously alleviated.

In this paper, we use the latest observational data to place constraints on the neutrino cosmological models. Since we use the uniform data sets, we actually can make a direct comparison for these models. We do not use the type Ia supernova data in this analysis because dynamical dark energy is not considered and also the systematic errors in the supernova data cannot be well quantified [45, 46]. But we assume that other astrophysical data sets, such as \(H_0\), SZ cluster counts, and cosmic shear, have accurately quantified estimates of systematic errors. Since there is no tension between Planck and BAO, we can always safely use the Planck+WP+BAO data combination. In order to measure the impacts from the other astrophysical observations on the neutrino physics, we can further combine the \(H_0\)+SZ+Lensing data in the analysis. Furthermore, to see the role of the BICEP2 data play in constraining the neutrino cosmological models, we finally use an all data combination involving the BICEP2 data. Thus, in our analysis, we use the data combinations: (i) Planck+WP+BAO, (ii) Planck+WP+BAO+\(H_0\)+SZ+Lensing, and (iii) Planck+WP+BAO+\(H_0\)+SZ+Lensing+BICEP2. In the next section, we will report and discuss the fitting results of the neutrino cosmological models in the light of these data sets.

3 Results and discussions

For convenience, the four models considered in this paper are called: (i) \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \), (ii) \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\), (iii) \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\), and (iv) \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\), respectively. The one- and two-dimensional joint, marginalized posterior distributions of the parameters for the four models are shown in Figs. 1, 2, 3, 4. The gray, red, and blue contours (and curves) stand for the results of Planck+WP+BAO, Planck+WP+BAO+\(H_0\)+SZ+Lensing, and Planck+WP+BAO+\(H_0\)+SZ+Lensing+BICEP2 data combinations, respectively. Detailed fit values for the cosmological parameters are given in Tables 1, 2, 3, 4. In the tables, we quote the \(\pm 1\sigma \) errors, but for the parameters that cannot be well constrained, we quote the 95 % CL upper limits.

Fig. 1
figure 1

Cosmological constraints on the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \) model

Fig. 2
figure 2

Cosmological constraints on the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) model

Fig. 3
figure 3

Cosmological constraints on the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\) model

Fig. 4
figure 4

Cosmological constraints on the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) model

Table 1 Fitting results for the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \) model. We quote \(\pm 1\sigma \) errors, but for the parameters that cannot be well constrained, we quote the 95 % CL upper limits
Table 2 Fitting results for the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) model. We quote \(\pm 1\sigma \) errors, but for the parameters that cannot be well constrained, we quote the 95 % CL upper limits
Table 3 Fitting results for the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\) model. We quote \(\pm 1\sigma \) errors, but for the parameters that cannot be well constrained, we quote the 95 % CL upper limits
Table 4 Fitting results for the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) model. We quote \(\pm 1\sigma \) errors, but for the parameters that cannot be well constrained, we quote the 95 % CL upper limits

3.1 Constraints on the total mass of active neutrinos \(\sum m_{\nu }\)

Figure 1 and Table 1 summarize the fit results for the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \) model.

From Fig. 1, one can see that comparing to the Planck+WP+BAO data combination, the addition of the astrophysical data sets of \(H_0\)+SZ+Lensing impacts significantly on the constraint results of \(\sigma _8\) and \(\sum m_\nu \). But \(H_0\), \(n_s\), and \(r_{0.002}\) are not affected evidently.

The combination of Planck+WP+BAO gives \(\sigma _8\) \(=0.811^{+0.031}_{-0.018}\), and when the data of \(H_0\)+SZ+Lensing are added, the fit result becomes \(\sigma _8=0.762\pm 0.012\).

Using the Planck+WP+BAO data cannot tightly constrain the neutrino mass, but can only obtain an upper limit

$$\begin{aligned} \sum m_\nu <0.28~\mathrm{eV}\quad (95\,\% \text{ CL }; \text{ Planck+WP+BAO }). \end{aligned}$$

However, when the \(H_0\)+SZ+Lensing data are included, the neutrino mass can be tightly constrained,

$$\begin{aligned}&\!\!\! \sum m_\nu =0.28\pm 0.07~\mathrm{eV}\\&\quad (68\,\% \text{ CL };~\text{ Planck+WP+BAO+ }H_0\text{+SZ+Lensing }). \end{aligned}$$

The posterior distribution is shown by the red curve in Fig. 1. Further including the BICEP2 data does not improve the constraint on the neutrino mass,

$$\begin{aligned}&\!\!\! \sum m_\nu =0.28^{+0.07}_{-0.08}~\mathrm{eV} \quad (68\,\% \\&\quad \text{ CL; } \text{ Planck+WP+BAO+ } H_0\text{+SZ+Lensing+BICEP2 }). \end{aligned}$$

The posterior distribution is shown in Fig. 1 by the blue curve which is nearly coincident with the red one. Thus, we find that in the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \) model the combined cosmological data prefer a nonzero total mass of active neutrinos at about the 4\(\sigma \) significance.

The BICEP2 does not affect other parameters, either, except for the tensor-to-scalar ratio \(r\). In the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \) model, it is shown from Fig. 1 and Table 1 that the tension between Planck and BICEP2 cannot be effectively reduced. The Planck+WP+BAO data combination gives \(r_{0.002}<0.12\) (95 % CL), and further adding \(H_0\)+SZ+Lensing data weakens the limit to \(r_{0.002}<0.15\) (95 % CL). Including the BICEP2 data could improve the constraint on \(r\) to \(r_{0.002}=0.18^{+0.03}_{-0.04}\) (68 % CL).

3.2 Constraints on the effective number of relativistic species \(N_\mathrm{eff}\)

Figure 2 and Table 2 summarize the fit results for the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) model.

The addition of the parameter \(N_\mathrm{eff}\) can slightly relieve the tension between Planck and \(H_0\). The Planck+WP+BAO data combination gives \(H_0=70.4^{+1.8}_{-1.9}\) km s\(^{-1}\) Mpc\(^{-1}\). In the same case we also find a high amplitude for the present-day matter fluctuations, \(\sigma _8=0.849\pm 0.020\). When the \(H_0\)+SZ+Lensing data are added, the value of \(H_0\) is not affected significantly, \(H_0=69.1\pm 1.4\) km s\(^{-1}\) Mpc\(^{-1}\), but the value of \(\sigma _8\) becomes much smaller, \(\sigma _8=0.792\pm 0.009\) (with the error also shrinking significantly).

In the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) model, the constraint results for the parameter \(N_\mathrm{eff}\) are

$$\begin{aligned}&\!\!\!N_\mathrm{eff}=3.52^{+0.31}_{-0.32}\quad (68\,\% \text{ CL; } \text{ Planck+WP+BAO) },\\&\!\!\!N_\mathrm{eff}=2.97^{+0.20}_{-0.22}\quad \\&(68\,\% \text{ CL; } \text{ Planck+WP+BAO+ }H_0\hbox {+SZ+Lensing)},\\&N_\mathrm{eff}=3.07\pm 0.20\quad \\&(68\,\% \text{ CL; } \text{ Planck+WP+BAO+ }H_0\hbox {+SZ+Lensing+BICEP2)}, \end{aligned}$$

which are all consistent with the standard value of 3.046.

We also find that in the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) model the upper limit for the tensor-to-scalar ratio becomes a little bit higher, \(r_{0.002}<0.15\), from the Planck+WP+BAO data, and this limit does not change when the \(H_0\)+SZ+Lensing data are added. So, this model cannot effectively alleviate the tension between Planck and BICEP2. When the BICEP2 data are included, the constraint on \(r\) becomes \(r_{0.002}=0.18\pm 0.04\).

The Planck+WP+BICEP2 constraints on the \(\Lambda \)CDM+\(r\)+ \(\sum m_\nu \) and \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) models were also discussed recently in Ref. [47].

3.3 Simultaneous constraints on \(N_\mathrm{eff}\) and \(\sum m_{\nu }\)

Figure 3 and Table 3 summarize the fit results for the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\) model.

In this model, the tension between Planck and \(H_0\) direct measurement can be significantly reduced. The Planck+WP+BAO data combination gives \(H_0=70.8^{+1.8}_{-2.1}\) km s\(^{-1}\) Mpc\(^{-1}\), which is improved to \(H_0=71.9\pm 1.6\) km s\(^{-1}\) Mpc\(^{-1}\) when the \(H_0\)+SZ+Lensing data are included. The Planck+WP+BAO data combination favors a high \(\sigma _8\) value, \(\sigma _8=0.821^{+0.041}_{-0.029}\), and the inclusion of the \(H_0\)+SZ+Lensing data improves the constraint to \(\sigma _8=0.759\pm 0.011\). Further adding the BICEP2 data does not change these constraints evidently.

In the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\) model, the constraint results for the parameters \(N_\mathrm{eff}\) and \(\sum m_{\nu }\) are

$$\begin{aligned}&\!\!\!\left. \begin{array}{l} N_\mathrm{eff} = 3.69^{+0.33}_{-0.40}~~(68\,\%~\mathrm{CL}) \\ \sum m_\nu < 0.50~ \mathrm{eV}~~(95\,\%~\mathrm{CL}) \end{array} \right\} \quad \text{(Planck+WP+BAO) },\\&\!\!\!\left. \begin{array}{l} N_\mathrm{eff} = 4.04^{+0.35}_{-0.34} \\ \sum m_\nu =0.58^{+0.14}_{-0.15}~\mathrm{eV} \end{array} \right\} \quad \\&\quad (68\% \text{ CL; } \text{ Planck+WP+BAO+ }H_0\hbox {+SZ+Lensing}),\\&\left. \begin{array}{l} N_\mathrm{eff} = 4.20\pm 0.32 \\ \sum m_\nu =0.63^{+0.13}_{-0.16}~\mathrm{eV} \end{array} \right\} \quad \\&\quad (68\% \text{ CL; } \text{ Planck+WP+BAO+ }H_0\hbox {+SZ+Lensing+BICEP2}). \end{aligned}$$

We find that with the basic data combination Planck+WP+BAO, only an upper limit for the total mass of active neutrinos can be given, but the weak preference for \(N_\mathrm{eff}>3.046\) at about the 1.6\(\sigma \) level is shown. Combining the \(H_0\)+SZ+Lensing data can tightly constrain both \(\sum m_\nu \) and \(N_\mathrm{eff}\), giving the evidence for nonzero mass of active neutrinos and \(\Delta N_\mathrm{eff}\equiv N_\mathrm{eff}-3.046>0\) at the 3.9\(\sigma \) and 2.9\(\sigma \), respectively. Further adding the BICEP2 data can improve the results to some extent, favoring \(\sum m_\nu >0\) and \(\Delta N_\mathrm{eff}>0\) at the 4.0\(\sigma \) and 3.6\(\sigma \) levels, respectively.

It is interesting to compare the current results with those derived from data before Planck and BICEP2. For example, using the WMAP7+BAO+\(H_0\)+X-ray cluster data combination, Burenin obtained \(\sum m_\nu =0.47\pm 0.16\) eV and \(N_\mathrm{eff}=3.89\pm 0.39\) [6], which indicates the detections of \(\sum m_\nu >0\) and \(\Delta N_\mathrm{eff}>0\) at the 2.9\(\sigma \) and 2.2\(\sigma \) levels, respectively.

It is also important to show that this model is very helpful in reconciling the tension between Planck and BICEP2. With only the Planck+WP+BAO data, we find that the upper limit on the tensor-to-scalar ratio \(r\) is weakened to \(r_{0.002}<0.19\) (95 % CL). Once the \(H_0\)+SZ+Lensing data are included, the limit on \(r\) is further weakened to \(r_{0.002}<0.27\) (95 % CL), which is well compatible with the BICEP2 result, \(r_{0.002}=0.20^{+0.07}_{-0.05}\) [1]. Combining with the BICEP2 data, the \(r\) constraint is tightened to \(r_{0.002}=0.22^{+0.04}_{-0.05}\). We also notice that due to the positive correlation between \(n_s\) and \(r\) (see the \(n_s-r_{0.002}\) contours in gray and red in Fig. 3), once the tensor-to-scalar ratio \(r\) is increased, the scalar spectral index \(n_s\) is also enlarged. According to the fitting results, the exact scale-invariant perturbation spectrum cannot be excluded but actually is favored in this model.

3.4 Constraints on massive sterile neutrino with \(N_\mathrm{eff}\) and \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\)

The \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) model has been discussed in Refs. [3, 4]. In Ref. [3], this model is also called \(\Lambda \)CDM+\(r\)+\(\nu _s\) model, with \(\nu _s\) denoting the sterile neutrino with two extra parameters \(N_\mathrm{eff}\) and \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\). In this paper, we duplicate the calculations in Ref. [3], but we will provide more information about the fit results. Figure 4 and Table 4 summarize the fit results for the \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) model.

It has been discussed in Refs. [3, 4] (see also Refs. [810]) that the sterile neutrino can reconcile the tensions between Planck and other astrophysical observations such as the direct measurement of \(H_0\) [34], the Planck SZ cluster counts [7], and the cosmic shear measurement [39]. Here, we can see from Fig. 4 and Table 4 that the combination of Planck+WP+BAO gives \(H_0=70.8^{+1.7}_{-2.1}\) km s\(^{-1}\) Mpc\(^{-1}\), and further combining with \(H_0\), SZ, and Lensing data improves the result to \(H_0= 70.7^{+1.5}_{-1.8}\) km s\(^{-1}\) Mpc\(^{-1}\). The Planck+WP+BAO data combination favors a high \(\sigma _8\) value, \(\sigma _8=0.812^{+0.038}_{-0.029}\), and the inclusion of the \(H_0\), SZ, and Lensing data lowers the value to \(\sigma _8=0.758^{+0.011}_{-0.012}\). Further adding the BICEP2 data does not change these results evidently.

We now show the constraint results for the parameters \(N_\mathrm{eff}\) and \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) in this model:

$$\begin{aligned}&\left. \begin{array}{l} N_\mathrm{eff} =3.72^{+0.32}_{-0.40}~~(68\,\%~\mathrm{CL}) \\ m_{\nu ,\mathrm{sterile}}^\mathrm{eff}< 0.51~ \mathrm{eV}~~(95\,\%~\mathrm{CL}) \end{array} \right\} \quad \text{(Planck+WP+BAO) },\\&\left. \begin{array}{l} N_\mathrm{eff} = 3.75^{+0.34}_{-0.37} \\ m_{\nu ,\mathrm{sterile}}^\mathrm{eff}=0.48^{+0.11}_{-0.13}~\mathrm{eV} \end{array} \right\} \quad \\&\quad (68\,\% \text{ CL; } \text{ Planck+WP+BAO+ }H_0\hbox {+SZ+Lensing}),\\&\left. \begin{array}{l} N_\mathrm{eff} = 3.95\pm 0.33 \\ m_{\nu ,\mathrm{sterile}}^\mathrm{eff}=0.51^{+0.12}_{-0.13}~\mathrm{eV} \end{array} \right\} \\&\quad (68\,\% \text{ CL; } \text{ Planck+WP+BAO+ }\\&\quad H_0\hbox {+SZ+Lensing+BICEP2}). \end{aligned}$$

We find that the mass of sterile neutrino cannot be well constrained using only the basic data combination Planck+WP+BAO, but the addition of \(H_0\), SZ, and Lending data significantly improves the constraint on the mass, strongly favoring a nonzero mass of sterile neutrino at the 3.6\(\sigma \) statistical significance. The posterior distributions of \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) for the two cases are shown as gray and red curves, respectively, in Fig. 4, and the direct comparison of the two curves is very impressive. This shows that the SZ cluster data (as well as the \(H_0\) and Lensing data) play an important role in constraining the mass of sterile neutrino, as discussed in Refs. [3, 4]. Further including the BICEP2 data improves the evidence for nonzero mass of sterile neutrino to be at the 3.9\(\sigma \) significance. For the \(N_\mathrm{eff}\) constraints, the basic combination Planck+WP+BAO shows the preference for \(\Delta N_\mathrm{eff}>0\) at the 1.7\(\sigma \) level, and the inclusion of \(H_0\)+SZ+Lensing data improves slightly the preference for \(\Delta N_\mathrm{eff}>0\) to be at the 1.9\(\sigma \) level. The BICEP2 data play a significant role in improving the constraint on \(N_\mathrm{eff}\), which can be seen directly from the posterior distribution curves in Fig. 4. Further adding the BICEP2 data favors the \(\Delta N_\mathrm{eff}>0\) result at the 2.7\(\sigma \) level.

The sterile neutrino can help reconcile the tension between Planck and BICEP2, as analyzed in Refs. [3, 4]. Using only Planck+WP+BAO can lead to \(r_{0.002}<0.20\) (95 % CL), and including \(H_0\)+SZ+Lensing can give \(r_{0.002}<0.23\) (95 % CL), consistent with the BICEP2 result. Further adding the BICEP2 data, we obtain the tightly constrained result, \(r_{0.002}=0.21^{+0.04}_{-0.05}\). As pointed out by Refs. [3, 4], the increase of \(r\) is at the expense of the increase of \(n_s\), due to the positive correlation between \(n_s\) and \(r_{0.002}\) (as shown by the gray and red contours in the \(n_s-r_{0.002}\) plane in Fig. 4). Hence, as the same as the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\) model discussed in the last subsection, this model can resolve the tension between Planck and BICEP2, but at the same time cannot exclude the exact scale-invariant primordial perturbation spectrum.

The light massive sterile neutrino is motivated to explain the anomalies appearing in the short-baseline neutrino oscillation experiments [4853]. It is of great interest to see that the evidence of the existence of the light sterile neutrino can be found in the existing cosmological data with high statistical significance (see also Refs. [3, 4, 810]). Moreover, in this model almost all the tensions of Planck with other astrophysical observations can be simultaneously relieved.

The best-fit results, \(\Delta N_\mathrm{eff}\approx 1\) and \(m_\mathrm{sterile}^\mathrm{thermal}\approx m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\approx 0.5\) eV, derived in this paper and Refs. [3, 4], indicate a fully thermalized sterile neutrino with sub-eV mass. However, the short-baseline neutrino oscillation experiments prefer the mass of sterile neutrino at around 1 eV. So, there is still a tension on the sterile neutrino mass between the cosmological data and the short-baseline neutrino oscillation data. The implication of this tension for cosmology deserves further investigations. See Ref. [54] for a recent discussion.

4 Conclusion

After the detection of the PGWs by the BICEP2 experiment, the base standard cosmology should at least be extended to the 7-parameter \(\Lambda \)CDM+\(r\) model. In this paper, we consider the extensions to this base \(\Lambda \)CDM+\(r\) model by including additional base parameters relevant to neutrinos and/or other neutrino-like relativistic components. Four neutrino cosmological models are considered, i.e., the \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \), \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\), \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\), and \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) models. We use the current observational data to constrain these models. The cosmological data used in this paper include: Planck+WP, BAO, \(H_0\), Planck SZ cluster, Planck CMB lensing, cosmic shear, and BICEP2 data. The main results of this paper are shown in Figs. 1, 2, 3, 4 and Tables 1, 2, 3, 4. Here, we summarize the findings from our analysis.

  • The \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \) model. With the Planck+WP+BAO data, we find a limit on the active neutrino mass, \(\sum m_\nu <0.28~\mathrm{eV}\) (95 % CL). Including the \(H_0\)+SZ+Lensing data leads to a strikingly tight constraint: \(\sum m_\nu =0.28\pm 0.07~\mathrm{eV}\), preferring a nonzero mass of active neutrinos at about the 4\(\sigma \) level. Further adding the BICEP2 data does not improve the constraint on the mass. We also find that this model cannot alleviate the tension on \(r\) between Planck and BICEP2.

  • The \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\) model. Using only the Planck+WP+ BAO data gives \(N_\mathrm{eff}=3.52^{+0.31}_{-0.32}\), and further adding the \(H_0\)+SZ+Lensing data gives \(N_\mathrm{eff}=2.97^{+0.20}_{-0.22}\), and combination of all data (including BICEP2) leads to \(N_\mathrm{eff}=3.07\pm 0.20\). These results are consistent with the standard value of 3.046. We also find that this model cannot effectively alleviate the tension on \(r\) between Planck and BICEP2.

  • The \(\Lambda \)CDM+\(r\)+\(\sum m_\nu \)+\(N_\mathrm{eff}\) model. With the Planck+WP+BAO data, we obtain \(\sum m_\nu < 0.50\) eV (95 % CL) and \(N_\mathrm{eff} = 3.69^{+0.33}_{-0.40}\), so in this case only an upper limit on the total mass of active neutrinos can be given, but the weak preference for \(N_\mathrm{eff}>3.046\) at about the 1.6\(\sigma \) level is shown. Combining with the \(H_0\)+SZ+Lensing data can lead to tight constraints, \(\sum m_\nu =0.58^{+0.14}_{-0.15}\) eV and \(N_\mathrm{eff} = 4.04^{+0.35}_{-0.34}\), giving the evidence for nonzero mass of active neutrinos and \(\Delta N_\mathrm{eff}>0\) at the 3.9\(\sigma \) and 2.9\(\sigma \), respectively. Further adding the BICEP2 data can improve the results to \(\sum m_\nu =0.63^{+0.13}_{-0.16}\) eV and \(N_\mathrm{eff} = 4.20\pm 0.32\), favoring \(\sum m_\nu >0\) and \(\Delta N_\mathrm{eff}>0\) at the 4.0\(\sigma \) and 3.6\(\sigma \) levels, respectively. We also show that this model is very helpful in relieving the tension between Planck and BICEP2. The increase of \(r\) is at the cost of the increase of \(n_s\), and consequently the exact scale-invariant spectrum cannot be excluded.

  • The \(\Lambda \)CDM+\(r\)+\(N_\mathrm{eff}\)+\(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}\) model. With the Planck +WP+BAO data, we obtain \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}< 0.51\) eV (95 % CL) and \(N_\mathrm{eff} =3.72^{+0.32}_{-0.40}\), thus in this case only an upper limit on the sterile neutrino mass can be derived and the preference for \(\Delta N_\mathrm{eff}>0\) at the 1.7\(\sigma \) level is shown. Further including the \(H_0\)+SZ+Lensing data significantly improves the constraints, \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}=0.48^{+0.11}_{-0.13}\) eV and \(N_\mathrm{eff} = 3.75^{+0.34}_{-0.37}\), favoring a nonzero mass of sterile neutrino and \(\Delta N_\mathrm{eff}>0\) at the 3.6\(\sigma \) and 1.9\(\sigma \) levels, respectively. Finally, further adding the BICEP2 data improves the constraints to \(m_{\nu ,\mathrm{sterile}}^\mathrm{eff}=0.51^{+0.12}_{-0.13}\) eV and \(N_\mathrm{eff} = 3.95\pm 0.33\), showing the evidence of nonzero sterile neutrino mass and \(\Delta N_\mathrm{eff}>0\) at the 3.9\(\sigma \) and 2.7\(\sigma \) levels, respectively. It is shown that this model is very helpful in relieving the tension between Planck and BICEP2, and the expense of the increase of \(r\) is the increase of \(n_s\), thus the exact scale-invariant spectrum cannot be excluded in this case, either. The fitting results indicate a fully thermalized sterile neutrino with sub-eV mass, in tension with the short-baseline neutrino oscillation experiments that prefer the mass of sterile neutrino at around 1 eV. The implication of this tension for cosmology deserves further investigation.