Updated global analysis of neutrino oscillations in the presence of eV-scale sterile neutrinos

We discuss the possibility to explain the anomalies in short-baseline neutrino oscillation experiments in terms of sterile neutrinos. We work in a 3+1 framework and pay special attention to recent new data from reactor experiments, IceCube and MINOS+. We find that results from the DANSS and NEOS reactor experiments support the sterile neutrino explanation of the reactor anomaly, based on an analysis that relies solely on the relative comparison of measured reactor spectra. Global data from the $\nu_e$ disappearance channel favour sterile neutrino oscillations at the $3\sigma$ level with $\Delta m^2_{41} \approx 1.3$ eV$^2$ and $|U_{e4}| \approx 0.1$, even without any assumptions on predicted reactor fluxes. In contrast, the anomalies in the $\nu_e$ appearance channel (dominated by LSND) are in strong tension with improved bounds on $\nu_\mu$ disappearance, mostly driven by MINOS+ and IceCube. Under the sterile neutrino oscillation hypothesis, the p-value for those data sets being consistent is less than $2.6\times 10^{-6}$. Therefore, an explanation of the LSND anomaly in terms of sterile neutrino oscillations in the 3+1 scenario is excluded at the $4.7\sigma$ level. This result is robust with respect to variations in the analysis and used data, in particular it depends neither on the theoretically predicted reactor neutrino fluxes, nor on constraints from any single experiment. Irrespective of the anomalies, we provide updated constraints on the allowed mixing strengths $|U_{\alpha 4}|$ ($\alpha = e,\mu,\tau$) of active neutrinos with a fourth neutrino mass state in the eV range.

We discuss the possibility to explain the anomalies in short-baseline neutrino oscillation experiments in terms of sterile neutrinos. We work in a 3 + 1 framework and pay special attention to recent new data from reactor experiments, IceCube and MINOS+. We find that results from the DANSS and NEOS reactor experiments support the sterile neutrino explanation of the reactor anomaly, based on an analysis that relies solely on the relative comparison of measured reactor spectra. Global data from the ν e disappearance channel favour sterile neutrino oscillations at the 3σ level with ∆m 2 41 ≈ 1.3 eV 2 and |U e4 | ≈ 0.1, even without any assumptions on predicted reactor fluxes. In contrast, the anomalies in the ν e appearance channel (dominated by LSND) are in strong tension with improved bounds on ν µ disappearance, mostly driven by MINOS+ and IceCube. Under the sterile neutrino oscillation hypothesis, the p-value for those data sets being consistent is less than 2.6 × 10 −6 . Therefore, an explanation of the LSND anomaly in terms of sterile neutrino oscillations in the 3 + 1 scenario is excluded at the 4.7σ level. This result is robust with respect to variations in the analysis and used data, in particular it depends neither on the theoretically predicted reactor neutrino fluxes, nor on constraints from any single experiment. Irrespective of the anomalies, we provide updated constraints on the allowed mixing strengths |U α4 | (α = e, µ, τ ) of active neutrinos with a fourth neutrino mass state in the eV range. a modentle@uni-mainz.de b alvaro.cabezudo@kit.edu c jkopp@uni-mainz.de d pmachado@fnal.gov e michele.maltoni@csic.es f ivanj.martinez@estudiante.uam.es g schwetz@kit.edu

I. INTRODUCTION
For almost two decades, the possible existence of light sterile neutrinos-new species of neutral fermions participating in neutrino oscillation-has intrigued the neutrino physics community. The excitement is fuelled in particular by a number of unexpected experimental results: an unexplained excess of electron anti-neutrinos (ν e ) in a muon anti-neutrino (ν µ ) beam observed at a baseline of ∼ 30 m from the source in the LSND experiment [1]; a similar excess found by the MiniBooNE collaboration at higher energies and correspondingly larger baseline [2]; the disagreement between theoretically predictedν e fluxes from nuclear reactors and observations [3,4], known as the reactor anti-neutrino anomaly [5] (see also [6][7][8]); and a similar disagreement between expectations and observations in experiments using intense radioactive sources [9,10].
In this work, we update our previous analyses from refs. [11,14,21] to incorporate new experimental results. These are in particular the following: 1. New constraints onν e disappearance into sterile neutrinos from the reactor neutrino experiments Daya Bay [22], NEOS [23], and DANSS [24][25][26]. Unlike the results from previous short-baseline reactor experiments that have led to the reactor anti-neutrino anomaly, these new analyses are based on a comparison of measured spectra at different baselines rather than a comparison of data to theoretically predicted spectra. The new results are therefore insensitive to possible mismodelling of theν e emission from nuclear reactors. In particular, they are insensitive to an observed, but so far unexplained, bump at neutrino energies ∼ 5 MeV [27][28][29] 1 . Spectral distortions in the recent data from DANSS and NEOS lead to a hint in favour of sterile neutrinos at the 3σ level, which supports the previous reactor anomaly independent of flux predictions.
2. Daya Bay measurements of the individual neutrino fluxes from different fissible isotopes [37]. By combining the time evolution of the observed reactor anti-neutrino spectra with the known evolution of the reactor fuel composition, the Daya Bay collaboration was able to determine independently the neutrino fluxes from the two most important fissible isotopes in a nuclear reactor, 235 U and 239 Pu. Their analysis suggests that the discrepancy between predicted and observed fluxes stems mainly from 235 U, while the neutrino flux from 239 Pu appears consistent with predictions. (The other potentially relevant isotopes 238 U and 241 Pu are subdominant in Daya Bay.) In contrast, oscillations into sterile neutrinos would lead to equal flux deficits in all isotopes. Implications of these results for sterile neutrino models have been discussed previously in refs. [20,21]. In our previous paper [21] we have shown that both hypotheses (free flux normalizations versus sterile neutrino oscillations) give acceptable fits to Daya Bay data, and that the preference in favour of flux rescaling decreases once Daya Bay is combined with the global reactor data. We will update those results in section III A below. Finally, it has been demonstrated recently that the theoretical predictions for the time-dependence of reactor anti-neutrino fluxes on which the Daya Bay analysis is based may need to be refined [38,39]. In particular, the present analysis accounts neither for the time-dependent equilibration of decay chains nor for the possibility of neutron capture on fission products, which would lead to a non-linear dependence of anti-neutrino fluxes on the neutron flux in the reactor [38]. Taking these effects into account, Daya Bay's preference for the flux misprediction hypothesis is estimated to drop to well below 2σ [39].
3. Final results from OPERA [40] and ICARUS [41,42]. Both experiments constrain sterile neutrinos mixing with electron and muon neutrinos by searching for anomalous ν µ → ν e appearance in the CNGS beam.
4. Searches for sterile neutrinos in MINOS/MINOS+ [43] and in NOνA [44]. The first analysis combines charged current ν µ disappearance data and neutral current data from the MINOS experiment and from the MINOS+ setup operating the same detector in a higher energy beam. The second analysis is based on neutral current data from NOνA. Especially the MINOS/MINOS+ analysis places stringent bounds on sterile neutrino mixing with ν µ over a wide range of masses.
5. New solar neutrino data, including the 2055-day energy and day/night asymmetry spectrum from Super-Kamiokande phase 4 [45] and the measurement of neutrinos from the proton-proton (pp) fusion chain in the Sun recently presented by Borexino [46]. In addition, the results of all solar experiments have been updated to match the new solar neutrino fluxes predicted by the GS98 version of the Standard Solar Model presented in ref. [47].
6. Improved atmospheric neutrino data from Super-Kamiokande (including 1775 days of phase 4 data) from ref. [48], as well as the complete set of DeepCore 3-year data presented in ref. [49] and publicly released in ref. [50]. The calculations of atmospheric neutrino event rates for both detectors are based on the atmospheric neutrino flux calculations described in ref. [51].
7. First sterile neutrino limits from IceCube, based on one year of data [52][53][54]. This novel analysis exploits the fact that active-to-sterile oscillations of atmospheric neutrinos inside the Earth may be enhanced by a Mikheyev-Smirnov-Wolfenstein (MSW) resonance [55,56]. The resonance affects the anti-neutrino sector, and for sterile neutrino masses around 1 eV occurs at energies of order 1 TeV, an energy well above IceCube's detection threshold, but still low enough to benefit from a substantial flux [57,58]. Consequently, IceCube is able to set strong limits on sterile neutrino mixing with ν µ .
We will begin in section II by reviewing the formalism of neutrino oscillations in the presence of sterile neutrinos. Along the way, we will also fix our notation, such as our parameterization of the leptonic mixing matrix. In sections III to V, we will then discuss the status of the global data sets in the ν e → ν e , ν µ → ν e , and ν µ → ν µ channels (and the corresponding anti-neutrino channels) in turn. In particular, section III discusses the recent hints from reactor spectral data and section IV reviews the anomalies in the appearance channel. In sections V and VI, we present updated constraints on the mixing of a sterile neutrino with the ν µ and ν τ flavour from global data, respectively. We will finally combine all oscillation channels in section VII into a global fit. We will determine the goodness of fit at the global best fit point and quantify the tension between appearance and disappearance data. We will summarize our results and conclude in section VIII. Supplementary material can be found in the appendices.

II. NEUTRINO OSCILLATIONS IN THE PRESENCE OF STERILE NEUTRINOS
The topic of this paper are scenarios in which the standard three-flavor framework for neutrino oscillations is augmented by adding one sterile neutrinos ν s . We will refer to such scenarios as "3 + 1 models". We will comment on scenarios with more than one sterile neutrino in section VIII.
The oscillation probability for ν α → ν β transitions in vacuum (α, β = e, µ, τ, s) is given by Here, L is the baseline, E is the neutrino energy, U αj are the elements of the leptonic mixing matrix (which is 4 × 4 in a 3 + 1 model), and ∆m 2 jk ≡ m 2 j − m 2 k are the mass squared differences, with m j the neutrino mass eigenvalues. We will assume m 1,2, 3 1 eV, but allow m 4 to be larger, thus considering the case ∆m 2 41 > 0. For experiments in which matter effects play a significant role, in general the evolution equation should be solved numerically. In cases where a constant matter density is a good approximation, U αj and ∆m 2 jk in eq. (1) can be replaced by an effective mixing matrix and effective mass squared differences in matter. For anti-neutrino oscillations, U should be replaced by U * .
The mixing matrix U in vacuum can be written as a product of two-dimensional rotation matrices. Where an explicit parameterization is required, we choose where R ij (θ ij ) denotes a real rotation matrix in the (ij)-plane with rotation angle θ ij , and R ij (θ ij , δ ij ) includes in addition a complex phase δ ij . In most cases, however, we will present our results in terms of the parameterization-independent matrix elements U αβ . For the following discussion the so-called short-baseline limit of eq. (1) will be useful. This limit refers to the situation where ∆m 2 21 L/4E 1, ∆m 2 31 L/4E 1, so that standard three-flavor oscillations have not had time to develop yet. In this case, eq. (1) generically simplifies to As we will see later, the connection between the ν e → ν e , ν µ → ν µ , and ν µ → ν e oscillation probabilities, inferred from these equations, will prove to be crucial to test the compatibility between different oscillation data sets. An extended discussion of various other limiting cases and the corresponding parameter dependencies (including complex phases) can be found in ref. [14].  [23,29] ratio of NEOS and Daya Bay spectra DANSS [26] ratios of spectra at two baselines (updated w.r.t. [21]) Double Chooz [33] near detector rate RENO [69,70] near detector rate Daya Bay spectrum [71] spectral ratios EH3/EH1 and EH2/EH1 Daya Bay flux [37] individual fluxes for each isotope (EH1, EH2) KamLAND [72] very long-baseline reactor experiment (L (4) GALLEX [74,88] ν e from 51 Cr source SAGE [89,90] ν e from 51 Cr and 37 Ar sources TABLE I. Data sets included in our ν e /ν e disappearance analysis. The total number of data points is 594. More details can be found in ref. [21]; the only update with respect of [21] is new data from DANSS [26].
In the ν e andν e disappearance channels, the most important constraints on sterile neutrinos come from reactor experiments at short baseline (L 1 km). But we include also data from solar neutrinos, ν e scattering on 12 C, and radioactive source experiments. The data is summarized in table I. The following analysis is based on our earlier publication [21] where more details can be found. In section III A we give an update of the reactor neutrino analysis, high-lighting the impact of the recent results from the DANSS experiment [26],  TABLE II. Results on ( -) ν e disappearance from DANSS+NEOS, from a fit to all reactor data (both for free fluxes and fixed fluxes), and from a fit to the combined ( -) ν e disappearance data listed in table I. For each combination of data sets, we give the parameter values and the χ 2 value per degree of freedom at the best fit point. In all fits, we treat θ 14 and ∆m 2 41 as free parameters. For the "all reactor" sample, we also leave θ 13 free. In the " ( -) ν e disap." analyses, all parameters listed in eq. (6) are allowed to float. For the analyses with free reactor fluxes, there are two additional free parameters corresponding to the normalization of the 235 U and 239 Pu fluxes. The last two columns of the table give the ∆χ 2 between the no-oscillation hypothesis and the best fit, as well as the significance at which the no-oscillation hypothesis is disfavoured. It is obtained by assuming that ∆χ 2 follows a χ 2 distribution with two degrees of freedom (∆m 2 41 and |U e4 |).
whereas in section III B we present the global ( -) ν e disappearance analysis.

A. Updated reactor analysis
The reactor analysis includes the experiments listed in table I. The fit by now is dominated largely by the recent NEOS [23] and DANSS [26] results, as well as the latest data from Daya Bay. For the latter we include the ratios of spectra measured in experimental halls (EH) 3 and 1, and in experimental halls 2 and 1 [71], as well as the measurement of the individual neutrino fluxes from each fissible isotope [37]. The analysis presented here is based largely on ref. [21] where more details can be found. The important difference with respect to that analysis is the recent preliminary results from the DANSS experiment presented in December 2017 [26], which consists of a data sample of approximately four times increased exposure compared to the one shown in March 2017 [25] used in [21]. Another recent analysis including this latest DANSS data can be found in ref. [91].
Regarding reactor neutrino flux predictions we consider two scenarios: (i) fixed fluxes, where we set the uncertainties on the predicted anti-neutrino fluxes to the values estimated in the original publications [3,4]; (ii) free fluxes, where the normalizations of the neutrino fluxes from the four main fissible isotopes 235 U, 238 U, 239 Pu and 241 Pu are allowed to float freely. (A weak constraint ±20% at 1σ is included for the numerically subdominant fluxes from 238 U and 241 Pu to avoid unphysical values.) Note that we never rely on the predicted anti-neutrino spectra, only on the predicted rates. Even in the case of fixed fluxes, those analyses which use spectral information are based entirely on ratios of spectra at different baselines.
The new spectral data from DANSS are shown in the left panel of fig. 1. The DANSS experiment uses a movable detector. The plot shows the ratio of the spectra observed in two detector locations corresponding to baselines of 10 distortion, leading to a preference in favour of sterile neutrino oscillations, as illustrated by the red dashed curve in fig. 2. The remarkable observation is that the preferred region from DANSS overlaps with the one from NEOS, which also observes a spectral distortion consistent with sterile neutrino oscillations, see right panel of fig. 1. Results of the combined analysis of DANSS and NEOS are given in table II. We find that the no-oscillation hypothesis is disfavoured with respect to sterile neutrino oscillations at a significance of 3.3σ. Let us stress that this result is completely independent of reactor neutrino flux predictions. It is only based on bin-by-bin spectral comparison between two detector locations in DANSS, and between the spectra observed in NEOS and Daya Bay.
Combing all available reactor data, we obtain the results shown table II and fig. 2. These results confirm the 3σ hint in favour of sterile neutrinos from DANSS and NEOS in the analysis with free fluxes. If the fluxes are fixed and the predicted neutrino rate is used ("reactor anomaly"), the significance increases to 3.5σ, with a best fit point consistent with the DANSS/NEOS spectral indications. Note that in the analysis using fixed fluxes there is minor tension between "old" reactor data and the DANSS/NEOS best fit region, see fig. 2. Despite this small tension, the significance for sterile neutrinos increases from 3.3σ for NEOS+DANSS to 3.5σ for the global data. We conclude that recent data support the indication in favour of sterile neutrinos from the reactor anomaly, a conclusion that is solely based on spectral distortions, but independent of reactor flux predictions.
Let us comment on the impact of the Daya Bay measurements of the individual neutrino fluxes from different fissible isotopes [37] by using the time evolution of the observed reactor anti-neutrino spectra. These data have been used to compare the hypothesis H 1 of no- oscillations but free flux normalizations to the hypothesis H 0 that flux predictions [3,4] (including their error estimates) are correct and a sterile neutrino exists. Considering the test statistic Daya Bay data lead to T obs = 6.3, which prefers H 1 (flux-free) over H 0 (oscillations) at 2.7σ [21,37] (see, however, [39]). As shown previously [20,21], this preference decreases, once the global reactor data is combined with DayaBay data. Using the numbers given in table II, we find that with present combined reactor data, T obs = −1.3, which actually shows a slight preference for oscillations over the no-oscillation but flux-free hypothesis. Again the main driver for this are spectral distortions, which can be fit better by oscillations than by re-scaling fluxes.
We proceed now to combining reactor data with all other data on ( -) ν e disappearance listed in table I. In fitting these data we scan the following set of parameters (see eq. (2) for our Constraints on ν e /ν e disappearance in the 3+1 scenario. We show the preferred parameter regions at 95% and 99% CL, projected onto the plane spanned by the mixing matrix element |U e4 | 2 and the mass squared difference ∆m 2 41 . The parameter space inside the shaded areas and to the left of the exclusion curves is allowed. For the reactor analysis we adopt the conservative assumption of free flux normalizations. The red region includes all data listed in table I. The green curves show the limit on |U e4 | 2 obtained from atmospheric neutrino data from SuperK, IceCube and DeepCore, discussed in section V. mixing matrix convention): We fix θ 13 here since it is determined very accurately, and we have checked that its best fit value does not depend on the possible existence of sterile neutrinos [14]. The dependence on θ 24 and θ 34 appears due to solar neutrino data, which in addition to the ν e survival probability includes also NC data sensitive to ν e → ν s transitions. 2 The results are shown in the last two rows of table II and in fig. 3. We observe that the best fit point remains stable at ∆m 2 41 ≈ 1.3 eV 2 , in agreement with the reactor-only analysis. From fig. 3 we observe a slight tension between the global best fit point and the region favoured by the gallium anomaly. We have used the parameter goodness-of-fit (PG) test [92] to quantify the compatibility of the gallium anomaly with reactor data. We obtain for the PG test-statistic (see appendix A for a review) χ 2 PG = 4.7, irrespective of whether reactor fluxes are fixed or free. For 2 dof, this translates into a p-value of about 9% for the compatibility of reactors and gallium. From fig. 3 we see, however, that the combined best fit point of reactor and gallium data lies in the island around ∆m 2 41 ≈ 4.5 eV 2 , which is disfavoured by solar neutrinos as well as neutrino scattering on 12 C. For the global best fit point around ∆m 2 41 ≈ 1.3 eV 2 , the PG test comparing reactor and gallium data gives χ 2 PG = 6.9 (7.2) for fixed fluxes (free fluxes). This corresponds to a p-value of 3.1% (2.8%), indicating some minor tension between these data sets. Despite this tension, table II shows that the significance of rejecting no-oscillations of the combined fit increases by about two units in ∆χ 2 compared to the reactor-only analysis, both for the flux-free and flux-fixed analyses.
In fig. 3 we show also the bound on |U e4 | 2 obtained from the atmospheric neutrino experiments SuperKamiokande (SK), IceCube (IC), and DeepCore (DC), see section V for more details. We observe that this bound is comparable to the one from solar neutrino data. The effect of sterile neutrinos on low-energy atmospheric data as relevant for SK and DC has been discussed in the appendix of ref. [93]. It amounts mostly to a normalization effect of the electron and muon neutrino survival probability according to P αα ∝ (1 − 2|U α4 | 2 ) with α = e, µ. In our SK/DC analyses we assume a 20% correlated normalization error on e and µ-like events, and a 5% error on the ratio of them. Therefore, we can expect a 1σ bound of order 0.1 on |U α4 | 2 from those data alone. If either |U e4 | 2 or |U µ4 | 2 is independently constrained from any other data, the bound on the other one from SK/DC becomes significantly stronger, due to the correlated uncertainty. Since the high-energy data relevant for IC provide such an independent constraint on |U µ4 | 2 due to the resonant matter effect (see section V), the combined bound improves and we get |U e4 | 2 0.1 at 99% CL (2 dof). Note that we do not include atmospheric data in the global ( -) ν e disappearance analysis presented in this section, since in this work we classify atmospheric neutrino experiments as ( -) ν µ disappearance to be discussed below.
We conclude that global ( -) ν e disappearance data show a robust hint in favour of sterile neutrinos at the 3σ level, independent of reactor flux predictions. If reactor flux predictions (including their uncertainties) are assumed to be correct, the significance reaches 3.8σ.
The appearance channelν µ →ν e was the first oscillation channel to reveal possible hints for sterile neutrinos, namely in the LSND experiment [1]. This hint, which to date remains the oscillation anomaly with the largest statistical significance, was later reinforced at lower significance by MiniBooNE [2]. Other experiments, in particular KARMEN [94], NOMAD [95], E776 [96], ICARUS [97,98], and OPERA [40], have not been able to confirm the findings by LSND and MiniBooNE, albeit not ruling them out either. We summarize the data sets included in our analysis of ν e andν e appearance data in table III.
Compared to our previous publication, ref. [14], in which more technical details on our fits are given, we have added the following data sets: 1. New results from the ICARUS [97,98] and OPERA [40] experiments in the high energy (∼ 20 GeV) CNGS beam. Both experiments have searched for anomalous ν µ → ν e appearance, but have not found any evidence. They are thus able to impose constraints over a wide range of ∆m 2 41 values.
2. Decay-in-flight data from LSND. The neutrino oscillation analysis of LSND is based on a search for anomalousν e appearance in the neutrino flux from a stopped pion source. Since the LSND detector was placed downstream from the pion production target, it received not only ν µ ,ν µ , and ν e from π + decays at rest (DaR), but also neutrinos and anti-neutrinos from pions decaying in flight (DiF). A discussion of the impact of Experiment References Comments Data points LSND [1]ν µ from stopped pion source (DaR) 11 LSND [1] combined DaR and DiF data ( [2,99] ν µ andν µ from high-energy Fermilab beam 22 KARMEN [94]ν µ from stopped pion source 9 NOMAD [95] ν µ from high-energy CERN beam 1 E776 [96] ν µ from high-energy Brookhaven beam 24 ICARUS [97,98] ν µ from high-energy CERN beam 1 OPERA [40] ν µ from high-energy CERN beam 1 For LSND, we have carried out analyses using only decay-at-rest (DaR) data, or the combination with decay-in-flight (DiF) data. In the latter case we use a χ 2 table provided by the collaboration, which cannot be associated with a number of data points. The total number of data points in the appearance channel (when using LSND DaR data only) is 69.
DiF data in the context of the global sterile neutrino fit can be found in ref. [100]. The LSND collaboration has kindly provided tabulated χ 2 values from their combined DaR+DiF fit. The LSND fit is based on the two-flavour approximation, so to include the tabulated χ 2 values in our 4-flavour analysis, we compute at each parameter point the effective two-flavour mixing angle from the full four-flavour mixing matrix U . In the following, we will show results using both our previous fitting code that includes only DaR data as well as results based on the tabulated two-flavour χ 2 values from LSND for DaR+DiF data.
Our results are plotted in fig. 4, which shows the favoured parameter regions projected onto the sin 2 2θ µe -∆m 2 41 plane. We see that all ν e data sets are consistent among each other: a large chunk of the parameter region favoured by LSND and MiniBooNE is not probed by any of the other searches. The strongest constraints come from OPERA at ∆m 2 41 0.5 eV 2 , and from KARMEN at larger ∆m 2 41 . Note that data from E776 is combined with solar neutrino data because a fit to E776 data alone would not be meaningful as it would leave possible oscillations of the ν e andν e backgrounds into sterile states unconstrained. Fitting E776 data jointly with solar neutrino data provides a reasonable constraint on |U e4 |, cf. fig. 3.
The conclusions drawn from fig. 4 agree qualitatively with the ones from our earlier paper ref. [14]. Some constraints, in particular those from OPERA and ICARUS, have become significantly stronger and now disfavour values of sin 2 2θ µe 0.02 that were still allowed previously. Note that our OPERA and ICARUS limits deviate slightly from those published by the respective collaborations [40,97,98] because we include oscillations of the backgrounds. Moreover, for consistency with the other exclusion curves in fig. 4, we interpret the χ 2 values from our OPERA and ICARUS fits assuming two degrees of freedom. We have checked that our code reproduces the official limits from refs. [40,97,98] very well when the same assumptions as in the official publications are used. Let us mention that the global ( -) ν µ → ( -) ν e analysis has a relatively poor goodness of fit. For the combined best fit point using the LSND DaR analysis we find χ 2 min /dof = 89.9/(69 − 2), which corresponds to a p-value of 3.3%. This is mostly driven by the MiniBooNE low-energy excess, which cannot be fitted well in the 3 + 1 scenario, and by the contribution from E776 whose spectrum gives a relatively poor fit. This feature has been present also in our previous analysis [14], where a more detailed discussion can be found.
In all cases LSND dominates the appearance fit. LSND alone disfavours the no-oscillation hypothesis with ∆χ 2 = 44 (29) when using DaR (DaR+DiF) data. For the combined appearance analysis these numbers increase slightly, due to the hint for appearance in MiniBooNE data. We find that the no-oscillation hypothesis for all appearance data is disfavoured compared to the best fit by ∆χ 2 = 46 (35) when using LSND DaR (DaR+DiF) data.
Comparing the allowed regions with and without the inclusion of decay-in-flight data in LSND, we see that the impact on the global fit is relatively minor. This is because although the LSND region with DiF data extends to slightly smaller values of sin 2 2θ µe , MiniBooNE appearance data prefers smaller ∆m 2 41 and mixing angles (especially for the neutrino mode data), somewhat limiting the impact of LSND DiF data when LSND and MiniBooNE data are combined. We observe only a slight broadening of the parameter regions preferred by LSND and by the combination of all ν µ → ν e andν µ →ν e appearance data. We will see in section VII that this slightly reduces the tension between appearance and disappearance data, but does not remove it. V.
( -) ν µ DISAPPEARANCE DATA Searches for muon neutrino disappearance due to oscillations involving a fourth neutrino mass state have recently received a significant boost thanks to novel results on sterile neutrinos from atmospheric neutrino data (both in the TeV energy window from IceCube [52] and at lower energy from DeepCore [49]) as well as from a combined analysis of MINOS and MINOS+ charged current (CC) and neutral current (NC) data [43]. Also NOνA has presented a first search for sterile neutrinos based on NC data [44]. Searches for a deficit of NC events are of particular interest because they are sensitive to mixing of sterile neutrinos with any active neutrino flavor. As such, any deficit found would be a unique signature of sterile neutrinos. The new analyses by IceCube, DeepCore, MINOS/MINOS+, and NOνA complement, and significantly extend, the exclusion regions from the short-baseline experiments CDHS [101] and MiniBooNE [102,103], from Super-Kamiokande data on atmospheric neutrinos [48,104], and from MINOS [105].
The high-energy IceCube analysis from ref. [52] exploits the fact that active-to-sterile neutrino oscillations in matter are resonantly enhanced by the MSW effect [55,56] at an energy of Here ρ ⊕ is the mass density of the material through which neutrinos are propagating. It is on average ∼ 3 g/cm 3 in the Earth's crust and outer mantle, ∼ 5 g/cm 3 in the inner mantle, and between 10 and 13 g/cm 3 in the core [106]. Equation (8) implies that, for sterile neutrinos at the eV-scale, neutrino telescopes like IceCube can in principle observe maximal oscillations at TeV energies -a sweet spot well above the detection threshold, but still low enough for the atmospheric neutrino flux to be appreciable [57,58]. For larger or smaller ∆m 2 41 , the sensitivity is expected to dwindle as the resonance moves to energies with a lower neutrino flux, or moves below the energy threshold of the detector. A limiting factor to this analysis is the fact that, for ∆m 2 41 > 0 as considered here, the resonance is in the anti-neutrino sector. Since neutrino telescopes cannot distinguish neutrinos from antineutrinos on an event-by-event basis, and since anti-neutrino cross sections are smaller by about a factor of three than neutrino cross sections, the magnitude of the observable effect is reduced. 3 Moreover, for small mixing angles, the resonance width, is small, so that only a very small fraction of the energy spectrum is affected. The narrow width, combined with the limited experimental energy resolution, further reduces the sensitivity of IceCube. In eq. (9), V MSW 1.9 × 10 −14 eV × [ρ ⊕ /(g/cm 3 )] is the neutral current-induced MSW potential for muon and tau neutrinos. Finally, systematic uncertainties play a crucial role in the analysis from ref. [52]. Technical details on our implementation of the IceCube analysis are given in appendix B.
In addition to the TeV neutrino events discussed above, the IceCube collaboration has also observed atmospheric neutrinos in the tens-of-GeV range through its sub-detector DeepCore.
The information on sterile neutrinos which can be extracted from this low-energy sample is very similar to that provided by Super-Kamiokande atmospheric data, which has been discussed in detail in refs. [14,93]. As explained there, low-energy atmospheric neutrino data can put a strong bound on |U µ4 | 2 through the suppression of the P µµ oscillation probability which a mixing of ν µ with a heavy state would imply. Moreover, such data also constrains |U τ 4 | 2 because the zenith-angle dependence of P µµ is modified if oscillations driven by ∆m 2 31 deviate from vacuum-like ν µ → ν τ oscillations. The formalism for neutrino oscillations discussed in appendix D of ref. [14] for Super-Kamiokande phase 1-3 data is also applied here to phase 4 results as well as to DeepCore data.
The MINOS detector is particularly interesting for sterile neutrino searches as it has observed neutrino oscillations over a fairly wide range of energies: during the original MINOS run, the NuMI beam was tuned to a peak energy of ∼ 2 GeV, while in the MINOS+ phase, the peak energy was at about 6 GeV, with the spectrum extending to tens of GeV. Moreover, the MINOS collaboration has analysed not only CC ν µ disappearance sensitive mainly to U µ4 , but has also searched for disappearance in NC events. Since MINOS/MINOS+ has near and far detectors, the experiment is sensitive over a wide range of ∆m 2 41 values. For ∆m 2 41 ∼ 10 −3 -10 −1 eV 2 , an oscillation pattern can be observed in the far detector, while no oscillations are expected in the near detector. At larger mass squared difference, oscillations in the far detector enter the averaging regime. At ∆m 2 41 ∼ 1-100 eV 2 , oscillation patterns begin to emerge in the near detector. In our analysis of MINOS/MINOS+ data, we follow very closely the recommendations accompanying the MINOS/MINOS+ data release [43].
We have also implemented the NOνA neutral current analysis from ref. [44]. Due to the low number of events and the difficult reconstruction of the neutrino energy in NC events, only total rates are used in the analysis. The dominant background in this analysis are misidentified charged current events. Following ref. [44], we implement a 12.2% (15.3%) systematic uncertainty on the signal (background) rates. Compared to the MINOS/MINOS+ NC search, the narrow-band beam employed in NOνA means that the experiment is sensitive to a much smaller range of ∆m 2 41 values, namely between 0.05 eV 2 and 0.5 eV 2 . Even in this mass range, the NOνA search for sterile neutrinos is not competitive with other searches yet as it is suffers from large systematic uncertainties related to detector modelling and energy reconstruction, but it is expected to improve considerably in the future.
We summarize the ν µ /ν µ disappearance data sets included in our analysis in table IV. Details on the CDHS and MiniBooNE analyses are given in ref. [14] and in the references

99% CL 2 dof
FIG. 5. Constraints on the 3 + 1 scenario from ν µ /ν µ disappearance. We show the allowed parameter regions, projected onto the plane spanned by the mixing matrix element |U µ4 | 2 and the mass squared difference ∆m 2 41 . Note that the exclusion limit from NOνA is still too weak to appear in the plot. It is, however, included in the curve labelled "combined", which includes all data listed in table IV. The curve labelled DC+SK+IC combines all our atmospheric neutrino data; for this bound we have fixed the parameters θ 12 , θ 13 , θ 14 but minimize with respect to all other mixing parameters, including complex phases. For comparison, we also show the parameter region favoured by ν e disappearance and ν µ → ν e appearance data (using LSND DaR+DiF), projected onto the |U µ4 | 2 -∆m 2 41 plane; we show the allowed regions for the analyses with fixed and free reactor neutrino fluxes. therein. Our results are shown in fig. 5 as a function of the mixing matrix element |U µ4 | 2 and the mass squared difference ∆m 2 41 . The plot reveals strong limits of order |U µ4 | 2 10 −2 across a wide range of ∆m 2 41 values from ∼ 2 × 10 −1 eV 2 to ∼ 10 eV 2 . MINOS/MINOS+ gives an important contribution in most of the parameter space. The strong constraint from atmospheric neutrino data at ∆m 2 41 1 eV 2 is dominated by IceCube. At large masses, MiniBooNE and to some extent CDHS are competitive with the MINOS/MINOS+ bound. Comparing to the parameter region preferred by appearance and ν e /ν e disappearance data (which includes the oscillation anomalies), we see dramatic tension. Given the constraints on U e4 from reactor experiments, the values of sin 2 2θ µe ≡ 4|U e4 | 2 |U µ4 | 2 required by LSND and MiniBooNE can only be reached if |U µ4 | is large. This, however, is clearly disfavoured by multiple ν µ /ν µ disappearance null results. This is the origin of the severe tension in the global fit we are going to report below. As we are going to discuss, this tension has become very robust and does not rely on any single ( -) ν µ disappearance data set.

VI. CONSTRAINTS ON |U τ 4 |
Mixing between tau neutrinos and possible sterile states is particularly difficult to constrain since no ν τ sources are available. Nevertheless, constraints can be obtained in the following two ways: (i) studying matter effects. All active neutrino flavors experience an MSW potential caused by coherent forward scattering through Z boson exchange, while sterile neutrinos do not. This influences ν e disappearance observed in solar neutrino experiments, as well as ν µ disappearance observed in beam experiments and in atmospheric neutrinos. The latter yield particularly strong limits as they possess the longest baselines in matter.
(ii) exploiting neutral current events, which are sensitive to any disappearance of active neutrinos. This approach allows us to derive constraints from the sterile neutrino searches in MINOS/MINOS+ [43] and NOνA [44], and from SNO solar neutrino data [79][80][81]. The corresponding analysis codes used in our fit are the same as discussed in sections III and V. Compared to ref. [14], we have in particular added IceCube, DeepCore, MINOS/MINOS+, and NOνA data to the fit.
Our results are shown in the four panels of fig. 6. Each panel corresponds to a different fixed value of ∆m 2 41 , and the corresponding contours have been drawn based on the χ 2 differences relative to the best fit point for this fixed ∆m 2 41 . The difference in χ 2 between the individual best fit points and the global one are, however, very small, as indicated in each panel. The reason is that in all cases the best fit point is very close to zero mixing, and therefore has very similar χ 2 values. In defining the exclusion contours we have assumed a χ 2 distribution with two degrees of freedom. We see that depending on ∆m 2 41 , the limit on |U µ4 | is driven by MINOS/MINOS+, IceCube, or the short-baseline experiments MiniBooNE and CDHS, in agreement with fig. 5. The strongest constraints on |U τ 4 | typically come from atmospheric neutrinos. We find that the combined bound is independent of ∆m 2 41 and is given by |U τ 4 | 2 < 0.13 (0.17) at 90% (99%) CL. (10) Let us mention that recently ref. [108] has found a 2σ hint from Ice Cube data in favour of sterile neutrinos with non-zero ν 4 -ν τ mixing in the high-mass region, with ∆m 2 41 100 eV 2 . With our code we cannot reproduce their results and we do not find any hint for sterile neutrino mixing in that mass range. The origin of these different results is currently under investigation.

VII. THE DISAPPEARANCE-APPEARANCE TENSION
As discussed above, results on the ν e → ν e , ν µ → ν e , and ν µ → ν µ oscillation channels (and the corresponding anti-neutrino modes) over-constrain eV-scale sterile neutrino models. The reason can be easily understood by going to the short-baseline limit in which baselines are so short that oscillations induced by ∆m 2 31 and ∆m 2 21 did not yet develop. In this limit, eqs. (3) and (4) show that the bounds on |U e4 | and |U µ4 | from electron and muon disappearance data lead to a quadratic suppression of the effective amplitude sin 2 2θ eµ , eq. (7), relevant for ν µ → ν e appearance [109][110][111]. Thus constraints from disappearance data challenge an explanation of the anomalies in the appearance channel in terms of sterile neutrino oscillations. While this tension has persisted for a very long time, see for instance ref. [100], it has become exceedingly severe with recent data, rendering the sterile neutrino hypothesis as an explanation for the appearance anomalies very unlikely, see below. The results of the combined fit are summarized in table VI, which shows the results for ( -) ν e disappearance, ( -) ν µ disappearance, and ( -) ν e appearance data separately as well as combined. The total numbers of data points in these analyses are summarized in table V. The last column of that table also indicates which parameters need to be considered when counting degrees of freedom. For the ( -) ν µ disappearance data we do take into account complex phases Data set Reference Data points Relevant parameters ( -) ν e disappearance Table I 594 ∆m 2 31 , ∆m 2 41 , θ 12 , θ 14 , θ 24 , θ 34 ( -) ν µ disappearance Table IV 504 ∆m 2 31 , ∆m 2 41 , θ 23 , θ 14 , θ 24 , θ 34  TABLE VI. Parameter values at the global best fit point and at the best fit points obtained for subsets of the data. We also indicate the χ 2 per degree of freedom at the best fit points, as well as the corresponding goodness-of-fit values. The numbers of data points, and the parameters relevant to the counting of degrees of freedom are summarized in table V. For the global fit, we also indicate the results of the parameter goodness-of-fit test [92] comparing appearance to disappearance data. The labels "DaR" and "DiF" refer to the LSND analysis employed, where "DiF" implies the joint use of DaR+DiF data, see section IV. Note that, as the number of degrees of freedom for the LSND DiF data is not available, we do not list the corresponding goodness of fit values.
in the fit [14], but since numerically their effect is very small we do not count them as full dof. We do, however, treat the normalization of the atmospheric neutrino flux as a free parameter in the IceCube analysis. Concerning the appearance sample, for most of the data summarized in table III the short-baseline approximation holds, motivating the use of only the effective mixing angle quoted in table V. Exceptions are the long-baseline experiments ICARUS and OPERA, which depend on more parameters, but play a role neither for the appearance best fit point nor for the global best fit point. Therefore, we consider only two effective parameters for the appearance sample. For the global analysis we count seven parameters plus the IceCube global normalization. The reactor analysis with free fluxes has Appearance versus disappearance data in the plane spanned by the effective mixing angle sin 2 2θ µe ≡ 4|U e4 U µ4 | 2 and the mass squared difference ∆m 2 41 . The blue curves show limits from the disappearance data sets using free reactor fluxes (solid) or fixed reactor fluxes (dashed), while the shaded contours are based on the appearance data sets using LSND DaR+DiF (red) and LSND DaR (pink hatched). All contours are at 99.73% CL for 2 dof. two additional free parameters.
We would now like to quantify the tension between different subsets of the global data that is evident from fig. 5. We first note that combining all data sets we find a goodness-of-fit for the global best fit point around 65%, see table VI. This good p-value does not reflect the tension we found because many data points entering the global fit have only little sensitivity to sterile neutrino oscillations, thus diluting the power of a goodness-of-fit test based on χ 2 /dof.
A more reliable method for quantifying the compatibility of different data sets is the parameter goodness-of-fit (PG) test [92], which measures the penalty in χ 2 that one has to pay for combining data sets, see appendix A for a brief review of this test. If the global neutrino oscillation data were consistent when interpreted in the framework of a 3 + 1 model, any slicing into two statistically independent data sets A and B should result in an acceptable p-value from the PG test. To illustrate an inconsistency in the data, it is however sufficient to demonstrate that at least one way of dividing it leads to a poor value. Here, we choose to split the data into disappearance data encompassing the oscillation channels ( -) ν e → ( -) ν e and ( -) ν µ → ( -) ν µ , and appearance data covering the ( -) ν µ → ( -) ν e channel. Note that it is important to chose data sets independent of their "result". For instance, dividing data into "evidence" and "no-evidence" samples would bias the PG test.
The tension between appearance and disappearance data is shown graphically in fig. 7. The figure illustrates the lack of overlap between the parameter region favoured by appearance data (driven by LSND and MiniBooNE) and the strong exclusion limits from disappearance data. The tension persists independently of whether reactor fluxes are fixed or kept free, and whether the LSND DaR or DaR+DiF samples are used.  In columns 2-8, we list the χ 2 at the global best fit point (χ 2 min,global ), the χ 2 at the appearance best fit (χ 2 min,app ), the difference in χ 2 app between the appearance best fit point and the global best fit point (∆χ 2 app ), the χ 2 at the disappearance best fit (χ 2 min,disapp ), the difference in χ 2 disapp between the disappearance best fit point and the global best fit point (∆χ 2 disapp ), the χ 2 per dof for the PG test (χ 2 PG /dof, computed according to eq. (A1)), and the resulting p-value given by eq. (A3).
p-value of the PG test statistic we use two degrees of freedom, corresponding to the two parameters in common to appearance and disappearance data, see table V and the related discussion. We observe that for none of the analyses given in the table, the p-value for appearance and disappearance data being consistent exceeds 10 −5 , with the "best" compatibility of p = 2.6 × 10 −6 emerging for fixed reactor fluxes and using LSND DaR+DiF data. We conclude that the appearance/disappearance tension excludes a sterile neutrino oscillation explanation of the ( -) ν µ → ( -) ν e anomalies at the 4.7σ level. Note that the parameter goodness-of-fit for the analysis using free reactor fluxes is worse than the one for fixed reactor fluxes. The reason can be understood from the χ 2 numbers given in table VI. We see that the χ 2 min of ( -) ν e disappearance decreases by more (9.9 units) than the global best fit point (7 or 6 units for DaR or DaR+DiF, respectively), when leaving reactor fluxes free. Therefore, reactor data alone benefits more from free fluxes than the appearance/disappearance tension, which increases the χ 2 penalty to pay for the combination in the case of free fluxes.
In table VII we investigate the robustness of the appearance/disappearance tension. We show how the PG would improve if individual experiments or classes of experiments were removed from the fit. We stress that we are not aware of any strong reason to discard data from particular experiments. The sole purpose of this exercise is to demonstrate the impact of individual data sets and establish the robustness of our conclusion.
The first row in table VII corresponds to the global analysis using free reactor fluxes and LSND DaR+DiF data, which is the combination of data we use throughout this table. The remaining part of the table shows that very strong tension remains even after removing any individual experiment. In particular, the PG remains below ≈ 5 × 10 −6 when any of the ( -) ν µ disappearance data sets are removed, so it does not rely on the particular treatment of any of those experiments. Even when all reactor data are removed, the PG remains very small (3.8 × 10 −5 ).
The only significant improvement is obtained when removing LSND. The still somewhat low PG of 0.16% is a manifestation of the tension between the MiniBooNE excess and the disappearance data. But it is clear that the very strong appearance/disappearance tension is driven by LSND. Note also that this remains true when MiniBooNE is removed, and therefore the result does not depend on the low-energy excess in MiniBooNE.
The only way to reconcile LSND would be to discard ( -) ν µ disappearance data altogether. Note that even if we remove all ( -) ν e disappearance data, the PG remains low, at 2.4 × 10 −4 . The reason is the non-trivial constraint on |U e4 | from the data sample we call ( -) ν µ disappearance (defined in table IV), see fig. 3. Remarkably, just using ( -) ν µ disappearance plus solar neutrinos pushes the PG already to 7.4 × 10 −6 . This demonstrates once again that our conclusion is independent of reactor neutrino data.
We observe from table VII that the PG gets nearly an order of magnitude worse when removing the gallium data. The reason is the slight tension between gallium and reactor data discussed in section III B. If gallium is removed, the ( -) ν e disappearance fit alone improves, and therefore the tension with appearance data increases.
Finally, we have also performed a slightly different PG test, by dividing the data into ν µ disappearance versus the combined ν e appearance and ν e disappearance data. This corresponds to the samples compared in fig. 5. Using LSND DaR+DiF data and free reactor fluxes we obtain a χ 2 PG = 23.4. According to table V, the common parameters in those two data sets are ∆m 2 31 , ∆m 2 41 , θ 14 , θ 24 , θ 34 . Therefore, χ 2 PG has to be evaluated for 5 dof, leading to a p-value of 2.8 × 10 −4 .

VIII. DISCUSSION AND CONCLUSIONS
We have presented an updated global analysis of neutrino oscillation data within a 3 + 1 sterile neutrino mass scheme. We have obtained two main results, which can be summarized as follows: 1. Reactor neutrino data show a 3σ preference for sterile neutrino oscillations with ∆m 2 41 ≈ 1.3 eV 2 and |U e4 | ≈ 0.1. This is driven by recent data from DANSS and NEOS and is based only on the relative comparisons of measured energy spectra and is therefore independent of predictions for the reactor neutrino fluxes and spectra. If flux predictions are taken into account, the preference for sterile neutrino oscillations in global ( -) ν e disappearance data increases to 3.8σ.
2. Constraints on ( -) ν µ disappearance have become exceedingly strong, due to recent data from MINOS/MINOS+ and IceCube. This leads to very strong tension between the anomalies in the appearance sector (LSND and MiniBooNE) and disappearance data. We find that appearance and disappearance data are incompatible, with a parameter goodness-of-fit test yielding a p-value of less than 2.6 × 10 −6 . This result does not rely on any single experiment in the ( -) ν µ sector and is robust with respect to theoretical predictions of reactor fluxes; the p-value remains at 3.8 × 10 −5 even if all reactor data are removed. The tension is dominated by LSND; the MiniBooNE anomaly plays a subleading role.
Our results rule out the sterile neutrino oscillation hypothesis as an explanation of the LSND and MiniBooNE anomalies, but it remains a viable option for the reactor and gallium anomalies.
Some comments are in order. Our conclusion in item 1 above is largely based on preliminary data from DANSS presented at conferences [25,26]. Our results are in agreement with another recent analysis done outside the DANSS collaboration [91]. However, those results will need to be supported by an official publication by the collaboration.
Throughout this work we have restricted ourselves to the 3 + 1 scenario, adding just one mass state at the eV scale. However, we expect that the tension between appearance and disappearance data cannot be resolved by adding more sterile neutrinos. This has been quantitatively investigated previously, e.g. [14,93]. There, it had been shown that adding more neutrinos does not relax the tension. The reason is that the quadratic suppression of the ν µ → ν e oscillation amplitudes by constraints on the elements |U ei | and |U µi | (i ≥ 4) from disappearance data remains equally true in scenarios with more than one eV-scale mass states. Therefore we expect that our conclusion concerning the sterile neutrino explanation of appearance anomalies remains qualitatively true also for more sterile neutrinos.
Finally, we remind the reader that a completely orthogonal set of constraints on eVscale sterile neutrinos comes from cosmology. The standard picture is that active neutrinos evolve into a superposition of active and sterile states at temperatures MeV. Hard, flavour-sensitive collisions mediated by W and Z bosons collapse these superpositions into purely active or purely sterile states, with the relative probability given by the activesterile mixing angles. After a large number of collisions, active and sterile neutrinos come into thermal equilibrium. Because of this, the vanilla 3 + 1 model appears to be strongly disfavoured by constraints on the number of relativistic species N eff at the time of Big Bang Nucleosynthesis (BBN) [112] and during the recombination epoch [113]. Moreover, constraints on the sum of neutrino masses, m ν from Cosmic Microwave Background and structure formation data disfavour extra neutrino species with masses 0.3 eV [113]. However, these constraints are model-dependent, and in non-minimal scenarios they can be weakened or absent. A full review of such scenarios is well beyond the scope of this work, therefore we only mention a few exemplary ones: in particular, mechanisms discussed in the literature include new interactions in the sterile sector [114][115][116][117], an extremely low reheating temperature [118], large neutrino-anti-neutrino asymmetries [119], late entropy production [120], and the presence of matter and antimatter domains during BBN [121]. It is also worth noting that the prevailing tension between local and cosmological determinations of the Hubble constant would be relaxed if N eff is somewhat larger than in the SM [122].
Note added: While we were finalizing this work, the STEREO collaboration announced first results from their search for short-baseline neutrino oscillations at the ILL reactor in Grenoble [123]. At the moment, we do not expect these new exclusion limits to have a significant impact on the preferred region for the combined reactor data and global ( -) ν e disappearance data yet.

Appendix B: Details of the IceCube Fit
The event numbers measured by the IceCube detector have been provided in a grid with 210 bins [52,53], which depends on the reconstructed muon energy E µ (logarithmically spaced in 10 bins ranging from 400 GeV to 20 TeV) and the reconstructed muon direction (linearly spaced in 21 bins from cos θ = −1.02 to cos θ = 0.24). We make the assumption that the reconstructed muon direction is the same as the direction of the initial neutrino. The predicted number of events in bin number (ij) (where i indexes cos θ and j indexes E µ ) is computed according to Here, φ atm,f ± (E ν , θ i , N 0 , γ, R π/K ) is the atmospheric muon neutrino (+) or anti-neutrino (−) flux, which depends on the true neutrino energy E ν , the neutrino direction θ i , and on the nuisance parameters N 0 , γ, and R π/K discussed below. It also depends on the theoretical flux model, indicated by the subscript f . The effective area A d eff,± (E ν , E j µ , θ i ) in eq. (B1) encodes the detector response to a ν µ (+) orν µ (−) with energy E ν and direction θ i . The IceCube collaboration provides A d eff,± (E ν , E j µ , θ i ) in the form of a three-dimensional array in E µ , cos θ (same binning as for the data), and E ν (200 bins logarithmically spaced between 200 GeV and 1 PeV) [52]. Separate arrays are provided for different assumptions on the Digital Optical Module (DOM) efficiency, indicated by the superscript d.
The muon neutrino and anti-neutrino survival probabilityP ± µµ is computed using GLoBES [124,125], including a low-pass filter to suppress fast oscillation and to account for the limited energy resolution of the detector. For the production height of the neutrinos we interpolate linearly between 28 km for horizontal neutrinos and 18 km for vertical neutrinos [51]. To model the attenuation of the neutrino flux due to absorption in the Earth, we multiply the oscillation probability by an exponential damping factor given by where X(θ) is the column density along the neutrino trajectory and σ ± (E) the inclusive absorption cross-section for neutrinos and antineutrinos, respectively. The factor (1 − P ± µµ (E, L)) accounts for the fact that only the active flavors interact with matter. This formula holds exactly only for an oscillation probability independent of the length of the trajectory. We make the assumption that in much of the parameter space the oscillations are either averaged out, or the oscillation length is so long that the probability is approximately constant along the trajectory. We have checked that our results do not depend significantly on this assumption.
In the published IceCube fit [52], systematic uncertainties are included either as discrete or as continuous nuisance parameters. The only discrete nuisance parameter in our analysis is the theoretical flux model. We found that out of the seven flux models considered by the IceCube collaboration, only two contribute significantly, namely the ones tagged "PolyGonato QGSJET-II-04" and "Honda-Gaisser". We therefore restrict our analysis to these two discrete models. Hence the index f in eq. (B1) runs from 1 to 2.
The continuous nuisance parameters can be divided into two classes: those related to the neutrino flux, and those related to the detector response and the optical properties of the ice. In our analysis we use the following atmospheric neutrino flux uncertainties: • the normalization N 0 . Formally we assume a large uncertainty of 40% on the normalization, but results are very similar for completely free normalization. Therefore we consider N 0 to be effectively unconstrained.
• the tilt of the energy spectrum, which is parameterized by including a factor (E/E 0 ) γ , with a 5% error on the power law index γ and a central value of γ = 0; • the ratio between the pion and the kaon decay contributions to the flux, R π/K , with an error of 10%; • the ratio between the neutrino and the anti-neutrino fluxes, R ± , with an error of 2.5%.
Out of the uncertainties associated with the detector response and the ice properties, we only include the uncertainty on the DOM efficiency. As stated above, the tabulated effective area is provided for four different models for the DOM efficiency. We interpolate linearly between the per-bin-prediction for each DOM model and allow the minimizer to choose the optimal superposition of DOM models. Concerning the ice properties, we restricted ourselves to the nominal model because effective areas for each DOM efficiencies are only provided for the nominal ice model. For each point in the parameter space a χ 2 value is calculated from the theoretical predictions and the experimental values by means of a log-likelihood function.
We have cross-check our IceCube fit with a second version of the analysis, which was developed completely independently. This analysis is not using the GLoBES software but is based on a dedicated probability code and it uses a partially different approach to systematics. The most noteworthy difference is the treatment of the discrete systematics. In our second implementation we restrict ourselves to only one flux model, the "Honda-Gaissermodel". Several other discrete systematics associated with the detector response are treated as continuous quantities, and their effects on the number of events are assumed to be linear. In detail, in our second implementation we use: • the DOM efficiency, where as nominal value we have used the table corresponding to 99% efficiency, and as 1σ deviation we have used the table corresponding to 95% efficiency; • photon scattering in the ice, where the 1σ deviation is defined from the table corresponding to a 10% increase with respect to the nominal response; • photon absorption in the ice, where the 1σ deviation is defined as a 10% increase in the absorption rate with respect to the nominal response; • the azimuthal anisotropy in the scattering length due to the dust grain shear; here the 1σ deviation is obtained from the data set denoted 'SPICELEA ice model'; • the optical properties of the ice column surrounding each string, where the 1σ deviation is obtained from the data set labelled 'SPICEMIE ice model'. This data set does not include hole ice effects.
Furthermore, in our second implementation, we average the oscillation probability over the altitude of the neutrino production point. The averaged probability is given by P ± µµ (E ν , θ) = dh P ± µµ (E ν , cos θ, h) κ ± (E ν , cos θ, h) , where P ± µµ (E ν , cos θ, h) is the unaveraged oscillation probability for a neutrino produced at altitude h and κ ± (E ν , cos θ, h) is the distribution of production altitudes, normalized to one [51].
We find good agreement between our two implementations, and between each of our implementations and the official IceCube results [52]. We therefore conclude that our IceCube analysis is robust.