Possible Interpretations of IceCube High-Energy Neutrino Events

We discuss possible interpretations of the 37 high energy neutrino events observed by the IceCube experiment in the South Pole. We examine the possibility to explain the observed neutrino spectrum exclusively by the decays of a heavy long-lived particle of mass in the PeV range. We compare this with the standard scenario, namely, a single power-law spectrum related to neutrinos produced by astrophysical sources and a viable hybrid situation where the spectrum is a product of two components: a power-law and the long-lived particle decays. We present a simple extension of the Standard Model that could account for the heavy particle decays that are needed in order to explain the data. We show that the current data equally supports all above scenarios and try to evaluate the exposure needed in order to falsify them in the future.


Introduction
The particle physics community around the globe invested a few decades in a variety of experiments in a coordinated effort to scrutinize the Standard Model (SM) of electroweak interactions. The scientific achievements turned out to be impressive. Neutrino flavor oscillations have been discovered and confirmed by a string of solar, atmospheric, reactor and accelerator neutrino experiments [1], challenging our understanding of neutrino properties and their role in nature. The discovery of the Higgs boson by the LHC experiments in 2012 [2], finally set the crowning glory of the SM electroweak symmetry breaking mechanism to experimental probing. The IceCube neutrino observatory, located in the South Pole, reported in 2013 28 neutrino candidates in the energy range from 50 to 2000 TeV, constituting at the time a 4.1σ excess over the expected atmospheric background [3]. This unexpected discovery of high-energy cosmic neutrinos herald the dawn of neutrino astronomy and drives intriguing questions. Where do these neutrinos come from? How were they produced? Can they shed a light on the long lasting quest for the origin of cosmic rays? Are they related to the elusive Dark Matter (DM)?
Since this first announcement many authors have investigated the possible origin of these events. Among the possibilities explored are viable astrophysical sources [4][5][6][7][8][9][10][11][12][13], a possible DM connection [14][15][16][17], a leptoquark resonance [18], the decay of massive neutrinos [19], the decay of a very heavy long-lived particle [20,21] or even the possibility that these events could be understood in terms of novel interactions that neutrinos have with the cosmic neutrino background [22,23]. Finally, in Ref. [24] the authors showed that assuming a simple isotropic astrophysical power-law spectrum for the neutrinos, the current data is consistent with neutrino having only the SM interactions.
Recently IceCube has updated their analysis to include three years of data for a live time of 988 days. They have now 37 neutrino events with energy from 30 TeV to 2 PeV [25], strongly disfavoring a purely atmospheric explanation at 5.7σ. Interesting enough, there are still no events in the energy range from ∼ 400 TeV to 1 PeV. This may simply be a statistical fluctuation or a signature from the underlying physics. Furthermore, the energy distribution indicates that there may be a cutoff in the spectrum at PeV energies.
In this paper we discuss interpretations of the IceCube high-energy excess events. Among the candidates so far examined the following two scenarios seem to be "standard" and relatively model-independent: (a) a power-law spectrum to which astrophysical sources are kept in mind, (b) decays of a heavy Long-Lived Particle (LLP), but with a background power-law spectrum component. We investigate here, in addition to these two, the possibility that (c) decays of a heavy LLP alone can explain the IceCube data in its entire energy region. The last possibility was also studied in Ref. [15,16]. With the current data, we show that all the three above scenarios (a) -(c) can fit the current IceCube data well. The next question, then, is how these three scenarios can be distinguished. We investigate this problem by examining simulated data of IceCube for the future to establish the necessary exposure time to differentiate the three scenarios above. This paper is organized as follows. In Section 2, we discuss the generic properties of the LLP that are required to explain the features of the IceCube events. In Section 3, we describe the various sources of neutrinos in IceCube required for our data analysis. In Section 4, we perform an analysis of the IceCube data in view of the three scenarios (a) -(c). In Section 5 we discuss how our model can be falsified with future IceCube data. In Section 6 we construct a LLP model that could accommodate the IceCube observations. We discuss the LLP abundance in our model and conclude that it does not have to be a dominant part of the DM content of the Universe. Finally, in Section 7 we make our final remarks and conclusions. This paper is completed with two appendices: In Appendix A, we give additional details on the statistical treatment we employed in this work while in Appendix B, we list the confidence intervals of the best fit parameters for various scenarios considered in this study.

Can a Decaying Long-Lived Particle Alone Explain the IceCube data?
The spectacular detection of 37 neutrino events in the energy range from 30 TeV to 2 PeV within the three years data set of IceCube disfavors, as already mentioned, a purely atmospheric explanation at 5.7σ [25]. Moreover, neutrino events are absent in the energy range from about 400 TeV to 1 PeV and beyond 2 PeV, while the assumption of an unbroken E −2 flux, would predict three events beyond 2 PeV [25]. So the non-observation of events at higher energies seem to suggest there is a cutoff at the PeV scale. Although at the moment this is still consistent with statistical fluctuations, here we opt for the exciting possibility that these features could arise from the decays of a single LLP. Clearly, more than one species of decaying LLP with different masses could as well describe these features. We will take the minimal point of view and stick to a single species of LLP in this work.
To accommodate the two features, the dip and the cutoff, described above, we need a decaying LLP, Y , with mass, M Y , at the PeV scale and lifetime, τ Y , with the following properties: (i) It has to be long-lived. 1 In order for Y to remain today, it has to have a lifetime at least longer than the lifetime of the Universe, i.e, τ Y > t 0 4.4 × 10 17 s. 2 (ii) It has a two-body decay to at least a SM neutrino in the final state.
Assuming the LLP to be non-relativistic, the two-body decay Y → ν α N with ν α a SM neutrino of flavor α = e, µ, τ and N can be neutrino or another SM singlet field, will produce a peak in the energy spectrum at M Y /2. The cutoff in the energy spectrum will be at M Y /2 as well. As shown in Ref. [16], electroweak (EW) corrections can in fact soften the peak and give a low energy tail. Intriguingly, they also showed that a peak structure can also result from the decay Y →ē e due EW corrections (e.g. cascade radiation of massive gauge bosons) [27,28]. In the current work, we take into account EW corrections in a simple way by extrapolating the results of [28] to higher masses. We know this is not strictly correct as the EW corrections in that paper were computed at leading order and so their results are only valid as long as the particle is not too heavy. However, the calculation of higher order corrections would be beyond the scope of this work.
(iii) It has to admit at least another longer decay chain to neutrinos. In order to produce neutrinos with a continuum energy spectrum at lower energies, the LLP requires a longer decay chain to neutrinos, Y → ... → ν α ν β ..., as considered, for instance, in Refs. [15,16,29]. For example, if Y is a scalar, we can have Y → µμ, Y → ττ , Y → tt, Y → 2h etc (we use the standard notation where h is the SM Higgs, t for top and so on). Using PYTHIA [30], we generated neutrino energy spectra from the decay of the LLP at rest with its mass at the PeV scale. Experimenting with various channels, we found that in order to account for the excess of neutrino events at lower energies (where the atmospheric explanation alone cannot explain the data), two-body decays such as Y → 2h as well as four-body decays, e.g. Y → tttt or Y → 4h, can do the job. The decay spectra will also be modified by EW corrections and this will be partially taken into account by us as described above (ii). In fact, these corrections, in general, tend to shift neutrino events to lower energies. As we will show in the work, even considering only (ii) Y → ν α N , we will also have a low energy tail due to EW corrections and red shift effect from extra-galactic contributions (see also [16]).
In Sections 3 -5, we will consider the LLP to be a scalar Y with three possible decay channels: Y → ν α N , Y → 2h and Y → 4h. Here we assume N is a fermion singlet which is not a SM neutrino. If the reader is curious about its identity, he or she can go directly to Section 6 where we present a consistent model which can realize such a particle with the required decay channels. For simplicity, we will assume that Y decays with equal branching ratios to neutrinos of all flavors and suppress the flavor index in what follows, i.e. Y → ν N . For the decays Y → 2h and Y → 4h, we use PYTHIA to generate the energy spectrum of the neutrinos taking into account of SM particles decay and hadronization. 3 Then we consider neutrino oscillations to properly account for the neutrino flavors which arrive at the Earth. Finally before proceeding to data analysis, we want to stress that the above choice is by no means the unique decay channels for a LLP which can describe the suggestive features of the IceCube data but simply a working assumption.

IceCube Data Analysis: A Brief Description
We will consider in our calculations three different sources of neutrinos in IceCube: LLP decays, an unknown extra-galactic source and cosmic ray air showers.
The contributions of LLP decays to the IceCube neutrino flux have two different components: a galactic and a diffuse extra-galactic contribution. The neutrino differential flux from galactic LLP decays can be calculated as [15] where the integral on s is along the line of sight and r 2 = s 2 + r 2 0 − 2 s r 0 cos(l) cos(b), with −90 • ≤ b < 90 • and −180 • ≤ l < 180 • . Here r 0 = 8.5 kpc is the distance from the Sun to the galactic center. The term (1/N )dN/dE ν is the normalized neutrino energy spectrum calculated using PYTHIA in the context of the LLP scenario proposed. For the galactic LLP matter density we use the DM Einasto density profile, The EW corrections to the neutrino energy spectrum are taken into account only for the decay channels Y → ν N and Y → 2h as described above (ii) while for the channel Y → 4h, since there is no straightforward way to implement such corrections, we ignore them keeping in mind that these corrections tend to shift neutrino events to lower energies.
with the standard choices r s = 20 kpc,ᾱ = 0.17 and ρ 0 = 0.3 GeV cm −3 . From this we can calculate the galactic neutrino flux contribution as dΦ ν dE ν gal = 1.7 × 10 −12 cm 2 sr s The extra-galactic contribution to the neutrino flux [31] is given by where y = 1 + z, z being the red-shift. This can also be written as where the numerical values of the cosmological densities Ω DM h 2 = 0.120, Ω M h 2 = 0.143, Ω Λ h 2 = 0.685 were taken from [32] and the critical density ρ c = 1.054 × 10 −5 GeV/cm 3 from [33]. Here we want to stress that although in the fit we have fixed Ω DM h 2 = 0.120 to be all the DM density, in principle the LLP can constitute only part of the DM. The lifetime of the LLP τ Y obtained from the fit will have to be multiplied by a factor of κ if the LLP constitute only a fraction κ to the DM density. Hence τ Y obtained from the fit is the upper bound on the lifetime of Y . We refer the reader to Section 6.3 for a discussion on the LLP abundance. Again the term (1/N )dN/dE ν is the normalized neutrino energy spectrum calculated using PYTHIA in accordance to the LLP decay modes of a particular scenario. In the above M Y and τ Y are parameters to be fit to the data. The cosmic unknown neutrino source contribution was estimated as a power-law similar to [24,33] but using the convenient parametrization where C 0 and s are parameters to be fit to the experimental observations. The total number of neutrino events expected in the n-th energy bin of IceCube, N (E n ), is calculated as where T is the exposure time, here 988 days [25], Ω = 4π is the solid angle of coverage, A(E ν ) is the effective area taken from [3], E i and E j are the lower and upper energy limits of the bin and the sum j is performed over the considered contributions to the flux in each case we analyze.  Table 1. Summary of best fit parameters and p-values for the hypothesis H 0 considered in this work. PL stands for power-law, C 0 is given in GeV cm −2 s −1 sr −1 and r νN is the branching ratio of the channel Y → νN .
Finally we will also take into account the background that arises from cosmic ray air showers, mainly from muon, π/K and charm decays, simply by using the digitalized numbers for the atmospheric background from the IceCube paper [25]. We have extrapolated this background to higher energy bins.

Fitting the IceCube Data in Three Scenarios
We compare three scenarios in fitting the three years of IceCube data [25]: (a) a single power-law, (b) a power-law and a two-body LLP decay and (c) pure LLP decays according to the LLP model described in Section 6.
In the scenario (a) we have two free parameters, s and C 0 (Eq. 3.6), in the scenario (b) we have an extra free parameter, the lifetime τ Y while the LLP mass M Y is fixed to be either 2.2 or 4.0 PeV and the branching ratio r νN = BR(Y → νN ) = 1. Finally, in the scenario (c) we will also have two free parameters τ Y and r νN while M Y will also be fixed to be either 2.2 or 4.0 PeV.
In order to estimate the best fit values of the parameters in each case we use the method of maximum likelihood by constructing a probability distribution function (pdf). We also compute the p-value associated with each hypothesis H 0 . The description of our statistical procedure can be found in Appendix A.
The summary of best fit parameters and the corresponding p-values for the hypothesis considered in this work are given in Table 1.

Power-Law Fit (H 0 = I)
In Figure 1 we show the best fit curve for an unbroken power-law spectrum fit to the data. Although it is at this point unclear if this power-law behavior can be explained by a single astrophysical type of source, this hypothesis is the simplest one. We show the atmospheric background contribution (red curve), the power-law contribution (magenta curve) as well as the sum of the atmospheric background and the power-law signal fit (blue curve) for our best fit value at s = 2.02 and C 0 = 0.44 GeV cm −2 s −1 sr −1 , which corresponds to a p-value of 0.52. From this we see that the power-law contribution is only significant for energies > ∼ 100 TeV, the lower energy part of the spectrum is dominated by the atmospheric background. If this hypothesis is true there should be events in the gap between 400 TeV and 1 PeV and above 2 PeV in the future. An unbroken power-law spectrum such as this one may arise from optically thin galactic neutrino sources [9]. It has been pointed out that cosmic ray interactions with gas, such as expected around supernova remnants, seem to be able to produce smooth neutrino spectra [34]. Here a note is in order. Our best-fit value is compatible to IceCube spectral index fit [25] within 1 σ since at this confidence level 1.01 ≤ s ≤ 2.69. We show in Figure 2 the correlation between C 0 and the spectral index s for our fit. We can see that a small change in s can cause a significant change in C 0 and vice-versa, so the best fit values of these parameters are not at this point very significant.
Also we would like to comment on the new IceCube veto-based technique developed to study the neutrino spectrum between 10 and 100 TeV [35]. Using this new method they were able to better understand their background at lower energies. Although this could have an impact on the best fit values of the parameters in our analysis, we do not believe our conclusions would change. For completeness we present on Table 4 in Appendix B the 1, 2 and 3σ confidence intervals for s and C 0 for our fit. The confidence intervals for the fitted parameters can also be found on Table 4 in Appendix B. From these intervals we see that if one allows for both a LLP two-body decay as well as a contribution from a power-law spectrum, there is a minimum lifetime for the LLP compatible with the data, but longer lifetimes are clearly also possible because in this case in practice we get back to the single power-law case.

Pure Long-Lived Particle Decays Fit (H 0 = III, IV and V)
We have examined three scenarios for pure LLP decays. For H 0 = III we have the two comparable decay modes: Y → νN and Y → 4h. For H 0 = IV we have two comparable two-body decay modes: Y → νN and Y → 2h, whereas for H 0 = V we have only a single decay contribution Y → νN .
In Figure 4 we show the best fit curve for pure LLP decay into the two modes Y → νN and Y → 4h. On the left panel we show the case M Y = 2.2 PeV, for the best fit τ Y = 0.62 × 10 28 s and r νN = 0.14, corresponding to a p-value of 0.06. On the right panel we show the case M Y = 4 PeV, for the best fit τ Y = 0.73 × 10 28 s and r νN = 0.35, corresponding to a p-value of 0.46.
In Figure 5 we show the best fit curve for pure LLP decay into the two modes Y → νN and Y → 2h. On the left panel we show the case M Y = 2.2 PeV, for the best fit τ Y = 1.54 × 10 28 s and r νN = 0.57, corresponding to a p-value of 0.02. On the right    Clearly the cases with M Y = 2.2 PeV are very unfavorable, but the cases with M Y = 4 PeV are consistent with data. In fact, the fit with M Y = 4 PeV and the LLP decaying into νN and 2h presents the highest p-value of all cases studied. We see that the LLP decays start to contribute to the spectrum at energies > ∼ 70 TeV up to 2 PeV, so these scenarios predicts a sharp cutoff in the spectrum above 2 TeV that could be confirmed by future data.
In Figure 6 we show the correlation between τ Y and the branching ratio r νN for M Y = 4 PeV and hypotheses III and IV. We see the Y lifetime is slightly correlated with r νN , more so for IV than for III. Also when we have Y → 4h it is possible, even at 1 σ, to have r νN = 0 while when we have Y → 2h we need at least some Y → νN contribution in order to explain the data. Finally, we also investigate the single decay mode Y → νN for M Y = 4 PeV for which we get the best fit τ Y = 1.58 × 10 28 s and a corresponding p-value of 0.21. This case is shown in Figure 7. We see that in this case instead of just a peak at 2 PeV, there is a cascade tail of lower energies neutrinos due to EW corrections (partially also due to the extragalactic neutrinos from LLP decay that redshifts to lower energies). Nevertheless the LLP decay only starts to contribute at much higher energies, around 500 TeV, so up to that point the spectrum has to be entirely explained by the atmospheric background. Also this scenario predicts a cutoff in the spectrum after 2 PeV. Just by looking at Figure 7 we see that the data in the two bins bellow 500 TeV and the bin at 1 PeV seem to be higher than the theoretical prediction. You can find on Table 4 in Appendix B the allowed intervals of the parameters for all cases considered here.  To conclude this section we note that at this point the data seems to be equally compatible with a single power-law spectrum, a power-law + Y → νN spectrum or a spectrum due to a 4 PeV LLP decaying into νN and 2h or 4h.

Constraints from Gamma-ray and Antiproton Observations
We now briefly discuss the question of whether our LLP decay scenario for IceCube highenergy events is consistent with the limits imposed by diffuse gamma-ray and antiproton observations. Here, we limit ourselves to a sketchy description by just reviewing the results in the existing literatures. 4 As discussed by the authors of [36], the cascade gamma-ray bound is largely DM mass-independent at sufficiently high masses, because it is essentially bolometric in nature. It allows us relatively DM mass independent conclusion. Also the gamma-ray limits at very high masses is weaker than the limit for neutrinos which was obtained [36] assuming non observation of three years run of IceCube. When applied to our case, it means that models of LLP decay which explains IceCube neutrino excess would be free from the diffuse gamma ray bound, even though the new Fermi-LAT data at higher energies [37] makes the consistency more nontrivial [38].
More specifically in our case, the model we examined in this section is much safer than the generic LLP decay scenario, because the decay products, neutrinos, gammas, and electrons, etc. from Higgs boson is about 10 times less prominent compared to those from bb at low energies (see also [11]). It appears that the PAMELA antiproton limit is also cleared by our LLP scenario, as one can see in Figure 6 in [39]. By extrapolating threeorders of magnitude in the DM mass from those in Figure 6 the lifetime lower bound is well below 10 27 s.

Future Data Perspectives
In view of the fact that at this point the data seems to be compatible with many different cases we would like to establish the necessary exposure time in order to exclude a given hypothesis H j assuming the true explanation of the data is H i . We refer the reader to Appendix A for details of the statistical calculation.
In Figure 8 we show the curves for all the hypotheses best fit with the data extrapolated to 10 PeV but for the current exposure. From this we can clearly see that data at higher energies will be able to help to disentangle the various hypotheses. In Table 2 we show the results of our computations for all possible combinations of the hypotheses we consider. In the columns we place the true hypothesis H i while in the rows we place the hypothesis to be excluded H j . We define the exclusion time as T = f T × 988 days. In each cell we write the vector (f T , p × 100) in order to indicate the necessary exposure time and the corresponding p-value as a percentage computed at that time. Also, we use the notation (f T , p × 100) * to identify combinations that can be distinguish but in a very long time and (f T , p × 100) * * for those combinations that cannot be distinguish at any time.
We see that assuming a power-law spectrum, the solely LLP decay models can be excluded with twice the current data due to the absence of neutrino events beyond the cutoff energy. If we introduce a LLP component to the power-law it will take a longer time to distinguish from the solely LLP decay models because in this case the power-law spectral index turns out to be larger and predicts less events beyond the cutoff. A single power-law and a power-law plus the LLP decay is rather difficult to distinguish unless one (1.1, 2) (2.1, 0) (6, 1) (9, 5) (7, 3) - Table 2. Estimation of the exclusion time needed to eliminate a hypothesis. In each cell we write the vector (f T , p × 100). We use the notation (f T , p × 100) * to identify combinations that can be excluded but in a very long time and (f T , p×100) * * for those combinations that cannot be excluded.
constraints the lifetime of the LLP to a maximum value. If the future data is consistent with solely LLP decays it will be difficult to distinguish among the possible contributing decay mode scenarios, but the sharp cutoff in the spectrum will certainly indicate a LLP decay component.

A Model for Long-Lived Particle
In this section we present a consistent model for a LLP that could give rise to the channel decays used to fit the data in the previous sections. We must emphasize that although we rely on this particular model to fit the IceCube high-energy excess events, it is by no means the unique model that can explain the IceCube data with or without the power-law component. However, we hope that it serves as the existence proof of such models that can explain the data only by LLP decays. As seen in the previous sections, we can accommodate the IceCube data with the following decay channels: Y → νN , Y → 2h and Y → 4h. To accomplish this, we introduce two complex scalar fields, Y and X, that are singlets under the SM gauge group. For the fermionic sector, we introduce a new vectorlike pair of fermion doublets Ψ L = (ψ 0 L , ψ − L ) T and Ψ R = (ψ 0 R , ψ − R ) T , and another right-handed fermion singlet N R . This model has no SM gauge anomaly since Ψ L and Ψ R are vectorlike under the SM gauge interactions while N R is a singlet. In addition, we assume these new fields to be charged under a new U (1) X symmetry according to Table 3. In the following we will consider both possibilities that U (1) X is global or local, keeping in mind that if U (1) X is gauged, we need to introduce N L to cancel the U (1) X gauge anomaly. Table 3. New fields of the model and their respective assignments under SU (2) L × U (1) Y × U (1) X .

The Scalar Sector
Now we will describe the scalar sector of our model with the following scalar potential where H is the SM Higgs doublet. We assume that all the dimensionless couplings λ's are positive. The complex dimension one coupling µ XY can be made real by redefining X and/or Y fields. Without loss of generality, in the following, µ XY is taken to be real and positive. We further assume that the new physics scale X = w v = 174 GeV. As we will see below, the LLP is approximately Y R (the real part of Y ) and hence M Y is at the PeV scale. We also take M 2 Y > 0 such that no large vev will be induced for Y . Instead, a small vev for Y is induced through the µ XY term: where we have assumed the condition µ XY w 2 /M 3 Y 1. As we will see later, this condition is always fulfilled due to the long-lived requirement of the LLP. In fact if µ XY → 0, there will be a Z 2 symmetry such that: Ψ L,R → −Ψ L,R and Y → −Y (see Eq. (6.15)). Hence µ XY controls the lifetime of our LLP. A small µ XY is technically natural since in the limit µ XY → 0, there is an enhanced symmetry U (1) 2 which corresponds to independent phase rotations of X and Y .
U (1) X is spontaneously broken when X acquires a vacuum expectation value (vev) w. If the U (1) X is a global symmetry, we will have one massless Nambu-Goldstone boson (NGB). According to Ref. [40], if one considers 10 9 GeV w 10 12 GeV, the NGB will decouple much before the neutrino decoupling temperature T ∼ MeV and hence its temperature will red-shift to a much lower value and have small contribution to the energy density at the time of nucleosynthesis. Assuming the SM degrees of freedom, it is possible to have up to 37 NGBs. Alternatively, U (1) X can be gauged and instead of a NGB, we will have a massive gauge boson associated with U (1) X . In this case the w scale can be relaxed to a lower value.
In the following we write Then the mass matrices for the fields {h, Y R , X R } and {Y I , X I } are respectively and The longevity of LLP requires tiny µ XY , which from Eq. (6.2) implies small mixing between Y R and X R , h and Y R , and Y I and X I . On the other hand, the mixing between h and X R is controlled by the ratio For the SM Higgs boson mass M h = 125 GeV the still allowed branching ratio for the Higgs invisible decay width is in the ball-park of 20 % [41]. In our model, we have and constraining BR (h → X I X I ) 0.2, we obtain λ HX 0.01. (6.7) On the other hand, for a gauged U (1) X , there will be no such decay channel since the gauge boson associated with U (1) X breaking is much heavier than the SM Higgs with its mass given by For simplicity, we assume δ HX 1 such that the scalars Y R , Y I , X R and X I are approximately mass eigenstates with respective masses We assume M X R M Y R such that Y R cannot decay to X R but it can decay directly to 2h or indirectly to 4h through two off-shell X R . In the following, we will drop the subscript R for the LLP Y R and simply denote it as Y . From the context, there should be no confusion with the original complex field Y .
The decay widths for the decays of Y to scalars are given by Comparing the decay rates, we have where we define r Y ≡ M Y /w and we have used Eq. (6.9) for the mass M X R and also Eq. (6.2). Taking 10 9 GeV w 10 12 GeV and M Y ∼ 10 6 GeV, we have 10 −6 r Y 10 −3 . With the assumption M X R M Y , we have λ X r 2 Y . However with the assumption of small H − X mixing δ HX 1, we have λ HX /λ X λ H /(4λ HX ) M 2 h /(4λ HX v 2 ). Taking the maximum allowed value λ HX = 10 −2 from Eq. (6.7), we have λ HX /λ X 13. As we will see later in Section 6.3, under reasonable assumptions, λ HY and λ XY are required to be small, 10 −10 , in order not to over-produce Y . Taking all the above considerations into account, we can write down conservative upper bounds From the above, we see that in order to have Γ(Y → 4h) > Γ(Y → 2h), we need a coupling λ HY 7 × 10 −7 |λ XY − r 2 Y |. In addition, we also notice that the decay channel of Y to the NGB X I always dominates over the channel Y → 4h while it is generally faster than Y → 2h unless λ HY > |λ XY + r 2 Y |. Hence in the global U (1) X scenario, this channel Y → X I X I is usually the one which determines the lifetime of Y . 5 Requiring τ Y > t 0 4.4 × 10 17 s, we obtain using Eq. (6.10) GeV. (6.13) The constraint above does not apply in the gauged U (1) X scenario since the U (1) X gauge boson is assumed to be a lot heavier than Y . From Section 4, we see that the lifetimes which fit the data fall in the range τ Y ∼ 10 27−29 s, assuring Y to be long-lived. In this case, we can constrain µ XY using Eq. (6.11) and obtain µ XY 2.6 × 10 −19 M Y 10 6 GeV 5/2 10 10 GeV w 2 10 −11 λ HY GeV. (6.14) The fact that the bound (6.14) is smaller than the bound (6.13) and not the other way around is interesting and also crucial. It implies that in the global U (1) X scenario, if we can fit the neutrino flux to explain the IceCube excess, despite having a dominant decay channel of Y to NGB, the long-lived requirement on Y is automatically fulfilled. In all cases, we see that µ XY is constrained to be very small and the condition in obtaining the induced vev (6.2) is always valid.

The Fermionic Sector
Now we describe the new fermionic sector of our model. With the introduction of a pair of vectorlike fermion doublets Ψ L , Ψ R and a right-handed singlet N R , we have the following new terms where L = (ν L , e L ) T are the SM lepton doublets with the flavor index suppressed and H = iσ 2 H * with σ 2 the Pauli matrix. We assume that M Ψ > PeV such that Y cannot decay into it. After electroweak symmetry breaking, a mixing of the new fermions with the SM leptons will be induced such that where we have defined the mass matrices with y e the 3×3 SM charged lepton Yukawa matrix and y Ψ a 3-column vector. Without loss of generality, we can choose a basis where y e =ŷ e is diagonal and real. Diagonalizing the mass matrix for charged leptons above, due to the very small u/M Ψ , the charged leptons mass eigenstates are still m α ≡ (ŷ e ) αα v to a good approximation. For the neutral leptons, since we have introduced only one N R , there is only one massive active neutrino with Dirac mass given by As we will see shortly, the longevity of Y implies an extremely small contribution to neutrino mass. This could be easily modified to accommodate the neutrino oscillation data for example by introducing two other heavy right-handed SM singlets uncharged under U (1) X or more generally by introducing a dimension five Weinberg operator [42].
Since this is not directly relevant for our current study, we won't pursue it further.
Next the leptonic decay widths of Y to charged and neutral leptons are, respectively, 20) and Taking the ratio of these widths we have Assuming the dimensionless couplings are of the same order and considering, for example, m τ /v, the ratio above is still suppressed by very small u/M Ψ . Hence the decays of Y into neutrinos will always largely dominate over the decays into charged leptons and can be ignored. Using Eq. (6.19), the total decay width of Y to neutrinos can be rewritten as Requiring the lifetime of Y to be longer than the age of the Universe t 0 , we have In order to maximize the contribution to neutrino mass, we take for instance r Y = 10 −6 and µ XY = 10 −17 GeV (see Eqs. (6.13) and (6.14)), we still only have m ν 10 −19 eV. We can also translate the bound (6.22) into a bound for the dimensionless couplings using Eq. (6.19) as follows α |(y Ψ ) α | 2 y ν 6 × 10 −20 M Ψ 10 6 GeV . (6.23) As we will see in Section 6.3, for M Ψ ∼ PeV, we need (y Ψ ) α 10 −11 in order not to over-produce Y and this in turn implies y ν 10 −9 .
Finally, we would like to comment on possible phenomenological constraints on the fermion sector. In Ref. [43], it was shown that the contributions to electric dipole moment of the electron and charged lepton flavor violating processes are much suppressed beyond the current bounds if new heavy vectorlike leptons have masses beyond ∼ 100 TeV. In our scenario, besides having M Ψ 100 TeV, the constraints on the model parameters from the requirement of the longevity of Y render these phenomenological constrains completely irrelevant.

Does the Long-lived Particle Constitute Most of the DM?
Here we would like to study the abundance of the LLP in our model. The abundance of the LLP Y is bounded from above by the DM abundance of the Universe. Clearly the abundance of a PeV LLP depends on the re-heating temperature T RH of the Universe after inflation. If T RH M Y , no Y would be generated. If T RH ∼ M Y , some amount of Y would be generated. If T RH M Y , it is possible to generate significant amount of Y . Here we will consider the last and most constraining scenario and determine the constraints on our model parameters in order not to over-produce Y . First let us estimate what is the upper bound on a PeV Y abundance. From Planck 2013 measurement, the ratio between DM over baryon densities are [32] where we denote Y x ≡ n x s, the x number density normalized by the entropy density s = (2π 2 /45)g T 3 . During the radiation dominated period of the Universe, the relativistic degrees of freedom g = 106.75, assuming only the SM particles. Taking the baryon mass M B = 1 GeV and Y B = 8.8 × 10 −11 , we have Hence the above is the upper bound for the abundance Y Y of our LLP Y of mass M Y . First, let us discuss the possibility to have the correct Y Y in the thermal freeze-out scenario. In this scenario, the scattering processes Y Y ↔ HH, Y Y ↔ L L and Y Y ↔ X I X I (the last scattering process is relevant only for global U (1) X scenario) as shown in Figure 9 are fast to keep Y in thermal equilibrium at T M Y . Since the electroweak symmetry breaking remains unbroken, L and H refer to the SM lepton and Higgs doublets respectively. For gauged U (1) X , if T RH M g with M g the U (1) X gauge boson mass, this gauge interactions can also keep Y in thermal equilibrium. The final Y Y can be estimated from the time when these scattering processes freeze out. In the following we will define the temperature as z ≡ M Y /T . Due to fast reactions which keep Y in equilibrium, Y Y is equal to its equilibrium abundance Y eq Y (z) = 45/(2π 4 g )z 2 K 2 (z) with K 2 (z) the modified Bessel function of type 2 until the time of freeze-out at Such a late freeze-out implies large annihilating cross sections. We found that for M Y ∼ PeV, such large cross sections cannot be obtained if λ HY and y Ψ (also λ XY for global U (1) X case) are bounded to be perturbative i.e.
4π. Since thermal freeze-out is not a viable scenario, for gauged U (1) X , we have to ensure that the gauge interactions between Y and U (1) X gauge boson are not in thermal equilibrium. Since the gauge coupling is generically large, we can only suppress this interaction by assuming T RH much smaller than the mass of the gauge boson M g such that there is no gauge boson for gauge scatterings to take place.
In the following, we will study the scenario in which Y abundance is obtained from the freeze-in scenario where Y is very weakly coupled to other particles in the thermal bath and is generated from the interactions with those particles. With the assumption T RH M g , the processes we have to consider are shown in Figure 9. We only show t-channel scattering processes and it is understood that in the calculation, one should also include the u-channel scattering processes. For gauged U (1) X , we have to consider processes in the top two rows of Figure 9: For global U (1) X , we also have to take into account the scattering with the NGB and consider two additional scatterings HH ↔ X I X I and Figure 9. The scattering and decay processes which help to populate Y abundance in the early Universe. For gauged U (1) X , only the processes in the top two rows are relevant while for global U (1) X , we also have to include the scattering processes with NGB in the last row. For the t-channel scattering processes, it is understood that we also have to include the u-channel scattering processes.
The SM particles H and L are necessarily in thermal equilibrium due to fast gauge interactions. If T RH M Ψ , the vectorlike fermion doublets Ψ L , Ψ R which are charged under SU (2) L × U (1) Y will also be in thermal equilibrium due to fast gauge interactions. If T RH M X R , taking the value allowed by invisible Higgs decay constraint (6.7) λ HX = 10 −2 , X R will also be thermal equilibrium due to the fast scatterings HH ↔ X R X R . Finally for the case of global U (1) X , taking λ HX = 10 −2 , X I will also be in thermal equilibrium. On the other hand, Y will be populated due to the scatterings H, L and X I and the decays of Ψ R and X R as shown in Figure 9. We can write down the Boltzmann equation to describe the generation of Y where γ a is the thermal averaged reaction density for the corresponding process a, H = 4π 3 /45 √ g T 2 /M Pl is the Hubble expansion rate with the Planck mass M Pl = 1. 22 × 10 19 GeV. In writing the equation above, we have also taken H, L , Ψ R , X R and X I to be in thermal equilibrium. In fact, since y Ψ 1 (weakly coupled) and γ L L ↔Y Y ∝ y 4 Ψ while γ Ψ R ↔ L Y ∝ y 2 Ψ , we will ignore the sub-dominant scattering process L L ↔ Y Y . In addition with Y Y Y eq Y , we can further simplify Eq. (6.26) by dropping the terms with Y Y /Y eq Y . Finally we obtain a very simple Boltzmann equation In the above the thermal averaged reaction densities are given by In the solution above we have ignored the phase space factors in the decays which are negligible when a Ψ , a X R 1. In Eq. (6.32), the contributions in the square bracket are in the following order HH ↔ Y Y , Ψ R ↔ L Y , X R ↔ Y Y and X I X I ↔ Y Y . The last contribution is only relevant for the global U (1) X scenario. We have verified the solution above by numerically solving Eq. (6.26). All in all, we require |y Ψ |, λ XY , λ HY 10 −11 − 10 −10 in order not to have Y exceeding the PeV DM abundance Y DM ∼ 10 −16 as in Eq. (6.25).
Finally we would like to reiterate that in the fit of Sections 4 and 5, we have assumed the LLP constitutes all the DM. If the LLP constitutes a fraction κ of the DM, then its lifetime has to be shorter by a factor of κ in order to maintain the observed flux by IceCube. Hence, we cannot choose an arbitrarily small κ and at some point when κτ Y < t 0 , it will cease to be the LLP. Since τ Y ∝ µ −2 XY (see Eqs. (6.10)-(6.12)), to obtain κτ Y , µ XY has to increase by factor of 1/ √ κ. In the gauged U (1) X scenario, we are allowed to take κ as low as ∼ 10 −10 . On the other hand, for the global U (1) X scenario, due to the fast decay of Y into NGB, we require κ 10 −4 which can be inferred from Eqs. (6.13) and (6.14). In other words, the LLP has to constitute at least about 0.01 % of the DM in the global U (1) X scenario.

Conclusion
The serendipitous discovery of high-energy extraterrestrial neutrinos by IceCube inaugurates an extremely exciting era for neutrino astronomy. It is clear that the neutrino candidate events above 60 TeV cannot be explained by atmospheric neutrinos alone. Moreover, the absence of events in the region 500 TeV -1 PeV and above 2 PeV seem to suggest a neutrino spectrum beyond a single power-law. This will have to be confirmed or rejected by future data.
We investigate the possibility that the IceCube neutrino events with energies from 30 TeV to 2 PeV can be explained by (a) solely a power-law spectrum, (b) decays of a PeV scale LLP with power-law spectrum component and (c) solely decays of a PeV scale LLP. For scenario (c), we study a simple scenario where a scalar LLP, Y , decays dominantly into Y → νN , Y → 2h or Y → 4h. We present a simple extension of the SM where two extra complex scalars, singlets under the SM group, a vectorlike pair of fermion doublets and a right-handed fermion singlet are charged under a global or gauge U (1) X . We show that this model can give rise to a LLP that can decay dominantly according to the above modes while the LLP abundance is generated through freeze-in mechanism. In particular, we find that if U (1) X is a global symmetry, such LLP has to constitute at least 0.01% of the total DM of the Universe due to the long-lived constraint.
Using the current IceCube three-year data set of 37 events [25], we find that all the three scenarios (a)-(c) above fit the data equally well due to the low statistics. Our results for these fits are summarized in Table 1. In order to disentangle various scenarios above, we simulate the future IceCube data based on current hypotheses up to neutrino energies of 10 PeV. The exposure time needed to disentangle various scenarios by future data is summarized in Table 2. Assuming a power-law spectrum, the solely LLP decay models can be excluded with twice the current data. Other combinations will take longer time and in some cases it will be impossible to distinguish between hypotheses.

A Some Details on the Statistical Treatment
For the statistical treatment we mainly follow [44,45]. Through this work we consider several hypotheses in order to explain the data observed by IceCube. In this appendix we refer to any of these hypotheses as H 0 (θ i ), where θ i are the free parameters of the hypotheses. In order to estimate the preferred values of θ i , we maximize the likelihood function with respect to θ i , where f (k n ; µ n (θ i )) is the Poisson probability distribution function for measuring k n events in the bin n assuming that the mean value is given by µ n (θ i ). Notice that the results obtained from this procedure are equivalent to those obtained from the minimization of the function χ 2 (θ i ; k n ) = −2 ln(L(θ i ; k n )). Therefore, for simplicity and numerical stability, we normally use χ 2 (θ i ; k n ) to report our results. The estimators of the parameters θ i are defined asθ i , the minimum of the statistic is given by χ 2 min ≡ χ 2 (θ i , k n ) and the mean values of the hypothesis H 0 (θ i ) are given by µ n (θ i ). In the following we show how we compute the p-value associated to the observed χ 2 min and the corresponding confidence intervals for the variables θ i .

A.1 p-value
In order to quantify the level of agreement or incompatibility of the hypothesis H 0 (θ i ) with respect to the current data, it is useful to evaluate the probability of obtaining the current value of χ 2 min assuming that the data is indeed generated by the hypothesis H 0 (θ i ). In practice, we must construct the probability distribution function, f (t|H 0 ), with t = χ 2 min (θ i ,k n ) andk n ∼ P ois(µ n (θ)). In general, this function is given by the normalized frequency distribution of t obtained from several random realizations of H 0 (θ). Using f (t|H 0 ) we are able to compute the p-value associated to χ 2 min , which is defined as the probability to find t in the region of lesser or equal incompatibility with H 0 than the level of incompatibility observed with the current data, Thus, a very small value of p implies that the observed level of incompatibility is quite unlikely to be found, which also suggest that H 0 is not a good representation of the data. When this is the case, it is said that the hypothesis H 0 is rejected at (1 − p) confidence level.

A.2 Intervals
For simplicity, we compute the intervals using a Bayesian point of view. From Bayes theorem, the probability density function of the parameters θ i can be obtained from the product of the likelihood function L(θ i ; k n ) and the joint prior of the parameters θ i , which we define as π(θ i ). We assume that each prior is constant in a finite interval and null otherwise. Thus, the domain of the variables θ i is given by a finite hyper volume U , which is obtained from the product of the priors. The normalized p.d.f. for θ i ∈ U is given by, From this expression is direct to compute the coverage probability of a region of parameters V ∈ U , since we just have to integrate the previous expression on the region V. However, it is not direct to obtain an interval such that the coverage probability is some fixed number (1 − α), because this procedure involves an integral equation. Furthermore, it is not easy to find a rule to discriminate within degenerate solutions. These problems can be addressed in a simple way when we consider the discrete form of Eq. (A.3).
Therefore, we generate a quite big sample of points θ i covering the domain U . In general we use N = (10 4 , 500 2 , 200 3 ) when the number of free parameters is n = (1, 2, 3) respectively. In the same procedure we are able to compute the pairs {θ i , L(θ i ; k n )} j , with j = 1..N . Using the elements of this set we can compute the coverage probability of some particular region, but first we need to fix the rule to discriminate within degenerate intervals. For instance, we require that the resulting interval contains the most likely points. This is analogous to considering a symmetric interval around the center of a Gaussian distributed variable. To implement this requirement, we just order the set {θ i , L(θ i ; k n )} j in descending order in L(θ i ; k n ), then the new set satisfies the condition L(θ i ; k n ) j ≥ L(θ i ; k n ) k with k = j + 1..N . Finally, the hyper volume with (1 − α) coverage probability is given by the set {θ i , L(θ i ; k n )} j with j = 1..M, such that where we have assumed that the volume element ∆θ i is constant everywhere. The final procedure only involves a loop on the variable j that runs in unit steps until the previous equation is violated, which determines the value of M . The intervals at some confidence level for a given variable θ i correspond to the extreme values of the set {θ i , L(θ i ; k n )} j with j = 1, ..., M .

A.3 Exclusion time
In this section we compute the necessary exposure time in order to exclude a given hypothesis H j assuming that the hypothesis H i is the true explanation of the current data. This procedure is based in the computation of p-values between pairs of hypotheses at different times. Below, we describe the algorithm step by step:

B Confidence Intervals
The summary of the confidence intervals for each hypothesis H 0 are showed in Table 4.  Table 4. Confidence level intervals. The lifetime τ Y is in units of ×10 28 s, the normalization of the power-law C 0 is in units of GeV cm −2 sr −1 s −1 and r νN is the branching ratio of the channel Y → νN . The intervals with format [a, b] * are scanned in logarithmic scale.