Testing cosmic ray composition models with very large volume neutrino telescopes

The composition in terms of nuclear species of the primary cosmic ray flux is largely uncertain in the knee region and above, where only indirect measurements are available. The predicted fluxes of high-energy leptons from cosmic ray air showers are influenced by this uncertainty. Different models have been proposed. Similarly, these uncertainties affect the measurement of lepton fluxes in very large volume neutrino telescopes. Uncertainties in the cosmic ray interaction processes, mainly deriving from the limited amount of experimental data covering the particle physics at play, could also produce similar differences in the observable lepton fluxes and are affected as well by large uncertainties. In this paper we analyse how considering different models for the primary cosmic ray composition affects the expected rates in the current generation of very large volume neutrino telescopes (ANTARES and IceCube). We observe that, a certain degree of discrimination between composition fits can be already achieved with the current IceCube data sample, even though in a model-dependent way. The expected improvements in the energy reconstruction achievable with the next generation neutrino telescopes is be expected to make these instruments more sensitive to the differences between models.


Introduction
When a primary cosmic ray (CR) reaches the top of the atmosphere, it penetrates it until it collides with an air nucleus. This usually happens at an altitude of about 10-20 km [1]. An extensive cascade of particles, showering down in the atmosphere, results from this interaction. The most abundant hadronic products are pions and kaons, neutral or charged. Neutral pion decays induce an electromagnetic cascade in the atmosphere; charged mesons can decay leptonically. These leptons constitute the entirety of the atmospheric events that can be detected in a very large volume neutrino telescope (VLVνT), placed at large depths under sea, lake water, or ice. Even though these atmospheric leptons represent the foreground to the cosmic searches to which VLVνTs are devoted, these experiments can provide an insight into the study of atmospheric lepton fluxes and thus into CR physics.
Extensive air-shower arrays indirectly measure the cosmic ray flux at Earth by observing the shower products arriving on the ground, coming from the electromagnetic component and the hadronic component -the latter being addressed by measuring muons penetrating underground. The observed distributions of these particles can be correlated to the primary energy and species. Indeed, the direct measurements obtained by satellite and balloon flights, using magnetic deflection, allow accessing the mass spectrum of primary CR up to some tens of TeV; the per-species power-law behaviour could actually be extrapolated to higher energy but then the spectral features present above hundreds TeV/few PeV energies (namely the CR knee) appear. A proper extrapolation is thus not bound by direct measurements since, because of the low flux, a proper measurement is not fully achievable. Indeed, in this energy range, CR air showers can be only observed by ground-based experiments, where in general only a distinction between a lighter and a heavier component could be obtained. The behaviour of each individual CR species around the knee and beyond is affected by large uncertainties, and current measurements do not completely agree in the description of the observed spectra. An overview of these factors, as well as of those measurement pointing at possible features in the CR spectrum will be presented in section 3 and their effect on the observable lepton fluxes in section 4.
The flux of atmospheric neutrinos can be measured, from sub-GeV to few hundreds-TeV energies, with the current generation of neutrino detectors and, in particular, with VLVνTs, described in section 2. This flux is the convolution of the primary cosmic ray flux at the top of the atmosphere and the neutrino yield per primary particle. This yield is affected by several factors, mostly related to the efficiency in producing the neutrino mother particles, i.e. hadrons decaying into leptons. At a first approximation, the neutrino flux can be written as where the first term AE −γp ν accounts for the primary CR spectrum, assumed e.g. to behave as a power law with spectral index −γ p . The sum then runs over the neutrino mother particles m: the coefficients B m and c m are related to the physics of the hadronic interaction at play. Finally, the term m is the so-called critical energy, for which the interaction length equals the decay length of the particle [1]; above this energy all the mother particles would interact before decaying, thus reducing the quantity of neutrinos at the highest energies and making the spectrum steeper.
For muon neutrinos below 100 TeV, m = {π, κ}, almost exclusively; this represent the conventional atmospheric neutrino flux. At higher energy, shorterlived particles such as charmed hadrons start contributing to the overall flux, producing the harder prompt atmospheric neutrino component. More details will be given in section 4.
Being the decay kinematics well-known, the overall flux is determined mostly by the primary flux, and by the production efficiency and interaction probability of the parent hadrons. Our knowledge of the former is affected by large uncertainties, both for what concerns the spectral behaviour of the individual components of the flux, and the nature itself of the primary particles species. The latter, i.e. the description of the primary hadronic interaction and the description of the further interactions of these hadrons as the cascade of particles showers-down in the atmosphere, is rather uncertain because the interactions at play are low-transferred-momentum processes, not fully studied at accelerator experiments since the pseudo-rapidity ranges at play, are largely outside the reach for current detectors at large hadronic colliders, and similarly the parton distribution functions used in the description of these interactions are affected by large uncertainties, both in the theoretical models and in the experimental results.
In order to determine the effect of different CR composition models in the measurement of lepton fluxes at VLVνT, a simplified analysis is presented in this paper, taking into account the current generation of neutrino telescopes, built or currently being built. A Monte Carlo study of the possible performance of VLVνTs in the study of atmospheric neutrino fluxes is set-up in section section 5.1. The statistical tools used in the analysis, including an overview of the systematic effects affecting the possible measurement are presented in section 5.2. The results of this study are shown in section 6.

Very large volume neutrino telescopes
High-energy neutrinos 1 can be detected in a large volume of transparent medium (ice or water) by observing the Cherenkov light emitted by the relativistic particles induced by neutrino interactions. This light can be detected with a threedimensional array of photo-multiplier tubes (PMTs), being these PMTs hosted in pressure-resistant glass spheres and distributed along a certain number of vertical strings to cover a large volume, large enough to allow the weaklyinteracting-only neutrinos to produce a measurable amount of events in the detector.
Charged current (CC) weak interactions of muon neutrinos produce a longlived relativistic muon, that can be identified as a straight track passing through the detector volume. Neutral current (NC) neutrino interactions, as well as CC interactions of electron neutrinos, would induce hadronic and electromagnetic particle showers in the medium, which can be identified as almost-point-like light sources, given the short elongation of these particle cascades with respect to the spacing between photo-sensors. Also tau neutrino interaction usually produce shower events, even though in CC interactions the tau lepton can travel a few tens of meters and be properly identified, allowing the observation of two separate particle showers.
Thanks to the long lever-arm of the passing-through track, track-like events provide sub-degree angular resolution, while the almost-spherical pattern of light in the detector given by cascades does not allow an extremely precise direction reconstruction (few degrees in water, roughly ten degrees in ice). On the other side, cascade-like events allow for an almost calorimetric measurement of the deposited energy, since most of the light is emitted within a limited amount of space. In contrast, the energy resolution in the case of track-like events is limited by the fact that only a part of the muon track is observed within the instrumented volume, and this muons usually arrives at the detector after having travelled large distances in the medium and thus after having lost a significant amount of its original energy.
The measurement of the muon energy is related to the energy loss processes at play as the muon travels through the detector. Light is emitted along the track, and above 500 GeV in water/ice the muon loses most of its energy because of radiative processes, which are proportional to its energy. Radiative losses induce cascades of charged particles along the track, whose light yield is proportional to the energy lost in the event. Thus, a measurement of the energy loss along the track dE/dX provides an estimate of the muon energy.
Track-like events in VLVνTs allow for a selection of a very pure sample of muon neutrino-induced events; in addition, the long path travelled by the muon increases the volume into which the neutrino interaction can take place, thus significantly increasing the number of detectable events. The larger statistics accessible with this kind of events can be extremely valuable for the study of atmospheric leptons. Event samples made of cascades present more limitations in terms of fiducial volume and background contamination.

Currently active VLVνTs
Four large volume neutrino telescopes are currently taking data: IceCube at the geographic South Pole and ANTARES in the Mediterranean Sea have been completed about 10 years ago. Baikal-GVD, in Lake Baikal, Russia, and the KM3NeT/ARCA telescope in the Mediterranean Sea, off-shore the coasts of Sicily, are currently being build.
The IceCube detector is composed of 86 in-ice, 1-km-long, vertical strings, each holding 50 10-inch PMTs, covering a volume of 1 km 3 at a depth between 1500 and 2500 m in the Antarctic Ice Shell. The IceCube collaboration has provided the first observations of a high-energy cosmic neutrino signal, using different data samples [3][4] [5] and of a first candidate cosmic source of high energy neutrinos [6].
The ANTARES telescope is made of 12 vertical lines, each holding 75 10inch PMTs, instrumenting ∼0.01 km 3 of deep-sea waters off-shore the coasts of Southern France, at a depth of 2000-2500 m [7]. Having been taking data continuously since 2007, ANTARES provides the longest neutrino telescope data-stream in the Northern Hemisphere. The currently available data-set has already let to significant results in searches for neutrino emissions from the Southern sky [8][9] as well as the observation of a mild excess of high-energy neutrino which could be attributed to a diffuse cosmic neutrino flux [10].
The Baikal-GVD experiment, currently being deployed in Lake Baikal, has started data taking in 2015 and is at the moment the largest-volume neutrino detector in the Northern hemisphere [11]. Analyses of shower-like events have already shown results in searches for cosmic neutrino fluxes [12]. At the moment no results are available for muon neutrinos, thus the possible performance of the Baikal-GVD detector is not studied in this paper.
The KM3NeT Collaboration aims at building a km 3 -scale neutrino detector (ARCA) off the coasts of Sicily, at a depth of 2800-3500 m [13]. The first string, with its 18 Digital Optical Modules, made of 31 3-inch PMTs, is currently taking data. The goal of KM3NeT/ARCA is to do all-neutrino-flavour astronomy, and discover galactic neutrino emitters. Along with this, a highly significant observation of an IceCube-like diffuse signal is expected within a short time with the full detector, thus allowing for an independent confirmation of this signal. KM3NeT/ARCA is expected to outperform the current generation of neutrino telescopes in terms of angular and energy resolution as well as in the purity of the neutrino sample [14].
3 High-energy cosmic rays at Earth The per-particle spectra of primary cosmic rays can be measured with rather good precision using balloon-borne experiments, as well as with experiments on satellite in low Earth orbit. These measurements span energies from the sub-GeV range to several tens of TeV. Elements heavier than iron can also be identified and their spectrum can be measured. A review of recent measurements can be found in reference [15]. As a matter of fact, the per-particle spectra follow power-law behaviour over the full range of energies covered by direct CR measurement, except for small kinks whose significance is not too relevant here.
Due to load limitations for satellites and balloons, these experiments cannot be too large and thus fail at measuring with the same precision the spectra of particles at very high energy/rigidity. For primary energies above a hundred of TeV, only indirect measurement of the CR flux are possible. These measurements are based on the observation of the cascade of particles induced by the CR interaction at the top of the atmosphere. Different technologies are exploited to measure the properties of the cascade of particles, either as it develops through the atmosphere, or as it arrives on the ground, or when these particles reach a certain depth -or, also, by means of a combination of these measurements. Indirect measurements go from 100 TeV to 100 EeV, but fail at identifing the nature of the CR nuclei on an event-by-event basis.
It is however possible to obtain, by means of statistical methods and using Monte Carlo simulations, an estimation of the CR composition at the highest energy as either light (i.e dominated by protons and helium nuclei) or heavier (whatever higher in atomic number). In any case these measurements are largely dependent on the CR interaction model as well as on the systematic effects in the estimation of the CR properties. The recent results from the Pierre Auger Observatory hint at a CR composition becoming heavier at the highest energies [16], even though some tension is present with the most recent results obtained by the Telescope Array Collaboration which prefer a lighter composition at the highest energies [17], above 10 19 eV.
At intermediate energies, around the knee region, several experiments have attempted to measure the composition on the basis of the shower topology and to perform per-particle spectral measurements; even though these measurements are as strongly model dependent on the hadronic interaction model considered, this has provided some hints on the behaviour of these components. In particular the KASCADE-Grande [18] and the ARGO-YBJ [19] have tested the composition of the CR spectrum around the knee and actually do find different spectral breaks for the the light component. It should be noted that these measurements are in general affected by energy scale uncertainties which could end-up mimicking such behaviours. The LHAASO experiment [20] is expected to provide high-precision measurement of the CR spectrum in the knee region [21] while Auger-Prime [22] should improve the knowledge of the very-high energy spectrum of CRs.
Various attempts have been made to reconstruct the all-particle flux in terms of its individual components, e.g assuming power-law shape extrapolation from direct composition and flux measurements combined with rigidity-dependent cut-offs to accomodate the behaviour in the knee region and beyond. An overview of these possible fits to the all-particle spectrum is provided in reference [23]. In general, several mass groups are considered for the primary composition and different fits become possible to describe the spectrum. In addition, different source populations can be assumed, inducing different rigiditydependent cutoffs. Some example are the so-called Hillas models (H3a and H4a, in the following, assuming 3 or 4 source populations) or the Gaisser-Stanev-Tilav models (GST-3gen or GST-4gen in the following). A more global approach is followed by the authors of the Global Spline Fit [24], where the constraints on the power-law spectral behaviour and rigidity dependent cut-offs are relaxed.
In general, given the same per-nucleon energy of the primary CR particle, heavier nuclei would generate more individual showers due to the fragmentation of the nucleus as it traverse the atmosphere. This would change the content of the overall air shower at the bottom of the atmosphere and subsequentely the amount of muons and neutrinos reaching large depth. The next section will show the results of neutrino flux simulations performed with different CR composition fit results.

Neutrinos in cosmic ray air showers
In order to get the neutrino flux induced by cosmic ray air showers at the surface of the Earth, thus before they can be detected in a VLVνT, this flux must be either computed analytically or obtained from a simulation of the CR shower. The standard framework used for a full simulation of the CR air shower is CORSIKA [25]. This framework allows for a detailed simulation of air shower with different possible assumptions, and the full propagation of particles through the atmosphere accounting for energy loss processes, as well as interactions. In addition, also the atmospheric profile can be customised, thus its influence on the final lepton fluxes can be taken into account.
Analytical solutions are possible for the computation of lepton fluxes through the atmosphere [1], even though these can only be considered approximations since some assumptions have to be made in order to make the calculation doable. The precision of these calculation is discussed in reference [26].
The MCEq software [27] has been developed to perform a precise computation of the evolution of particle densities along the cascade in the Earth's atmosphere. The cascade equations are solved by means of matrix computations taking into account the particle physics of user-selected hadronic interaction models such as e.g. SIBYLL 2.3c [28] or EPOS-LHC [29], as well as cosmic ray input fluxes [30]. Similarly, user-defined atmosphere density profiles can be used, as in the CORSIKA framework. These features will be used extensively here to benchmark the performances of VLVνT in measuring features of the neutrino spectrum at the detector.
Above 100 GeV and below a few hundreds of TeV the flux of atmospheric muon neutrinos is dominated by the so-called conventional component, induced by the decays of long-lived charged mesons (basically pions and kaons). In the 10-100 TeV energy range, the muon neutrino flux is almost entirely produced by kaon decays [31]. In this region of the energy spectrum, the conventional flux is asymptotically steeper than the primary CR spectrum by a factor ∝ E −1 because of the competition between decay and interaction processes due to the long lifetime of the mother particles [1]. Above these energies, the contribution of short-lived hadrons becomes dominant because these follow a harder energy spectrum, similar to that of the primary particle since their livetime is way too short to allow any of them to interact before decaying in the atmosphere, thus always producing neutrinos.
In figure 1 the expected muon neutrino flux for the conventional and prompt component is show for events from zenith equal to 90 • , 60 • and 0 • according to the H3a and GST 3-gen CR composition fits. A mid-latitude atmosphere is considered. Two effects are visible: since the prompt flux is constant over zenith while the conventional flux is largely zenith dependent, the influence of the prompt component is much more evident for vertical events than close to the horizon. A large difference between the two CR parameterisation appears in the 10-100 TeV energy range and above some tens of PeV; however, only the former would be visible in the current generation of neutrino telescopes since the neutrino detection rate above the PeV threshold is very low and dominated by the diffuse cosmic signal that, if isotropic as currently claimed and if extending up to those energies, would make any detector blind to this feature.
In figure 2-left the zenith-dependent relative difference between the conventional muon neutrino flux predicted when assuming the H3a CR composition fit versus the GST 3-gen is show. The relative difference also appears to be slightly zenith dependent and, since the neutrino zenith is properly reconstructed in neutrino telescopes, such features would still be present when considering the effect of the detector response. On the right of figure 2 the same kind of plot is reported, but now taking into account the differences induced when considering a different CR interaction model (in particular SIBYLL 2.3c compared to EPOS-LHC). In this case the overall effect is less significant, being of the order of few percent, against the difference coming from the CR fit which is of the order of 10-20%. Its influence will be considered sub-dominant here and an estimation of the contribution of this effect will be provided as a systematic uncertainty. It should be noted that in both plots the differences above 100 TeV would be smeared out by the existence of the cosmic neutrino flux, which would behave in the same way regardless of the neutrino flux model and thus produce null differences. Its influence will be discussed in the next section when considering the simulated analysis at a VLVνT.
An additional factor that could be considered when comparing predictions of atmospheric leptons is the nature of atmosphere in which the CR interaction takes place. Differences in temperature directly translate into differences in the air density and thus in the CR interaction target density, as well as in the density of the medium traversed by hadrons and leptons along the atmosphere column. In particular, in the case of polar atmospheres the differences between summer and winter can be remarkable and thus the zenith-dependent flux can be largely affected. This effect is expected to be milder at intermediate latitudes since the summer/winter asymmetry is less pronounced. These differences are significant in the IceCube analyses but will not be considered any further in this particular study. It should be however noticed that this is an effect that must be taken into account when considering a higher-level analysis with real data, which goes beyond the scope of this paper.    and IceCube [32] neutrino telescopes are shown as a function of the neutrino energy as solid lines. In the case of the IceCube detector also the zenith dependent ones are shown, as dashed lines, in four bins of the neutrino zenith angle. depends on both the neutrino energy and zenith. Neutrino telescopes are usually assumed to be homogeneously sensitive in azimuth. Being the atmospheric neutrino flux (at least for the conventional component) also zenith dependent, the event distribution at the detector must be computed considering both the energy and zenith angle of the incoming neutrino.
The IceCube effective area used in this work has been obtained from reference [32], already provided in bins of zenith angle. This area corresponds to the one after the final event selection, and can thus be directly used to compute the detected event rate. For ANTARES, a zenith-averaged effective area was used in the search for a diffuse flux of cosmic neutrinos [10]. We obtain the corresponding effective area in different zenith ranges assuming that the selection efficiency is negligibly dependent on the neutrino zenith and thus the zenith dependency is only due to neutrino absorption through the Earth, which can be approximately estimated. Even though this approximation is not really correct, it is sufficient for the scope of this paper.The zenith-averaged effective areas, plotted as the averaged between ν µ andν µ , are shown in figure 3. In addition, the zenith dependence of the IceCube one is also presented, to show the order of magnitude of the effect. The effective areas from KM3NeT/ARCA and GVD-Baikal for an atmospheric neutrino-dominated sample have not been released and thus will not be used in this work. It should be noted that both experiment are expected to reach a size similar to that of IceCube within the next decade, with comparable effective areas.
The event rates at the detector as a function of the true neutrino energy are obtained by multiplying the zenith-dependent atmospheric muon neutrino flux (as in section 4) by the effective area described here above, taking into account the observed solid angle and by the observation time. Since these distributions are obtained in the true neutrino energy while in VLVνTs, the neutrino energy is not directly measured for ν µ events a smearing of these distributions is required. Indeed, for almost all events, the resulting muon track is only partially observed. This muon may have travelled several kilometres before being detected and thus only part of its original energy is observable at the detector. Analogously, part of the neutrino energy is released in an hadronic shower which is not directly observable since it might also be kilometres away from the apparatus. Finally, the energy of the muon at the detector is affected by a some uncertainty since the energy estimation in VLVνTs is based on sampling energy losses within the instrumented volume -a quantity that is proportional to the muon energy above a TeV -which is a stochastic process. In order to obtain a model of these effects, the event rate distributions are smeared so that these energy-estimation-related uncertainties can be accounted for. Since the correct and complete treatment of these effects would require full Monte Carlo simulations of the detector, a simplified smearing is applied here; in particular this smearing corresponds to a Gaussian distribution with width equal to the reported energy resolution of ANTARES [33] and IceCube [34].This allows us to tentatively reproduce what the detected and estimated neutrino energy distributions should look like in a VLVνT.
In addition to the smearing effect due to the limited energy resolution, an energy scale uncertainty can affect the measurement. This effect corresponds to a systematic shift of the muon energy estimation, which could also be energy dependent (and with a non-linear dependence). This effect could be induced either by a limited knowledge of the light propagation properties in the medium or of the optical module efficiency, or by a combination of these effects. This systematic shift can be considered as an unknown unknown, i.e. a systematic effect that could affect data collected by a VLVνT and might not be present in Monte Carlo simulations, e.g. because of the lack of an energy calibration. Indeed, the most reliable energy calibration source in a VLVνT is the flux of atmospheric leptons itself and this energy scale effect can be in principle fitted under a certain number of assumptions when doing the measurement of the energy spectrum. Still, its limited knowledge can strongly affect any energydependent measurement such as that of the atmospheric or cosmic neutrino spectrum. Results will be presented here assuming that the energy smearing might either not have such a bias or it might be affected by it, thus the Gaussian smearing would have a certain width, given by the energy resolution, but also it might not be centred at zero.
In figure 4 an example of the energy estimator distribution is shown for different values of energy resolution and energy shift, after the simulation of 10 years of data taking of an IceCube-like detector. Having defined the log-energy resolution as E res = log 10 E reco E true (2) three values are considered, with E res = 0.15, 0.25 and 0.35 in the logarithm of the true energy, as well as an energy shift equal to 0.1 in the logarithm under the assumption that the resolution is 0.25. Changing the energy resolution modifies the spectrum making it smearing out differences and allowing more low-energy events to migrate to the high-reconstructed energy bins. This is due to the fact that the atmospheric neutrino energy spectrum is steeply falling and thus more events are expected at low energies with respect to the highest energies. For this reason, the probability that a low-energy event would pollute the high-energy queue of the distribution is higher. An energy shift would mostly affect the low-energy region, causing a shift in the turn-on regime, i.e. in the energy range where the detector effiency is lower; at the highest energies also this effect would be smeared out by the resolution.

Statistical analysis tools
A binned maximum likelihood approach is followed, similarly to what has been done by the ANTARES and IceCube Collaborations in their searches for a diffuse flux of cosmic neutrinos [35] [10]. It should be noted that the results of a search similar to the one presented here has been presented by the IceCube collaboration in reference [32]: the scope of this paper is however to understand to which extend such a procedure should be improved for a VLVνTs to study neutrino fluxes and how much improvement in the current reconstruction performances and statistics would be needed to constrain different atmospheric neutrino flux models. Most of the statistical tools used in this work are taken from the ROOT [36] and RooFit [37] frameworks.
Pseudo-data sets are generated from different models and the distribution of a simulated energy estimator is produced, accounting for the detector energy resolution smearing and allowing for an unknown energy scale systematic effect to influence the distribution, as described in the previous section. Together with this, also a normalisation systematic effect is present in this simulated data, allowing the overall flux to fluctuate around the expected value. In this work, this possible fluctuation is assumed constant over the full energy range energy, even though the flux normalisation uncertainty is energy dependent and probably a more complex influence, such as a polynomial behaviour of this uncertainty with energy could be present [38]. An energy-uniform effect, as well as a normalisation effect which is linerly-dependentt on the energy -not so dissimilar to an energy scale effect -can be fitted by the likelihood procedure without affecting significantly the result. Higher order effects have also been considered in The latest IceCube analysis: the fitting procedure has been shown to be robust against these effects as well [32].
Similarly, an energy-uniform tilt in the spectral behaviour would induce an overall shift in the data, which is analogous to that assumed to come from a possible energy mis-calibration of the detector. The two effects are not factorisable with the current approach. In addition to the smearing effects due to the neutrino energy reconstruction, the distributions are effected also by statistical fluctuations, considered here as poissonian given the large per-bin statistics expected in a km 3 -scale VLVνT.
The zenith-and energy-binned distribution of pseudo-data is compared to the one expected from the model, smeared as well by the expected energy resolution of the experiment, but only by this. The flux normalisation is fitted using a likelihood maximisation algorithm, separately for the conventional, the prompt and the cosmic components since the three are affected by different uncertainties, most of which are uncorrelated. The likelihood function used here, L(d|h), is given by the product of the individual likelihoods L i (d|h) computed for each bin i of the energy distribution observed in pseudo-data sets compared to those of the models. The individual likelihood is given by a Poisson probability term; in particular where, dropping the (d|h) from the formula, µ i is the expected number of events in the i-th bin from the simulated templates according to the hypothesis h, N is the number of bins in the energy estimator histogram for each event sample and k i is the number of events observed in pseudo-data d for that event sample in that bin. In order to estimate the sensitivity of an experimental configuration, a teststatistics TS is defined as a log-likelihood difference: i.e. testing the data against one model or the other, and by comparing the difference of the TS distributions obtained for several pseudo-experiments for data generated according to different hypotheses. The TS is distributed as a Gaussian distribution. The median sensitivity is defined by considering the area of the Gaussian of the TS distribution beyond the median of the TS distribution obtained by exchanging hypotheses. Figure 5 shows the TS distribution obtained by generating 2000 pseudo-data sets of an IceCube-sized detector with an energy resolution equal to 0.25 in the logarithm of the neutrino energy (and under the assumption of a null-cosmic flux). These pseudo-experiments are generated according to the H3a model (in red) or the GST 3-gen (in blue) and the TS value is computed for each of them. A Gaussian fit is performed on both distributions. Given the mean values of the two Gaussian curves and their variances, the median sensitivity is computed. In this particular, unrealistic case, it corresponds to a p-value of 0.02 meaning that 10 years of an IceCube-sized neutrino telescope with this precision in the reconstruction of the muon neutrino energy would be able to separate the two models to roughly a 2σ level. In section 6 several possible outcomes of the procedure will be discussed. In order to evaluate the impact induced by systematic effects, these could be added as nuisance parameters in the fitting procedure, as done in references [10] [32]. For simplicity, since not all the detector-related uncertainties can be accounted for here, we computed the effect of systematic uncertainties by means of a χ 2 comparison between the expected distributions at the detector and taking the χ 2 difference induced by a certain effect as an estimate of the corresponding decrease in terms of significance. The χ 2 is defined as where µ H1 i is the observed number of events in the i-th bin according to the hypothesis H 1 and ν H0 i is the expectations from H 0 . H 0 and H 1 are modified to account for a systematic effect producing a certain H 0 and H 1 and χ 2 (H 0 , H 1 ) is computed. The influence of a systematic effect is estimated to be of the order of being the square root of this ∆χ 2 and estimation of the significance drop.

Results
We first analysed the case of a detector with the size of ANTARES. In this case, given the limited effective area of the neutrino telescope, the possible separation between any different model remains at the level of 0.8 in p-value even assuming an extremely optimistic energy resolution of the detector of 0.35 in the logarithm of the neutrino energy. When using a realistic value as from [33], the TS distributions become basically overlapping. The only effect that appears to be measurable within 1σ with ANTARES is a change of slope in energy spectrum larger that |∆γ p | = 0.1. The small ANTARES effective area does not allow the model-induced fluctuations shown in figure 2 to become relevant above the statistical fluctuations. This result is close to what had been reported in reference [39].
The configurations considered in this work for 10 years of data acquisition of an IceCube-sized neutrino telescope provide some more interesting results. When considering the best fit result from the search of a cosmic flux as in reference [32] and a realistic assumption of the neutrino energy resolution of 0.5 in the logarithm of the energy, a discrimination between the H3a and GST 3-gen model is of the order of 1.8σ.
The influence of the cosmic flux in this evaluation is estimated by considering the strategy described previously. Different cosmic components are in-troduced in the pseudo-experiments ranging from the lowest-normalisation and softest-spectrum cosmic fit allowed within 1σ in reference [32] to the highestnormalisation and hardest-spectrum one provided in the same reference. As a result, we find that the discrimination power changes by roughly ±0.2σ when considering different cosmic spectra. Indeed, for lower cosmic normalisation the discrimination power increases; a similar effect is present for harder cosmic spectra. Prompt neutrino fluxes are currently affected by very large uncertainties. Their contribution can be considered sub-dominant, even though still relevant, because of the presence of a clear cosmic flux which appears in the same energy region. Some indication of its relevance in a full-spectrum analysis can be found in reference [42]. Dedicated studies should be performed to better understand the relevance of the prompt component in the measured high-energy neutrino fluxes; some discrimination power probably would come from the measurement of electron neutrino spectra in VLVνT. This channel has not been treated here.
In order to test the possible limits of this analysis, we have considered an improved neutrino energy reconstruction pushing the IceCube one to the best expected results that could be obtained, e.g., in an underwater neutrino telescope such as KM3NeT/ARCA [14]. The observed effect corresponds to an improvement of about 0.3σ in sensitivity. An improved energy resolution also makes the analysis less sensitive to the effects induced by the uncertainty on the background given by the diffuse flux of cosmic neutrinos. This is partly shown in figure 6 where, in the same fashion as in figure 2, the H3a and GST 3-gen models are compared, once the cosmic background is also added to the observed distributions. However, the effect of the energy smearing is accounted for in this figure, considering a resolution of 0.35 (top) and 0.5 (bottom) in the logarithm of the neutrino energy and adding the expectation from a cosmic flux to that of the atmospheric signal. The presence of a cosmic component removes completely the differences at PeV energies. In a similar way, the limited energy resolution smears the effect because low-energy events, being more abundant, can easily pollute the region where the highest-energy ones are expected.
Finally, the effect shown in figure 1 given by a possible energy scale uncertainty, has been considered. In order to do so, the standard configuration, e.g. the one corresponding to two plots from figure 6, without any systematic shift has been compared to data-sets where a shift in the logarithm of the energy equal to -0.1 had been applied in pseudo data-sets that were to be compared to models where this shift had not been applied. As a result, as shown inf figure 7 the differences between the expectations decrease in the region where it was peaking before by a large factor, and a relevant difference between the distribution appears at the lowest energies while in the standard scenario no difference was visible below 1 TeV. An opposite effect is present when a positive shift is applied. Thus this effect can definitely bias an analysis such as the one described here. This effect is somewhat milder when the energy resolution is improved.
This kind of uncertainty can be accounted for in a full-likelihood fitting procedure as a nuisance parameter, as e.g. done in reference [10]. If this effect is constant with energy, it can be disentangled from the spectral measurement  by using constraints from events below 1 TeV, where the relative differences between models are smaller, the neutrino statistics are larger and the effects induced by energy scale uncertainties are more significant. Given this, it is possible that in a complete analysis this effect might become even milder. In these computations, the uncertainty given by the direction reconstruction has been neglected; this is surely not affecting the result above 10 TeV, where the angular resolution of VLVνT is well below the size of the zenith bins considered here. At lower energies, also the angular resolution should start playing a role; however, the contribution to the total significance coming from these lower energies is not too large, and the total effect is only partly zenith dependent in that energy range.
As already mentioned along the paper, some effects have been neglected here and would certainly play a role in the overall estimation of the sensitivity. One effect that could probably be significant when approaching the problem as done here, would be that induced by seasonal variation of the atmospheric lepton fluxes measured at the South Pole; however, the current knowledge of the temperature profile of the atmosphere would probably guarantee that this effect could be kept under-control in a fully defined analysis using complete detector responses.
One further aspects that could induce systematic deviations in the observable zenith-energy distribution at the detector would be related to the limited knowledge of neutrino cross sections at very high energies. Indeed, in these conditions, only VLVνT can attempt a measurement of this quantity just using atmospheric neutrinos, actually by measuring deviations in the 2D zenith-energy distribution in a very similar way to what has been proposed here [40]. If one assumes that the neutrino cross section should follow a scaling by a constant factor (or by a linearly-energy-dependent one) with respect to the standard model prediction, this can be probably absorbed in a full-likelihood analysis as nuisance parameter under the assumption that the inner-Earth model, in terms of density, is known with a much smaller uncertainty. In general, the effect due to a different neutrino cross-section would mostly affect the measurements in the 10-100 TeV range, in the region when absorption through the Earth becomes non-neglibile, where also this analysis is most sensitive. However, effect which depend on the neutrino cross-section would be a more remarkable zenith-dependent behaviour, especially in the region of the Earth core.
Another systematic uncertainty related to the particle physics at play is the one induced by the hadronic model in the CR interaction and shower development in the atmosphere. As shown in figure 2 its influce does not show a huge energy dependent effect and seems quite linear over the energy range. We assumed here that its impact would not be too large when the central expectation values are taken into account. Given the uncertainties on these prediction, we cannot exclude they could significantly change the picture, but probably further improvements on their precision could be foreseen.
As a matter of fact, a polished, even though model-dependent, analysis could be already performed with the current generation of km 3 neutrino telescopes, and the separation power between different models could be of the order of 1.5 to 2σ; larger-volume neutrino telescopes are foreseen in the next future (KM3NeT/ARCA and GVD-Baikal) or on a slightly longer term (such as IceCube-Gen2 [41]). Given the improvement in performance expected for these instruments, a better insight on this question could be given if large and pure atmospheric neutrino data-sets are collected, benefiting from both the larger exposure and the more precise measurement of the muon neutrino energy. The possibility of combining different data-samples collected by different telescopes should not be underestimated, since this would help controlling detector systematical uncertainties e.g. in the energy calibration which could be of great importance in the proper estimation of the behaviour of atmospheric neutrinos. Finally, expanding this to the electron neutrino channel might be challenging because of the lower statistics achievable and the necessity of considering further uncertainties coming from neutral current events, but this channel might also be worth testing in the future.