1 Introduction

Primary Cosmic Rays (CRs) are ionized nuclei, mainly protons, that were discovered by Victor Hess and others in a number of balloon flights [1]. Despite extensive studies since then, the origin, propagation, and interaction of high-energy CRs in the atmosphere are not yet fully understood [2]. This work focuses on the latter point demonstrating a deficit of deeply penetrating muons in the simulations of CR interactions in the atmosphere using the up-to-date hadronic interaction models. Although the KM3NeT data are in better agreement with older parametric simulations [3] (Figure 2), those are based on underground and underwater muon measurements and the hadronic interaction models used are inconsistent with the LHC results.

The measurement of the number of muons in CR events with the Pierre Auger Observatory has revealed a deficit of GeV muons in CR simulations with respect to the data [4, 5]. This result triggered similar analyses by several experiments, such as the EAS-MSU Array [6], the IceCube Neutrino Observatory [7], the KASCADE-Grande experiment [8], the NEVOD-DECOR detector [9], the SUGAR array [10], the Telescope Array [11], and the Yakutsk array [12]. For several of these experiments, the muon deficit becomes more and more pronounced as the energy increases, starting from 20 to 40 PeV up to the highest energies of tens of EeV. The issue connected with this deficit is known as the Muon Puzzle [13].

In this work, the zenith distribution of the rate of high-energy muons arising from Extensive Air Showers (EAS) is probed at a depth of several kilometres with the KM3NeT underwater telescopes. The minimal muon energy at sea level required to reach the KM3NeT detectors is around 500 GeV [14] with the majority of muons having energies in the TeV range as shown below in Sect. 3.

A verification that the production of TeV muons is described properly serves also as a check that the neutrino and gamma ray fluxes from cosmic sources are correctly predicted. Since neutrinos and muons are mostly produced in the weak decays of mesons, one may expect that the increase of muons is due to an increase of mesons decaying predominantly in muons, and compensated by a decrease in mesons decaying electromagnetically with a production of gamma rays (see, for example, [15, 16]). The increased neutrino production and decreased gamma ray production means that the gamma ray to neutrino conversion predictions for the cosmic sources used in studies of the neutrino telescope collaborations could gain a boost for the neutrino fluxes [17,18,19].

Several other large-volume neutrino telescopes reported results on atmospheric muon measurements, namely AMANDA [20], ANTARES [21], Baikal-GVD [22], IceCube [23], NEMO [24], and NESTOR [25]. The quoted results were compared with the pre-LHC hadronic interaction models in contrast to this work where the post-LHC models are used.

This paper is organised as follows. The description of the KM3NeT detectors is given in Sect. 2. Details on the models used in the simulation are provided in Sect. 3. The motivation and procedure of the tuning of a fast Monte Carlo (MC) generator based on the detailed MC simulations are described in Sect. 4. The muon reconstruction performance and the estimation of systematic uncertainties are given in Sect. 5 and Sect. 6, respectively. The main results of the comparison of KM3NeT data with MC simulations are presented and discussed in Sect. 7. Finally, the conclusions are summarised in Sect. 8.

2 The KM3NeT neutrino telescopes

The KM3NeT research infrastructure comprises two neutrino telescopes at the bottom of the Mediterranean Sea [26]. The detection technology is the same for both detectors. In the KM3NeT telescopes, Cherenkov photons induced by highly-relativistic charged particles are detected by an array of three-inch PhotoMultiplier Tubes (PMTs). The PMTs are placed in pressure-resistant 17-inch diameter glass spheres and there are thirty one PMTs in each sphere. This covers most of the light arrival directions which is suitable for the downward-going atmospheric muons detection. Such sphere, together with the associated readout electronics inside it [27], comprises the Digital Optical Module (DOM) [28].

The KM3NeT DOMs are attached to a pair of vertical ropes, forming a Detection Unit (DU) consisting of eighteen DOMs. Each DU is secured to the sea floor with an anchor and has a buoy at the top to keep the DU in an almost vertical position. The detector geometry varies slowly with deep-water currents. The position and orientation of each DOM are continuously measured with an acoustic positioning system, and with tilt and compass sensors [29]. The dynamic positioning calibration of the detectors can be further corroborated and enhanced using muons passing through the detector, by means of a likelihood maximisation procedure [30].

Whenever the signal pulse read-out of a PMT exceeds a predefined threshold voltage, the signal is recorded. In particular, the start time of the signal and the pulse duration are saved. Together with the PMT identification number these data form a “hit”. A set of causally-connected hits triggers an “event”. The data acquisition is based on the all-data-to-shore concept [26]. The event triggering and reconstruction are performed onshore. The data are taken continuously in consecutive runs of a few hours.

Neutrino telescopes can detect different event topologies, coming from all-flavour neutrino interactions and from atmospheric muons. Track-like events correspond to muons crossing the instrumented volume of the detector. This topology is mostly caused by atmospheric muons or by muons from charged-current neutrino interactions. Shower-like (or cascade-like) events are caused by interactions of neutrinos where only electromagnetic and hadronic showers are produced. In this work, events are reconstructed under the assumption of a track-like topology [31]. The overwhelming majority of such reconstructed events is due to the downward-going atmospheric muons.

The KM3NeT/ARCA telescope is under construction offshore the coast of Sicily, Italy, at a depth of about 3.5 km. In its final configuration, the KM3NeT/ARCA detector will consist of two “building blocks”, each comprising 115 DUs. The horizontal spacing between the DUs is about 90 m and the vertical distance between the DOMs on each DU is 36 m, optimised for the detection of high-energy cosmic neutrino signals. The KM3NeT/ORCA detector is located about 40 km offshore the coast of Toulon, France, at about 2.5 km depth. KM3NeT/ORCA will comprise one building block of 115 DUs. The distance between the DUs is around 20 m and the DOMs on each DU are on average 9 m apart, in order to measure neutrino oscillations of GeV atmospheric neutrinos.

The overburden difference between the KM3NeT detectors is almost one kilometre. This, together with different detector energy thresholds and geometries, allows for measurements of deeply-penetrating atmospheric muons in complementary energy ranges. The analysis presented in this work is performed with data taken by both the KM3NeT/ARCA and KM3NeT/ORCA telescopes in a configuration of six DUs at each site. In the following, these configurations are denoted as ARCA6 and ORCA6 for the two detectors, correspondingly.

3 Simulation of high-energy muons in KM3NeT

The simulation of muons for the KM3NeT telescopes is divided into four steps:

  1. 1.

    Simulation of the interactions of the primary CRs with the air nuclei and the subsequent propagation, interaction, and decay in the atmosphere of the secondary particles. This is done with a detailed MC simulation software. By detailed simulations, it is implied that all the relevant interaction, decay, and energy loss processes are taken into account and treated stochastically using step-by-step Monte Carlo sampling methods.

  2. 2.

    Propagation of muons in water from sea level down to the KM3NeT detectors. The propagation routine uses MC methods for proper treatment of the stochastic muon energy losses and the muon direction changes.

  3. 3.

    Generation of the Cherenkov light from muons and its detection by the PMTs. Fast sampling of the PMT photoelectrons for nominal quantum efficiency is done from pre-generated tables. Conversion of the simulated photoelectrons to hits is performed using the measured efficiencies, PMT timing characteristics, and the optical background.

  4. 4.

    Reconstruction of muon tracks.

3.1 Simulations of primary cosmic-ray interactions

For the first step, CORSIKA version 7.7410 [32] has been used for this work. Since the minimal muon energy required to reach the depth of the KM3NeT detectors is about 500 GeV at sea level [14], the lower limit on the primary CR energy is conservatively set to 1 TeV per nucleon in the simulation. For each nucleus, the energy range in which simulations are carried out is divided into 3 sub-ranges (named “TeV low”, “TeV high”, and “PeV”) to obtain sufficient statistics for all energies of interest. A summary of the production energy ranges and the number of generated CR showers in each production is reported in Table 1.

Table 1 Five nuclei and three energy ranges have been used for the simulation using the CORSIKA code. The Table reports for each nucleus and for the three energy bands the interval of energy and, in brackets, the number of simulated events

Five nuclei are used as primaries in the simulation: hydrogen (\({}{^{1}_{1}\textrm{H}}\)), helium (\({}{^{4}_{2}\textrm{He}}\)), carbon (\({}{^{12}_{6}\textrm{C}}\)), oxygen (\({}{^{16}_{8}\textrm{O}}\)), and iron (\({}{^{56}_{26}\textrm{Fe}}\)). These simulation samples are used together to describe the expected muon flux at sea level. Each sample must be weighted in order to reproduce the input primary spectrum and its composition. Other primaries are taken into account by increasing the flux weights of C, O, and Fe, according to the flux of nuclei that have not been simulated.

The Global Spline Fit (GSF) is chosen as the model for the CR flux and mass-composition [33]. This model is a data-driven parameterisation of the CR flux, obtained using only the assumption that the flux is smoothly varying with energy. Since the GSF, in contrast with other mass composition models such as H3a [34] and GST [35], does not rely on the assumption that the primary CR components should be described by power-law energy spectra with rigidity-dependent cutoffs, it describes data without assumptions on the physical origin of the measured CRs. Additionally, the GSF incorporates the data-driven uncertainties on the total flux and on the flux of individual nuclear species, which are used in this work for the systematic uncertainties studies described in Sect. 6.

The UrQMD 1.3 model [37] is used to simulate the elastic and inelastic interactions of hadrons below 80 GeV in air. The simulation of high-energy hadronic interactions is performed with the post-LHC model Sibyll 2.3d [38].

The flux of particles produced in the CR air shower depends on the density profile of the atmosphere. NRLMSIS 2.0 [36] is used to obtain the atmosphere density profile. Since the atmospheric conditions vary depending on time and location, a fit is performed on the model predictions averaged over a 3 year period (2019–2021) and over the KM3NeT/ORCA and KM3NeT/ARCA sites, 42\(^{\circ }\)48\('\) N 06\(^{\circ }\)02\('\) E and 36\(^{\circ }\)16\('\) N 16\(^{\circ }\)06\('\) E, respectively. In CORSIKA the atmosphere is divided into five layers with the fifth layer at the top of the atmosphere. The atmospheric overburden as a function of the altitude above sea level, T(h) (g/cm\(^2\)), is parameterised in each layer [32]. For the first four layers, T(h) is parameterised as an exponential function:

$$\begin{aligned} T(h) = a_i + b_i \cdot e^{-\frac{h}{c_i}}, 1 \le i \le 4, \end{aligned}$$
(1)

while in the fifth layer the T(h) dependence is linear:

$$\begin{aligned} T(h) = a_5 - b_5 \cdot \frac{h}{c_5}. \end{aligned}$$
(2)

Here, i is the layer number, and \(a_i\), \(b_i\), and \(c_i\) are the fitting parameters. The values of fitted parameters are reported in Table 2.

Table 2 Values of the parameters used in Eqs. (1) and (2) resulting from the T(h) fit

3.2 Propagation of muons in seawater

To obtain muon kinematic properties in seawater at a cylindrical surface surrounding the KM3NeT detector, the can, the gSeaGen code [39] is used. The propagation of atmospheric muons resulting from CORSIKA is added to the gSeaGen code for this work since originally gSeaGen was developed as a GENIE-based [40] application for the neutrino simulation. The muon propagation routine used in gSeaGen is the PROPOSAL software [41], specifically version 6 of the code. The can height and radius correspond to those of the instrumented volume enlarged by 4 times the maximum of the absorption lengths in seawater, i.e., 70 m \(\times \) 4, to account for the light emitted along the path of muons passing outside the can, but close enough to reach the instrumented volume.

Table 3 Relative content of the seawater elements used in the “ANTARESWater” model and in the updated model obtained in this work. The content of hydrogen atoms is predefined to be equal to 2, the content of other elements is calculated relative to hydrogen

The following muon energy loss processes are taken into account in the PROPOSAL calculations: ionisation (Groom, Mokhov, Striganov parameterisation [42] with the density effect correction calculated using Sternheimer parameterisation [43]), bremsstrahlung (Kelner et al. [44]), \(e^{+}/e^{-}\) pair production (Kokoulin and Petrukhin [45]), and photo-nuclear interactions (Abramowicz and Levy [46]). The Landau-Pomeranchuk-Migdal effect [47] is also accounted for in PROPOSAL, although the effect becomes noticeable only at muon energies greater than 10 PeV; it is negligible for atmospheric muons detected in KM3NeT. The uncertainty on the underwater flux of muons caused by the inaccuracies in the models of the energy loss processes is about a few per cent at most [48].

The seawater composition and mass density used in the simulations correspond to the “ANTARESWater” model in PROPOSAL. The composition is taken from [49] and corrected for the salinity at the ANTARES site (38.48\(\permille \) [50]). The ANTARES location was near the KM3NeT/ORCA detector; the salinity at the KM3NeT/ARCA site differs from the ANTARES one at a level below 1% [51]. With this correction, the seawater content of all elements except for hydrogen and oxygen is multiplied by a factor corresponding to the relative difference in salinity (38.48/35); the correction for the oxygen coming from the sea salt sulphate was neglected. In this work, the updated seawater composition [52] (Table 4), corrected to account also for oxygen in sulphates, is tested. First, the abundance of each sea salt molecule is re-scaled accounting for higher salinity at the ANTARES site with respect to the salinity of the “Reference Seawater”, i.e., 35\(\permille \). Then, molar masses of seawater molecules are calculated as the ratio between the relative content of molecules constituting the sea salt and the atomic weights of elements as listed in PROPOSAL, which also accounts for different isotope abundances. After that, the hydrogen atomic content is normalised such that there are two atoms of hydrogen in the “solution” which is used in PROPOSAL. The contribution of other elements is then re-scaled using the proportion obtained in the hydrogen normalisation. The updated seawater composition and the composition from the “ANTARESWater” model are given in Table 3.

A test is performed in order to evaluate the influence of the new water composition on the muon energy losses. The differences in the energy losses between the two composition models for TeV muons travelling 2 and 3 km in water are found at a level below 0.5%. The original “ANTARESWater” model is used for the following in this work.

3.3 Detector response simulation

The light generation, detector response simulation, and track reconstruction are performed with internal KM3NeT software. The detector response simulation takes into account the calibrated position and orientation of the PMTs, values of PMT quantum efficiencies and the environmental optical background. This background includes the contribution of photons induced in the water by the products of the \({}{^{40}\textrm{K}}\) \(\beta \)-decays, bioluminescent light [53], and other radioactive decays of nuclei in the DOM glass and water.

3.4 Characteristics of the reconstructed events

The same reconstruction algorithms are applied to both data and simulations assuming the track-like topology of events. The events produced by the optical background are filtered by applying quality cuts on the output of the track reconstruction algorithm. Cuts are applied on the number of hits used by the algorithm, \(n_{\text {hits}}>50\), and on the likelihood of the reconstruction, \(\Lambda >20\). Hereafter, those events that passed such quality cuts are referred to as “reconstructed events”.

The rate of reconstructed events in the simulations as a function of the nucleus energy of primary CRs is shown in Fig. 1. The highlighted area indicates the 90% fraction of the total number of events counting from the maximum of the distribution, i.e., the highest density interval. This energy range spans from 3 to 320 TeV for ORCA6 and from 4 TeV to 1 PeV for ARCA6; the median values of the distributions are 26 TeV and 60 TeV, respectively. Most of the events originate from proton and helium nuclei.

Fig. 1
figure 1

The rate of simulated atmospheric muon events reconstructed with the ORCA6 (left) and ARCA6 (right) detectors as a function of the primary CR energy. The unshaded area corresponds to the 90% fraction of events (the highest density interval). Statistical uncertainties are shown as vertical error bars

Muons from the same CR interaction arrive in packed bundles at the KM3NeT detectors and they are reconstructed as a single track. The expected rate of muons at sea level from the bundles that are reconstructed with the ORCA6 and ARCA6 detectors as a function of the individual muon energy is shown in Fig. 2 for different ranges of the cosines of the zenith angle, \(\cos \theta \), where \(\cos \theta = 1\) indicates vertically down-going muons. Each muon from the reconstructed bundles that reached the can is used to fill the distributions. More inclined muons have higher energies at sea level since a larger amount of the water overburden is travelled.

The pseudorapidity variable characterises the direction of the outgoing muon with respect to the primary CR. It is defined as \(\eta =-\ln [\tan (\alpha / 2)]\), where \(\alpha \) is the angle between the primary particle and the secondary muon at sea level. The pseudorapidity distributions of reconstructed muons reaching the ORCA6 and ARCA6 detectors are plotted in Fig. 3. The distributions are filled using each muon that reaches the can. The peak of the distributions is located at \(\eta \simeq 9\). Hence, muons detected by the KM3NeT telescopes originate from hadronic interactions in the very forward region of pseudorapidities. This region is not fully covered by accelerator experiments [13].

Fig. 2
figure 2

Sea level rate of generated muons from the reconstructed bundles as a function of the muon energy. Different colours indicate different zenith angle ranges; vertical muons have \(\cos \theta = 1\). Statistical uncertainties are shown as vertical error bars

Fig. 3
figure 3

Pseudorapidity distribution of muons reaching the ORCA6 (left plot) and ARCA6 (right plot) cans. Only reconstructed events are used in the distribution. The unshaded area shows the pseudorapidity range that includes the 90% fraction of events (the highest density interval). Statistical uncertainties are shown as vertical error bars

4 Tuning of the MUPAGE parameters on the CORSIKA simulation

The CORSIKA software provides detailed simulations of EAS. Given the steeply falling primary cosmic ray spectrum, high-energy events are much rarer than low-energy events, and the properties of air showers are highly stochastic. Thus, in order to get representative statistics of atmospheric muons deep underwater, the demand for CPU time for simulations is high. The KM3NeT simulations are based on the Run-by-Run (RbR) approach similar to the ANTARES one [54], i.e., simulations are subdivided into batches corresponding to the actual data-taking periods to reproduce the time variability of the detector conditions. In particular, environmental noise is sampled from the data, the calibrated position and orientation of PMTs and the values of the PMT quantum efficiencies are considered individually for each run in the simulation. A separate muon sample is generated for each run to prevent bias associated with using the same sample. Performing RbR simulations with CORSIKA is not presently feasible due to the extensive CPU time required.

Currently, the simulation of atmospheric muons in KM3NeT is based on the fast MC generator MUPAGE [55, 56]. It generates the muon bundle kinematic properties for a certain water depth and zenith angle at a plane perpendicular to the bundle axis. This generation is based on parametric formulas describing the flux of single and multiple muons in the bundle, the differential energy spectrum, and the distance from the bundle axis. The default MUPAGE parameters were obtained from a detailed EAS simulation performed with the HEMAS package [57] and then fitting the results to the MACRO measurements [58].

A framework has been developed to tune the MUPAGE parameters using the CORSIKA simulation software and the most recent available models to describe high-energy hadronic interactions, Sibyll 2.3d [38], and the CR flux, GSF [33]. The goal of this tuning is to combine the advantage of a quick parameterised simulation with the features coming from a detailed MC simulation. The MUPAGE code and the parametric formulas themselves are kept unmodified. The numerical values of some parameters have been changed according to the method described below.

In this tuning procedure, atmospheric muons at sea level from CORSIKA simulations are propagated down through the water to a plane perpendicular to the CR shower axis using the PROPOSAL software [41]. The propagation is performed for seven different depths, from 2 km down to 3.5 km in steps of 250 m.

Muon bundle kinematic properties are fitted at each depth following the approach described in the MUPAGE paper [55]. Five fits are performed to obtain the new values of the MUPAGE parameters that aim to describe the following muon bundle kinematic properties:

  • flux of single muons in the bundles as a function of the zenith angle,

  • flux of multiple muons in the bundles as a function of the bundle multiplicity,

  • normalised energy spectrum of single muons in the bundles,

  • lateral spread of multiple muons in the bundles,

  • normalised energy spectrum of multiple muons in the bundles.

The new values of the MUPAGE parameters obtained using CORSIKA simulations are given in Tables 4 and  5. Comparison between the MUPAGE tuned on CORSIKA, original MUPAGE, and CORSIKA for the five aforementioned kinematic properties of muon bundles at the can are plotted in Fig. 4.

Table 4 Best fit values of the parameters [55] that describe the zenith dependence of the flux of single muons in the bundles and the number of muons in the bundles obtained in this work
Table 5 Best fit values of the parameters [55] that describe the energy dependence of single muon bundles and the muon lateral spread obtained in this work

A comparison of the tuned and unmodified (nominal) MUPAGE with CORSIKA for the single muon flux as a function of the cosine of the simulated zenith angle is given in Fig. 4a. The MUPAGE functions with the tuned parameters well describe the CORSIKA distributions for muons with a direction ranging from vertical to very inclined. The statistical uncertainties are underestimated for the very inclined CORSIKA muons where the proton contribution is missing due to the lack of statistics in the “TeV high” production (see Table 1 for the production label and Fig. 1 for the number of simulated proton showers).

The muon bundle fluxes are compared in Fig. 4b. A good agreement between the tuned MUPAGE and CORSIKA is present up to a multiplicity of five. The difference for larger multiplicities can be ascribed to the radial distribution of muons in bundles: muons in the bundles with multiplicity four and above are sampled from the same radial distribution in the MUPAGE code, which is dominated by bundles consisting of four muons. However, in the CORSIKA simulations the radial distribution becomes more compact as the multiplicity grows, hence, more muons are crossing the can with respect to MUPAGE at high multiplicities. A possible fix for the issue is the change of the MUPAGE formulas which may be performed in future works. Here, the discrepancy in the bundle multiplicity does not affect the results as shown in Fig. 6 since the inclusive muon rate underwater is dominated by bundles with low multiplicity. With inclusive atmospheric muon flux all muons originating from the whole CR energy spectrum are indicated, in contrast to the muons measured in EAS detectors from Ultra-High Energy (UHE) CR interactions only.

The energy spectra of single muons are compared in Fig. 4c. Both the nominal and the tuned MUPAGE follow the spectrum obtained with CORSIKA.

The fit of the muon lateral spread distribution does not converge in the presented version of the tuning procedure. In particular, for the lateral distribution of inclined bundles, the MUPAGE parametric formulas do not approximate well the distribution as simulated with CORSIKA. The comparison between the CORSIKA and MUPAGE distributions for the lateral spread of two-muon bundles reaching the can is shown in Fig. 4d. This distribution is dominated by vertical muons for which the MUPAGE functions with the adjusted parameters provide a good description.

Several MUPAGE functions that aim to describe the multiple muon bundle energy dependence do not approximate the distributions resulting from CORSIKA with Sibyll 2.3d and GSF on the full parameter space, and thus, the parameter values are kept nominal and are not reported in the two tables above. The rate of events as a function of the bundle energy is plotted in Fig. 4e. Although the MUPAGE parameters for the energy distribution of multi-muon bundles were not changed, the agreement between the tuned MUPAGE and CORSIKA has improved, thanks to the changes of parameter values connected with the multiplicity distributions.

5 Muon reconstruction performance

To estimate the performance of the reconstruction algorithms, a comparison between the distributions of the cosine of the zenith angle from the MC reconstructed and generated events separately for the ORCA6 and ARCA6 detectors is done as shown in the top part of Fig. 5. A discrepancy between the two distributions starts to emerge for events with \(\cos \theta < 0.5\) for ORCA6 and with \(\cos \theta < 0.6\) for ARCA6. Even though the KM3NeT angular resolution is at the sub-degree level [26], the very steep dependence of the muon flux on the true \(\cos \theta \) causes the few well-reconstructed inclined events to be hidden by a small fraction of mis-reconstructed more vertical muons. The scatter plot for the true generated \(\cos \theta \) versus the reconstructed one is given at the bottom of Fig. 5. It can be noted also that all muons are assumed to be collinear with the bundle axis in MUPAGE. The median angular spread between muons in a bundle and the bundle axis is evaluated using CORSIKA. This value is around 0.14\(^{\circ }\), with 90% of muons deviating from the bundle axis by less than 0.38\(^{\circ }\). Hence, the parallel muon approximation should not affect the zenith distribution.

Fig. 4
figure 4

Comparison of the CORSIKA simulations (blue points) with the MUPAGE tuned on CORSIKA (red points) and the nominal MUPAGE (green points) for muons on the can from events which passed the trigger condition. The livetime of the MUPAGE simulations is about 43 (52) hours for the tuned MUPAGE and 35 (43) hours for the nominal one for ORCA6 (ARCA6). The same amount of generated events corresponds to the different values of livetime due to the differences in the MUPAGE parameter values. The comparison is shown for five distributions: (a) the single muon flux as a function of the cosine of the zenith angle; (b) the muon bundle flux as a function of the bundle multiplicity; (c) the single muon flux as a function of the muon energy; (d) the flux of two-muon bundles at the top of the can as a function of the distance to the bundle axis; (e) the muon bundle flux as a function of the bundle energy. The distributions for ORCA6 are shown on the left plots and for ARCA6 are on the right plots. The ratios between MUPAGE and CORSIKA are on the bottom plots. Statistical uncertainties are shown as vertical error bars. These uncertainties can be underestimated for the very inclined single muons from CORSIKA and for several bins in the single muon energy distribution for the very high-energy muons

Fig. 5
figure 5

Top: muon rate as a function of the cosine of the true zenith angle (red points) compared to the reconstructed one (blue points) for ORCA6 (left plot) and ARCA6 (right plot). Statistical uncertainties are shown as vertical error bars. Bottom: cosine of the true zenith angle versus the reconstructed one. The colour scale represents the rate of events. Solid and dashed black lines indicate 50%, 16% and 84% quantiles, respectively

The performance of the energy reconstruction for atmospheric muons is also evaluated. This analysis is carried out with data collected in a six-DU configuration for both KM3NeT/ORCA and KM3NeT/ARCA, this corresponds to 5% and 2.5% of the final design detector size, respectively. With the limited instrumented volume of the detectors, only a fraction of the muon energy is deposited within the sensitive volume of the telescopes. Therefore, the energy reconstruction algorithms fail to properly estimate the muon energy and the reconstructed spectrum does not match with the true MC one. The reconstruction performance is expected to improve for larger detector configurations.

It is worth noticing that for muons with energies below 100 GeV radiative losses are negligible and their energy is reconstructed from the visible track length. The track length of 100 GeV muons is about 400 m, which is longer than the sensitive size of the KM3NeT/ORCA detector, especially considering slightly-inclined muons that do not pass through the whole vertical length of the detector (\(\sim \)300 m). This limits the performance of the energy reconstruction for KM3NeT/ORCA. For the KM3NeT/ARCA detector muons below 100 GeV are almost never reconstructed due to the sparser module distribution with respect to KM3NeT/ORCA. For muons above 100 GeV the contribution of radiative energy losses becomes important. The energy loss per unit path length in this regime is roughly proportional to the muon energy. However, the number of detected photons used as a proxy for the energy losses can be severely affected by the mis-modelling of water properties, in particular the light absorption length in seawater. For direction reconstruction the effect is milder and it is evaluated in Sect. 6.

6 Systematic uncertainty estimation

Several systematic uncertainty effects on the reconstructed muon rate are considered when comparing data with the expectation from the MC simulation, as listed below:

  • the primary CR flux normalisation and mass composition,

  • the light absorption length in seawater,

  • the light detection efficiency,

  • the high-energy hadronic interaction model.

The primary CR flux uncertainties are estimated using the complete CORSIKA simulations. All the other uncertainties mentioned are evaluated using simulated MC samples produced with MUPAGE tuned on CORSIKA.

The total flux uncertainty is calculated using the limits on the all-particle flux available in the GSF model, while fixing the CR composition. A relative uncertainty of 6–8% on the muon rate is obtained as a function of the zenith angle.

The effects of the uncertainties on the primary CR composition are also calculated using the per-mass-group flux estimations in the GSF model. In practice, the light (proton) and heavy (iron) components of the flux are varied within their uncertainties while keeping the total flux value unchanged. First, the CR proton flux is assumed to be at the maximal possible value within the quoted uncertainties. From this the fluxes of all other components are recomputed starting from iron, so that the same average flux value as the original GSF model prediction is obtained. Then, the same procedure is repeated but this time increasing the iron component while decreasing the flux of the other primary species. The results of this uncertainty calculation are at a level of 6–7%.

Finally, since 5 types of primary nuclei have been simulated with CORSIKA (Table 1) and 28 primaries are available in the GSF model, the following approach is used to account for all the nuclei available in GSF. The hydrogen and helium flux weights, \(w^{\text {H}}_{\text {CR}}\) and \(w^{\text {He}}_{\text {CR}}\), are taken directly from the GSF table containing the flux values for each nucleus. The carbon weight, \(w^{\text {C}}_{\text {CR}}\), is the sum of the GSF fluxes of nuclei with a charge from 3 to 6, \(w^{\text {C}}_{\text {CR}} = \sum _{Z=3}^{6} w^{Z}_{\text {CR}}\), the oxygen weight is \(w^{\text {O}}_{\text {CR}} = \sum _{Z=7}^{10} w^{Z}_{\text {CR}}\), and the iron weight is \(w^{\text {Fe}}_{\text {CR}} = \sum _{Z=11}^{28} w^{Z}_{\text {CR}}\). This assignment of the flux weights is noted as “Average composition” in Table 6. A test is performed in order to estimate the bias introduced by that particular assignment considering two edge cases. The first one is to assign larger weights to lighter primaries, “Lighter composition” in Table 6, and the second one is to increase the weights of the heavier primaries, “Heavier composition”. This results in an error of about −0.5% (3%) introduced by the different scheme of the flux weight assignment.

The new results from the direct CR experiments in the TeV energy range have been published after the GSF model development was completed [59,60,61,62]. The uncertainties on the CR flux and composition may be further reduced in future models of the CR flux.

Table 6 Scheme of the CR flux weight assignment. This corresponds to the ranges of atomic numbers of elements listed in the second, third, and fourth columns that are simulated in CORSIKA as a primary in the first column. The average composition is used as default in this work

The light absorption length that is used in the simulations is the result of the measurements of the NEMO Collaboration [63]. They reported the variation of the absorption length for five different measurements during 1999–2003 at a level of ±10% for 400–500 nm wavelength. The light absorption length is modified in MC simulations by ±10% to estimate the effect of this uncertainty in the measured muon rate. Then, the ratios between the MC simulations with the standard absorption length and with the modified ones are computed. This leads to a difference in the muon rate at a level of about ±5% for ORCA6 and about ±10–20% for ARCA6. A higher uncertainty in ARCA6 could be attributed to the different detector geometries: the ORCA6 detector layout is denser and light travels shorter distances in water with respect to ARCA6. Hence, the influence of different absorption length values is less pronounced in ORCA6. For ARCA6, the uncertainty is also zenith-dependent. In fact, most of the vertical muons are reconstructed close to a single DU, so information from all eighteen DOMs of the DU is used in the reconstruction. For inclined muons the number of DOMs close to the muon track is much smaller, thus hits from the DOMs at larger distances also become important in the reconstruction and they are more affected by the changes in the absorption length.

Fig. 6
figure 6

Top: a comparison between the MUPAGE tuned on CORSIKA (red) and the CORSIKA MC simulations (blue) for the reconstructed rate of muon events as a function of the cosine of the zenith angle for ORCA6 (left) and ARCA6 (right). Bottom: ratio between the tuned MUPAGE and CORSIKA distributions. Statistical uncertainties are shown as vertical error bars. The livetime of the MUPAGE simulations is about 43 (52) hours for ORCA6 (ARCA6)

It is expected that the light scattering does not change the number of detected photons, thus, the number of reconstructed events is not significantly affected. The quality of the reconstruction for tracks may slightly degrade [26], but it is less relevant for this work. Hence, the uncertainty on the light scattering length in seawater is not considered here.

The light absorption and scattering lengths are mostly affected by the particle concentration. The long-term monitoring of the deep-water properties was performed in the period 2009–2015 at the ANTARES site [64] which is close to the ORCA one. The baseline of the suspended particle concentration did not change noticeably in seven years of monitoring, however, seasonal variations are present.

The PMT efficiency is calculated for each PMT by measuring the coincidence rates between PMTs in each single DOM and comparing the rates to the ones obtained with the GEANT4 simulation of radioactive nucleus decays in the seawater and in the DOM glass [65]. The PMT efficiency is calculated every several hours and then used in the RbR simulations. The efficiency drops by \(\sim \)20% for the upward-looking PMTs in the first year after deployment due to sedimentation [66]. Then, it remains stable and even occasionally increases probably due to the partial cleaning. For other PMTs, the efficiency loss is much slower. Values of the overall light detection efficiency for each PMT in the KM3NeT DOMs are known with a precision of about 10%. The uncertainties are mainly coming from the radioactive element composition in the DOM components (\({}{^{40}\textrm{K}}\) in the glass sphere, mainly) and the difference of PMT efficiency for the inclined light from the nearby source (radioactive elements decay) and for the mostly parallel light from muons. In order to evaluate the effect from this uncertainty the MC simulations are repeated with the PMT efficiency values re-scaled by ±10% for the evaluation of the corresponding systematics. The resulting uncertainties are \(\sim \)5% for ORCA6 and \(\sim \)10–30% for ARCA6 depending on the zenith angle. Also in this case, the larger uncertainties for the ARCA6 detector can be explained by its geometry, i.e. the spatial distribution of DOMs in KM3NeT/ARCA is less dense than in KM3NeT/ORCA. The uncertainty associated with the PMT efficiency is zenith-dependent for ARCA6 for the same reason as for the absorption length.

The uncertainty induced by differences in the high-energy hadronic interaction models is estimated using the MCEq softwareFootnote 1 [67] as a faster and compatible alternative to CORSIKA [68]. Three post-LHC models are used for the muon flux calculation, namely EPOS-LHC [69], QGSJETII-04 [70], and Sibyll 2.3c [71]. The CORSIKA simulation results are obtained in this work with the Sibyll 2.3d [38] model for the interactions, which is not available in MCEq at the time of writing. The uncertainty is calculated as the ratio between the hadronic interaction model that provides the minimum and the maximum sea-level flux for a certain muon energy with respect to the flux averaged over the results obtained with the three models considered. To evaluate the uncertainty for the KM3NeT simulation results, the flux of muons reaching the ORCA6 and ARCA6 cans as a function of their energy at sea level is used. The uncertainty value is obtained by convolving the MCEq results and the muon fluxes at sea level. The uncertainties for the ORCA6 (ARCA6) detector are at a level of 3% (4%).

The results of the CORSIKA simulation at the reconstruction level are compared with simulations carried out with the tuned MUPAGE described above using the inputs from the CORSIKA samples (Fig. 6). The tuned MUPAGE is in agreement with CORSIKA within statistical errors for ARCA6, and the disagreement between the tuned MUPAGE and CORSIKA is below 3% for ORCA6. Therefore, the tuned MUPAGE is used as a proxy for CORSIKA for the reconstructed muon rate as a function of the zenith angle. A slight disagreement between the tuned MUPAGE and CORSIKA is not considered as a systematic uncertainty in this work.

Seasonal variation of the atmospheric muon flux is not considered as a systematic uncertainty. As described above, the CORSIKA simulations are performed with an atmosphere model averaged over a three-year period. The ORCA6 data considered in this analysis was taken during the period from February 2020 to January 2021 considering 6 h of data in each month. Hence, the seasonal variations are also averaged. For the ARCA6 detector, however, the data-taking period considered here spanned from May to July 2021. The seasonal variation of the atmospheric muon flux has been measured for ORCA6 [72]. The muon flux for the summer period is \(\sim \)3% higher with respect to the annual average. An additional check is performed using the MCEq software by comparing the summer and winter muon fluxes at sea level as a function of the zenith angle. The fluxes are calculated for 1, 5, and 10 TeV muons. The ratio of the fluxes in summer and winter is flat in zenith and differs from unity at a level of 3–4% for all three muon energies considered.

7 Results and discussion

The comparison of the KM3NeT data with the simulations obtained with the MUPAGE tuned on CORSIKA, including the systematic uncertainties discussed above, is shown in Figs. 7 and 8 for the ORCA6 and ARCA6 detectors, respectively. A small fraction of livetime, corresponding to 4.5 days (2.5 \(\times \) 10\(^{6}\) events) in ORCA6 and to 1.5 days (1.3 \(\times \) 10\(^{5}\) events) in ARCA6, is used.

Fig. 7
figure 7

Top: reconstructed muon rate as a function of the cosine of the zenith angle for ORCA6. The data points are shown in black, the simulations are in blue. Different systematic uncertainties are summed linearly and plotted as coloured bands. Bottom: the ratio between the data and the simulations. Statistical uncertainties are shown as vertical error bars

Fig. 8
figure 8

Top: the rate of the atmospheric muon events reconstructed with the ARCA6 detector as a function of the cosine of the zenith angle. The MC simulation points are plotted in blue with the systematic uncertainties summed linearly and plotted as coloured bands. The data points are shown in black. Bottom: the ratio between the data and the simulations. Statistical uncertainties are shown as vertical error bars

7.1 Comparison between the ARCA6 and ORCA6 results

The MC simulation samples obtained with the Sibyll 2.3d hadronic interaction model and the GSF CR flux model underestimate the muon rates measured with both the ORCA6 and ARCA6 detectors.

In the case of ORCA6, the ratio between data and simulations is flat in the considered zenith angle range; \(\sim \)40% more muons are observed in data than in simulations and this discrepancy cannot be accounted for by the statistical and systematic uncertainties considered in this work.

The shape of the ARCA6 data/MC ratio is not flat, in contrast to ORCA6. The depth of the top part of the ORCA6 detector is about 2.3 km, while ARCA6 is located deeper with the top part being at about 2.8 km. Also, the energy threshold of the two detectors is different, \(\le \)10 GeV for ORCA6 and \(\le \)100 GeV for ARCA6. The difference in the energy thresholds can be compensated using the additional slant depth for ORCA6, i.e. 360 m which corresponds to muon losses of 90 GeV. Thus, similar results are expected for equivalent depth, e.g. the value of the ARCA6 \(\cos \theta = 1\) corresponds to the ORCA6 \(\cos \theta = 2.3/(2.8+0.36) \approx 0.73\). However, the ratio values are different, 1.5 and 1.4 for ARCA6 and ORCA6, respectively.

Fig. 9
figure 9

Vertical muon spectrum at sea level from the CORSIKA simulation used in this work compared to two analytical models, Gaisser (orange line) and Bugaev (green line), and to data from ground-based experiments. The ratio of models and data to CORSIKA is shown on the bottom plot. Statistical and systematic uncertainties are shown as vertical error bars. The sea-level energy intervals for 90% of vertical muons (\(0.975< \cos \theta < 1\)) from reconstructed events for ARCA6 and ORCA6 are shown as coloured bands

A possible explanation for the differences in the shape and the mismatch in the ratio values between the detectors could be related to the optical properties of water used in the simulation, in particular to light absorption. By increasing the absorption length by 10%, the data/MC ratio for ARCA6 is flat in \(\cos \theta \) and at a level of 40% above unity which is the same as for ORCA6. The same change in the absorption length for the ORCA6 detector modifies the result by \(\sim \)5% and does not change the shape of the ratio. It is quite realistic that the optical properties of seawater are not the same at both detector sites and the difference can be within a 10% uncertainty.

7.2 Spectral index of the muon flux

The different zenith angles correspond to different slant depths travelled and, thus, different energy thresholds for muons at sea level. Therefore, the flat data/MC ratio seen in ORCA6 hints that the shape of the simulated muon energy distribution at sea level is correct even if there is a discrepancy in the normalisation. Since the shape of the muon energy spectrum at sea level depends on the primary CR flux, it may be stated that the GSF model predictions are in agreement with the KM3NeT data. The 90% fraction of events reconstructed in ORCA6 originates from primaries with energies in the 3–350 TeV region as shown in Fig. 1. In this energy range, the GSF approximation for the flux may be considered as a power-law spectrum and is fitted with the corresponding function. The fitted value of the spectral index is \(-2.607 \pm 0.006\).

7.3 Comparison with other neutrino telescopes

The IceCube Collaboration has also published the results on the characterisation of the muon flux properties deep under ice [23]. One of the results is the comparison of the data with the simulations in terms of the muon zenith distribution. The IceCube simulations were performed with CORSIKA using the Sibyll 2.1 (pre-LHC) hadronic interaction model [73] and two CR flux models, H3a [34] and GST [35]. Even though the models are different with respect to the ones used in this work, there is also an underestimation of the muon flux in the simulations with respect to the IceCube data at the level of 20% for vertical muons. The TeV muon energy spectrum at sea level predicted by the Sibyll 2.1 model is about 10% higher with respect to the one predicted by Sibyll 2.3d and the difference is almost energy independent; the H3a primary proton flux used in IceCube is higher by \(<10\%\) than the one from GSF used in this work. Therefore, the IceCube result is compatible with the one found in this work for vertical muons once the differences between the models are accounted for.

In the IceCube analysis, the toy model simulation was fitted to data in order to get the power law index of the CR spectrum which provides the flat data/MC ratio. The spectral indices obtained for the trigger and high-quality data were \(-2.715\) and \(-2.855\), respectively. Both values differ from the one that provides the flat data/MC ratio for ORCA6 in this work, which is \(-2.607\). Differences in the KM3NeT and IceCube results for the spectral index may be explained by the systematics of the two detectors and by the different hadronic models used.

The ANTARES measurement of the zenith distribution of the atmospheric muon flux has been published in [21]. The detailed MC simulations with CORSIKA and pre-LHC hadronic interaction models underestimate the ANTARES data, although, the difference with respect to the MC simulation is within the systematic uncertainties for ANTARES.

The underwater muon rate as a function of the zenith angle has been also measured by the Baikal Collaboration [22]. The results show good agreement between the data and simulations. The hadronic interaction model used in these simulations is pre-LHC and the CR composition does not include the most recent direct measurements.

7.4 Comparison with the inclusive muon flux at sea level

The measured discrepancy between data and simulations in KM3NeT can be investigated considering also sea-level measurements of the muon flux. The high-energy muon flux can be measured at sea level up to several TeV. This threshold depends on the maximum detectable momentum of a muon, given by the relative momentum resolution of the spectrometer used to carry out such a measurement [74]. Several experiments were able to directly measure the absolute muon spectrum at sea level in the energy range relevant for KM3NeT: the L3+cosmic experiment [74], the Nottingham CR spectrometer [75, 76], and the Kiel spectrographs [77].

The vertical muon flux at sea level resulting from the CORSIKA simulations used in this work is compared to the data from the experiments mentioned above, and to two analytical models, namely the Gaisser [2] and Bugaev [78] models. This comparison is shown in Fig. 9.

The CORSIKA simulations using Sibyll 2.3d and GSF underestimate the sea-level muon flux predicted by the analytical models at a level of \(\sim \)30%. The first L3+cosmic data point, the results from the Nottingham spectrometer obtained in 1984 [76], and the Kiel spectrograph data point are also above the CORSIKA predictions by \(\sim \)20–40%. This is compatible with the discrepancy seen by ORCA6 within the quoted uncertainties. The Nottingham data of 1968 [75] and the second L3+cosmic point are in agreement with the CORSIKA predictions and in disagreement with the two models considered here.

A similar discrepancy between the data for the inclusive flux of lower energy (GeV) muons at sea level and the MC simulations with Sibyll 2.3d and GSF is reported in [48]. This discrepancy is at a level of \(\sim \)20–30% for muons in the energy range from 1 GeV up to several hundred GeV.

Muons contributing to the inclusive flux originate mainly from the first interactions in the EAS. Thus, they are mainly coming from CRs whose energies are higher by one order of magnitude with respect to the muon energy [71]. The inclusive TeV muon flux detected by the KM3NeT telescopes is also mainly driven by the first interactions. Therefore, it is worth highlighting that a similar muon deficit is observed for GeV muons coming from GeV CRs [48] and for TeV muons coming from TeV primaries as shown in this work. On the other hand, the deficit described in the Muon Puzzle is also for GeV muons but those muons are decay products of pions born after several stages of the shower cascade development.

7.5 On the atmospheric neutrino flux

Atmospheric muons with energies of a few TeV are produced mainly in the decays of charged pions [71] (Figure 5) which also give rise to neutrinos. The neutrino energy is lower than that of the muon of the same pion decay by a factor \(\sim \)4 [2] (Equation (6.14)). Hence, the TeV muon deficit observed in this work would correspond to the muon neutrino deficit in the \(\sim \)250 GeV energy domain. However, the inclusive neutrino flux for energies above 200 GeV is dominated by neutrinos coming from kaon decays [71] (Figure 5). Therefore, the deficit of atmospheric muons does not directly lead to the deficit of atmospheric neutrinos. Comparisons with atmospheric neutrino data can provide a complementary measurement for hadronic interaction studies.

7.6 On the first interactions probed by the Pierre Auger Observatory

The Pierre Auger Observatory has reported measurements of the fluctuations in the number of muons in EAS produced by UHECRs [79]. These measurements agree with the MC simulations with the recent high-energy hadronic interactions models, in particular with Sibyll 2.3d. The fluctuations in the muon number are believed to be mainly determined by the first interaction of the CR primaries with the atmosphere [80]. Since muons detected by the KM3NeT telescopes originate mainly from the first interactions in EAS, these detectors provide direct probes of such interactions. As demonstrated in this work, there is a discrepancy between the KM3NeT data and the Sibyll 2.3d hadronic model. Therefore, it is important to notice that even though the hadronic models are able to describe the fluctuations of the muon number for UHECRs, they fail in the description of the absolute number of TeV muons originating from the first interactions for lower CR energies (TeV–few PeV range) as reported here.

8 Conclusions

The most recent direct measurements of the primary CR flux describe the per-nucleus spectrum up to several hundreds of TeV, mainly driven by the CREAM and AMS-02 detectors [33]. Above these energies, the CR flux can only be measured indirectly through air shower observations, whose results are strongly dependent on various systematic uncertainties, most importantly those concerning the nature of hadronic interactions. Even though the knowledge on hadronic interaction has been significantly improved by the LHC measurements, the lack of accelerator data in the forward interaction region does not allow to describe CR air showers accurately. This leaves space for discrepancies, such as those observed recently in the EAS shower measurements. In this work, the high-energy muon contribution from CR showers is studied with the KM3NeT underwater neutrino detectors.

A detailed simulation using the CORSIKA MC software has been used in this work to extract updated parameters for the fast underwater muon flux generator MUPAGE, namely using recent input for the hadronic interaction model (Sybil 2.3d) and the CR flux (the GSF model). This parameterisation allows for a comparison of simulations with data from the ORCA6 and ARCA6 detectors, and reveals a deficit in the simulations with respect to the data at the \(\sim \)40% level for TeV-energy atmospheric muons. This deficit is weakly dependent on the muon inclinations and thus on the muon path length or muon energy at sea level. The deficit is compatible with the sea-level measurement and models of TeV-scale muon flux.

The observed deficit of TeV muons indicates that the neutrino production in cosmic sources may be underestimated with respect to the flux of the accelerated nuclei and with respect to the gamma ray flux. Once the new hadronic interaction model is obtained as a solution to the observed atmospheric muon deficit, the gamma ray and neutrino production in cosmic ray accelerators could be revisited.

An overview of the discrepancies in different muon energies coming from different primary CR energies is reported in this work. This provides additional inputs to the Muon Puzzle observed in the measurement of GeV muons from ultra-high energy CR air showers. The recent attempts to solve the discrepancies in the Muon Puzzle should be extended to describe a broader phase space region where the issue is observed.

Other muon kinematic properties important for understanding the discrepancy origin include muon bundle multiplicity, lateral spread, and underwater energy spectrum. Proper reconstruction of these observables is expected with the completed KM3NeT detectors.