UHECRs mass composition from $$X_{\mathrm{max}}$$Xmax distributions

The atmospheric depth where the energy deposit profile of secondary particles from extensive air showers (EAS) reaches its maximum, $X_{\rm max}$, is related to the primary particle mass. The mass composition of the ultra-high energy cosmic rays (UHECRs) can be inferred from measurements of $X_{\rm max}$ distributions in each energy interval, by fitting these distributions with Monte Carlo (MC) templates for four primary species (p, He, N and Fe). On the basis of simulations, we show that a high abundance of some intermediate elements in the $X_{\rm max}$ distributions, e.g. Ne or Si, may affect the quality of the fit and also the reconstructed fractions of different species with respect to their true values. We propose a method for finding the"best combination"of elements in each energy interval from a larger set of primaries (p, He, C, N, O, Ne, Si and Fe) which best describes the $X_{\rm max}$ distributions. Applying this method to the $X_{\rm max}$ distributions measured by the Pierre Auger Observatory (2014), we found that the"best combination"of elements which best describe the data suggest the presence of Ne or Si in some low energy bins for the EPOS-LHC model.


Introduction
The mass composition of the UHECRs is one of the most important ingredients needed when trying to elucidate the origin and acceleration mechanisms of these most energetic particles in the universe. The most reliable observable from extensive air showers (EAS) used to infer the mass composition is the X max parameter [1], the atmospheric depth where the energy deposit profile of secondary particles reaches its maximum. This parameter is related to the mass of the primary particle which initiate the shower, X max ∝ − ln A, with larger mean values and dispersion for light primary para e-mail: nicusorarsene@spacescience.ro b e-mail: octavian.sima@partner.kit.edu ticles in comparison with the heavier nuclei. Experimentally, the mass composition of UHECRs was inferred from measurements of the first two moments of the X max distributions ( X max and σ X max ) as a function of the primary energy, by the Pierre Auger [2][3][4], High-Resolution Fly's Eye (HiRes) [5] and the Telescope Array [6] Collaborations. Despite of the large data acquisition time (the Pierre Auger Observatory is operating since 2004) and large acceptance of the experiments, the reconstruction of the mass composition is affected by large uncertainties mainly due to the unknown interaction cross sections at highest energies, experimental systematic uncertainties and poor statistics at the highest energies.
In [7] the Pierre Auger Collaboration show that using only the limited information given by the first two moments of the X max distributions, degeneracies may be induced when interpreting the mass composition of a given X max distribution, e.g. different mixes of primary particles can have identical mean and dispersion. To get information on fractions of individual nuclei, the Pierre Auger Collaboration used the entire shape of X max distributions fitting them with MC templates for (p, He, N, Fe). The fits were performed with a binned maximum-likelihood method and the goodness of the fits was characterised with p-value. With the use of this method the Auger data for E > 10 17.8 eV could be described well with mixed compositions consisting of p, He and N (as representative for the intermediate mass elements), while fractions of Fe were close to zero in most of the energy bins.
In this work we show that fitting the X max distributions with (p, He, N, Fe) elements, the fit quality is affected if some intermediate elements, e.g. Ne/Si are in fact present. For that, we propose a method for finding the best combination from a larger set of possible elements (p, He, C, N, O, Ne, Si and Fe) to fit the data. Applying this method to the Auger data reported in [4] we observe a slight improvement of p-values when Ne/Si are considered as additional fitting elements in some energy bins, especially at energies below the ankle (E < 10 18.6 eV) where the statistics in the data is larger.
In Sect. 2 we describe the simulation procedure to obtain the X max Probability Density Functions (PDFs) for each primary species in the energy range lg(E/eV) = [17.8 − 19.3]. In Sect. 3 we show the influence of Ne/Si abundance on the goodness of fit parameter p-value. In Sect. 4 we present the fit results on X max distributions measured by the Pierre Auger Observatory (until 2014), considering the elements which best describe the data. Section 5 concludes the paper.

Simulations
We used the CONEX v4r37 simulation code [8,9] to generate the X max distributions for each element (p, He, C, N, O, Ne, Si and Fe) in 15 energy intervals of 0.1 in log(E/eV ) starting from E = 10 17.8 eV up to E = 10 19.3 eV. Three high energy hadronic interaction models were employed, EPOS-LHC [10], QGSJETII-04 [11] and Sibyll 2.1 [12]. The zenith angle of the showers were sampled from an isotropic distribution in the interval θ = [0 • −60 • ]. The statistics of the simulation data set consists in 10 4 -10 5 events per each primary species per hadronic interaction model in each energy interval. A PDF of X max for a nuclear species in a given energy interval consists in a binned X max distribution in the range (0-2000) g/cm 2 with a bin width = 20 g/cm 2 . The true X max values given by the CONEX simulations were modified to account for the detector acceptance and experimental resolution (Eqs. (7) and (8) from [4]). An example of PDFs of X max for proton and iron induced showers in the energy interval lg(E/eV) = [19.0−19.1] for two hadronic interaction models is presented in Fig. 1.
We will use these PDFs in the next section to generate random X max distributions with different mixes of primary particles to observe the behavior of the goodness of fit estimator p-value as a function of different prior abundances, when the X max distributions are fitted with the four fixed PDFs (p, He, N and Fe).

Influence of Ne/Si on the goodness of fit
The results on mass composition of primary cosmic rays at energies E > 10 17.8 eV reported in [7] indicate a modulation of the abundances of primary protons, He and N nuclei as a function of energy. The experimental X max distributions in each energy interval were fitted with four primary PDFs (p, He, N and Fe) following a binned maximum-likelihood procedure. Different astrophysical models suggest a variation of the abundance of different elements as a function of energy below and above the ankle [13][14][15][16]. In such a scenario, the observed modulation of the reconstructed fractions might be biased as a consequence of a high abundance of an intermediate element not included into the fitting procedure, in the case when the X max distributions are fitted with the same fixed four species (p, He, N and Fe) over the entire energy range.
We performed the following test. Using individual X max values obtained from simulations as explained in Sect. 2, we build X max distributions for each energy bin considering random abundances of eight primary species (p, He, C, N, O, Ne, Si and Fe). We generated a large number of such distributions (3 × 10 4 ) to ensure that we cover all possible mixes. The statistics in each distribution is of the same magnitude as in the Auger data. Then, using a binned maximum-likelihood procedure we fit these X max distributions with 4 PDFs (p, He, x, Fe), where x was varied from C to Si.
The minimizing quantity, − ln L, in this fitting procedure is defined as: where n i stands for the measured counts in the "i"-th bin of an X max distribution and y i represents the MC prediction for that bin [17]. The p-value parameter represents the probability of obtaining a worse fit than that observed, even if the distribution predicted by the fit results is correct: where is the incomplete gamma function, nd f represents the number of degrees of freedom, and χ 2 represents the sum of the square of residuals using the parameters computed by the likelihood method. Note that the p-values calculated using Eq. 2 differ from those calculated in [7]. We make the approximation that L behaves like a χ 2 variable while in [7] the p-value parameters are calculated in a more realistic way, using mock data sets of the predicted fractions with size equal to the real data sets. Even if the absolute p-values might be affected by this approximation we consider that the relative variation of p-values with the components included in the fit is significant. In addition, we mention that the pvalues obtained by us with Eq. 2 do not differ significantly from those obtained with the method used in [7], therefore we consider that the main conclusion of this paper will be not affected by this choice. Indeed, the best results were obtained when the distributions were fitted with p, He, Fe and any of CNO nuclei. Further, we tried to check what is the capability of this fitting method to reconstruct these four abundances (p, He, N, Fe) if one of the primary species has a high prior abundance. An example of the evolution of the fit quality as a function of different abundance of nuclear species in the X max distributions is represented in Fig. 2  As we can see in Fig. 2, the probability of obtaining a good p-value decreases with the increase of abundances of Ne or Si and with increase of statistics in X max distributions, when the fitting procedure includes only four PDFs (p, He, N and Fe). Moreover, we found that the reconstructed fractions of protons, He and Fe differ from the true fractions by up to 20% in some cases (i.e. when the abundance of Ne or Si is > 40%). When fitting the same distributions with five elements including Si (p, He, N, Si and Fe), the fit quality is not affected by the higher prior abundances of Ne or Si. These results are presented in Fig. 3. We observed that in the energy bins where the statistics is very small (e.g. lg(E/eV) = [19.2−19.3], N = 87), the higher abundance of Ne or Si does not affect the quality of the fit. For these energies the reliable estimations on the mass composition can not be obtained due to poor statistics available in data.

Fitting Auger X max distributions
We fit the experimental X max distributions measured at the Pierre Auger Observatory [4], with the four fixed PDFs (p, He, N and Fe) on the entire energy range lg(E/eV) = [17.8−19.3]. The results we have obtained are in a very good agreement with those reported in [7], since we build our PDFs based on the same version of CONEX code, employing the same versions of hadronic interaction models and considering the same binned maximum-likelihood fitting procedure.
We found that the most appropriate approach to fit the experimental X max distributions in each energy interval is to consider all possible combinations of PDFs from a larger set of nuclear elements (p, He, C, N, O, Ne, Si and Fe) and then to find the "best combination" of elements which best describe the data. Thus, the number of elements from a "best combination" may vary between 1 and 8. We will refer from now on to this fitting approach as "best combination".
It is worth mentioning that in the minimization procedure of the log-likelihood (Eq. 1) we do not neglect the empty bins and the nd f parameter is calculated as the number of bins in the X max distribution minus the number of parameters considered in the fit. Therefore, the computation of the pvalue parameter takes into account the number of parameters considered in the fit.
In Fig. 4 we give an example of a X max distribution measured by the Pierre Auger Observatory in the energy interval lg(E/eV) = [17.9 − 18.0]. We found that the "best combination" (Fig. 4 left) suggests that the shape of the distribution is best described only by two elements, protons and O, with p-value= 0.35, for the case of EPOS-LHC model. In Fig. 4 right we present the results obtained by fitting the same X max distribution with 4 PDFs (p, He, N and Fe). In this case we obtain a worse p-value= 0.22.
A direct comparison of the two fitting procedures is presented in Figs. 5, 6 and 7 for EPOS-LHC, QGSJETII-04 and Sibyll 2.1 for the entire energy range. In the case of QGSJETII-04 (Fig. 6) and Sibyll 2.1 (Fig. 7) we observe negligible differences if the X max distributions are fitted with the four fixed PDFs (p, He, N and Fe) or if we use the "best combination" of elements. The only modification consists in a slight improvement of the p-value parameter over the entire energy range for the "best combination" case. Important to mention that the number of elements from the "best combination" consists in two or three elements over the entire energy range for each hadronic interaction model. The error bars (statistical uncertainties) of the fitted fractions should not be compared with those from [7] since they are computed considering different methods. We have employed the MINOS technique based on L = 1/2 rule, while in [7] the authors used the Feldman-Cousins procedure in which the parameter uncertainties are computed in a more rigorous way by enforcing unitarity. Most likely the uncertainties from Figs. 5, 6 and 7 from our manuscript are underestimated.
The most interesting aspect is observed in the case of EPOS-LHC model (Fig. 5) at the lower energies. We found that for some energy intervals, e.g. lg(E/eV) = [18.1−18.2], [18.4−18.5], the "best combination" suggest the presence of Ne or Si in Auger data with a slight improvement of the p-value parameter. This aspect is in agreement with our results from Sect. 3, where we found that a high prior abundance of Ne or Si (> 20%) may affect the quality of fit if the X max distributions are fitted with the four PDFs (p, He, N and Fe). Without making speculations, one can consider that the results presented in this paper could be a hint for the presence of the heavier elements (20 < A < 39) around the ankle, as predicted in [16].

Discussions and conclusions
In this paper we investigated the capability to infer the mass composition of the primary UHECRs from measurements of X max distributions. Using simulated X max distributions for a large set of primary species (p, He, C, N, O, Ne, Si and Fe), we build X max distributions with random mixes of elements for each energy interval in the energy range lg(E/eV) = [17.8−19.3]. We found that a high prior abundance of Ne or Si can bias the reconstructed fractions of elements if the distributions are fitted with four fixed PDFs (p, He, N and Fe). We found that the fit quality decreases with increasing the Ne/Si abundance and with increasing the statistics in the X max distributions.
We proposed an alternative approach to infer the mass composition from the X max distributions which finds the "best combination" of elements best describing the distributions from a larger set of primaries. Applying this method to the X max distributions measured by the Pierre Auger Observatory until 2014, it was shown that in some low energy  circles represent the fitted fractions found for the "best combination" method and black stars stand for Auger 2014 results [7] bins, only for the EPOS-LHC model the "best combination" of elements suggests the presence of Ne or Si, with a slight improvement of the p-value parameter. Since we have shown using simulations that a high Ne/Si prior abundance will affect the fit quality if the X max distribution is fitted with four PDFs (p, He, N and Fe), we consider that it is important to take into account further elements in future studies.

Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors' comment: The datasets generated during and/or analysed during the current study available from the corresponding author on reasonable request.] Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Funded by SCOAP 3 .