1 Introduction

The steepening of the energy spectrum of cosmic rays (CRs) at around \(10^{15.5}\) eV, first reported in [1], is referred to as the “knee” feature. A widespread view for the origin of this bending is that it corresponds to the energy beyond which the efficiency of the accelerators of the bulk of Galactic CRs is steadily exhausted. The contribution of light elements to the all-particle spectrum, largely dominant at GeV energies, remains important up to the knee energy after which the heavier elements gradually take over up to a few \(10^{17}\) eV [2,3,4,5,6]. This fits with the long-standing model that the outer shock boundaries of expanding supernova remnants are the Galactic CR accelerators, see e.g. [7] for a review. Hydrogen is indeed the most abundant element in the interstellar medium that the shock waves sweep out, and particles are accelerated by diffusing in the moving magnetic heterogeneities in shocks accordingly to their rigidity. That the CR composition gets heavier for two decades in energy above the knee energy could thus reflect that heavier elements, although sub-dominant below the knee, are accelerated to higher energies, until the iron component falls off steeply at a point of turn-down around \({\simeq }\,10^{16.9}\) eV. Such a bending has been observed in several experiments at a similar energy, referred to as the “second knee” or “iron knee” [8,9,10,11]. The recent observations of gamma rays of a few \(10^{14}~\)eV from decaying neutral pions, both from a direction coincident with a giant molecular cloud [12] and from the Galactic plane [13], provide evidence for CRs indeed accelerated to energies of several \(10^{15}~\)eV, and above, in the Galaxy. A dozen of sources emitting gamma rays up to \(10^{15}~\)eV have even been reported [14], and the production could be of hadronic origin in at least one of them [15]. However, the nature of the sources and the mechanisms by which they accelerate CRs remain in general undecided. In particular, that particles can be effectively accelerated to the rigidity of the second knee in supernova remnants is still under debate, see e.g. [16].

Above \(10^{17}\) eV, the spectrum steepens in the interval leading up to the “ankle” energy, \({\sim }5{\times }10^{18}\) eV, at which point it hardens once again. The inflection in this energy range is not as sharp as suggested by the energy limits reached in the Galactic sources to accelerate iron nuclei beyond the iron-knee energy [17]. Questions arise, then, on how to make up the all-particle spectrum until the ankle energy. The hardening around \(10^{17.3}\) eV in the light-particle spectrum reported in [18] is suggestive of an extragalactic contribution to the all-particle spectrum steadily increasing. It has even been argued that an additional component is necessary to account for the extended gradual fall-off of the spectrum and for the mass composition in the iron-knee-to-ankle region, be it of Galactic [17] or extragalactic origin [19].

While the concept that the Galactic-to-extragalactic transition occurs somewhere between \(10^{17}\) eV and a few \(10^{18}\) eV is well-accredited, a full understanding of how it occurs is hence lacking. The approximately power-law shape of the spectrum in this energy range may mask a complex superposition of different components and phenomena, the disentanglement of which rests on the measurements of the all-particle energy spectrum, and of the abundances of the different elements as a function of energy, both of them challenging from an experimental point of view. On the one hand, the energy range of interest is accessible only through indirect measurements of CRs via the extensive air showers that they produce in the atmosphere. Therefore, the determination of the properties of the CRs, especially their mass and energy, is prone to systematic effects. On the other hand, different experiments, different instruments and different techniques of analysis are used to cover this energy range, so that a unique view of the CRs is only possible by combining measurements the matching of which inevitably implies additional systematic effects.

The aim of this paper is to present a measurement of the CR spectrum from \(10^{17}\) eV up to the highest observed energies, based on the data collected with the surface-detector array of the Pierre Auger Observatory. The Observatory is located in the Mendoza Province of Argentina at an altitude of 1400 m above sea level at a latitude of \(35.2^\circ \) S, so that the mean atmospheric overburden is 875 g/cm\(^2\). Extensive air showers induced by CR-interactions in the atmosphere are observed via a hybrid detection using a fluorescence detector (FD) and a surface detector (SD).

Fig. 1
figure 1

The layout of the SD and FD of the Pierre Auger Observatory are shown above. The respective fields of view of the five FD sites are shown in blue and orange. The 1600 SD locations which make up the SD-1500 are shown in black while the stations which belong only to the SD-750 and the boarder of this sub-array are highlighted in cyan

The FD consists of five telescopes at four sites which look out over the surface array, see Fig. 1. Four of the telescopes (shown in blue) cover an elevation range from \(0^\circ \) to \(30^\circ \) while the fifth, the High Elevation Auger Telescopes (HEAT), covers an elevation range from \(30^\circ \) to \(58^\circ \) (shown in red). Each telescope is used to collect the light emitted from air molecules excited by charged particles. After first selecting the UV band with appropriate filters (310–390 nm), the light is reflected off a spherical mirror onto a camera of 22\(\times \)20 hexagonal, 45.6 mm, photo-multiplier tubes (PMTs). In this way, the longitudinal development of the particle cascades can be studied and the energy contained within the electromagnetic sub-showers can be measured in a calorimetric way. Thus the FD can be used to set an energy scale for the Observatory that is calorimetric and so is independent of simulations of shower development.

The SD, the data of which are the focus of this paper, consists of two nested hexagonal arrays of water Cherenkov detectors (WCDs). The layout, shown in Fig. 1, includes the SD-1500, with detectors spread apart by 1500 m and totaling approximately 3000 km\(^2\) of effective area. The detectors of the SD-750 are instead spread out by 750 m, yielding an effective area of 24 km\(^2\). SD-750 and SD-1500 include identical WCDs, cylindrical tanks of pure water with a 10 m\(^2\) base and a height of 1.2 m. Three 9” PMTs are mounted to the top of each tank and view the water volume. When relativistic secondaries enter the water, Cherenkov radiation is emitted, reflected via a Tyvek lining into the PMTs, and digitized using 40 MHz 10-bit Flash Analog to Digital Converters (FADCs). Each WCD along with its digitizing electronics, communication hardware, GPS, etc., is referred to as a station.

Using data collected over 15 years with the SD-1500, we recently reported the measurement of the CR energy spectrum in the range covering the region of the ankle up to the highest energies [20, 21]. In this paper we extend these measurements down to \(10^{17}\) eV using data from the SD-750: not only is the detection technique consistent but the same methods are used to treat the data and build he spectrum. The paper is organized as follows: we first explain how, with the SD-750 array, the surface array is sensitive to primaries down to \(10^{17}\) eV in Sect. 2; in Sect. 3, we describe how we reconstruct the showers up to determining the energy; we illustrate in Sect. 4 the approach used to derive the energy spectrum from SD-750; finally, after combining the spectra measured by SD-750 and SD-1500, we present the spectrum measured using the Auger Observatory from \(10^{17}\) eV upwards in Sect. 5 and discuss it in the context of other measurements in Sect. 6.

2 Identification of showers with the SD-750: from the trigger to the data set

The implementation of an additional set of station-level trigger algorithms in mid-2013 is particularly relevant for the operation of the SD-750. Their inclusion in this work extends the energy range over which the SD-750 triggers with \(>98\%\) probability from \(10^{17.2}\) eV down to \(10^{17}\) eV.

To identify showers, a hierarchical set of triggers is used which range in scope from the individual station-level up to the selection of events and the rejection of random coincidences. The trigger chain, extensively described in [22], has been used since the start of the data taking of the SD-1500, and was successively adopted for the SD-750. In short, station-level triggers are first formed at each WCD. They are then combined with those from other detectors and examined for spatial and temporal correlations, leading to an array trigger, which initiates data acquisition. After that, a similar hierarchical selection of physics events out of the combinatorial background is ultimately made.

We describe in this section the design of the triggers (Sect. 2.1). We then illustrate their effect on the data, at the level of the amplitude of detected signals (Sect. 2.2) and on the timing of detected signals in connection with the event selection (Sect. 2.3). Finally we describe the energy at which acceptance is 100% (Sect. 2.4). A more detailed description of the trigger algorithms can be found in Appendix A.

2.1 The electromagnetic triggers

Using the station-level triggers, the digitized waveforms are constantly monitored in each detector for patterns consistent with what would be expected as a result of air-shower secondary particles (primarily electrons and photons of 10 MeV on average, and GeV muons) entering the water volume.Footnote 1 The typical morphologies include large signals, not necessarily spread in time, such as those close to the shower core, or sequences of small signals spread in time, such as those nearby the core in low-energy showers, or far from the core in high-energy ones. Atmospheric muons, hitting the WCDs at a rate of 3 kHz, are the primary background. The output from the PMTs has only a small dependence on the muon energy. The electromagnetic and hadronic background, while also present, yields a total signal that is usually less than that of a muon. Consequently, the atmospheric muons are the primary impediment to developing a station-level trigger for small signal sizes without contaminating the sampling of an air shower with spurious muons.

Originally, two triggers were implemented into the station firmware, called threshold (TH), more adept to detect muons, and time-over-threshold (ToT), more suited to identify the electromagnetic component. Both of these have settings which require the signal to be higher in amplitude or longer than what is observed for a muon traveling vertically through the water volume. As such, they have the inherent limitation of being insensitive to signals which are smaller than (or equal to) that of a single muon, thus prohibiting the measurement of pure electromagnetic signals, which are generally smaller.

To bolster the sensitivity of the array to such small signals, two additional triggers were designed. The first, time-over-threshold-deconvolved (ToTd), first removes the typical exponential decay created by Cherenkov light inside the water volume, after which the ToT algorithm is applied. The second, multiplicity-of-positive-steps (MoPS), is designed to select small, non-smooth signals, a result of many electromagnetic particles entering the water over a longer period of time than a typical muon pulse. This is done by counting the number of instances in the waveform where consecutive bins are increasing in amplitude. Both of the trigger algorithms are described in detail in Appendix A.

The implementation of the ToTd and MoPS (the rate of which is around 0.3 Hz, compared to 0.6 Hz of ToT and 20 Hz of TH) did not require any modification in the logic of the array trigger, which calls for a coincidence of three or more SD stations that pass any combination of the triggers described above with compact spacing, spatially and temporally [22]. We note that in spite of the low rate of the ToTd and MoPS relative to TH and ToT, the array rate more than doubled after their implementation. This, as will be shown in the following, is due to the extension of measurements to the more abundant, smaller signals.

2.2 Effect of ToTd and MoPS on signals amplitudes

The ToTd and MoPS triggers extend the range over which signals can be observed at individual stations into the region which is dominated by the background muons that are created in relatively low energy air showers. By remaining insensitive to muon-like signals, these two triggers increase the sensitivity of the SD to the low-energy parts of the showers that have previously been below the trigger threshold.

The effects of the additional triggers can be seen in the distribution of the observed signal sizes. An example of such a distribution, based on one month of air-shower data, is shown in Fig. 2.

Fig. 2
figure 2

Distribution of the signal sizes at individual stations which pass the TH and ToT triggers (solid black) and signals which pass only the ToTd and/or MoPS triggers (dashed red)

The signal sizes are shown in the calibration unit of one vertical equivalent muon (VEM), the total deposited charge of a muon traversing vertically through the water volume [22]. For the stations passing only the ToT and TH triggers (shown in solid black), the distribution of deposited signals is the convolution of three effects, the uniformity of the array, the decreasing density of particles as a function of perpendicular distance to the shower axis (henceforth referred to as the axial distance), and the shape of the CR spectrum resulting in the negative slope above \({\simeq }\,7\) VEM. Furthermore there is a decreasing efficiency of the ToT and TH at small signal sizes. The range of additional signals that are now detectable via the ToTd and MoPS triggers are shown in dashed red. As expected, ToTd and MoPS triggers increase the probability of the SD to detect small amplitude signals, namely between 0.3 and 5 VEM. That the high-signal tail of this distribution ends near 10 VEM is consistent with a previous study [24] that estimated that the ToT+TH triggers were fully efficient above this value.

Fig. 3
figure 3

The increase in station multiplicity when including the ToTd and MoPS triggers versus the original multiplicity with only ToT and TH. The black circles show the median increase in that multiplicity bin

The additional sensitivity to small air-shower signals also increases the multiplicity of triggered stations per event. This increase is characterized in Fig. 3, which shows the number of additional triggered stations per event as a function of the number of stations that pass the TH and ToT triggers, after removing spuriously triggered stations. The median increase of multiplicity in each horizontal bin is shown by the black circles and indicates a typical increase of one station per event.

2.3 Effects of ToTd and MoPS on signal timing

The increased responsiveness of the ToTd and MoPS algorithms tosmaller signals, specifically due to the electromagnetic component, has an effect also on the observed timing of the signals. In general, the electromagnetic signals are expected to be delayed with respect to the earliest part of the shower which is muon-rich, the delay increasing with axial distance. Further, in large events, stations that pass these triggers tend to be on the edge of the showers, where the front is thicker, thus increasing the variance of the arrival times. Such effects can be seen through the distribution of the start times for stations that pass the ToTd and MoPS triggers.

Fig. 4
figure 4

Distributions of start times with respect to a plane front for stations that pass the ToT and TH algorithms, in blue and in green, respectively. The signals due to ToTd and MoPS are shown in red. Positive residuals correspond to a delay with respect to the plane wave expectation

The residuals of the pulse start times with respect to a plane front fit of the three stations with the largest signals in the event are shown in Fig. 4 for different trigger types. The entries shown in blue correspond to stations that passed the ToT algorithm, the ones in green to stations that pass the TH trigger (but not the ToT trigger), and those in red to stations that pass the ToTd and/or MoPS triggers, only. For each of the trigger types, there is a clear peak near zero, which reflects the approximately planar shower front close to the core. Stations that pass the TH condition, but not the ToT one, tend to capture isolated muons, including background muons arriving randomly in time. This explains the vertical offset, flat and constant, in the green curve. In turn, the lack of such a baseline shift in the blue and red distributions gives evidence that the ToT, TOTd and MoPS algorithms reject background muons effectively. This is particularly successful for the ToTd and MoPS that accept very small signals, of approximately 1 VEM in size. One can see that these distributions have different shapes and that, in particular, the start time distributions of signals that pass the ToTd and MoPS have much longer tails than those of the TOT triggers, including a second distribution beginning around 1.5 \(\upmu \)s possibly due to heavily delayed electromagnetic particles.

The extended time portion of showers accessed by the ToTd and MoPS triggers has implications on the procedure used to select physical events from the triggered ones [22]. In this process, non-accidental events, as well as non-accidental stations, are disentangled on the basis of their timing. First, we identify the combination of three stations where they form a triangle, in which at least two legs are 750 m long, and where they have the largest summed signal among all such possible configurations. These stations make up the event seed and the arrival times of the signals are fit to a plane front. Additional stations are then kept if their temporal residual, \(\Delta t\), is within a fixed window, \(t_\text {low}< \Delta t < t_\text {high}\). Motivated by the differing time distributions, updated \(t_\text {low}\) and \(t_\text {high}\) values were calculated based on which trigger algorithm was satisfied. Using the distributions of timing residuals, shown in Fig. 4, the baseline was first subtracted. Then the limits of the window, \(t_\text {low}\) and \(t_\text {high}\), were chosen such that the middle 99% of the distribution was kept. The trigger-wise limits are summarized in Table 1.

Table 1 Temporal window limits \(t_\text {low}\) and \(t_\text {high}\) used to remove stations from an event, for each station-level trigger algorithm

2.4 Effect of the ToTd and MoPS on the energy above which acceptance is fully-efficient

Most relevant to the measurement of the spectrum is the determination of the energy threshold above which the SD-750 becomes fully efficient. To derive this, events observed by the FD were used to characterize this quantity as a function of energy and zenith angle. The FD reconstruction requires only a single station be triggered to yield a robust determination of the shower trajectory. Using the FD events with energies above \(10^{16.8}\) eV, the lateral trigger probability (LTP), the chance that a shower will produce a given SD trigger as a function of axial radius, was calculated for all trigger types. The LTP was then parameterized as a function of the observed air-shower zenith angle and energy. It is important to note that because the LTP is derived using observed air showers as a function of energy, this calculation reflects the efficiency as a function of energy based on the true underlying mass distribution of primary particles. Further details of this method can be found in [25].

The SD-750 trigger efficiency was then determined via a study in which isotropic arrival directions and random core positions were simulated for fixed energies between \(10^{16.5}\) and \(10^{18}\) eV. Each station on the array was randomly triggered using the probability given by the LTP. The set of stations that triggered were then checked against the compactness criteria of the array-level triggers, as described in [22]. The resulting detection probability for showers with zenith angles \(<40^\circ \) is shown as a solid blue line in Fig. 5 as a function of energy. The detection efficiency becomes almost unity (\(>98\%\)) at around \(10^{17}\) eV.Footnote 2 For comparison, we show in the same figure, in dashed red, the detection efficiency curve for the original set of station-triggers, TH and ToT, in which the full efficiency is attained at a larger energy, i.e., around \(10^{17.2}\) eV.

Fig. 5
figure 5

The detection efficiency of the SD-750 for air showers with \(\theta <40^\circ \) is shown for the original (dashed red) and expanded (solid blue) station-level trigger sets with bands indicating the systematic uncertainties. The trigger efficiency was determined using data above \(10^{16.8}\) eV and is extrapolated below this energy (shown in gray)

A description for the detection efficiency, \(\epsilon (E)\), below \(10^{17}\) eV, will be important for unfolding the detector effects close to the threshold energy (see Sect. 4). This quantity was fit using the results of the LTP simulations with \(\theta < 40^\circ \) and is well-parameterized by

$$\begin{aligned} \begin{aligned} \epsilon (E)&= \frac{1}{2}\left[ 1 + {\text {erf}}\left( \frac{\lg (E / \text {eV}) - \mu }{\sigma } \right) \right] , \end{aligned} \end{aligned}$$
(1)

where \({\text {erf}}(x)\) is the error function, \(\mu = 16.4 \pm 0.1\) and \(\sigma = 0.261 \pm 0.007\).

For events used in this analysis, there is an additional requirement regarding the containment of the core within the array: only events in which the detector with the highest signal is surrounded by a hexagon of six stations that are fully operational are used. This criterion not only ensures adequate sampling of the shower but also allows the aperture of the SD-750 to be evaluated in a purely geometrical manner [22]. With these requirements, the SD-750 data set used below consists of about 560,000 events with \(\theta < 40^\circ \) and \(E>10^{17}\) eV recorded between 1 January 2014 and 31 August 2018. The minimum energy cut is motivated by the lowest energy to which we can cross-calibrate with adequate statistics the energy scale of the SD with that of the FD (see Sect. 3.3). The corresponding exposure, \({\mathcal {E}}\), after removal of time periods when the array was unstableFootnote 3 (\({<}2\)% of the total) is \({\mathcal {E}}=(105\pm 4)\) km\(^2\) sr yr.

3 Energy measurements with the SD-750

In this section, the method for the estimation of the air-shower energy is detailed together with the resulting energy resolution of the SD-750 array. The measurement of the actual shower size is first described in Sect. 3.1 after which the corrections for attenuation effects are presented in Sect. 3.2. The energy calibration of the shower size after correction for attenuation is presented in Sect. 3.3. The energy resolution function is finally derived in Sect. 3.4.

3.1 Estimation of the shower size

The general strategy for the reconstruction of air showers using the SD-750 array is similar to that used for the SD-1500 array which is detailed extensively in [26]. In this process, the arrival direction is obtained using the start times of signals, assuming either a plane or a curved shower front, as the degrees of freedom allow. The lateral distribution of the signal is then fitted to an empirically-chosen function to infer the size of the air shower, which is used as a surrogate for the primary energy. The reconstruction algorithm thus produces an estimate of the arrival direction and the size of the air shower via a log-likelihood minimization.

The lateral fall-off of the signal, S(r), with increasing distance, r, to the shower axis in the shower plane is modeled with a lateral distribution function (LDF). The stochastic variations in the location and character of the leading interaction in the atmosphere result in shower-to-shower fluctuations of the longitudinal development that propagate onto fluctuations of the lateral profile, sampled at a fixed depth. Showers induced by identical primaries at the same energy and at the same incoming angle can thus be sampled at the ground level at a different stage of development. The LDF is consequently a quantity that varies on an event-by-event basis. However, the limited degrees of freedom, as well as the sparse sampling of the air-shower particles reaching the ground, prevent the reconstruction of all the parameters of the LDF for individual events. Instead, an average LDF, \(\langle S(r)\rangle \), is used in the reconstruction to infer the expected signal, \(S(r_\text {opt})\), that would be detected by a station located at a reference distance from the shower axis, \(r_\text {opt}\) [27, 28]. This reference distance is chosen so as to minimize the fluctuations of the shower size, down to \(\simeq \, 7\%\) in our case. The observed distribution of signals is then adjusted to \(\langle S(r)\rangle \) by scaling the normalization, \(S(r_\text {opt})\), in the fitting procedure.

The reference distance, or optimal distance, \(r_\text {opt}\), has been determined on an event-by-event basis by fitting the measured signals to different hypotheses for the fall-off of the LDF with distance to the core as in [28]. Via a fit of many power-law-like functions, the dispersion of signal expectations has been observed to be minimal at \(r_\text {opt}\simeq \, 450\) m, which is primarily constrained by the geometry of the array. The expected signal at 450 m from the core, S(450), has thus been chosen to define the shower-size estimate.

The functional shape chosen for the average LDF is a parabola in a log-log representation of \(\langle S(r)\rangle \) as a function of the distance to the shower core,

$$\begin{aligned} \ln \langle S(r) \rangle = \ln S(450)+\beta \,\rho + \gamma \,\rho ^2, \end{aligned}$$
(2)

where \(\rho =\ln (r/(450\,\text {m}))\), and \(\beta \) and \(\gamma \) are two structure parameters. The overall steepness of the fall-off of the signal from the core is governed by \(\beta \), while the concave deviation from a power-law function is given by \(\gamma \). The values of \(\beta \) and \(\gamma \) have been obtained in a data-driven manner, by using a set of air-shower events with more than three stations, none of which have a saturated signal. The zenith angle and the shower size are used to trace the age dependence of the structure parameters based on the following parameterization in terms of the reduced variables \(t=\sec \theta - 1.27\) and \(u=\ln S(450) - 5\):

$$\begin{aligned} \beta= & {} (\beta _0 + \beta _1 t + \beta _2 t^2)(1 + \beta _3 u), \end{aligned}$$
(3)
$$\begin{aligned} \gamma= & {} \gamma _0 + \gamma _1 u. \end{aligned}$$
(4)

For any specific set of values \({\mathbf {p}}=\{\beta _i, \gamma _i\}\), the reconstruction is then applied to calculate the following \(\chi ^2\)-like quantity, globally to all events:

$$\begin{aligned} Q^2({\mathbf {p}})=\frac{1}{N_\text {tot}}\sum _{k=1}^{N_\text {events}}\sum _{j=1}^{N_k}\frac{(S_{k,j}-\langle S(r_j,{\mathbf {p}})\rangle )^2}{\sigma _{k,j}^2}. \end{aligned}$$
(5)

The sum over \(N_k\) stations is restricted to those with observed signals larger than 5 VEM to minimize the impact of upward fluctuations of the station signals far from the core and hence to avoid biases from trigger effects, and to stations more than 150 m away from the core. The uncertainty \(\sigma _{k,j}\) is proportional to \(\sqrt{S_{k,j}}\) [26]. \(N_\text {tot}\) is the total number of stations in all such events. The best-fit {\(\beta _i\), \(\gamma _i\)} values are collected in Table 2.

Table 2 Best-fit {\(\beta _i\), \(\gamma _i\)} values defining the structure parameters of the LDF

3.2 Correction of attenuation effects

There are two significant observational effects that impact the precision of the estimation of the shower size. Both of these effects are primarily a result of the variable slant depth that a shower must traverse before being detected with the SD. Since the mean atmospheric overburden is 875 g/cm\(^2\) at the location of the Observatory, nearly all observed showers in the energy range considered in this analysis have already reached their maximum size and have started to attenuate [29]. Thus, an increase in the slant depth of a shower results in a more attenuated cascade at the ground, directly impacting the observed shower size.

The first observational effect is related to the changing weather at the Observatory. Fluctuations in the air pressure equate to changes in the local overburden and thus showers observed during periods of relatively high pressure result in an underestimated shower size. Similarly, the variations in the air density directly change the Molière radius which directly affects the spread of the shower particles. The increased lateral spread of the secondaries, or equivalently, the decrease in the density of particles on the ground, also leads to a systematically underestimated shower size. Both the air-density and pressure have typical daily and yearly cycles that imprint similar cycles upon the estimation of the shower size.

The relationship between these two atmospheric parameters and the estimated shower sizes has been studied using events detected with the SD [30]. From this relationship, a model was constructed to scale the observed value of S(450) to what would have been measured had the shower been instead observed at a time with the daily and yearly average atmosphere. When applying this correction to individual air showers, the measurements from the weather stations located at the FD sites are used. The values of S(450) are scaled up or down according to these measurements, resulting in a shift of at most a few percent. The shower size is eventually the proxy of the air-shower energy, which is calibrated with events detected with the FD (see Sect. 3.3). Since the FD operates only at night when, in particular, the air density is relatively low, the scaling of S(450) to a daily and yearly average atmosphere corrects for a \({\simeq }\,0.5\%\) shift in the assigned energies.

The second observational effect is geometric, wherein showers arriving at larger zenith angles have to go through more atmosphere before reaching the SD. To correct for this effect, the Constant Intensity Cut (CIC) method [31] is used. The CIC method relies on the assumption that cosmic rays arrive isotropically, which is consistent with observations in the energy range considered [32]. The intensity is thus expected to be independent of arrival direction after correcting for the attenuation. Deviations from a constant behavior can thus be interpreted as being due to attenuation alone. Based on this property, the CIC method allows us to determine the attenuation curve as function of the zenith angle and therefore to infer a zenith-independent shower-size estimator.

We empirically chose a functional form which describes the relative amount of attenuation of the air shower,

$$\begin{aligned} f_\text {CIC}(\theta ) = 1 + a x + bx^2. \end{aligned}$$
(6)

The scaling of this function is normalized to the attenuation of a shower arriving at \(35^\circ \) by choosing \(x = \sin ^2 35^\circ - \sin ^2 \theta \). For a given air shower, the observed shower size can be scaled using Eq. (6) to get the equivalent signal of a shower arriving with the reference zenith angle, \(S_{35}\), via the relationship \(S(450) = S_{35}\,f_\text {CIC}(\theta )\).

Isotropy implies that \({\mathrm {d}N/\mathrm {d}\sin ^2\theta }\) is constant. Thus, the shape of \(f_\text {CIC}(\theta )\) is determined by finding the parameters a and b for which the CDF of events above \(S(450) > S_\text {cut}\, f_\text {CIC}(\theta )\) is linear in \(\sin ^2 \theta \) using an Anderson-Darling test [33]. The parameter \(S_\text {cut}\) defines the size of a shower with \(\theta = 35^\circ \) at which the CIC tuning is performed, the choice of which is described below.

Since the attenuation that a shower undergoes before being detected is related to the depth of shower maximum and the particle content, the shape of \(f_\text {CIC}(\theta )\) is dependent on both the energy and the average mass of the primary particles at that energy. Further, this implies that a single choice of \(S_\text {cut}\) could introduce a mass and/or energy bias. Thus, Eq. (6) was extended to allow the polynomial coefficients, \(k \in \{a,\,b\}\), to be functions of S(450) via \(k( S(450)) = k_0 + k_1 y + k_2 y^2\) where \(y = \lg (S(450) / \text {VEM})\). The function \(f_\text {CIC}(\theta , S(450))\) was tuned using an unbinned likelihood.

The fit was performed so as to guarantee equal intensity of the integral spectra using eight threshold values of \(S_\text {cut}\) between 10 and 70 VEM, evenly spaced in log-scale. These values were chosen to avoid triggering biases on the low end and the dwindling statistics on the high end. The best fit parameters are given in Table 3. The resulting 2D distribution of the number of events, in equal bins of \(\sin ^2\theta \) and \(\lg S_{35}\), is shown in Fig. 6, bottom panel. It is apparent that the number of events above any \(\sin ^2{\theta }\) value is equalized for any constant line for \(\lg S_{35}\gtrsim 0.7\). The magnitude of the CIC correction is \((-27\pm 4)\)% for vertical showers (depending on S(450)) and \(+15\)% for a zenith angle of \(40^\circ \).

Fig. 6
figure 6

Top: histogram of reconstructed shower sizes and zenith angles. The solid black line represents the shape of \(f_\text {CIC}\) at 10 VEM. Bottom: same distribution but as a function of corrected shower size, \(S_{35}\), and zenith angle. The dashed black line indicates the mapping of the solid black line in the top figure after inverting the effects of the CIC correction

Table 3 The energy dependence of the CIC parameters (Eq. (6)) are given below

3.3 Energy calibration of the shower size

Fig. 7
figure 7

Correlation between the SD shower-size estimator, \(S_{35}\), and the reconstructed FD energy, \(E_\text {FD}\), for the selected hybrid events

The conversion of the shower size, corrected for attenuation, is based on a special set of showers, called golden hybrid events, which can be reconstructed independently by the FD and by the SD. The FD allows for a calorimetric estimate of the primary energy except for the contribution carried away by particles that reach the ground. The amount of this so-called invisible energy, \({\simeq }\,20\%\) at \(10^{17}\) eV and \({\simeq }\,15\%\) at \(10^{18}\) eV, has been evaluated using simulations [34] tuned to measurements at \(10^{18.3}\) eV so as to correct for the discrepancy in the muon content of simulated and observed showers [35]. The empirical relationship between the FD energy measurements, \(E_\text {FD}\), and the corrected SD shower size, \(S_{35}\), allows for the propagation of the FD energy scale to the SD events.

FD events were selected based on quality and fiducial criteria aimed at guaranteeing a precise estimation of \(E_\text {FD}\) as well as at minimizing any acceptance biases towards light or heavy mass primaries introduced by the field of view of the FD telescopes. The cuts used for the energy calibration are similar to those described in [29, 36]. They include the selection of data when the detectors are properly operational and the atmosphere properties like clouds coverage and the vertical aerosol depth are suitable for a good determination of the air-shower profile. A further quality selection includes requirements on the uncertainties of the energy assignment (less than 12%) and of the reconstruction of the depth at the maximum of the air-shower development (less than 40 g cm\(^{-2}\)). A possible bias due to a selection dependency on the primary mass is avoided by using an energy dependent fiducial volume determined from data as in [29].

Restricting the data set to events with \(E_\text {FD} \ge 10^{17}\) eV, (to ensure that the SD is operating in the regime of full efficiency) there are 1980 golden-hybrid events available to establish the relationship between \(S_{35}\) and \(E_\text {FD}\). Fourty-five events in the energy range between \(10^{16.5}\) eV and \(10^{17}\) eV are included in the likelihood as described in [37]. As \(S_{35}\) depends on the mass composition of the primary particles, the relation between \(S_{35}\) and \(E_\text {FD}\), shown in Fig. 7, accounts for the trend of the composition change with energy inherently as the underlying mass distribution is directly sampled by the FD. Measurements of \(\langle X_\text {max}\rangle \) suggest that this composition trend follows a logarithmic evolution up to an energy of \(10^{18.3}\) eV, beyond which the number of events available for this analysis is too small to affect the results in any way [36]. So we choose a power-law type relationship,

$$\begin{aligned} E_{\mathrm{SD}}=A S_{35}^B, \end{aligned}$$
(7)

which is expected from Monte-Carlo simulations in the case of a single logarithmic dependence of \(X_\text {max}\) with energy. The energy of an event with \(S_{35} = 1\) VEM arriving at the reference angle, A, and the logarithmic slope, B, are fitted to the data by means of a maximum likelihood method which models the distribution of golden-hybrid events in the plane of energies and shower sizes. The use of these events allows us to infer A and B while accounting for the clustering of events in the range \(10^{17.4}\) to \(10^{17.7}\) eV observed in Fig. 7 due to the fall-off of the energy spectrum combined with the restrictive golden-hybrid acceptance for low-energy, dim showers. A comprehensive derivation of the likelihood function can be found in [37].

The probability density function entering the likelihood procedure, detailed in [37], is built by folding the cosmic-ray intensity, as observed through the effective aperture of the FD, with the resolution functions of the FD and of the SD. Note that to avoid the need to model accurately the cosmic-ray intensity observed through the effective aperture of the telescopes (and thus to reduce reliance on mass assumptions), the observed distribution of events passing the cuts described above is used. The FD energy resolution, \(\sigma _\text {FD}(E)/E_\text {FD}\), is typically between 6% and 8% [38]. It results from the statistical uncertainty arising from the fit to the longitudinal profile, the uncertainties in the detector response, the uncertainties in the models of the state of the atmosphere, and the uncertainties in the expected fluctuations from the invisible energy. The SD shower-size resolution, \(\sigma _\text {SD}(S_{35})/S_{35}\), is, on the other hand, comprised of two terms, the detector sampling fluctuations, \(\sigma _\text {det}(S_{35})\), and the shower-to-shower fluctuations, \(\sigma _\text {sh}(S_{35})\). The former is obtained from the sum of the squares of the uncertainties from the reconstructed shower size and zenith angle, and from the attenuation-correction terms that make up the \(S_{35}\) assignment. The latter stem from the stochastic nature of both the depth of first interaction of the primary and the subsequent development of the particle cascade. This contribution thus depends on the CR mass composition and on the hadronic interactions in air showers. For this reason, the derivation of A and B follows a two-step procedure. A first iteration of the fit is carried out by using an educated guess for \(\sigma _\text {sh}(S_{35})\), as expected from Monte-Carlo simulations for a mass-composition scenario compatible with data [29]. The total resolution \(\sigma _\text {SD}(S_{35})/S_{35}\) is then extracted from data as explained next in Sect. 3.4 and used in a second iteration.

Table 4 The systematic uncertainties on the FD energy scale are given below. Lines with multiple entries represent the values at the low and high end of the considered energy range (\(\simeq \) 10\(^{17}\) and \(\simeq \) 10\(^{19}\) eV, respectively)

The resulting relationship is shown as the red line in Fig. 7 with best-fit parameters such that \(A=(13.2\pm 0.3)\) PeV and \(B=1.002\pm 0.006\). The goodness of the fit is supported by the \(\chi ^2/\text {NDOF} = 2120/1978\) (\(p = 0.013\)). We use these values of A and B to calibrate the shower sizes in terms of energies by defining the SD estimator of energies, \(E_\text {SD}\), according to Eq. (7). The SD energy scale is set by the calibration procedure and thus it inherits the A and B calibration-parameters uncertainties and the FD energy-scale uncertainties, listed in Table 4. The systematic uncertainty, after addition in quadrature, of the energy scale is about 14% and is almost energy independent. The energy independence is a consequence of the 10% uncertainty of the FD calibration, which is the dominant contribution.

3.4 Resolution function of the SD-750 array

The SD resolution as a function of energy is needed in several steps of the analysis. In the regime of full efficiency, it can be considered as a Gaussian function centered on the true energy, the width of which reflects the statistical uncertainty associated with the detection and reconstruction processes on one hand, and the stochastic development of the particle cascade on the other hand. The combination of the two can be estimated for the golden hybrid events, thus allowing us to account for the contribution of the shower-to-shower fluctuations in a data-driven way.

Each event observed by the SD and FD results in two independent measurements of the air-shower energy, \(E_\text {SD}\) and \(E_\text {FD}\), respectively. Unlike for the SD, the FD directly provides a view of the shower development so a total energy resolution, \(\sigma _\text {FD}(E)\), can be estimated for each of the golden hybrid events. Using the known \(\sigma _\text {FD}(E)\), the resolution of SD can be determined by studying the distribution of the ratio of the two energy measurements.

Fig. 8
figure 8

An example of the ratio of the energy assignments for the SD and FD is shown with black crosses for the energy bin indicated in the plot. The best fit ratio distribution for this bin is shown by the black line

For two independent, Gaussian-distributed random variables, X and Y, their ratio, \(z=X/Y\), produces a ratio distribution that depends on the means (\(\mu _X\), \(\mu _Y\)) and standard deviations (\(\sigma _X\), \(\sigma _Y\)) of the two variables, \({\text {PDF}}(z; \mu _X, \mu _Y, \sigma _X, \sigma _Y)\). Likewise, the ratio of the two energy measurements, \(z = E_\text {SD} / E_\text {FD}\), follows such a distribution to first order. Because the FD sets the energy scale of the Observatory, there is inherently no bias in the energy measurements with respect to its own scale and thus, on average, \(\mu _\text {FD}(E)=1\). Using the golden hybrid data set, the ratio distribution was fit in an unbinned likelihood analysis, \({\text {PDF}}(z; \mu _\text {SD}(E), 1, \sigma _\text {SD}(E), \sigma _\text {FD}(E))\).

Fig. 9
figure 9

The total SD energy resolution, as calculated using the golden hybrid events (red circles) is shown in bins with equal statistics. The parameterization of the resolution is shown by the solid blue line and the corresponding 68% confidence interval in dashed lines. The energy resolution, calculated using mass-weighted MC air showers (gray squares), is shown as a verification of the method

An example of the measured energy-ratio distributions is shown in Fig. 8 with the fitted curve overlaid on the data points. Carrying out the fit in different energy bins, the SD resolution, shown by the red points in Fig. 9, is represented by,

$$\begin{aligned} \frac{\sigma _\text {SD}(E)}{E} = (0.06 \pm 0.02) + (0.05 \pm 0.01) \sqrt{\frac{1\,\text {EeV}}{E}}. \end{aligned}$$
(8)

The corresponding curve is overlaid in blue, bracketed by the 68% confidence region.

To measure the spectrum above the \(10^{17}\) eV threshold, the knowledge of the resolution function, which induces bin-to-bin migration of events, and of the detection efficiency are also required for energies below this threshold. As a verification, particularly in the energy region where Eq. (8) is extrapolated, a Monte-Carlo analysis was performed. A set of 325,000 CORSIKA [39] air showers were used, consisting of proton, helium, oxygen, and iron primaries with energies above \(10^{16}\) eV. EPOS-LHC [40] was used as the hadronic interaction model. The air showers were run through the full SD simulation and reconstruction algorithms. The events were weighted based on the primary mass according to the Global Spline Fit (GSF) model [41] to account for the changing mass-evolution near the second knee and ankle. The reconstructed values of S(450) were corrected by applying the energy-dependent CIC method to obtain values for \(S_{35}\) and these values were then calibrated against the Monte-Carlo energies. During the calibration, a further weighting was performed based on the energy distribution of golden hybrid events to account for the hybrid detection efficiency. Following the calibration procedure, each MC event was assigned an energy in the FD energy scale (i.e. \(E_\text {MC} \rightarrow S_{35} \rightarrow E_\text {FD}\)).

The SD energy resolution was calculated using the mass-weighted simulations and is shown in gray squares in Fig. 9. Indeed, the simulated and measured SD resolutions show a similar trend and agree to within the uncertainties, supporting the golden hybrid method.

In the energy region at-and-below \(10^{17}\) eV, systematic effects also enter into play on the energy estimate. An energy-dependent offset, a bias, is thus expected in the resolution function for several reasons:

  1. 1.

    The application of the trigger below threshold, combined with the finite energy resolution, cause an overestimate of the shower size, on average, which is then propagated to the energy assignment.

  2. 2.

    The linear relationship assumed in Eq. (7) cannot account for a possible sudden change in the evolution of the mass-composition with energy. Such a change would require a broken power law for the energy calibration relationship.

  3. 3.

    In the energy range where the SD is not fully efficient, the SD efficiency is larger for light primary nuclei, thus preventing a fair sampling of \(S_{35}\) values over the underlying mass distribution.

Because there is an insufficient number of FD events which pass the fiducial cuts below \(10^{17}\) eV, the bias was characterized, using the same air-shower simulations as used for the resolution cross-check. The remaining relative energy bias is shown in Fig. 10.

Fig. 10
figure 10

The bias of the energy assignment for the SD-750 was studied using Monte Carlo simulations, weighted according to the GSF model [41]. The ratio of the assigned and expected values as a function of energy are shown (red circles) along with the parameterization (blue line) given in Eq. (9)

The ratio between the reconstructed and expected values are shown as the red points as a function of \(E_\text {FD}\). A larger bias of \(\simeq \) 20% is seen at low energies, where upward fluctuations are necessarily selected by the triggering conditions. In the range considered for the energy spectrum, \(E > 10^{17}\) eV, the bias is 3% or less. To complete the description of the SD resolution function, the relative bias was fit to an empirical function,

$$\begin{aligned} b_\text {SD}(E)= b_0 (\lg \tfrac{E}{\mathrm {eV}} - b_1)\exp \left( -b_2(\lg \tfrac{E}{\mathrm {eV}} - b_3)^2\right) + b_4. \nonumber \\ \end{aligned}$$
(9)

The corresponding best fit parameters (blue line in Fig. 10) are given in Table 5.

Table 5 Best-fit parameters for the relative energy bias of the SD-750, \( b_\text {SD}(E)\), given in Eq. (9)

4 Measurement of the energy spectrum

To build the energy spectrum from the reconstructed energy distribution, we need to correct the raw spectrum, obtained as \(J^\text {raw}_i=N_i/({\mathcal {E}}\Delta E_i)\), for the bin-to-bin migrations of events due to the finite accuracy with which the energies are assigned. The energy bins are chosen to be regularly sized in decimal logarithm, \(\Delta \lg E_i=0.1\), commensurate with the energy resolution. The level of migration is driven by the resolution function, the detection efficiency in the energy range just below the threshold energy, and the steepness of the spectrum. To correct for these effects, we use the bin-by-bin correction approach presented in [21]. It consists of folding the detector effects into a proposed spectrum function, \(J(E,{\mathbf {k}})\), with free parameters, \({\mathbf {k}}\), such that the result describes the set of the observed number of events \(N_i\). The set of expectations, \(\nu _i\), is obtained as \(\nu _i({\mathbf {k}})=\sum _j R_{ij}\mu _j({\mathbf {k}})\), where the \(R_{ij}\) coefficients (reported in a matrix format in the Supplementary material) describe the bin-to-bin migrations, and where \(\mu _j\) are the expectations in the case of an ideal detector obtained by integrating the proposed spectrum over \(E_j\) and \(E_j+\Delta E_j\) scaled by \({\mathcal {E}}\). The optimal set of free parameters, \(\hat{{\mathbf {k}}}\), is inferred by minimizing a log-likelihood function built from the Poisson probabilities to observe \(N_i\) events when \(\nu _i(\hat{{\mathbf {k}}})\) are expected.

Fig. 11
figure 11

Residuals of the SD-750 raw spectrum with respect to the power-law function \(J^\text {ref}(E)\). Data points from the SD-1500 spectrum measurement are superimposed

To choose the proposed function, we plot in Fig. 11 the residuals (red dots) of the SD-750 raw spectrum with respect to a reference function, \(J^\text {ref}(E)\), that fits the SD-1500 spectrum below the ankle energy down to the SD-1500 threshold energy, \(10^{18.4}\) eV. A re-binning was applied at and above \(10^{19}\) eV to avoid too large statistical fluctuations.

The reference function in this energy range, as reported in [21], is

$$\begin{aligned} J^\text {ref}(E)=J_0^\text {ref}\left( \frac{E}{10^{18.5}\,\text {eV}}\right) ^{-\gamma _1^\text {ref}}, \end{aligned}$$
(10)

with \(J_0^\text {ref}=1.315{\times }10^{-18}\) km\(^{-2}\) yr\(^{-1}\) sr\(^{-1}\) eV\(^{-1}\) and \(\gamma _1^\text {ref}=3.29\). The residuals of the SD-1500 unfolded spectrum with respect to \(J^\text {ref}(E)\) are also shown as open squares in Fig. 11. The sharp transition at \({\simeq }\,10^{18.7}\) eV to a different power law corresponds to the spectral feature known as the ankle. Such a transition is also observed, with much lower sensitivity, using data from the SD-750 array. Below \({\simeq }\,10^{18.7}\) eV and down to \({\simeq }\,10^{17.4}\) eV, one can see a shift of the raw SD-750 spectrum compared to \(J^\text {ref}(E)\). This is expected from a combination of primarily the resolution effects to be unfolded and of a possible mismatch, within the energy-dependent budget of uncorrelated uncertainties, of the SD-1500 and SD-750 \(E_\text {SD}\) energy scales. Below \({\simeq }\,10^{17.4}\) eV, a slight roll-off begins. Overall, these residuals are suggestive of a power-law function to describe the data leading up to the ankle energy where the spectrum hardens, with a gradually changing spectral index over the lowest energies studied. Consequently, the proposed function is chosen as three power laws with transitions occurring over adjustable energy ranges,

$$\begin{aligned} J(E,{\mathbf {k}}) {=} J_0 \left( \frac{E}{10^{17}\,\text {eV}}\right) ^{-\gamma _0} \prod _{i=0}^1\left[ 1{+}\left( \frac{E}{E_{ij}}\right) ^{\frac{1}{\omega _{ij}}}\right] ^{(\gamma _i{-}\gamma _j)\omega _{ij}}, \nonumber \\ \end{aligned}$$
(11)

with \(j=i+1\). The normalization factor \(J_0\), the three spectral indices \(\gamma _i\), and the transition parameter \(\omega _{01}\) constitute the free parameters in \({\mathbf {k}}\). The transition parameter \(\omega _{12}\), constrained with much more sensitivity using data from the SD-1500, is fixed at \(\omega _{12}=0.05\) [21].

Fig. 12
figure 12

Unfolded energy spectrum derived using data from the SD-750 array

Table 6 Best-fit values of the spectral parameters (Eq. (11)). The parameter \(\omega _{12}\) is fixed to the value constrained in [21]. Note that the parameters \(\gamma _0\) and \(E_{01}\) correspond to features below the measured energy region and are treated only as aspects of the unfolding fixed to their best-fit values to infer the uncertainties of the measured spectral parameters

Combining all the ingredients at our disposal, we obtain the final estimate of the spectrum, \(J_i\), unfolded for the effects of the response of the detector and shown in Fig. 12. It is obtained as

$$\begin{aligned} J_i=\frac{\mu _i}{\nu _i}J^\text {raw}_i = c_i\,J^\text {raw}_, \end{aligned}$$
(12)

where the \(\mu _i\) and \(\nu _i\) coefficients are estimated using the best-fit parameters \(\hat{{\mathbf {k}}}\). Their ratios define the bin-by-bin corrections used to produce the unfolded spectrum. The correction applied extends from 0.84 at \(10^{17}\) eV to 0.99 around the ankle (see Appendix B). The best-fit spectral parameters are reported in Table 6, while the statistical correlations between the parameters are detailed in Appendix B (Table 9). The goodness-of-fit of the forward-folding procedure is attested by the deviance of 15.9, which, if considered to follow the C statistics [42], can be comparedFootnote 4 to the expectation of \(16.2\pm 5.6\) to yield a p-value of 0.50.

Fig. 13
figure 13

Unfolded energy spectrum of the SD-750, scaled by \(E^{2.6}\)

The fitting function is shown in Fig. 13, superimposed to the spectrum scaled by \(E^{2.6}\), allowing one to better appreciate its characteristics, from the turn-over at around \(10^{17}\) eV up to a few \(10^{19}\) eV, thus including the ankle. The turn-over is observed with a very large exposure, unprecedented at such energies. However, as indicated by the magnitude of the transition parameter, \(\omega _{01}\simeq 0.49\), the change of the spectral index occurs over an extended \(\Delta \lg E\simeq 0.5\) energy range, so that the spectral index \(\gamma _0\) cannot be observed but only indirectly inferred. Also, the value of the energy break, \(E_{01}\simeq 1.24{\times }10^{17}\) eV, turns out to be close to the threshold energy. These two facts thus imply that, while a spectral break is found beyond any doubt, it cannot wholly be characterised, as only the higher energy portion is actually observed. Consequently, the fit values describing \(E_{01}\) and \(\gamma _0\) are not to be considered as true measurements but as necessary parameters in the fit function, the statistical resolutions of which are on the order of 35%. Once we infer their best-fit values, we use these values as “external parameters” to estimate the uncertainties of the other spectral parameters. This procedure gives rise to an increase of the systematic uncertainties, but is necessary as \(E_{01}\) and \(\gamma _0\) are not directly observed. Beyond the smooth turn-over around \(E_{01}\), the intensity can be described by a power-law shape as \(J(E)\propto E^{-\gamma _1}\), up to \(E_{12} = \left( 3.9\pm 0.8\right) {\times }10^{18}\) eV, the ankle energy, the value of which is within 1.4\(\sigma \) of that found with the much larger exposure of the SD-1500 measurement of the spectrum, namely \((5.0\pm 0.1){\times }10^{18}\) eV. Also the value of \(\gamma _1 = 3.34\pm 0.02\) is within 1.8\(\sigma \) of that obtained with the SD-1500 between \(10^{18.4}\) and \(10^{18.7}\) eV (\(3.29 \pm 0.02\)).

The characteristics of the measured spectrum can also be studied by looking at the evolution of the spectral index as a function of energy, \(\gamma (E)\). Rather than relying on the empirically chosen unfolding function, this slope parameter can be directly fit using the values calculated in J(E). Power-law fits were performed for a sliding window of width \(\Delta \lg E = 0.3\). The resulting estimations of the so obtained spectral indexes are shown in Fig. 14.

Fig. 14
figure 14

Evolution of the spectral index with energy. The measured spectral points were fit to power laws within a sliding window of \(\Delta \lg E = 0.3\). The values of \(\gamma _1\) and \(\gamma _2\) are represented by the dashed and dash-dotted lines, for reference

The values of the spectral index fits present a consistent picture of the evolution. Beginning at the lowest energies shown, \(\gamma (E)\) increases first quite rapidly, finally approaching a value of 3.3 leading up to the ankle asymptotically. Unsurprisingly, this is the value found for \(\gamma _1\) in the unfolding of both the SD-750 and SD-1500 spectra [21].

The systematic uncertainties that affect the measurement of the spectrum are dominated by the overall uncertainty of the energy scale, detailed in [43], and is, itself, dominated by the absolute calibration of the fluorescence telescopes (10%). The total uncertainty in the energy scale is \(\sigma _E / E = 14\)%. Once propagated, the steepness of the spectrum as a function of energy amplifies this uncertainty, roughly as \(\sigma _{J}/J = (\gamma _1 - 1)\sigma _E / E\), resulting in a total flux uncertainty of \(\sigma _{J}/J \simeq 35\)%. However, for a more exact calculation of the uncertainty, the energies of the individual events were shifted by \(\pm 14\)% and the unfolding procedure was repeated. The result is shown as dashed red lines in Fig. 15.

Fig. 15
figure 15

Systematic uncertainties in the flux measurement as a function of energy. The main contributions are shown separately

Beyond that of the energy scale, the additional uncertainties are subdominant but are important to understand as they have energy dependence and some are uncorrelated with other flux measurements made at the Observatory. Such knowledge is particularly important for the combination of the two SD spectra presented later in Sect. 5. The most relevant of these energy-dependent uncertainties is associated with the procedure of the forward-folding itself. The uncertainties in the resolution function and in the detection efficiency all contribute a component to the overall unfolding uncertainty. The forward-folding process was hence repeated by shifting, within the statistical uncertainties, the parameterizations of the energy resolution (Eq. (8)) and efficiency parameterization, and by bracketing the bias with the pure proton/iron mass primaries below full efficiency. The impact of the resolution uncertainties on the unfolding procedure is the larger, in particular at the highest energies. On the other hand, the energy bias and reduced efficiency below \(10^{17}\) eV only impacts the first few bins. These various components are summed in quadrature and are shown by the dotted blue line in Fig. 15. These influences are clearly seen to impact the spectrum by \({<}4\%\).

The last significant uncertainty in the flux is related to the calculation of the geometric exposure of the array. This quantity has been previously studied and is 4% for the SD-750 which directly translates to a 4% energy-independent shift in the flux [24].

The resulting systematic uncertainties of the spectral parameters are given in Table 6. For completeness, beyond the summary information provided by the spectrum parameterization, the correlation matrix of the energy spectrum is given in the Supplementary material. It is obtained by repeating the analysis on a large number of data sets, sampling randomly the systematic uncertainties listed above.

5 The combined SD-750 and SD-1500 energy spectrum

The spectrum obtained in Sect. 4 extends down to \(10^{17}\) eV and at the high-energy end overlaps with the one recently reported in [21] using the SD-1500 array. The two spectra are superimposed in Fig. 16. Beyond the overall consistency observed between the two measurements, a combination of them is desirable to gather the information in a single energy spectrum above \(10^{17}\) eV obtained with data from both the SD-750 and the SD-1500 of the Pierre Auger Observatory. We present below such a combination considering adjustable re-scaling factors in exposures, \(\delta {\mathcal {E}}\), and \(E_\text {SD}\) energy scales, \(\delta E_\text {SD}\), within uncorrelated uncertainties.

Fig. 16
figure 16

Superimposed SD spectra to be combined scaled by \(E^{2.6}\), the SD-750 (red circles) and the SD-1500 (black squares)

Fig. 17
figure 17

SD energy spectrum after combining the individual measurements by the SD-750 and the SD-1500 scaled by \(E^{2.6}\). The fit using the proposed function (Eq. (13)) is overlaid in red along with the one sigma error band in gray

The combination is carried out using the same bin-by-bin correction approach as in Sect. 4. The joint likelihood function, \({\mathcal {L}}({\mathbf {s}},\delta {\mathcal {E}},\delta E_\text {SD})\), is built from the product of the individual Poissonian likelihoods pertaining to the two SD measurements, \({\mathcal {L}}_{750}\) and \({\mathcal {L}}_{1500}\). These two individual likelihoods share the same proposed function,

$$\begin{aligned} J(E,{\mathbf {s}}) = J_0 \left( \frac{E}{E_0}\right) ^{-\gamma _0} \frac{\prod _{i=0}^3\left[ 1+\left( \frac{E}{E_{ij}}\right) ^{\frac{1}{\omega _{ij}}}\right] ^{(\gamma _i-\gamma _j)\omega _{ij}}}{\prod _{i=0}^3\left[ 1+\left( \frac{E_0}{E_{ij}}\right) ^{\frac{1}{\omega _{ij}}}\right] ^{(\gamma _i-\gamma _j)\omega _{ij}}},\nonumber \\ \end{aligned}$$
(13)

with \(j=i+1\) and \(E_0 = 10^{18.5}\) eV. As in [21], the transition parameters \(\omega _{12}\), \(\omega _{23}\) and \(\omega _{34}\) are fixed to 0.05. In this way, the same parameters \({\mathbf {s}}\) are used during the minimisation process to calculate the set of expectations \(\nu _i({\mathbf {s}},\delta {\mathcal {E}},\delta E_\text {SD})\) of the two arrays. For each array, a change of the associated exposure \({\mathcal {E}}\rightarrow {\mathcal {E}}+\delta {\mathcal {E}}\) impacts the \(\nu _i\) coefficients accordingly, while a change in energy scale \(E_\text {SD}\rightarrow E_\text {SD}+\delta E_\text {SD}\) impacts as well the observed number of events in each bin. Additional likelihood factors, \({\mathcal {L}}_{\delta {\mathcal {E}}}\) and \({\mathcal {L}}_{\delta E_\text {SD}}\), are thus required to control the changes of the exposure and of the energy-scale within their uncorrelated uncertainties. The likelihood factors described below account for \(\delta {\mathcal {E}}\) and \(\delta E_\text {SD}\) changes associated with the SD-750 only. We have checked that allowing additional free parameters, such as the \(\delta {\mathcal {E}}\) corresponding to the SD-1500, does not improve the deviance of the best fit by more than one unit, and thus their introduction is not supported by the data.

Both likelihood factors are described by Gaussian distributions with a spread given by the uncertainty pertaining to the exposure and to the energy-scale. The joint likelihood function reads then as

$$\begin{aligned} {\mathcal {L}}({\mathbf {s}},\delta {\mathcal {E}},\delta E_\text {SD})={\mathcal {L}}_{750}\times {\mathcal {L}}_{1500}\times {\mathcal {L}}_{\delta {\mathcal {E}}}\times {\mathcal {L}}_{\delta E_\text {SD}}. \end{aligned}$$
(14)

The allowed change of exposure, \(\delta {\mathcal {E}}\), is guided by the systematic uncertainties in the SD-750 exposure, \(\sigma _{\mathcal {E}}/{\mathcal {E}}=4\%\). Hence, the constraining term for any change in the SD-750 exposure reads, dropping constant terms, as

$$\begin{aligned} -2\ln {\mathcal {L}}_{\delta {\mathcal {E}}}(\delta {\mathcal {E}}) = \left( \frac{\delta {\mathcal {E}}}{\sigma _{\mathcal {E}}}\right) ^2. \end{aligned}$$
(15)

Likewise, uncertainties in A and B, \(\delta A\) and \(\delta B\), translate into uncertainties in the SD-750 energy scale. Statistical contributions stem from the energy calibration of \(S_{35}\), which are by essence uncorrelated to those of the SD-1500. Other uncorrelated contributions of the systematic uncertainties from the FD energy scales propagated to the SD-1500 and SD-750 could enter into play. The magnitude of such systematics, \(\sigma _{\mathrm {syst}}\), is difficult to quantify. By testing several values for \(\sigma _{\mathrm {syst}}\), we have checked, however, that such contributions have a negligible impact on the combined spectrum. Hence, the constraining term for any change in energy scale can be considered to stem from statistical uncertainties only and reads as

$$\begin{aligned} -2\ln {\mathcal {L}}_{E_{\mathrm {SD}}}(\delta A,\delta B)= & {} [\sigma ^{-1}]_{AA}(\delta A)^2+[\sigma ^{-1}]_{BB}(\delta B)^2 \nonumber \\&+\,2[\sigma ^{-1}]_{AB}(\delta A)(\delta B), \end{aligned}$$
(16)

where the notation \([\sigma ]_{ij}\) stands for the coefficients of the variance-covariance matrix of the A and B best-fit estimates and \([\sigma ^{-1}]\) is the inverse of this matrix.

Table 7 Best-fit values of the combined spectral parameters (Eq. (13)). The parameter \(\omega _{12}\), \(\omega _{23}\) and \(\omega _{34}\) are fixed to the value constrained in [21]. Note that the parameters \(\gamma _0\) and \(E_{01}\) correspond to features below the measured energy region and should be treated only as aspects of the combination

The outcome of the forward-folding fit is the set of parameters \({\mathbf {s}}\), \(\delta {\mathcal {E}}\), \(\delta A\) and \(\delta B\) that allow us to calculate the expectation values \(\mu _i\) and \(\nu _i\), and thus the correction factors \(c_i\), for both arrays separately. The resulting combined spectrum, obtained as

$$\begin{aligned} J^\text {comb}_i=\frac{c_{i,750}\,N_{i,750}+c_{i,1500}\,N_{i,1500}}{{\mathcal {E}}_i^\text {eff}\,\Delta E_i}, \end{aligned}$$
(17)

is shown in Fig. 17. Here, the observed number of events \(N_i^{750}\) in each bin is calculated at the re-scaled energies, while the effective exposure, \({\mathcal {E}}_i^\text {eff}\), is the shifted one of the SD-750 in the energy range where \(N_{i,1500}=0\), the one of the SD-1500 in the energy range where \(N_{i,750}=0\), and the sum \({\mathcal {E}}_{750}+\delta {\mathcal {E}}+{\mathcal {E}}_{1500}\) in the overlapping energy range. The set of spectral parameters are collected in Table 7, while the corresponding correlation matrix is reported in Appendix B (Table 11) for \(\delta {\mathcal {E}}\), \(\delta A\) and \(\delta B\) fixed to their best-fit values. The change in exposure is \(\delta {\mathcal {E}}/{\mathcal {E}}=+1.4\%\), while the one in energy scale follows from \(\delta A/A=-2.5\%\) and \(\delta B/B=+0.8\%\). The goodness-of-fit is evidenced by a deviance of 37.2 for an expected value of \(32\pm 8\). We also note that the parameters describing the spectral shape are in agreement with those of the two individual spectra from the SD arrays.

The impact of the systematic uncertainties, dominated by those in the energy scale, on the spectral parameters are reported in Table 7. For completeness, beyond the summary information provided by the spectrum parameterization, the correlation matrix of the energy spectrum itself is also given in the Supplementary material.

Fig. 18
figure 18

SD-750 spectrum (solid red circles) near the second knee along with the measurements from Akeno [44], GAMMA [45], IceTop [9], KASCADE-Grande [46], TALE [10], Tien Shan [47], Tibet-III [48], Tunka-133 [11], Yakutsk [49]. The experiments that set their energy scale using calorimetric observations are indicated by solid colored markers while those with an energy scale based entirely on simulations are shown by gray markers

6 Discussion

We have presented here a measurement of the CR spectrum in the energy range between the second knee and the ankle, which is covered with high statistics by the SD-750, including 560,000 events with zenith angles up to \(40^\circ \) and energies above \(10^{17}\) eV. The measurement includes a total exposure of 105 km\(^2\) sr yr and an energy scale set by calorimetric observations from the FD telescopes. We note a significant change in the spectral index and with a width that is much broader than that of the ankle feature.

Such a change has been observed by a number of other experiments, and via various detection methods. Most notably, the nature of this feature was linked to a softening of the heavy-mass primaries beginning at \(10^{16.9}\) eV by the KASCADE-Grande experiment, leading to the moniker iron knee [8]. Additional analyses by the Tunka-133 [50] and IceCube [9] collaborations have given further evidence that high-mass particles are dominant near \(10^{17}\) eV and thus that it is their decline that largely defines the shape of the all-particle spectrum. The hypothesis is also supported by a preliminary study of the distributions of the depths of the shower maximum, \(X_\text {max}\), measured at the Auger Observatory [36, 51]. These have been parametrized according to the hadronic models EPOS-LHC [40], QGSJetII-04 [52] and Sibyll2.3 [53]. From these parametrizations, the evolution over energy of the fractions of different mass groups, from protons to Fe-nuclei, has been derived. From all three models, a fall-off of the Fe component above \(10^{17}\) eV is inferred. The consistency of all these observations strongly supports a scenario of Galactic CRs characterised by a rigidity-dependent maximum acceleration energy for particles with charge Z, namely \(E_\text {max}(Z)\simeq ZE_\text {max}^\text {proton}\), to explain the knee structures.

The measurements of the all-particle flux from various experiments [9,10,11, 44,45,46,47,48,49] in the energy region surrounding the second knee are shown in Fig. 18. Experiments which set their energy scale using calorimetric measurements are plotted using colored markers (Auger SD-750, TA TALE, TUNKA-133, Yakutsk) while the measurements shown in gray markers represent MC-based energy assignments. The spread between various experiments is statistically significant. However, all these measurements are consistent with the SD-750 spectrum within the 14% energy scale systematic uncertainty. Understanding the nature of the off-sets in the energy scales is beyond the scope of this paper. However, we note that the TALE spectrum agrees rather well with the SD-750 spectrum, offset by 5 to 6% in energy. The agreement is notable given that at-and-above the ankle, an energy scale off-set of around 11% is required to bring the spectral measurements with SD-1500 of the Auger Observatory and the SD of the Telescope Array into agreement [54].

Additionally, we have presented a robust method to combine energy spectra. Using the result from the SD-750 and a previously reported measurement using the SD-1500, a unified SD spectrum was calculated by combining the respective observed fluxes, energy resolutions, and exposures. The result has partial coverage of the second knee and full coverage of the ankle, an additional inflection at \({\simeq }\,1.4{\times }10^{19}\) eV, and the suppression. This procedure is applied to spectra inferred from a single detector type (i.e. water-Cherenkov detectors), but can be used for the combination of any spectral measurements for which the uncorrelated uncertainties can be estimated.

The impressive regularity of the all-particle spectrum observed in the energy region between the second knee and the ankle can hide an underlying intertwining of different astrophysical phenomena, which might be exposed by looking at the spectrum of different primary elements. In the future, further measurements will allow separation of the intensities due to the different components. On the one hand, \(X_\text {max}\) values will be determined down to \(10^{17}\) eV using the three HEAT telescopes. On the other hand, the determination of the muon component of EAS above \(10^{17}\) eV will be possible using the new array of underground muon detectors [35], co-located with the SD-750. This will help us in studying whether the origin of the second knee stems from, for instance, the steep fall-off of an iron component, as expected for Galactic CRs characterized by a rigidity-dependent maximum acceleration energy for particles with charge Z, namely \(E_\text {max}(Z)\simeq ZE_\text {max}^\text {proton}\). In addition, we will be able to extend the measurement of the energy spectrum below \(10^{17}\,\)eV with a denser array of 433 m-spaced detectors and with the analysis of the Cherenkov light in FD events [55]. The extension will allow us to lower the threshold and to further explore the second-knee region in more detail.