1 Introduction

Quarkonium production in high-energy hadronic collisions is an important tool to study the perturbative and non-perturbative aspects of quantum chromodynamics (QCD) calculations [1, 2]. Quarkonia are bound states of either a charm and anti-charm (charmonia) or a bottom and anti-bottom quark pair (bottomonia). In hadronic collisions, the scattering process leading to the production of the heavy-quark pair involves momentum transfers at least as large as twice the mass of the considered heavy quark, hence it can be described with perturbative QCD calculations. In contrast, the binding of the heavy-quark pair is a non-perturbative process as it involves long distances and soft momentum scales. Describing quarkonium production measurements in proton–proton (pp) collisions at various colliding energies represents a stringent test for models and, in particular, for the investigation of the non-perturbative aspects that are treated differently in the various approaches. These measurements also provide a crucial reference for the investigation of the properties of the quark–gluon plasma formed in nucleus–nucleus collisions and of the cold nuclear matter effects present in proton–nucleus collisions [2, 3].

Quarkonium production can be described by various approaches that essentially differ in the treatment of the hadronization part. The Color Evaporation Model (CEM) [4, 5] considers that the quantum state of every heavy-quark pair produced with a mass above its production threshold and below twice the open heavy flavor (D or B meson) threshold production evolves into a quarkonium. In this model, the probability to obtain a given quarkonium state from the heavy-quark pair is parametrized by a constant phenomenological factor. The Color Singlet Model (CSM) [6] assumes no evolution of the quantum state of the pair from its production to its hadronization. Only color-singlet heavy-quark pairs are thus considered to form quarkonium states. Finally, in the framework of Non-Relativistic QCD (NRQCD) [7], both color-singlet and color-octet heavy-quark pairs can evolve towards a bound state. Long Distance Matrix Elements are introduced in order to parametrize the binding probability of the various quantum states of the heavy-quark pairs. They can be constrained from existing measurements and do not depend on the specific production process under study (pp, electron–proton, etc.).

This article presents measurements of the inclusive production cross section of charmonium (\(\textrm{J}/\psi \) and \(\psi \mathrm{(2S)}\)) and bottomonium (\(\Upsilon \mathrm (1S)\), \(\Upsilon \mathrm{(2S)}\), and \(\Upsilon \mathrm{(3S)}\)) states in pp collisions at a center-of-mass energy \(\sqrt{s} = 5.02\) TeV with the ALICE detector. The analysis is performed in the dimuon decay channel at forward rapidity (\(2.5< y < 4\)). In this rapidity interval, the total, transverse momentum (\(p_{\textrm{T}}\)) and rapidity (\(y\)) differential cross sections for \(\textrm{J}/\psi \) as well as the total cross section for \(\psi \mathrm{(2S)}\), were published by the ALICE collaboration based on an earlier data sample [8, 9], corresponding to a factor 12 smaller integrated luminosity. These measurements with improved statistical precision supersede the ones from earlier publication. The \(p_{\textrm{T}}\) and \(y\) differential measurements for the \(\psi \mathrm{(2S)}\) and \(\Upsilon \mathrm (1S)\) as well as the total cross sections for all the measured \(\Upsilon \) states are presented here for the first time at \(\sqrt{s} = 5.02\) TeV and at forward rapidity. The \(p_{\textrm{T}}\) coverage of the \(\textrm{J}/\psi \) measurement is extended up to 20 GeV/\(c\).

The inclusive differential cross sections are obtained as a function of \(p_{\textrm{T}}\) for \(p_{\textrm{T}} < 20\) GeV/\(c\) and as a function of \(y\) for \(p_{\textrm{T}} < 12\) GeV/\(c\) for \(\textrm{J}/\psi \), for \(p_{\textrm{T}} < 12\) GeV/\(c\) for \(\psi \mathrm{(2S)}\), and for \(p_{\textrm{T}} < 15\) GeV/\(c\) for \(\Upsilon \mathrm (1S)\). Only the \(p_{\textrm{T}}\)-integrated cross sections are measured for \(\Upsilon \mathrm{(2S)}\) and \(\Upsilon \mathrm{(3S)}\) due to statistical limitations. The inclusive \(\psi \mathrm{(2S)}\)-to-\(\textrm{J}/\psi \) ratio is also presented as a function of \(p_{\textrm{T}}\) and \(y\). The comparison of the \(\textrm{J}/\psi \) cross section with recent results from LHCb [10] is discussed. The results are compared with previous ALICE measurements performed at \(\sqrt{s} = 2.76\), 7, 8, and 13 TeV [9, 11,12,13]. Earlier comparisons with LHCb quarkonium results at \(\sqrt{s} = 7\), 8, and 13 TeV [14,15,16,17] were performed in [9, 12, 13]. Finally, the results are compared with theoretical calculations based on NRQCD and CEM.

The measurements reported here are inclusive and correspond to a superposition of the direct production of quarkonium and of the contribution from the decay of higher-mass excited states (predominantly \(\psi \mathrm{(2S)}\) and \(\chi _c\) for \(\textrm{J}/\psi \), \(\Upsilon \mathrm{(2S)}\), \(\chi _b\), and \(\Upsilon \mathrm{(3S)}\) for \(\Upsilon \mathrm (1S)\), \(\Upsilon \mathrm{(3S)}\) and \(\chi _b\) for \(\Upsilon \mathrm{(2S)}\), and \(\chi _b\) for \(\Upsilon \mathrm{(3S)}\)). For \(\textrm{J}/\psi \) and \(\psi \mathrm{(2S)}\) a non-prompt contribution from beauty hadron decays is also present.

The article is organized as follows: the ALICE detectors used in the analysis and the data sample are briefly described in Sect. 2, the analysis procedure is presented in Sect. 3, and in Sect. 4 the results are discussed and compared with theoretical calculations and measurements at other center-of-mass energies from ALICE.

2 Apparatus and data samples

A detailed description of the ALICE setup and its performance are discussed in Refs. [18, 19]. In this section, the subsystems relevant for this analysis are presented.

Muons from quarkonium decays are detected in the muon spectrometer within the pseudorapidity rangeFootnote 1\(-4<\eta <-2.5\) [20]. The muon spectrometer consists of a front absorber located along the beam direction (z) between \(-0.9\) and \(-5\) m from the interaction point (IP), five tracking stations (MCH), located between \(-5.2\) and \(-14.4\) m from the IP, an iron wall at \(-14.5\) m, and two triggering stations (MTR), placed at \(-16.1\) and \(-17.1\) m from the IP. Each station is made of two layers of active detection material, with cathode pad and resistive plate techniques employed for the muon detection in the tracking and triggering devices, respectively. A dipole magnet with a 3 T\(\times \)m field integral deflects the particles in the vertical direction for the measurement of the muon momentum. The hadronic particle flux originating from the collision vertex is strongly suppressed thanks to the front absorber with a thickness of 10 interaction lengths. Throughout the spectrometer length, a conical absorber at small angle around the z axis reduces the background from secondary particles originating from the interaction of large angle primary particles with the beam pipe. The 1.2 m thick iron wall positioned in front of the triggering stations stops the punch-through hadrons escaping the front absorber, as well as low-momentum muons from pion and kaon decays. In addition, a rear absorber downstream of the trigger stations ensures protection against the background generated by beam–gas interactions.

Two layers of silicon pixel detectors (SPD) with a cylindrical geometry, covering \(|\eta |<2.0\) and \(|\eta |<1.4\), respectively, are used for the determination of the collision vertex. They are the two innermost layers of the Inner Tracking System (ITS) [21] and surround the beam pipe at average radii of 3.9 and 7.6 cm. The T0 quartz Cherenkov counters [22] are made of two arrays positioned on each side of the IP at \(-70\) cm and 360 cm. They cover the pseudorapidity ranges \(-3.3< \eta < -3.0\) and \(4.6< \eta < 4.9\), respectively. The T0 is used for luminosity determination and background rejection. Similarly, the V0 scintillator arrays [23] are located on both sides of the IP at \(-90\) and 340 cm and cover the pseudorapidity ranges \(-3.7< \eta < -1.7\) and \(2.8< \eta < 5.1\), respectively. These are used for triggering, luminosity determination and to reject beam–gas events using offline timing selections together with the T0 detectors.

A minimum bias trigger is issued by the V0 detector [23] when a logical AND of signals from the two V0 arrays on each side of the IP is produced. Single muon, same-sign dimuon, and opposite-sign dimuon triggers are defined by an online estimate of the \(p_{\textrm{T}}\) of the muon tracks using a programmable trigger logic circuit. A predefined \(p_{\textrm{T}}\) threshold of 0.5 GeV/\(c\) is set in order to remove the low-\(p_{\textrm{T}}\) muons, mainly coming from \(\pi \) and K decays. The muon trigger efficiency reaches \(50\%\) at this threshold value and saturates for \(p_{\textrm{T}} > 1.5\) GeV/\(c\). Events containing an opposite-sign dimuon trigger in coincidence with the minimum bias trigger are selected for the quarkonium analysis.

The data sample of pp collisions at \(\sqrt{s} =5.02\) TeV used for the measurements reported in this article was collected in 2017 with the opposite-sign dimuon trigger, and corresponds to an integrated luminosity \(L_{\textrm{int}} = 1229.9~\pm ~0.4\) (stat.) ± 22.1 (syst.) nb\(^{-1}\) [24]. The luminosity determination is based on dedicated van der Meer scans [25], where the cross sections seen by two different minimum bias triggers based on the V0 and T0 signals are derived [24]. The number of T0- and dimuon-trigger counts measured with scalers on a run-by-run basis without any data acquisition veto is used along with the T0-trigger cross section to calculate the integrated luminosity of the analyzed data sample. Another method, using reconstructed minimum bias events triggered with the V0 detector only, is used as a cross-check of the first method. In this method, the luminosity is computed as the ratio of the number of equivalent minimum bias events over the V0-trigger cross section. The number of equivalent minimum bias events is evaluated as the product of the total number of dimuon-triggered events with the inverse of the probability of having dimuon-triggered events in a minimum bias triggered data sample recorded with only the V0 [26]. The two methods give compatible values and the one based on T0 is used, as it gives a smaller total uncertainty (see Sect. 3.4).

3 Analysis procedure

3.1 Track selection

The number of detected quarkonia is estimated by pairing muons of opposite charges and by fitting their invariant mass (\(m_{\mu ^{+}\mu ^{-}}\)) distribution. Reconstructed tracks must meet several selection criteria. The pseudorapidity of each muon candidate must be within the geometrical acceptance of the muon spectrometer (\(-4< \eta < -2.5\)). Muons are identified and selected by applying a matching condition between the tracking system and the trigger stations. A selection on the transverse position \(R_{\text {abs}}\) of the muon at the end of the front absorber (\(17.6< R_{\text {abs}} < 89.5\) cm) rejects tracks crossing the thickest sections of the absorber. Finally, the contamination from tracks produced by background events, like beam–gas collisions, is reduced by applying a selection on the product of the track momentum and the transverse distance to the primary vertex [27]. Opposite-sign (OS) muon pairs are then formed in the range \(2.5< y < 4\). The considered \(p_{\textrm{T}}\)  interval varies according to the studied resonance given the available data sample: \(p_{\textrm{T}} < 20\) GeV/\(c\) for \(\textrm{J}/\psi \); \(p_{\textrm{T}} < 12\) GeV/\(c\) for \(\psi \mathrm{(2S)}\); \(y\)-differential and (\(p_{\textrm{T}}\),\(y\))-differential \(\textrm{J}/\psi \) studies; and \(p_{\textrm{T}} < 15\) GeV/\(c\) for \(\Upsilon \mathrm{(nS)}\).

3.2 Signal extraction

A fit to the OS dimuon invariant mass distribution is performed separately for the charmonium and bottomonium mass regions, in each \(p_{\textrm{T}}\) and \(y\) interval considered. In both cases, a maximum log-likelihood fitting method is used. In order to evaluate the systematic uncertainties on the charmonium and bottomonium signal extraction, several fitting functions and ranges are considered, and the parameters that are fixed during the fitting procedure are varied, as described below.

Fig. 1
figure 1

Examples of fit to the OS dimuon invariant mass distribution in the mass region \(2< m_{\mu ^{+}\mu ^{-}} < 5\) GeV/\(c^{2}\) for \(p_{\textrm{T}} < 20\) GeV/\(c\) (left), and \(7< m_{\mu ^{+}\mu ^{-}} < 13\) GeV/\(c^{2}\) for \(p_{\textrm{T}} < 15\) GeV/\(c\) (right). The \(\textrm{J}/\psi \), \(\psi \)(2S) and \(\Upsilon \)(nS) signals are modelled with extended Crystal Ball functions, while the background is described by a pseudo Gaussian with a width increasing linearly with the invariant mass. The fit is performed on the full data sample. The widths of the \(\psi \mathrm{(2S)}\), \(\Upsilon \mathrm{(2S)}\) and \(\Upsilon \mathrm{(3S)}\), for these examples, are fixed to 73 MeV/\(c\), 156 MeV/\(c\)  and 161 MeV/\(c\), respectively

In the charmonium mass region (2 \(< m_{\mu ^{+}\mu ^{-}}<\) 5 GeV/\(c^{2}\)), the fit is performed using the same functional form to describe the \(\textrm{J}/\psi \) and \(\psi \mathrm{(2S)}\) signals, on top of an ad-hoc function to describe the background. The signal shapes considered are either two extended Crystal Ball functions or two pseudo-Gaussian functions [28]. For both functional forms, the \(\textrm{J}/\psi \) mass pole and width are left free during the fit procedure, while the \(\psi \mathrm{(2S)}\) mass is bound to the \(\textrm{J}/\psi \) one by fixing the mass difference between the two states according to the PDG values [29]. The width of the \(\psi \mathrm{(2S)}\) signal is also bound to the \(\textrm{J}/\psi \) one by means of a scale factor on their ratio. It was obtained via a fit to a large data sample from pp collisions at \(\sqrt{s} \) = 13 TeV [9] which gives 1.01 ± 0.05. A variation of +5\(\%\) of the \(\psi \mathrm{(2S)}\)-to-\(\textrm{J}/\psi \) width ratio central value, corresponding to the difference observed between data and Monte Carlo (MC) simulation at \(\sqrt{s} = 13\) TeV,Footnote 2 induces a variation of the \(\textrm{J}/\psi \) yield at the per mille level and is therefore neglected, while the impact of this variation on the \(\psi \mathrm{(2S)}\) yield enters the systematic uncertainty. The parameters descr