1 Introduction

Studies involving heavy quarkonia provide a unique insight into the nature of quantum chromodynamics (QCD) near the boundary of the perturbative and non-perturbative regimes. However, despite a long history of research, quarkonium production in hadronic collisions still presents significant challenges to both theory and experiment.

In high-energy hadronic collisions, charmonium states can be produced either from short-lived QCD sources (referred to as prompt production) or from long-lived sources – decays of beauty hadrons (non-prompt production). These two sources can be distinguished experimentally by measuring the distance between the production and decay vertices of the charmonium state. Feed-down decays of higher charmonium states contribute to the production of \(J/\psi \) mesons for both of these sources, and should be taken into account when comparing with theoretical predictions (there is no significant feed-down contribution to \(\psi (2{\textrm{S}})\) production). While calculations within the framework of perturbative QCD (see e.g. Refs. [1, 2]) have been reasonably successful in describing the non-prompt contributions, a satisfactory understanding of the prompt production mechanisms is still to be achieved.

Methods developed within the non-relativistic QCD (NRQCD) approach provide a framework for describing quarkonium production processes, leading to a variety of models differing in their accuracy and predictive power. In particular, Ref. [3] introduced a number of phenomenological parameters – long-distance matrix elements (LDMEs) – which can be extracted from fits to the experimental data, and are expected to describe the cross-sections and differential spectra for several sets of data reasonably well [4,5,6,7]. However, various attempts to build a universal library of LDMEs to be used to describe a wider range of measurements such as the polarisation of quarkonia [8,9,10,11], their associated production [12, 13] or the production of quarkonium in a wider range of processes (e.g. photo- and electro-production) have not been particularly successful [14,15,16,17,18]. An alternative approach to the description of the hadronisation process of heavy quarkonia is offered by the Colour Evaporation Model (CEM) [19,20,21], which offers a simpler framework with fewer parameters, but has its own problems in describing the data [22]. A combination of ATLAS results with cross-section and polarisation measurements from CMS [7, 23, 24], LHCb [6, 25,26,27,28,29] and ALICE [11, 30,31,32,33,34,35,36] now includes a variety of charmonium production characteristics in a wide kinematic range, thus providing a wealth of information for a new generation of theoretical models.

One way to add qualitatively new information is to extend the kinematic range of quarkonium production measurements. ATLAS has previously measured the inclusive differential cross-section for \(J/\psi \) production in pp collisions at \(\sqrt{s} = 7\) and 8 TeV [4], as well as the differential cross-sections for the production of \(\chi _c\) states [37], and for \(\psi (2{\textrm{S}})\) production [5]. In most of these measurements, ATLAS exploited a dimuon trigger with a muon transverse momentum (\(p_{\textrm{T}}\)) threshold of 4 GeV, with the high-\(p_{\textrm{T}}\) reach limited mainly by the dimuon trigger’s performance to about 100 GeV: at higher \(p_{\textrm{T}}\) values the angular resolution of the muon trigger system is not sufficient to separate the two almost collinear muons. This paper describes a measurement of \(J/\psi \) \((\psi (2{\textrm{S}}))\) meson production via decay in the dimuon channel, at \(\sqrt{s} = 13\) TeV for meson transverse momenta of 8–360 GeV (8–140 GeV), which is a much broader range than in previous measurements. This was made possible by the use of two different triggers. Production of \(J/\psi \) and \(\psi \)(2S)  at low \(p_{\textrm{T}}\), between 8 and \(60~\text {GeV}\), is measured using a dimuon trigger requiring a pair of muons to each pass a \(p_{\textrm{T}}\) threshold of \(4~\text {GeV}\), while at high \(p_{\textrm{T}}\) a single-muon trigger with a \(p_{\textrm{T}}\) threshold of \(50~\text {GeV}\) was used. This allowed measurements to be performed for transverse momenta as high as \(360~\text {GeV}\) for \(J/\psi \) and \(140~\text {GeV}\) for \(\psi \)(2S). The measurements include the double-differential cross-sections for production of the two vector charmonium states (separately for the prompt and non-prompt production mechanisms), the non-prompt fraction for each state, and the prompt and non-prompt \(\psi (2{\textrm{S}})\)-to-\(J/\psi \) production ratios.

The paper is organised as follows. A brief description of the ATLAS detector is given in Sect. 2. The event selection and the analysis strategy are explained in Sect. 3, followed by a description of the systematic uncertainties affecting the measurement in Sect. 4. Results and comparisons with theoretical calculations are presented in Sect. 5, followed by a summary in Sect. 6.

2 The ATLAS detector

The ATLAS experiment [38] at the LHC is a multipurpose particle detector with a forward–backward symmetric cylindrical geometry and a near \(4\pi \) coverage in solid angle.Footnote 1 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer. The inner tracking detector covers the pseudorapidity range \(|\eta | < 2.5\). It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. Metal/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity. A steel/scintillator-tile hadron calorimeter covers the central pseudorapidity range \((|\eta | < 1.7).\) The endcap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to \(|\eta | = 4.9\). The muon spectrometer surrounds the calorimeters and is based on three large superconducting air-core toroidal magnets with eight coils each. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. The muon spectrometer has a system of precision chambers for tracking and fast detectors for triggering. A two-level trigger system is used to select events. The first-level trigger is implemented in hardware and uses a subset of the detector information to accept events at a rate below 100 kHz. This is followed by a software-based trigger that reduces the accepted event rate to 1 kHz on average depending on the data-taking conditions. An extensive software suite [39] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Table 1 Parameterisation of the fit model. Here GCBE and P denote Gaussian, Crystal Ball, exponential and second-order polynomial functions, respectively, with different indices corresponding to different parameters. The parameterisation of the \(i=1\) term is modified as described in the text and shown in Eq. (4)

3 Analysis strategy

3.1 Event selection

Data for this analysis were taken during the LHC proton–proton collision runs at \(\sqrt{s}= 13~\text {Tev}\) in the years 2015–2018. For lower values of the transverse momentum \(p_{\textrm{T}}\) of the dimuon system, between 8 and \(60~\text {GeV}\), a dimuon trigger was used, requiring a pair of muons to each pass a \(p_{\textrm{T}}\) threshold of \(4~\text {GeV}\). This trigger ran unprescaled during 2015 data-taking, collecting an integrated luminosity of 2.6 fb\(^{-1}\). In the high \(p_{\textrm{T}}\) range between 60 and \(360~\text {GeV}\), a single-muon trigger with a \(p_{\textrm{T}}\) threshold of \(50~\text {GeV}\) was used, unprescaled throughout the full Run 2 data-taking, providing a total integrated luminosity of 140 \(\text {fb}^{-1}\). The selected events were required to contain a pair of oppositely charged muons of high quality (using the tight identification requirements defined in Ref. [40]), with \(p_{\textrm{T}}>4~\text {GeV}\) and \(|\eta |<2.4\). In the low \(p_{\textrm{T}}\) range the two muons were required to match the two trigger objects of the dimuon trigger, while in the high \(p_{\textrm{T}}\) range at least one of the muons was required to have \(p_{\textrm{T}}> 52.5~\text {GeV}\) and match the trigger object. The two ID tracks attributed to the muons were fitted to a common vertex, and the dimuon invariant mass \(m_{\mu \mu }\) was required to satisfy \(2.6< m_{\mu \mu } < 4.2~\text {GeV}\). The transverse distance \(L_{xy}\) between the primary vertex and the dimuon vertex was used to calculate the meson’s pseudo-proper decay time

$$\begin{aligned} \tau = \frac{m_{\mu \mu }}{p_{\textrm{T}}}\frac{L_{xy}}{c}, \end{aligned}$$

where \(p_{\textrm{T}}\) is the reconstructed transverse momentum of the dimuon system, and c is the speed of light. The primary vertex is chosen as the reconstructed collision interaction vertex whose z coordinate is nearest to the point of closest approach of the dimuon system’s trajectory to the beam axis. If an event has more than one selected dimuon candidate, all candidates are retained and treated independently.

3.2 Cross-section determination

The phase space of the measurement is divided into 34 intervals in dimuon \(p_{\textrm{T}}\) covering the range from 8 to 360 GeV, and 3 intervals in absolute rapidityFootnote 2 |y| with boundaries at 0, 0.75, 1.5 and 2.0, thus producing 102 analysis bins overall. In each \((p_{\textrm{T}},y)\) bin, a two-dimensional unbinned maximum-likelihood fit to the distribution of dimuon candidates in invariant mass \(m_{\mu \mu }\) and pseudo-proper decay time \(\tau \) of the \(\psi \) meson is performed to obtain the raw yields \(N_{\psi }^{\textrm{P},\textrm{NP}}\) for prompt (P) and non-prompt (NP) \(\psi \) mesons, where \(\psi = J/\psi , \; \psi (2S)\). The raw yields are then corrected to account for the geometrical acceptance \({\mathcal {A}(\psi )}\), the trigger and reconstruction efficiencies \(\epsilon _{\textrm{trig}}\) and \(\epsilon _{\textrm{reco}}\), and the trigger and reconstruction correction scale factors \( \epsilon _{\textrm{trigSF}}\) and \( \epsilon _{\textrm{recoSF}}\), averaged over that bin. Several low \(p_{\textrm{T}}\) bins are divided into narrower sub-bins to obtain raw yields in finer granularity, which are then corrected and summed to give the final yield in the corresponding analysis bin. This procedure helps to reduce measurement biases due to modelling assumptions in the regions of phase space with large statistical power.

The prompt (P) and non-prompt (NP) double-differential production cross-sections for \(\psi = J/\psi , \; \psi (2S)\) in each analysis bin are calculated as

$$\begin{aligned}&\frac{\textrm{d}^{2}\sigma ^{\textrm{P},\textrm{NP}} (pp\rightarrow \psi )}{\textrm{d}p_{\textrm{T}}\textrm{d}y} \times \mathcal {B}(\psi \rightarrow \mu ^+ \mu ^-)\nonumber \\&\quad = \frac{1}{{\mathcal {A}(\psi )} \epsilon _{\textrm{trig}} \epsilon _{\textrm{trigSF}} \epsilon _{\textrm{reco}} \epsilon _{\textrm{recoSF}}}\; \frac{N_{\psi }^{\textrm{P},\textrm{NP}}}{\;\Delta p_{\textrm{T}} \; \Delta y \;\int \!{\mathcal {L}} \textrm{d}t}, \end{aligned}$$
(1)

where \(\Delta p_{\textrm{T}}\) and \(\Delta y\) are bin widths in \(p_{\textrm{T}}\) and rapidity, and \(\int \!{\mathcal {L}} \textrm{d}t\) is the corresponding integrated luminosity. Bin migration effects are discussed in Sect. 3.4. The acceptance \({\mathcal {A}}(\psi )\) is defined as the probability that a \(\psi \) state with (true) momentum within an analysis bin survives the following acceptance selections imposed on the two muons (assuming \(p_{\textrm{T}}(\mu 1)>p_{\textrm{T}}(\mu 2)\)) in the two \(\psi \) \(p_{\textrm{T}}\) ranges:

  • low \(p_{\textrm{T}}\) range, \(p_{\textrm{T}}(\psi )< 60\) GeV: \(p_{\textrm{T}}(\mu _1)>4\) GeV, \(p_{\textrm{T}}(\mu _2)>4\) GeV, \(|\eta (\mu _1)|, |\eta (\mu _2)|<2.4\);

  • high \(p_{\textrm{T}}\) range, \(p_{\textrm{T}}(\psi )\ge 60\) GeV: \(p_{\textrm{T}}(\mu _1)>52.5\) GeV, \(p_{\textrm{T}}(\mu _2)>4\) GeV, \(|\eta (\mu _1)|, |\eta (\mu _2)|<2.4\).

The acceptance calculation is performed using Monte Carlo (MC) generator-level kinematic variables, with resolution effects taken into account at the efficiency correction stage. An isotropic angular distribution of muons in the \(\psi \) decay frame is assumed. Since the spin alignment of the \(\psi \) states may affect the acceptance, a number of non-isotropic spin-alignment scenarios are used to calculate correction factors for the measured cross-sections (see Appendix). For a given spin-alignment scenario, systematic uncertainties due to the acceptance calculation are small (see Sect. 4). Changing to a different spin-alignment scenario, however, can lead to noticeable changes in the cross-sections and other measured quantities, including variations as a function of \(p_{\textrm{T}}\) (see Sect. 5).

The reconstruction efficiency \(\epsilon _{\textrm{reco}}\) and trigger efficiency \(\epsilon _{\textrm{trig}}\) are calculated using samples of fully simulated \(J/\psi \) and \(\psi \)(2S) events, including appropriate trigger information. Correction scale factors \(\epsilon _{\textrm{recoSF}}\) and \(\epsilon _{\textrm{trigSF}}\) account for the differences between simulated and real data. See Sect. 3.4 for more details.

The non-prompt fractions for \(\psi = J/\psi , \psi (2{\textrm{S}})\) are defined as

$$\begin{aligned} F^{{\textrm{NP}}}_{\psi }(p_{\textrm{T}}, y)= & {} \frac{\textrm{d}^{2}\sigma ^{\textrm{NP}} (pp\rightarrow \psi )}{\textrm{d}p_{\textrm{T}}\textrm{d}y}\nonumber \\{} & {} \times \left[ \frac{\textrm{d}^{2}\sigma ^{\textrm{P}} (pp\rightarrow \psi )}{\textrm{d}p_{\textrm{T}}\textrm{d}y} + \frac{\textrm{d}^{2}\sigma ^{\textrm{NP}} (pp\rightarrow \psi )}{\textrm{d}p_{\textrm{T}}\textrm{d}y}\right] ^{-1}.\nonumber \\ \end{aligned}$$
(2)

Finally, \(\psi \)(2S)-to-\(J/\psi \) production ratios are defined separately for the prompt and non-prompt production mechanisms as

$$\begin{aligned}{} & {} R^{{\textrm{P},\textrm{NP}}}(p_{\textrm{T}}, y) \nonumber \\{} & {} \quad = \frac{\textrm{d}^{2} \sigma ^{\textrm{P}, \textrm{NP}} (pp\rightarrow \psi (2{\textrm{S}}))}{\textrm{d}p_{\textrm{T}}\textrm{d}y}\times \mathcal {B} (\psi (2{\textrm{S}})\rightarrow \mu ^+ \mu ^-)\nonumber \\{} & {} \qquad \times \left[ \frac{\textrm{d}^{2}\sigma ^{\textrm{P}, \textrm{NP}} (pp\rightarrow J/\psi )}{\textrm{d}p_{\textrm{T}}\textrm{d}y} \times \mathcal {B}(J/\psi \rightarrow \mu ^+ \mu ^-)\right] ^{-1}.\nonumber \\ \end{aligned}$$
(3)

In calculating these quantities, the event yields, efficiencies, and acceptance corrections are used in accord with Eq. (1); uncertainties in the fraction and ratio measurements partially cancel out.

Fig. 1
figure 1

Mass (left) and pseudo-proper decay time (right) projections of the fit result for selected analysis (sub-)bins. The values of \(\chi ^2/\)d.o.f. for the corresponding 2-dimensional fits are, from top to bottom: 1.09, 0.89, 1.20 and 0.93

Fig. 2
figure 2

Total, statistical, and systematic uncertainties (in %) as functions of \(p_{\textrm{T}}\) for the differential a prompt \(J/\psi \) and b non-prompt \(\psi (2{\textrm{S}})\) cross-sections, and for the non-prompt fractions of c \(J/\psi \) and d \(\psi (2{\textrm{S}})\), in the rapidity slice \(0.00 \le |y|<0.75\). The main components of the systematic uncertainties are also shown

Fig. 3
figure 3

Differential cross-sections for a prompt and b non-prompt production of \(J/\psi \) mesons. For visual clarity, a scaling factor of 1, 10, or 100 is applied to the rapidity slices \(0.00 \le |y|<0.75\), \(0.75\le |y|<1.5\), and \(1.5\le |y|<2.0\), respectively. For each data point, the horizontal bar spans the \(p_{\textrm{T}}\) range covered by that bin, with the horizontal position of each point representing the mean \(p_{\textrm{T}}\) in that bin. The vertical uncertainty range (obscured by the marker for some values) combines both the statistical (the inner bar) and total uncertainty. Uncertainties related to spin alignment or integrated luminosity are not included. Data up to \(60~\text {GeV}\) were taken with a dimuon trigger with integrated luminosity \(2.6~\text {fb}^{-1}\); data above \(60~\text {GeV}\) were taken with a single-muon trigger with integrated luminosity \(140~\text {fb}^{-1}\)

3.3 Fit model

The fit model’s probability distribution function \(F(m_{\mu \mu },\tau )\) contains seven terms,

$$\begin{aligned} F(m_{\mu \mu },\tau ) = \sum _{i=1}^{7} \kappa _i P_i(m_{\mu \mu },\tau ), \end{aligned}$$

with fractions \(\kappa _i\), describing four signal contributions and three types of background. Terms \(i=1,2\) describe prompt and non-prompt \(J/\psi \) signal respectively; terms \(i=3,4\) correspond to prompt and non-prompt \(\psi \)(2S) signal. Term \(i=5\) describes the prompt background, where non-resonant dimuons are produced at the primary vertex (e.g. Drell–Yan pairs). Term \(i=6\) describes single-sided non-prompt background, mainly for dimuon continuum events where the two muons originate from the (cascade) decay of a single b-hadron, while term \(i=7\) describes the double-sided part of the non-prompt continuum, where the two muons originate from different b-hadrons, yielding a secondary vertex which may appear on either side of the beamline. For \(i=2\)–7, each term is factorised into a function \(f_i(m_{\mu \mu })\) of dimuon mass \(m_{\mu \mu }\) and a function \(h_i(\tau )\) of pseudo-proper decay time \(\tau \), where the latter is convolved with a decay time resolution function \(R(\tau )\):

$$\begin{aligned} P_i(m_{\mu \mu },\tau ) = f_{i}(m_{\mu \mu })\cdot \left[ h_{i}(\tau )\otimes R(\tau )\right] . \end{aligned}$$

The term with \(i=1\), which describes the prompt \(J/\psi \) peak, has a similar structure but allows for some correlations between \(m_{\mu \mu }\) and \(\tau \), as described below.

The decay time resolution function \(R(\tau )\) is parameterised as a sum of three Gaussian functions, \(G_{\textrm{A}}, G_{\textrm{B}}\) and \(G_{\textrm{C}}\), with the relative weights of the first two, \(\omega _{\textrm{A}}\) and \(\omega _{\textrm{B}}\), treated as free parameters:

$$\begin{aligned} R(\tau ) = \omega _{\textrm{A}} G_{\textrm{A}}(\tau ) + \omega _{\textrm{B}} G_{\textrm{B}}(\tau ) + (1-\omega _{\textrm{A}} -\omega _{\textrm{B}}) G_{\textrm{C}}(\tau ). \end{aligned}$$

Based on MC studies with fully simulated signal samples, the means of the three Gaussian functions are fixed to zero, and the widths are linked by \(\sigma _{\textrm{B}} = 2\sigma _{\textrm{A}}\) and \(\sigma _{\textrm{C}} = 4\sigma _{\textrm{A}}\), where \(\sigma _{\textrm{A}}=0.04\) ps is fixed to the smallest value found in test fits.

The parameterisations of the functions \(f_i(m_{\mu \mu })\) and \(h_i(\tau )\) are summarised in Table 1. The mass lineshapes of the \(J/\psi \) and \(\psi \)(2S) peaks, \(f_{i}(m_{\mu \mu })\) for \(i=1\) to 4, are parameterised as weighted sums of two Gaussian functions and a Crystal Ball function [41], which are the same for the prompt and non-prompt components. Based on MC studies, the weights are common to \(J/\psi \) and \(\psi \)(2S), while the ratios of the peak positions and the widths are fixed to the ratio \(\beta \) of the masses [42] of the two states. Parameters of the Crystal Ball function were kept the same for both prompt and non-prompt \(J/\psi \) and also \(\psi \)(2S), as verified from the MC studies. The lifetime distributions \(h_{1}(\tau )\) and \(h_{3}(\tau )\) of prompt \(J/\psi \) and \(\psi \)(2S), respectively, are parameterised as delta functions, while for non-prompt \(\psi \)(2S), a single-sided exponential function is used for \(h_{4}(\tau )\). Since the non-prompt \(J/\psi \) sample is larger and has a wider observed \(\tau \) range, its decay time distribution \(h_{2}(\tau )\) is described by a superposition of two single-sided exponential functions with slopes related by \(\gamma _1 = b\gamma _2\), where the constant \(b=1.4\) is obtained from test fits using real data. All these exponential functions are convolved with the resolution function \(R(\tau )\).

In the \(i=1\) term, describing the prompt \(J/\psi \) peak, the product of the narrowest Gaussian term in \(f_1(m_{\mu \mu })\) and the narrowest Gaussian term in \(R(\tau )\) was replaced by a bivariate Gaussian function in \(m_{\mu \mu }\) and \(\tau \) with a correlation coefficient \(\rho =0.3\),

$$\begin{aligned}&\omega _0 G_1(m_{\mu \mu }; \sigma _1) \cdot f_{\textrm{A}} G_{\textrm{A}}(\tau ; \sigma _{\textrm{A}}) \nonumber \\&\quad \mapsto \omega _0 f_{\textrm{A}} G_{\textrm{BV}} (m_{\mu \mu },\tau ; \sigma _1, \sigma _{\textrm{A}}, \rho ), \end{aligned}$$
(4)

to take into account the observed correlation between the measured values of these quantities. The effect of this correlation was found to be negligible for other terms.

Parameterisations for the background terms are selected using both the experience gained from similar analyses at lower energies [4] and physics considerations. In the prompt background term, \(i=5\), the mass distribution is modelled by a second-order polynomial, while the non-prompt mass distributions for \(i=6\) and 7 are parameterised as exponential functions, with independent parameters. The decay time distribution is a delta function for the prompt term \(i=5\), a single-sided exponential function for the main non-prompt term \(i=6\), and a symmetric double-sided exponential function for the last term, \(i=7\). Each of these is also convolved with the decay time resolution function \(R(\tau )\).

Fits are performed in each (sub-)bin using an unbinned maximum-likelihood method, in the dimuon mass range from 2.7 to 4.1 GeV and the decay time range between \(-1\) and 11 ps. Twenty of the 29 parameters are determined from the fit, with the rest fixed to predetermined values obtained from test fits using samples of MC simulated \(J/\psi \) and \(\psi \)(2S) signal events. Uncertainties due to variations of the fit model and assumptions about the fixed parameters are estimated during the studies of systematic uncertainties described in Sect. 4. In particular, it is found that a reliable determination of the cross-sections for prompt and non-prompt production of the \(\psi \)(2S)  meson (with significance better than \(5\sigma \)) is only possible up to \(p_{\textrm{T}}= 140\) GeV, mainly due to poorer mass resolution and a lower signal-to-background ratio at higher \(p_{\textrm{T}}\). For \(p_{\textrm{T}}>140\) GeV the yield of \(\psi \)(2S) was fixed to a constant fraction (0.07) of the yield of \(J/\psi \), by extrapolating the results from the fits at lower transverse momenta.

Figure 1 shows the mass and pseudo-proper decay time projections of the fits in several sample bins, together with the associated pull distributions. The quality of the fits, assessed by calculating a two-dimensional \(\chi ^2\) value, is found to be good in all (sub-)bins.

The main parameters determined from the fits are the prompt and non-prompt yields of \(J/\psi \) and \(\psi (2{\textrm{S}})\) states. The cross-sections, non-prompt fractions and production ratios were then calculated using Eqs. (1)–(3). taking into account correlations between fit parameters. The results for all measured quantities are presented in Sect. 5.

Fig. 4
figure 4

Differential cross-sections for a prompt and b non-prompt production of \(\psi (2{\textrm{S}})\) mesons. For visual clarity, a scaling factor of 1, 10, or 100 is applied to the rapidity slices \(0.00 \le |y|<0.75\), \(0.75\le |y|<1.5\), and \(1.5\le |y|<2.0\), respectively. For each data point, the horizontal bar spans the \(p_{\textrm{T}}\) range covered by that bin, with the horizontal position of each point representing the mean \(p_{\textrm{T}}\) in that bin. The vertical uncertainty range (obscured by the marker for some values) combines both the statistical (the inner bar) and total uncertainty. Uncertainties related to spin alignment or integrated luminosity are not included. Data up to \(60~\text {GeV}\) were taken with a dimuon trigger with integrated luminosity \(2.6~\text {fb}^{-1}\); data above \(60~\text {GeV}\) were taken with a single-muon trigger with integrated luminosity \(140~\text {fb}^{-1}\)

Fig. 5
figure 5

Non-prompt production fraction of a \( J/\psi \) and b \(\psi (2{\textrm{S}})\) mesons. For visual clarity, a vertical shift of 0, 0.2, or 0.4 is applied to the rapidity slices \(0.00 \le |y|<0.75\), \(0.75\le |y|<1.5\), and \(1.5\le |y|<2.0\), respectively. For each data point, the horizontal bar spans the \(p_{\textrm{T}}\) range covered by that bin, with the horizontal position of each point representing the mean \(p_{\textrm{T}}\) in that bin. The vertical uncertainty range (obscured by the marker for some values) combines both the statistical (the inner bar) and total uncertainty. Uncertainties related to spin alignment or integrated luminosity are not included. Data up to \(60~\text {GeV}\) were taken with a dimuon trigger with integrated luminosity \(2.6~\text {fb}^{-1}\); data above \(60~\text {GeV}\) were taken with a single-muon trigger with integrated luminosity \(140~\text {fb}^{-1}\)

Fig. 6
figure 6

The \(\psi \)(2S)-to-\(J/\psi \) production ratio for the a prompt and b non-prompt production mechanisms. For visual clarity, a vertical shift of 0, 0.02, or 0.04 is applied to the rapidity slices \(0.00 \le |y|<0.75\), \(0.75\le |y|<1.5\), and \(1.5\le |y|<2.0\), respectively. For each data point, the horizontal bar spans the \(p_{\textrm{T}}\) range covered by that bin, with the horizontal position of each point representing the mean \(p_{\textrm{T}}\) in that bin. The vertical uncertainty range (hidden by the marker for some values) combines both the statistical (the inner bar) and total uncertainty. Uncertainties related to spin alignment or integrated luminosity are not included. Data up to \(60~\text {GeV}\) were taken with a dimuon trigger with integrated luminosity \(2.6~\text {fb}^{-1}\); data above \(60~\text {GeV}\) were taken with a single-muon trigger with integrated luminosity \(140~\text {fb}^{-1}\)

Fig. 7
figure 7

Spin-alignment hypothesis correction factors for the \(J/\psi \) a differential cross-section and b non-prompt production fraction, for a number of spin-alignment scenarios. The correction factors are approximately the same for \(J/\psi \) and \(\psi \)(2S), for the prompt and non-prompt production mechanisms, and also for the three rapidity regions. The discontinuities at \(p_{\textrm{T}}=60\) GeV are due to the transition from a low-\(p_{\textrm{T}}\) dimuon trigger to a high-\(p_{\textrm{T}}\) single-muon trigger, and the corresponding change in event acceptance

Fig. 8
figure 8

Ratios of various theoretical predictions (described in the text) to the data points from this measurement, for the prompt production of a \(J/\psi \) and b \(\psi \)(2S) in the central rapidity region. In each \(p_{\textrm{T}}\) bin, the shaded area represents the ratio of the theoretical prediction to the measured value, with the vertical spread showing the uncertainties of the respective model. Error bars on the black circles show fractional uncertainties of this measurement

3.4 Efficiency corrections

As shown in Eq. (1), the yields obtained from two-dimensional maximum-likelihood fits in each (sub-)bin are subject to acceptance corrections, followed by the corrections for reconstruction and trigger efficiencies obtained from \(J/\psi \) and \(\psi \)(2S) MC simulations, and by the correction scale factors that account for differences between data and MC simulation.

MC samples used for efficiency determinations were produced either by the PYTHIA 8 generator [43] with the A14 set of tuned parameters [44], or by a custom particle gun generator producing single \(\psi \) states with a given distribution, followed by their decay into a dimuon final state. The generated events were passed through the full ATLAS detector simulation [45] based on GEANT4 [46], and were reconstructed using the same software as the real data. The reconstruction efficiency \(\epsilon _{\textrm{reco}}\) in each analysis (sub-)bin is defined as the ratio

$$\begin{aligned} \epsilon _{\textrm{reco}} = \frac{N_{\textrm{reco}}}{N_{\textrm{true}}}, \end{aligned}$$

where \(N_{\textrm{reco}} \) is the reconstructed yield within the bin boundaries defined in terms of reconstructed variables, with fiducial cuts applied to reconstructed variables, while \(N_{\textrm{true}}\) is the true (generated) yield within the bin boundaries defined in terms of true variable values, with fiducial cuts applied to the true variable values. This definition takes into account detector resolution smearing of the kinematic variables used to define the fiducial cuts and bin boundaries, and hence includes bin migration effects between neighbouring bins.

The trigger efficiency \(\epsilon _{\textrm{trig}}\), also obtained using MC simulations, is defined fully in terms of reconstructed variables as

$$\begin{aligned} \epsilon _{\textrm{trig}} = \frac{N_{\textrm{trig}}}{N_{\textrm{reco}}}, \end{aligned}$$

where \(N_{\textrm{trig}} \) is the number of triggered events among the reconstructed events. Since this measurement used two different triggers, the trigger efficiency was calculated accordingly. Further correction scale factors, \(\epsilon _{\textrm{recoSF}}\) and \(\epsilon _{\textrm{trigSF}}\), are applied to account for any differences between data and MC events at reconstruction level and trigger level (see Eq. (1)). These are evaluated in dedicated tag-and-probe studies with auxiliary triggers, using a mixture of \(J/\psi \rightarrow \mu ^+\mu ^-\) and \(Z \rightarrow \mu ^+\mu ^-\) decays (see Ref. [40] for details).

4 Systematic uncertainties

Systematic effects from a variety of sources were studied, and appropriate corrections and uncertainties were assigned to all measured quantities. The systematic uncertainties can be broadly grouped into those related to reconstruction, trigger, and acceptance corrections, and those related to the fit model.

In order to assess the systematic uncertainties related to the fit model, the fit model was varied in several ways. As mentioned in Sect. 3.3, in the nominal fits some of the parameters describing the lineshapes of the signal peaks in the mass and lifetime domains were fixed to the values obtained from signal MC studies. These include the values of parameters \(\alpha \) and n of the Crystal Ball function, the constant \(\beta \) linking the positions of the \(J/\psi \) and \(\psi \)(2S) mass peaks, and the factor b relating the slopes of the two exponential functions describing the distribution of \(J/\psi \) decay times, \(\tau \). These parameters were allowed to float, one at a time, and the fits were repeated. Some variations covered changes in the decay time resolution parameterisation, with widths of the three Gaussian functions changed independently. In other variations, alternative parameterisations were chosen for the mass dependence of the background terms, and the fits were repeated. In one of the variations the correlation parameter \(\rho \) from Eq. 4 was set to 0. Finally, in the highest \(p_{\textrm{T}}\) bins, \(p_{\textrm{T}}>140\) GeV, the fixed fraction 0.07, relating the \(\psi \)(2S) and \(J/\psi \) yields, was varied by 0.01 to cover the observed range at lower \(p_{\textrm{T}}\) values, and the fits were run again. After each rerun, changes in the measured yields were recorded. The outcome of this process, in each analysis bin, was a number of measurements of each yield, scattered around the result of the nominal fit for that yield. It was assumed that the probability distribution of these variations around the nominal value was uniform between the smallest and largest measured values, and, therefore, the corresponding systematic uncertainty was evaluated as the standard deviation of this uniform distribution.

Acceptance-related systematic uncertainties are governed by the size of the samples used to generate the corresponding acceptance maps. The chosen size ensured that these uncertainties are small relative to the total systematic uncertainties. Changes in acceptance due to different spin-alignment hypotheses were treated separately (see Appendix).

Systematic uncertainties due to trigger and reconstruction efficiency corrections include the uncertainties on the scale factors [40] as well as those due the sizes and shapes of the MC samples. These were added in quadrature.

The total systematic uncertainty in each bin is calculated by summing in quadrature the uncertainties from the above-mentioned sources. The total uncertainty is calculated as the sum in quadrature of the total systematic and statistical uncertainties.

An additional systematic uncertainty comes from the determination of the integrated luminosity. In the \(p_{\textrm{T}}< 60~GeV\) region, where only 2015 data contributes to this measurement, the integrated luminosity has a 1.13% uncertainty [47], while for \(p_{\textrm{T}}\ge 60~GeV\) the uncertainty in the combined 2015–2018 integrated luminosity is 0.83%, as obtained using the LUCID-2 detector [48].

As an illustration, in Fig. 2 the total, statistical, and systematic uncertainties, together with the main individual contributions to the systematic uncertainty, are shown for the central rapidity slice for the differential cross-sections of prompt \(J/\psi \) and non-prompt \(\psi (2{\textrm{S}})\) mesons, as well as for the non-prompt fractions of \(J/\psi \) and \(\psi \)(2S) mesons, as functions of \(p_{\textrm{T}}\). For the cross-sections, apart from a few high \(p_{\textrm{T}}\) bins, the uncertainties are largely systematic, while for the non-prompt fractions and \(\psi \)(2S)-to-\(J/\psi \) production ratios, statistical errors dominate in many bins because the systematic uncertainties partially cancel out. Error levels are similar between prompt and non-prompt cross sections. Comparable results are obtained in other rapidity slices.

5 Results

The measured double-differential cross-sections for prompt and non-prompt \(J/\psi \) production in the nominal isotropic spin-alignment scenario are presented in Fig. 3a b, respectively. The same quantities for \(\psi \)(2S) are shown in Fig. 4a and b. The non-prompt production fractions for \(J/\psi \) and \(\psi \)(2S) are presented in Fig. 5a and b. Finally, the \(\psi \)(2S)-to-\(J/\psi \) production ratios are presented in Fig. 6a and b for the prompt and non-prompt production mechanisms, respectively.

While the non-prompt fractions shown in Fig. 5 increase steadily with \(p_{\textrm{T}}\) up to about 100 GeV, they are almost constant for both \(J/\psi \) and \(\psi (2{\textrm{S}})\) in the high \(p_{\textrm{T}}\) range, which suggests similar \(p_{\textrm{T}}\)-dependences for the prompt and non-prompt differential cross-sections at very high transverse momenta.

5.1 Acceptance and spin alignment corrections

The transition between the low-\(p_{\textrm{T}}\) dimuon trigger and the high-\(p_{\textrm{T}}\) single-muon trigger at \(p_{\textrm{T}}=60\) GeV presents a particular challenge because of the sharp change in event kinematics. The corresponding changes in the acceptance and efficiency correction factors are significant and could lead to discontinuities in the measured distributions.

Since the spin alignment of \(\psi \) states may be different for the prompt and non-prompt production mechanisms, additional correction factors may be needed for all measured distributions. In order to quantitatively assess the possible impact of \(\psi \) spin alignment, correction factors are calculated for a variety of scenarios. It was found that the dependence on the polar angle \(\theta \) in the helicity frame of the \(\psi \) state causes the largest variation, so the angular dependence of \(\psi \rightarrow \mu ^+\mu ^-\) decays is assumed to be \(\propto (1+\lambda _{\theta }\cos ^2\theta )\). The correction factors are shown in Fig. 7a and b for the differential cross-sections and non-prompt fractions respectively, where the values \(\lambda _{\theta } = \pm 0.20\) are chosen to reflect the approximate level of experimental knowledge [7, 36, 49] and theoretical understanding [50,51,52] of this parameter. The correction factors are shown for prompt \(J/\psi \) in the central rapidity range, but were found to be essentially the same for \(J/\psi \)  and \(\psi \)(2S), for the prompt and non-prompt production mechanisms, and also for the three rapidity regions.

The potential bias due to the spin-alignment assumption is especially noticeable at the \(p_{\textrm{T}} = 60\) GeV transition, and indeed a step can be seen at this point in the \(J/\psi \) non-prompt fraction in Fig. 5a, reflecting a possible issue in the spin-alignment modelling. Correction factors for some other values of \(\lambda _{\theta }\) are presented in the Appendix. These can be used for further studies, if more precise spin-alignment data and/or improved modelling become available in the future.

Fig. 9
figure 9

Ratios of various theoretical predictions (described in the text) to the data points from this measurement, for non-prompt production of a \(J/\psi \) and b \(\psi \)(2S) in the central rapidity region. In each \(p_{\textrm{T}}\) bin, the shaded area represents the ratio of the theoretical prediction to the measured value, with the vertical spread showing the uncertainties of the respective model. Error bars on the black circles show fractional uncertainties of this measurement

5.2 Theory comparison: prompt production

Model calculations of prompt production of charmonium are usually based on perturbative QCD for the production of the \(c\bar{c}\) pair, and differ in the mechanism of hadronisation and formation of the bound state with specific quantum numbers.

The predictions of a model using the non-relativistic QCD approach to charmonium production cross-sections at next-to-leading order (NLO NRQCD) [53], using predetermined LDMEs [54, 55], are shown in comparison with our measurements of the \(J/\psi \) and \(\psi \)(2S) production cross-sections in the top panels of Fig. 8a and b respectively. The predictions of the model largely overlap with the data points within the theoretical uncertainties, which include variations of the renormalisation, factorisation and NRQCD scales. However, the predictions seem to overestimate the cross-sections at high \(p_{\textrm{T}}\).

One generalisation of the NRQCD approach is a model which aims to improve the description by taking into account the transverse degrees of freedom of the initial gluons in the colliding protons (\(k_T\)-factorisation model) [56, 57]. The predictions of this model, obtained with the PEGASUS event generator [58] using the LDMEs determined in Ref. [59], are compared with our measured results as shown in the middle panels of Fig. 8a and b for \(J/\psi \) and \(\psi (2{\textrm{S}})\) respectively, where the theoretical uncertainties only account for variations of the renormalisation scale. The predictions of the model tend to underestimate the cross-sections at low \(p_{\textrm{T}}\), while in the high \(p_{\textrm{T}}\) region the range of comparison is limited by the availability of the transverse-momentum-dependent parton distribution function (PDF) of the gluon [60].

A different approach is used by the Improved Colour Evaporation Model (ICEM) [61], which assigns a fixed fraction of the \(c\bar{c}\) production cross-section below the open charm threshold to individual charmonium states. Comparisons of ICEM predictions with the parameter values and their uncertainties previously determined from fits to LHCb data at 7 TeV [10, 62] are shown in the bottom panels of Fig. 8a and b for \(J/\psi \) and \(\psi (2{\textrm{S}})\) respectively. The model seems to predict somewhat harder \(p_{\textrm{T}}\) spectra than observed in the data for both \(J/\psi \) and \(\psi \)(2S), and tends to underestimate the cross-section for \(\psi \)(2S).

5.3 Theory comparison: non-prompt production

Theoretical calculations of non-prompt charmonium production are based on perturbative QCD for the production of a \(b \bar{b}\) quark pair, their hadronisation into a pair of b-hadrons, and their subsequent decay into a charmonium state with specific quantum numbers. Predictions from one such model, based on fixed-order–next-to-leading-log (FONLL) QCD calculations [1, 2] were obtained using the web-based tool [63] with default parameter values, and are shown in comparison with our measurements in the top panels of Fig. 9a and b for \(J/\psi \) and \(\psi (2{\textrm{S}})\) respectively. Here the uncertainties cover variations of both the renormalisation scale and the charm quark mass. Agreement is good at lower \(p_{\textrm{T}}\) values, but the FONLL model predicts higher cross-sections for \(J/\psi \) at the high \(p_{\textrm{T}}\) end.

Another set of predictions, based on the next-to-leading-order QCD calculation in the general-mass–variable-flavour-number scheme (GM–VFNS) [64] are shown in the middle panels of Fig. 9a and b. Parameter values were determined in Ref. [54, 65], with uncertainties originating from renormalisation scale dependence. These predictions lead to similar results, but the deviation from data at the highest \(p_{\textrm{T}}\) values is somewhat more pronounced, especially in the \(J/\psi \) case.

Finally, the NRQCD model with \(k_T\)-factorisation can also be used to predict the \(p_{\textrm{T}}\) distributions of vector charmonia through the non-prompt production mechanisms [58, 66] (see bottom panels of Fig. 9a and b). Where available, the shapes of \(p_{\textrm{T}}\) distributions are reproduced fairly well for both \(J/\psi \) and \(\psi \)(2S), but the limitations of the transverse-momentum-dependent model for the gluon PDF show up at even lower charmonium \(p_{\textrm{T}}\) values. Also, in this model the cross-section for \(\psi \)(2S) non-prompt production at low \(p_{\textrm{T}}\) is somewhat underestimated.

Overall, none of the models considered here is able to describe the data over the whole measured range of transverse momenta. The general trend shown by all theoretical models is a slower-than-observed decrease of the cross-section with \(p_{\textrm{T}}\), which could be related to insufficiently accounting for PDF evolution and/or possible dependence of LDMEs on transverse momentum. In any case, these measurements should help refine theoretical models of hadronic production of quarkonium at the highest available energies and at transverse momenta well beyond 100 GeV.

6 Summary

This paper describes a measurement of the double-differential production cross-sections of \(J/\psi \) and \(\psi (2{\textrm{S}})\) charmonium states in pp collisions at \(\sqrt{s}=13~\text {Tev}\), performed through their decays into dimuons and using 140 fb\(^{-1}\) of data collected by the ATLAS detector at the LHC during Run 2. The cross-sections for each of the two states are measured separately for prompt and non-prompt production mechanisms. The non-prompt fractions for each state are also measured, along with the \(\psi (2{\textrm{S}})\)-to-\(J/\psi \) production ratios. The rapidity range of the measurement is \(|y|<2\). For \(\psi (2{\textrm{S}})\) meson the transverse momentum range is \(8{-}140~\text {GeV}\), while for \(J/\psi \) state the results cover a much wider transverse momentum range, from 8 to \(360~\text {GeV}\), extending well beyond the range of previous measurements. In the high \(p_{\textrm{T}}\) range the results show similar \(p_{\textrm{T}}\)-dependences for the prompt and non-prompt differential cross-sections, with the non-prompt fractions being nearly constant for both \(J/\psi \) and \(\psi (2{\textrm{S}})\) states.

The results are compared with a number of theoretical predictions, which describe the data with varying degrees of success. The extended \(p_{\textrm{T}}\) reach of this measurement provides important fresh input for future tuning of theoretical models.