1 Introduction

Three decades ago, Matsui and Satz first suggested that charmonia, bound states of c- and \(\bar{c}\)-quarks, could be a sensitive probe to study the hot, dense system created in nucleus–nucleus (A+A) collisions [1]. They postulated that Debye screening of the quark colour charge in a hot plasma would lead to a dissociation of quarkonium bound state in the medium, such as \(J/\psi \) or \(\psi (2\mathrm {S})\), when the Debye length becomes smaller than the quarkonium binding radius. Therefore, the suppression of the quarkonium production should be significantly larger for \(\psi (2\mathrm {S})\) than for \(J/\psi \) because the smaller binding energy facilitates the dissociation in the medium. This is referred to as sequential melting [2, 3]. In this picture, the suppression of different quarkonium states could therefore provide information related to the temperature and degree of deconfinement of the medium formed in heavy-ion collisions.

There have been numerous experimental and theoretical investigations since then that have demonstrated that other effects are also present in addition to colour screening in a deconfined plasma [4,5,6]. First, it has been shown that over a wide range of interaction energies there is already a modification in the production of \(J/\psi \) mesons in systems where a large volume of quark–gluon plasma does not appear to form, such as in proton–nucleus collisions [7,8,9]. Second, it has been shown by the ALICE Collaboration that not only a suppression of quarkonium is observed in ion–ion collisions as reported by several collaborations [10,11,12,13,14], but also an enhancement may play a role leading to an increase in the observed yields of \(J/\psi \) at low transverse momentum, \(p_{\text {T}}\), relative to higher transverse momenta [15, 16]. This observation has led to the interpretation that recombination of charm quarks and anti-quarks from the medium can play a role by providing an additional mechanism of quarkonium formation [17,18,19].

Finally, similarities between the suppression of \(J/\psi \) and the suppression of charged hadrons and D-mesons suggest that high-\(p_{\text {T}}\) \(J/\psi \)s may also be sensitive to parton energy loss in the medium [20, 21]. At LHC energies, \(J/\psi \) originates not only from the immediate formation of the composite \( c \bar{c}\) bound state (prompt \(J/\psi \)), but also from the decay of b-hadrons, which result in a decay vertex separated from the collision vertex by up to a few millimetres (non-prompt \(J/\psi \)). When a secondary vertex can be identified, using for instance the precise tracking system of the ATLAS experiment [22], it offers the intriguing possibility of using \(J/\psi \) production to study the propagation of b-quarks in the hot dense medium. Suppression of the production of b-hadrons in the medium, in the most naive picture, is caused by a completely different phenomenon from the suppression of \(c \bar{c}\) bound states. While \(c\bar{c}\) bound state formation may be inhibited by colour screening from a hot and deconfined medium, the suppression of high-\(p_{\text {T}}\) b-quark production is commonly attributed to energy loss of propagating b-quarks by collisional or radiative processes or both [23], not necessarily suppressing the total cross section but more likely shifting the yield to a lower \(p_{\text {T}}\). Quantum interference between the amplitudes for b-hadron formation inside and outside of the nuclear medium may also play a role [24].

The modification of prompt \(J/\psi \) production is not expected to be similar to the modification of non-prompt \(J/\psi \) production, since quite different mechanisms can contribute to those two classes of final states [6]. Simultaneous measurements of prompt and non-prompt charmonia are therefore essential for understanding the physics mechanisms of charmonium suppression in heavy-ion collisions.

This paper reports measurements of prompt and non-prompt per-event yields, non-prompt fraction and nuclear modification factors, \(R_{\mathrm {AA}}\), of the \(J/\psi \) and \(\psi (2\mathrm {S})\). The results are reported for Pb+Pb collisions at \(\sqrt{s_{\mathrm {NN}}}\) = 5.02 TeV in the dimuon decay channel and are presented for a 0-80% centrality range, \(9< p_{\text {T}} ^{\mu \mu } < 40\ \text {GeV}\) in dimuon transverse momentum, and \(-2< y_{\mu \mu } < 2\) in rapidity.

For the quantification of quarkonium suppression in Pb+Pb collisions with respect to pp collisions, the cross-section for quarkonium production in \(pp\) collisions needs to be measured. This was done in previous ATLAS publication [25].

Section 2 describes the ATLAS detector, Sect. 3 discusses the selection procedure applied to the data, the data analysis is presented in Sect. 4 and systematic uncertainties in Sect. 5. Results and a summary of the paper are presented in Sects. 6 and 7.

2 ATLAS detector

The ATLAS detector [22] at the LHC covers nearly the entire solid angle around the collision point.Footnote 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting air-core toroid magnets with eight coils each.

The inner-detector system is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the pseudorapidity range \(|\eta | < 2.5\). A high-granularity silicon pixel detector covers the vertex region and typically provides three measurements per track, the first hit being normally in the innermost layer. Since 2015 the detector has been augmented by the insertable B-layer [26], an additional pixel layer close to the interaction point which provides high-resolution hits at small radius to improve the tracking and vertex reconstruction performance, significantly contributing to the reconstruction of displaced vertices. It is followed by a silicon microstrip tracker which comprises eight cylindrical layers of single-sided silicon strip detectors in the barrel region, and nine disks in the endcap region. These silicon detectors are complemented by a transition radiation tracker (TRT), which enables radially extended track reconstruction up to \(|\eta | = 2.0\).

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta |< 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\), to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by a steel/scintillator-tile calorimeter, segmented into three barrel structures within \(|\eta | < 1.7\), and two copper/LAr hadronic endcap calorimeters situated at \(1.5< |\eta | < 3.2\). The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules (FCal) situated at \(3.1< |\eta | < 4.9\), optimized for electromagnetic and hadronic measurements respectively.

The muon spectrometer comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroids. The precision chamber system covers the region \(|\eta | < 2.7\) with three layers of monitored drift tubes, complemented by cathode strip chambers in the forward region, where the background is the highest. The muon trigger system covers the range of \(|\eta | < 2.4\) with resistive plate chambers in the barrel, and thin gap chambers in the endcap regions.

In addition to the muon trigger, two triggers are used in Pb+Pb collisions to select minimum-bias events for the centrality characterization. These are based on the presence of a minimum amount of transverse energy in all sections of the calorimeter system (\(|\eta | < 3.2\)) or, for events which do not meet this condition, on the presence of substantial energy deposits in both zero-degree calorimeters (ZDCs), with a threshold set just below the one-neutron peak, which are primarily sensitive to spectator neutrons in the region \(|\eta | > 8.3\). Those two triggers were found to be fully efficient in the centrality range studied in this analysis.

A two-level trigger system is used to select events of interest [27]. The first-level (L1) trigger is implemented in hardware and uses a subset of detector information to reduce the event rate to a design value of at most 100 kHz. This is followed by a software-based high-level trigger (HLT), which reduces the event rate to a maximum value of 1 kHz.

3 Event and data selection

The analysis presented in this paper uses data from Pb+Pb collisions at a nucleon–nucleon centre-of-mass energy of \(\sqrt{s_{_\text {NN}}}\) = 5.02 TeV and \(pp\) collisions at a centre-of-mass energy of \(\sqrt{s}\) = 5.02 TeV recorded by the ATLAS experiment in 2015. The integrated luminosity of previously analysed \(pp\) sample is \(25~\text{ pb }^{-1}\). The integrated luminosity of Pb+Pb sample is \(0.42~\mathrm {nb^{-1}}\).

Events were collected using a trigger requiring that the event contains at least two reconstructed muons. In the previously analysed \(pp\) sample both muons must generate a L1 muon trigger and be confirmed by the HLT while in the Pb+Pb sample only one muon is required to be seen by the L1 muon trigger and confirmed by the HLT; the second muon is only required to pass the HLT. At both levels the muon must satisfy the requirement of \(p_\mathrm {T} > 4\) GeV, as reconstructed by the trigger system.

Monte Carlo (MC) simulations are used for performance studies, where the response of the ATLAS detector was simulated using Geant 4 [28, 29]. Prompt (\(pp \rightarrow J/\psi \rightarrow \mu \mu \)) and non-prompt (\(pp \rightarrow b\bar{b} \rightarrow J/\psi \rightarrow \mu \mu \)) samples of \(J/\psi \) were produced with the event generator Pythia 8.212 [30] and corrected for electromagnetic radiation with Photos [31]. The A14 set of tuned parameters [32] is used together with the CTEQ6L1 parton distribution function set [33]. These samples were used to study the trigger and reconstruction performance of the \(pp \) collisions. In order to simulate \(J/\psi \) production in the high multiplicity environment of Pb+Pb collisions, the generated events were overlaid with a sample of minimum-bias events produced with HIJING [34].

Muon candidates are required to pass the “tight” muon working point selection [35] without any TRT requirements, have \(p_{\text {T}} >4\) GeV, and \(|\eta |<2.4\) in addition to being the reconstructed muon associated, in \(\Delta R<0.01\), with the trigger decision. To be selected, a muon pair must be consistent with originating from a common vertex, have opposite charge, and an invariant mass in the range \(2.6<m_{\mu \mu }<4.2\) GeV. The dimuon candidate is further required to have \(p_\text {T}^{\mu \mu } > 9\) GeV to ensure that the pair candidates are reconstructed in a fiducial region where systematic uncertainties in the final results do not vary significantly relative to the acceptance and efficiency corrections.

The centrality of Pb+Pb collisions is characterized by the sum of the transverse energy, \(\sum E_\text {T}^\mathrm {FCal}\), evaluated at the electromagnetic scale (that is before hadronic calibration) in the FCal. It describes the degree of geometric overlap of two colliding nuclei in the plane perpendicular to the beam with large overlap in central collisions and small overlap in peripheral collisions. Centrality intervals are defined in successive percentiles of the \(\sum E_\text {T}^\mathrm {FCal}\) distribution ordered from the most central (highest \(\sum E_\text {T}^\mathrm {FCal}\) ) to the most peripheral collisions. A Glauber model analysis of the \(\sum E_\text {T}^\mathrm {FCal}\) distribution was used to evaluate the mean nuclear thickness function, \(\langle T_{\text {AA}} \rangle \), and the number of nucleons participating in the collision, \(\langle N_{\text {part}} \rangle \), in each centrality interval [36,37,38]. The centrality intervals used in this measurement are indicated in Table 1 along with their respective calculations of \(\langle T_{\text {AA}} \rangle \) and \(\langle N_{\text {part}} \rangle \).

Table 1 The \(\langle T_{\text {AA}} \rangle \), \(\langle N_{\text {part}} \rangle \) values and their uncertainties in each centrality bin. These are the results from the Glauber modelling of the summed transverse energy in the forward calorimeters, \(\sum E_\text {T}^\mathrm {FCal}\)

The number of minimum-bias events, \(N_{\text {evt}}\), times the centrality fraction, is used to normalize the yield in respective centrality class. Minimum-bias events are selected by requiring that they pass at least one of the two minimum-bias triggers. The analysed dataset corresponds, after correction for the trigger prescale factor, to \(2.99 \times 10^{9}\) Pb+Pb minimum bias events.

4 Data analysis

The pseudo-proper decay time, \(\tau \), is used to distinguish between prompt and non-prompt charmonium production. It is defined as,

$$\begin{aligned} \tau = \frac{L_{xy}m_{\mu \mu }}{p_{\text {T}} ^{\mu \mu }}, \end{aligned}$$

where \(L_{xy}\) is the distance between the position of the reconstructed dimuon vertex and the primary vertex projected onto the transverse plane. A weight, \(w_{\mathrm {total}}\), is defined for each selected dimuon candidate using the relation:

$$\begin{aligned} w^{-1}_{\mathrm {total}} = A\times \epsilon _{\mathrm {reco}}\times \epsilon _{\mathrm {trig}}, \end{aligned}$$

where A is the acceptance, \(\epsilon _{\mathrm {reco}}\) is the reconstruction efficiency, and \(\epsilon _{\mathrm {trig}}\) is the trigger efficiency.

A two-dimensional unbinned maximum-likelihood fit to the invariant mass and pseudo-proper time distributions of weighted events is used to determine the yields of the prompt and non-prompt charmonium components as well as the contribution from background. A total of 31 572 events before applying the weights are used in the fit.

The differential cross sections for the production of prompt (p) and non-prompt (np) \(J/\psi \) and \(\psi (2\mathrm {S})\) in \(pp\) collisions were calculated in a previously published study [25] and are defined as:

$$\begin{aligned} \frac{\text {d}^2\sigma ^\text {p(np)}}{\text {d}p_{\text {T}} \text {d}y}\times B(\psi (\textit{n}\mathrm {S})\rightarrow \mu \mu ) = \frac{N_{\psi (\textit{n}\mathrm {S})}^\mathrm {p(np),~corr}}{\Delta p_{\text {T}} \times \Delta y\times \int {\mathcal {L}dt}}, \end{aligned}$$

where \(B(\psi (\textit{n}\mathrm {S})\rightarrow \mu \mu )\) is the branching ratio for charmonium states decaying into two muons [39], \(N_{\psi (\textit{n}\mathrm {S})}^\mathrm {p(np),~corr}\) is the prompt and non-prompt charmonium yield corrected for acceptance and detector effects, and \(\Delta p_{\text {T}} \) and \(\Delta y\) are the widths of the \(p_{\text {T}}\) and y bins. Following the same approach, the per-event yield of charmonium states measured in A+A collisions is calculated as:

$$\begin{aligned} \left. \frac{\text {d}^{2}N^\text {p(np)}}{\text {d}p_{\text {T}} \text {d}y}\right| _\text {cent} \times B(\psi (\textit{n}\mathrm {S})\rightarrow \mu \mu ) = \frac{1}{\Delta p_{\text {T}} \times \Delta y}\times \left. \frac{N_{\psi (\textit{n}\mathrm {S})}^\mathrm {p(np),~corr}}{N_\text {evt}}\right| _\text {cent}, \end{aligned}$$
(1)

where \(N_\text {evt}\) is the number of minimum-bias events and “cent” refers to a specific centrality class.

4.1 Acceptance and efficiency corrections

The kinematic acceptance \(A(p_{\text {T}},y)\) for a \(\psi (\textit{n}\mathrm {S})\) with transverse momentum \(p_{\text {T}} \) and rapidity y decaying into \(\mu \mu \) was obtained from a MC simulation and is defined as the probability that both muons fall within the fiducial volume \(p_\text {T}(\mu ^{\pm })>4\) GeV and \(|\eta (\mu ^{\pm })|<2.4\). Acceptance generally depends on the \(\psi (\textit{n}\mathrm {S})\) polarization. In this study, we assume that the \(\psi (\textit{n}\mathrm {S})\) are unpolarized following Refs. [40,41,42]. The effects of variations to this assumption have been considered and are discussed in Sect. 5. In order to apply the acceptance weight to each charmonia candidate, a simple linear interpolation is used in the mass range where the \(J/\psi \) and \(\psi (2\mathrm {S})\) overlap due to the detector resolution. The upper mass boundary for the \(J/\psi \) candidates is chosen to be 3.5 GeV and the lower mass boundary for the \(\psi (2\mathrm {S})\) candidates to be 3.2 GeV, resulting in a superposition range of 0.3 GeV. Within the interpolation range of \(m_{\mu \mu }\) = 3.2–3.5 GeV, the following function was applied for the acceptance correction:

$$\begin{aligned} A =A(J/\psi ) \times \frac{3.5 - m_{\mu \mu }}{0.3} + A(\psi (2\mathrm {S})) \times \frac{m_{\mu \mu } - 3.2}{0.3}. \end{aligned}$$
(2)

The difference between the \(J/\psi \) and \(\psi (2\mathrm {S})\) acceptance varies from 5% at low \(p_{\text {T}}\) to 0.05% at high \(p_{\text {T}}\).

Trigger and reconstruction efficiencies were calculated for both data and MC simulation using the tag-and-probe (T&P) method. The method is based on the selection of an almost pure muon sample from \(J/\psi \rightarrow \mu \mu \) events collected with an auxiliary single-muon trigger, requiring one muon of the decay (tag) to be identified as the “tight” muon which triggered the read-out of the event and the second muon (probe) to be reconstructed as a system independent of the one being studied, allowing a measurement of the performance with minimal bias. Once the tag and probe sample is defined, the background contamination and the muon efficiency are measured with a simultaneous maximum-likelihood fit of two statistically independent distributions of the invariant mass: events in which the probe is or is not successfully matched to the selected muon [35, 43]. Both efficiencies were evaluated as a function of \(p_{\text {T}}\) and \(\eta \), in narrow bins, using muons from simulated \(J/\psi \rightarrow \mu \mu \) decays in order to build the efficiency map. Muon reconstruction efficiency increases from low to high \(p_{\text {T}}\) and decreases from central to forward rapidities. It varies between 60% and 90%, becoming almost constant for \(p_{\text {T}} >6\) GeV. The dimuon trigger efficiency is studied and factorized in terms of single-muon trigger efficiencies which increase from low to high \(p_{\text {T}}\) and from central to forward rapidities. Dimuon trigger efficiency increases from 50% to 85% between the lowest and highest dimuon \(p_{\text {T}}\).

In order to account for the difference between efficiencies in simulation and experimental data, the data-to-MC ratio, \(\epsilon ^\mathrm {data}_{\mathrm {reco}}/\epsilon ^{\mathrm {MC}}_{\mathrm {reco}}\), was parameterized as a function of \(p_\text {T}\) and centrality and applied as a multiplicative scale factor to the efficiency correction separately for the barrel and endcap regions of the muon spectrometer. This scale factor varies between 1.01 and 1.05. The inverse total weight, \(w^{-1}_{\mathrm {total}}\), after applying the scale factor, is shown in the left panel of Fig. 1, averaged in bins of the dimuon transverse momentum and rapidity. The right panel of Fig. 1 shows the centrality dependence of the muon reconstruction efficiency.

Fig. 1
figure 1

(Left) Inverse total weight binned in the dimuon transverse momentum and rapidity for integrated centrality as estimated in MC simulation and corrected for differences between efficiencies in MC and experimental data. Decreases in efficiency at very central rapidity correspond to the \(|\eta |<0.1\) region not covered by the muon detectors. The weight is dominated by the acceptance correction. (Right) Muon reconstruction efficiency as a function of the summed transverse energy in the forward calorimeters, \(\sum E_\text {T}^\mathrm {FCal}\)

4.2 Fit model

The corrected prompt and non-prompt \(\psi (\textit{n}\mathrm {S})\) yields are extracted from two-dimensional weighted unbinned maximum-likelihood fits performed on invariant mass and pseudo-proper decay time distributions. A fit is made for each \(p_{\text {T}}\), y, and centrality interval measured in this analysis. The probability distribution function (PDF) for the fit [44] is defined as a normalized sum of seven terms listed in Table 2, where each term is factorized into mass-dependent and decay-time-dependent functions; these functions are described below. The PDF can be written in a compact form as:

$$\begin{aligned} \text {PDF}(m,\tau ) = \sum ^7_{i=1}\kappa _i f_i(m) \cdot h_i(\tau ) \otimes g(\tau ), \end{aligned}$$

where \(\kappa _i\) is the normalization factor of each component, \(f_i(m)\) and \(h_i(\tau )\) are distribution functions for the mass m and the pseudo-proper time \(\tau \) respectively; \(g(\tau )\) is the resolution function described with a sum of two Gaussian distribution; and the “\(\otimes \)” symbol denotes a convolution. The distribution functions \(f_i\) and \(h_i\) are defined by a Crystal Ball (\(C\!B\)) function [45], Gaussian (G), Dirac delta (\(\delta \)) and exponential (E) distributions; individual components are shown in Table 2. The fit is performed using the RooFit framework [46]. In order to stabilize the fit model, and reduce the correlation between parameters, a number of component terms listed in Table 2 share common parameters, are scaled to each other by a multiplicative scaling parameter, or are fixed to the value observed in MC simulation.

Table 2 Probability distribution functions for individual components in the default fit model used to extract the prompt (p) and non-prompt (np) contribution for \(J/\psi \) and \(\psi (2\mathrm {S})\) signal and background (Bkg). Symbols denote functions as follows: “\(C\!B\)” – Crystal Ball, “G” – Gaussian, “E” – exponential, and “\(\delta \)” – Dirac delta function

The signal mass shapes of the \(J/\psi \) and \(\psi (2\mathrm {S})\) are each described by the sum of a \(C\!B\) function, which covers the \(J/\psi \) invariant mass distribution’s low-side tail due to final-state radiation, and a single Gaussian function which share a common peak position treated as a free parameter. The width term in the \(C\!B\) function is equal to the Gaussian standard deviation times a free scaling term that is common to the \(J/\psi \) and \(\psi (2\mathrm {S})\). The \(C\!B\) low-mass tail and height parameters are fixed to the MC value. Variations of these two parameters are considered a part of the fit model’s systematic uncertainties. The mean of the \(\psi (2\mathrm {S})\) mass profile is set to be the mean of the \(J/\psi \) mass profile multiplied by the ratio of their known masses, \(m_{\psi (2\mathrm {S})}/m_{J/\psi } = 1.190\) [39]. The Gaussian width of the \(\psi (2\mathrm {S})\) is also set to be the width of the \(J/\psi \) multiplied by the same factor. Variations of this scaling term are considered a part of the fit model systematic uncertainties. The relative fraction of the \(C\!B\) and Gaussian functions, \(\omega \), is free but common to the \(J/\psi \) and \(\psi (2\mathrm {S})\).

The non-prompt signal pseudo-proper decay time PDFs are described by a single-sided exponential function (for positive \(\tau \) only) convolved with a sum of two Gaussians lifetime resolution function. The sum of two Gaussian resolution function has a fixed mean at \(\tau = 0\) and free widths with a fixed relative fraction for the two single Gaussian components. The same resolution function is used to describe the prompt contribution by convolving it with a delta function.

The pseudo-proper decay time PDFs describing the background are represented by the sum of one prompt component and two non-prompt components. The prompt background component is described by a delta function convolved with a sum of two Gaussian function. While one of the non-prompt background contributions is described by a single-sided decay model (for positive \(\tau \) only), the other is described by a double-sided decay model accounting for candidates of mis-reconstructed or non-coherent dimuon pairs resulting from Drell–Yan muons and combinatorial background. The same Gaussian resolution functions are used for the background and the signal. For the background parameterizations in the mass distribution, the three components: prompt, single-sided non-prompt, and double-sided non-prompt were modelled with exponentials functions.

Example fit projections are shown in Fig. 2. The important quantities extracted from the fit are: the number of signal \(J/\psi \), the number of signal \(\psi (2\mathrm {S})\), the non-prompt fraction of the \(J/\psi \) signal, and the non-prompt fraction of the \(\psi (2\mathrm {S})\) signal. From these values and the correlation matrix of the fit, all the measured observables and their uncertainties are extracted.

Fig. 2
figure 2

Dimuon invariant mass for events with \(2.6< m_{\mu \mu } < 4.2\ \text {GeV}\) (left) and dimuon pseudo-proper lifetime (right). The data, corrected for acceptance times efficiency, are shown for the range \( 9< p_{\text {T}} < 40\ \text {GeV}\), \(|y| < 2.0\), and centrality 20–50% in Pb+Pb collisions. Superimposed on the data are the projections of the fit results

4.3 Observables

The suppression of charmonium states is quantified by the nuclear modification factor, which can be defined for a given centrality class as:

$$\begin{aligned} R_{\mathrm {AA}} = \frac{N_{\mathrm {AA}}}{\langle T_{\mathrm {AA}}\rangle \times \ \sigma _{pp}}, \end{aligned}$$
(3)

where \(N_\mathrm {AA}\) is the per-event yield of charmonium states measured in A+A collisions, \(\langle T_\mathrm {AA}\rangle \) is the mean nuclear thickness function and \(\sigma _{pp}\) is the cross section for the production of the corresponding charmonium states in \(pp \) collisions at the same energy [25].

In order to quantify the production of \(\psi (\mathrm {2\mathrm {S}})\) relative to \(J/\psi \) a ratio of nuclear modification factors, \(\rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi }\) = \(R_\mathrm {AA}^{\psi (\mathrm {2\mathrm {S}})}/R_\mathrm {AA}^{J/\psi }\), can be used. However, in this analysis the numerator and denominator are not calculated directly from Eq. (3), rather, it is advantageous to calculate it in the equivalent form as:

$$\begin{aligned} \rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi } = ( N_{\psi (2\mathrm {S})} / N_{J/\psi } )_{\text{ Pb }+\text{ Pb }} /( N_{\psi (2\mathrm {S})} / N_{J/\psi } )_{pp}. \end{aligned}$$

This formulation minimizes the systematic uncertainties due to a substantial cancelling-out of the trigger and reconstruction efficiencies for the two quarkonium systems because they are very similar in mass and they are measured in the identical final-state channel.

Also measured is the non-prompt fraction \(f_\mathrm {np}\), which is defined as the ratio of the number of non-prompt charmonia to the number of inclusively produced charmonia,

$$\begin{aligned} f_\mathrm {np}^{\psi (\textit{n}\mathrm {S})} = \frac{N_{\psi (\textit{n}\mathrm {S})}^\mathrm {np,corr}}{N_{\psi (\textit{n}\mathrm {S})}^\mathrm {np,corr} + N_{\psi (\textit{n}\mathrm {S})}^\mathrm {p,corr}}, \end{aligned}$$

where the non-prompt fraction can be determined for the \(J/\psi \) and \(\psi (2\mathrm {S})\) simultaneously. This observable has the advantage that acceptances and efficiencies are similar for the numerator and denominator, and thus systematic uncertainties are reduced in the ratio.

5 Systematic uncertainties

The main sources of systematic uncertainty in this measurement are the assumptions in the fitting procedure, the acceptance and efficiency calculations, and the pp luminosity and \(\langle T_\mathrm {AA}\rangle \) determination. The acceptance, and hence the corrected yields, depend on the spin-alignment state of the \(\psi (\textit{n}\mathrm {S})\). For prompt production, six alternative scenarios have been considered, corresponding to extreme cases of spin alignment, as explained in Ref. [44]. An envelope to the acceptance has been obtained from the maximum deviations from the assumption of unpolarized production. In the non-prompt case a map weighted to the CDF result [47] for \(B\rightarrow J/\psi \) spin-alignment is used as a variation. Since the polarization of charmonia in pp collisions was measured to be small [40,41,42], its modification due to the nuclear environment is neglected and the spin-alignment uncertainty is assumed to cancel out in \(R_{\text{ AA }}\) and \(\rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi }\). Changes in the yields due to bin migration effects are at the per-mil level and thus no correction is needed. Table 3 shows the systematic uncertainties affecting the three measured observables. The total systematic uncertainty is calculated by summing the different contributions in quadrature and is derived separately for pp and Pb+Pb results. No differences in the uncertainties was observed for prompt and non-prompt production. The yield extraction uncertainties, which are dominated by the uncertainty in the muon reconstruction, increase from central to forward rapidity, and from high to low \(p_{\text {T}}\). The double \(R_\mathrm {AA}\) ratio, \(\rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi }\) has a substantially larger fit uncertainty than the other observables; this is because the signal-to-background ratio for the \(\psi (2\mathrm {S})\) is much smaller than for the \(J/\psi \). For \(R_\mathrm {AA}\) and \(\rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi }\) the correlations between the uncertainty in the pp and Pb+Pb samples are taken into account.

Table 3 Systematic uncertainties of the \(J/\psi \) yield, \(R^{J/\psi }_\text {AA}\) and \(\rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi }\) measured in Pb+Pb collisions. “Uncorr.” refers to point-to-point uncorrelated uncertainties and “Corr.” refers to global uncertainties from various sources

5.1 Proton–proton luminosity and mean nuclear thickness uncertainties

The integrated luminosity determined for the 2015 \(pp\) data was calibrated using data from dedicated beam-separation scans, also known as van der Meer scans. Sources of systematic uncertainty similar to those examined in the 2012 \(pp \) luminosity calibration [48] were studied in order to assess the systematic uncertainties for the 2015 data. The combination of these systematic uncertainties results in a uncertainty in the luminosity during \(pp\) collisions at \(\sqrt{s} = 5.02\ \text {TeV}\) of \(\delta \mathcal {L}/\mathcal {L} = \pm 5.4\%\). The uncertainty in the value of the nuclear overlap function \(\langle T_{\text {AA}} \rangle \) is estimated by varying the Glauber model parameters [38] and is shown in Table 1. This uncertainty is treated as fully correlated across \(p_{\text {T}}\) and y bins for the same centrality and it is reported separately from other uncertainties. For the case of the \(R_{\text{ AA }}\) evaluated as a function of \(N_{\text {part}}\), the \(T_{\text {AA}}\) uncertainty is added in quadrature with other uncertainties.

5.2 Trigger and reconstruction efficiency uncertainty

Several sources of systematic uncertainty were examined to assess the uncertainties of the muon efficiency determination. The statistical uncertainty of the fitted scale factors is propagated as a systematic uncertainty. The signal and background fit models used to extract the data efficiency in the T&P method are changed to assess systematic uncertainties related to the choice of signal and background PDFs. A Chebychev polynomial is used instead of an exponential function for the background model variation, and a single Gaussian function is used instead of a weighted sum of Gaussian and CB functions for the signal mass resolution model variation.

For the reconstruction efficiency, the difference between the “true” muon efficiency given by the fraction of generator-level muons that are successfully reconstructed and the efficiency determined using the T&P method in MC simulation is also assigned as a correlated systematic uncertainty. The accuracy of dimuon chain factorization was estimated using MC simulation. The difference between the initial number of dimuons in the sample and the number of dimuons after trigger selection and correction was assessed as the systematic uncertainty, having a value of 3%. The centrality-dependent corrections have an uncertainty of \(\mathcal {O}\)(1%). These uncertainties apply to the cross sections but most cancel out in the ratios of \(\psi (2\mathrm {S})\) to \(J/\psi \) yields, leaving a residual difference of less than 1%.

5.3 Fit model uncertainty

The uncertainty associated with the particular choice of PDFs was evaluated by varying the PDF of each component, using ten alternative models. In each variation of the fit model, all measured quantities were recalculated and compared to the nominal fit. The root mean square of all variations was then assigned as the fit model’s systematic uncertainty. The signal mass PDF was varied by replacing the \(C\!B\) plus Gaussian function with a double Gaussian function, and varying parameters of the \(C\!B\) model, which were originally fixed. For the signal decay time PDF, a single exponential function was changed to a sum of two exponential function. The background mass PDFs were varied by replacing exponential functions with second-order Chebyshev polynomials in order to describe the prompt, non-prompt and double-sided background terms. Finally, the decay time resolution was varied by using a single Gaussian function in place of the double Gaussian function.

The stability of the nominal fitting procedure is quantified by comparing the yield of a randomly weighted MC simulation sample of prompt and non-prompt \(J/\psi \) with the fit output of the same sample. The comparison shows a 1% difference in the yield extractions and non-prompt fraction. This is assigned as an additional systematic uncertainty in the yields and non-prompt fraction value, which, however, cancels out in the \(\psi \)(2S) to \(J/\psi \) ratio. An extra systematic uncertainty is added to the \(\psi \)(2S) to \(J/\psi \) ratio to account for a \(2\%\) bias introduced by the acceptance interpolation (see Eq. (2)). This value comes from comparing the fit results from a sample that is corrected with a standalone acceptance and other that used the interpolation. The difference between both samples was found to be significant only when the signal-to-background ratio was small, which is typical for the \(\psi \)(2S).

6 Results

6.1 Prompt and non-prompt \(J/\psi \) per-event yields for Pb+Pb collisions

The per-event yields are defined as the number of \(J/\psi \) produced per bin of \(p_{\text {T}}\), y and centrality intervals normalized by the width of the \(p_{\text {T}}\) and y bin and the number of events, \(N_\text {evt}\), measured in minimum-bias data for each centrality class, as defined in Eq. (1). The resulting per-event yields and non-prompt fraction for \(J/\psi \) production are shown in Figs. 3 and 4 respectively, as a function of transverse momentum, for three centrality slices and rapidity range \(|y|<2\). The vertical error bars in the \(J/\psi \) per-event yields shown in Fig. 3 are the combined systematic and statistical uncertainties. The non-prompt fraction appears to be essentially centrality-independent and to have a slightly different slope from that found in pp collisions [25].

Fig. 3
figure 3

Pb+Pb per-event yields of prompt \(J/\psi \) (left) and non-prompt \(J/\psi \) (right) as a function of \(p_{\text {T}}\) for three different centrality slices in the rapidity range \(|y| < 2\). The centroids of the \(p_{\text {T}}\) bins are the mean value of the transverse momentum distributions of dimuons in the \(J/\psi \) mass region, corrected for acceptance \(\times \) efficiency. The vertical error bars are the combined systematic and statistical uncertainties, where the dominant source is the systematic uncertainty with the exception of the latest bin. Overlaid is a band representing the variation of the result in various spin-alignment scenarios

Fig. 4
figure 4

(Left) Non-prompt fraction of \(J/\psi \) production in 5.02 TeV Pb+Pb collision data as a function of \(p_{\text {T}}\) for three different centrality slices in the rapidity range \(|y| < 2\). (Right) Comparison with the ATLAS 5.02 TeV \(pp\) collision data [25]. The vertical error bars are the combined systematic and statistical uncertainties, dominated by the statistical uncertainty

6.2 Nuclear modification factor, \(R^{J/\psi }_\mathrm {AA}\)

The influence of the hot dense medium on the production of the \(J/\psi \) mesons is quantified by the nuclear modification factor, given in Eq. (3), which compares production of charmonium states in Pb+Pb collisions to the same process in \(pp\) collisions, taking geometric factors into account. The results of the measurement of this observable are presented as a function of transverse momentum in Figs. 5 and 6, rapidity in Fig. 7, and centrality in Fig. 8; the last is presented as a function of the mean number of participants. The error box on the right-hand side of the plots located at the \(R_{\mathrm {AA}}\) value of 1 indicates the correlated systematic uncertainties of the measurement, while the error boxes associated with data-points represent the uncorrelated systematic uncertainties, and the error bars indicate the statistical uncertainties. The results exhibit agreement with previous measurements performed by CMS at \(\sqrt{s_\text {NN}}= 2.76\) and 5.02 TeV in a similar kinematic region [11, 12], as can be seen in Figs. 57 and 8 where the CMS results are plotted together with total uncertainties which are dominated by systematic uncertainties.

Figure 5 shows the nuclear modification factor as a function of \(p_{\text {T}}\) for production of prompt and non-prompt \(J/\psi \), for \(|y| < 2\), and for four selections of centrality. In this figure, it can be seen that the production of \(J/\psi \) is strongly suppressed in central Pb+Pb collisions. In the kinematic range plotted, as a function of \(p_{\text {T}}\), the nuclear modification factor for both prompt and non-prompt \(J/\psi \) production is seen to be in the range \(0.2< R_{\mathrm {AA}} < 1\), depending on the centrality slice, having a minimum value for prompt \(J/\psi \) of 0.229 ± 0.017(stat) ± 0.016(syst) and 0.290 ± 0.034(stat) ± 0.021(syst) for the non-prompt \(J/\psi \) in the 0–10% centrality range. For \(p_{\text {T}}\)  > 12 GeV, a small increase in \(R_\mathrm {AA}\) with increasing \(p_{\text {T}}\) is observed in the prompt \(J/\psi \) production, as shown in Fig. 6 (left), similar in shape and size to that observed for charged particles and D-mesons [49,50,51], typically attributed to parton energy-loss processes and, for the case of charmonia, also to coherent radiation from the pre-resonant \(q\bar{q}\) pair [20, 21]. In Fig. 6 (right), one can see the prompt \(J/\psi \) \(R_{\text{ AA }}\) evaluated for the 0–20% centrality bin compared with several models, showing that the data are consistent with the colour screening and colour transparency picture [52,53,54], as well as parton energy-loss [20, 21]. The \(R_{\text{ AA }}\) value for non-prompt \(J/\psi \) is seen to be approximately constant as a function of \(p_{\text {T}}\) within the uncertainties, also consistent with a parton energy-loss mechanism [55, 56].

Fig. 5
figure 5

The nuclear modification factor as a function of \(p_{\text {T}}\) for the prompt \(J/\psi \) (left) and non-prompt \(J/\psi \) (right) for \(|y|<2\), in 0–80% centrality bin (top) and in 0–10%, 20–40%, and 40–80% centrality bins (bottom). The statistical uncertainty of each point is indicated by a narrow error bar. The error box plotted with each point represents the uncorrelated systematic uncertainty, while the shaded error box at \(R_{\mathrm {AA}}\)=1 represents correlated scale uncertainties

Fig. 6
figure 6

(Left) Comparison of prompt and non-prompt \(J/\psi \) \(R_{\text{ AA }}\) with the \(R_{\text{ AA }}\) of charged particles [49] and D-mesons [51]. (Right) Comparison of the \(R_{\text{ AA }}\) for prompt \(J/\psi \) production with different theoretical models. The statistical uncertainty of each point is indicated by a narrow error bar. The error box plotted with each point represents the uncorrelated systematic uncertainty, while the shaded error box at \(R_{\mathrm {AA}}\)=1 represents correlated scale uncertainties

In Fig. 7, the nuclear modification factor is presented as a function of rapidity for production of prompt and non-prompt \(J/\psi \) for transverse momenta \(9< p_\mathrm {T} < 40\) GeV and for four selections of centrality. It can be seen from the figure that the \(R_\mathrm {AA}\) exhibits a modest dependence on rapidity, as expected from Ref. [57], explained due to the boost invariance of the medium in central rapidity region. These patterns are seen to be similar for both prompt and non-prompt \(J/\psi \) production. Figure 8 presents the nuclear modification factor as a function of centrality, expressed as the number of participants, \(N_\mathrm {part}\), for production of prompt and non-prompt \(J/\psi \) for \(|y| < 2\), and for \(9< p_\mathrm {T} < 40\) GeV. In the kinematic range plotted, as a function of centrality, the nuclear modification factor for both prompt and non-prompt \(J/\psi \) decrease from the most peripheral bin, 60–80%, to the most central bin, 0–5%, with a minimum value of 0.217 ± 0.010(stat) ± 0.020(syst) for prompt and 0.264 ± 0.017(stat) ± 0.023(syst) for non-prompt. Suppression by a factor of about 4 or 5 for both the prompt and non-prompt \(J/\psi \) mesons in central collisions, together with \(\mathrm {R}_\mathrm {pPb}\) of charmonia being consistent with unity [25], are a very striking signs that the hot dense medium has a strong influence on the particle production processes. The two classes of meson production have essentially the same pattern which is unexpected because the two cases are believed to have quite different physical origins: the non-prompt production should be dominated by b-quark processes that extend far outside the deconfined medium, whereas the prompt production happens predominantly within the medium.

Fig. 7
figure 7

The nuclear modification factor as a function of rapidity for the prompt \(J/\psi \) (left) and non-prompt \(J/\psi \) (right) for \(9< p_{\text {T}} < 40\ \text {GeV}\), in 0–80% centrality bin (top) and in 0–10%, 20–40%, and 40–80% centrality bins (bottom). The statistical uncertainty of each point is indicated by a narrow error bar. The error box plotted with each point represents the uncorrelated systematic uncertainty, while the shaded error box at \(R_{\mathrm {AA}}\)=1 represents correlated scale uncertainties

Fig. 8
figure 8

The nuclear modification factor as a function of the number of participants, \(N_\mathrm {part}\), for the prompt \(J/\psi \) (left) and non-prompt \(J/\psi \) (right) for \(9< p_{\text {T}} < 40\ \text {GeV}\) and for rapidity \(|y|<2\). The statistical uncertainty of each point is indicated by a narrow error bar. The error box plotted with each point represents the uncorrelated systematic uncertainty, while the shaded error box at \(R_{\mathrm {AA}}\)=1 represents correlated scale uncertainties

Fig. 9
figure 9

\(\psi (2\mathrm {S})\) to \(J/\psi \) double ratio, as a function of the number of participants, \(N_\mathrm {part}\), for prompt meson production compared with different theoretical models (left) and non-prompt meson production (right). The narrow error bar represents the statistical uncertainties while the error box represents the total systematic uncertainty

6.3 \(\psi (2\mathrm {S})\) to \(J/\psi \) yield double ratio

The double ratio of \(\psi (2\mathrm {S})\) production to \(J/\psi \) meson production, \(\rho _\text {PbPb}^{\psi (2\mathrm {S})/J/\psi }\) is shown in Fig. 9 for the centrality bins of 0–10%, 10–20%, 20–50%, 50–60% and 60–80%. These results represent a measurement complementary to an earlier measurement of \(\psi (2\mathrm {S})\) to \(J/\psi \) yield ratios at the same centre-of-mass energy made by the CMS Collaboration [58]. This ratio, which compares the suppression of the two mesons, can be interpreted in models in which the binding energy of the two mesons is estimated [59], leading to different survival probabilities in the thermal medium, or in which the formation mechanisms differ, such as different susceptibility of the two mesons to recombination processes [60, 61]. If the non-prompt \(J/\psi \) and \(\psi (2\mathrm {S})\) originate from b-quarks losing energy in the medium and hadronizing outside of the medium, then the ratio of their yields should be unity. This statement should be true for the ratio expressed as a function of any kinematic variable. By contrast, prompt \(J/\psi \) and \(\psi (2\mathrm {S})\) or their pre-resonant states, should traverse the hot and dense medium. Considering both mesons as composite systems, with potentially different formation mechanisms and different binding energies, they may respond differently to the hot dense medium. This interpretation is supported by the results of Fig. 9, which shows the ratio of \(\psi (2\mathrm {S})\) to \(J/\psi \) production as a function of the number of collision participants, \(N_\mathrm {part}\). The ratio is consistent with unity within the experimental uncertainties for non-prompt mesons, while for prompt \(J/\psi \) the ratio is different from unity. These data support the enhanced suppression of prompt \(\psi (2\mathrm {S})\) relative to \(J/\psi \). This observation is consistent with the interpretation that the tightest bound quarkonium system, the \(J/\psi \), survives the temperature of the hot and dense medium with a higher probability than the more loosely bound state, the \(\psi (2\mathrm {S})\). It is, however, also consistent with the radiative energy-loss scenario as shown in Ref. [20]. Irrespective of the underlying mechanism for the charmonium suppression, one may expect less ambiguity in the interpretation of this result since quark recombination processes, \(J/\psi \)s formed from uncorrelated \(c\bar{c}\) pairs in the plasma, which are important at small \(p_\mathrm {T}^{\psi (\textit{n}\mathrm {S})}\), should not play a significant role here [17, 18, 62].

7 Summary

Measurements of \(J/\psi \) and \(\psi (2\mathrm {S})\) production are performed in the dimuon decay channel in Pb+Pb collisions at \(\sqrt{s_\mathrm {NN}}\) = 5.02 TeV with an integrated luminosity of \(0.42~\mathrm {nb}^{-1}\), and in pp collisions at \(\sqrt{s}\) = 5.02 TeV, with an integrated luminosity of \(25~\text{ pb }^{-1}\) collected with the ATLAS experiment at the LHC. Results are presented for prompt and non-prompt nuclear modification factors of the \(J/\psi \) mesons, as well as the yields and non-prompt fraction in the region with transverse momentum \(9< p_{\text {T}} < 40\ \text {GeV}\) and rapidity \(|y| < 2\).

Strong suppression of prompt and non-prompt \(J/\psi \) and \(\psi (2\mathrm {S})\) mesons is observed in Pb+Pb data. The maximum suppression of prompt and non-prompt \(J/\psi \) is observed for the most central collisions. The dependence of the nuclear modification factor \(R_{\mathrm {AA}}\) on centrality is approximately the same for prompt and non-prompt \(J/\psi \). The prompt \(J/\psi \) \(R_\mathrm {AA}\), as a function of \(p_{\text {T}}\), shows an increasing trend while the non-prompt \(J/\psi \) \(R_\mathrm {AA}\) is consistent with being constant as a function of \(p_{\text {T}}\) within the uncertainties.

The ratio of \(\psi (2\mathrm {S})\) to \(J/\psi \) meson production is measured for both the prompt and non-prompt mesons, and is shown as a function of centrality. Values consistent with unity are measured for the non-prompt mesons, while the values observed for the prompt mesons are below unity.