Measurement of the differential cross-sections of prompt and non-prompt production of J/ and (2S) in pp collisions at s = 7 and 8 TeV with the ATLAS detector

The production rates of prompt and non-prompt J/ψ and ψ(2S) mesons in their dimuon decay modes are measured using 2.1 fb−1 and 11.4 fb−1 of data collected with the ATLAS experiment at the Large Hadron Collider, in proton-proton collisions at √ s = 7 and 8 TeV. Production cross-sections for both prompt and non-prompt sources, ratios of ψ(2S) to J/ψ production, and fractions of non-prompt to inclusive production for J/ψ and ψ(2S) are measured as a function of meson transverse momentum and rapidity. The measurements are compared to theoretical predictions. © 2015 CERN for the benefit of the ATLAS Collaboration. Reproduction of this article or parts of it is allowed as specified in the CC-BY-3.0 license.


Introduction
Measurements of heavy quark-antiquark bound states (quarkonia) production processes provide an insight into the nature of quantum chromodynamics (QCD) close to the boundary between the perturbative and non-perturbative regimes.More than forty years since the discovery of the J/ψ, the investigation of hidden heavy flavour production in hadronic collisions still presents significant challenges to both theory and experiment.
In high-energy hadronic collisions, charmonium states can be produced either directly by short-lived QCD sources ("prompt" production), or by long-lived sources such as decays of beauty hadrons ("nonprompt" production).These can be separated experimentally by measuring the distance between the proton-proton primary interaction and the decay vertex of the quarkonium state.While Fixed-Order with Next-to-Leading-Log (FONLL) calculations [1,2], within the framework of perturbative QCD, have been satisfactorily successful in describing the non-prompt contributions, a satisfactory understanding of the prompt production mechanisms is still to be achieved.
Early attempts to describe the charmonium formation [25][26][27][28][29][30][31][32] using leading-order perturbative QCD gave rise to a variety of models, none of which could explain the large production cross-sections measured at the Tevatron [3,13,[21][22][23].Within the colour-singlet model (CSM) [33] next-to-next-to-leadingorder (NNLO) contributions to the hadronic production of S-wave quarkonia were calculated without introducing any new phenomenological parameters.However, technical difficulties have made it so far impossible to perform the full NNLO calculation, or to extend those calculations to the P-wave states, so it is not entirely surprising that the predictions of the model fall significantly below the experimental data for inclusive production of J/ψ and Υ states [18,34].
Non-relativistic QCD (NRQCD) calculations that include colour-octet (CO) contributions [35] introduce a number of phenomenological parameters -long-distance matrix elements (LDMEs) -which are determined from fits to the experimental data, and can hence describe the cross-sections and differential spectra satisfactorily well [36].However, the attempts to describe the polarisation of S-wave quarkonium states using this approach have not been so successful [37], prompting a suggestion [38] that a more coherent approach is needed to the treatment of polarisation within the QCD-motivated models of quarkonium production.
Neither the CSM nor the NRQCD model give a satisfactory explanation for the measurement of prompt J/ψ production in association with the W [39] and Z [40] bosons: in both cases, the measured differential cross-section is larger than theoretical expectations [41][42][43][44].
It is also important to broaden the scope of comparisons between theory and experiment by providing a variety of experimental information on quarkonium production across a wider kinematic range.In this context, ATLAS has measured the inclusive differential cross-section of J/ψ production at √ s = 7 TeV, using the data collected in 2010 with 2.3 pb −1 of integrated luminosity [18], as well as the differential cross-sections of the production of χ c states (4.5 fb −1 ) [14], and of the ψ(2S) in its J/ψππ decay mode (2.1 fb −1 ) [9], at √ s = 7 TeV with data collected in 2011.The cross-section and polarisation measurements from CDF [4], CMS [4,6,45], LHCb [8,10,12,[46][47][48] and ALICE [5], cover a considerable variety of charmonium production characteristics in a wide kinematic range (transverse momentum p T ≤ 100 GeV and rapidities of |y| < 5), thus providing a wealth of information for a new generation of theoretical models.This paper presents a high-statistics measurement of J/ψ and ψ(2S) production in the dimuon decay mode, both at √ s = 7 TeV and at √ s = 8 TeV.It is presented as a double-differential measurement in transverse momentum and rapidity of quarkonium, separated into prompt and non-prompt contributions, covering a range of transverse momenta 8 < p T ≤ 110 GeV in rapidities out to |y| < 2.0.Also reported are the prompt and non-prompt ratios of ψ(2S) to J/ψ production, and the non-prompt production fractions relative to those inclusively produced for both J/ψ and ψ mesons.

The ATLAS Detector
The ATLAS experiment [49] is a general-purpose detector consisting of an inner tracker, a calorimeter and a muon spectrometer.The inner detector (ID) directly surrounds the interaction point; it consists of a silicon pixel detector, a semiconductor tracker and a transition radiation tracker, and is embedded in an axial 2T magnetic field.The ID coverage goes up to pseudo-rapidity 1 |η| = 2.5 and is enclosed by a calorimeter system containing electromagnetic and hadronic sections.The calorimeter is surrounded by a large muon spectrometer (MS) in a toroidal magnet system.The MS consists of monitored drift tubes and cathode strip chambers, designed to provide precise position measurements in the bending plane in the range |η| < 2.7.In addition, resistive plate chambers (RPCs) and thin gap chambers (TGCs) with a coarse position resolution but a fast response time are used to trigger muons in the ranges |η| < 1.05 and 1.05 < |η| < 2.4, respectively.The ATLAS trigger system [50] has three levels: the hardware-based Level-1 trigger and the two-stage High Level Trigger (HLT), comprising the Level-2 trigger and Event Filter (EF), which reduce the 20 MHz proton-proton collision rate to a several-hundred Hz rate for transfer to mass storage.At Level-1, the muon trigger searches for patterns of hits satisfying different transverse momentum thresholds using the RPCs and TGCs.Around these Level-1 hit patterns "Regions-of-Interest" (RoI) are defined which serve as seeds for the HLT muon reconstruction.

Candidate selection
The analysis is based on data recorded at the LHC during proton-proton collisions at a centre-of-mass energy of 7 TeV and 8 TeV, collected in 2011 and 2012, respectively.This data sample corresponds to a total integrated luminosity of 2.1 fb −1 and 11.4 fb −1 for 7 TeV data and 8 TeV data, respectively.
Events were selected online using a trigger requiring two oppositely-charged muon candidates, each passing the requirement p T > 4 GeV, and the muons are constrained to originate from a common vertex, which is fitted taking into account the track parameter uncertainties and a χ 2 /nd f cut from the vertex fit of 20 for the one degree of freedom is applied.
For the trigger used to collect data at 8 TeV, there was also a 4 GeV muon p T threshold applied at Level-1, which was not required for 7 TeV data and which modifies the trigger efficiency profile at low muon transverse momenta towards lower values.
Events are selected by the offline analysis requiring at least two muons, identified by the muon spectrometer with tracks reconstructed in the ID.Due to the ID acceptance, muon reconstruction is possible only for |η| < 2.5.The selected muons are further restricted to |η| < 2.3 ensuring high-quality tracking and triggering, and reducing the contribution from fake muons.For the momenta of interest in this analysis, measurements of the muons are degraded by multiple scattering within the MS and so only the inner 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe.The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upward.Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the beam pipe.The pseudorapidity η is defined in terms of the polar angle θ as η = − ln tan(θ/2) and the transverse momentum p T is defined as p T = p sin θ.The rapidity is defined as y = 0.5 ln E + p z / E − p z , where E and p z refer to energy and longitudinal momentum, respectively.The η-φ distance between two particles is defined as ∆R = (∆η) 2 + (∆φ) 2 .
detector tracking information is considered.To ensure accurate inner detector measurements, each muon track must fulfill muon reconstruction performance selection requirements [51].
Muon candidates passing quality criteria are required to have opposite sign.A spatial matching requirement between each reconstructed muon candidate and the trigger identified candidates within a cone of ∆R = (∆η) 2 + (∆φ) 2 < 0.01 is required to accurately correct for trigger inefficiencies.All dimuon candidates with an invariant mass, as determined from a fit to a common vertex, within 2.6 < m(µµ) < 4.0 GeV and within the kinematic range p T (µµ) > 8 GeV, |y(µµ)| < 2.0 are retained for the analysis.In the case that multiple candidates are found in an event (occurring in approximately 10 −6 of selected events), all candidates are retained.The properties of the dimuon system, such as invariant mass m(µµ), transverse momentum p T (µµ), and rapidity |y(µµ)| are determined from the result of the fit.

Methodology
The measurements are performed in intervals of dimuon p T and absolute value of the rapidity (|y|).The definition prompt refers to the J/ψ or ψ(2S) states -hereafter called ψ to refer to either -produced from short-lived QCD decays, including feed-down from other charmonium states as long as they are also produced from short-lived sources; if the decay chain producing a ψ state includes long-lived particles such as b-hadrons, then such ψ mesons are labelled as non-prompt.Using a simultaneous fit to the invariant mass of the dimuon and its decay time (described below), prompt and non-prompt signal and background contributions can be extracted from the data.
The probability for the decay of a particle as a function of proper decay time t follows an exponential distribution, p(t) = 1/τ B • e −t /τ B where τ B is the lifetime of the particle.For each decay, the proper decay time can be calculated as t = L/ βγ, where L is the distance between the particle production and decay point, β = v/c and γ is the Lorentz factor.As the non-prompt decays of ψ mesons do not fully reconstruct the b-hadron decay, the transverse momentum of the dimuon system, and the reconstructed dimuon invariant mass are used to construct the "pseudo-proper decay time", τ = L x y m(µµ)/p T (µµ), where L x y ≡ L • p T (µµ)/p T (µµ) is the signed projection of the distance of the dimuon decay vertex from the primary vertex, L, onto its transverse momentum, p T (µ µ) .This is a good approximation to using the parent b-hadron information when the ψ and parent momenta are closely aligned, which is the case for the values of ψ transverse momenta considered here, and τ therefore can be used to statistically distinguish between the prompt and non-prompt processes.
If the event contains multiple primary vertices, the dimuon system is assigned to the primary vertex closest in z to the decay vertex.The effect of selecting an incorrect vertex has been shown [52] to have a negligible impact on the extraction of prompt and non-prompt contributions.If any of the muons from the dimuon candidate contributes to the construction of the primary vertex, the corresponding tracks are removed and the vertex is refitted.

Differential cross-section determination
Differential dimuon prompt and non-prompt cross-sections times branching ratio for both the production of J/ψ or ψ(2S) mesons are measured separately according to the equations: where Ldt is the integrated luminosity, ∆p T and ∆y are the interval sizes in terms of dimuon transverse momentum and rapidity, respectively, and is the number of observed prompt or non-prompt J/ψ or ψ(2S) mesons in the slice under study, corrected for acceptance, trigger and reconstruction efficiencies.The intervals in ∆y combine the data from both negative and positive rapidities.These differential crosssections are determined separately for the J/ψ and ψ(2S) states.
The determination of the cross-sections proceeds in several steps.First, a weight is determined for each selected dimuon candidate equal to the inverse of the total efficiency for each candidate.Second, a fit is performed to the distribution of weighted events using an unbinned maximum likelihood fit on the dimuon invariant mass, m(µµ), and pseudo-proper decay time, τ, observables, to determine the yields of J/ψ → µ + µ − and ψ(2S) → µ + µ − produced in each (p T (µµ), |y(µµ)|) interval.These yields are determined separately for prompt and non-prompt processes.Finally, the differential cross-section times the ψ → µ + µ − branching fraction is calculated for each state by including the integrated luminosity and the p T and rapidity interval widths as shown in Eq. ( 1) and (2).
The total weight, w tot , for each dimuon candidate includes three factors: the fraction of produced ψ → µ + µ − decays with both muons in the kinematic region p T (µ) > 4 GeV and |η(µ)| < 2.3 (defined as acceptance, A), the probability that a candidate within the acceptance satisfies the offline reconstruction selection ( reco ), and the probability that a reconstructed event satisfies the trigger selection ( trig ).The weights assigned to a given candidate when calculating the cross-sections are then given by:

Non-prompt fraction
The non-prompt fraction is defined as number of non-prompt ψ (produced via the decay of a b-hadron) relative to the number of inclusively produced ψ decaying to muon pairs after applying weighting corrections: where this fraction is determined separately for both J/ψ and ψ(2S).The use of the fraction is advantageous since acceptance and efficiencies are largely cancelled and the total systematic effects are reduced.

Ratio of ψ (2S) to J/ψ production in prompt and non-prompt production
The ratio of ψ(2S) to J/ψ production, in their dimuon decay modes, is defined as: where N p(np) ψ is the number of prompt (non-prompt) J/ψ or ψ(2S) mesons decaying into a muon pair in an interval, corrected for selection efficiencies and acceptance.
For the ratio measurements, similarly to the non-prompt fraction, the acceptance and efficiency corrections largely cancel, thus allowing a more precise measurement.The theoretical uncertainties on such ratios are also smaller, as several dependencies, such as parton density functions and b-hadron production spectra, cancel in the ratio.

Acceptance
The kinematic acceptance A(p T , y) for a ψ → µ + µ − decay with p T and y is given by the probability that both muons pass the fiducial selection (p T (µ) > 4 GeV and |η(µ)| < 2.3).This is calculated using generator-level simulations.Detector-level corrections, which are found to be small, are applied to the results and also considered as part of the systematic uncertainties.
The acceptance A depends on five independent variables (two muon momenta constrained by the m(µµ) mass condition), chosen as the p T , |y| and azimuthal angle φ of the ψ meson in the laboratory frame, and two angles characterising the ψ → µ + µ − decay, θ and φ .The angle θ is the angle between the direction of the positive-muon momentum in the ψ rest frame, and the momentum of the ψ in the laboratory frame, while φ is defined as the angle between the dimuon production and decay planes in the laboratory frame.The ψ production plane is defined by the momentum of the ψ in the laboratory frame and the positive z axis direction.The distributions in θ and φ differ for various possible spin alignment scenarios of the dimuon system.
The spin-alignment of the ψ may vary depending on the production mechanism, which in turn affects the angular distribution of the dimuon decay.Predictions of various theoretical models are quite contradictory, while the recent experimental measurements [7] indicate that the angular dependence of J/ψ and ψ(2S) decays is consistent with being isotropic.This analysis adopts the isotropic distribution in both cos θ and φ angles as nominal.
Seven extreme cases that lead to the largest possible variations of acceptance within the phase space of this measurement are identified.These cases, described in Table 1, are used to define a range in which the results may vary under any physically allowed spin-alignment assumptions.Angular coefficients For each of the two mass-points (corresponding to the J/ψ and ψ(2S) masses), 2D maps are produced as a function of dimuon p T (µµ) and |y(µµ)| for the set of spin-alignment hypotheses.Acceptance maps are defined within the range 8 < p T (µµ) < 110 GeV, |y(µµ)| < 2.0, corresponding to the data considered in the analysis.The map is defined by 100 slices in |y(µµ)| and 4400 in p T (µµ), using 200k trials for each point.The statistical uncertainty of each point is < 0.1% and is neglected.As the reconstructed candidates in the data cover a range of masses which is due in part to the detector resolution of the signal mesons and also from background contributions, the acceptance for a given candidate is interpolated linearly as a function of the reconstructed mass m(µµ) using the J/ψ and ψ(2S) known masses and acceptance maps.
Figure 1 shows the p T acceptance projection for all the spin alignment hypotheses for the J/ψ meson, and also shows the ratio of acceptances for ψ(2S) to J/ψ mesons for the isotropic hypothesis.

ATLAS Simulation Preliminary
Figure 1: Projections of the acceptance maps as a function of p T for the J/ψ meson (left) under the various spin alignment hypotheses.Ratio of acceptances for ψ(2S) to J/ψ (right), produced as a function of the dimuon p T and rapidity, where the central (isotropic) spin-alignment hypothesis has been assumed, and the muon selection criteria: p T (µ) > 4 GeV and |η(µ)| < 2.3 is applied.

Muon reconstruction and trigger efficiency determination
The technique for correcting the data for trigger and reconstruction inefficiencies is the same for the 7 TeV and 8 TeV data, and is described in [9,34].However, different efficiency maps are required for each set of data, and the 8 TeV corrections are detailed briefly below.For the 8 TeV data, the single-muon reconstruction efficiency is determined from a tag-and-probe study in dimuon decays [40].The efficiency map is calculated as a function of p T (µ) and q × η(µ), where q = ±1 is the electrical charge of the muon, expressed in units of e.
The trigger efficiency correction consists of two components.The first part represents the trigger efficiency for a single muon in intervals of p T (µ) and q × η(µ).For the dimuon system there is a second correction to account for reductions in efficiency due to: closely spaced (in η-φ) muons firing only a single RoI, vertex quality cuts, and opposite sign requirements.This correction is performed in three rapidity intervals: 0-1.0, 1.0-1.2,1.2-2.3.The correction is a function of ∆R(µµ) in the first two rapidity intervals and function of ∆R(µµ) and |y(µµ)| in the last interval.
The final effect of the combination of the two components (single muon efficiency map and dimuon corrections) mentioned above is illustrated in Figure 2

Fitting Technique
The corrected prompt and non-prompt J/ψ and ψ(2S) signal yields are extracted from two-dimensional weighted unbinned maximum likelihood fits performed on the dimuon invariant mass (m(µµ)) and pseudoproper decay time (τ) in each p T (µµ) − |y(µµ)| interval.Each interval is fitted independently with respect from all others.The probability density function (PDF) for each fit is defined as a normalized sum, where each term is factorized into mass and τ-dependent functions.The PDF can be written in a compact form as where κ i represents the relative normalisation of the i th term (such that i κ i = 1), f i (m) is the massdependent term, and ⊗ represents the convolution of the τ-dependent function h i (τ) with the τ resolution term, R(τ).The latter is modelled by a double Gaussian distribution with both means fixed to zero and widths determined from the fit.
Table 2 shows the contributions to the overall PDF with the corresponding f i and h i functions.Here G 1 and G 2 are Gaussian, B 1 and B 2 are Crystal Ball distributions [53] (see below), while F is an uniform distribution and C 1 a first order Chebyshev polynomial.The exponential functions E 1 , E 2 , E 3 , E 4 and E 5 have different decay constants, where E 5 (|τ|) is a double-sided exponential with the same decay constant on either side of τ = 0.The parameter ω represents the fractional contribution between the B and G mass signal contributions, while δ(τ) is used to represent the pseudo-proper decay time distribution of the prompt candidates.i Type Source In order to make the fitting procedure more robust, a number of component terms share common parameters to reduce the number of free parameters which led to 22 free parameters in total.In detail, the signal mass shapes are described by the sum of a Crystal Ball shape 2 (B) and Gaussian (G).For each of the J/ψ and ψ(2S), the B and G share a common mean, and freely determined widths, with the ratio between the B and G widths set to be common between the J/ψ and ψ(2S).The B parameters α, and n, describing the transition point of the low-edge from a Gaussian to a power-law shape, and the shape of the tail respectively, are fixed, and variations are considered as part of the fit model systematic uncertainties.
The width of G of the ψ(2S) is set to the width of the J/ψ multiplied by a free parameter scaling term.The relative fraction of B and G is free, but common between the J/ψ and ψ(2S).
The non-prompt signal decay slopes (E 1 , E 2 ) are described by an exponential (for positive τ only) convolved with a double Gaussian, R(τ) describing the pseudo-proper decay time resolution for the nonprompt component, and the same Gaussian response functions to describe the prompt contributions.Both resolution Gaussians have fixed means at τ = 0 and free widths.The decay constants of the J/ψ and ψ(2S) are separate free parameters in the fit.
The background contributions are described by a prompt and non-prompt component, as well as a doublesided exponential convolved with a double Gaussian describing mis-reconstructed or non-coherent dimuon pairs.The same resolution function as in signal is used to describe the background.For the non-resonant mass parameterizations, the non-prompt contribution is modelled by a first order Chebyshev polynomial. 2 The Crystal Ball function is given by: The prompt mass contribution follows a flat distribution and the double-sided background uses an exponential.Variations of this fit model are considered as systematical uncertainties.
The important quantities extracted from the fit are: fraction of events that are signal (prompt or nonprompt J/ψ or ψ(2S)); fraction of signal events that are prompt; fraction of prompt signal that is ψ(2S); and fraction of non-prompt signal that is ψ(2S).From these parameters, and the weighted sum of events, all measured values are extracted.
For 7 TeV data, 168 fits are performed across the range of 8 < p T < 100 GeV (8 < p T < 60 GeV) for J/ψ (ψ(2S)) and 0 < |y| < 2. For 8 TeV data, 172 fits are performed across the range of 8 < p T < 110 GeV and 0 < |y| < 2, excluding the area where p T is less than 10 GeV and simultaneously |y| is greater than 0.75.This region is excluded due to a steeply changing low trigger efficiency and correlation effects which lead to an artificial fluctuation across rapidity of the measured cross-section.
Figure 3 shows the fit results for one of the intervals considered in the analysis with the overall and individual contributions superimposed on the data, and projected onto the invariant mass and pseudoproper decay time distributions for 7 TeV data.In Figure 4 the fit results are shown for one high-p T interval of 8 TeV data.

Bin migration corrections
In order to account for bin migrations due to the detector resolution, corrections in the dimuon system p T are derived by comparing analytic functions fit to the p T spectra of dimuon events in data with and without convolution by the experimental resolution in p T (determined from the fitted mass resolution and measured muon angular resolutions), as described in [34].
The numbers of acceptance and efficiency-corrected dimuon decays extracted from the fits in each p T and rapidity interval are corrected for the differences between the true and reconstructed values of the dimuon p T .The correction factors, applied to the fitted yields deviate from unity by no more that 1.5%, and for the majority of slices are smaller than 1%.The ratio measurement and non-prompt fractions are corrected by the corresponding ratios of bin migration correction factors.Using a similar technique, bin-migration corrections as a function of |y| are found to differ from unity by negligible amounts.

Systematic Uncertainties
The considered sources of systematic uncertainties along with the minimum, maximum and median values in percentage are listed in Table 3.The impact of these uncertainties to the production cross section measurements, as well as to the their ratios for 7 TeV (8 TeV have very similar behavior), is shown for a representative interval in Figure 5 and 6, respectively.The methodology used to determine these uncertainties is described in [54].The luminosity uncertainty is only applied to the J/ψ and ψ(2S) cross-section results.

Muon reconstruction and trigger efficiencies
To determine the systematic uncertainty on the muon reconstruction and trigger efficiency maps, each of the maps is reproduced in 100 pseudo-experiments.For each pseudo-experiment a new map is created by varying independently each bin content according to a Gaussian distribution about its estimated value, determined from the original map.In each pseudo-experiment, the total weight is recalculated for each dimuon p T and |y| interval of the analysis.The RMS of the total weight pseudo-experiment distributions for each efficiency type is used as the systematic uncertainty.
The ID tracking efficiency is in excess of 99.5%, and an uncertainty of 1% is applied to account for the inner detector reconstruction efficiency (0.5% per muon, added coherently).This uncertainty is applied to the differential cross-sections and is assumed to cancel in the fractions of non-prompt to inclusive production for J/ψ and ψ(2S) and ratios of ψ(2S) to J/ψ production.
For the trigger efficiency trig , there is an additional correction that accounts for correlations between the two trigger muons.This correction is varied by its uncertainty, and the shift in the resultant total weight relative to its central value is added in quadrature to the uncertainty from the map.The choice of triggers is known [55] to introduce a small lifetime-dependent efficiency loss but is determined to have a negligible effect on the prompt and non-prompt yields and no correction is applied to this analysis.

Fit model systematics
The uncertainty due to the fit procedure was determined by varying one component at a time in the fit model described in section 4.6, creating a set of new fit models.For each new fit model, all measured quantities were recalculated, and in each p T and |y| interval the maximal variation from the central fit model was used as its systematic uncertainty.
Variations to the central model fit were considered as: • Signal model mass shape: using double Gaussians in place of the Crystal Ball plus Gaussian model; and varying the α and n parameters of the B model, which are originally fixed.
• Signal model decay time shape: a double exponential was used to describe the decay time distribution for the ψ non-prompt signal.
• Background mass shapes: varying the mass model using exponentials, quadratic Chebyshev polynomial to describe the components of prompt, non-prompt and double-sided background terms.
• Background decay time shape: for the non-prompt component, a single exponential was considered.
• Decay time resolution: using a single Gaussian in place of the double Gaussian to model the lifetime resolution (also prompt lifetime shape); and varying the mixing terms for the two Gaussians for the measurement.

Bin migrations
As the corrections to the results due to bin migration effects are close to unity in all regions, the difference between the correction factor and unity is applied as the uncertainty.

Fit model closure test
Closure tests were performed in order to validate the fit model.In these tests, two reconstructed Monte-Carlo samples were used that passed same reconstruction procedure as real data, one of them containing prompt J/ψ and the other containing non-prompt J/ψ coming from b-decays.In order to validate the fit model, 25 random fits were performed.On each fit, a dummy sample was created with varying compositions of prompt and non-prompt J/ψ originated from the Monte-Carlo samples mentioned above.The fit was performed on that sample and the estimated number of prompt and non-prompt events was compared with the real one.The results of the closure test can be seen in Figure 7, where the fractional difference between the truth and the estimated number of entries for prompt (dashed line) and non-prompt (continuous line) are less than 5% and are in satisfactorily agreement within the fitted statistical uncertainties to the truth values.

Results
The J/ψ and ψ(2S) non-prompt and prompt production cross-sections are presented, corrected for acceptance and detector efficiencies under the assumption of isotropic decays, as described in 4.1.Also presented are the fractions of non-prompt decays relative to all decays of J/ψ and ψ(2S) mesons, described here 4.2 and the ratio of prompt and non-prompt decays of ψ(2S) to J/ψ, described here 4.3.All the results presented have been derived using an unpolarised spin-alignment scenario.

Production cross-sections
All production cross-sections presented in this section are corrected for acceptance and detector efficiencies with the assumption of isotropic decays, as described in section 4.1.Figures 8 and 9 show respectively the prompt and non-prompt differential cross-sections of J/ψ and ψ(2S) as functions of p T and |y|, together with the relevant theoretical predictions.

Non-prompt production fractions
The results for the fractions of non-prompt decays relative to all decays of J/ψ and ψ(2S) presented as a function of p T for slices of rapidity in Figure 10.In each rapidity slice, the non-prompt fraction is seen to increase as a function of p T and has no strong dependence on either rapidity or center-of-mass energy.

Production ratios of ψ(2S) to J/ψ
The results of the fits for the ratio of ψ(2S) to J/ψ decaying to a muon pair and produced in prompt and non-prompt processes are presented as a function of p T for slices of rapidity in Figure 11.The nonprompt ratio is shown to be relatively flat across the considered range of p T , for each slice of rapidity.For the prompt ratio, an increase as a function of p T is observed, with no strong dependence on rapidity or centre-of-mass energy.

Comparison with theory
For prompt ψ production, comparisons are made to NLO NRQCD calculations [56], as shown in Figure 12 where comparisons between the theory and data, plotted as a ratio, are provided for J/ψ and ψ(2S) at both 7 and 8 TeV centre-of-mass energies.The uncertainties in the theory prediction originate from the choice of scale, charm quark mass and long distance matrix elements, and is discussed further in [57].Figure 12 shows an fair agreement between the theoretical calculation and the data points for the whole p T range.There is no dependence of the ratio between theory and data as a function of rapidity.
For non-prompt ψ production, comparisons are made to fixed-order next-to-leading-logarithm (FONLL) theoretical predictions [1,2].Figure 13 shows the ratios of J/ψ and ψ(2S) FONLL predictions with respect to the data, as functions of p T and in slices of rapidity, for centre-of-mass energies of 7 and 8 TeV.For J/ψ, the agreement is generally good, however the theory predicts slightly harder p T spectra than observed in the data.For ψ(2S), the shape between data and theory appears to be in satisfactorily agreement, however the theory predicts higher yields than the data suggests.There is no observed dependence on rapidity in the comparisons between theory and data for the non-prompt J/ψ and ψ(2S) production.

Comparison of cross-sections 8 TeV with 7 TeV
An interesting comparison would be also the one with the cross-sections of different center-of-mass energies.The comparison is also made between the corresponding theoretical predictions.
Figure 14 shows the ratios of prompt (left) and non-prompt (right) J/ψ (top) and ψ(2S) (bottom) for both data (with red) and theoretical predictions (with green) for 8 TeV over 7 TeV.The error bars on data ratio represent the propagated error for those points.For the theoretical ratios the errors have been neglected since the high correlation along them makes it impossible to properly assign an uncertainty.
Since the results of 8 TeV have slightly finner p T intervals compared with 7 TeV, for the direct comparisons and production ratios the p T ranges for 7 TeV were used, and in the cases where more than one point existed from 8 TeV, a weighted average of those points were used.The agreements between ratios of data and ratios of theoretical predictions are good within the uncertainties of the data.From the shapes of both ratios of data and ratios of theoretical predictions it seems that for increasing center-of-mass energy the cross-section production is also increased for higher p T regions.The ratios of the NRQCD theoretical predictions to data are presented for the differential prompt crosssection of J/ψ (left) and ψ(2S) (right) as a function of p T (µµ) for each of rapidity slice.Top row is representing the 7 TeV results while bottom row the 8 TeV.The error on the data is the relative error of each data point while the error bars on the theory prediction is the relative error of each theory point.The ratio of the FONLL theoretical predictions to data are presented for the differential non-prompt crosssection of J/ψ (left) and ψ(2S) (right) as a function of p T (J/ψ) for each of rapidity slice.Top row is representing the 7 TeV results while bottom row the 8 TeV.The error on the data is the relevant error of each data point while the error bars on the theory prediction is the relevant error of each theory point.The ratio of differential cross-sections for 8 TeV over 7 TeV are presented for prompt (top) and nonprompt (bottom) J/ψ (left) and ψ(2S) (right) for both data (with red) and theoretical predictions (with green).The theoretical predictions used are NRQCD on the left and FONLL on the right.The error on the data ratio is the propagated error of each data point while the error on the ratio of theory prediction is ignored due to correlations.

Conclusion
The prompt and non-prompt production cross-sections, the non-prompt production fraction of the J/ψ and ψ(2S) decaying into two muons, and the prompt and non-prompt ratios of ψ(2S) to J/ψ production were measured in the rapidity range |y| < 2.0 for transverse momenta between 8 and 110 GeV.This measurement was carried out using 2.1 fb −1 (11.4 fb −1 ) of pp collision data at a centre-of-mass energy of 7 TeV (8 TeV) recorded by the ATLAS experiment at the LHC.This measurement is the latest in a series of related measurements of the production of charmonium states made by ATLAS at 7 TeV, χ c → J/ψγ [14] and ψ(2S) → J/ψππ [9], and extends the measurements to √ s = 8 TeV.
The non-prompt theory predictions from the FONLL model are in reasonable agreement with the data across the range of p T and y considered, for both J/ψ and ψ2s, although the p T spectra in theory tends to be harder than data.For ψ(2S) the theory tends to predict a slightly higher yield than observed.
The prompt theory predictions from the NRQCD model, which includes colour-octet contributions with various matrix elements tuned to earlier collider data, are in a good agreement with the observed data points.Together with related measurements made by other LHC collaborations, these results will help improve the understanding of the dynamics of quarkonium production in hadronic collisions.

Plots for approval [not part of main paper] 8 Average weight distributions
A summary of the mean weight and its RMS for each analysis bin of 8 TeV is presented in Figures 15 and 16 for kinematic acceptance, muon reconstruction respectively, while in Figure 17 the RMS of the trigger weight distributions is shown.These Figures are for illustrative purposes only, as each candidate had individual weights applied, not an average weight as shown here.

Figure 2 :
Figure 2: Average dimuon trigger weight in the intervals of p T (µµ) and |y(µµ)| studied in this set of measurements.

Figure 3 :
Figure3: Projections of the fit result over the mass (left) and pseudo-proper decay time (right) distributions for data collected at 7 TeV for one typical interval.The data are in black, superimposed with the individual components of the fit result projections, where the total prompt and non-prompt components are represented by the dashed and dotted lines, respectively, and the shaded areas show the signal ψ prompt and non-prompt contributions.

Figure 4 :
Figure 4: Projections of the fit result over the mass (left) and pseudo-proper decay time (right) distributions for data collected at 8 TeV an interval of high-p T .The data are in black, superimposed with the individual components of the fit result projections, where the total prompt and non-prompt components are represented by the dashed and dotted lines, respectively, and the shaded areas show the signal ψ prompt and non-prompt contributions.

Figure 5 :
Figure 5: Statistical and systematic contributions to the fractional uncertainty on the prompt (left column) and non-prompt (right column) J/ψ (top row) and ψ(2S) (bottom row) cross-sections for 7 TeV, shown for the region 0.75 < |y| < 1.00.

Figure 6 :
Figure 6: Breakdown of the contributions to the fractional uncertainty on the non-prompt fractions for J/ψ (top left) and ψ(2S) (top right), and the prompt (bottom left) and non-prompt (bottom right) ratios for 7 TeV, shown for the region 0.75 < |y| < 1.00.

Figure 7 :
Figure 7: Results of the closure test of the fit model.The fractional difference between the true and the estimated number of entries for prompt (dashed line) and non-prompt (continuous line) contributions.

Figure 8 :
Figure8: The differential prompt cross-section times dimuon branching fraction of J/ψ (left) and ψ(2S) (right) as a function of p T for each slice of rapidity.Top row is representing the 7 TeV results while bottom row the 8 TeV.For each increasing rapidity slice, an additional scaling factor of 10 is applied to the plotted points for visual clarity.The center of each bin on the horizontal axis represents the mean of the weighted p T distribution.The horizontal errorbars represent the range of p T for the bin, and the vertical error-bar covers the statistical and systematic uncertainty (with the same multiplicative scaling applied).Along with the measurements the NLO NRQCD theory predictions are presented.

Figure 9 :
Figure9: The differential non-prompt cross-section times dimuon branching fraction of J/ψ (left) and ψ(2S) (right) as a function of p T for each slice of rapidity.Top row is representing the 7 TeV results while bottom row the 8 TeV.For each increasing rapidity slice, an additional scaling factor of 10 is applied to the plotted points for visual clarity.The center of each bin on the horizontal axis represents the mean of the weighted p T distribution.The horizontal error-bars represent the range of p T for the bin, and the vertical error-bar covers the statistical and systematic uncertainty (with the same multiplicative scaling applied).Along with the measurements the FONLL theory predictions are presented.

Figure 10 :
Figure10: The non-prompt fraction of J/ψ (left) and ψ(2S) (right), as a function of p T for each slice of rapidity.Top row is representing the 7 TeV results, while bottom row the 8 TeV.For each increasing rapidity slice, an additional factor of 0.2 is applied to the plotted points for visual clarity.The centre of each bin on the horizontal axis represents the mean of the weighted p T distribution.The horizontal error-bars represent the range of p T for the bin, and the vertical error-bar covers the statistical and systematic uncertainty (with the same multiplicative scaling applied).

Figure 11 :
Figure 11: The ratio of ψ(2S) to J/ψ production times muon pair branching fraction of non-prompt (right) and prompt (left) as a function of p T for each of the slices of rapidity.Top row is representing the 7 TeV results while bottom row the 8 TeV.The center of each bin on the horizontal axis represents the mean of the weighted p T distribution.The horizontal error-bars represent the range of p T for the bin, and the vertical error-bar covers the statistical and systematic uncertainty.

Figure 12 :
Figure12: The ratios of the NRQCD theoretical predictions to data are presented for the differential prompt crosssection of J/ψ (left) and ψ(2S) (right) as a function of p T (µµ) for each of rapidity slice.Top row is representing the 7 TeV results while bottom row the 8 TeV.The error on the data is the relative error of each data point while the error bars on the theory prediction is the relative error of each theory point.

Figure 13 :
Figure13: The ratio of the FONLL theoretical predictions to data are presented for the differential non-prompt crosssection of J/ψ (left) and ψ(2S) (right) as a function of p T (J/ψ) for each of rapidity slice.Top row is representing the 7 TeV results while bottom row the 8 TeV.The error on the data is the relevant error of each data point while the error bars on the theory prediction is the relevant error of each theory point.

Figure 14 :
Figure 14:The ratio of differential cross-sections for 8 TeV over 7 TeV are presented for prompt (top) and nonprompt (bottom) J/ψ (left) and ψ(2S) (right) for both data (with red) and theoretical predictions (with green).The theoretical predictions used are NRQCD on the left and FONLL on the right.The error on the data ratio is the propagated error of each data point while the error on the ratio of theory prediction is ignored due to correlations.

Figure 15 :Figure 16 :Figure 17 :Figure 18 :Figure 19 :
Figure15: Left: the mean weight for each analysis bin of p T and |y| from the kinematic acceptance correction.The mean is calculated using the weight distribution for all candidates that pass the selections and fall into a particular bin.Right: the value of the RMS for the corresponding distributions.

Figure 20 :Figure 21 :Figure 22 :
Figure 20: Projections of the fit result over the pseudo-proper lifetime distribution in different slices of invariant mass for data collected in 7 TeV.The data are in black, superimposed with the components of the fit result projections, where the total prompt and non-prompt components are represented by the dashed and dotted lines, respectively, and the shaded areas show the signal ψ prompt and non-prompt contributions.

Figure 24 :Figure 25 :
Figure24: Projections of the fit result over the mass the mass distribution in different slices of pseudo-proper lifetime for data collected in 7 TeV.The data are in black, superimposed with the components of the fit result projections, where the total prompt and non-prompt components are represented by the dashed and dotted lines, respectively, and the shaded areas show the signal ψ prompt and non-prompt contributions.

Figure 26 :
Figure 26: Projections of the fit result over the mass and pseudo-proper decay time distributions for data collected in 8 TeV for one typical interval.The data are in black, superimposed with the components of the fit result projections, where the total prompt and non-prompt components are represented by the dashed and dotted lines, respectively, and the shaded areas show the signal ψ prompt and non-prompt contributions.

Figure 32 :
Figure32: Comparisons of the prompt and non-prompt J/ψ production cross-section times branching fraction into two muons between the 8 TeV data, and for a similar, but non-overlapping, interval of LHCb 8 TeV data [12] are shown as a function of p T .Solid points describe the ATLAS data for the highest rapidity interval 1.75 < |y| < 2.0, while hollow points show the lowest rapidity region 2.0 < y < 2.5 from LHCb.

Table 1 :
Values of angular coefficients describing the considered spin-alignment scenarios.

Table 2 :
Description of the fit model PDF, see equation 3. Components of the probability density function used to extract the prompt (P) and non-prompt (NP) contributions for J/ψ and ψ(2S) signal and background (Bkg).

Table 3 :
Summary of the minimum and maximum contributions along with the median of the systematic uncertainties as percentages.Maximum values are quoted for 7 and 8 TeV.
Figure23: Projections of the fit result over the mass and pseudo-proper decay time distributions for data collected in 7 TeV for one typical interval.The data are in black, superimposed with the components of the fit result projections, where the total prompt and non-prompt components are represented by the dashed and dotted lines, respectively, and the shaded areas show the signal ψ prompt and non-prompt contributions.