Energy Dependence of Direct-Quarkonium Production in pp Collisions from Fixed-Target to LHC Energies: Complete One-Loop Analysis

We compute the energy dependence of the P_T-integrated cross section of directly produced quarkonia in pp collisions at next-to-leading order (NLO), namely up to alpha_s^3, within nonrelativistic QCD (NRQCD). Our analysis is based on the idea that the P_T-integrated and the P_T-differential cross sections can be treated as two different observables. The colour-octet NRQCD parameters needed to predict the P_T-integrated yield can thus be extracted from the fits of the P_T-differential cross sections at mid and large P_T. For the first time, the total cross section is evaluated in NRQCD at full NLO accuracy using the recent NLO fits of the P_T-differential yields at RHIC, the Tevatron and the LHC. Both the normalisation and the energy dependence of the J/psi, psi' and Upsilon(1S), we obtained, are in disagreement with the data irrespective of the fit method. The same is true if one uses CEM-like colour-octet NRQCD parameters. If, on the contrary, one disregards the colour-octet contribution, the existing data in the TeV range are well described by the alpha_s^3 contribution in the colour-singlet model --which, at alpha_s^4, however shows an unphysical energy dependence. A similar observation is made for eta(c,b). This calls for a full NNLO or for a resummation of the initial-state radiation in this channel. In any case, past claims that colour-octet transitions are dominantly responsible for low-P_T quarkonium production are not supported by our results. This may impact the interpretation of quarkonium suppression in high-energy proton-nucleus and nucleus-nucleus collisions.


Introduction
Understanding the production mechanism of low-P T quarkonia in nucleon-nucleon collisions is of fundamental importance to properly use them as probes of deconfinement or collectivity in heavy ion collisions. Indeed, most of the analysis of quarkonium production in nucleus-nucleus colli-sions are carried out on the bulk of the cross section, namely at low P T .
Recently, comparisons between ALICE data [1] without P T cut and CMS data [2] with P T cut in PbPb collisions at √ s NN = 2.76 TeV showed an unexpected suppression pattern of the charmonia, at variance with the simple picture of quarkonium melting in deconfined quark matter [3]. However, to properly interpret this observation, it is essential to rule out the possibility that a part of the effect observed could be due to a difference in the production mechanism in individual nucleon-nucleon collisions at low and at larger P T . The propagation of a colour-octet pair in a deconfined medium certainly differs from that of a colour-singlet pair; this can result into a different nuclear suppression (see e.g. [6]). On the contrary, as regards the bottomonia, the observation of the expected sequential-suppression pattern has been claimed by CMS [4,5]. Further, the effect of normal nuclear matter may also significantly depend on how the pair is produced: the recently revived fractional energy loss [7,8] would for instance act on long-lived colour-octet states and probably differently if the heavy quark state is already produced colourless at short distance, as postulated in the CSM [9]. Saturation effects in pA collisions also do depend on the colour state of the perturbatively produced heavy-quark pair [10][11][12] Despite the possibility that NRQCD factorisation would not hold at low P T , several NRQCD analyses have thus been carried earlier to evaluate the impact of the colouroctet channels to the P T -integrated J/ψ yields [13][14][15]. A first study of the impact of initial state radiations (ISR) on the very low P T J/ψ's and Υ's was recently carried out successfully in NRQCD [16] -yet at the cost of introducing additional non-perturbative parameters.
Whereas, based on an analysis of the sole early RHIC data, Cooper et al. argued [14] that the universality of NRQCD was safe and that colour-singlet contributions to the P T -integrated J/ψ yields were negligible, the global analysis of Maltoni et al. at NLO showed [15] that the colour-octet Long-Distance Matrix Elements (LDMEs) required to describe the total prompt J/ψ yield from fixedtarget energies to RHIC were one tenth of that expected from the -leading-order-fit of the P T -differential cross sections at Tevatron energies.
Such fits of the P T -differential J/ψ cross sections have recently been extended to NLO -i.e. one-loop-accuracy on the prompt J/ψ yields -some of them focusing on the larger P T data and explicitly including the feed-down contributions [17,18], some enlarging the analysis beyond hadroproduction and including rather low-P T data [19]-and on the Υ(nS ) yields [20,21]. Thanks to these studies, we can significantly extend the existing NRQCD studies of the P Tintegrated cross section by combining in a coherent manner, the hard parts -or Wilson coefficients-up to α 3 S , first derived by [22] and which we have systematically checked with FDC [23], with the NRQCD matrix elements fitted at NLO on the P T dependence of the yields. One can indeed consider the P T -integrated and the P T -differential cross sections as two different observables -their Born contributions involve different diagrams -and such a procedure is not a all trivial physics-wise.
As we detail later, our results show that the data do not allow for a global description of both the P T -integrated and P T -differential quarkonium yields. As a point of comparison, we also had a look at Colour-Evaporation-Modellike (CEM) predictions derived from NRQCD following the work of [26] and we found out that it cannot reproduce P Tintegrated yields using the LDMEs obtained following the relations of [26] after identifying the minimal singlet transition to that of the CSM. A contrario, results obtained from the traditional CEM implementation at one loop do not show a similar issue.
The inability of colour-octet dominance within NRQCD to provide a global description of both low and large P T data is in line with the recent findings [28][29][30][31] that the sole LO colour-singlet contributions are sufficient to account for the magnitude of the total cross section and its dependence in rapidity, dσ/dy, from RHIC, Tevatron all the way to LHC energies. Any additional contribution in this energy range creates a surplus 1 as compared to data.
However, as we also study in a dedicated section, the total NLO CSM cross section shows a weird energy dependence at LHC energies. The problem is striking for the J/ψ, less for the Υ. In any case, one should be very careful in interpreting these results. In particular, such NLO results cannot be considered as a improvement of the LO ones. We also observe the same issue for η c and η b production for which there is no final-state-gluon radiation at Born order. We are therefore tempted to attribute this behaviour to large loop contributions which become negative at low P T , rather than to specific effects related to the 3 S 1 production per se. A quick inspection of the rather concise one-loop results [32] for η c and η b production in the TMD factorisation formalism unfortunately does not reveal any obvious negative contributions and does not help in the understanding of this rather general issue of quarkonium production in collinear factorisation.
The same problems appear with some CO channels as well and may therefore cast doubts on the reliability of our results in the √ s region where some contributions shows a strange behaviour -in particular at LHC energies. At this stage, we are not able to conclude from our observations whether these problems are indicative of the break down of NRQCD factorisation at low P T and low x or at low x only. However, for sure, none of the above observations can reasonably support the idea that CO transitions are dominant at low P T . Such a conclusion would be for the least premature.
The structure of the paper is as follows. In section 2, we detail the procedure to evaluate the P T -integrated yield at one-loop accuracy in NRQCD and we explain the idea underlying this first complete one-loop analysis. In section 3, we explain our selection of LDMEs determined at NLO. In section 4, we briefly comment on the existing world data sets for J/ψ, ψ(2S ) and Υ(1S ) 2 . We also explain how we estimate the direct yields. In section 5, we show and discuss our results for the first full one-loop NRQCD analysis of quarkonium hadroproduction. To go further in the interpretation of some of our results, we discuss in section 6 the prediction of NRQCD using CEM-like LDMEs. This is also compared with the conventional approach based on quark-hadron duality. Section 7 focuses on CSM results both for the 3 S 1 states considered here and for the 1 S 0 states for which analytical results exist. Our conclusions are presented in section 8.

Generalities
Following the NRQCD factorisation, the cross-section for quarkonium hadroproduction can be expressed from the parton densities in the colliding hadrons, f (x), a hard-partthe partonic cross section-for the production of a heavyquark pair with zero relative velocity, v, in a definite angularmomentum, spin and colour state, and a LDME connected 2 Whereas the Υ analysis of Gong et al. [21] treats the Υ(1S ), Υ(2S ) and Υ(3S ), the lack of knowledge on the χ b (2P) and χ b (3P) yields and their corresponding feed down to Υ(2S ) and Υ(3S ) makes the analysis of their direct yield delicate; we have thus decided not to consider these in the present study. Our choice has been confirmed by the recent LHCb result [63] that a large fraction of the Υ(2S ) and Υ(3S ) yield actually comes from χ b (2P) and χ b (3P) decays -up to 40% in the Υ(3S ) case. Fig. 1: Representative diagrams contributing (a-b) at Born order to i + j → Q, (c-e) both at Born order to i + j → Q+jet and at one loop to i + j → Q, (f) at one loop to i + j → Q, (g-k) at one loop to i + j → Q+jet.
to the hadronisation probabilities of the intermediate state into the quarkonium. Namely, one has for the production of a quarkonium Q along with some unidentified set of particles X, where the indices i, j run over all partonic species and n denotes the colour, spin and angular momentum states of the intermediate QQ pair. For the 3 S 1 quarkonium states, the first CO states which appear in the v expansion are the 1 S [8] 0 , 3 S [8] 1 and 3 P [8] J states, in addition to the leading v contribution 3 S [1] 1 from a CS transition. One however has to note that, for hadroproduction, whereas the CO contributions already appear at α 2 S (Fig. 1a  & 1b), the CS one only appear at α 3 S (Fig. 1c). These α 2 S CO graphs nevertheless do not contribute to the production of quarkonia with a nonzero P T since they would be produced alone without any other hard particle to recoil on.
The Born contributions from CS and CO transitions are indeed different in nature: the former is the production of a quarkonium in association with a recoiling gluon, which could form a jet, while the latter is the production of a quarkonium essentially alone at low P T .
Let us now have a look at the α 3 S CO contributions ( Fig. 1d-1f) which are then NLO -or one loop-corrections to quarkonium production and which are potentially plagued by the typical divergences of radiative corrections. Yet, the real-emission α 3 S corrections to CO contributions ( Fig. 1d & 1e) can also be seen as Born-order contributions to the production of a quarkonium + a jet -or, equally speaking, of a quarkonium with P T Λ QCD . As such, they do not show any soft divergences for P T 0. These are supposed to be the leading contribution to the P T -differential cross section in most of the data set taken at hadron collider (Tevatron, RHIC and LHC). These are now known up to one-loop accuracy, namely up to α 4 S (see e.g. [17-21, 34, 35]) ( Fig. 1g  & 1h).
It is important to note that one cannot avoid dealing with the divergences appearing at α 3 S if one study the P Tintegrated cross section.
where q denotes u, d, s. At α 3 S , the QCD corrections to the aforementioned channels include real ( Fig. 1d & 1e) and virtual (Fig. 1f) corrections. One encounters UV, IR and Coulomb singularities in the calculation of the virtual corrections. The UVdivergences from the self-energy and triangle diagrams are removed by the renormalisation procedure. Since we follow the same lines as [18,35] where all the procedure is described, we do not repeat its description. As regards the realemission corrections, they arise from 3 kinds of processes (not all drawn): As usual, the phase-space integrations generate IR singularities, which are either soft or collinear and can be conveniently isolated by slicing the phase space into different regions. Here we adopt the two-cutoff phase space slicing method to deal with the problem [36]. As we previously alluded to, the α 3 S CS contribution is particular since, in the limit v = 0, it would be strictly speaking Born order for both the production of a quarkonium and of a quarkonium + a jet. It arises from the well-known process: Our calculations is equivalent to a previous work by Maltoni et al. [15,22] and we have checked that we reproduce their results for all the relevant channels. As announced in the introduction, one of the novelty in our study resides in the use of the LDMEs fitted at the same order, i.e. one loop, to the P T -differential cross sections. As such, this is the first global NLO analysis of hadroproduction.
Since we also look at data at rather low energies, we also included a CS channel via γ exchange. Indeed, as noted in a different context in [37], the QED CS contributions via γ are naturally as large as the CO 3 S [8] 1 transition via g -the α em suppression being compensated by the small relative size of the 3 S [8] 1 CO LDME (O(10 −3 )) as compared to the 3 S [1] 1 CS LDME (O(1)). The real-emission contributions arise from whereas the loop contributions are only from Fig. 1a ( Fig. 1f) with the s-channel gluon replaced by a γ would depict the Born (a one-loop) contribution. We have however found that they do not matter in the regions which we considered.
3 Constraints on the LDMEs from the P T -differential cross section The CS LDMEs can either be extracted from the leptonic decay width at NLO or can be estimated by using a potential model result, which gives for the Buchmuller-Tye potential [38] |R J/ψ (0)| 2 = 0.81 GeV 3 , |R ψ(2S ) (0)| 2 = 0.53 GeV 3 and |R Υ(1S ) (0)| 2 = 6.5 GeV 3 . As regards the CO LDMEs, they can only be extracted from data. As we discussed above, our aim is to analyse the P T -integrated yield using the constraints from the P T dependence of the yields.

J/ψ
In the J/ψ case, we will use the results of five fits of this dependence [17,18,[39][40][41]. The first two were limited to pp data but explicitly took into account the effect of the feeddown 3 The latter fit was based on a wider set of data including ep and γγ systems but the feed-down effects were only implicitly included through constant fractions for these systems. The fourth one includes the recent η c measurement at P T ≥ 6 GeV by LHCb [42] by relying on the heavy-quark spin symmetry of NRQCD which relates colour-octet matrix elements of spin-singlet and triplet quarkonia with the same principal quantum number n. The fifth one incorporates the leading-power fragmentation corrections together with the usual NLO corrections, which results in a different short-distance coefficient and allows for different LDMEs. 3 In [17], Ma et al. used both the prompt J/ψ yield and polarisation data from CDF(run II). In [18], Gong et al. chose to fit the CDF and LHCb experimental data for the yield only (no polarisation data).
1 ) from 5 NLO (i.e. at one loop) fits of the P T dependence of the yields, which we will use to compute [the energy dependence of] the P T -integrated yields. Ref.
Another recent fit [43] took the η c measurement into account. The LDME values which they found fall into range considered for [40], therefore we do not use it separately.
In Ref.
They proceeded to two fits with different P T cuts. We use that for P T > 7 GeV and limit ourselves to the central val- As a central value, we choose the middle of the allowed interval. The same group has however recently improved their analysis by taking into account the feed-down [44]. As aforementioned, they in turn performed a new fit [40] including η c data. The 6 sets of LDMEs to be used to probe the allowable parameter space of the fit are given in Table 1.
As mentioned above, in [39], Butenschoen et al. proceeded to a global fit of prompt J/ψ data from pp, γγ, γp systems 5 . Since γγ, γp mostly lies at low P T , they also considered data at rather low P T from RHIC. They did not included NLO predictions for χ c in the fit. Rather they assumed a constant direct fraction, for instance 36 % for hadroproduction.

ψ(2S )
Buttenschoen et al. did not provide a fit of ψ(2S ) in [39] due to the lack of data besides those from pp collisions. The LDMEs which we consider for ψ(2S ) are therefore only from [17] and [18]. For the former fit, the values are obtained in the same way as for the The resulting values as well as those from [18] are gathered in Table 2 Table 2: Same as Table 1 for ψ(2S ).
In [44], the authors of [17] tried to refit the existing data with a larger P T cut-off. Such a fit already badly overshoots mid-P T data. We therefore do not consider it in this work. For the same reason, we have not considered the fit of [45] since it only reproduces the ψ(2S ) data in an admittedly narrow -high P T -range.

Υ(1S )
As regards the Υ(1S ), there are two NLO analyses from [20] and [21]. However, Wang et al. used in [20] a different value of the NRQCD factorisation scale µ Λ which we use in the present evaluation, that is µ Λ = m b . To perform a correct comparison would have required a new evaluation of the hard coefficients with their choice of µ Λ to use their LDME values. In addition, although they did consider the effects of excited feed-down, they have not disentangled the direct contribution to that of the feed-down in their LDME extraction. The central values of [21] are gathered in Table 3. Table 3: Same as Table 1 for Υ(1S ). Ref.

World data and feed-down effects
As regards the data for J/ψ and ψ(2S ), we drew on the extensive set used in [15] with the exception that we only kept data: derived from more than 100 events at a given √ s; from pp or pp collisions only in order to avoid dealing with nuclear effects; where dσ/dy was derived at y = 0. To this set, we have added data published later than 2006 which includes data from the LHC. We have also added one point from the CDF collaboration at the Tevatron 6 . All the quoted uncertainties are combined in quadrature together with that of the direct fraction 7 which we assumed to be energy independent and F direct J/ψ = 60 ± 10% [28]. As regards the ψ(2S ), the data sets are very scarce especially if one focuses on P T -integrated yields at y = 0. In fact, there is only data from ISR-Clark et al. [47] averaged over √ s = 52.4 and 62.7 GeV and from PHENIX at √ s = 200 GeV. CDF measured the cross section at √ s = 1.96 TeV for |y| < 0.6 but only for P T > 2 GeV [53]. In order to use this precise measurement, we have extrapolated it by assuming the same ratio dσ(P T <2GeV)  enough P T to perform a model-independent enough extrapolation 8 .  As regards the Υ(1S ), the data set is surprisingly wider than that of ψ(2S ) despite a significantly smaller production cross section. It is certainly due to the larger energy of the decay leptons and to the smaller background. For a long time, it was considered that only half of the (low-P T ) Υ(1S ) were directly produced (F direct Υ(1S ) = 50 ± 10%) based on an early CDF measurement [61]. Recent LHCb studies of χ b production [62,63] along with Υ(2S , 3S ) cross section measurements [59,60,64,65], rather indicate that two thirds of the Υ(1S ) are directly produced, we will therefore opt for F direct Υ(1S ) = 66 ± 10%. Yet, a number of experiments could not resolve the 3 Υ states. In this case, one should apply [28] a slightly smaller direct fraction which we take to be F direct Υ(1S +2S +3S ) = 60 ± 10%. As we take this fraction to be energy independent, we chose a conservative estimate of their uncertainty. Table 4 shows the J/ψ data set, Table 5 that of ψ(2S ) and Table 6 that of Υ. 8 One could however use the LHCb and ALICE measurements in the forward region since the rapidity dependence is certainly is better control than the P T dependence from 6 GeV downwards.

Complete NLO results within NRQCD
In the numerical computation at NLO, the CTEQ6M PDF [66] 9 , and the corresponding two-loop QCD coupling constant α s are used 10 . The charm quark mass, m c , is set by default to 1.5 GeV and the bottom quark one, m b , to 4.5 GeV. Our default choices for the renormalisation, factorisation, and NRQCD scales are µ R = µ F = µ 0 with µ 0 = 2m Q and µ Λ = m Q , respectively. When other choices are made, in particular to estimate the theoretical uncertainty due to the lack of knowledge of corrections beyond NLO, they are indicated on the corresponding plots. We have taken δ s = 10 −3 and δ c = δ s /50 for the two phase space cutoffs -the insensitivity of the result on the chosen values for these cut-off has been checked. Our results for direct J/ψ, ψ(2S ) and Υ(1S ) are shown on respectively Fig. 2 (a), (b) and (c).
We first discuss the comparison between the five fits and the J/ψ data ( Fig. 2 (a)). Without a surprise, our study shows that the global fits including rather low P T /m Q data, that is the one of Butenschoen et al. [39] provides the only acceptable description of the total cross section. We however note that the latter fit does not provide a good description of the J/ψ polarisation data and, as recently noted [68], it yields to negative cross section for J/ψ + γ at large P T . Finally, it does not allow [69] to describe the η c data. The fits of Gong et al. [18], and Ma et al. [17,40] greatly overshoot the data in the energy range between RHIC and the Tevatron, whereas these fits a priori provide a good description of the P T -differential cross section at these energies.
The fit of Bodwin et al. [41] gives the worse account of the P T -integrated J/ψ data in the whole energy range. Indeed, the new ingredient of [41] allows one to describe high-P T data with a large 1 S [8] 0 LDME (see Table 1) -as for [18] but without negative LDMEs for the other octet LDMEswhich results in too large a yield at low P T .
In addition, we also note the strange energy dependence of at least the P-wave octet channel. We postpone its discussion to section 5.1 where this is analysed in more detail for the 1 S [8] 0 transition and, in section 7, where we discuss a similar observation for the CSM yield at NLO.
As regards the ψ(2S ) (Fig. 2 (b)), our NLO NRQCD results do not reproduce the data at all at RHIC energies and, since both fits as dominated by the P-wave octet channel, shows a nearly unphysical behaviour at LHC energies.
The comparison for the Υ(1S ) (Fig. 2 (c)), is more encouraging. At RHIC energies and below, the agreement is even quite good, while at Tevatron and LHC energies, the NLO NRQCD curves only overshoot the data by a factor of 2.
We finally note that from RHIC to LHC energies, the LO CSM contributions (the blue in all the plots) accounts well for the data. The agreement is a bit less good for ψ(2S ) if we stick only to the default/central value. This is not at all a surprise and is in line with the previous conclusions made in [28][29][30][31]. In fact, strangely enough, it seems that it is only at low energies (below √ s = 100 GeV) that the CO contributions would be needed to describe the data. The more recent data from the LHC and the Tevatron tend to agree more with the LO CSM.
Overall, this shows -unless the resummation of ISR modifies our predictions by a factor of ten-that it would be difficult to achieve a global description of the total and P T -differential yield and its polarisation at least for the charmonia.
As we discussed in the introduction, a first resummation study has recently been performed within NRQCD [16]. When combined with the results of [17], this resummation yields [16] to a good description of the low-P T data. It should however be stressed that this study introduces 3 new parameters g 1,2,3 to parametrise the so-called W NP function used the CSS resummation procedure. Moreover, we stress that such values of the CO LDMEs would result in a negative NLO P T -differential cross section for J/ψ + γ at large P T where NRQCD factorisation should normally hold.
To close the discussion of the theory-data comparison, let us note that the 3 S [8] 1 channel alone would provide a decent energy dependence. If we were to refit the low-P T data and thus obtain a dominance of the 3 S [8] 1 channel, the yield at large P T would nevertheless dominantly be transversely polarised in disagreement with existing data (e.g. [70]). Yet, the better energy dependence of 3 S [8] 1 at NLO with the respect to the other octet channels, which shows a flat energy dependence in the TeV region, is probably a fortunate "accident". Indeed, most of the 3 S [8] 1 yield up α 4 s is in fact not at one loop, but for the -suppressedqq contribution, since the gg → 3 S [8] 1 is zero. The curve for 3 S [8] 1 -as well as 3 S [1] 1shown on Fig. 2 (a) is effectively a Born-order one. Clearly, these behave better than the channels where the loop contributions are allowed.

Behaviour at high
√ s and scale dependence of the 1 S [8] 0 contribution In view of the observations just made in the previous paragraph, we have found it useful to analyse more carefully the behaviour of one specific channel. We have decided to look more carefully at the simplest one, that is from the 1 S [8] 0 transition in particular at how the scale choices influence the behaviour of the yield at large √ s. Analytical results for the hard-scattering partonic amplitude squared can be found in appendix C of [22]. We have used them in a small numerical code to convolve it with PDFs and checked them with the results of FDC. The advantage of using FDC is that we can easily cut on P T and y.  Fig. 3: The cross section for the production of a J/ψ from only a colour octet 1 S [8] 0 cc state as a function of the cmsenergy for various choice of the mass and scales.
As one could have anticipated from the band of the lower panel of Fig. 2 (a), one observes that for a wide range of scale choices, the energy dependence remains extremely flat. For some of the choice where µ F > µ R , one even sees that the cross section clearly decreases and becomes negative -when the yield becomes negative the curve stops. This is of course not satisfactory. At this stage, we are not able to conclude from our observations -normalisation and high energy issue-whether these point at the break down of NRQCD factorisation, NRQCD universality or should force us to continue questioning our understanding of the midand high-P T quarkonium production mechanisms. To investigate this a bit further, we look at the predictions of another approach -the colour evaporation model-in the next section and later in the colour-singlet model both for 3 S 1 and 1 S 0 quarkonia.

Colour-Evaporation-Model-like NRQCD evaluation
To go further in our investigations of QCD one-loop effects on the energy dependence of quarkonium production, we have found it useful to compare our results with those of the Colour-Evaporation Model (CEM) which directly follows from the quark-hadron duality [71,72]. The quarkonium production cross section is obtained by considering the cross section to produce a QQ pair in an invariant mass region compatible with its hadronisation into a quarkonium, namely between 2m Q and the threshold to produce open heavy flavour hadrons, 2m H . To this, one should multiply a phenomenological factor accounting for the probability that the pair eventually hadronises into a given quarkonium state. Overall, one considers In a sense, the factor P direct Q , i.e. the probability (or fraction) of QQ pair in the relevant invariant mass region to directly hadronise into Q, plays a similar role as the LDMEs in NRQCD, except that its size can be guessed. Indeed, it is expected [73] that one ninth -one colour singlet QQ configuration out of 9 possible-of the open charm cross section in this invariant mass region eventually hadronise into a "stable" quarkonium. Taking into account this factor 9, in the case of J/ψ, it was argued [73] that a simple statistical counting, which would give: where the sum over i runs over all the charmonium states below the DD threshold, could describe the existing data in the late 90's. The solid turquoise curve -computed at NLO as opposed to the analysis of [73]-on Fig. 4 (a) illustrates this 11 . Following the fit of Vogt in [75], P direct J/ψ lies between 1.5 % and 2.5 %. This indeed remarkably coincides with the simple statistical counting 12 . For the Υ, the corresponding quantity is of similar size, between 2 % and 5 %, although following the state-counting argument, one may expect a smaller number than for J/ψ. Let us nevertheless stress that a violation of Eq. 9 cannot be used to invalidate the CEM since this relation completely ignores phase-space constraints. What CEM predicts is that P direct Q is process independent.
In [26], Bodwin et al. studied the connexion between the CEM and NRQCD. Following [26] up to v 2 corrections, only 4 intermediate QQ states contribute to 3 S 1 quarkonium production in a CEM-like implementation of NRQCD. One 11 It has been obtained with MCFM [74] with m c = 1.5 GeV, µ R = µ F = 2m c and m H = m D . 12 As discussed in [24,25], further counting rules involving the Pwaves do not work and illustrate the limit of the model. indeed has: All these nonvanishing LDMEs are then fixed if one makes the reasonable assumption that O3 S 1 ( 3 S [1] 1 ) is indeed the usual CS LDME, i.e. 2N C 4π (2J + 1) |R(0)| 2 . As compared to the results presented in the previous section, the only additional piece to perform a full one-loop analysis is the hard part for 1 S [1] 0 which normally needs not to be considered for 3 S 1 production at this level of accuracy in v. Computing it with FDC [23] does not present any difficulty.   1 curve is the same as in the previous section. One notes that, as for the P-wave octet, the 1 S [1] 0 curve strangely flattens out in the J/ψ case at high energies. We will come back to this in the next section.
The total CEM-like contribution greatly overshoots the data, by a factor as large as 100. This was to be expected since (i), following Eq. 10, all the LDMEs are roughly of the same size, (ii) the 3 S [1] 1 is roughly compatible with the data and (iii) the hard part for the other transitions appear at α 2 S , are not suppressed as P T → 0 and are thus expected to give a larger contribution than the 3 S [1] 1 if one disregards the LDME.
Of course, one could question our assumption that O3 S 1 ( 3 S [1] 1 ) = 2N C 4π (2J + 1) |R(0)| 2 and rather fit O3 S 1 ( 1 S [1] 0 ) . In both the J/ψ and Υ(1S ) cases, the corresponding LDMEs would then approximately be 100 times smaller. In particular, the singlet transition would be 100 times less probable that what one expects from the leptonic decay. This would be an unlikely and dramatic violation of factorisation which should have implications elsewhere. In particular, a pair produced at short distances with the same quantum number as the physical state, among these the colour, would have a much larger probability to be broken up before eventually hadronising than expected.
Although it is not as obvious as in the NRQCD formulae of Eq. 10, where the hypotheses of the CEM are translated into direct relations between CO and CS transition probabilities, the same should happen in Eq. 8 where all the colour configurations are summed over and then considered on the same footage. In a process where the CS configurations dominate, such as qq → γ → QQ, CSM and CEM predictions necessarily differ. Contrary to NRQCD which encompasses the CSM, the CEM does not encompass the CSM. If one agrees with the data, the other cannot. The matter is then how precise the predictions and the data are to rule out one approach or the other.
Overall, one has to acknowledge that the conventional CEM central curves -as simplistic as the underlying idea of the model can be-give an account (Fig. 4 (a) & (b)) of the world data points as satisfactory as the LO CSM. The latter seems to underestimate the data at low energies while the former only has trouble to account for the TeV J/ψ points; the slope being more problematic than the normalisation which can be adjusted. All this is qualitative since the theoretical uncertainties on the CEM are as large as that on the open heavy-flavour production which are admittedly large (see [27] for an up-to-date discussion on the cc case).

Energy dependence of the colour-singlet channel at Born and one-loop accuracy
As we just stated, the LO CSM curves are providing a surprisingly good description of the J/ψ and Υ data at high energies without adjusting -and even less twisting-any parameters. Although the central LO CSM curves agree with the data, the conventional theoretical uncertainties -from the arbitrary scales and the heavy-quark mass-are large (see the lower panels of Fig. 2)). It is therefore very natural to look whether these uncertainties are reduced at oneloop accuracy. Such an observation was already made for the Υ case in [28] but this study was limited to a single √ s, i.e. 200 GeV.

Spin-triplet quarkonia: J/ψ and Υ
Contrary the CO channels, the one-loop corrections to the CS channels only arise at α 4 S (see e.g. Fig. 1j & 1k). Nevertheless, these are know since 2007 [76] and can also be computed with FDC as done in [77]. In particular, there is no specific difficulty to integrate the α 3 S and α 4 S contributions in P T since they are finite at P T = 0.
However, as already noted in [29], such NLO results tends to shows negative values at low P T which can have a non-negligible impact on the total (i.e. P T -integrated) cross section. To our knowledge, the energy dependence of the CSM at NLO has never been studied in detail. This is done below.
Figs. 5 show the energy dependence of the NLO CSM (7 curves). Note that, if a curve is not shown until 14 TeV, this indicates that the total yield got negative. The 3 red curves correspond to the default scale choices (µ R = µ F = 2m Q ) and are indicative of the heavy-quark mass uncertainty, on the order of a factor of 4 for the J/ψ and 2.5 for the Υ. In the former case, all 3 curves end up to be negative somewhere between 500 GeV and 2 TeV. Note also that the upper curve at low energies, i.e. for m c = 1.4 GeV, is the first to get negative and crosses the other ones as if the negative contribution were more important for lighter systems 13 . In the Υ case, these 3 curves do not become negative at high energies -we have checked it up to √ s = 100 TeV. Nonetheless, they start to significantly differ from the LO curves (3 blue curves) above 1 TeV, contrary to the good LO vs NLO convergence found at RHIC energies in [28]. One might thus be tempted to identify this weird energy behaviour to a low-x effect.
Going further in the J/ψ case, one can vary the factorisation and renormalisation scales about the default choice. Doing so, one obtains two classes of curves. For µ R > µ F (pink and orange), the yield remains positive, but it is not less unphysical for it to be practically constant as the energy   increases between 1 and 10 TeV ! The only way to recover a semblance of increase is to take a large value of µ R -and seemingly also a small value for µ F . Obviously, whatever the reason for this behaviour is, for large enough µ R , the QCD corrections which are proportional to α s (µ R ) necessarily get smaller and any difference between LO and NLO results should decrease. In the Υ case, as for the J/ψ case, both curves with µ R > µ F (pink and orange) correspond to the highest yields at high energies and the lowest at low energies. When one chooses µ F > µ R (purple and green), the high-energy yields become negative both for J/ψ and Υ. In many respect, these observations are very similar to those made on the 1 S [8] 0 . Such a pathological behaviour may thus not be related to the nature to the final state (see also next section).
Large NNLO corrections are expected to show up at large energies (low x) as discussed in [78]. It is not clear if they could provide a solution to this issue. Another way to solve this might be to resum initial state radiation as done in the CEM [79] and for some CO channels [16].
At the light of such results, the most that one can reasonably say is that the NLO CSM results may be reliable for Υ up to 200 GeV and for J/ψ up to 60 GeV, that is up to √ s about 20 times the quarkonium mass. Above these value, the best that we have is the Born order results.
7.2 Spin-singlet pseudo-scalar quarkonia: η c and η b Contrary to the spin-triplet case, one can obtain analytical formulae [22,80] for the spin-singlet pseudo-scalar production cross section such as that of η c and η b . This can in principle be of some help to understand the weird energy behaviour of the CS 3 S 1 yield and of some CO channels. Indeed, the LO production occurs as for some CO channels without final-state-gluon radiation. In fact the final state is simply colourless.
As can be seen on Figs. 6, the issue is similar in many respects but for the fact that one does not obtain negative yields for the η b . For the η c , the curves for µ R > µ F remains positive at high energies -as for the J/ψ. One also sees that the crossing of the central LO and NLO curves occurs at larger √ s than for the 3 S 1 states. However, such small quantitative differences may be due to the fact that we computed the y-integrated cross sections using the analytical expressions of [22] instead of the y-differential cross section at y = 0.
We have investigated this in more details by looking at the different NLO contributions (the real emissions from gg and qg fusion as well as the virtual (loop) contributions) in order to see which channels induce the negative contributions and for which scale/mass values. However, it must be stressed that the decomposition between these different contributions depend on the regularisation method used. For instance, the decomposition is drastically different when using FDC -with sometimes a very large cancellation between the positive real-emission gg contribution and the negative sum of the Born and loop gg contributions -and the formulae of [22,80] -where all the gg contributions are gathered. Yet, we checked that we obtain exactly the same results with both methods; the regularisation method or numerical instabilities cannot be the source of the issues observed above.
As regards the qg contribution, only µ F /m Q matters to tell whether it will change sign. For µ F close to m Q and below, it will be positive (negative) at small (large) √ s. Otherwise, it remains negative for any √ s. The value of µ R only influences the normalisation.
As what concerns the gg contributions, which are expected to be dominant at high energies, both µ F /m Q and µ R /µ F matter. For µ F m Q , the gg contribution monotonously increase as function √ s irrespective of   µ R /µ F . For µ F m Q , the gq one is rather small and, despite being negative, does not induce a turn over in the increase of the cross section. For µ F 2m Q , the gg contribution gets negative at large √ s for µ R ≤ µ F . For µ R > µ F , it remains always positive. Yet the sum gg + gq can still become negative since, in some cases, gq increases faster with √ s. All this seems in lines, for the gg [gq] part, with the formulae (C.25) and (C.26) [(C.32) and (C.35)] in the appendix C of [22]. Both contributions indeed exhibit logarithms of µ F /m Q multiplied by a factor function of m Q /ŝ.
As we can see, the results are already difficult to interpret for the spin singlet case. For the spin triplet, we do not have similar analytical results. We can only guess that the structure is similar.
What seems surprising is that when one inspects similar expressions for the η Q production at NLO in the TMD factorisation [32], such negative terms do not appear as ob-vious. We are thus entitled to wonder whether such a formalism, which automatically resums logarithm of transverse momenta, may provide the solution to this issue. Another possible solution may be the consideration of NNLO corrections, which may show opposite signs to that at NLO. This is obviously beyond the scope of our analysis.

Conclusion
We have performed the first full analysis of the energy dependence of the quarkonium-production cross section at one-loop accuracy both in NRQCD and in the CSM. Taken at face value, our results would indicate a severe break down of NRQCD universality -in line with the previous analysis of Maltoni et al. [15] -unless one keeps the LDMEs close to the fit of Butenschoen and Kniehl, which however disagrees with the J/ψ polarisation measurements and the η c cross sections.
The situation is however slightly more intricate since we have uncovered a weird -sometimes unphysical-behaviour at large energies where one approaches the small-x regime where non-linear effects in the parton densities may be relevant. This certainly casts doubts on the numerical values we obtained at LHC energies since collinear factorisation on which we based our analysis could break down.
Yet, up to a few hundred GeV, the energy dependence of the different octet channels at NLO seems well behaved and there is no reason to doubt this. In this region, the NLO yield prediction by NRQCD after fitting the mid-and high-P T quarkonium data -i.e. the yield and its polarisation-would overshoot the data by a factor ranging from 4 to more than 10. The same holds true at LO (see Appendix A). To reproduce the data, the CO LDMEs should be much smaller than what they are found in order to be to reproduce the Tevatron and LHC P T -differential cross section in the case of the J/ψ and ψ(2S ).
On the other hand, the LO CSM provides a decent energy dependence in agreement with the existing data, except for √ s ≤ 40 GeV and is therefore up to ten times below the -data-overshooting-CO contributions. At one loop, the results are however ill-behaved for the charmonia and the cross sections can even become negative at large √ s for some -reasonable-scale choices. The situation is a bit better for Υ(1S ). The same occurs for the spin-singlet quarkonia. In this case, one-loop results exist in the TMD factorisation approach and do not seem to be prone to such an issue. In the case of double J/ψ production, the energy dependence at one loop seems well-behaved [81]. Finally, we investigated the energy dependence of the yield in the CEM where the final states are treated rather differently and we did not find any specific problems 14 . We are therefore tempted to attribute this problem to initial-state effects.
In [82], Ma and Venugopalan obtained a good description of the low-P T J/ψ data over a wide range of energy by, on the one hand, using the LDMEs from [17] -our second set-and, on the other, a CGC-based computation of the low-P T dependence. In reproducing the data, they found that the CS contribution is only 10% of the total yield. This 10% is reminiscent of the factor 10 between the CS and CO in our "collinear" study. From our viewpoint, it looks as if the specific ingredient of this CGC-based computation would correspond to an effective reduction of the two-gluon flux 15 by a factor of 10. It is therefore very interesting to find out new processes which would be sensitive to this physics.
The negative yields obtained in the collinear case -observed for 1 S [8] 0 , 1 S [1] 0 , 3 S [1] 1 , 3 P [8] J channels-could also be cured by adding the large contributions of the one-loop amplitude squared -thus positive. This may look like an adhoc solution which certainly questions the convergence of the perturbative series in α s . However, large NNLO corrections have already been discussed 10 years ago in [78]. Another path towards a solution may also be higher-twist contributions [83] where two gluons come from a single proton as recently rediscussed in [84]. In any case, whatever the explanation for this situation may be, past claims that colour-octet transitions are dominantly responsible for low-P T quarkonium production were premature in light of the results presented here.   Based on this fit, we derive the energy dependence for the direct J/ψ which is shown on Fig. 7. Without any surprise, the results badly overshoot the world data.