Feasibility of top quark measurements at LHCb and constraints on the large-$x$ gluon PDF

The forward LHCb acceptance opens interesting possibilities of studying precision Standard Model hard processes in a kinematical region beyond the reach of ATLAS and CMS. In this paper we perform a feasibility study for cross-section measurements of top quark pairs with the LHCb detector, with an analysis of signal and background rates for selected final states, and determine the potential precision achievable at $\sqrt{s} =$ 7 and 14~TeV. We then study the dependence of theoretical uncertainties on the pseudorapidity distribution of top quarks produced in pair production at NLO, and observe that a cross-section measurement at high pseudorapidity has enhanced sensitivity to probe the high-$x$ gluon PDF as compared to measurements in the central-region. Based on simulated pseudodata, the impact of a 14~TeV cross-section measurement on the gluon PDF and charge asymmetry is quantified.


Introduction
Top quark measurements at LHCb were initially proposed in Ref. [1]. It was demonstrated that tt production can be probed at high pseudorapidity by partially reconstructing (µ ± bjet) the total system. It was also proposed that this method of partial reconstruction can be used to measure the pair production charge asymmetry (charge asymmetry) by comparing the rate of top (µ + -tagged) to anti-top (µ − -tagged) events as a function of lepton pseudorapidity within the LHCb acceptance. The main motivation being that at a proton-proton collider, the rapidity of the heavy quarks written in terms of the incoming partons at leading order (LO) is, where m T = (m 2 + p 2 T ) ,ŝ = 2m 2 T (1 + cosh∆y).
Meaning that, at momentum exchange scales required to produce pairs of top quarks, top quarks produced in the forward region have a high probability of having come from a high-x 1 incoming parton where the ratio of quark to gluon parton distribution functions (PDF) is larger -a consequence of the valence content of the proton. This results in less dilution to the charge asymmetry which arises from the colour structure of interfering diagrams, qX → ttY , with quarks in the initial state [2]. However, this also means that forwardly produced top quarks from gg-scattering processes are produced from incoming gluons at high-x 1 . This is presented in Fig. 1 where the ratio of production mechanisms, (qq + |qg|)/total, contributing to tt production is presented as a function of the arithmetic mean of pseudorapidity distributions of t andt (pseudotop -t) for 7 (left) and 14 TeV (right) centre of mass energies. Note that the contribution from gg-scattering is dominant across the entire range of phase space for both centre of mass energies. There have been large efforts in the QCD community to improve the precision of top quark pair production predictions. In particular, the completion of the full next-to-next-toleading order (NNLO) calculation [3,4,5,6] as well as resummation of soft gluon emissions to next-to-next-to-leading log (NNLL) accuracy [7,8,9]. The reduced scale uncertainty in these predictions is crucial to gaining information on other sources of theoretical uncertainty such as the high-x gluon PDF, α s and the top mass. A recent study of the impact of these uncertainties on the inclusive cross-section at NNLO+NNLL accuracy can be found in Ref. [10], where it is observed that such a measurement, with minimal scale uncertainties, has the potential to strongly constrain the gluon PDF. It is clear that a differential result to the same accuracy is highly desirable and will be available in the not-too-distant future.
In fact, differential cross-section results and studies using approximate NNLO calculations and resummation techniques have already been obtained in [11,12,13,14,15]. To this end, we demonstrate the increased sensitivity of pair production cross-section measurements at high rapidity to the gluon PDF at NLO accuracy.

LHCb analysis at 7, 14 TeV
This section aims to provide an estimate of the potential statistical precision of a crosssection measurement achievable with the current 7 TeV data ( Ldt = 1fb −1 ) as well as the projected 14 TeV data sample after 1 year of running ( Ldt = 5fb −1 ). As pointed out in Ref. [1], top quarks can be identified through their decay t → (W → µν µ )b, where the muon and the b are registered by the detector. Indeed, in the full tt decay it is also possible to reconstruct a b,µ along with W decay products, radiated jets (which tend to be forward) or b quarks which do not come from the same parent top -as demonstrated in Ref. [16]. In the following analysis we will consider both µb and µbj final states. Using multiple final states, requiring a different number of b-tags, is a crucial cross check of the background modelling, in particular the W +(b)jets processes. Given that top pairs are produced asymmetrically beyond LO, we introduce the 'pseudotop' object where; Thus, the µb and µbj final states are labelled ast µb ,t µbj . Introducing this definition removes the small bias introduced from the charge asymmetry. Given that the asymmetry in the backgrounds is driven by the quark valence content, which is well constrained by DIS data, the main uncertainty on backgrounds arises from total normalisation and so is not affected by this definition.

Signal and background
The tt signal and backgrounds are simulated using POWHEG [17,18,19,20,21,22,23,24] with the central CT10wnlo [25] PDF set and then matched to Pythia8176 [26], with the exception of Z + bjets where the matrix element is produced using MadGraph5 [27] with CTEQ6l1 [28]. The tt signal is found by fixing the reference factorisation and renormalisation scales equal to the top mass (m t = 173.25 GeV). It is found that the difference between the parton level POWHEG→Pythia8176 and MCFM [29] pseudotop pseudorapidity distributions is negligible 1 . The main backgrounds are identified as single top, W +(b)jets and Z+(b)jets. The QCD background originating from di-bjet production where a secondary muon passes isolation and kinematic cuts has previously been shown to be negligible [1].
Given a di-bjet background rejection of O(10 −5 ), and that the relative increase in the ratio of (pp → tt)/(pp → bb) from 7→14 TeV is ≈ 3, this will also be ignored for the 14 TeV analysis. The t-channel single top process is modelled in both 4 and 5 flavour (ST, tch) schemes, the 4-flavour cross-section is normalised to that found in the case of 5 flavours and the average of these distributions is plotted with a systematic error associated to the envelope between the two descriptions 2 . There is a small combined (below 10% of tt signal) contribution from tW and s-channel single top which is not included. The Z+(b)jets background arises from leptonic decay of a Z where only one of the leptons is detected in association with either a correctly identified (in Zb/Zg → bb) or mis-tagged b-jet. The W +(b)jets background is separated into W jets, where jet is a light flavour (u,d,s,c,g) which is mis-tagged as a b-jet, and W bjets (where g → bb) with a correctly identified b-jet. 1 Provided the switch SpaceShower:phiIntAsym is turned off during the shower. This switch introduces un-desired colour-reconnection effects which are already accounted for correctly in the NLO matrix element. 2 The top decay is not included in the matrix element in the 4-flavour scheme.

Selection and reconstruction
Jet objects are defined to have a jet parameter R 3 of 0.5, p T >15 GeV, and to be built with the anti-kt algorithm using FastJet3 [30] software. In this analysis, b-jets are defined to be jets which are matched to a parton b-quark from the hard process (within R). It is also required that charged leptons are isolated (∆ R (µ, jet) ≥ R) which is necessary to suppress QCD background. It was found that reconstructed jets with an R parameter of 0.7 better match parton energy as well as increasing the b-matching efficiency. However, the combination of the lepton isolation requirement and background reduction favours an R parameter of 0.5. Kinematic cuts of p T > 60 GeV on the leading b-jet p T and a cut of p T > 20 GeV on the muon and sub-leading jet are imposed which dramatically reduces background whilst having a comparatively small effect on signal. Jets and muons are also required to be within the pseudorapidity range of 2 < η < 4.5. An efficiency of 75% is applied to muons, which is estimated from the combined trigger efficiency of ≈ 75% and identification/tracking > 95% for high p T (p T > 20 GeV) muons from Ref. [31]. A b-tagging efficiency of 70% with a corresponding mis-tag rate for light jets of 1% is assumed. It is noted that a more detailed study could separate c-jet background processes and apply an appropriate charm mis-tag rate, this is left for the data analysis.
The expected number of pseudotop events in 1 fb −1 as a function of reconstructed invariant mass (upper) and muon pseudorapidity (lower) for both µb (left) and µbj (right) channels at 7 TeV are plotted in Fig. 2. The background and signal is stacked and the resultant uncertainty band corresponds to the statistical error for the given choice of binning.
The single top (ST, tch) distribution is the envelope of the 4 and 5-flavour predictions as previously mentioned. The 4-flavour differential cross-section tends to be slightly larger for high invariant masses due to a larger number of events where the co-linear spectator b-quark and lepton are reconstructed together within the acceptance -in particular beyond m t . The η µ distributions (lower) have overwhelming statistical uncertainties in the pseudorapidity bins beyond η = 3. This is region of phase space where the asymmetry in the t/t pseudorapidity distributions in pair production is largest and indicates that more data is required for a statistically meaningful differential charge asymmetry measurement at LHCb.
The same analysis of signal and background at 14 TeV is also presented. The expected number of pseudotop events in 5 fb −1 as a function of reconstructed invariant mass (upper) and of muon pseudorapidity (lower) for both µb (left) and µbj (right) channels at 14 TeV are shown in Fig. 3. The larger data sample size and dramatic increase in tt cross-section at 14 TeV suggest that high statistical precision will be achievable for several high multiplicity pseudotop final states. The wider grey fill on the tt signal corresponds to the statistical precision expected with 5 fb −1 , while the black band corresponds to 50 fb −1 (achievable  after ≈ 10 years of running). It is clear that high-statistical precision (< 2%) can be obtained across the entire acceptance in η−phase space even for a fine choice of binning (/0.3η).

Pair production cross-section
Given the promising signal yield and observability at 14 and 7 TeV, we study the theoretical uncertainties on the signal at the parton level within the LHCb acceptance. The parton level NLO results are produced with MCFM and compared to the inclusive NNLO+NNLL (NNLO * ) results presented in Ref. [10]. The LHCb cross-section for pseudotop production is,  In accordance with Ref. [10], the theoretical uncertainties are obtained in the following way,

Top mass
The central top quark pole mass is assumed to be 173.25 GeV. The dependence on the cross-section from the uncertainty of the top mass, δm t , is then found by varying the mass within the range m t ∈ [171.75 − 174.75] GeV and then taking the average. This range is in agreement with the current PDG value of m t = 173.07 ± 0.52 ± 0.72 GeV [32] and latest LHC combination of m t = 173.29 ± 0.23 ± 0.792 GeV [33] from direct measurements.

PDF
The following NLO PDF sets are studied; ABM11(5flv) [34], CT10wnlo, HERAPDF1.5 [35], MSTW08nlo68cl [36] and NNPDF2.3nlo [37], where the central value of α s (M z ) = 0.118, 0.118, 0.1176, 0.120 and 0.119 is chosen for each set respectively. Asymmetric/symmetric uncertainties are found in the usual way as; (3.2) X ± i represents the observable calculated from eigenvector member S ± i . The uncertainties obtained for each PDF collaboration are quoted at 1σ confidence level (CL), where the CT10 uncertainties provided at 90% CL have been scaled down by a factor of 1.645. The PDFs are accessed through the LHADPF interface [38].

Scale
The scale uncertainty, δ αs , is found from varying factorisation and renormalisation scales µ F and µ R independently by a factor of two in both directions of the top mass -this is done such that the scale ratio (µ F /µ R ) is always within this range. The central value is The pseudotop differential cross-section with respect to pseudorapidity is shown in  The magnitude of δ PDF increases with pseudorapidity as this corresponds to events produced from partons at both very high and low-x where the gluon and anti-quark PDFs are respectively not well known. There is also a rapidity dependence of δ αs which arises from uncertainty in the gluon PDF indirectly, where an increase in α s leads to a smaller gluon PDF at lower values of x while momentum sum rules compensate this by increasing the gluon PDF at large x, resulting in a rapidity dependent uncertainty. There is also a small pseudorapidity dependence on scales due to differences in the physical scale, Q 2 , for forward events.
The contribution from the individual sources of systematic uncertainties to the LHCb cross-section are now evaluated and compared to the inclusive NLO and NNLO * resultsfrom Ref. [10]. The total uncertainty is found by combining the the individual uncertainties following the recommendation of the Higgs Cross Section Working Group [39] as, 3) The 7 and 14 TeV results are summarized in Table 1 and Table 2  where it is found that a 1 GeV uncertainty on m t translates into a 3.0, 2.7% uncertainty on the cross-section at 7 and 14 TeV.  Table 1. Summary of inclusive (inc.) and differential (LHCb) cross-sections at NNLO+NLLL (NNLO * ) and NLO accuracy and associated theoretical uncertainties at 7 TeV, for PDF sets as described in the text.
The enhanced sensitivity of measurements at high pseudorapidity can be seen by comparing the relative uncertainties for the inclusive and differential LHCb cross-sections. This comparison is done by taking the ratio of their relative uncertainties,  Table 2. Summary of inclusive (inc.) and differential (LHCb) cross-sections at NNLO+NLLL (NNLO * ) and NLO accuracy and associated theoretical uncertainties at 14 TeV, for PDF sets as described in the text.  which highlights the sensitivity of measurements at LHCb to PDF uncertainties, in particular to those sets provided by NNPDF and CT10. The results are summarised in Tables 3   and 4 Table 4. Ratio of relative uncertainties at 14 TeV LHCb/inclusive cross-sections at NLO other predictions for differential and inclusive NLO, and NNLO results. At NNLO this can be understood from both a lower value for α s (M Z ) and a softer gluon PDF at largex [10,40]. At NLO, even for identical best fit value α s (M Z ), the prediction from ABM is substantially lower than CT10 as shown in Fig. 4. In fact, the discrepancy between the central value of ABM and the other predictions is enhanced at high rapidity as a result of the soft large-x gluon PDF. The predictions from different eigenvectors were found to be very stable, with the exception of members 10 and 13, resulting in small PDF uncertainty.
Although the PDF uncertainty is small, including LHCb tt data in a PDF fit will impact the central value of the gluon PDF in the large-x region.
At NLO the contribution from the scale variation to the total uncertainty is dominant.
However, given the recent theoretical advances in pair production predictions, it is clear that a cross-section measurement in the forward region can be used to constrain the gluon PDF description at high-x. It is expected that the observed large ratio of the relative PDF uncertainties between inclusive and LHCb measurements is still present at NNLO. This can be seen by comparing the relative uncertainty on the gluon PDF as function of x for both CT10 NLO and NNLO sets for δ PDF (left) and δ αs (right) as shown in Fig. 7. The uncertainties at NLO and NNLO are of comparable size.

Constraining the gluon PDF
Due to the high statistical precision expected within 1 year of running (5 fb −1 ) at 14 TeV, a differential measurement in bins of pseudorapidity across the entire LHCb acceptance is viable. To demonstrate the potential power of such a measurement on constraining the gluon PDF, we apply a reweighting to the CT10 and NNPDF sets based on a hypothetical measurement of σ LHCb . This is done following the prescriptions of Ref. [41,42,43,44] where a Bayesian method based on statistical inference is used. The procedure is easily performed for the NNPDF Monte Carlo sets, while for CT10 (the Hessian set) it is necessary to first generate a set of random PDFs from the eigenvector set. This is done working in the basis of observables, spanning the N eigenvectors. Hypothetical and random observables are generated as: where R kj is a random gaussian-distributed number with zero mean and variance of one.
The choice of negative or positive displacements S − j or S + j depends on the sign of R kj . For the generated CT10 and NNPDF sets studied, the number of replicas are 1000 and 100 respectively. This procedure is applied to the evolved gluon PDF g(x, Q 2 ) for CT10 and then compared to the Hessian result in Fig. 8, where the relative uncertainty for the replica and Hessian set is plotted with respect to the Hessian central value. The difference between the two sets occurs for large x where the PDF uncertainties are most asymmetric (see also where the weights are computed as and the dominator fixing the normalisation is, After applying this reweighting technique, the number of effective remaining replicas can be found after calculating the Shannon entropy as, The effective number of replicas after having applied this reweighting technique to the random NNPDF (N rep = 100) and CT10 (N rep = 1000) sets for different experimental uncertainties are provided in Table 5.
The effect of this reweighting on the evolved gluon PDF is presented in Fig. 9 Table 5. Effective replicas after reweighting with the inclusion of an LHCb semi-inclusive measurement, the associated experimental uncertainty is within the range 4-8%. The largest sensitivity lies within the range of 0.1 < x < 0.3 for 14 TeV pseudodata.
The experimental precision achievable at LHCb will therefore have a large impact on future PDF fits within this range. The choice of generating pseudodata from an observable   Table 6. Effective replicas after reweighting with the inclusion of an LHCb semi-inclusive measurement generated from the HERA1.5 central value, the associated experimental uncertainty is within the range 4-8%.
For convenience of the PDF collaborations, we list the eigenvectors (and their directions) for all studied asymmetric Hessian sets which have a substanial impact on replicas with large χ 2 k values. Given that the pseudodata values are centred on the observable calculated from the central Hessian member (σ LHCb 0 ), this can be quantified as  Figure 11. Ratio of evolved quark, gluon PDFs (f p (x, Q 2 )) with respect to their corresponding central value for selected members -as described in the text. 7. The ratio of the evolved gluon and up quark PDFs with respect to the central value for these members are also presented in Fig. 11. The deviations at high-x are found to be largest for the gluon PDF, with the exception of the valence content for a few eigenvectors, demonstrating the dominance of the gluon PDF uncertainties on the observable σ LHCb . CT10 HERA MSTW These particular eigenvectors are only similar to the list obtained from calculating the inclusive tt cross-section. This is due to partial cancellation across the entire pseudorapidity Also plotted in Fig. 12 (right) is the ratio of differential cross-sections for thet µb final state passing all analysis cuts discussed in the previous section, again for eigenvectors 13− and 13+ where the deviation from the central value is larger. This demonstrates that the analysis of the impact of a measurement σ LHCb on the gluon PDF is an underestimate as more information is contained in a binned differential cross-section. In fact, the kinematic cuts applied in the analysis, which are required to improve the signal/background ratio, for the b-jet and muon of p T > 60, 20 GeV select harder events which are produced from higher x 1 incoming partons improving the constraints at yet higher x. This is demonstrated in Fig. 13 where the incoming parton momentum fraction x is plotted against the event momentum scale squared (Q 2 ). The left plot corresponds to events where a parton level top is within the LHCb acceptance, and the right plot to events passing the full analysis cuts. As a larger fraction of events are at high-x 1 (< x 1 >= 0.28) after applying analysis cuts, this increases the sensitivity within this region which can be seen by comparing the bin-by-bin deviation in Fig. 12. Fully quantifying the sensitivity after applying analysis cuts will require a full study of NLO+PS for all eigenvector members as well as knowledge of cuts which will be eventually used in the analysis.  Figure 13. Event momentum scale, Q 2 , with respect to incoming parton momentum fraction x for a pseudotop within the LHCb acceptance (left) and pseudotop final statet µb passing analysis cuts (right).

Application to the charge asymmetry
Improvements to the gluon PDF description at high-x are useful for reducing uncertainties in Standard Model (SM), such as Higgs production, as well as Beyond-SM (BSM) physics processes which are often swamped by tt backgrounds. Another interesting application of an improved high-x gluon PDF is to the prediction of the tt charge asymmetry, diluted by symmetric gg-scattering.
There is tension between NLO predictions and the observed charge asymmetry from foward-backward measurements with the full TeVatron data sets, Ref. [45,46,47,48], where the measured asymmetries are larger than expected. Although the same behaviour is not seen in the current LHC forward-central measurements, Ref. [49,50,51] , it is difficult for any conclusion to be made as the combined uncertainties on the LHC measurements are of comparable size to the theoretical predictions. The small asymmetry prediction at the LHC, in comparison to the TeVatron, is a result of the large gg-dilution present in multi-TeV pp collisions as well as the redefinition of asymmetry variables required as the initial state is symmetric.
The proposal of Ref. [2], and specifically to LHCb in Ref. [1], was to measure the production rate of t/t from pair production in the high pseudorapidity bins at the LHC as, Due to the reduction in the dilution from gg-scattering, the asymmetry grows substantially with increasing pseudorapidity. With LHCb data sets of 5, 50 fb −1 at 14 TeV, the number of tt → µb events passing the analysis cuts of Section 2 beyond η = 3.2 are O(1k, 10k) respectively. Therefore, an asymmetry measurement with these data sets will also be systematically dominated. We have already demonstrated the sensitivity of cross-section measurements at LHCb to the high-x gluon uncertainties, meaning that the associated PDF systematic for the asymmetry is also large in comparison to central measurements.
To demonstrate how a cross-section measurement at LHCb impacts the PDF uncertainty of A tt , we perform a reweighting of the observables A tt (S k ) generated from CT10 and NNPDF replica sets based on the assumption of a cross-section measurement σ LHCb fake . Note that in this case, the reweighting of both CT10 and NNPDF sets is done assuming the same cross-section, where as in the previous section this was not the case. The predictions from the replica sets are combined in the following way, X central = 0.5 · (max(X 1 + δX 1 , X 2 + δX 2 ) + min(X 1 − δX 1 , X 2 − δX 2 )) δX = 0.5 · (max(X 1 + δX 1 , X 2 + δX 2 ) − min(X 1 − δX 1 , X 2 − δX 2 )).  The assumed cross-section of σ LHCb fake = 145.1 pb results in a decrease in magnitude of the asymmetry, while for the smaller cross-section σ LHCb fake = 129.5 pb the opposite behaviour is observed, accounting respectively for and increase and decrease in the gluon PDF at high-x required to account for the assumed cross-section. The asymmetry expectation and associated uncertainty is provided in Table 8 where the relative shifts in the asymmetry after reweighting are also included. The change to the overall relative uncertainty on the asymmetry is found to be negligible for the given choices of pseudodata cross-sections. The largest shift to the central value is in the region 2.0 < η < 3.0, which is the region where the contribution to σ LHCb is largest -see Fig. 5. 0.0 < η < 1.0 1.0 < η < 2.0 2.0 < η < 3.0 3.0 < η < 4.0 4.0 < η < 5.0  Table 8. Summary of A tt with respect to pseudorapidity and the relative shift of this asymmetry after reweighting assuming a cross-section measurement within the LHCb acceptance.

Discussion
The chosen experimental uncertainty range of 4-8% is an estimate of the systematic reach of future measurements at LHCb, it is expected that the largest uncertainties arise from; background/signal modelling, b-tagging mis-tag/efficiency and luminosity. Current crosssection results from ATLAS [52] have already achieved a total relative uncertainty below 5%, it is therefore not unreasonable to expect similar precision from measurements with the upgraded LHCb detector. Especially given that the same simulation technology is available to LHCb (currently NLO→parton shower) and that the luminosity uncertainty at LHCb [53] is of similar size to the result [52]. It is therefore expected that future tt cross-section measurements at LHCb have the potential to reduce uncertainties on the high-x gluon PDF by up to 20%. A direct application of such an improvement is to better predict the charge asymmetry within the LHCb acceptance. Given that the prediction of the tt charge asymmetry is dependent on the high-x gluon PDF, comparisons between BSM scenarios and perturbative QCD will rely on such improvements.
The analysis strategy presented for top reconstruction relies on isolating the charged lepton in the decay t → (W → µν µ )b as well as any additional jets in the event, the main motivation for this is to reduce the QCD bb background. However, this also removes highly boosted top quark decays in which the top decays products are very close together.
There are many BSM scenarios which contain top partners as a solution to the hierarchy problem, for example [54,55]. If these new particles are kinematically accessible at the LHC, and they decay via top quarks -which is often the case -then boosted top quarks are an interesting signal for BSM [56,57,58,59,60,61]. In the case of boosted top quarks at LHCb, an investigation into the separation power of very energetic fat jets and top decays should be undertaken. Semi-leptonic decays may be promising due to the excellent impact parameter resolution for charged leptons.
We conclude that a cross-section measurement with the current 7 TeV data set is statistically limited. However, given the larger data set available at 8 TeV (2 fb −1 ) and an increase in σ tt ·Acc, statistical precision of 6% is achievable in the highly populated bins (see. Fig 2). It is also worth investigating the precision achievable in the electron channel, which could further improve statistics. At 14 TeV the impact of a cross-section measurement on the gluon PDF ultimately depends on the experimental precision. Measurements of the background cross-sections such as W (b)-jets will be a necessary ingredient to achieving high precision.

Acknowledgements
We are grateful to Amanda Cooper-Sarkar, Ulrich Haisch and Victor Coco for many useful suggestions and in particular Juan Rojo for advice which helped improve the study. The research of R.G. is supported by an STFC Postgraduate Studentship.