On the ratio of tt̄bb̄ and tt̄jj cross sections at the CERN Large Hadron Collider

Triggered by ongoing experimental analyses, we report on a study of the cross section ratio σ(pp → tt̄bb̄)/σ(pp → tt̄jj) at the next-to-leading order in QCD, focusing on both present and future collider energies: √ s = 7, 8, 13 TeV. In particular, we provide a comparison between our predictions and the currently available CMS data for the 8 TeV run. We further analyse the kinematics and scale uncertainties of the two processes for a single set of parton distribution functions, with the goal of assessing possible correlations that might help to reduce the theoretical error of the ratio and thus enhance the predictive power of this observable. We argue that the different jet kinematics makes the tt̄bb̄ and tt̄jj processes uncorrelated in several observables, and show that the scale uncertainty is not significantly reduced when taking the ratio of the cross sections.


Introduction
In order to establish whether the scalar resonance observed at the Large Hadron Collider (LHC) around 125 GeV [1,2] matches the properties of the Standard Model (SM) Higgs boson, quantities such as the couplings to fermions have to be measured with high precision. A special interest is due to the Yukawa couplings to top (Y t ) and bottom (Y b ) quarks. Massive as they are, these quarks are ideal candidates for probing the nature of the new particle and more generally of the Electroweak Symmetry Breaking mechanism.
For a SM Higgs boson with the observed mass value, the dominant decay mode is H → bb [3]. The presence of an overwhelming QCD background discourages Higgs searches in the direct production channel pp → H → bb. Attention is rather put on Higgs production in association with one or more additional objects [4][5][6][7] due to the fact that backgrounds are easier to control in such an environment.
Among all the associated production mechanisms that have been explored by the ATLAS and CMS Collaborations, the ttH channel plays an important role. The production rate is directly sensitive to the Yukawa coupling Y t , providing a unique opportunity to probe this quantity without making any assumption on physics beyond the Standard Model [8][9][10][11]. When the Higgs decays into b quarks, this channel is also a direct probe of Y b and adds to the information provided by the V H(H → bb) channel, where V = Z/W ± .
However, the ttH(H → bb) final state is very challenging to measure. Search strategies employed by both experiments are based on the full reconstruction of the ttbb final state from charged leptons, missing energy and jets [12][13][14][15]. Using b-jet tagging, events with four b-jets are isolated, and the decays of the two candidate top quarks are reconstructed. Afterwards, the two b-jets which have not been associated to top decays are assigned to the candidate Higgs boson's decay. It should be clear that the identification of such decay products is not free of ambiguities. The so-called combinatorial background is responsible for a substantial smearing of the Higgs boson peak in the bb invariant mass. Together with the possibility of misidentifying light jets with b-jets, this represents a serious obstacle to the observation of the Higgs signal and demands a good control of dominant backgrounds as a prerequisite for a successful analysis.
The process of ttbb production in QCD is the most important irreducible background for the signal under consideration. With the help of b-jet tagging algorithms, it is possible to isolate the contribution of this process from the most general reducible background represented by ttjj production. Instead of extracting absolute cross sections, one can measure the production rate of ttbb normalized to the inclusive ttjj sample. This procedure has been explored by both CMS and ATLAS Collaborations [16][17][18] and has the advantage that many experimental systematics, including luminosity uncertainty, lepton identification and jet reconstruction efficiency, are expected to cancel in the ratio. The overall systematic error should thus be dominated by the efficient and clean identification of bottom jets, referred to as the b-jet tagging efficiency, as well as the tagging efficiency for the light flavour jets, referred to as the mistag rate.
On the theory side, the QCD backgrounds pp(pp) → ttbb and pp(pp) → ttjj have been calculated at the next-to-leading order (NLO) in QCD [19][20][21][22][23][24]. Fairly moderate, O(15% − 30%) corrections have been found for both processes. The estimated theoretical uncertainties due to truncation of higher-order terms in the perturbative expansion are of the same size. In addition, first results for tt production in association with either two light or two bottom jets, and enhanced by a parton shower have recently appeared [25][26][27]. Scale variations before and after matching have been assessed to be rather similar. Each of these calculations, however, has been carried out with different sets of cuts, jet algorithms, values of top quark mass and parton distribution functions (PDFs). This makes a determination of the cross section ratio possible only at the price of introducing undesired additional theoretical uncertainties.
The purpose of this paper is twofold. First, we would like to provide a systematic analysis of ttbb and ttjj backgrounds and extract the most accurate NLO predictions for the cross section ratio, to be used in comparisons with the available LHC data. The second goal is to examine whether the ratio has enhanced predictive power for Higgs searches, by investigating possible correlations between the two processes in the quest of reducing theoretical errors.
The paper is structured as follows. In Section 2 we assess the kinematical range of our predictions, i.e. we motivate which phase space restrictions, particularly in the transverse momentum of jets, shall be applied for our fixed-order results to be reliable. Beyond these limits, the stability of the perturbative expansion is likely to be endangered, and resummation of higher order effects is required. We estimate these limits by studying leading-order ttjj production matched with Pythia parton shower, and use the obtained results to determine the kinematical setup for our predictions. In Section 3 we examine next-to-leading order differential cross sections for both ttbb and ttjj processes, analysing similarities and possible correlations between the two backgrounds. Subsequently, we provide in Section p T -ordered PS 4 the results for the ratio and absolute cross sections for three different collider energies: √ s = 7, 8 and 13 TeV. Section 5 is devoted to a comparison with the currently available CMS data at √ s = 8 TeV. Finally, in Section 6 we draw our conclusions.

Leading Order Results with Parton Shower
We begin our analysis by exploring the validity domain of our perturbative calculation. To this end, we have generated an inclusive parton-level sample of pp → ttjj, produced with Helac-Phegas [28][29][30] in the Les Houches event file format [31] and interfaced with the general purpose Monte Carlo program Pythia 6.4 (version 6.427) [32] to include initialand final-state shower effects. The event sample simulates pp collisions at √ s = 8 TeV using the following parton level cuts, ∆R jj = ∆φ 2 jj + ∆y 2 jj > 0.4 , where p T j , y j and ∆R jj denote transverse momentum, rapidity and distance between the two jets in the (y, φ) plane respectively. We use the leading-order (LO) CTEQ PDF set, i.e. CT09MC1 [33] with µ = m t . The top quark mass is set to the value m t = 173.5 GeV [34] and top quarks are assumed to be stable. Jets are reconstructed out of the partonic final state emerging after shower, using the anti-k T jet clustering algorithm [35] provided by the FastJet package [36,37]. The jet cone size is set to R = 0.5, and reconstructed jets are required to satisfy To allow for a more direct comparison with our fixed-order results, we decide to stop the evolution at the end of the perturbative phase. In other words, we neglect effects related to hadronization, underlying events or multiple pp interactions. Also, decays of the top quark and QED radiation from quarks are switched off. All the other Pythia parameters have been left unchanged and correspond to default settings. We have considered two different variants of shower, both provided within Pythia 6.4: transverse-momentum ordered shower (dubbed Pythia p T ) and virtuality-ordered or mass-ordered shower (dubbed Pythia Q 2 ). The starting scale for the shower has been set to p min T j and m min jj = p min T j 2(1 − cos R) respectively. As a consistency check, we have compared the total rate obtained after showering with the LO expectation based on our selection cuts. We obtain the following cross sections: (2. 3) The two showered results, based on different shower ordering variables, agree within 3% and are comparable with the LO cross section. In a subsequent step, we compare leading-order predictions at the differential level before and after showering. Figure 1 shows distributions of the transverse momentum of the two hardest jets, the dijet invariant mass and the transverse momentum of the ttj 1 system, where j 1 denotes the first hardest jet. We observe that p T and invariant mass distributions are not strongly modified by the parton shower. Shape differences are within the corresponding theoretical errors, that we did not report on the plots for better readability. On the other hand, the transverse momentum distribution of the ttj 1 system shows a sizeable discrepancy in the low-p T region. Note that at leading-order, momentum conservation sets the equality p T (ttj 1 ) = p T (j 2 ), where j 2 is the second hardest jet, and thus the distributions of these two observables coincide. When the parton shower is turned on, the extra radiation allows the presence of additional jets, and the direct relation between the previous two quantities is lost. A large Sudakov suppression is visible starting approximately below p ttj 1 = 40 GeV, while the fixed-order result displays a sharp peak. This discrepancy indicates that dominant higher-order effects endanger the stability of the perturbative expansion in the small p T region, and suggests the following choice of basic selection cuts for a reliable fixed-order analysis: The specific value of the cut on the maximum jet rapidity is dictated by the detector acceptance and the experimental requirements for the bottom flavor jet reconstruction [38]. We report for completeness the total LO cross sections that we obtain using the cuts

Next-to-leading Order Differential Cross Sections
Having established a safe kinematical domain, we now turn to examine the behaviour of differential cross sections for both processes, pp → ttbb and pp → ttjj. As already mentioned, we are interested in investigating similarities and correlations between the two backgrounds with the goal of reducing theoretical uncertainties in the cross section ratio.
Our NLO results are based on NLO CTEQ PDF set, i.e. CT10 [39], using µ R = µ F = µ 0 for the renormalization and factorization scales, where 1 Jets are reconstructed using the anti-k T clustering algorithm with resolution parameter R = 0.5. We require the presence of at least two jets and impose the selection cuts of Eq. (2.4). No restriction on the kinematics of the possible third jet is applied. All the next-to-leading order results presented in this paper have been obtained with the help of the package Helac-NLO [40], which consists of Helac-1loop [41][42][43] and Helac-Dipoles [44,45]. The integration over the phase space has been achieved using Kaleu [46]. To understand similarities and differences between the two backgrounds, it is helpful to identify the dominant partonic subprocesses. In the case of pp → ttbb, at LO in the perturbative expansion, the most important production mechanism is via scattering of two gluons (see Figure 2 -A). Within our selection cut choice, the gg channel contributes to the total LO cross section by about 90% at √ s = 8 TeV. On the other hand, pp → ttjj is governed by two equally important channels, namely the gg channel (49%) and the qg/gq channel (40%) (see Figure 2 -B and C). We note that the contribution of the process gg → ttqq, which is related to the ttbb final state amounts to 2.6% and is almost negligible compared to the dominant contributions. These facts suggest that the two backgrounds ttbb and ttjj might show different features in the jet kinematics. This would have of course a negative impact on correlations.
A collection of observables is reported in Figure 3, where the NLO distributions have been normalized to the corresponding absolute cross sections, in order to evidentiate shape differences between the two processes. We focus here on quantities related to jet activity,  such as rapidity and transverse momentum distributions of the first and the second hardest jet, invariant mass and separation between the two jets. Note that the requirement of two hard jets with a resolution parameter R = 0.5 and p min T j = 40 GeV implies a lower bound on their invariant mass, of the order of m min jj = 19.8 GeV.  . Comparison of the normalized leading order and next-to-leading order differential cross sections for pp → ttbb and pp → ttjj at the LHC with √ s = 8 TeV. The following distributions are shown: invariant mass of the two hardest jets, separation between those jets and rapidity of the first and the second hardest jet.
We observe large shape differences in several observables, in line with our expectations. First of all, the b-jets show a preference for the central region of the detector in comparison with light jets. This difference is to be ascribed mainly to the contribution of the qg/gq channel, which favours the emission of jets at larger rapidities than the gg channel. Note that, contrary to the ttjj case, in ttbb production the qg/gq channel is absent at LO and becomes available only at NLO.
In general, jets from the ttjj background show a much harder spectrum compared to ttbb. Sizeable differences can also be seen in the invariant mass and ∆R jj separation between the two jets. In fact, using our cut selection, the ttbb background is dominated by the gg → ttg(g → bb) production mechanism (see Figure 2 -A.2), which naturally favours the production of b-jet pairs with small invariant mass. In the case of ttjj, there is an interplay between two different mechanisms. On the one hand, gg → ttg(g → gg) ( Figure  2 -B.2) is relevant for small values of m jj and gives a signature quite similar to the bb case. On the other hand, gluon radiation off initial-state partons (see e.g Figure 2 -B.1) provides an equally important contribution due to collinear enhancements. Thus, light jets with large rapidities and large ∆R jj separation are also likely to be produced in the ttjj case, which explains the quite different ∆R jj spectrum. All the kinematical features described above are rather insensitive to higher-order corrections as shown in Figure 4, where we compare normalized LO and NLO differential cross sections. Despite sizeable differences in the jet activity, it might still be possible that ttbb and ttjj show some similarity connected to the underlying basic process they have in common, i.e. top quark pair production. To this end, we report in Figure 5 normalized distributions of a few observables related to the top quark kinematics, namely invariant mass of the tt system and averaged transverse momentum of top quarks. Indeed, distributions show a very good agreement in shape, indicating some level of correlation. The pretty different jet kinematics that characterizes the two backgrounds has a minimal influence on the underlying heavy tt system.

Next-to-leading Order Cross Section Ratio
In this Section we present NLO predictions for the ratio σ ttbb /σ ttjj at the LHC for √ s = 7, 8 and 13 TeV. In addition to the basic selection cuts of Eq.(2.4), we also report results for R = 0.8 and ∆R jj > 0.8 to check whether the impact of higher-order corrections is stable against these two parameters. Indeed we want to be confident that our choice ∆R min jj = R = 0.5 is well within the range of stability of the perturbative expansion.

LHC @ 7 TeV
We start with the LHC results at √ s = 7 TeV. In Table 1 may have some structure, i.e. two partons can be inside a jet, an interplay between two different effects can be observed. On the one hand, the simultaneous decrease of the ∆R jj separation cut results in higher total NLO cross sections. On the other hand, a smaller resolution parameter R means that the probability of parton radiation outside the area with distance R is higher. This may be translated into a larger number of soft jets with p T j < p min T j and lower total NLO cross section. Since for the ttjj final state many events are concentrated around ∆R jj = π, the NLO cross section is mildly affected by a change in ∆R jj cut from 0.8 to 0.5. Accordingly, the effect associated with the resolution parameter R dominates leading to the lower NLO cross section.
With ∆R jj > 0.5 and R = 0.5, i.e. for the values that have been used in the experimental studies [17], our predictions for the absolute cross sections read The theoretical uncertainty associated with neglected higher-order terms in the perturbative expansion, can be estimated by varying the renormalization and factorization scales up and down by a factor 2 around the central scale of the process, i.e. µ 0 . The scale dependence is indicated by the upper/lower value, which corresponds to 0.5µ 0 /2µ 0 . Our estimated scale uncertainties for the integrated cross sections are of the order 14% − 24% (14% − 20% after symmetrisation). In addition, we find that the size of the NLO QCD corrections is moderately affected by lowering both ∆R jj and R, i.e. changes of the order of 15% or less are visible. Since those changes are within our theoretical errors, we conclude that ∆R min jj = R = 0.5 is still perturbatively valid and a fixed-order NLO calculation can be considered reliable.
We now turn to estimating the theoretical error for the cross section ratio. Given that there is no unique prescription for this in the literature, we decided to evaluate it using three different approaches. The first one assumes that the two background processes are not correlated, and consists in calculating all possible cross section ratios: R = ttbb(µ 1 )/ttjj(µ 2 ), where µ 1 , µ 2 ∈ (0.5µ 0 , µ 0 , 2µ 0 ). All possible combinations are considered, namely (µ 1 , µ 2 ) = {(2, 2), (2, 1), (2, 0.5), (0.5, 2), (0.5, 1), (0.5, 0.5), (1, 0.5) and (1, 2)}. The theoretical error band is determined taking the minimum and maximum values of the resulting ratios. This approach, that we name uncorrelated, gives the following result: After symmetrisation of the error estimate, we get a scale uncertainty of 30% for the cross section ratio. The second approach assumes that some degree of correlation exists, so the possible combinations to be evaluated are restricted to the subset (µ 1 , µ 2 ) = {(2, 2), and (0.5, 0.5)}. If ttbb and ttjj are indeed correlated, a reduction of the scale uncertainty in the ratio should be expected. Using this approach, named correlated, we get the following result: Only a minor reduction in the size of the scale uncertainty is observed. The theoretical error band for the ratio is now 22% and is of the same order as the error for the absolute cross sections. The third and last approach uses the relative errors of the absolute cross sections as input. We assume these quantities as uncorrelated and add the errors in quadrature, separately for the cases 0.5µ 0 and 2µ 0 . This approach, that we name relative error, gives the result σ NLO ttbb /σ NLO ttjj (relative error) = 0.0105 +0.0022(21%) −0.0029(28%) .

(4.5)
After symmetrisation of the error estimate, the final scale uncertainty is 24%.

LHC @ 8 TeV
We repeat the same procedure for the case √ s = 8 TeV. The NLO cross sections are reported in Table 2, together with the cross section ratio for the two different jet separation cuts and jet resolution parameters. Our conclusions are similar to the case of √ s = 7 TeV and therefore will be briefly summarized here. The absolute cross sections and corresponding theoretical errors for ∆R min jj = R = 0.5 are: Accordingly, results for the cross section ratio are presented, and scale uncertainties evaluated according to the three methods described in the previous Subsection: After symmetrisation, the final theoretical errors amount to 32% for the uncorrelated case and 26% for the correlated one. Using the relative error approach, we find 26%. Scale uncertainties for the absolute ttbb and ttjj cross sections are of the order of 15% − 24% (14% − 21% after symmetrisation) and therefore comparable with the uncertainty of the ratio.

LHC @ 13 TeV
The case of √ s = 13 TeV shows a similar pattern. The NLO cross sections for ttbb and ttjj are reported in Table 3, again for two different values of the jet resolution parameter R and jet separation cut ∆R jj . For ∆R min jj = R = 0.5 we find For the uncorrelated case the theoretical error is 38%, whereas the correlated approach gives 34% and the relative-error approach 27%. We observe here that the cross section ratio and its uncertainty increases with the center-of-mass energy and the difference between uncertainties evaluated in the correlated and uncorrelated approaches becomes smaller. The theoretical error on the ratio is in this case slightly larger than the corresponding one on the absolute cross sections. We summarize our predictions in Figure 6, where the cross section ratio is presented as a function of the collider center-of-mass energy. The plot shows three different error bands according to the three methods employed for the uncertainty estimation. The error bands are relatively independent on the method adopted. The uncorrelated approach being the most conservative one. We decided to adopt the latter for our comparison with the LHC data at √ s = 8 TeV, that will be discussed in the next Section.

Comparison with CMS Results at √ s = 8 TeV
We now compare our NLO predictions with the corresponding measurement of the ratio σ ttbb /σ ttjj by the CMS Collaboration [17], based on a data sample corresponding to an integrated luminosity of 19.6 fb −1 collected at √ s = 8 TeV in the dilepton decay mode.
We quote below the experimental result for |η j | < 2.5, ∆R jj > 0. 5  NLO rel. err. A total systematic uncertainty of 22.6% has been estimated by CMS, where the dominant contribution for p T j > 40 GeV comes from the mistag rate (12.6%) and the b-jet tagging efficiency (11.2%) [17]. Several experimental systematic uncertainties are reduced by taking the cross section ratio, as expected. We can directly compare the measured ratio for p T j > 40 GeV with the corresponding Helac-NLO prediction at √ s = 8 TeV: Let us remind that we have adopted here the most conservative uncorrelated approach for our theoretical error estimate. As Figure 7   as uncorrelated and thus added in quadrature. The total experimental error obtained in this way amounts at present to ±0.0064 (29%).

Conclusions
In this paper we have presented the first consistent NLO theoretical predictions for the cross section ratio σ ttbb /σ ttjj in order to help high-quality comparisons with the data collected at the LHC. We have considered the case of both present and future collider energies, √ s = 7, 8 and 13 TeV, exploring different methods to provide as much realistic estimates as possible of the scale uncertainty for our predictions. We have found that our estimate is relatively independent on the method applied. The method which assumes ttbb and ttjj uncorrelated should be taken as the most conservative one. Moreover, we have shown that the scale uncertainty of the ratio, at the level of 20% − 30%, is comparable with the error on the absolute cross sections σ(ttbb) and σ(ttjj). Given that this uncertainty is the dominant theoretical error for the processes at hand, we conclude that the ratio shows the same theoretical accuracy as the individual cross sections 2 .
Let us remind that top quark decays are not included in our study. This corresponds to the unrealistic situation of a perfect top quark reconstruction with all decay channels included. Besides, the two light or b-jets are always assumed not to be misassociated with the top quark decay products. It is clearly desirable to include top quark decays and study how they affect the cross section ratio. However, we expect a moderate impact, provided the same method of reconstructing the tt system is used in both processes, because the top quarks show a rather similar kinematics dependence in ttbb and ttjj backgrounds. This correlation might be helpful to better distinguish whether the reconstructed b-jets come from the tt pair, or e.g. from the QCD g → bb splitting.
The results presented in this paper have been obtained at the partonic level, and parton shower effects should, in principle, be included. We expect the parton shower to play an important role in case of loose cuts on jet p T , i.e. for p T j ≪ 40 GeV, where a question mark is put on the reliability of a genuine fixed-order calculation. First results for tt production in association with up to two jets merged with parton shower have recently started to appear [25], but the assumed kinematical restriction on the jet p T (40 GeV, 60 GeV or 80 GeV) seems still too high to shed light on such effects. We note that the estimated uncertainty on the absolute cross section for the production of the ttjj system presented there is comparable with our estimates. Similar conclusions apply as well to the case of ttbb production, recently matched to the parton shower [26]. Scale variations before and after matching have been assessed to be rather similar, at the level of 20% − 30% which is again in agreement with our estimates. Given all these reasons, we believe that parton shower effects will have a minimal impact on our results in the considered kinematical range.
Finally, we have presented a comparison between our NLO predictions and the currently available CMS data for √ s = 8 TeV. The present level of agreement is not striking but still within the uncertainties. We note that a new measurement of the cross section ratio, based on the complete data sample collected by the CMS experiment, is underway. This enlarged data sample, including other top quark decay channels, will provide a more accurate measurement, which we are looking forward to compare with our predictions.