Hadroproduction of tt¯bb¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ t\overline{t}b\overline{b} $$\end{document} final states at LHC: predictions at NLO accuracy matched with parton shower

We present predictions for the hadroproduction of tt¯bb¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ t\overline{t}b\overline{b} $$\end{document} final states at the LHC with collision energies s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 8 TeV and 14 TeV at NLO accuracy matched with parton shower, as obtained with PowHel + PYTHIA. We quantify the effects of parton shower and hadronization. We find these are in general moderate except the effect of the decay of heavy particles, which can modify significantly some distributions, like that of the invariant mass of the two leading b-jets. We also show kinematic distributions obtained with cuts inspired to those recently employed by the CMS collaboration. For these predictions, we present the theoretical uncertainty bands, related to both scale and PDF variations. We find that these uncertainties are only moderately affected by the change of the collision energy.


Introduction
The ttbb hadroproduction process was one of the first four-parton hadroproduction processes studied a few years ago in order to test and show the capabilities of new unitarityinspired methods for evaluating one-loop amplitudes, as well as the potential of new developments in the more traditional tensor-reduction approach. Pioneering works in this direction were performed by Bredenstein et al. [1,2], as well as by the HELAC-NLO collaboration [3]. In both cases the role of next-to-leading-order (NLO) QCD corrections was shown, by considering rather loose systems of cuts. The agreement between predictions obtained by different methods was taken as a prove of the correctness of the new conceptual developments, that were subsequently applied successfully to other multiparticle hadroproduction processes, reaching a level of unforeseen complexity [4][5][6][7].
A further application was represented by the study of ttbb production at the TeVatron [8]. Nowadays, a renewed interest for this process has arisen, driven not only by further theoretical developments, including the possibility of matching complex NLO QCD calculations to parton shower (PS) approaches, but also by the requests of the experimental collaborations working at LHC, that need input to reduce the uncertainties on the estimate of the ttbb contribution to the total background in their searches for the Higgs boson in the ttH channel, with the H boson decaying in a bb pair [9][10][11][12]. The theoretical uncertainty quoted in recent experimental analyses on the contribution of this irreducible non-resonant background component amounts to a few tens percent (∼ 50% in refs. [11,12]). We expect that theoretical estimates with the best possible theoretical accuracy within present reach of the most advanced event generators can help reduce this uncertainty. Thus, one of the aims of this paper is to present predictions for differential cross sections at NLO QCD accuracy matched to PS and hadronization, resulting in predictions at the hadron level, using cuts similar to those employed by the experimental collaborations.
First predictions concerning ttbb computations with NLO QCD accuracy matched with PS were presented by our group in refs. [13,14], making use of PowHel. More recently, JHEP03(2015)083 ttbb predictions obtained by using the MC@NLO matching algorithm [15], as encoded in SHERPA [16], and massive b-quarks, with a fixed pole mass m b = 4.75 GeV, have been reported in ref. [17], using 1-loop amplitudes computed by OpenLoops [18]. Our computation differs from that presented in ref. [17] in the following aspects: we use (i) massless b-quarks in the generation of NLO matrix elements, (ii) a different matching algorithm (POWHEG [19,20]), and (iii) a different Shower Monte Carlo program (PYTHIA).
As pointed out in ref. [17], the finite mass of the b-quarks allows for computing the contribution of ttjj events with the jets originating from two gluons that both separately split into a collinear bb pair. Each pair itself becomes part of one b-jet (as the b-quarks are sufficiently collinear to be unresolved by the jet algorithm), so the final state contains a ttquark pair in association with two b-jets. Strictly speaking, such contributions are higher order in perturbation theory and are not included in our computation with NLO+PS accuracy, where collinear bb pairs can only be produced by gluon splitting in the PS. On the other hand, in ref. [17], one of those g → bb splittings is included in the real radiation matrix elements (at LO accuracy for these contributions), while the other is generated by the parton shower, and such double g → bb splittings were found to give a significant contribution to the tt+2 b-jet final states when the t-quarks are kept stable.
This paper represents an extension of our previous studies, relying on a more advanced PowHel implementation, on the basis of recent POWHEG-BOX developments, and includes a detailed study of the theoretical uncertainties affecting our results in the most general case, including top decay and the full SMC chain up to hadronization and hadron decay.

Method
Predictions presented in this paper were obtained using events generated by PowHel [21], and stored in Les Houches event (LHE) files [22]. Our computational framework, PowHel, is an event generator that uses the HELAC-NLO codes [23] (HELAC-1LOOP [24] and HELAC-Dipole [25]) for the computation of the amplitudes required as input by the POWHEG-BOX [26]. We use the latter for matching NLO QCD computations with PS, according to the POW-HEG method, generating events stored in the Les Houches format (LHE). Details of event generation for ttbb final states were presented in ref. [13], where thorough checks of the correctness of the event files were also discussed. In particular, we treated the b-quarks as massless, and used five massless flavours, but neglected the small contribution to the cross section from the b-quarks in the initial state. Feynman graphs containing g → bb splittings are included in our computations. Such splittings are singular when the massless b-quarks become collinear. To avoid this divergence, we employ a generation cut on the invariant mass of the bb pair (has to be above 2 GeV), which is chosen such that it is harmless when physical cuts are also employed [13]. In practice, this means that the physical cuts must be chosen such that the invariant mass of b-jets should be significantly larger than 2 GeV. We tested that distributions are independent of this generation cut if m bb is above about 20 GeV. We set the mass of the t-quark to m t = 173.2 GeV for the events at 8 TeV and to m t = 172.6 GeV for those at 14 TeV. The latter choice was made in the first calculation at 14 TeV [3], against which we checked our predictions at NLO accuracy in ref. [13].

JHEP03(2015)083
Subsequent radiation emissions on top of the LHEs are generated by shower Monte Carlo (SMC) programs. SMCs can be further used to generate hadronization and hadron decay, as well as the decay of the heavy elementary particles in the narrow width approximation (NWA). For making predictions in this paper, we used the PYTHIA SMC code [27], in the Fortran version 6.4.28. However, other SMCs can be used as well to further shower the events generated by PowHel.
Following ref. [13], we fix the default value of the renormalization and factorization scales equal to µ R = µ F = µ 0 = H T /2. Here H T is the sum of the transverse masses of partons in the final state, a fairly standard choice for final states with high multiplicity [28], using the underlying Born kinematics. Then we consider their simultaneous variation by a factor of two around this value, leading to scale uncertainty bands corresponding to scales in the range [µ 0 /2, 2µ 0 ]. We also consider µ R and µ F variations separately in this range and we verify that the uncertainties associated to off diagonal variations (µ R , µ F /2), (µ R /2, µ F ), (µ R , 2µ F ), (2µ R , µ F ) lie within the bands of diagonal ones. Our scale uncertainty bands represent the envelope coming from these variations, leaving out the antipodal choices (µ R /2, 2µ F ) and (2µ F , µ R /2) according to the prescription of ref. [29].
We used the CT10NLO, MSTW2008NLO and NNPDF sets of PDFs with 2-loop running α S and 5 active flavours, as available in the LHAPDF interface [30], drawing a PDF uncertainty band as the envelope of the results corresponding to the three central values of these PDF sets. For practical reasons, due to the intensive CPU requirements of our calculations, we neglected uncertainties coming from considering the statistical variations around each central PDF distribution. We expect that differences between different PDF sets are more pronounced than differences within each PDF set, as already found for the hadroproduction of ttH final states in the HXSWG reports [14,31].
As discussed in ref. [13], suppression factors are used in order to suppress the generation of the events in those regions of the phase space that are expected to be less relevant from the experimental point of view, considering the typical cuts applied in the experimental analyses. These factors however, lead to events characterized by a wide weight distribution, extending without any explicit limit both in the negative and in the positive weight region. This may lead to spikes in the kinematic distributions. We expect these spikes would disappear with significantly more statistics, but the present capabilities of computer resources limit the total number of events we can generate to several millions. 1 We thus implemented an automated spike-elimination procedure applied after SMC, relying on the fact that the same LHE can lead to different SMC emissions, i.e. can populate different bins of the final differential distributions at the hadron level, depending on the random number sequence in the SMC generator.

Phenomenology
Using PowHel one can make predictions at four different stages in the evolution of the final state: (i) at the parton level using NLO accuracy, (ii) from the pre-showered POWHEG JHEP03(2015)083 simulation (referred to Les Houches events, or LHEs), formally at the NLO accuracy, (iii) after decay of the heavy particles, (iv) at the hadron level after full SMC. A possible alternative to the last stage is (iv') after PS switching off hadronization and hadron decay, but in this case the heavy quarks are automatically kept stable by PYTHIA. Utilizing these options, we study the effect of these various stages of event evolution on several differential distributions, before making predictions at the hadron level.

SMC effects
As the number and type of particles are very different after the different stages of event evolution, selection cuts employed at the different stages may change the cross section significantly. Thus, to understand the effect of the PS and hadronization on the shapes of the distributions without changing normalization, we first performed a phenomenological analysis where the cuts are applied on the LHEs, whereas distributions are plotted after different stages of the evolution. This way we can ensure that the selection cuts do not introduce a bias, that is as if we used an inclusive event sample. The cuts applied on jets formed from the LHEs, reconstructed using the k ⊥ algorithm [32,33] with R = 0.4, as implemented in FastJet 3.0.6 [34] are the following: we require at least one band oneb-jet with (a) p T > 20 GeV, (b) |η| < 2.5, and (c) m bb > 100 GeV, as relevant for Higgs studies. We do not apply cuts on b-jets emerging from the decay of top quarks, implemented at LO accuracy in the NWA in the SMC, which neglects spin correlations.
In figures 1-6 we present the comparisons of distributions at different stages of event evolution. Each plot contains five predictions, corresponding to the list (i-iv') presented in the introduction of this section.
First we study the effect of the PS by comparing differential distributions from the LHEs and after PS, keeping heavy particles stable. The corresponding predictions are marked as 'LHE' and 'PS'. In the middle section of each plot we also show the ratio PS/LHE of these predictions together with the ratio of the prediction at NLO accuracy to that from the LHEs, already studied in ref. [13], where a fairly uniform increase of the distributions was found from the LHEs. Here we find that the effect of the PS is to 'bring back' the predictions close to the NLO ones with two exceptions: (i) the rapidity distribution of the hardest b-jet, which remains close to the prediction from LHEs even after shower evolution, and (ii) the distributions of transverse momenta for small p ⊥ (below about 50 GeV). Thus, the effect of the PS is in general small on top of the predictions at NLO level when the t-quarks are kept stable, or looking from an experimental point of view, when those are reconstructed from their decay products.
Secondly we study the effect of the full SMC by comparing differential distributions from the LHEs, after decay and after SMC, corresponding to the full simulation including PS, hadronization and hadron decay. The corresponding predictions are labelled as 'decay' and 'SMC' and the ratio of these is also shown in the lower panel of each plot. Here we find significant changes in almost all distributions. The decay of the heavy quarks modifies the shapes, which are further modified by the SMC in the case of distributions of transverse momenta, or invariant mass of the bb-jet pair. In case of rapidity distributions the effect of the SMC is negligible.  We select isolated leptons with p ⊥, > 25 GeV, |η | < 2.5 and minimal separation from all jets in the pseudorapidity-azimuthal angle plane ∆R = 0.4. Looking at kinematic distributions of those leptons, such as the transverse momentum and pseudorapidity of the hardest isolated lepton plotted in figure 7, it is clear that these are not affected by SMC effects, as expected. Such leptons appear in the decay of the heavy quarks, and neither the PS nor the hadronization change those. The same is true for the W transverse mass, reconstructed for W 's decaying leptonically, not shown here.
On the other hand, the distributions most affected by the SMC are the total number of (b +b)-jets, the total number of jets (b and non-b), the p ⊥ of the hardest non-b jet. We show the last of these in figure 8. In this case the prediction at NLO accuracy diverges for small transverse momentum, which is screened by the Sudakov factor, in case of parton shower matched prediction. The decay of the heavy quarks hardens this distribution significantly through jets emerging from the hadronic decay of the W bosons, which is subsequently softened by the PS and hadronization.

Analyses with cuts at the hadron level
In order to make predictions at the hadron level including scale and PDF uncertainties, we decided to use selection cuts at the hadron level, inspired by the CMS note ref. [35], to identify events in the dileptonic decay mode. These are rather involved, which amply shows the flexibility of our method: 1. We require the events to have at least one pair of isolated opposite sign leptons with p ⊥, > 20 GeV and |η | < 2.4. Dilepton candidate events with invariant mass m < 12 GeV are removed to suppress multijet final states, at essentially no penalty for the collected signal. A lepton is considered isolated if the sum of the transverse momenta of the charged hadrons, neutral hadrons and photons in a cone of ∆R < 0.3 around it divided by its transverse momentum, I rel , does not exceed I max rel = 0.15.
2. To remove the large background from events with Z-boson and jets, with Z decaying into leptons, we require that the invariant mass of the lepton pair defined above (e + e − or µ + µ − ) falls outside a ±15 GeV window centered at m Z .
3. Signal events typically contain large missing transverse energy due to the presence of neutrinos from decays of the W -bosons. We require a minimum missing transverse energy, / p ⊥ > 30 GeV, for the e + e − or µ + µ − dilepton final states, but not for the mixed flavour dilepton final state.
4. We cluster jets from hadrons, photons and non-isolated leptons using the anti-k ⊥ algorithm [36] with R = 0.5. The jets are all required to have |η j | < 2.5 and p ⊥,j > 40 or 20 GeV. We performed the analysis with both cuts and report in the following the corresponding cross-sections, but present distributions here for the former only. The trend of the distributions related to the latter in fact do not present particular additional features which modify our general conclusions. 5. We require at least four well-separated b-jets after SMC, ∆R(j i , j k ) > 0.5 for i, k ∈ {1, 2, 3, 4}, as well as ∆R(j i , k ) > 0.5 for i ∈ {1, 2, 3, 4}, and k ∈ {1, 2} for the selected opposite-sign leptons. We assume 100% b-tagging efficiency using MCTRUTH.
With these set of cuts the cross sections for various choices of the default scale and for different PDF sets are presented in tables 1 and 2 for a collider energy of 14 TeV and in tables 3 and 4 for a collider energy of 8 TeV. These cross-section values were obtained at the hadron level by showering events generated by PowHel with a p T -ordered version of the PYTHIA PS, by using the Perugia 2011 tune [37]. Using the default version of PYTHIA, where the PS is virtuality-ordered in absence of tunes, gives rise to values a few percent lower.
In table 5 we exhibit the numbers of expected events for an integrated luminosity of 19.6 fb −1 at 8 TeV, for minimum transverse momentum of the jets fixed to both 40 and 20 GeV. These marginally agree with the numbers of events expected in the CMS data sample, obtained on the basis of a LO+PS computation by MadGraph [38] + PYTHIA (first line in table 2 and table 1 Table 2. Total cross sections in fb at 14 TeV in the different dilepton channels with cuts listed in the text. The minimum transverse momentum of the jets is set to 20 GeV.  Table 3. Total cross sections in fb at 8 TeV in the different dilepton channels with cuts listed in the text. The minimum transverse momentum of the jets is set to 40 GeV.  Table 4. Total cross sections in fb at 8 TeV in the different dilepton channels with cuts listed in the text. The minimum transverse momentum of the jets is set to 20 GeV. only be indicative for two reasons. On the one hand our predictions assume 100% btagging efficiency, while the experimental analysis has smaller. On the other hand, we require at least four b-jets, treating all b-jets on the same footing, while the experimental analysis requires only two b-jets, because they do not include the b-jets coming from t-JHEP03(2015)083  Table 5. Number of expected ttbb events at 8 TeV in the different dilepton channels with cuts listed in the text. The minimum transverse momentum of the jets is set to 40 GeV (first line) and 20 GeV (second line), respectively. The uncertainty corresponds to the variation of the renormalization and factorization scales around their default value by factors of half and two. quark decays in this count. Furthermore, in order to identify b-jets we kept the lowest lying B-hadrons stable in PYTHIA for simplicity, tagging as b-jets those including at least a B-hadron, whereas the experiment reconstructs b-jets from their decay products, using displaced vertex information.
Recently an independent study of ttbb and ttjj hadroproduction, also inspired by the same CMS note, appeared in ref. [39]. This work differs from our one because it is an NLO study, including a simplified sets of cuts applied on parton level events (also including top decays, but neglecting parton shower, hadronization and hadron decay effects).
In figures 9-12 we present several kinematic distributions obtained with these selection cuts (for the case of minimum transverse momentum of jets fixed to 40 GeV): the transverse momenta and rapidities of the two hardest b-jets, the transverse momentum, rapidity and invariant mass distribution of the hardest b-jet pair and the invariant mass distribution of the hardest b-jet and hardest charged lepton. In each figure we show two predictions, one for 8 TeV c.m. energy (smaller cross sections) and the other for 14 TeV (larger cross sections). The shapes of the rapidity distributions are very similar at the two energies, the result of the larger collision energy just amounting to about a five to six-fold increase of the JHEP03(2015)083  cross section. On the other hand, in case of the invariant mass and transverse momentum distributions, the spectra at 14 TeV also become considerably harder with respect to those at 8 TeV.
In the same figures we exhibit two bands: one corresponding to the envelope of standard variations of the renormalization and factorization scales, between half and twice the default scale µ 0 , while the other to the envelope of dependence on the PDF set (CT10NLO, MSTW2008NLO and NNPDF). In order to see these better, in the lower panels we show all predictions normalized to our default choice: predictions with µ 0 = H ⊥ /2 and CT10NLO PDF set. The middle panels corresponds to the cross sections at 14 TeV, while the lower ones to those at 8 TeV. We see that the scale dependence for these distributions follows the scale dependence found at the NLO accuracy [13]. It is similar in size (about +38%-26% at 14 TeV and +41%-29% at 8 TeV, for a jet p ⊥ cut at 40 GeV) and also uniform in shape.

JHEP03(2015)083
The PDF bands are much narrower. In general, we find that up to statistical fluctuations, the smallest cross sections are obtained with CT10NLO, while MSTW2008NLO and NNPDF give similar, at most ∼ 10% larger values than our default, with MSTW predictions slightly larger than NNPDF ones. These results show that this complicated final state with rather exclusive experimental selection cuts can be modelled at NLO accuracy matched to SMC with theoretical uncertainties that are comparable to what we can find at fixed order.
We present an example for lepton distributions in figure 13(a) where the spectrum of the transverse momentum of the hardest lepton is depicted. This lepton emerges in the decay of one of the t-quarks, i.e. it is not present in the NLO matrix elements. Nevertheless, all what we pointed out for the b-jet distributions concerning the scale and PDF uncertainties are valid for this (and other leptonic distributions, not shown here), too.
Finally, in figure 13(b) we present the distribution of the transverse momentum of the ttbb system, emphasizing the low p ⊥ -region, where we see the effect of Sudakov damping. This distribution is divergent for p ⊥ = 0 at fixed order in perturbation theory, while it is finite in the matched prediction. We find decreasing scale dependence with decreasing p ⊥ , although with naturally worsening statistics.

Conclusions
We have studied the pp → ttbb process including NLO QCD corrections computed with five massless flavours matched with parton shower, using the PowHel event generator interfaced with the PYTHIA SMC. This computation, challenging due to the presence of two massless b's in the final state, producing singular underlying Born configurations, opens the road to JHEP03(2015)083 Figure 13. Same as figure 9, as for the transverse momentum of (a) the hardest lepton, (b) the ttbb system. a unified treatment of ttbb and ttjj. The latter process can also produce tt+2 b-jet final states when both jets are tagged as b-jets due to g → bb splittings. Such contributions are not included in our computation.
We presented predictions at different stages of event evolution in order to show the effect of the SMC in modifying NLO distributions. We found that the effect of the parton shower over the predictions at NLO accuracy are usually very small except for rapidity distributions and small values of transverse momenta. The effect of the hadronization is to soften the transverse momentum or invariant mass spectra and can be up to 30%. However, the largest effect is due to the decay of the heavy quarks. Kinematic distributions of the leptonic decay products of the heavy quarks are not affected by the hadronization.
We paid special attention to the identification of the sources of uncertainties and in the estimate of the uncertainty bands at the hadron level. We found that the scale dependences change only very moderately with c.m. energy, and are very similar to those of the predictions at the NLO accuracy reported in ref. [13], altough the latter were obtained with looser systems of cuts. These scale dependences are fairly uniform for all distributions shown when we employ our default scale, the half of the sum of transverse masses in the final state, which is a standard choice for final states with high multiplicity [28] and for those involving heavy quark pairs in multiscale dynamical regimes [29]. The PDF uncertainties are much smaller, with CT10NLO yielding the smallest cross section in general, while MSTW2008NLO and NNPDF giving very similar results that are up to about 10% higher. These results show that our events can model the ttbb final states with uniform theoretical uncertainties similar in size to those found at fixed order.
Final states consisting of a tt and a bb quark pair constitute important backgrounds for Higgs boson production in association with a tt-pair. Our events can be used to optimize the selection of the signal events to find the best signal/background ratio in the various decay channels. For this purpose sets of LHEs can be downloaded from our web-page for JHEP03(2015)083 both the ttH signal (http://grid.kfki.hu/twiki/bin/view/DbTheory/TthProd) and for the background (http://grid.kfki.hu/twiki/bin/view/DbTheory/TtbbProd), or new ones can be requested for different choices of scale, PDF, and parameter values.