1 Introduction

At the Large Hadron Collider, searches for \(t \bar{t} H\) production in the \(H\rightarrow b \bar{b} \) channel are plagued by a large QCD background, which is dominated by \(t \bar{t} b\bar{b} \) production, and the availability of precise theoretical predictions for this multi-particle background process is of crucial importance for the sensitivity of \(t \bar{t} H(b \bar{b})\) analyses. The process \(pp\rightarrow t \bar{t} b\bar{b} \) is also very interesting on its own, as it provides a unique laboratory to explore the QCD dynamics of heavy-quark production and to test state-of-the-art Monte Carlo predictions in a nontrivial multi-scale environment.

As a result of its \(\alpha _\mathrm{S}^4\) dependence, the leading-order (LO) \(t \bar{t} b\bar{b} \) cross section is highly sensitive to variations of the renormalisation scale. The uncertainty corresponding to standard factor-two scale variations amounts to 70–80% at LO, and the inclusion of next-to-leading order (NLO) QCD corrections [1,2,3] is mandatory. At NLO, the scale dependence goes down to 20–30%, and in order to avoid excessively large K-factors and potentially large corrections beyond NLO, the renormalisation scale should be chosen in a way that accounts for the fact that the typical energies of the b-jet system are far below the hardness of the underlying \(pp\rightarrow t \bar{t} \) process [3].

The first NLOPS simulation of \(pp\rightarrow t \bar{t} b\bar{b} \) was carried out in Powhel  [4, 5] by combining NLO matrix elements in the five-flavour (5F) scheme with parton showers by means of the Powheg method [6, 7]. Shortly after, an NLOPS generator based on four-flavour (4F) \(pp\rightarrow t \bar{t} b\bar{b} \) matrix elements became available in the Sherpa+OpenLoops framework [8], which implements an improved version [9] of the MC@NLO matching method [10]. Thanks to the inclusion of b-mass effects, \(t \bar{t} b\bar{b} \) matrix elements in the 4F scheme are applicable to the full b-quark phase space, including regions where one b-quark remains unresolved. Thus the 4F scheme guarantees a consistent NLOPS description of inclusive \(t \bar{t} +b\)-jet production with one or more b-jets. On the contrary, NLOPS \(t \bar{t} b\bar{b} \) generators based on 5F matrix elements with massless b-quarks suffer from collinear \(g\rightarrow b \bar{b} \) singularities that require ad-hoc restrictions of the physical phase space through generation cuts.

In Ref. [8] it was pointed out that matching and shower effects play an unexpectedly important role in \(t \bar{t} +b\)-jet production. This is due to the fact that two hard b-jets can arise from two hard jets involving each a collinear \(g\rightarrow b \bar{b} \) splitting. In NLOPS simulations of \(pp\rightarrow t \bar{t} b\bar{b} \), such configurations result from the combination of a \(g\rightarrow b \bar{b} \) splitting that is described at NLO accuracy through \(t \bar{t} b\bar{b} \) matrix elements together with a second \(g\rightarrow b \bar{b} \) splitting generated by the parton shower. The impact of this so-called double-splitting mechanism can have similar magnitude to the \(t \bar{t} H(b \bar{b})\) signal, and the thorough understanding of the related matching and shower uncertainties is very important for \(t \bar{t} H\) analyses.

A first assessment of NLOPS uncertainties was presented in Ref. [11] through a tuned comparison of NLOPS \(t \bar{t} b\bar{b} \) simulations in Powhel  [4, 5], Sherpa+OpenLoops  [8] and Madgraph5aMC@NLO  [12]. On the one hand, this study has revealed significant differences between the two generators based on the MC@NLO matching method,Footnote 1 i.e. Sherpa and Madgraph5aMC@NLO. Such differences were found to be related to a pronounced dependence on the shower starting scale in Madgraph5aMC@NLO. On the other hand, in spite of the fact that Sherpa+OpenLoops and Powhel implement different matching methods and different parton showers, the predictions of these two generators turned out to be quite consistent. However, due to the limitations related to the use of the 5F scheme in Powhel—which have been overcome only very recently with the 4F upgrade of Powhel  [13]—the agreement between Powhel and Sherpa+OpenLoops did not allow to draw any firm conclusion in the study of Ref. [11].

To date the assessment of theoretical uncertainties in \(t \bar{t} b\bar{b} \) production remains an important open problem. In this context one should address the question of which of the various NLOPS methods and tools on the market are more or less appropriate to describe the process at hand. Moreover, in order to address such issues in a systematic way, it is desirable to develop a better picture of the QCD dynamics that drive \(t \bar{t} +b\)-jet production. In this spirit, this paper starts with a discussion of the various possible frameworks for theoretical simulations of \(t \bar{t} +b\)-jet production at NLOPS accuracy. In particular, we present detailed studies on the role of \(g\rightarrow b \bar{b} \) splittings and discuss the advantages and disadvantages of the 4F and 5F schemes. To this end we quantify the relative importance of \(g\rightarrow b \bar{b} \) splittings of initial-state (IS) and final-state (FS) type by using approximations based on collinear QCD factorisation, as well as by decomposing \(pp\rightarrow t \bar{t} b\bar{b} \) matrix elements into diagrams involving IS and FS \(g\rightarrow b \bar{b} \) splittings. These studies demonstrate that \(t \bar{t} +b\)-jet production is widely dominated by \(pp \rightarrow t \bar{t} g\) followed by FS \(g\rightarrow b \bar{b} \) splittings. This holds also for observables where initial-state splittings are expected to be enhanced, such as in regions with a single resolved b-jet. These findings support the use of NLOPS generators based on \(pp\rightarrow t \bar{t} b\bar{b} \) matrix elements in the 4F scheme, where b-mass effects guarantee a consistent treatment of FS \(g\rightarrow b \bar{b} \) splittings. We also consider more inclusive simulations of \(t \bar{t} +\)jets production based on multi-jet merging [14,15,16,17,18,19]. In this case we find that \(t \bar{t} +b\)-jet observables suffer from an unexpectedly strong dependence on the parton-shower modeling of \(g\rightarrow b \bar{b} \) splittings.

Motivated by the above findings we present a new Powheg generator for \(pp\rightarrow t \bar{t} b\bar{b} \) in the 4F scheme.Footnote 2 At variance with the Powhel generator of Ref. [13], this new Powheg generator is implemented in the Powheg-Box-Res framework [22] using OpenLoops, which guarantees a very fast evaluation of the required \(2\rightarrow 4\) and \(2\rightarrow 5\) matrix elements. The new generator supports also top-quark decays including spin-correlation effects. Moreover, in order to guarantee a more consistent resummation of QCD radiation, the separation of the so-called singular and finite parts in the Powheg-Box is not restricted to initial-state radiation as in Ref. [13] but is applied also to final-state radiation (see Sect. 3).

For what concerns the Powheg methodology we pay particular attention to issues related to the multi-scale nature of the process at hand. In particular we point out that the treatment of the recoil associated with NLO radiation can induce sizeable distortions of the underlying \(t \bar{t} b\bar{b} \) cross section. This technical inconvenience restricts the domain of applicability of QCD factorisation in a way that can jeopardise the efficiency of event generation and can also lead to unphysical resummation effects. Fortunately, such issues can be avoided by means of a Powheg-Box mechanism that restricts the resummation of real radiation to kinematic regions where QCD factorisation is fulfilled within reasonably good accuracy.

Predictions for \(pp\rightarrow t \bar{t} +b\)-jets at the 13 TeV LHC are presented for various cross sections and distributions with emphasis on the discussion of theoretical uncertainties. Besides QCD scale variations, also uncertainties related to the matching method and intrinsic shower uncertainties are analysed in detail. In particular, we consider different approximations for the modelling of \(g\rightarrow b \bar{b} \) splitting as well as \(\alpha _\mathrm{S}\) scale uncertainties in Pythia. Moreover we compare Pythia to Herwig. Finally, to gain further insights into the size and the nature of matching and shower uncertainties we present a consistent comparison of Powheg+Pythia generators of \(t \bar{t} b\bar{b} \) and inclusive \(t \bar{t} \) production, against corresponding generators based on Sherpa+OpenLoops.

The new \(t \bar{t} b\bar{b} \) generator will be soon publicly available on the Powheg-Box web site [23].

The paper is organised as follows. In Sect. 2 we study the role of \(g\rightarrow b \bar{b} \) splittings in \(t \bar{t} +b\)-jet production, and we point out the advantages of Monte Carlo generators based on \(pp\rightarrow t \bar{t} b\bar{b} \) matrix elements in the 4F scheme as compared to more inclusive generators of \(t \bar{t} \) production in the 5F scheme. Technical aspects of the new Powheg generator and the setup for numerical simulations are discussed in Sect. 3. In particular, in Sect. 3.1 we review aspects of the Powheg method that can play a critical role for multi-scale process like \(pp\rightarrow t \bar{t} b\bar{b} \). Detailed predictions and uncertainty estimates for cross sections and distributions for \(pp\rightarrow t \bar{t} b\bar{b} \) with stable and unstable top quarks can be found in Sects. 4 and 5, respectively. Our main findings are summarised in Sect. 6.

2 Anatomy of \(\varvec{t \bar{t} +b}\)-jet production and \(\varvec{g\rightarrow b \bar{b}}\) splittings

Events with \(t \bar{t} +b\)-jets final states arise from an underlying \(pp\rightarrow t \bar{t} \) process that takes place at scales of the order of 500 GeV and is accompanied by the production of b-jets at typical transverse momenta of a few tens of GeV. The production of b-jets is governed by IS or FS \(g\rightarrow b \bar{b} \) splittings and is enhanced in kinematic regions where the \(p_\mathrm {T}\) of individual b-quarks becomes small or the \(b \bar{b} \) pair becomes collinear and, possibly, also soft.

The understanding of the QCD dynamics that governs b-jet production is a crucial prerequisite for a reliable theoretical description of \(t \bar{t} +b\)-jet production and related uncertainties. In this spirit, this section compares various theoretical frameworks for the description of \(t \bar{t} +b\)-jet production at NLO QCD accuracy with a special focus on the role of \(g\rightarrow b \bar{b} \) splittings. Specifically, we compare inclusive or merged simulations of \(t \bar{t} +\)multi-jet production in the 5F scheme against a description based on \(t \bar{t} b\bar{b} \) matrix elements in the 4F scheme, pointing out the advantages of the latter.

Numerical studies presented in this section are based on the setup specified in Sects. 3.2 and 3.6 and have been performed with Sherpa+OpenLoops.

2.1 NLOPS \(\varvec{t \bar{t}}\) simulations in the five-flavour scheme

Inclusive NLOPS generators of \(t \bar{t} \) production [24, 25] are based on \(pp\rightarrow t \bar{t} \) NLO matrix elements matched to partons showers in the 5F scheme. In this framework, as illustrated in Fig. 1, \(t \bar{t} +b\)-jet events are generated starting from \(2\rightarrow 3\) tree matrix elements of type \(gb\rightarrow t \bar{t} b\) or \(gg/q\bar{q}\rightarrow t \bar{t} g\). In the latter case, \(t \bar{t} b\bar{b} \) events arise via FS \(g\rightarrow b \bar{b} \) shower splittings. Instead, in the case of \(gb\rightarrow t \bar{t} b\) the final-state b-quark emerges from the matrix element, while \(g\rightarrow b \bar{b} \) splittings generate the initial-state b-quark through the evolution of the 5F PDFs. The unresolved spectator b-quark associated with such IS \(g\rightarrow b \bar{b} \) splittings is emitted by the parton shower via backward evolution. The main advantage of the 5F scheme lies in the resummation of potentially large \(\alpha _\mathrm{S}\ln (m_t/m_b)\) terms associated with the evolution of the b-quark density. However, such logarithmic effects are typically rather mild at the LHC [26]. Moreover, as we will show in Sect. 2.3, \(t \bar{t} +b\)-jet production is largely dominated by topologies with FS \(g\rightarrow b \bar{b} \) splittings. For this reason, \(t \bar{t} +b\)-jet predictions based on NLOPS \(t \bar{t} \) generators suffer from the twofold disadvantage given by the direct dependence on the parton-shower modelling of FS \(g\rightarrow b \bar{b} \) splittings plus the LO nature of the underlying \(t \bar{t} g\) matrix element.

2.2 \(\varvec{t \bar{t}}+\)multi-jet merging in the five-flavour scheme

As a possible strategy to reduce the sensitivity to the parton shower and increase the accuracy of theoretical predictions we consider \(t \bar{t} +\)multi-jet merging in the 5F scheme. In this approach, a tower of NLOPS simulations for \(t \bar{t} +0,1,\ldots ,N\) jet production is merged into a single inclusive sample [17,18,19]. This is achieved by clustering QCD partons into jets with a certain \(k_\mathrm {T}\)-resolution, \(Q_{\mathrm {cut}}\), which is known as merging scale. At LO, the phase-space regions with \(N=0,1,\ldots , N_{\mathrm {max}}\) resolved jets (\(k_T> Q_{\mathrm {cut}}\)) are described in terms of \(t \bar{t} +N\)-jet LOPS simulations. The LOPS simulation with \(N_{\mathrm {max}}\) jets fills also the phase space with \(N>N_{\mathrm {max}}\) resolved jets by means of the parton shower. At NLO, the resolution criterion used to separate regions of different jet multiplicity exactly the same as for LO, while the basic difference with respect to LO merging lies in the fact that \(t \bar{t} +N\)-jet LOPS simulations are replaced by corresponding NLOPS simulations. Thus in NLO (LO) merging the effective number of resolved jets that is described at NLOPS (LOPS) accuracy is \(N_{\mathrm {eff}}=\mathrm {min}\{N,N_\mathrm {\max }\}\), while the \((N_{\mathrm {eff}}+1)^{\mathrm {th}}\) resolved or unresolved jet is described at LOPS (pure PS) accuracy, and all remaining resolved or unresolved jets are described at pure PS accuracy.

Fig. 1
figure 1

Sample \(t \bar{t} b\bar{b} \) diagrams with IS (left) and FS (right) \(g\rightarrow b \bar{b} \) splittings. In NLOPS simulations of inclusive \(t \bar{t} \) production in the 5F scheme the black subtopologies are described in terms of tree matrix elements, while the orange lines correspond to parton shower emissions

Multi-jet merging for \(t \bar{t} +\)jet at NLO can be performed in a fully automated way within the Sherpa  [27] and Madgraph5aMC@NLO  [12] frameworks. However, the fact that \(t \bar{t} +b\)-jet events constitute only a small fraction of a \(t \bar{t} +\)jets sample poses very high requirements in terms of Monte Carlo statistics. Moreover, in order to minimise the dependence on parton-shower modelling, such simulations should be performed using a small merging scale and including a sufficiently high number of NLO jets, \(N_{\mathrm {max}}\). The required CPU resources grow very fast at large \(N_{\mathrm {max}}\) and small \(Q_{\mathrm {cut}}\), and state-of-the-art merged simulations can handle up to two jets at NLO [28] at present.

Fig. 2
figure 2

Sample diagrams describing the interplay between matrix elements (black) and parton shower (orange) for \(t \bar{t} +b\)-jet events in merged LOPS simulations of \(t \bar{t} +0,1,2\) jet production in the 5F scheme. In regions where the \(b \bar{b} \) system and its parent gluon are produced below the merging scale, the event is described through hard \(t \bar{t} \) matrix elements plus parton-shower branchings (a). When the parent gluon becomes harder, the parton shower is used only to model collinear \(g\rightarrow b \bar{b} \) splittings (b). Above the merging scale, if \(g\rightarrow b \bar{b} \) splittings do not belong to the two hardest branchings they are still left to the parton shower, while \(2\rightarrow 4\) matrix elements are used to account for other light jets (c). Otherwise, the event is described through \(t \bar{t} b\bar{b} \) matrix elements (d)

The multi-jet merging description of \(t \bar{t} b\bar{b} \) events with FS \(g\rightarrow b \bar{b} \) splittings is sketched in Fig. 2 for the case of \(t \bar{t} +0,1,2\)-jet merging at LO. In regions where the \(b \bar{b} \) pair and/or the parent gluon are emitted at small scales, \(g\rightarrow b \bar{b} \) splittings are expected to be generated by the parton shower, while hard b-jet pairs are expected to arise from \(t \bar{t} b\bar{b} \) matrix elements. However, as we will see, typical \(t \bar{t} +b\)-jet events involve additional light jets that are emitted at harder scales with respect to the \(g\rightarrow b \bar{b} \) branching. In that case, \(2\rightarrow 4\) matrix elements account only for light jets, and \(g\rightarrow b \bar{b} \) splittings are left to the parton shower. In general, the relative importance of matrix elements and parton shower depends on the resolution scale \(Q_{\mathrm {cut}}\), and using a finite resolution is mandatory in the 5F scheme, since collinear \(g\rightarrow b \bar{b} \) splittings are divergent, i.e. \(t \bar{t} b\bar{b} \) matrix elements cannot be used in the full phase space.

In Figs. 3 and 4 we analyse the matrix-element content of a \(t \bar{t} +0,1,2\)-jet merged simulation at LO. These studies are based on the MEPS@LO method [16] implemented in Sherpa, but the main findings are expected to hold also for other merging methods. Besides the b-jet multiplicity distribution, in Figs. 3 and 4 we plot differential observables in the presence of ttbb cuts, i.e. requiring \(N_b\ge 2\) b-jets as defined in Sect. 3.6. For jets we apply the acceptance cuts Eq. (39) and, in order to maximise the possibility to resolve jets at matrix-element level, we choose a merging scale lower than the jet-\(p_\mathrm {T}\) threshold, \(Q_{\mathrm {cut}}=20\) GeV.

Fig. 3
figure 3

Breakdown of merged LOPS simulations of \(pp\rightarrow t \bar{t} +0,1,2\) jets production at 13 TeV into the contributions from matrix elements with \(t \bar{t} +\)0,1 and 2 generic QCD partons: distributions in the number of b-jets with \(p_\mathrm {T}>25\) GeV (a), the invariant mass of the two leading b-jets (b), and the \(p_\mathrm {T}\) of the leading b-jet (c)

As expected, in Fig. 3 we find that \(t \bar{t} +b\)-jet observables with \(N_b\ge 2\) resolved b-jets are largely dominated by \(t \bar{t} +\)2-parton matrix elements. This holds also for \(N_b\ge 1\). However, the breakdown of the merged sample into contributions from matrix elements with different b-quark multiplicity in Fig. 4 reveals that, in spite of the low merging scale, the cross section for producing one or more b-jets is dominated by matrix elements with zero b-quarks. The contribution of \(t \bar{t} b\bar{b} \) matrix elements hardly exceeds 50% even in the region of large b-jet \(p_{\mathrm {T}}\) or large invariant mass of the b-jet pair. This counterintuitive feature can be attributed to the fact that, in \(t \bar{t} +\)jet events that involve \(g\rightarrow b \bar{b} \) splittings, the two hardest QCD branchings are typically associated with the emission of the parent gluon of the \(b \bar{b} \) pairs and/or with the production of other light jets. As a consequence, in LOPS merged samples with \(N_{\mathrm {max}}=2\), \(g\rightarrow b \bar{b} \) splittings are left to the parton shower. We have verified that the contribution of \(t \bar{t} b\bar{b} \) matrix elements remains relatively low even when the merging procedure is extended up to 3 or 4 jets with \(Q_{\mathrm {cut}}=20\) GeV. Moreover, we have checked that increasing the merging scale leads to a further suppression of the contribution of \(t \bar{t} b\bar{b} \) matrix elements.

Figure 4d demonstrates that \(t \bar{t} +2\,b\)-jet events are indeed accompanied by abundant emission of extra light jets, and the importance of \(t \bar{t} b\bar{b} \) matrix elements decreases with increasing light-jets multiplicity. Moreover, even in the bin with zero additional light jets it turns out that the contribution of \(t \bar{t} b\bar{b} \) matrix elements remains below 50%. This is probably due to the fact that \(g\rightarrow b \bar{b} \) splittings tend to take place at branching scales below the jet-\(p_\mathrm {T}\)-threshold of 25 GeV.

In summary, contrary to naive expectations, \(t \bar{t} +\)jets samples based on LOPS merging do not guarantee a matrix-element description of b-jet production but largely rely on the parton shower modelling of \(g\rightarrow b \bar{b} \) splittings. In the case of multi-jet merging at NLO, to a certain extent \(g\rightarrow b \bar{b} \) shower splittings should be matched to \(t \bar{t} b\bar{b} \) and \(t \bar{t} b\bar{b} g\) tree matrix elements. Nevertheless, based on the above observations, the theoretical accuracy in the description of b-jet production is expected to remain between the LOPS and the pure PS level.Footnote 3 Finally, in the light of the presence of abundant light-jet radiation with a typical hardness beyond the one of b-jets, the role of hard radiation on top of the \(t \bar{t} b\bar{b} \) system should be studied with great care also in the context of NLOPS simulations of \(t \bar{t} b\bar{b} \) production (see Sect. 3.1).

Fig. 4
figure 4

Breakdown of merged LOPS simulations of \(t \bar{t} +0,1,2\) jets production at 13 TeV into the contributions from matrix elements with \(t \bar{t} +\)0,1 and 2 b-quarks. Same observables ac as in Fig. 3 and distribution in the exclusive number of light jets with \(p_\mathrm {T}>25\) GeV (d) in the presence of ttbb cuts

2.3 \({t \bar{t} b\bar{b}}\) production in the four-flavour scheme

In order to minimise the dependence on parton-shower modelling and to maximise the use of higher-order matrix elements, in the following we will adopt a description of \(t \bar{t} +b\)-jet production based on ttbb matrix elements in the 4F scheme. In this scheme, b-quarks are treated as massive partons, and \(g\rightarrow b \bar{b} \) splittings are free from collinear singularities. Thus \(t \bar{t} b\bar{b} \) matrix elements can be used in the entire phase space. Generic \(t \bar{t} b\bar{b} \) topologies where b-quarks emerge from IS and FS splitting processes are illustrated in Fig. 5. In the case of FS \(g\rightarrow b \bar{b} \) splittings, \(t \bar{t} b\bar{b} \) matrix elements with \(m_b>0\) can be extended to the collinear regime, where the \(b \bar{b} \) pair becomes unresolved within a single b-jet. Similarly, 4F \(t \bar{t} b\bar{b} \) matrix elements describe also collinear IS \(g\rightarrow b \bar{b} \) splittings, where the spectator b-quark is emitted in the beam direction and remains unresolved, while the \(bg\rightarrow t \bar{t} b\) sub-process with a single b-jet corresponds to the description of \(t \bar{t} +b\)-jet production at LO in the 5F scheme. Thus, \(t \bar{t} b\bar{b} \) matrix elements provide a fully inclusive description of \(t \bar{t} +b\)-jet production, and NLO predictions in the 4F scheme yield NLO accuracy both for observables with two b-jets and for more inclusive observables with a single resolved b-jet.

The inclusion of \(m_b\) effects in \(g\rightarrow b \bar{b} \) splittings represents a clear advantage of the 4F scheme with respect to the 5F scheme. However, the 4F scheme has the disadvantage that potentially large \(\alpha _\mathrm{S}\ln (m_b/Q)\) terms that arise from IS \(g\rightarrow b \bar{b} \) splittings are not resummed through the PDF evolution. In the following, in order to assess the relevance of this limitation, we decompose the \(pp\rightarrow t \bar{t} b\bar{b} \) LO cross section into contributions from IS and FS \(g\rightarrow b \bar{b} \) splittings. Since the \(q\bar{q}\) channel involves only FS \(g\rightarrow b \bar{b} \) splitting, we focus on the gg channel and we first consider a naive diagrammatic splitting of the \(gg\rightarrow t \bar{t} b\bar{b} \) matrix element,

$$\begin{aligned} \mathscr {M}_{t \bar{t} b\bar{b}}= \mathscr {M}_{\mathrm {IS},t \bar{t} b\bar{b}} + \mathscr {M}_{\mathrm {FS},t \bar{t} b\bar{b}}+ \mathscr {M}_{\mathrm {rem},t \bar{t} b\bar{b}}. \end{aligned}$$
(1)

The terms \(\mathscr {M}_{\mathrm {IS},t \bar{t} b\bar{b}}\) and \(\mathscr {M}_{\mathrm {FS},t \bar{t} b\bar{b}}\) correspond, respectively, to 18 diagrams with IS \(g\rightarrow b \bar{b} \) splittings and 16 diagrams with FS \(g\rightarrow b \bar{b} \) splittings. Generic diagrams with IS and FS splittings are depicted in Fig. 5a–b. The term \(\mathscr {M}_{\mathrm {rem},t \bar{t} b\bar{b}}\) corresponds to two remaining diagrams where an s-channel gluon splits into an on-shell b-quark and an off-shell b-line coupled to the \(t \bar{t} \) system. Its numerical impact turns out to be negligible. Based on Eq. (1) we split the \(t \bar{t} b\bar{b} \) cross section into three terms,

$$\begin{aligned} \mathrm {d}\sigma _{t \bar{t} b\bar{b}}= \mathrm {d}\sigma _{\mathrm {IS},t \bar{t} b\bar{b}} +\mathrm {d}\sigma _{\mathrm {FS},t \bar{t} b\bar{b}} +\mathrm {d}\sigma _{\mathrm {int},t \bar{t} b\bar{b}}, \end{aligned}$$
(2)

where the IS and FS parts are defined as

$$\begin{aligned} \mathrm {d}\sigma _{\mathrm {IS},t \bar{t} b\bar{b}}&=\frac{|\mathscr {M}_{\mathrm {IS},t \bar{t} b\bar{b}}|^2}{|\mathscr {M}_{t \bar{t} b\bar{b}}|^2}\, \mathrm {d}\sigma _{t \bar{t} b\bar{b}}, \nonumber \\ \mathrm {d}\sigma _{\mathrm {FS},t \bar{t} b\bar{b}}&=\frac{|\mathscr {M}_{\mathrm {FS},t \bar{t} b\bar{b}}|^2}{|\mathscr {M}_{t \bar{t} b\bar{b}}|^2}\, \mathrm {d}\sigma _{t \bar{t} b\bar{b}}, \end{aligned}$$
(3)

while \(\mathrm {d}\sigma _{\mathrm {int},t \bar{t} b\bar{b}}\) consists of the interference between \(\mathscr {M}_{\mathrm {IS},t \bar{t} b\bar{b}}\) and \(\mathscr {M}_{\mathrm {FS},t \bar{t} b\bar{b}}\) plus a minor contribution from \(\mathscr {M}_{\mathrm {rem},t \bar{t} b\bar{b}}\).

Fig. 5
figure 5

Generic leading-order \(gg\rightarrow t \bar{t} b\bar{b} \) topologies. The first line shows the most general form of topologies with IS (a) and FS (b) \(g\rightarrow b \bar{b} \) splittings. The second line shows the generic form of those topologies with IS (c) and FS (d) splittings that turn out to dominate \(gg\rightarrow t \bar{t} b\bar{b} \). The labels \(ij=56, 65\) stand for the \(b \bar{b}\) system, while \(\alpha =1,2\) indicates the initial-state gluon that generates the radiation

In order to check the soundness of the above gauge-dependent separation we compare it to an alternative definition of IS and FS \(g\rightarrow b \bar{b} \) contributions based on the collinear limits of the \(gg\rightarrow t \bar{t} b\bar{b} \) matrix element. In this case we define

$$\begin{aligned} \mathrm {d}\sigma _{\mathrm {IS}\otimes t \bar{t}}&= \frac{|\mathscr {M}_{\mathrm {IS}\otimes t \bar{t}}|^2}{|\mathscr {M}_{t \bar{t} b\bar{b}}|^2}\,\mathrm {d}\sigma _{t \bar{t} b\bar{b}}, \nonumber \\ \mathrm {d}\sigma _{\mathrm {FS}\otimes t \bar{t}}&= \frac{|\mathscr {M}_{\mathrm {FS}\otimes t \bar{t}}|^2}{|\mathscr {M}_{t \bar{t} b\bar{b}}|^2}\,\mathrm {d}\sigma _{t \bar{t} b\bar{b}}. \end{aligned}$$
(4)

Here \(|\mathscr {M}_{\mathrm {IS}\otimes t \bar{t}}|^2\) and \(|\mathscr {M}_{\mathrm {FS}\otimes t \bar{t}}|^2\) describe the collinear limits of the topologies depicted in Fig. 5c, d, respectively. Note that, for simplicity, we consider only the leading collinear enhancements where the \(b \bar{b} \) system originates either through the combination of \(g\rightarrow b \bar{b} \) and \(b\rightarrow gb\) IS splittings (\(\mathscr {M}_{\mathrm {IS}\otimes t \bar{t}}\)) or via IS \(g\rightarrow gg\) plus FS \(g\rightarrow b \bar{b} \) splittings (\(\mathscr {M}_{\mathrm {FS}\otimes t \bar{t}}\)). For events with external momenta

$$\begin{aligned} g(p_1)\,g(p_2) \rightarrow t (p_3)\, \bar{t}(p_4)\, b(p_5)\, \bar{b}(p_6), \end{aligned}$$
(5)

the collinear limits take the general form

$$\begin{aligned}&\big |\mathscr {M}_{\mathrm {IS}\otimes t \bar{t}/\mathrm {FS}\otimes t \bar{t}}\big |^2 = (8\pi \alpha _\mathrm{S})^2 \nonumber \\&\qquad \times \max _{\begin{array}{c} \alpha =1,2 \\ ij=56,65 \end{array}}\left\{ \frac{K_{\mathrm {IS}/\mathrm {FS}}(p_\alpha ,p_i,p_j)}{(p_\alpha -p_i-p_j)^2}\, \big |\mathscr {M}_{gg\rightarrow t \bar{t}}\big |^2_{p_\alpha \rightarrow z p_\alpha }\right\} ,\nonumber \\ \end{aligned}$$
(6)

where \(\alpha \in \{1,2\}\) and \(ij\in \{56,65\}\) specify, respectively, the IS gluon emitter and the ordering of the \(b \bar{b} \) pair as depicted in Fig. 5c, d. \(K_{\mathrm {IS}/\mathrm {FS}}(p_\alpha ,p_i,p_j)\) are the corresponding splitting kernels. The choice of \(\alpha \) and ij specifies a particular topology, and the maximum in Eq. (6) defines \(\big |\mathscr {M}_{\mathrm {IS}\otimes t \bar{t}}\big |^2\) and \(\big |\mathscr {M}_{\mathrm {FS}\otimes t \bar{t}}\big |^2\) as the collinear limit of the most likely topology of IS and FS type. The splitting kernels read

$$\begin{aligned} K_{\mathrm {IS}}(p_\alpha ,p_i,p_j)&=\frac{1}{(p_\alpha -p_i)^2-m_b^2} \frac{P_{gq}(x_{\mathrm {IS}})}{x_{\mathrm {IS}}} \frac{P_{qg}(y_{\mathrm {IS}})}{y_{\mathrm {IS}}},\nonumber \\ K_{\mathrm {FS}}(p_\alpha ,p_i,p_j)&=\frac{-1}{(p_i+p_j)^2} \frac{P_{gg}(z)}{z} P_{gq}(x_{\mathrm {FS}}) , \end{aligned}$$
(7)

where

$$\begin{aligned} P_{gq}(x)&=T_R\left[ 1-2x(1-x)\right] , \ P_{qg}(x)=C_F\left[ \frac{1+(1-x)^2}{x}\right] , \nonumber \\ P_{gg}(x)&= 2C_A\left[ \frac{x}{1-x}+\frac{1-x}{x}+x(1-x)\right] , \end{aligned}$$
(8)

with \(T_R=1/2\), \(C_F=4/3\) and \(C_A=3\). The various momentum fractions are set to

$$\begin{aligned} x_{\mathrm {IS}}&=\frac{E_\alpha -E_i}{E_\alpha },&y_{\mathrm {IS}}&=\frac{E_\alpha -(E_i+E_j)}{E_\alpha -E_i}, \nonumber \\ x_{\mathrm {FS}}&=\frac{E_j}{E_i+E_j},&z&= x_{\mathrm {IS}}\,y_{\mathrm {IS}}\,. \end{aligned}$$
(9)

Finally, the underlying \(gg\rightarrow t \bar{t} \) squared matrix element in Eq. (6) reads

$$\begin{aligned} \big |\mathscr {M}_{gg\rightarrow t \bar{t}}\big |^2_{p_\alpha \rightarrow z p_\alpha }= & {} \left( 4\pi \alpha _\mathrm{S}\right) ^2 \theta \left( z -\frac{m_{t \bar{t}}}{p_1\cdot p_2}\right) \nonumber \\&\times \Bigg \{ \bigg [\frac{(p_{1}\cdot p_{3})^2}{(p_{1}\cdot p_{2})^2} +\frac{(p_{2}\cdot p_{3})^2}{(p_{1}\cdot p_{2})^2}\nonumber \\&-\frac{2m_t^2}{(p_{1}\cdot p_{2})} +\frac{m_t^4}{(p_{1}\cdot p_{3})(p_{2}\cdot p_{3})} \bigg ] \nonumber \\&\times \left[ \frac{(p_{1}\cdot p_{2})^2}{6(p_{1}\cdot p_{3})(p_{2}\cdot p_{3})}-\frac{3}{8}\right] \Bigg \}_{p_\alpha \rightarrow z p_\alpha },\nonumber \\ \end{aligned}$$
(10)

where helicity/colour sums and average factors are included, and the momentum of the IS emitter has to be rescaled by z.

Fig. 6
figure 6

Breakdown of the \(pp\rightarrow t \bar{t} b\bar{b} \) cross section into contributions from topologies with IS and FS splittings at fixed-order LO in the 4F scheme: distributions in the \(p_\mathrm {T}\) of the leading b-jet with ttb (a) and ttbb (b) cuts, and distributions in the invariant mass (c) and \(\varDelta R\) separation (d) of the two leading b-jets with ttbb cuts. The complete \(gg/q\bar{q}\rightarrow t \bar{t} b\bar{b} \) matrix-element prediction (solid red) is split according to (2) into contributions from topologies of IS (solid blue) and FS (solid green) type and their interference (solid purple). This is compared to the gauge-invariant breakdown (4) into IS (dashed blue) and FS (dashed green) parts based on the collinear limits of the \(t \bar{t} b\bar{b} \) matrix element. Note that the \(q\bar{q}\) channel consists solely of FS \(g\rightarrow b \bar{b} \) contributions

Numerical results for the diagrammatic decomposition Eq. (2) and the collinear decomposition Eq. (4) of \(pp\rightarrow t \bar{t} b\bar{b} \) at \(\sqrt{s}=13\) TeV are shown in Fig. 6. The first two plots display the leading b-jet \(p_\mathrm {T}\) distribution in the presence of ttb and ttbb cuts as defined in Sect. 3.6, i.e. requiring \(N_b\ge 1\) and \(N_b\ge 2\) b-jets, respectively. In the ttbb phase space, topologies with \(g\rightarrow b \bar{b} \) FS splittings turn out to be surprisingly close to the full matrix element, with deviations that do not exceed the 10% level in the entire spectrum. This agreement remains remarkably good also in the inclusive ttb phase space, where IS splitting processes with one unresolved b-quark are expected to be more pronounced. Actually, with ttb and ttbb cuts the pure IS contribution ranges between 20–40% and 15–75%, respectively, but is almost entirely cancelled by the IS–FS interference.

The fact that the collinear approximations Eq. (4) agree rather well with the corresponding squared Feynman diagrams up to relatively high \(p_\mathrm {T}\) confirms that \(t \bar{t} b\bar{b} \) production is dominated by the topologies in Fig. 5c, d. On the other hand, the importance of interference effects provides strong motivation for using exact \(t \bar{t} b\bar{b} \) matrix elements, while collinear approximations such as those in Eq. (6) or in the parton-shower modelling of \(g\rightarrow b \bar{b} \) splittings should be used with due caution.

The above considerations apply also for the \(m_{b \bar{b}}\) and \(\varDelta R_{b \bar{b}}\) distribution in Fig. 6c, d. In particular, we observe that topologies with FS \(g\rightarrow b \bar{b} \) splittings are very close to the full matrix element in the whole \(m_{b \bar{b}}\) spectrum as well as for \(\varDelta R_{b \bar{b}}<2\) . At the same time, for \(50\,\text {GeV}< m_{b \bar{b}}< 200\,\text {GeV} \) and \(1<\varDelta R_{b \bar{b}}<2.5\), i.e. in the range of interest for \(t \bar{t} H(b \bar{b})\) analyses, we observe that IS splitting contributions and negative interference effects grow fast and tend to become very sizable. Thus a naive separation into contributions from IS and FS splittings is not applicable at large \(m_{b \bar{b}}\) and \(\varDelta R_{b \bar{b}}\). On the other hand, in the region of moderate invariant mass and \(\varDelta R\) separation, which contains the bulk of the \(t \bar{t} b\bar{b} \) cross section, interference effects are rather small, and \(b \bar{b} \) pairs turn out to originate almost entirely from FS \(g\rightarrow b \bar{b} \) splittings.

In summary, given that the 5F scheme is based on the LO process \(gb\rightarrow t \bar{t} b\), where FS \(g\rightarrow b \bar{b} \) splittings and interference effects are entirely neglected, the above observations provide strong motivation for a description of \(t \bar{t} +b\)-jet production based on \(t \bar{t} b\bar{b} \) matrix elements in the 4F scheme.

3 Technical aspects and setup of NLOPS \(\varvec{t \bar{t} b\bar{b}}\) simulations

In this section we introduce a new Powheg generator based on \(t \bar{t} b\bar{b} \) matrix elements in the 4F scheme. Special emphasis is devoted to some technical aspects of the Powheg method that turn out to play an important role for multi-scale processes like \(pp\rightarrow t \bar{t} b\bar{b} \). In addition we describe the setup used for the \(t \bar{t} +b\)-jet simulations presented in Sects. 4 and 5, i.e. all relevant input parameters, scale choices and parton shower settings, as well as the treatment of theoretical uncertainties and the definitions of physics objects and selection cuts. Finally we provide details on the treatment of top-quark decays.

The new \(t \bar{t} b\bar{b} \) generator is implemented in the Powheg-Box-Res framework [22], and the relevant LO and NLO matrix elements are computed by OpenLoops  [29,30,31] through its Powheg-Box-Res interface [32]. For the evaluation of one-loop integrals OpenLoops employs the Collier library [33,34,35,36] and, alternatively, CutTools  [37, 38] together with the OneLOop library [39]. While we do not apply the resonance-aware method [22], the Powheg-Box-Res framework allows us to make use of new technical features, such as the automated implementation of scale variations , a Rivet  [40] interface and the option to unweight events partially. This \(t \bar{t} b\bar{b} \) generator will be soon publicly available on the Powheg-Box webpage [23].

3.1 Powheg methodology

As discussed in Sect. 3.1 the very mild sensitivy of Powheg predictions to the choice of parton shower is due to the fact that the

In the following, we briefly review the Powheg method [6, 7] with emphasis on the separation of radiation into singular and finite parts. In this context we discuss technical subtleties that arise in the case of multi-scale processes like \(pp\rightarrow t \bar{t} b\bar{b} \).

The master formula for the description of NLO radiation in the Powheg approach consists of two contributions,

$$\begin{aligned} \mathrm {d}\sigma = \mathrm {d}\sigma _\mathrm {s}+ \mathrm {d}\sigma _\mathrm {f}, \end{aligned}$$
(11)

which arise from the splitting of real emission into singular (s) and finite (f) parts,

$$\begin{aligned} R(\varPhi _{\mathrm {R}}) = R_\mathrm {s}(\varPhi _{\mathrm {R}})+R_\mathrm {f}(\varPhi _{\mathrm {R}}). \end{aligned}$$
(12)

Here \(R(\varPhi _{\mathrm {R}})\) should be understood as squared real-emission matrix element, and \(\varPhi _{\mathrm {R}}\) as the corresponding phase space. Similarly, Born and virtual contributions in the Born phase space are denoted as \(B(\varPhi _{\mathrm {B}})\) and \(V(\varPhi _{\mathrm {B}})\). The splitting Eq. (12) is implemented as

$$\begin{aligned} R_{\mathrm {s}}(\varPhi _{\mathrm {R}})&= F(\varPhi _{\mathrm {R}})\, R(\varPhi _{\mathrm {R}}), \nonumber \\ R_{\mathrm {f}}(\varPhi _{\mathrm {R}})&= \left[ 1-F(\varPhi _{\mathrm {R}})\right] \, R(\varPhi _{\mathrm {R}}), \qquad \end{aligned}$$
(13)

where \(F(\varPhi _{\mathrm {R}})\in [0,1]\) is a damping function that fulfils \(F\rightarrow 1\) and \(F\rightarrow 0\), respectively, in the infrared and hard regions of phase space (see below).

The singular part of real radiation is resummed according to the Powheg formula

$$\begin{aligned}&\mathrm {d}\sigma _\mathrm {s}=\bar{B} (\varPhi _{\mathrm {B}}) \,\mathrm {d}\varPhi _{\mathrm {B}}\nonumber \\&\quad \times \bigg [ \varDelta (q_{{\text {cut}}}) + \sum _{\alpha } \varDelta (k_{\mathrm {T},\alpha }) \frac{R_{\mathrm {s},\alpha } (\varPhi _{\alpha } (\varPhi _{\mathrm {B}}, \varPhi _{{\text {rad}}}) )}{B (\varPhi _{\mathrm {B}})} \,\mathrm {d}\varPhi _{{\text {rad}}} \bigg ],\nonumber \\ \end{aligned}$$
(14)

where real emission is further split into FKS sectors [41],

$$\begin{aligned} R_\mathrm {s}= \sum _{\alpha } R_{\mathrm {s},\alpha },\quad R_\mathrm {f}= \sum _{\alpha } R_{\mathrm {f},\alpha }, \end{aligned}$$
(15)

which isolate collinear singularities arising from individual emitters. In each sector, the emission phase space \(\varPhi _{\mathrm {R}}\) is factorised into the Born phase space \(\varPhi _{\mathrm {B}}\) and a one-particle radiation phase space \(\varPhi _{{\text {rad}}}\) through an appropriate FKS mapping,

$$\begin{aligned} (\varPhi _{\mathrm {B}}, \varPhi _{{\text {rad}}}) \;\longrightarrow \; \varPhi _{\mathrm {R}} = \varPhi _{\alpha } (\varPhi _{\mathrm {B}}, \varPhi _{{\text {rad}}}). \end{aligned}$$
(16)

The term within squared brackets in Eq. (14) generates the hardest radiation according to an emission probability R / B. The parameter \(k_{\mathrm {T},\alpha }=k_{\mathrm {T},\alpha }(\varPhi _\mathrm {rad})\) stands for the hardness of the radiated parton, and radiation harder than \(k_{\mathrm {T},\alpha }\) is excluded by means of corresponding Sudakov form factors,

$$\begin{aligned} \varDelta (q) = \exp \left[ - \sum _\alpha \,\int _{k_{\mathrm {T},\alpha }\, >\, q} \frac{R_{\mathrm {s},\alpha } (\varPhi _{\alpha } (\varPhi _{\mathrm {B}}, \varPhi _{{\text {rad}}}) )}{B (\varPhi _{\mathrm {B}})}\, \mathrm {d}\varPhi _{{\text {rad}}} \right] . \end{aligned}$$
(17)

The term \(\varDelta (q_{{\text {cut}}})\) in Eq. (14) represents the no-emission probability above the infrared cutoff \(q_{{\text {cut}}}\), and Sudakov form factors account for unresolved multiple emissions in a way that cancels infrared singularities while preserving the differential NLO cross section \(\bar{B} (\varPhi _{\mathrm {B}})\) in the Born phase space. The latter is defined by integrating out the singular part of real radiation,

$$\begin{aligned} \bar{B} (\varPhi _{\mathrm {B}})&= B (\varPhi _{\mathrm {B}}) + V (\varPhi _{\mathrm {B}}) \nonumber \\&\quad + \sum _{\alpha } \int R_{\mathrm {s},\alpha } (\varPhi _{\alpha } (\varPhi _{\mathrm {B}}\,,\varPhi _{{\text {rad}}}) ) \,\mathrm {d}\varPhi _{{\text {rad}}}\,.\quad \end{aligned}$$
(18)

Here infrared cancellations between V and \(R_\mathrm {s}\) are controlled via FKS subtraction. The remaining finite part of NLO radiation is treated as in fixed-order calculations,

$$\begin{aligned} \mathrm {d}\sigma _\mathrm {f}= \sum _\alpha R_{\mathrm {f},\alpha } (\varPhi _{\alpha } (\varPhi _{\mathrm {B}}, \varPhi _{{\text {rad}}}))\, \mathrm {d}\varPhi _{\mathrm {B}}\, \mathrm {d}\varPhi _{{\text {rad}}}\,. \end{aligned}$$
(19)

Note that flux and symmetry factors as well as the convolution with PDFs are implicitly understood in Eqs. (14) and (19).

Let us now come back to the details of the separation of the singular and finite parts of real emission in Eq. (13). Technically, the damping function F is implemented based on the kinematics of the actual FKS sector, i.e. 

$$\begin{aligned} R_{\mathrm {s},\alpha }(\varPhi _{\alpha }) = R_{\alpha }(\varPhi _{\alpha }) - R_{\mathrm {f},\alpha } (\varPhi _{\alpha }) =F_{\alpha } (\varPhi _{\alpha }) R_{\alpha }(\varPhi _{\alpha }). \end{aligned}$$
(20)

The default functional form of F in Powheg-Box  [42, 43] is

$$\begin{aligned} F_{\alpha } (\varPhi _{\alpha }) = F_{\mathrm {damp},\alpha } (\varPhi _{\alpha })\, F_{\mathrm {bzd}, \alpha } (\varPhi _{\alpha }), \end{aligned}$$
(21)

where

$$\begin{aligned} F_{\mathrm {damp},\alpha } (\varPhi _{\alpha }) = \frac{h_{\mathrm {damp}} ^2}{h_{\mathrm {damp}} ^2+k_{\mathrm {T},\alpha }^2} \end{aligned}$$
(22)

is the usual factor that smoothly shifts the weight of real radiation from \(R_\mathrm {s}\) to \(R_\mathrm {f}\) when the hardness of the emission, \(k_{\mathrm {T},\alpha }\), becomes of the order of \(h_{\mathrm {damp}} \) or higher. Note that the freely adjustable \(h_{\mathrm {damp}} \) parameter in Powheg plays an analogous role as the resummation scale \(\mu _Q\) in the MC@NLO method. This is because both parameters act as a \(k_{\mathrm {T}}\)-threshold that separates the radiative phase space into a hard region, which is described by fixed-order matrix elements, from a singular region, where large logarithms of soft and collinear origin are resummed to all orders by means of Sudakov form factors. More precisely, in the MC@NLO approach the factor \(R_{\mathrm {s},\alpha }/B\) and the terms \(\varDelta \) in Eq. (14) correspond, respectively, to the parton-shower emission probability and the associated Sudakov form factors. Thus, in the MC@NLO framework, Eq. (14) corresponds to the weight of so-called soft events supplemented by the probability of the first parton-shower emission and its no-emission counterpart. In analogy with Eqs. (21) and (22), also the first MC@NLO shower emission is modulated by a certain damping function. The related reference scale \(\mu _Q\), i.e. the MC@NLO counterpart of the scale \(h_{\mathrm {damp}} \) in Powheg, corresponds to the upper bound of the first shower emission in MC@NLO. Thus MC@NLO predictions are sensitive to the choice of the shower starting scale.Footnote 4 On the contrary, the first Powheg emission in Eqs. (11)–(13) is entirely determined by the matrix element, which also dictates the scale at which the shower starts emitting further partons. Thus the Powheg method has the advantage of being essentially independent on the shower starting scale. More generally, thanks to the fact that the first emission is completely independent of the parton shower, Powheg predictions are characterised by a rather mild sensitivity to systematic uncertainties associated with the parton shower.

In addition to the well-known \(h_{\mathrm {damp}} \)-dependent damping mechanism Eq. (22), the Powheg-Box also implements a theta functionFootnote 5 of the form [42, 44]

$$\begin{aligned} F_{\mathrm {bzd}, \alpha } (\varPhi _{\alpha }) = \theta \left( h_{\mathrm {bzd}}-\frac{R_{\alpha }(\varPhi _{\alpha })}{\mathscr {R}_{\alpha }(\varPhi _{\alpha })}\right) , \end{aligned}$$
(23)

where \(\mathscr {R}_{\alpha }\) corresponds to the infrared (soft and collinear) approximation of the full matrix element. Schematically it has the factorised form

$$\begin{aligned} \mathscr {R}_{\alpha }(\varPhi _\alpha ) = \mathscr {K}_{\alpha }(\varPhi _{\mathrm {rad}}) B(\varPhi _{\mathrm {B}}), \end{aligned}$$
(24)

with an FKS kernel \(\mathscr {K}_{\alpha }(\varPhi _{\mathrm {rad}})\) and an underlying Born contribution \(B(\varPhi _{\mathrm {B}})\), whose kinematics is determined by the inverse of the mapping Eq. (16) in the actual sector \(\alpha \). By default, the cut-off parameter \(h_{\mathrm {bzd}} \) in Eq. (23) is set equal to 5. In this way, in the vicinity of IR singularities, where \(R_\alpha /\mathscr {R}_{\alpha }\rightarrow 1\), radiative contributions are attributed to \(R_\mathrm {s}\) and resummed according to Eq. (14). On the contrary, when the real emission matrix element largely exceeds the IR approximation Eq. (24), the resummation of the full R / B kernel according to Eq. (14) is not well justified, and corresponding events are attributed to the finite remnant Eq. (19) through the theta function Eq. (23). In the standard Powheg-Box, and in Ref. [13], the damping function Eq. (21) is applied only to initial-state radiation. However, in the present \(t \bar{t} b\bar{b} \) generator we have extended it to all (massless or massive) final-state emitters, that have a FKS sector associated with it, in order to ensure a consistent resummation of QCD radiation off b-quarks.

The requirement \(R_{\alpha }(\varPhi _{\alpha }) < h_{\mathrm {bzd}} \,\mathscr {K}_{\alpha }(\varPhi _{\mathrm {rad}})\,B(\varPhi _{\mathrm {B}})\) was originally introduced in order to avoid possible divergences of \(R(\varPhi _\alpha )/B(\varPhi _\mathrm {B})\) due to so-called Born zeros, i.e. phase space regions where \(B(\varPhi _{\mathrm {B}})\rightarrow 0\). Such divergences cancel in the \(\bar{B}/B\) ratio, i.e. they are not physical, and are not related to IR radiation. Corresponding \(\varPhi _\mathrm {B}\) regions should thus be attributed to the finite remnant. Otherwise they could lead to dramatic inefficiencies in the event generation. More generally, the damping factor Eq. (23) can play an important role also in case of multi-scale processes where the Born cross section involves enhancement mechanisms at scales well below the hard energy of the full process. Such enhancements can compete with the ones due to soft and collinear QCD radiation in a way that is somewhat analogous to Born zeros.

In the case of \(pp\rightarrow t \bar{t} b\bar{b} \), such effects can arise from the interplay of soft and collinear enhancements due to NLO light-jet radiation and to the generation of the \(b \bar{b} \) system in regions with \(m_{b \bar{b}}\ll m_{t \bar{t} b\bar{b}}\) and/or \(p_{\mathrm {T},b \bar{b}}\ll m_{t \bar{t} b\bar{b}}\). For example, let us consider a \(gg\rightarrow t \bar{t} b\bar{b} g\) event with a gluon emission of ISR type. Its kinematics is generated starting from a \(gg\rightarrow t \bar{t} b\bar{b} \) Born event through a mapping of type Eq. (16), which creates the required gluon recoil by boosting the final state of the \(gg\rightarrow t \bar{t} b\bar{b} \) Born event in the transverse direction. The relevant boost factor, \(\gamma =1/(1-\beta ^2)^{1/2}\), is determined by \(p_{\mathrm {T},j}=p'_{\mathrm {T},t \bar{t} b\bar{b}}=\gamma \beta E_{t \bar{t} b\bar{b}}\), where \(E_{t \bar{t} b\bar{b}}\) is the \(t \bar{t} b\bar{b} \) energy of the \(b \bar{b} \) system in the Born event. If we assume, for simplicity, that he gluon is emitted in the same azimuthal direction as the \(b \bar{b} \) system in the Born event, then the \(b \bar{b} \) transverse momentum of the radiative event becomes \(p'_{\mathrm {T},b \bar{b}}=\gamma (p_{\mathrm {T},b \bar{b}}-\beta E_{b \bar{b}})\), where \(p_{\mathrm {T},b \bar{b}}\) and \(E_{b \bar{b}}\) are the \(b \bar{b} \) transverse momentum and energy in the Born event. Thus, the FKS mapping can lead to a very significant reduction of \(p_{\mathrm {T},b \bar{b}}\). More precisely, for radiative events with

$$\begin{aligned} \frac{p_{\mathrm {T},j}}{p_{\mathrm {T},b \bar{b}}}=(1+\varepsilon )\frac{E_{t \bar{t} b\bar{b}}}{E_{b \bar{b}}}, \end{aligned}$$
(25)

the effect of the FKS boost on the \(b \bar{b} \) system amounts to

$$\begin{aligned} \frac{p'_{\mathrm {T},b \bar{b}}}{p_{\mathrm {T},b \bar{b}}}=\tilde{\varepsilon }=\gamma -1-\varepsilon . \end{aligned}$$
(26)

Thus, since the bulk of the \(t \bar{t} b\bar{b} \) cross section is characterised by \(E_{t \bar{t} b\bar{b}} \gg E_{b \bar{b}}\) and \(\gamma \sim 1\), in the case of hard QCD radiation with \(\tilde{\varepsilon }\ll 1\) the FKS mapping can lead to a drastic reduction of the \(p_{\mathrm {T}}\) of the \(b \bar{b} \) system. As a result, in the region of small \(m_{b \bar{b}}\), the ISR boost can enhance the \(R_\alpha (\varPhi _\alpha )/\mathscr {R}_\alpha (\varPhi _\alpha )\) ratio by up to a factorFootnote 6 \((p_{\mathrm {T},b \bar{b}}/p'_{\mathrm {T},b \bar{b}})^2\sim 1/{\tilde{\varepsilon }}^2\). This violates the main assumption that justifies the Powheg formula Eq. (14), namely \(R_\alpha (\varPhi _\alpha )/B(\varPhi _B)\sim \mathscr {R}_\alpha (\varPhi _\alpha )/B(\varPhi _B)=\mathcal{K}_\alpha (\varPhi _\mathrm {rad})\), which requires a sufficiently hard \(t \bar{t} b\bar{b} \) process as compared to the \(k_\mathrm {T}\) of NLO radiation. In particular, due to the sensitivity of the Born amplitude to scales of the order \({p_{\mathrm {T},b \bar{b}}}\sim (E_{b \bar{b}}/E_{t \bar{t} b\bar{b}})\, p_{\mathrm {T},j}\ll p_{\mathrm {T},j}\), the factorisation formula (24) is not fulfilled.

Fortunately, this problematic behaviour emerges only in relatively hard regions of the \(\varPhi _\mathrm {rad}\) phase space.Footnote 7 Thus, as a remedy it is natural to shift such events into the finite remnant by means of the damping factor Eq. (23). In fact, in the case of \(t \bar{t} b\bar{b} \) production we have found that the \(h_{\mathrm {bzd}} \)-dependent cut plays an important role for the efficiency generation of Les Houches events (LHEs) as well as for a consistent scale dependence. Moreover, applying a large \(h_{\mathrm {bzd}} \) cut we have observed a significant enhancement of the QCD scale dependence. This can be attributed to the fact that scale variations in the soft term Eq. (14) are restricted to the \(\bar{B}\) factor Eq. (18), where the unphysical distortions of the \(b \bar{b} \) kinematics induced by the FKS mappings can jeopardise the natural cancellation of virtual and real contributions associated with a given Born configuration.

As discussed in Sect. 4, Powheg predictions for ttbb observables are rather stable with respect to variations of \(h_{\mathrm {bzd}} \). Thus, in order to avoid an unphysical enhancement of the scale dependence, we have reduced \(h_{\mathrm {bzd}} \) from its default value of 5 to \(h_{\mathrm {bzd}} =2\). This guarantees a more reasonable consistency with the fixed-order scale dependence without shifting an excessive fraction of the cross section from \(\mathrm {d}\sigma _\mathrm {s}\) to \(\mathrm {d}\sigma _\mathrm {f}\).

3.2 Input parameters, PDFs and scale choices

The predictions in Sects. 4 and 5 are based on the following input parameters, scale choices and PDFs. Heavy-quark mass effects are included throughout using

$$\begin{aligned} m_{t} = 172.5 \;\text {GeV},\quad m_{b} = 4.75 \;\text {GeV}. \end{aligned}$$
(27)

All other quarks are treated as massless in the perturbative part of the calculations. Since we use massive b-quarks, for the PDF evolution and the running of \(\alpha _\mathrm{S}\) we adopt the 4F scheme. Thus, for consistency, we renormalise \(\alpha _\mathrm{S}\) in the decoupling scheme, where top- and bottom-quark loops are subtracted at zero momentum transfer. In this way, heavy-quark loop contributions to the evolution of the strong coupling are effectively described at first order in \(\alpha _\mathrm{S}\) through the virtual corrections.

For the calculation of hard cross sections at LO and NLO, as well as for the generation of the first Powheg emission, we use the NNPDF30_nlo_as_0118_nf_4 parton distributions [45] as implemented in the LHAPDFs [46] and the corresponding \(\alpha ^{(4\mathrm {F})}_\mathrm{S}\).Footnote 8 To assess PDF uncertainties we re-evaluate the weights of LHEs with 100 different PDF replicas, while using the nominal PDF set for parton showering.

Since it scales with \(\alpha _\mathrm{S}^4\), the \(t \bar{t} b\bar{b} \) cross section is highly sensitive to the choice of the renormalisation scale \(\mu _\mathrm{R}\), and this choice plays a critical role for the stability of perturbative predictions. Following [8, 11], we adopt a scale choice of the form

$$\begin{aligned} \mu _\mathrm{R}=\xi _\mathrm{R}\sqrt{\mu _{t \bar{t}}\,\mu _{b \bar{b}}}, \end{aligned}$$
(28)

with the scale-variation factor \(\xi _\mathrm{R}\in [0.5,2]\). This dynamic scale choice accounts for the fact that \(t \bar{t} b\bar{b} \) production is characterised by two widely separated scales, which are related to the \(t \bar{t} \) and \(b \bar{b} \) systems and are chosen as the geometric average of the respective transverse energies,

$$\begin{aligned} \mu _{b \bar{b}}=\sqrt{E_{\mathrm {T},b}E_{\mathrm {T},\bar{b}}}, \quad \mu _{t \bar{t}}=\sqrt{E_{\mathrm {T},t}E_{\mathrm {T},\bar{t}}}\;. \end{aligned}$$
(29)

The transverse energies \(E_{\mathrm {T},i}=\sqrt{m_i^2+p^2_{\mathrm {T},i}}\) are defined in terms of the rest masses \(m_i\) and the transverse momenta \(p_{\mathrm {T},i}\) of the bare heavy quarks. The scales Eq. (29) are computed according to physical kinematics, i.e. without projecting real emission events to the underlying Born phase space. The choice Eq. (28) is applied to all (N)LO matrix elements apart from the \(\alpha _\mathrm{S}\) factor that results from the R / B ratio in Eq. (14). In that case \(\alpha _\mathrm{S}\) is evaluated at the transverse momentum of the hardest Powheg emission, and that \(\alpha _\mathrm{S}(k_{\mathrm {T},\alpha })\) factor is not subject to scale variations.

For the factorisation scale \(\mu _\mathrm{F}\) we useFootnote 9

$$\begin{aligned} \mu _\mathrm{F}=\xi _\mathrm{F}\,\frac{H_\mathrm {T}}{2} = \frac{\xi _\mathrm{F}}{2} \sum _{i=t,\bar{t}, b, \bar{b},j} E_{\mathrm {T},i}, \end{aligned}$$
(30)

where \(\xi _\mathrm{F}\in [0.5,2]\), and the total transverse energy of the \(t \bar{t} b\bar{b} \) system, \(H_\mathrm {T}\), is computed in terms of bare-quark transverse momenta including also QCD radiation at NLO. Our nominal predictions correspond to \(\xi _\mathrm{R}=\xi _\mathrm{F}=1\), and to quantify scale uncertainties we take the envelope of the seven-point variation \((\xi _\mathrm{R},\xi _\mathrm{F})=(0.5,0.5)\), (0.5, 1), (1, 0.5), (1, 1), (1, 2), (2, 1), (2, 2).

For the Powheg-Box parameters \(h_{\mathrm {bzd}} \) and \(h_{\mathrm {damp}} \), which control the resummation of NLO radiation according to Eqs. (21)–(23) as discussed in Sect. 3.1, we set

$$\begin{aligned} h_{\mathrm {bzd}} =2 \end{aligned}$$
(31)

and

$$\begin{aligned} h_{\mathrm {damp}} =\frac{H_\mathrm {T}}{2} = \frac{1}{2} \sum _{i=t,\bar{t}, b, \bar{b}} E_{\mathrm {T},i}\;. \end{aligned}$$
(32)

Here the various \(E_{\mathrm {T},i}\) are defined in the underlying Born phase space. To account for the uncertainties associated with these choice we apply the independent variations \(h_{\mathrm {bzd}} =2\), 5, 10 and \(h_{\mathrm {damp}} ={H_\mathrm {T}}/{4},{H_\mathrm {T}}/{2},H_\mathrm {T},1.5\,m_t\), varying both parameters one at a time.Footnote 10

The above choices for \(\mu _\mathrm{R},\mu _\mathrm{F}\) and \(h_{\mathrm {damp}} \), as well as the employed PDFs correspond to the setup recommended in [11].

3.3 Parton shower settings and variations

By default, LHEs are showered with Pythia 8.2 using the A14 tune,Footnote 11 where ISR and FSR parameters as well as the MPI activity have been tuned in a single step using most of the available \(t \bar{t} \) ATLAS data from Run 1 [47]. In the A14 tune, \(m_b=4.75\) GeV and \(\alpha ^{(5\mathrm {F})}_\mathrm{S}(M_Z)=0.127\), both for ISR and FSR, while in the default Monash tune \(\alpha ^{(5\mathrm {F})}_\mathrm{S}(M_Z)=0.13650\). Since the shower evolution is implemented in the 5F scheme we shower events using 5F PDFs. Specifically, we choose the NNPDF30_nlo_as_0118 5F PDFs.Footnote 12

The interplay between Powheg and Pythia is controlled by the scalup parameter, which describes the hardness of radiation in LHEs and may be taken as starting scale for Pythia. However, in order to avoid inconsistencies due to the fact that the Pythia evolution variable does not coincide with the definition of hardness in Powheg, we apply the following two-step procedure based on the PowhegHooks class. Instead of starting below scalup, we instruct Pythia to generate radiation up to the kinematic limit by setting

figure a

Then, to guarantee the correct ordering of emissions in Powheg and Pythia, we apply a veto on each Pythia emission that is harder than scalup according to the Powheg-Box definition of hardness. This is achieved by setting

figure b

The remaining PowhegHooks settings are left to their default values.

At LOPS level we set the shower starting scale equal to \(H_{\mathrm {T}}/2\) and vary it up and down by a factor two in order to assess the related uncertainty. At NLOPS, the shower starting scale is dictated by the kinematics of real emission matrix elements in the Powheg method. Thus, at variance with NLOPS predictions based on the MC@NLO method, Powheg predictions are free from uncertainties related to the choice of the shower starting scale.

In order to assess uncertainties due to the parton-shower modelling of \(g\rightarrow b \bar{b} \) splittings we vary the parameter TimeShower:weightGluonToQuark, which permits to select various optional forms of the \(g\rightarrow Q\bar{Q}\) splitting kernel in Pythia 8. The default is option 4, which corresponds to the splitting probability [48]

$$\begin{aligned} \mathrm {d}P_{g\rightarrow b \bar{b}}&= \frac{\alpha _\mathrm{S}(p_T^2)}{2\pi }\frac{\mathrm {d}m_{b \bar{b}}^2}{m_{b \bar{b}}^2} \frac{\beta _b}{2} \nonumber \\&\quad \times \left[ z^2+(1-z)^2+8r_b z(1-z)\right] (1-\delta )^3, \end{aligned}$$
(33)

where \(r_b=m_b^2/m_{b \bar{b}}^2\), \(\beta _b=\sqrt{1-4r_b}\) and \(\delta =m^2_{b \bar{b}}/m^2_{\mathrm {dipole}}\). The factor \((1-\delta )^3\), which suppresses the production of high-mass \(b \bar{b} \) pairs, is derived from the \(H\rightarrow gb \bar{b} \) matrix element by interpreting \(m_H\) as the mass of a gluon dipole, \(m_{\mathrm {dipole}}\). Omitting the factor \((1-\delta )^3\) in Eq. (33) corresponds to option 2 and results in a DGLAP splitting probability of type \(\gamma ^*\rightarrow b \bar{b} \) with mass effects. More precisely, in option 2, \(g\rightarrow b \bar{b} \) splittings are generated based on massless kinematics, and the \(r_b\) mass correction is implemented through reweighting, while massive kinematics is restored through momentum reshuffling. Option 3, which implements massive DGLAP splittings in a more realistic way, involves an additional \((1+\delta )/(1-\delta )\) factor that leads to a significant enhancement of the \(g\rightarrow b \bar{b} \) rate. This option is excluded by LEP/SLC data and also by direct measurements of \(t \bar{t} +b\)-jet production [49]. Finally, option 1 corresponds to option 2 with \(r_b=0\) and yields very similar results. Thus, for the assessment of \(g\rightarrow b \bar{b} \) shower uncertainties we will compare options 4 and 2.

In addition to the functional form of the heavy-quark splitting kernel we also vary the scale of \(\alpha _\mathrm{S}\) in the parton shower. To this end, we set TimeShower:weightGluonToQuark to 6 and 8, which corresponds to options 2 and 4 with \(\alpha _\mathrm{S}(p_T^2)\) replaced by \(\alpha _\mathrm{S}(m_{b \bar{b}}^2)\) in the heavy-quark splitting kernel Eq. (33). Moreover, using TimeShower:renormMultFac, we vary \(\alpha _\mathrm{S}(p_\mathrm {T}^2)\rightarrow \alpha _\mathrm{S}(\xi p_\mathrm {T}^2)\) with prefactors \(\xi =0.1,1,10\) both for options 2 and 4. This latter variation is applied to all final-state QCD splittings, i.e. also splittings of type \(g\rightarrow gg\), \(q\rightarrow q g\), etc.

3.4 Comparisons against alternative generators

In order to assess systematic uncertainties related to the parton shower and the matching scheme, in Sects. 4.2 and 4.3 we compare Powheg+Pythia predictions of \(t \bar{t} b\bar{b} \) production against corresponding predictions generated with Powheg+Herwig and with Sherpa  [8]. The Powheg+Pythia and Sherpa generators of \(t \bar{t} b\bar{b} \) production are also compared against corresponding generators of inclusive \(t \bar{t} \) production in the 5F scheme.Footnote 13

In the case of Herwig  [50] we apply the angular ordered shower using version 7.1, setting \(m_b = 4.75\) GeV, and leaving the strong coupling to its default value, \(\alpha _S(m_Z) = 0.126234\). To restrict the hardness of Herwig emissions according to the value of scalup in the LHEs we set

figure c

In the case of Sherpa we use version 2.2.4 with its default tuneFootnote 14 for Sherpa ’s dipole shower [52]. The relevant one-loop matrix elements are computed with OpenLoops, and matching to the parton shower is based on the Sherpa implementation [9] of the MC@NLO method [10], dubbed SMC@NLO. As for the hard cross section we use the same input parameters, PDFs and scale settings as specified in Sect. 3.2 for the case of Powheg. Moreover, as motivated in Sect. 3.1 we identify the resummation scale \(\mu _Q\) in Sherpa with the \(h_{\mathrm {damp}} \) parameter in Powheg, i.e. we set \(\mu _Q=H_\mathrm {T}/2\). In the Sherpa simulation the NNPDF30_nlo_as_0118_nf_4 PDF set is used throughout, i.e. also for paton showering.

For Powheg and Sherpa simulations of inclusive \(t \bar{t} \) production we use the same setup as for the corresponding \(t \bar{t} b\bar{b} \) generators, with the only exceptions being the QCD scales, \(\mu _\mathrm{R}=\mu _\mathrm{F}=0.5\sqrt{E_{\mathrm {T},t}E_{\mathrm {T},\bar{t}}}\), and the choice of the NNPDF30_nlo_as_0118_nf_5 PDF set. In this setup the inclusive NLO cross section amounts to \(\sigma _{t \bar{t}}=815\) pb, which is only 2% below the NNLO prediction of \(832^{+45}_{-50}\) pb [53].

3.5 Simulations with stable or decayed top quarks

In Sects. 4 and 5 predictions for \(t \bar{t} b\bar{b} \) production are presented both for the case of stable top quarks and with spin-correlated top decays. Simulations with stable top quarks permit to avoid the combinatorial complexity that results from the presence of four b-quarks in decayed \(t \bar{t} b\bar{b} \) events. In this way one can focus on the production of the \(b \bar{b} \) pair that is governed by QCD dynamics and which represents the main source of theoretical uncertainty in \(pp\rightarrow t \bar{t} b\bar{b} \). Moreover, results with stable top quarks can be compared to the benchmarks of Refs. [8, 11]. Since top quarks do not hadronise, when we switch off top decays we disable hadronisation and, following Refs. [8, 11], we also deactivate multi-parton interactions (MPIs) and QED radiation in the parton shower. This is achieved by setting

figure d

For the case of decaying top quarks we show results both with hadronisation and MPI switched off or on, while the QED shower is always activated and hadrons are kept stable throughout. For the implementation of spin-correlated decays in the Powheg framework we follow the approach of Ref. [54], which has already been employed in the Powheg-Box framework in Refs. [25, 55, 56]. More precisely, we use resonant tree matrix elements for the full \(2\rightarrow 8\) Born processes \(q\bar{q}/gg\rightarrow t (\rightarrow b ij)\bar{t}(\rightarrow \bar{b} kl)b \bar{b} \), where ij and kl stand for the leptons or quarks from W decays, and corresponding \(2\rightarrow 9\) processes with an additional external gluon at the level of the \(pp\rightarrow t \bar{t} b\bar{b} \) sub-process. In the \(2\rightarrow 8(9)\) matrix elements we include only topologies with two intermediate top resonances. This accounts for spin correlations as well as for off-shell effects associated with the top and the W propagators. Technically, top decays are generated starting from on-shell \(t \bar{t} b\bar{b} \) events with a veto algorithm based on the ratio between \(2\rightarrow 8(9)\) matrix elements and corresponding \(2\rightarrow 4(5)\) matrix elements for the underlying \(pp\rightarrow t \bar{t} b\bar{b} \)(+jet) process.

As additional input parameters for top decays we use [57]

$$\begin{aligned} M_{W} = 80.385 \;\text {GeV},\quad G_{\mathrm {F}} = 1.1663787\cdot 10^{-5}\; \text {GeV} ^{-2}, \end{aligned}$$
(34)

the total widths

$$\begin{aligned} \varGamma _t=1.329\;\text {GeV},\quad \varGamma _W=2.089\;\text {GeV} \,, \end{aligned}$$
(35)

and the branching ratios

$$\begin{aligned} \mathrm {BR}_{t\rightarrow b\ell _i\nu _j}&= \mathrm {BR}_{W\rightarrow \ell _i\nu _j}= \frac{\delta _{ij}}{3}\mathrm {BR}_{W\rightarrow \mathrm {lept}}, \end{aligned}$$
(36)
$$\begin{aligned} \mathrm {BR}_{t\rightarrow b u_i d_j}&= \mathrm {BR}_{W\rightarrow u_i d_j}= \frac{|V_{ij}|^2}{2}\mathrm {BR}_{W\rightarrow \mathrm {had}}, \end{aligned}$$
(37)

where we assume a 100% branching ratio for \(t\rightarrow bW\) decays. For the total W-boson branching ratios into leptons and hadrons we use the values [57]

$$\begin{aligned} \mathrm {BR}_{W\rightarrow \mathrm {had}}=0.675,\quad \mathrm {BR}_{W\rightarrow \mathrm {lept}}=0.325\,, \end{aligned}$$
(38)

which include state-of-the-art higher-order corrections.

3.6 Jet observables and acceptance cuts

For the reconstruction of jets we use the anti-\(k_\mathrm {T}\) [58] algorithm with \(R=0.4\). We select jets that fulfil

$$\begin{aligned} p_{\mathrm {T}}>25\,\text {GeV},\quad |\eta |<2.5, \end{aligned}$$
(39)

both for the case of light jets and b-jets. At parton level, we define as b-jet a jet that contains at least a b-quark, i.e. jets that contain a \(b \bar{b} \) pair arising from a collinear \(g\rightarrow b \bar{b} \) splitting are also tagged as b-jets. At particle level, i.e. when hadronisation is switched on, we tag as b-jets those jets that are matched to a B-hadron using the ghost method as implemented in FastJet  [59].

When studying \(t \bar{t} b\bar{b} \) production with stable top quarks, in Sects. 2 and 4, we categorise events according to the number \(N_b\) of b-jets that do not arise from top decays and fulfil the acceptance cuts Eq. (39). For the analysis of cross sections and distributions we consider an inclusive selection with \(N_b\ge 1\) and a more exclusive one with \(N_b\ge 2\). We refer to them as ttb and ttbb selections, respectively.

In Sect. 5 we present predictions for \(t \bar{t} b\bar{b} \) production with top-quark decays in the dilepton channel. In this case we require two oppositely charged leptons, \(\ell =e\) or \(\mu \), with

$$\begin{aligned} p_{\mathrm {T},\ell }>20\,\text {GeV}, \quad |\eta _{\ell }|< 2.5. \end{aligned}$$
(40)

Charged leptons are dressed with collinear photon radiation within a cone of radius 0.1. We do not apply any cut on missing transverse energy. Jets are defined as for the case of stable top quarks, and we select events with at least four b-jets that fulfill the acceptance cuts Eq. (39).

Table 1 Cross sections for \(pp\rightarrow t \bar{t} b\bar{b} \) at \(\sqrt{s}\)=13 TeV and their ratios in the phase space regions with \(N_b\ge 1\) (ttb) and \(N_b\ge 2\) (ttbb) b-jets as well in the region \(m_{b_1b_2}>100\,\)GeV of the ttbb phase space (ttbb\(_{100}\)). Nominal fixed-order predictions at LO and NLO accuracy are compared to corresponding LOPS and NLOPS predictions of Powheg+Pythia. Also cross sections at LHE level are reported. Uncertainties correspond to the envelope of the 7-point factor-two variations of \(\mu _\mathrm{R}\) and \(\mu _\mathrm{F}\)

4 Predictions for \(\varvec{t \bar{t} b\bar{b}}\) production with stable top quarks

In this section we present numerical predictions for \(pp\rightarrow t \bar{t} b\bar{b} \) at \(\sqrt{s}=13\) TeV in the 4F scheme. The presented results have been obtained with Powheg+OpenLoops using the setup of Sect. 3. Top quarks are kept stable throughout as specified in Sect. 3.5, and we study cross sections and distributions in the inclusive ttb phase space with \(N_b\ge 1\) b-jets, as well as in the ttbb phase space with \(N_b\ge 2\).

4.1 NLOPS predictions with perturbative uncertainties

In this section we compare (N)LO and (N)LOPS predictions focusing on NLO and matching effects as well as perturbative and PDF uncertainties. Table 1 presents cross sections in the ttb and ttbb phase space, as well as in the presence of an additional cut, \(m_{b_1b_2}>100\,\)GeV, on the invariant mass of the two hardest b-jets. At fixed order we find perfect agreement with the NLO results of Ref. [11]. The various phase space regions feature similar NLO uncertainties, around 25–30%, while corresponding LO scale variations are roughly a factor two larger. Both at LO and NLO, scale uncertainties are strongly dominated by \(\mu _\mathrm{R}\) variations. The large \(\sigma _{\mathrm {ttb}}/\sigma _{\mathrm {ttbb}}\) ratio, which exceeds a factor 5, reflects the appearance of large logarithms of \(m_b\) when a b-quark becomes unresolved. As shown in Sect. 2.3, such logarithms are mainly due to FS \(g\rightarrow b \bar{b} \) splittings. Thus the use of 4F PDFs, where \(\ln (m_b/Q)\) effects of IS origin are not resummed in the PDF evolution is well justified. Note also that \(\ln (m_b/Q)\) effects in \(\sigma _{\mathrm {ttb}}\) are present already at LO. Thus they do not jeopardise the convergence of the perturbative expansion. In fact, \(\sigma _{\mathrm {ttb}}/\sigma _{\mathrm {ttbb}}\) turns out to be very stable with respect to NLO corrections. The same hold for \(\sigma _{\mathrm {ttbb}}/\sigma _{\mathrm {ttbb}_{100}}\).

At variance with [8], where LO calculations were performed using LO PDFS and the corresponding value of \(\alpha _\mathrm{S}\), here, in order to obtain a more realistic picture of the convergence of the \(\alpha _\mathrm{S}\)-expansion, we use NLO inputs throughout.Footnote 15 This approach increases the NLO K-factors from 1.15–1.25 [8] to 1.80–1.95. This observation raises some concerns regarding the possible presence of significant higher-order corrections beyond NLO and calls for a better understanding of the origin of the large K-factor at NLO. This question as well as the search for possible improvements is deferred to future studies.

Comparing fixed-order (N)LO cross sections against (N)LOPS ones we find that matching and showering effects are almost negligible in \(\sigma _{\mathrm {ttb}}\), while in the case of \(\sigma _{\mathrm {ttbb}}\) they slightly exceed 10%, and in the Higgs-signal region, \(m_{b_1b_2}>100\,\text {GeV} \), they approach 30% . As pointed out in Ref. [8], such effects can be understood in terms of \(t \bar{t} +2b\)-jet production via double \(g\rightarrow b \bar{b} \) splittings. In practice, one of the b-jets results from a \(g\rightarrow b \bar{b} \) splitting in the \(t \bar{t} b\bar{b} \) matrix element, while the second one is created by the parton shower via a further \(g\rightarrow b \bar{b} \) collinear splitting. This interpretation is confirmed by the fact that the enhancement at hand is not present in the LHE-level cross sections presented in Table 1. In fact, double splittings are generated only at NLOPS level through parton showering. Double-splitting enhancements in Table 1 behave in a qualitatively similar way as in Refs. [8, 11], but their size turns out to depend on the employed NLOPS generator. As compared to Ref. [11], we observe that the NLOPS/NLO correction to \({\sigma _\mathrm {ttbb}}\) in Table 1 (+12%) is twice as large as in Sherpa (+6%), very close to PowhelFootnote 16 (+13%) and well below the prediction of Madgraph5aMC@NLO (+41%).

For what concerns scale variations, in Table 1 we see that their impact at NLOPS tends to be 5–10% higher as compared to fixed-order NLO. This is consistent with the behaviour of Madgraph5aMC@NLO and Powhel in Ref. [11], while Sherpa features a significantly lower scale uncertainty. Such differences may be an artefact of the incomplete implementation of scale variations in the various NLOPS tools. In the case of Powheg, as anticipated in Sect. 3.1 we have found that increasing \(h_{\mathrm {bzd}} \) can lead to unphysical enhancements of the scale uncertainty. This effect is mostly visible in the ttbb phase space, where the maximum scale variation amounts to \(+40\%\) for \(h_{\mathrm {bzd}} =2\) and grows up to \(+45\%\) and \(+54\%\) when setting \(h_{\mathrm {bzd}} =5\) and 50, respectively. Based on these observations, as default for our \(t \bar{t} b\bar{b} \) simulations we have set \(h_{\mathrm {bzd}} =2\). This choice guarantees a decent consistency with fixed-order scale variations without altering the matching procedure in a drastic way. In particular, when \(h_{\mathrm {bzd}} \) is reduced from its standard Powheg-Box value of 5 down to 2, we have checked that the fraction of the \(t \bar{t} +b\)-jet cross section that is shifted from the singular part Eq. (14) to the finite remnant Eq. (19) amounts to only 10–20%. This holds for all considered distributions in the ttb and ttbb phase space.

Differential observables with ttb and ttbb cuts are presented in Figs. 7 and 8. The inclusive b-jet multiplicity distribution in Fig. 7a extends the results of Table 1, which correspond to \(N_b\ge 1,2\), to the bins with \(N_b\ge 3,4\). The latter are populated by events that result from the interplay of real-emission matrix elements and \(g\rightarrow b \bar{b} \) parton-shower splittings. Thus they feature an enhanced scale dependence.

For kinematic distributions that are inclusive with respect to NLO QCD radiation, NLOPS scale variations have a minor impact on shapes and amount essentially to a normalisation shift, similar to what observed at the level of the ttb and ttbb cross sections. In contrast, in the case of the light-jet \(p_\mathrm {T}\) spectra, scale variations increase from about 30% in the soft region up to 100% in the hard tails. This is consistent with the fact that such observables are only LOPS accurate and depend on \(\alpha _\mathrm{S}^5(\mu _\mathrm{R})\). The effect of PDF variations is clearly subleading as compared to scale uncertainties and has little impact on shapes.

Fig. 7
figure 7

Predictions for \(pp\rightarrow t \bar{t} b\bar{b} \) at \(\sqrt{s}\)=13 TeV: distributions in the inclusive number of additional b-jets (a), the \(p_{\mathrm {T}}\) of the first b-jet (b) and the first light jet (c) with ttb cuts, and in the \(p_{\mathrm {T}}\) of the second b-jet with ttbb cuts (d). Results at LO and NLO are in blue and red, respectively, and dashed lines correspond to fixed-order (N)LO predictions, while solid curves represent (N)LOPS predictions. The bands illustrate the envelope of 7-point \(\mu _{\mathrm {R}},\mu _{\mathrm {F}}\) variations. Absolute predictions are shown in the main frame. The first ratio plot shows LO, LOPS and NLOPS predictions normalised to fixed-order NLO. The second ratio plot displays the relative effect of PDF uncertainties applied to NLOPS predictions Top quarks are kept stable throughout

Fig. 8
figure 8

Distributions in the \(p_{\mathrm {T}}\) of the second b-jet (a) in the \(p_\mathrm {T}\) of the first light jet (b), and in the invariant mass (c) and the \(\varDelta R\) separation (d) of the first two b-jets with ttbb cuts throughout. Predictions and uncertainties as in Fig. 7

Comparing (N)LOPS predictions to the respective fixed-order (N)LO results, we observe that matching and shower effects remain almost negligible also at the level of distributions in the ttb phase space. As for the ttbb region, the NLOPS effects of order 10% observed in \(\sigma _{\mathrm {ttbb}}\) turn out to be quite sensitive to the kinematics of b-jets. In particular, as expected from the QCD dynamics of double \(g\rightarrow b \bar{b} \) splittings [8], the most pronounced effects are observed in the tails of the \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\) distributions, where the NLOPS/NLO ratio approaches a factor two. In the Higgs signal region, \(m_{b_1b_2}\sim 125\) GeV, the NLOPS enhancement is around 1.25 and well consistent with Ref. [8].

Comparing fixed-order NLO predictions to LO ones we find that, in spite of the fairly large K-factors observed in Table 1, the shapes of distributions turn out to be quite stable with respect to higher-order QCD corrections. In the case of (N)LOPS predictions, the situation is different, especially for the shape of the light-jet \(p_\mathrm {T}\) spectra, which receives significant NLO distortions. This is not surprising, since at LOPS the light-jet \(p_\mathrm {T}\) is entirely generated by the parton shower. Thus the NLOPS/LOPS ratio should be regarded as a LO matrix-element correction to the parton-shower approximation, rather than a NLO correction in the perturbative sense.

Significant differences between NLOPS and LOPS shapes are observed also in the \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\) distributions. Since the respective NLO and LO shapes are very similar, this behaviour can be attributed to the parton shower. More precisely, it can be understood as a side effect of the above-mentioned NLOPS/LOPS correction to the light-jet \(p_\mathrm {T}\) spectra, which is converted into a double-splitting effect by \(g\rightarrow b \bar{b} \) splittings inside the light jet.

4.2 Shower uncertainties

In Figs. 9 and 10 we study the sensitivity of (N)LOPS predictions to parton-shower and matching uncertainties for the same observables considered in Sect. 4.1.

The ratios displayed in the upper frames illustrate the net effect of parton showering by comparing full NLOPS predictions against results at LHE level. In addition, to assess parton-shower uncertainties, NLOPS predictions based on Pythia are compared to the corresponding results obtained with Herwig.

In the ttb phase space, apart from a mild distortion of the light-jet spectrum, the net effect of parton showering is essentially negligible. In contrast, in the ttbb phase space it increases the cross section by about 5% and tends to grow in the tails of distributions. The most sizable shower effects are observed in the \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\) distributions, where they reach up to 50–100%. This behaviour is well consistent with the enhancement of the NLOPS/NLO ratio observed in Fig. 8, and the fact that it is driven by the parton shower provides further support to its interpretation in terms of double \(g\rightarrow b \bar{b} \) splittings.

In spite of the important role of parton showering, it is reassuring to observe that the sensitivity of NLOPS predictions to the choice of parton shower is small. In fact, the typical agreement between results based on Pythia and Herwig is at the level of 10% both in the ttb and ttbb selections, except for the leading light-jet-spectrum in the ttbb selection where the difference reaches almost 20% in the tail. Sizeable deviations at the level of over 20% are observed only when requiring more than three b-jets. As discussed in Sect. 3.1, the mild sensitivity of Powheg predictions to the choice of parton shower is due to the fact that the first emission is completely independent of the parton shower in the Powheg approach.

The ratios shown in the central frames of Figs. 9 and 10 illustrate (N)LOPS uncertainties related to the modelling of \(g\rightarrow b \bar{b} \) splittings and variations of \(\alpha _\mathrm{S}\) in Pythia (see Sect. 3.3). At LOPS also variations of the shower starting scale (scalup) are shown.

The fact that \(t \bar{t} b\bar{b} \) 4F matrix elements populate the whole \(b \bar{b} \) phase space restricts the effect of \(g\rightarrow b \bar{b} \) shower splittings to events with four or more b-quarks. Thus, only the cross sections with \(N_b\ge 3,4\) b-jets suffer from sizable shower uncertainties. Vice versa, all considered observables with ttb or ttbb cuts turn out to be very stable, with typical shower uncertainties of a few percent at NLOPS. This holds also for the observables that are most sensitive to double splittings, i.e. \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\), the only exception being the tail of the \(\varDelta R_{b \bar{b}}\) distribution, where double-splitting effects can reach 50% of the NLOPS cross section, while \(g\rightarrow b \bar{b} \) shower uncertainties can reach 15%.

Predictions at LOPS depend also on the choice of the shower starting scale. This uncertainty is especially sizeable in the case of the light-jet spectrum, where scalup acts as a cutoff. A sizeable scalup dependence is visible also in the LOPS predictions for the \(p_\mathrm {T}\)-distributions of b-jets, which indicates that such observables are rather sensitive to QCD radiation. Let us recall that the scalup dependence disappears completely in NLOPS simulations based on the Powheg approach.

Ratios plotted in the lower frames of Figs. 9 and 10 show the dependence of NLOPS predictions with respect to the choice of the \(h_{\mathrm {damp}} \) and \(h_{\mathrm {bzd}} \) parameters, which control the separation of the first emission into events of soft and hard type in the Powheg-Box framework (see Sect. 3.3). The \(h_{\mathrm {damp}} \) band is obtained by varying \(h_{\mathrm {damp}} =H_\mathrm {T}/4\), \(H_\mathrm {T}/2\), \(H_\mathrm {T}\), \(1.5m_t\) with the value of \(h_{\mathrm {bzd}} \) fixed to 2, while the \(h_{\mathrm {bzd}} \) band is obtained by varying \(h_{\mathrm {bzd}} =2\), 5, 10 with fixed \(h_{\mathrm {damp}} =H_\mathrm {T}/2\). Observables that are inclusive with respect to light-jet radiation reveal a remarkably small dependence, typically of the order of a few percent, on the choice of \(h_{\mathrm {damp}} \) and \(h_{\mathrm {bzd}} \). Non-negligible but moderate uncertainties are found only in the light-jet spectra, which are enhanced by up to 20% when \(h_{\mathrm {bzd}} \) is increased from 2 to 10. Investigating simultaneous variations of \(h_{\mathrm {damp}} \) and \(h_{\mathrm {bzd}} \) (not plotted) we have found that the size of the \(h_{\mathrm {damp}} \) variation band is fairly stable with respect to the value of \(h_{\mathrm {bzd}} \) within the considered range.

Fig. 9
figure 9

Relative impact of shower effects and uncertainties in (N)LOPS simulations of \(pp\rightarrow t \bar{t} b\bar{b} \) at \(\sqrt{s}\)=13 TeV: distributions in the inclusive number of additional b-jets (a), the \(p_{\mathrm {T}}\) of the first b-jet (b) and the first light jet (c) with ttb cuts, and in the \(p_{\mathrm {T}}\) of the second b-jet with ttbb cuts (d). All results are normalised to nominal NLOPS predictions with Pythia 8. The upper frame compares NLOPS result based on Pythia 8 (PY8) or Herwig 7 (HW7) against LHE results. The central frame compares NLOPS (red) and LOPS (blue) predictions with uncertainties related to \(\alpha _\mathrm{S}\) variations and to the modelling of \(g\rightarrow b \bar{b} \) splittings in Pythia (red NLO band and blue LO band). At LOPS, also variations of the shower starting scale scalup=\(H_\mathrm {T}/4,H_\mathrm {T}/2,H_\mathrm {T}\) are shown (grey band). The lower frame illustrates the relative effect of \(h_{\mathrm {damp}} =H_\mathrm {T},H_\mathrm {T}/2,H_\mathrm {T},1.5 m_t\) variations (HDAMP) and \(h_{\mathrm {bzd}} =2,5,10\) variations (BZD). Top quarks are kept stable throughout

Fig. 10
figure 10

Distributions in the \(p_{\mathrm {T}}\) of the second b-jet (a) in the \(p_\mathrm {T}\) of the first light jet (b), and in the invariant mass (c) and the \(\varDelta R\) separation (d) of the first two b-jets with ttbb cuts throughout. Predictions and uncertainties as in Fig. 9

4.3 Comparisons against other \(\varvec{t \bar{t} b\bar{b}}\) and \(\varvec{t \bar{t}}\) generators

In Figs. 11 and 12 we compare \(t \bar{t} +b\)-jet predictions based on Powheg+Pythia and Sherpa. This comparison is done both for (N)LOPS \(pp\rightarrow t \bar{t} b\bar{b} \) generators in the 4F scheme and for corresponding generators of inclusive \(t \bar{t} \) production in the 5F scheme. Specifically, in the case of Powheg we use hvq [25]. As detailed in Sect. 3.4, input parameters, QCD scales and matching parameters are chosen as coherently as possible across all generators. In this spirit, the parameter \(h_{\mathrm {damp}} =H_\mathrm {T}/2\) in Powheg is identified with the resummation scale \(\mu _Q\) in the SMC@NLO framework of Sherpa. Instead, for what concerns the parton showers we simply use standard settings, i.e. we do not try to improve the agreement between generators by tuning the Pythia and Sherpa showers.

Fig. 11
figure 11

Predictions for \(pp\rightarrow t \bar{t} +b\)-jets at \(\sqrt{s}\)=13 TeV: distributions in the inclusive number of additional b-jets (a), the \(p_{\mathrm {T}}\) of the first b-jet (b) and the first light jet (c) with ttb cuts, and in the \(p_{\mathrm {T}}\) of the second b-jet with ttbb cuts (d). The various ratio plots compare \(t \bar{t} +b\)-jet observables as described in LOPS (blue) and NLOPS (red) simulations based on \(pp\rightarrow t \bar{t} b\bar{b} \) or \(pp\rightarrow t \bar{t} \) matrix elements in Powheg+Pythia or Sherpa. In the ratios shown in the upper and middle frame Powheg predictions are normalised to Sherpa ones for the case of \(pp\rightarrow t \bar{t} b\bar{b} \) and \(pp\rightarrow t \bar{t} \) simulations, respectively. The third frame displays the ratio of \(t \bar{t}\) to \(t \bar{t} b\bar{b} \) Powheg predictions. For all ratios the numerator and denominator are evaluated at the same order, and uncertainties are applied only to the numerator. They correspond to the combination in quadrature of \(h_{\mathrm {damp}} \) and \(h_{\mathrm {bzd}} \) variations with the uncertainties due to the modelling of \(g\rightarrow b \bar{b} \) splittings and the choice of \(\alpha _\mathrm{S}\) and scalup in Pythia (see Sects. 3.23.3). Top quarks are kept stable throughout

Fig. 12
figure 12

Distributions in the \(p_{\mathrm {T}}\) of the second b-jet (a) in the \(p_\mathrm {T}\) of the first light jet (b), and in the invariant mass (c) and the \(\varDelta R\) separation (d) of the first two b-jets with ttbb cuts throughout. Predictions and uncertainties as in Fig. 11

The ratios in the upper frames of Figs. 11 and 12 show Powheg \(pp\rightarrow t \bar{t} b\bar{b} \) predictions normalised to corresponding Sherpa predictions at LOPS and NLOPS accuracy. The bands describe the combination in quadrature of all matching and shower uncertaintiesFootnote 17 in Powheg+Pythia (referred to shower uncertainties in the following), while only nominal Sherpa predictions are considered in the ratios. Comparing LOPS predictions gives direct insights into the different modelling of radiation in Pythia and Sherpa. For observables that are inclusive with respect to jet radiation we find deviations between 10–40% and comparably large shower uncertainties. In contrast, in the jet-\(p_\mathrm {T}\) distributions the LOPS predictions of Pythia are far above the ones by Sherpa, with differences that can reach a factor 2.5 in the tails. These differences are perfectly consistent with LOPS shower uncertainties, which are dominated by variations of the Pythia starting scale.

Moving to NLOPS reduces the direct dependence on the parton shower. At the same time, differences between the Powheg and SMC@NLO matching methods come into play. In practice, at NLOPS we observe a drastic reduction of shower uncertainties, especially in the light-jet and b-jet \(p_\mathrm {T}\)-distributions. Also the differences between Powheg and Sherpa become very small at NLOPS. The ttb and ttbb cross sections agree at the percent level, and differential b-jet observables deviate by more than 5% only in the tails of the \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\) distributions. Even the light-jet spectra in the ttb and ttbb phase space deviate by less than 10–20% up to high \(p_\mathrm {T}\), in spite of the limited formal accuracy (LOPS) of such observables. In the light of these results, NLOPS theoretical uncertainties related to the matching scheme and the parton shower seem to be well under control in \(pp\rightarrow t \bar{t} b\bar{b} \). In particular, their impact appears to be clearly subleading as compared to QCD scale uncertainties.

In the central frames of Figs. 11 and 12 we compare (N)LOPS generators of inclusive \(t \bar{t} \) production based on Powheg+Pythia and Sherpa. In this case, the \(g\rightarrow b \bar{b} \) final-state splittings that give rise to \(t \bar{t} +b\)-jet signatures are entirely controlled by the parton shower. At LOPS, also the parent gluon that splits into \(b \bar{b} \) is generated by the parton shower. Nevertheless, the ttb and ttbb LOPS cross sections predicted by Powheg and Sherpa deviate by less than 30%–40%. Instead, as expected, the shapes of \(t \bar{t} +b\)-jet observables vary very strongly, and in all considered light-jet and b-jet distributions Pythia results exceed Sherpa ones by a factor of two and even more. This excess is well consistent with the estimated LOPS shower uncertainties. At NLOPS, only \(g\rightarrow b \bar{b} \) splittings are controlled by the parton shower, while the emission of their parent gluon is dictated by LO matrix elements. Consequently, we observe a drastic reduction of shower uncertainties as compared to LOPS. Also the differences between Powheg and Sherpa are largely reduced at NLO, nevertheless they remain quite significant in various distributions.

To provide a more complete picture of the uncertainties of inclusive \(t \bar{t} \) simulations, in the lower frames of Figs. 11 and 12 we compare Powheg+Pythia generators of inclusive \(t \bar{t} \) production and \(t \bar{t} b\bar{b} \) production. Shower uncertainties are shown only for the \(t \bar{t} \) generator. At LOPS, the \(t \bar{t} \) generator is strongly sensitive to the modelling of \(pp\rightarrow t \bar{t} g\) through initial-state gluon radiation in Pythia. As a result, the \(t \bar{t} \) generator overestimates the ttb and ttbb cross sections by about 90% and 50%, respectively. This excess is strongly sensitive to scalup, and in the \(p_\mathrm {T}\)-distributions it is confined to the regions below 100–200 GeV, while the tails are strongly suppressed. Also the \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\) distributions feature strong shape differences as compared to LOPS \(t \bar{t} b\bar{b} \) predictions.

Such differences go down significantly at NLOPS. The ttb and ttbb cross sections predicted by the \(t \bar{t} \) generator overshoot \(t \bar{t} b\bar{b} \) results by only 15–20%, and also b-jet observables feature an improved agreement with \(t \bar{t} b\bar{b} \) predictions. Nevertheless, in b-jet observables we find quite significant shape differences, especially for the \(m_{b_1b_2}\) and \(\varDelta R_{b_1b_2}\) distributions, and shower uncertainties remain far above the ones of the \(t \bar{t} b\bar{b} \) generator (see upper frame). As for the light-jet spectra, \(t \bar{t} \) predictions turn out to lie above \(t \bar{t} b\bar{b} \) ones by about a factor of two in the tails. In principle, with the help of parton shower tuning NLOPS \(t \bar{t} \) generators may be amenable to a reasonable description of inclusive \(t \bar{t} +b\)-jet observables. However, in the light of the above results it should be clear that NLOPS \(t \bar{t} b\bar{b} \) generators are mandatory in order to achieve an acceptable level of shower systematics.

5 \(\varvec{t \bar{t} b\bar{b}}\) production with top-quark decays

In this section we present NLOPS results of the Powheg+Pythia \(t \bar{t} b\bar{b} \) generator with leptonic top-quark decays. More precisely we consider final states with oppositely charged leptons and/or muons. By default hadronisation and MPI are deactivated in Pythia, and their effect is shown separately. As detailed in Sect. 3.5, our implementation of top decays is based on resonant \(pp\rightarrow t \bar{t} b\bar{b} \rightarrow 2\ell 2\nu b \bar{b} \)(+j) matrix elements, where spin correlations are consistently taken into account.

Fig. 13
figure 13

Distributions in the b-jets of the reconstructed \(t \bar{t} b\bar{b} \) system for \(pp\rightarrow t \bar{t} +b\)-jets with dileptonic top decays at \(\sqrt{s}\)=13 TeV. Inclusive number of additional b-jets (a), distribution with ttb cuts in the \(p_{\mathrm {T}}\) of the first b-jet (b) and distributions with ttbb cuts in the \(p_{\mathrm {T}}\) of the first b-jet (c) and light-jet (d) as well as in the invariant mass (e) and \(\varDelta R\) (f) of the first and second b-jet. All results are based on Powheg+Pythia with hadronisation and MPI switched off. The ratio corresponds to NLOPS predictions with Pythia decays (ttbb+PSdecay) or stable top quarks (ttbb) normalised to corresponding ones with spin-correlated decays (ttbb+decay). Top-decay products are subject to acceptance cuts, while predictions with stable top quarks are normalised to ttbb+decay ones at the level of the ttbb cross section

Fig. 14
figure 14

Predictions for \(pp\rightarrow t \bar{t} +b\)-jets at \(\sqrt{s}\)=13 TeV after leptonic top-quark decays. Four b-jets and two leptons within acceptance are required without any distinction between b-jets from \(t \bar{t} b\bar{b} \) production and decay. Distributions in the inclusive number of b-jets (a) and in the \(p_\mathrm{T}\) of the first (b), second (c) and third (d) b-jet. All results are based on Powheg+Pythia, and in the lower frame nominal NLOPS predictions with spin-correlated decays without (ttbb + decay) and with hadronisation (ttbb + decay + HAD) and multi-parton interactions (ttbb + decay + HAD + MPI) are compared to corresponding ones with Pythia decays (ttbb + PSdecay)

Fig. 15
figure 15

Distributions in the invariant mass (a) and \(\varDelta R\) (b) of the first and second b-jet, in the \(p_\mathrm{T}\) of the leading lepton (c), and in the azimuthal \(\varDelta \phi \) separation of the two charged leptons (d). Predictions and ratios as in Fig. 14

Top-quark decays are due to weak interactions and, up to small corrections of \(\mathscr {O}(\varGamma _t/m_t)\), their effect factorises with respect to \(t \bar{t} b\bar{b} \) production. Thus, while they strongly increase the complexity of \(t \bar{t} +b\)-jet events, top decays are not expected to interfere with the QCD dynamics of \(pp\rightarrow t \bar{t} b\bar{b} \) in a significant way. In order to verify this hypothesis, in Fig. 13 we compare NLOPS \(pp\rightarrow t \bar{t} b\bar{b} \) simulations with stable and decayed top quarks. To this end, based on Monte Carlo truth, all reconstructed jets are split into two subsets associated with \(t \bar{t} b\bar{b} \) production and top decays. Specifically, jets that contain a parton originating from showered top-decay products are attributed to top decays, otherwise to \(t \bar{t} b\bar{b} \) production.Footnote 18 At the level of top decays we require two b-jets and two charged leptons within the acceptance cuts Eqs. (39) and (40), while for the “reconstructed” \(t \bar{t} b\bar{b} \) system we consider the same cuts and observables as for the case of stable top quarks. In order to mimic the leptonic branching ratio and the efficiency of acceptance cuts on top-decay products, the normalisation of the \(t \bar{t} b\bar{b} \) simulation with stable top quarks is adapted to the predictions with decayed top quarks. This is done at the level of the ttbb cross section through a constant normalisation factor.

As shown in Fig. 13, \(t \bar{t} +b\)-jet observables with stable top quarks and reconstructed top decays turn out to agree quite well: b-jet cross sections and distributions deviate by only 5–10%, and also in the light-jet \(p_{\mathrm {T}}\)-distribution decay effects hardly exceed 10%. These differences can be understood as indirect effect of the acceptance cuts on top-decay products, which result from the correlation between the kinematics of the \(t \bar{t} \) system and the additional jets.

Keeping in mind that realistic b-jet observables consist of a combinatorial superposition of b-jets from \(t \bar{t} b\bar{b} \) production and from top decays, the fact that Monte Carlo truth acceptance cuts on top decays have only a minor effect on the production of b-jets suggests that the essential features observed in \(pp\rightarrow t \bar{t} b\bar{b} \) production, such as double-splitting effects, are expected to show up also in the presence of top decays.

In order to assess the importance of spin correlations, in Fig. 13 we also compare spin-correlated top decays to isotropic decays generated by Pythia. At the level of reconstructed \(t \bar{t} b\bar{b} \) observables this comparison does not reveal any significant effect of spin correlations.

A more realistic analysis of \(t \bar{t} b\bar{b} \) production and decay is presented in Figs. 14 and 15, where b-jet and leptonic observables are defined at the level of the full final state, and two charged leptons and four b-jets within the acceptance cuts Eqs. (39) and (40) are required, without any distinction between \(t \bar{t} b\bar{b} \) production and decay.

Comparing spin-correlated and isotropic top decays, in b-jet observables we find no significant deviation, and significant spin-correlation effects show up only in the azimuthal correlation of the two charged leptons.

In Figs. 14 and 15 we also assess the relative impact of hadronisation and multi-parton interactions (MPI). It turns out that b-jet observables are very stable with respect to hadronisation, with differences between parton and hadron level that do not exceed the few percent level. The same holds for MPI effects.

The above results indicate that insights on the QCD dynamics of \(t \bar{t} b\bar{b} \) production gained through studies with stable top quarks at parton level should hold true also in the presence of top decays and hadronisation.

6 Summary and conclusions

Searches for \(t \bar{t} H\) production in the \(H\rightarrow b \bar{b} \) channel call for a precise theoretical description of the irreducible \(t \bar{t} +b\)-jet background. To shed light on the QCD dynamics that governs this nontrivial multi-scale process, in the first part of this paper we have analysed the relative importance of the various mechanisms that lead to the radiation of b-quarks off \(pp\rightarrow t \bar{t} \) events. To this end we have compared the role of \(pp\rightarrow t \bar{t} b\bar{b} \) topologies involving initial-state and final-state \(g\rightarrow b \bar{b} \) splittings. Using a naive diagrammatic splitting, as well as gauge-invariant collinear approximations, we have demonstrated that the \(t \bar{t} +b\)-jet cross section is strongly dominated by b-jet production via final-state \(g\rightarrow b \bar{b} \) splittings. This holds both for phase space regions with two or only one resolved b-jets. These findings support the usage of NLOPS generators based on \(pp\rightarrow t \bar{t} b\bar{b} \) matrix elements in the four-flavour scheme, while we have pointed out that \(t \bar{t} +b\)-jet predictions based on \(t \bar{t} +\)multi-jet merging rely very strongly on the parton-shower modelling of \(g\rightarrow b \bar{b} \) splittings.

Motivated by these observations we have introduced a new \(pp\rightarrow t \bar{t} b\bar{b} \) Powheg generator in the 4F scheme. This tool is based on the Powheg-Box-Res framework, and all relevant matrix elements are computed with OpenLoops. When applied to a multi-scale process like \(pp\rightarrow t \bar{t} b\bar{b} \), the Powheg method can lead to subtle technical issues. In particular, we have pointed out that the FKS mappings which generate the recoil associated with the first Powheg emission can enhance the amplitude of the underlying \(t \bar{t} b\bar{b} \) Born process in a way that leads to anomalously large weights as compared to the behaviour expected from the factorisation of soft and collinear radiation. Fortunately, such anomalies arise only from events with finite transverse momenta and not in the soft and collinear limits. Moreover, the Powheg-Box framework disposes of a mechanism that automatically attributes such events to the so-called finite remnant, where QCD radiation is handled as in fixed-order NLO calculations. This mechanism, which is controlled by the \(h_{\mathrm {bzd}} \) parameter in (23), plays an important role for the efficiency of event generation. Moreover, it permits to avoid artefacts that can result from the application of QCD factorisation and resummation far away from their validity domain.

We have discussed predictions of the new Powheg generator and theoretical uncertainties for various \(t \bar{t} +b\)-jet cross sections and distributions at the 13 TeV LHC. At variance with previous studies, in order to provide a better picture of the perturbative convergence, we have evaluated QCD corrections using the same \(\alpha _\mathrm{S}\) value and the same PDFs at LO and NLO. The resulting NLO K-factors turn out to be close to two, even if the renormalisation scale is chosen in a way that is expected to absorb large logarithms associated with the running of \(\alpha _\mathrm{S}\). The question of the origin of such large higher-order effects and the search for possible remedies, such as improved scale choices, deserve to be addressed in future studies.

Scale uncertainties at fixed-order NLO amount to 25–30% and are dominated by renormalisation-scale variations. At NLOPS they tend to increase in a similar way as in Madgraph5aMC@NLO, while in Sherpa they tend to decrease [11]. However this behaviour may be an artefact of the incomplete implementation of scale variations in NLOPS generators.

Comparing predictions at NLO, LHE and LOPS level reveals significant shower effects at the level of 10% in the ttbb cross section and up to 30% or more for the invariant-mass and \(\varDelta R\) distributions of b-jet pairs. These effects can be attributed to double \(g\rightarrow b \bar{b} \) splittings [8] and are qualitatively and quantitatively consistent with the findings of Refs. [8, 11]. For the \(p_\mathrm {T}\)-distribution of light-jet radiation, the predictions of the new Powheg generator are quite close to fixed-order NLO and also quite stable with respect to variations of the parameters \(h_{\mathrm {damp}} \) and \(h_{\mathrm {bzd}} \), which separate real radiation into singular and finite parts. This good stability is guaranteed by the \(h_{\mathrm {bzd}} \)-dependent mechanism mentioned above.

To assess pure shower uncertainties we have compared Powheg samples generated with Pythia 8 and Herwig 7. In addition, we have considered systematic uncertainties due to the modelling of \(g\rightarrow b \bar{b} \) splittings and the choice of \(\alpha _\mathrm{S}\) in Pythia. At NLOPS, all shower uncertainties turn out to be rather small and clearly subleading with respect to QCD scale variations. As a further independent estimate of matching and shower uncertainties we have compared NLOPS \(t \bar{t} b\bar{b} \) generators based on Powheg+Pythia and Sherpa finding remarkable agreement both for \(t \bar{t} +b\)-jet cross sections and distributions. We have also shown that matching and shower uncertainties increase considerably if NLO corrections are not taken into account. The same holds for NLOPS generators of inclusive \(t \bar{t} \) production as compared to \(t \bar{t} b\bar{b} \) generators.

Finally, we have presented predictions for \(pp\rightarrow t \bar{t} b\bar{b} \) with spin-correlated top decays. In this context we have show that hadronisation and MPI effects are almost negligible. Thus, the key features of the QCD dynamics of \(t \bar{t} b\bar{b} \) production at parton level are expected to hold true also at particle level after top decays.

The new \(t \bar{t} b\bar{b} \) Powheg generator will be made publicly available in the near future, and its application to experimental analyses may lead to significant steps forward in the understanding of the QCD dynamics of \(t \bar{t} +b\)-jet production and in the control of the theoretical uncertainties that plague \(t \bar{t} H(b \bar{b})\) searches.