Lepton-pair production in association with a $b\bar{b}$ pair and the determination of the $W$ boson mass

We perform a study of lepton-pair production in association with bottom quarks at the LHC based on the predictions obtained at next-to-leading order in QCD, both at fixed order and matched with a QCD parton shower. We consider a comprehensive set of observables and estimate the associated theoretical uncertainties by studying the dependence on the perturbative QCD scales (renormalisation, factorisation and shower) and by comparing different parton-shower models (Pythia8 and Herwig++) and matching schemes (MadGraph5_aMC@NLO and POWHEG). Based on these results, we propose a simple procedure to include bottom-quark effects in neutral-current Drell-Yan production, going beyond the standard massless approximation. Focusing on the inclusive lepton-pair transverse-momentum distribution $p_{\bot}^{l^+l^-}$, we quantify the impact of such effects on the tuning of the simulation of charged-current Drell-Yan observables and the $W$-boson mass determination.


Introduction
The production of a pair of high-transverse-momentum leptons in hadron-hadron collisions is one of the historical testing grounds of perturbative Quantum Chromodynamics (QCD). At the lowest order (Born approximation), it proceeds through the parton level amplitude qq → Z/γ * → + − , which once folded with parton distributions, gives the (first order) prediction for the inclusive rate, the so called Drell-Yan (DY) process. As shown a long time ago, higher-order QCD corrections [1] are important and need to be included to improve both the precision and accuracy of the calculation. Predictions for more exclusive final states can also be calculated in perturbative QCD, including for example QCD jets or heavy quarks (bottom o top quarks).
The theoretical interest in this process is matched (or even surpassed) by the experimental one: dilepton pairs in high-energy collisions have been always considered golden final states for Standard Model measurements as well as for new physics searches. In the long and impressive list of experimental results which feature an + − final state at hadron colliders, the measurement of the inclusive lepton-pair transverse-momentum distribution, conventionally dubbed p + − ⊥ , in neutral-current (NC) DY, has now reached an impressive level of accuracy at the LHC. Using 8 TeV measurements, ATLAS [2] and CMS [3] have attained a total experimental uncertainty below the 0.5% level in a large interval of transverse-momentum values, ranging between 2 and 50 GeV. These achievements represent a formidable challenge for the theoretical predictions which need to combine approximate results obtained with different techniques (fixed higher-order corrections vs resummation to all orders of logarithmically-enhanced terms) matched together, to perform a sensible test of the Standard Model (SM).
The approximate inclusion of initial-state logarithmically-enhanced corrections to all perturbative orders is necessary to perform a meaningful comparison with differential distributions of the leptons and is known up to next-to-next-to-leading logarithmic (NNLL) QCD accuracy [20,21] with respect to log(p V ⊥ /m V ), where p V ⊥ is the lepton-pair transverse momentum and m V is the relevant gauge boson mass (V = W, Z); these corrections have been implemented in simulation codes such as ResBos [22] or DYqT/DYRes [23,24].
The problem of merging fixed-order and all-order results, avoiding double counting, has been separately discussed in the context of QCD [22,[25][26][27][28][29][30] and in the EW [31][32][33] computations. QCD and EW results have to be combined together to obtain a realistic description of the DY final states: general-purpose Shower Monte Carlo programs, such as Shower (PS), formally retaining only LL accuracy in the respective logarithmic expansions. The combined matching with exact matrix elements of QCD and QED PS, respecting the NLO-QCD and NLO-EW accuracy on the quantities inclusive with respect to additional radiation has been presented in [38][39][40][41]. For a systematic comparison of the tools that simulate the DY processes including higher-order radiative corrections see Ref. [42].
Given the very precise experimental results available for all the relevant observables in the NC-DY process, it is necessary to carefully quantify all possible sources of uncertainties, including those coming from sets of radiative corrections which are formally subdominant in the perturbative expansions in the strong and electromagnetic couplings. These higherorder corrections include contributions from subprocesses with additional coloured particles in the final state. Among them, the production of a lepton pair in association with a bottom quark pair is of special interest. In this case the presence of (at least) two wellseparated perturbative scales, m b and m Z , where Λ QCD m b m Z , can potentially lead to large perturbative logarithmic corrections whose impact needs to be carefully assessed on a observable-by-observable basis. Starting from the pragmatic point of view that b quarks can be found as partons in the proton, it can be easily checked that at the LHC bb → Z provides a small but non negligible fraction of the total cross section for inclusive lepton-pair production. As this contribution affects both the normalisation and the shape of the kinematic distributions, a careful analysis is required that can also estimate bottom mass effects.
A first estimate of the relative importance of contributions from different flavours of quarks in DY processes can be obtained by computing the individual contributions of quarks to the total cross section for NC-DY in the so-called five-flavour scheme (5FS), i.e. in terms of five massless active quarks. Results are shown in Table 1. While this decomposition is not physical per se as it is ambiguous beyond NLO, it allows to appreciate the precision needed in the predictions of NC-DY production through heavy quark flavours.  Table 1: Flavour decomposition of the total cross section within the acceptance cuts described in Section 2, computed at NLO accuracy with five active massless quarks in the proton.
Although in this case all the active flavours in the proton are described as massless, in the case of heavy quarks the effect of their mass m Q is introduced in an initial condition that controls the evolution equations of the respective parton densities; the latter start to  be non-zero at an energy scale of O(m Q ). These boundary conditions, combined with all the other constraints satisfied by the proton PDFs, result a heavy quark PDFs behaviour which is typically quite different from those of light quarks, leading to significant differences in observables like the p + − ⊥ distribution. In Figure 1 we appreciate the shape of the various contributions initiated by different quark flavours, which display a harder spectrum in the case of heavy quarks. We conclude that given the present experimental uncertainty, the bottom-quark contribution to the p + − ⊥ distribution deserves a dedicated study of the residual uncertainties due to the treatment of the bottom mass effects. With the latter, we mean an improved description of the bottom-quark kinematics, including mass-dependent terms, at NLO-QCD.
The p + − ⊥ distribution provides access to a large range of scales and therefore it offers a stringent test of perturbative QCD in different regimes: in the low-momentum region it is sensitive to non-perturbative QCD contributions and possibly to the flavour structure of the proton [43]. The precise knowledge of this part of the p + − ⊥ spectrum makes it possible to calibrate the non-perturbative models that describe the partonic transverse degrees of freedom inside the proton. Such models, implemented e.g. in QCD PS, are then used to simulate other scattering processes, and their uncertainties propagate in the prediction of the new observables. A striking example is the determination of the W -boson mass m W [44][45][46], which relies on the p + − ⊥ input to obtain an accurate simulation of the W -boson transverse-momentum spectrum and in turn of the leptonic final state. Eventu-ally, the extraction of m W displays a strong sensitivity to the modelling assumptions for the low-momentum part of the p ν ⊥ spectrum. In this context, it is important to remark that the heavy-quark contribution is different in CC-and NC-DY, because of the different initial-state flavour structure, following from electric charge conservation and Cabibbo-Kobayashi-Maskawa (CKM) mixing. More specifically, the bottom-quark effects which are present in NC-DY are marginal in CC-DY as the bottom-quark density appears only in the CKM-suppressed cb initiated subprocess. If (part of) the perturbative effects are nonuniversal and flavour dependent, as it is the case for the bottom-quark contributions, then a non-perturbative model based on the fit of NC-DY data could include in its parameterisation these effects and erroneously propagate them also to processes like CC-DY, where instead they are absent or marginal. In summary, improving the accuracy and precision of the heavy-quark contributions to the inclusive Z-boson production, is relevant: i) to reduce the amount of information which has to be encoded in a model that describes the low-momentum part of the gauge-boson transverse-momentum spectrum; ii) to capture some non-universal flavour-dependent contributions, which distinguish massless and massive quarks, leaving for the non-perturbative model, to a greater extent, only universal, flavour-independent effects.
Understanding heavy-quark contributions to lepton-pair production benefits also from the analysis of exclusive final states where the leptons are associated to a pair of bottomantibottom quarks, which are explicitly tagged in terms of either b−jets or B hadrons. The presence of additional energy scales, such as the masses and transverse momenta of the measured b quarks, imposes non-trivial constraints on the structure of the radiative corrections that have to be included in the simulations to obtain accurate predictions. Understanding these final states is also propedeutic to that of other heavy systems, e.g., a Higgs boson or a tt pair, accompanied by a bb pair.
The production of + − bb, with the inclusion of NLO-QCD corrections has been discussed in Refs. [47][48][49][50][51], and more recently in Ref. [52], for final states with at least one or with two tagged b-quark jets, in the so called four-flavour scheme (4FS), namely using a parameterisation of the proton structure in terms of only four active quarks and considering bottom quarks in the final state as massive. The matching of fixed-order matrix elements with a Parton Shower has been implemented in Ref. [53] in the Mad-Graph5 aMC@NLO framework [54] and in Ref. [55] in the Sherpa framework.
Given on the one hand the very high level of precision necessary to obtain sensible results in the description of p + − ⊥ and eventually in the determination of m W , and, on the other hand, the link between exclusive and inclusive final states characterised by the presence of heavy quarks, we deem necessary to scrutinise the theoretical uncertainties affecting the prediction of the observables for + − bb final states. To this aim, we present a systematic comparison of two different schemes of matching between fixed order results with a QCD PS 1 , namely the MadGraph5 aMC@NLO and the POWHEG ones; we expose the impact of different treatments for the QCD PS phase space assignment and we present the phenomenological results obtained with two QCD PS models, namely Pythia8 and

Herwig++.
To summarise we i) thoroughly compare the implementations of the production of a lepton pair in association with a bb pair in the 4FS between two available Monte Carlo event generators, with a systematic analysis of all the relevant QCD theoretical uncertainties; ii) consider the effects of including bottom-quark-mass contributions on the inclusive transverse-momentum spectrum of the lepton pair; iii) estimate the impact that such contributions may have on the determination of the W -boson mass.
The paper is structured as follows. In Section 2 we describe the setup employed for the numerical simulations; in Section 3 we study lepton-pair production in association with bottom quarks in the 4FS, we compare the implementations in the Mad-Graph5 aMC@NLO and POWHEG frameworks and discuss several sources of theoretical uncertainties for inclusive observables. We defer to Appendix A an extensive comparison of more exclusive observables. In Section 4, in order to evaluate the effects of the bottomquark mass, we consistently combine the 4FS prediction for + − bb with the usual 5FS inclusive lepton-pair calculation and study the transverse-momentum distribution p In Section 5 we consider the impact of bottom-quark mass effects on CC-DY observables and on the determination of the W -boson mass. We draw our conclusions in Section 6.

Setup of the simulations
In this work we study the processes for one leptonic family, in a setup typical of the LHC, with √ S = 13 TeV. Unless stated otherwise, the simulations have been run at NLO+PS accuracy with the codes MadGraph5 aMC@NLO (all the processes have been generated within the same computational framework) and POWHEG-BOX. Both codes have been interfaced with the same QCD-PS programs, namely Pythia8 (version 8.215, Monash tune) [35,57] and Herwig++ (version 2.7.1) [36,58]. We did not include any QED effect via QED PS. The simulation of the underlying event is not performed. For the proton parton-density parameterisation we use the NNPDF 3.0 NLO PDFs with α s (m Z ) = 0.118, with the same flavour-number scheme as for the hard process [59]. The SM parameters are set to the following values [60,61]: It is understood that the quoted value of m b is employed only in the 4FS process, Eq. 2.2.
For the central value of the renormalisation and factorisation scales, we use for all samples the lepton-pair transverse mass divided by four: For the 5FS NC-DY, this choice was advocated in Ref. [62]. The only exception to what stated above is represented by the samples for charged-current Drell-Yan used in Section 5.2, where the transverse mass of the (reconstructed) W boson is used: In Eq. 2.5 (2.6) M 2 and p + − ⊥ (p ν ⊥ ) are respectively the squared invariant mass and the transverse momentum of the lepton pair (lepton-neutrino pair).
In the simulation of processes 2.1 and 2.2, a generation cut M ( + , − ) > 30 GeV is applied in order to avoid the singularity related to the photon contribution. At the analysis level, for the processes 2.1 and 2.2, we apply a cut on the transverse momentum of each lepton, p ⊥ ( ± ) > 20 GeV, and on their pseudorapidity, η( ± ) < 2.5, together with an invariant-mass cut around the Z peak, |M ( + , − ) − m Z | < 15 GeV. In process 2.3 we impose a cut on the charged-lepton transverse momentum and on the missing transverse energy (the transverse momentum of the neutrino), p ⊥ ( + ) > 20 GeV, p miss ⊥ > 20 GeV, and again a pseudorapidity cut |η( + )| < 2.5 for the charged lepton.

Lepton-pair production in association with bottom quarks in the 4FS
In this section we study the process pp → + − bb + X in the 4FS. The bottom quarks, absent in the proton PDFs, are treated as massive final-state hard partons. In Section 3.1 we compare the formulation of two matching recipes to combine fixed-and all-order results with NLO-QCD accuracy, with special attention to the details of the inclusion of multiple parton radiation and to the perturbative sources of uncertainty. We then discuss phenomenological results, obtained with the setup outlined in Section 2: in Section 3.2 we first determine a typical scale which characterises the process and then in sections 3.3 and 3.4 we compare respectively the results of the various matched schemes for the transverse momentum of the + − bb system and the effect of higher-order corrections and of the matching for the lepton-pair transverse momentum. The interested reader can find more details on other differential observables in the Appendix A.

MadGraph5 aMC@NLO and POWHEG-BOX implementations
In this section we compare two different matching schemes, namely those implemented in the MadGraph5 aMC@NLO and POWHEG-BOX Monte Carlo event generators. The aim is to disentangle genuine bottom-quark effects from those due to a different treatment of higher-order emissions in the two approaches.
The simulation of scattering processes in hadron collisions requires not only the inclusion of fixed-order corrections in order to obtain a reliable estimate of the overall normalisation of the cross sections, but also the inclusion of multiple parton emissions at all orders in order to achieve a realistic description of the shape of the distributions. The possibility of simultaneously preserving the NLO accuracy for all the observables that are regular when the radiative corrections are included, together with the description of multiple parton emissions, is achieved by matching fixed-and all-order results. Different matching schemes have been proposed in the literature, and here we focus on MC@NLO [26] and POWHEG [27]; they share the same fixed-order accuracy, but differ for the inclusion of subsets of higher-order terms. The latter are beyond the accuracy of the calculation with respect to the coupling constant expansion and are formally subleading in a logarithmic expansion in powers of log(p V ⊥ /m V ), where p V ⊥ represents a generic transverse-momentum variable that yields a singularity of the amplitude in the limit p V ⊥ → 0, and m V is the invariant mass of the system whose transverse momentum is described by p V ⊥ ; although subleading, these terms can nevertheless have a sizeable numerical impact on the predictions, in particular for those observables that have only the lowest order accuracy.
The matching of fixed-and all-orders corrections should avoid double counting between the two contributions and respect the ordering of the emissions of QCD partons, in order to preserve the logarithmic accuracy of the results. In a Monte Carlo approach, the hardest QCD parton, with respect to the radiation ordering parameter t, plays a special role, for it receives the exact matrix element corrections of the fixed-order calculation. The subsequent emissions are instead generated by the QCD-PS and the associated phase-space volume is part of the matching prescription.
A generic scattering process, whose lowest order (LO) is characterised by the presence of k final state particles, receives radiative corrections due to the emission of n additional partons. In the MC@NLO approach an event is generated according to the following steps: 1. The event weight is split in two contributions, called standard (S-events) and hard (H-events), which describe final states with respectively k and k + 1 final state particles; both standard and hard terms are matched with a QCD-PS that generates n additional parton emissions using: i) the standard Sudakov form factor computed in the collinear approximation; ii) an approximated phase-space measure; iii) an upper limit for the hardness of the emission set by a scale Q sh called shower scale.
2. H-events account for the exact real matrix-element corrections describing the first real emission, evaluated in the full phase space with exact integration measure. The double counting in the generation of the hardest parton between the PS and the exact matrix element is avoided with an appropriate counterterm.
3. S-events account for all the terms entering a NLO cross sections (Born, virtual, counterterms, etc.) except for the real-emission matrix elements and the corresponding counterterms.
4. The shower scale Q sh associated to each S-event is extracted from a probability distribution. The latter parametrically depends, event-by-event, on a reference scale, which we denote with the symbol µ sh and which is computed considering the S-event kinematics. For the corresponding H-event, the maximum of the allowed values by the same distribution is used. The details of this procedure and the functional form of the distribution of are given in Section 2.4.4 of Ref. [54].
In the POWHEG approach an event with n additional partons is generated according to the following steps: 1. Each LO configuration is rescaled by a factor, usually denoted byB, that accounts for virtual corrections and the integral over the first real emission. This rescaling guarantees the full NLO accuracy for inclusive quantities.
2. The expression of the POWHEG Sudakov form factor depends on the splitting, in the full real-emission matrix elements R, between the singular R s and a remaining regular part R f , controlled by a scale h according to: where the damping factor f (h, t) depends on the radiation variable t, it goes to 1 in the collinear limit t → 0 and vanishes for large t. The scale h defines the region where the Sudakov suppression is active and the effects of multiple parton emissions are systematically included. 4. The fact that the first parton with emission variable t =t is by construction the hardest is obtained: i) computing the product of the Sudakov form factor for an emission witht with the corresponding real-emission matrix element; ii) limiting the QCD-PS phase space at the valuet of the emission variable t.
5. The QCD-PS populates the available phase space, assigning to the ordering variable t a maximum value equal to the shower scale Q sh of the event; this value is by default t =t (wheret varies on a event by event basis); however, a redefinition of the Q sh value, different thant, is allowed in the generation of the events based on the non singular part R f of the real-emission matrix element (remnant events), without spoiling the accuracy of the calculation; once Q sh is assigned, the generation of n − 1 additional emissions proceeds in the PS approximation for the branching probability and integration measure.
These two approaches share the NLO-QCD accuracy in the prediction of the total inclusive cross section, and differ by the inclusion of terms of higher order in the perturbative expansion in powers of α s . As already said, the latter are formally subleading with respect to the enhancement due to log(p V ⊥ /m V ) factors, but can nevertheless be numerically sizeable, depending on the phase space region under study.
We note that in general when PS programs take in short-distance events, a comparison is performed between the shower scale Q sh provided in the event (in the event-record field SCALUP) and the corresponding scale that would be associated by the code based on its own phase space evaluation. Since all the PS emissions must be ordered with respect to the hardness parameter t, the smallest value between the two is eventually used in the QCD-PS.
We focus now our attention on the actual distribution of Q sh in the event samples produced in the two Monte Carlo frameworks, i.e. the distribution of the values of the SCALUP field of the event records.
In Figure 2 (left plot) we show the histograms obtained with MadGraph5 aMC@NLO for different choices of µ sh , in the 4FS simulation. 2 For reference, we also show the same distribution in the 5FS simulation.
In Figure 2 (right plot) we show the distributions obtained in the POWHEG-BOX 4FS simulation, for the events describing the singular part and the regular reminder of the real matrix element (B and remnant events respectively), for different values of the damping factor h. In the POWHEG-BOX default setup, Q sh coincides with the transverse momentum of the first emission. As said, it is possible to preserve the logarithmic accuracy of the calculation with a different choice of Q sh for the remnant events. 3 We recall that the generation probability of a remnant event depends also on the scale h introduced in the POWHEG Sudakov form factor, as it can be seen from the plot. 2 The choice µ sh = √ŝ has been the default in MadGraph5 aMC@NLO up to version 2.5.2. From version 2.5.3, µ sh = HT /2 is the new default. In these newer versions, it is still possible to use µ sh = √ŝ , by setting the i scale=0 in the subroutine assign ref scale inside montecarlocounter.f.
3 For example, we tested a different option in Figure 15 where we also considered the distributions of From the comparison of the two plots, one may appreciate how different the distributions of Q sh are in the event samples generate by the two frameworks. This fact and the different structure of the matching procedure itself, impact the final results of the simulations (after showering), formally beyond the claimed accuracy but still in a phenomenologically relevant way. We also note that the final numerical results, after showering, depend on the PS ordering variable, so that different QCD-PS models may yield different results even if using the same sample, Q sh distribution.
To summarise, the formulation of the matching and the interface with the QCD PS are closely entangled; their ambiguities and prescriptions represent an important source of theoretical uncertainty, beside the canonical ones, related e.g. to the choice of the renormalisation and factorisation scales. These additional uncertainties are relevant in the study of the shape of the kinematic distributions, in particular of those sensitive to the details of real radiation. 3.2 Identification of the reference energy scale for lepton-pair production in association with a b-quark pair In Ref. [62], the production of a Higgs boson in association with a massive bb pair is considered and, following the discussion of Ref. [63], a universal logarithmic factor L ≡ log Q 2 (z)/m 2 b , associated to each g → bb splitting is identified. We adapt this approach to the case under study of the subprocess g/q(p 1 ) g/q(p 2 ) → + (q + ) − (q − )b(k 1 )b(k 2 ) and obtain that the universal corrections have the form where M 2 ( + , − ) is the squared lepton-pair invariant mass. The effective scale that characterises the process is Figure 3: Event distribution, in + − bb production, with respect to the variable M defined in Eq. 3.2. The red arrow corresponds to the peak of the distribution.
In Figure 3 we plot the distribution dσ/dM and observe the presence of a peak at M ∼ 25 GeV. We interpret this value as one of the typical energy scales that characterise the process and justify, following Ref. [62], our choices described in Section 2 for the renormalisation and factorisation scales in the 4FS.

The + − bb transverse-momentum distribution
In this section we consider the transverse-momentum distribution of the + − bb system and present numerical results in different approximations in Figure 4. This quantity is of technical interest, because it makes it possible to study the impact of QCD radiation on this system, with interesting features due to the presence of coloured particles in the final state, whose emissions contribute to the recoil of + − bb.
In the left plot of Figure 4 we show the distribution of the logarithm of the trans-verse=momentum distribution of the + − BB system 4 computed with different combinations of generators and of scales 5 : in black we present the divergent distribution computed in the fixed order NLO calculation (we stress that this quantity is only LO accurate in this calculation); in blue we show results obtained with MadGraph5 aMC@NLO and Pythia8 , using as reference shower scale, µ sh , the partonic variable √ŝ /2 (dashed) or √ŝ /4 (solid); in red we show results obtained with MadGraph5 aMC@NLO and Pythia8 , using as µ sh the sum of the final-state transverse masses H T /2 (dashed) or H T /4 (solid); in green we show results obtained with POWHEG and Pythia8 , setting the value of the scale h in the damping factor equal to m Z (dashed) or m Z /4 (solid). In the upper inset of the right plot of Figure 4, we compare the different combinations of tools and scales of the left plot with the fixed-order NLO-QCD results. As a function of the transverse momentum of the system we observe: the Sudakov suppression at low momenta; the redistribution of the events due to the unitarity of the PS and of the matching procedure, yielding an increase of the distribution at intermediate momenta; a decrease of the distributions compared to the fixed order results occurs at large momenta, for both MadGraph5 aMC@NLO and the POWHEG-BOX and irrespective of the choices of the PS parameters.
In the lower inset of the left plot we show the PDF and renormalisation/factorisation scale uncertainties, which are at the ±20% level for transverse momenta smaller than 120 GeV, but increase and reach the ±45% level for transverse momenta close to 1 TeV, and are dominated by the scale uncertainties.
In the lower inset of the right plot we compare again with the fixed-order NLO-QCD curve the result obtained with MadGraph5 aMC@NLO combined with Pythia8 (blue) and with Herwig++ (brown) using for µ sh the partonic variable √ŝ /2 (dashed) or √ŝ /4 (solid). We observe similar trends of the two QCD-PS models but a quantitative significant difference in the comparison of the bands obtained with a variation of the shower scale in the same range, in particular for the size of the bands corresponding to the √ŝ /2 scale choice.

in different approximations
In this section we study the transverse-momentum distribution p + − ⊥ of the lepton pair, in presence of a bb pair in the final state, inclusive over the b-quark contributions.
The process pp → + − bb is studied in the 4FS in different perturbative approximations, namely at LO, at fixed NLO-QCD, including QCD-PS effects matched with the LO or with the NLO-QCD results. At variance with the 5FS case, where the p → 0, this observable in the 4FS is regular in the same limit at fixed order and a fortiori after matching with a QCD PS. The regular behaviour of p + − ⊥ in the 4FS is due to the bottom-quark mass, which acts as a regulator for the singularity associated with the limit p distribution is sensitive to large logarithmic corrections due to QCD initial-state radiation, mostly from the gg-initiated subprocess 6 . The origin of these large effects can be understood by considering the two mechanisms that yield a transverse momentum of the lepton pair: i) the LO distribution of the γ * /Z boson in a three-body final state and ii) the recoil against QCD radiation of the + − bb system. While the former is regular in the whole phase space, the latter is sensitive to the presence of collinear divergences due to initial-state radiation. In fact, the transverse-momentum distribution of the + − bb system is divergent at fixed order for vanishing transverse momentum and requires the resummation of logarithmically-enhanced terms to all orders to become regular. After the resummation, the transverse-momentum distribution of the + − bb system is still sensitive to logarithmically-enhanced corrections, which contribute in turn to the second of the two mechanisms that yields the p + − ⊥ distribution, explaining why the prediction of the latter requires not only a fixed-order calculation but also the matching with multiple parton emissions at all orders via QCD-PS.
In Figure 5 (left panel) we compare the 4FS distributions in different perturbative approximations: at fixed-order LO and NLO and, after the matching of MadGraph5 aMC@NLO with the Pythia8 QCD-PS, with LO+PS and NLO+PS accuracy. We use the inputs described in Section 2 and set µ sh = √ŝ /4 as the reference shower scale. In Figure 5 (right panel) we show the relative impact of the various approximations relative to the LO results. The NLO corrections (green) yield a large K-factor of O(70%), flat almost the whole p  Figure 6 we study different sources of theoretical uncertainty, using the same colour code and comparing the same approximations as in Figure 4. In the left plot, upper inset,  We observe a common trend of the corrections due to multiple parton emissions, which are negative down to −30%, with respect to the fixed NLO prediction, for p + − ⊥ < 20 GeV and positive up to +15% for larger values. From the upper inset of the right plot of Figure 6 we observe that in the low-p + − ⊥ region the POWHEG-BOX corrections are slightly smaller in size than those of MadGraph5 aMC@NLO.
In the left plot, lower inset, we show the uncertainty bands associated to scale variations (the renormalisation and factorisation scales are varied independently in the interval [µ/2, 2µ]) and to PDFs. As it can be seen, scale variations provide an uncertainty of O(±20%) which is quite independent on the value of p + − ⊥ , and represent by far the dominant source of uncertainty. PDF uncertainties are much smaller, below 3%. In the right plot, lower inset, we compare the PS Pythia8 and Herwig++, both matched to MadGraph5 aMC@NLO and with the same reference shower scale µ sh . We observe that accidentally the combination of MadGraph5 aMC@NLO with Herwig++ yields results which are in size and shape similar to those obtained with the POWHEG-BOX and
In Figure 7 we compare the predictions of MadGraph5 aMC@NLO at LO+PS and NLO+PS accuracy obtained with Pythia8 and Herwig++ as QCD-PS (left plot), with either µ sh = √ŝ /4 or µ sh = √ŝ /2; we show the relative difference with respect to the NLO fixed-order prediction in the right panel. The differences of order ±15% at LO+PS (green and red bands) are reduced down to ±7% at NLO+PS, because the first real emission is described, in the latter case, by the exact matrix element and the PS differences appear from the second emission.

Inclusive lepton-pair transverse-momentum distribution
In this section we discuss the prediction of the inclusive lepton-pair transverse-momentum distribution and propose a formulation that includes a refined treatment of the bottomquark contributions, exploiting the advantages of both 4FS and 5FS formulations of the lepton-pair production process. Two procedures are commonly followed to calculate high-energy processes characterised by a hard scale Q, that involve the production of heavy quarks such as the bottom quark.
In the so called "massive" or four-flavour scheme (4FS) the heavy quarks do not contribute to the proton wave-function because the value of their mass, larger than the one of the proton, makes their creation in pairs possible only in high-energy interactions. In this scheme the active degrees of freedom are n f light quarks, while the heavy quarks are decoupled and do not contribute to the running of the strong coupling constant nor to the PDF evolution; in particular a bottom-quark PDF is absent. The validity of this approach is guaranteed when the hard scale Q of the process is comparable to the heavy-quark mass m b . The latter acts as a natural cut-off in the case of additional collinear emissions.
In the case when a hierarchy between the heavy-quark mass and the hard scale of the process is present (m b Q) it is possible that large corrections enhanced by log Q 2 m 2 b appear in the cross sections, spoiling the convergence of the perturbative expansion, while powers of the ratio m b /Q are naturally suppressed. The initial-state logarithmic corrections can be resummed to all orders via the Altarelli-Parisi equations and reabsorbed in the definition of a bottom-quark proton PDF, while in the final-state case it is possible to introduce appropriate fragmentation functions. The bottom quark belongs then to the light quarks present in the proton (n f = 5) and contributes to the running of the strong coupling constant. This approach is called "massless" or five-flavour scheme (5FS).
The advantages of the 5FS are related to the lower multiplicity of scattering particles: the simplicity of the final-state structure makes it possible to include higher-order radiative corrections more easily than in the corresponding 4FS processes. In addition, the presence of a bottom PDF in the proton resums to all orders initial-state collinear logarithms due to gluon emissions. As of today, the final-state higher multiplicity in the 4FS forbids the inclusion of corrections beyond NLO-QCD. On the other hand, the exact description of the massive-quark kinematics is already present at LO and can be analysed in detail upon inclusion of the NLO corrections and also after matching with a QCD PS. In addition, as argued in Section 3.2 based on the results of Ref. [62], possibly large logarithms log Q 2 m 2 b are in fact suppressed by phase space effects, and the effective scale Q 2 is parametrically lower than the vector boson mass. One therefore expects that in this region bottom mass effects to be more relevant than the collinear logarithms and the 4FS could be preferred over the 5FS. Another motivation for employing the 4FS scheme is provided by the inclusive gaugeboson transverse-momentum distribution, which has a peak in the interval between three and eight GeV, namely at values comparable to the bottom-quark mass. The simulation of the lepton-pair transverse momentum distribution in shower Monte Carlo codes based on the 5FS description requires, in the case of the bottom-induced partonic contributions, that the emission of real radiation stops at a transverse momentum scale of O(m b ), that the b parton be put on its mass shell and that the hadronisation of b quark into B hadron takes place. This is typically handled by an ad hoc procedure in the QCD PS, which features intrinsic ambiguities. In the 4FS instead, the lepton-pair transverse momentum distribution receives an exact matrix-element description including the O(α s ) corrections in the full range from zero GeV up to the kinematic limit.

Bottom-quark contributions to DY in 4FS and 5FS
We are interested in combining the advantages of the 4FS and 5FS approaches, in order to improve the description of the bottom-quark effects in the lepton-pair transversemomentum distribution. 7 The merging of 4FS and 5FS results is in principle possible provided that double counting is avoided. To this aim, equivalent terms that contribute in the two schemes need to be identified, then subtracted from the 5FS description of the process and added back as evaluated in the 4FS. The rationale behind this combination is the possibility of exploiting the improved description offered by the 4FS of the heavy-quark contribution to observables like the gauge-boson transverse momentum at low-/intermediate-momentum values. Figure 8: Comparison of the p + − ⊥ distribution in the plain 5FS with the contribution associated to the bottom quark, the latter evaluated in different schemes and approximations.
In the 5FS, at tree level, the DY process occurs through quark-antiquark annihilation, the partonic cross section starts at O(G 2 µ ) and bottom-initiated subprocesses are already present. Since we are interested in the bottom contribution, we remark that this density, generated inside the proton by a radiative mechanism, is proportional to α s and it contains, via Altarelli-Parisi evolution, the resummation to all orders in α s of terms enhanced by a factor log(µ F /m b ).
In Figure 8 we show, with NLO+PS accuracy, in black dashed the complete p + − ⊥ distribution in the 5FS and in red dashed the contribution given by the subprocesses initiated by at least one bottom PDF. The size of the latter is consistent with the overall contribution of O(4%) to the total cross section, but the peak of the distribution is at a larger value than the one of the all-flavour p + − ⊥ distribution (10 GeV vs. 3 GeV). After the matching of exact NLO matrix elements with a QCD-PS that simulates parton radiation to all orders, we have to consider the possibility that the emitted gluons split into bb pairs, which appear as final-state hard partons; such terms are of O(α 2 s G 2 µ ) (when the initial state contains only light quarks) or higher. Since it is not possible to make a distinction between initial-and final-state bottom contributions, we are lead to define the bottom contribution to DY in the 5FS as the one given by all the events that contain at least one B hadron in the final state (generated in the hadronisation phase of the QCD-PS). We recall that in the 5FS the cross section is evaluated with five active flavours contributing to the strong coupling-constant running, inducing a bottom contribution also in the subprocesses initiated by light quarks and gluons; the latter are not tagged by the B hadron selection.
In the 4FS, the bottom quark in the proton is by definition absent; lepton-pair production in association with a bb pair starts at O(α 2 s G 2 µ ), with the strong coupling-constant running with four active flavours. This LO cross section is exact in the description of the kinematics of the massive bb pair. In a NLO-QCD accurate calculations, also terms of O(α 3 s G 2 µ ) are exactly included. In this scheme, heavy-quarks contributions to the α s running are decoupled and included in the renormalisation condition. After matching with a QCD PS, additional bb pairs might be created, although with suppressed rate, starting from O(α 4 s G 2 µ ). In Figure 8

distribution
As discussed in Section 4.1, the improvement over the plain 5FS description can be obtained by the subtraction of the bottom-related contributions and their replacement with the 4FS results.
We define two physical distributions, namely the production of a lepton pair strictly without B hadrons (our B-vetoed 5FS calculation, that we label 5FS-Bveto) and the production of a lepton-pair accompanied by at least one B hadron (our 4FS results), which are complementary with respect to the additional particles beside the lepton pair. 8 The orthogonality of the two quantities allows us to take their sum and to consider it as our alternative prediction for any DY observable, in particular for the lepton-pair transversemomentum distribution, with respect to the treatment of the bottom-quark effects dσ mass dp The impact of our combination is illustrated by the ratio R(p + − ⊥ ) of the shape of our alternative combination for the p + − ⊥ distribution over the corresponding results obtained in the plain 5FS, defined as The ratio in Eq. 4.2 is defined for a generic Parton Shower model, which we label with tuneX. In Figure 9 we show the function R(p , computed using, in all the terms that enter in its definition, the same matching scheme (MadGraph5 aMC@NLO in the left plot, POWHEG-BOX in the right plot) and QCD PS model (Pythia8). We argue that the ratio deviates from one because of the different content of perturbative terms associated to the treatment of the bottom quark, and also for the choice of the Parton Shower phase space. The different approximations shown in Figure 9 correspond to: 5FS computation  For what concerns the effects due to the shower parameters, the choice of µ sh has always a visible effect on the shape of the MadGraph5 aMC@NLO predictions, and reflects the pattern of the Q sh probability distribution. In the POWHEG-BOX, the main effect is due to the variation of h, while the variation of the prescription for Q sh in the "remnant" events has no effect in practice.
One may wonder whether the effects of the improved prediction can be enhanced in some region of the lepton-pair phase space: in fact, if the dominant effects enter through terms of O (m b /M ( + , − ), log(m b /M ( + , − )) ), with M ( + , − ) being the lepton-pair invariant mass, one may expect a dependence of these effects with respect to the γ * /Z virtuality. A study in this direction is further motivated by Ref. [2] (see in particular Figures 14 and 15 therein), where no single generator gives a satisfactory description of the p + − ⊥ spectrum across different lepton-pair invariant-mass or rapidity bins, with the data-theory disagreement at the level of several tens of percent. In Figures 10 and 11 we show the ratio R(p + − ⊥ ) defined in Eq. 4.2 and already studied in Figure 9, now analysed in different bins of lepton-pair invariant mass and rapidity, respectively. The binning corresponds to the one adopted in Ref. [2] (with the exception of the invariant-mass bin below 30 GeV, which we do not consider). For the sake of simplicity, we only show Mad-Graph5 aMC@NLO predictions, and, among these, only those with µ sh ∼ √ŝ (which are those giving the largest effect in Figure 9). However, we have also performed the same analysis with the POWHEG-BOX and found similar results. As a function of the lepton-pair invariant mass, one indeed observes a trend in the corrections: they are flat and positive (+3-4%) in the bin with the smallest invariant mass (30 GeV < M ( + , − ) < 46 GeV), while going at higher invariant masses the corrections are smaller and become negative(-1%) in the full p + − ⊥ range of the largest invariant-mass bin (116 GeV < M ( + , − ) < 150 GeV). As expected, the bin around the Z peak has a shape which closely follows the one considered in the inclusive case shown in Figure 9.
The effect is identical to that of the inclusive case for rapidity up to |y( + , − )| < 1.6 At larger values, the effect of the improved prediction becomes flatter and negative (-1%). Given the size and the shape of these effects, we conclude that the data-generator differences found in Ref. [2] cannot be attributed to heavy-quark effects. The new NNLO computations which have been recently published for Z+jet production [15][16][17][18][19] hint that these discrepancies may be due to missing higher-order QCD effects.

Interplay between NC-DY and CC-DY
Effects due to the strong interaction in a non-perturbative regime, which cannot be evaluated from first principles, affect several observables studied at hadron colliders.
The NC-DY, thanks to the full kinematic reconstruction of the lepton pair, allows us to perform a precise tuning of the models describing non-perturbative contributions to the p + − ⊥ distribution, at small values of the transverse momentum. Under the assumption that these long-distance physics effects are universal, it is possible to use these models to predict other observables.
In CC-DY the neutrino transverse momentum is inferred from the study of the recoil of the whole hadronic system that accompanies the lepton pair, but a precision measurement of p ν ⊥ is not possible at the level necessary for the m W determination; furthermore, a stand-alone determination of the non-perturbative effects relevant to describe p ν ⊥ at small transverse momenta is not possible. For the above reasons, the parameters fitted from NC-DY are used in the simulation of CC-DY, to predict the p ν ⊥ distribution. An imperfect evaluation of the non-perturbative parameters in the NC-DY fit will propagate to CC-DY and in turn affect the m W determination. The different heavy-quark flavor content of the initial state in NC-DY with respect to CC-DY suggests that the nonperturbative parameters fitted in NC-DY might not be fully universal as one would wish in view of a high-precision simulation of CC-DY, where a bottom quark in the initial state is in fact absent. In Section 4.2 we have proposed a way to include in the description of the p + − ⊥ distribution an explicit treatment of the non-universal elements peculiar of the bottom quarks, in particular due to mass corrections. We can employ this method and re-fit the non-perturbative parameters, so that such a fit can be sensitive only to effects which are (more) universal, in the sense that they are common to light and heavy quarks.

Transferring the bottom-quark effects to the simulation of charged-current Drell-Yan
In this section we try to estimate how our alternative prediction of the inclusive p + − ⊥ distribution, which accurately includes the bottom quarks contributions, will affect the prediction of the p ν ⊥ distribution and, in turn, the m W determination. We do not rely on the experimental data, but rather try to explore the role of the bottom quark with a simplified approach based only on the available simulation tools. We make the following assumptions: i) it is possible to tune the parameters of the QCD-PS to perfectly describe the shape of the p + − ⊥ data in the 5FS (we call this setup tune1); ii) given the smallness of the bottom-quark effects, it is possible to find a second combination of the QCD-PS parameters to perfectly describe the shape of the p + − ⊥ data also when we use our alternative prediction Eq. 4.1 for the perturbative cross section (we call this setup tune2); iii) the parameters of the QCD-PS describing non-perturbative effects are universal, i.e. flavor independent, and constant, i.e. energy-scale independent. We assume that our perturbative description provides the bulk of the prediction and non-perturbative effects are just a correction that compensates for the different perturbative approximations. As a consequence of iii), the non-perturbative parameters contribute to the description of the gauge-boson transverse-momentum distribution, irrespective of the boson, W or Z, as a function only of the transverse-momentum value.
If tune1 and tune2 provide the same exact description of the shape of the data in the fiducial region (fid) defined by the acceptance cuts, we can write 1 σ exp fid dσ exp dp , (5.1) where the last equality follows from Eq. 4.2 and we use the labels (exp, 5FS, mass) to indicate the experimental data, the plain 5FS massless simulation and our alternative predictions. From these equalities we read that the function R expresses the difference in the predictions of the shapes computed with the same 5FS massless partonic cross section, using tune1 or tune2 In summary, the function R represents the impact of the improved perturbative treatment of bottom-quark effects; alternatively, if these effects can be perfectly absorbed in a QCD-PS tune, it describes the difference of the predictions obtained in the plain 5FS, using either the plain 5FS tune or the tune derived from the improved partonic cross section. In our study, we would like to simulate CC-DY using tune2, i.e. with a Parton Shower that has been tuned to account for the bottom-quark effects, and compare these predictions with the standard ones based on tune1. Since tune2 is not yet available, we can mimic the CC-DY results corresponding to this tune in the following way: we work with the plain 5FS code interfaced to a tune1 QCD-PS and we reweigh by 1/R(p ν ⊥ ) each event according to its lepton-pair transverse momentum p ν ⊥ . This last combination allows us to assess the impact in the CC-DY simulation of an improved treatment of the bottom-quark effects in the NC-DY fit. The reweighting of p ν ⊥ then propagates to all the other leptonic observables used in the m W determination and leads eventually to a shift in the measured m W value.

Template-fit determination of m W
The procedure of template fit to a distribution of experimental data consists in the comparison with the data of several theoretical distributions, the templates, obtained varying the fit parameter, in our example m W . The template that maximises the agreement with the data selects the preferred, i.e. the measured value of the fit parameter. In the present study we do not directly compare the theoretical distributions with the data. We choose one set of input parameters as reference and prepare the templates accordingly, letting m W vary in a given range. We then simulate the distribution with a second set of inputs, keeping m W at a fixed nominal value m W 0 . We fit this distribution, that we call pseudodata, with the templates based on the first set of inputs. The preferred value m W ,j is in general different than m W 0 , because the fitting procedure tries to accommodate the distortion induced by the second set of inputs with a shift of m W . The difference m W ,j − m W 0 is an estimate of  Figure 12: Comparison of templates generated for different m W values in the CC-DY in the 5FS (orange shaded areas). In the top row we show the POWHEG-BOX results, while in the bottom row we present the equivalent curves from MadGraph5 aMC@NLO. In blue, red and green we display the curves obtained by using the reweighting function R, for several different setups.
the difference between the two results that would be observed, if one would use templates based on these two sets of inputs to fit the same experimental data.
In the present study we use all our central choices for the input parameters, as described in Section 2 to compute the templates for the CC-DY lepton transverse momentum and lepton-pair transverse mass in the plain 5FS using tune1, i.e. our default Pythia8 tune. We generate the distributions corresponding to different values of m W with the reweighting technique described in refs. [66,67]. For illustration we show in Figure 12, for the transverse mass of the W boson as well as for the transverse momentum of the charged lepton, the comparison of templates computed with different m W values in a range up to ±20 MeV about the central PDG value: we observe a spread of the curves at the Jacobian peak in correspondence of the W resonance. The same figure also shows the effect of the p ν ⊥ reweighting on these observables, according to our alternative predictions as shown in Figure 9. These curves represent the pseudodata that will be employed in the fit, which are obtained again in the plain 5FS with default Pythia8 tune, PDG values for the masses and the p ν ⊥ -dependent reweighting by R(p ν ⊥ ), described in Section 5.1, to simulate the impact of the improved bottom-quark treatment. For both templates and pseudodata we consider the shape of the distributions: we define a range of values of the fitted variable around its Jacobian peak and normalise the distribution to the corresponding integral. In this way we enhance the sensitivity of the template fit procedure to the precise position of the peak.
The level of agreement between templates and pseudodata can be assessed with the least squares method. The standard definition of a χ 2 indicator, assumes that all the bins are uncorrelated and that each contributes according to its statistical error, represented by σ 2 k . For each template j we compute χ 2 j ; as a function of j we should obtain a parabola whose minimum indicates the preferred value of the fit parameter. We perform two independent fits on the lepton transverse momentum and on the W -boson transverse mass, which is defined, starting from the transverse momentum of the charged lepton and of the neutrino 9 , as where φ + ν is the azimuthal angle between the two leptons. The fit is performed in the following ranges, which correspond to the ones employed by ATLAS [46]: The granularity of the m W scan is of 1 MeV. In Figure 13 we show the χ 2 parabolas and the shift induced by reweighting the p ν ⊥ distribution with our alternative p + − ⊥ description. The left column of the figure refers to the transverse mass, the right column to the lepton transverse momentum. Plots in the top row are obtained with the POWHEG-BOX, those in the bottom row with Mad-Graph5 aMC@NLO. As far as the transverse mass is concerned, all induced shifts are compatible with zero. In fact this observable is known to be insensitive to the details of the p ν ⊥ modelling [68]. When the lepton transverse momentum is considered, the mass shifts are of the order ∆ m W ∼ 4 − 5 MeV when the 5FS is improved with the fixed-order 4FS prediction, and of ∆ m W ∼ 1 − 2 MeV or ∆ m W ∼ 3 MeV for the predictions improved with the NLO+PS 4FS, in the POWHEG-BOX and MadGraph5 aMC@NLO respectively. We take the results based on the fixed-order NLO 4FS calculation as a technical benchmark, while we consider the results obtained matching fixed-and all-orders calculations, discussed in detail in Section 3.4, as more accurate in the description of these transversemomentum distributions. In conclusion, our estimate of the m W mass shift due to b-quark effects, in the measurement from the lepton transverse-momentum distribution, is in general smaller than 5 MeV. We conclude our section by investigating how the extracted 9 We identify the neutrino transverse momentum and the missing transverse energy, i.e. p ⊥ (ν) ≡ p miss   Figure 14 and can be justified with a comparison with Figure 12, where the reweighted distributions and the templates are compared. We should consider, for a given template j, which bins contribute the most to the χ 2 j value and this can be guessed at glance by checking when a reweighted distribution follows and when it deviates from the template. Above the Jacobian peak the reweighting procedure yields a shape qualitatively different than any of the templates, so that the inclusion of the bins where the deviation is more pronounced may affect the position of the minimum of the χ 2 parabola. This problem is particularly evident when considering the fixed-order results (green dots in Figure 12), which correspond to the lower right plot of Figure 14.  Figure 14: Dependence on the fit window of the template-fit results, in the case of the lepton transverse-momentum distribution. The black marks correspond to the fit range used in Figure 13.

Conclusions
The high luminosity of the LHC together with the stunning performances of the detectors (ATLAS, CMS, and also LHCb) have turned Drell-Yan processes into high-precision arenas where to test our understanding of the fundamental interactions on the one hand and to perform the most precise measurements of the parameters of the SM, on the other hand. At this level of precision one needs to control not only higher-order perturbative effects, QCD as well as EW, but also less obvious ones, such as non-perturbative or parametric effects. An interesting example, discussed in this work, is given by the contribution from bottom quarks to Drell-Yan processes which so far has been considered in the massless approximation. In fact, the associated production of a lepton pair together with a bb pair is a rather complicated process, featuring several (if not all the) aspects that make the description of final states involving b quarks an interesting challenge for theorists as well as for experimentalists.
In this paper we have considered + − bb production in the 4FS at the LHC with the main goal of assessing the accuracy and precision currently achievable of + − observables inclusive over the bottom quarks. To this aim we have employed state-of-the-art Monte Carlo tools accurate at NLO in QCD and matched to parton showers. We have shown that predictions from different NLO MC tools for quantities that are inclusive with respect to the bottom quark in the + − bb final state are in agreement within the expected uncertainties. We have employed a simple prescription which makes it possible to consistently include the contributions from massive bottom quarks into the inclusive DY production calculated in the 5FS, and studied their effects together with the associated uncertainties. In so doing we have been able to estimate that the residual uncertainties have a small but visible (∆m W < 5 MeV) impact on the W -mass extraction. The stability of this prescription, with respect to the inclusion of higher-order QCD corrections, could be further explored with the help of codes which make it possible to match QCD-PS with NNLO-QCD accurate predictions for Drell-Yan processes [28][29][30]. We have also performed an extensive study in the 4FS of the observables that are exclusive on the bottom quarks. This analysis, which is documented in Appendix A, reveals differences between formally equivalent methods that are larger than the (estimated) associated uncertainties, at least in some cases. A thorough comparison of many distributions has allowed us to identify the regions in phase space where the differences arise. Assessing the origin of such discrepancies in the specific case of + − bb and providing a resolution will be an important task for the SM and BSM programme of the LHC (see e.g., the measurement of HZ-associated production at small transverse momentum and the search for dark matter in the missing-transverse energy +b-jet final states are two examples directly related to + − bb). A deeper understanding of the treatment of the bottom quark contributions can be crucial also for the precision prediction of very important final states like ttbb. This, however, needs a dedicated effort which goes beyond the scope of our work and it is left for future investigations.

Acknowledgements
We would like to thank Maarten Boonekamp, Stefano Camarda, Rikkert Frederix, Stefano Frixione, Luca Perrozzi, Paolo Torrielli, for many stimulating discussions. MZ is grateful to the Physics Department of the University of Milan for the hospitality in many different occasions. This work has received funding from the European Union's Horizon 2020 research and innovation programme as part of the Marie Sklodowska-Curie Innovative Training Network MCnetITN3 (grant agreement no. 722104). EB is supported by the Collaborative Research Center SFB676 of the DFG, "Particles, Strings and the early Universe". A. Appendix: Differential observables in + − bb production In this appendix we compare results obtained with different PS and/or matching schemes for various differential observables in pp → + − bb production, possibly distinguishing different signatures depending on the number of tagged b-jets.
We use the setup described in Section 2. After parton shower and hadronisation, hadrons are clustered into jets using the anti-k T algorithm [69] as implemented in FastJet [70,71], using a radius parameter R = 0.4. Jets are required to satisfy the following conditions A jet is considered as a B-tagged jet if at least one B-flavoured hadron is found among its constituents. For fixed-order predictions we apply the same jet-clustering algorithm to QCD partons (gluons and quarks, including the b), and we consider a jet as B-tagged if at least one b quark appears among its constituents. In both cases we assume a 100% B-tagging efficiency and zero mis-tagging rate.
The point of this comparison is to stress the fact that the differences that emerge by employing different matching approaches and QCD PS models (as one can appreciate in figures 4, and 15-29), make it apparent that higher-order terms with respect to the α s expansion, subleading in the counting of logarithmic enhancing factors, can nevertheless be numerically sizeable. We consider the width of the envelope of the different uncertainty bands presented in these figures as a conservative quantity useful to characterise our level of understanding of the observable under consideration and of the accuracy of our simulations. When we observe a similar shape in the correction factors expressing the impact of all the terms beyond NLO-QCD, we tend to consider the envelope a reliable conservative estimate of the residual uncertainties; when this is the case, in all the plots considered the envelope has a width typically of O(±20%) with respect to its mid point, a value that also represent the typical uncertainty from scale variations in most kinematic configurations (scale and PDF uncertainties are shown for all differential observables). When instead we observe different trends in the corrections, rather than quoting a very large uncertainty, we can only argue that the comparison is signalling the presence of a quantity whose description is very sensitive to the details of the radiation and deserves further analytical and numerical investigation. The colour code employed in all figures in this appendix is the same as in Figure 4.

A.1 Jet multiplicities
The first observable we investigate is the number of reconstructed b jets, shown in Figure 15. With respect to the normal layout of the figures, for this specific observable we also show as a green-patterned band the uncertainty related to the variation of Q sh in the "remnant" events of the POWHEG-BOX samples, as described in Section 2. Higher-order QCD corrections play a non trivial role in the jet reconstruction, yielding in turn sizeable effects.  Figure 15: The b-jet multiplicities. Same colour codes and approximations of figure 4.
The b-jet multiplicity is thus the first quantity that has to be discussed, for a correct interpretation also of the other observables. The largest bin is the one with zero b-jets, because the production of b quarks is due to the collinear splitting of the incoming gluons, so that the transverse momentum of the jet that includes the b quark does not fulfil the jet definition. The number of events with 1 or 2 b jets depends on the transverse momentum distribution of the final state b andb at NLO-QCD. Higher-order corrections beyond NLO-QCD, simulated with a QCD-PS, yield a redistribution of the events. We observe that in MadGraph5 aMC@NLO there is a moderate stability of the 0-jet and 1-jet bins (changes do not exceed the ±5% level) and an increase of the 2-jets bin with respect to the fixed NLO prediction. The precise description of the effects in the first two bins and their overall stability depend on the details of the QCD-PS and PS phase space adopted. The increase of the third bin is due to a migration of events from the 1-jet to the 2-jets bin. Even if the absolute number of events that migrate is not large, the percentage effect is large, of O(+20%), because of the steeply-falling shape of the distribution. The hard recoil of the + − bb system, compared to the fixed-order prediction, in MadGraph5 aMC@NLO at intermediate transverse momentum values (see Figure 4) may explain the larger number of events with both b quarks passing the b-jet requirements.
In the POWHEG-BOX case we observe an increase of the 0-jet bin and a corresponding reduction of the rates with 1 and 2 b jets (the observables with a genuine NLO accuracy), independently of the value of the h scale in the damping parameter, if the default prescription for Q sh (i.e. the transverse momentum of the first, hardest emission) is used. Variations in the shower scale of the "remnant" events give instead effects comparable with those from shower-scale variations in MadGraph5 aMC@NLO , in the 2 b-jets bin. The latter is the most sensitive to changes in the treatment of the "remnant" events, characterised by a large transverse momentum of the first parton. Although the variation of Q sh will not be shown for the other observables presented in this section, the reader should keep in mind that the h variation in the POWHEG-BOX may give only a partial estimate of the theoretical uncertainties, and that other sources of uncertainty exist.
Events with 3 or 4 b jets are due to additional splittings via the QCD-PS and are affected by large parametric uncertainties.   In Figure 16  (above 100 GeV), there is a good shape agreement among the different matched predictions, while differences in normalisation reflect those the 1-jet bin in Figure 15. In Figure 17 we show the results obtained for the p where the lepton pair is detected in association with at least 2 b jets. Corrections with respect to fixed NLO span from +30% for MadGraph5 aMC@NLO+Pythia8 down to -20% for POWHEG-BOX+Pythia8 but they are rather flat in shape. This behaviour is associated to the positive impact of QCD-PS corrections in the value of the 2 b-jets multiplicity. In both Figures 16 and 17 we observed a fair compatibility of the predictions computed with the different options of matching scheme and PS, once the differences in normalisation are accounted for.  In Figure 18 we consider a final state with at least 2 b jets and study the invariant-mass distribution of the hardest b-jet pair. This observable shows a great sensitivity to the details of the matching, in particular on secondary g → bb splittings generated by the PS. At very low invariant masses we observe a large negative correction in all matching schemes, due to the definition of b jet and to the action of the QCD-PS: at fixed NLO the b jets contain, beside the b quarks, at most one additional parton and the jet mass is therefore rather close to the b-quark mass; the inclusion of additional partons via QCD-PS rapidly increases the total jet mass, with a consequent migration of events to the larger dijet-mass bins and a corresponding depletion of the first ones. At larger invariant masses, for m(b 1 b 2 ) > 50 GeV, we observe that the PS corrections obtained with MadGraph5 aMC@NLO are positive, with the predictions matched to Pythia8 reaching the +40% level when m(b 1 b 2 ) ∼ 500 GeV. The effect of matching to Herwig++ is milder and flatter with respect to fixed NLO, while the POWHEG-BOX predictions lie below it. In Figure 18 the differences between the various matching options are a consequence of the jet definition, because the largest fraction of the radiative effects due to collinear emissions is integrated in the jet cone. Looking at the uncertainty bands due to shower-scale (in MadGraph5 aMC@NLO) and h variations (in the POWHEG-BOX), we notice that the latter are visibly smaller than the former. A similar behaviour has been observed for the same observable in the context of the ttbb implementation in the POWHEG-BOX [72].

A.3 Invariant mass of the two hardest b jets
A.4 Invariant mass of the two hardest B hadrons with or without tagged b jets    In Figures 19, 20 and 21 we study a more exclusive observable with respect to the one of Figure 18, i.e. we consider the production of a pair of B hadrons and plot the invariant-mass distribution of the pair made by the two hardest B hadrons in the event,  in events characterised by the presence of at least 0, 1 or 2 b jets respectively. We do not require that the two hardest B hadrons belong to any of the tagged jets, nor we ask that they satisfy any cut in order to be detected.
In the case where no b jet is explicitly requested, shown in Figure 19, we observe that the MadGraph5 aMC@NLO+Pythia8 results are largely independent of the choice of the variable and of the interval used to extract Q sh , but that there is a strong sensitivity to the QCD-PS model, with differences between Pythia8 and Herwig++ at the 20% level. Curiously enough, MadGraph5 aMC@NLO+Herwig++ is rather similar to POWHEG-BOX+Pythia8. All matched predictions are considerably softer than those at fixed NLO, in which the B hadrons are replaced by the b quarks, because of the loss of energy due to the fragmentation of the latter into the former.
In the case with at least 1 b jet, shown in Figure 20, we observe in the Mad-Graph5 aMC@NLO results that the choice of the variable used to extract Q sh , namelyŝ vs H T /2, yields differences at the 10 − 20% level at large invariant masses. From the Mad-Graph5 aMC@NLO+Pythia8 histograms one can appreciate the fact that, once the matching scheme and the PS are fixed, the pattern of the predictions closely follows those of the shower-scale distribution shown in Figure 2, with the hardest prediction corresponding to the largest shower-scale. The differences between Pythia8 and Herwig++ are sizeable through the whole invariant mass spectrum, both in shape and in size of the corrections. We stress that these effects are due to terms beyond NLO-QCD in the perturbative expansion. As in the case without explicitly asking extra b jets, predictions with Mad-Graph5 aMC@NLO+Herwig++ are close to those with POWHEG-BOX+Pythia8.
Finally, in the case with at least 2 b jets, shown in Figure 21, we observe in the MadGraph5 aMC@NLO results that there is a good agreement between the different options of matching fixed-and all-orders results and between Pythia8 and Herwig++.
We also observe the large size of the radiative effects in the first two invariant mass bins, where the higher orders enhance the cross section up to a factor +120%, while at large invariant masses the corrections range from being negative (-20%) to being compatible with zero. This large correction is explained as due to the appearance via QCD-PS of events where both B hadrons belong to the same jet (because of secondary g → bb splittings) and turn out to be the hardest pair, but, at the same time, have a small invariant mass. For what concerns the POWHEG-BOX predictions, they fall below and are manifestly softer than fixed NLO for values of the invariant mass starting at 100 GeV, with a depletion of rate that can reach −50% at m(B 1 , B 2 ) ∼ 500 GeV.
A.5 y − φ distance of the two hardest b jets  We introduce the distance ∆R(ij) ≡ (y i − y j ) 2 + (φ i − φ j ) 2 between particles i and j, whose rapidity and azimuthal angle are denoted with y and φ, and show, in Figure 22, the distribution for the distance between the two hardest b jets. For this observable, matched predictions can display large differences. More in detail, while the POWHEG-BOX+Pythia8 prediction is rather close to the fixed-order one, with almost no visible shape distortion, the MadGraph5 aMC@NLO ones show sizeable deviations, particularly for ∆R(b 1 , b 2 ) > π. In this region, the MadGraph5 aMC@NLO+Pythia8 prediction with the largest shower scale can lead to rates which are larger than the fixed-order predictions by a factor 1.5-2. This is partially mitigated by the choice of smaller values for the shower scale ∼ H T or by matching with Herwig++. In fact, in these two cases, predictions show a rather similar behaviour: up to ∆R(b 1 , b 2 ) = π they lie quite close to the fixed-order one; starting from ∆R(b 1 , b 2 ) = π the rate is enhanced with respect to the fixed-order one, up to a factor 1.4 at ∆R(b 1 , b 2 ) = 4.5. Finally, for very large ∆R(b 1 , b 2 ), these matched predictions seem suppressed with respect to the fixed-order one, although the statistics for this kinematic region is quite poor.
A.6 y − φ distance of the two hardest B hadrons with or without tagged b jets We show in Figures 23, 24     between the two hardest B hadrons, in presence of an increasing number of b jets (at least 0, 1 and 2). As it has been the case for the corresponding invariant-mass distributions (Figures 19-21) we do not require that the two hardest B hadrons belong to any of the tagged jets, nor any condition for tagging the B hadrons is required. When no b jet is explicitly required (Figure 23)  viations, which remain well below 10% over all the range that we display. Conversely, the MadGraph5 aMC@NLO-matched predictions show quite large discrepancies: when Pythia8 is employed, the prediction is suppressed at small and large distances (∆R < 3 and ∆R > 6 − 7), while it is mildly enhanced (up to +10%) at intermediate distances.
While at small and moderate distances the behaviour of the Pythia8-matched predictions is only marginally dependent on the choice of shower scale, and the departure from the fixed-order prediction reaches at most 20% at very small distances, at large distances such a dependence is apparent, with larger shower scales leading to bigger suppressions, with the predictions suppressed by a factor two or more with respect to the fixed-order one. If instead MadGraph5 aMC@NLO+Herwig++ is employed, the behaviour is even more complicated, but overall the deviations with respect to the fixed-order predictions are smaller than with Pythia8: at very small distances the Herwig++-matched prediction lies below the fixed-order one, with a suppression of 20%. For distances in the range 1 < ∆R(B 1 , B 2 ) < 4, the matched prediction lies 5% above the fixed-order one, while in the range 4 < ∆R(B 1 , B 2 ) < 9 it is again below, with a suppression between 10% and 15%. Finally, at very large distances ∆R(B 1 , B 2 ) > 9, the matched prediction returns ∼ 20% above the fixed-order one. This enhancement of the Mad-Graph5 aMC@NLO+Herwig++ prediction has been also observed for bbH associated production [73] and for charged-Higgs production in association with a top quark [74]. Requiring at least one b jet partially mitigates these discrepancies: the POWHEG-BOX prediction is very similar to the MadGraph5 aMC@NLO+Herwig++ one, and both are also similar to the one obtained with MadGraph5 aMC@NLO+Pythia8 and with a shower scale ∼ H T , up to ∆R(B 1 , B 2 ) = 6 (for larger distances, the latter prediction predicts a suppressed rate with respect to the former ones). When these matched predictions are compared to the fixed-order one, the behaviour is not much different from the case without extra jets: predictions are suppressed (up to -20%) for small distances (∆R(B 1 , B 2 ) < 1), for intermediate distances (1 < ∆R(B 1 , B 2 ) < 4) they behave similarly to the fixed-order one, while at larger distances they are again suppressed (-30% for the POWHEG-BOX and MadGraph5 aMC@NLO+Herwig++ and up to -50% for MadGraph5 aMC@NLO+Pythia8 with Q sh ∼ H T ). Finally, the MadGraph5 aMC@NLO+Pythia8 with Q sh ∼ √ŝ follows the other matched predictions up to ∆R(B 1 , B 2 ) = 3.5. It then keeps growing with respect to the fixed-order one for about one unit of distances, where the enhancement with respect to the fixed-order prediction reaches +10%, finally for large distances it predicts suppressed rates with respect to the fixed order one, with the suppression reaching -40%. For this last prediction, a variation of Q sh by a factor two can have an effect as large as 10% for ∆R(B 1 , B 2 ) > 4, while for the other predictions the shower-scale dependence is much smaller. Finally, when two b jets are required, Figure 25, the ∆R(B 1 B 2 ) distributions closely follow the corresponding counterparts for the distance between the two b jets, shown in Figure 22.   In figures 26, 27 we show the transverse momentum of the hardest and the second hardest b jets. For the transverse momentum of the hardest jet, the general behaviour of matched computations is to be softer than the fixed-order prediction, and this effect is more pronounced for predictions matched with Pythia8 than for the ones matched with Herwig++. For small values of the transverse momentum, differences among the matched simulations are moderate (at the level of 10%) and reflect the pattern observed for the one-jet multiplicity displayed in Figure 15, while for larger values such differences are mitigated. For the second-hardest b jet, no visible distortions of the matched spectra with respect to the fixed-order one can be appreciated, and differences in rate reflect those of the two-jet bin in Figure 15.  In Figures 28, 29 we show the pseudo-rapidity distributions of the hardest and the second hardest b jets. As it has been the case for their transverse-momentum counterpart, differences in rate reflect the one-jet and two-jet bins of Figure 15. Besides these differences, it is worth to note that matched predictions have the general tendency to populate more the forward and backward regions with respect to the fixed-order ones. Such a tendency is more pronounced for the first jet than for the second, and when larger values of the shower scale (∼ŝ) are employed, in particular for Herwig++.  Figure 29: Pseudo-rapidity distribution of the second-hardest b jet.