Combining single and double parton scatterings in a parton shower

Double parton scattering (DPS) processes in which there is a perturbative \mbox{``$1\to2$''} splitting in both protons overlap with loop corrections to single parton scattering (SPS). Any fundamental theoretical treatment of DPS needs to address this double-counting issue. In this paper, we augment our Monte-Carlo simulation of DPS, \textsf{dShower}, to be able to generate kinematic distributions corresponding to the combination SPS+DPS without double counting. To achieve this, we formulate a fully-differential version of the subtraction scheme introduced in Diehl {\em et al.} (JHEP 06 (2017) 083). A shower is attached to the subtraction term, and this is combined with the \textsf{dShower} DPS shower along with the usual SPS shower. We perform a proof-of-concept study of this new algorithm in the context of $\mathrm{Z}^0\mathrm{Z}^0$ production. Once the subtraction term is included, we verify that the results do not depend strongly on the artificial ``DPS-SPS demarcation'' scale $\nu$. As part of the development of the new algorithm, we improve the kinematics of the $1\to2$ splitting in the DPS shower (and subtraction term), allowing the daughter partons to have a relative transverse momentum. Several reasonable choices for the transverse profile in the $1\to2$ splitting are studied. We find that many kinematic distributions are not strongly affected by the choice, although we do observe some differences in the region where the transverse momenta of both bosons are small.


Introduction
Double parton scattering (DPS) is where one has two separate hard parton-parton collisions in the same proton-proton collision, producing two sets of final states that we shall denote by A and B. In terms of the total cross section for the production of A+B, DPS is formally a power suppressed effect compared to the usual single parton scattering (SPS) mechanism [1][2][3]. However, DPS populates the final-state phase space in a different way to SPS, with the result that when making more-differential measurements, DPS can play an important role, and there are various regions of phase space where DPS contributes at the same level as SPS. One generic example is the region where the transverse momenta of both A and B are small [4,5], and for many processes (such as double J/Ψ production [6]), another is the region where A and B are widely separated in rapidities. For certain processes where the SPS mechanism is suppressed by small or multiple coupling constants, DPS can compete with SPS even at the level of the total cross section -a well known example is same-sign WW production [7,8]. The importance of DPS relative to SPS increases with collider energy (as lower momentum fractions are probed, where the population of partons is greater), such that DPS is more relevant at the Large Hadron Collider (LHC) than at any previous collider, and will be yet-more relevant at any future higher-energy protonproton collider. DPS can also be an important effect in proton-nucleus and nucleus-nucleus collisions, with certain contributions to DPS rising more quickly with the nucleon number A than SPS does [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24] (for a review, see [25]). Finally, DPS reveals information about the proton structure that is not accessible via any SPS process -namely, correlations between partons. For all of these reasons, the experimental measurement of DPS contributions to various processes at the LHC, and the ability to make corresponding theoretical predictions of these contributions, is of great interest and importance.
The simplest and crudest approach to make theoretical predictions for DPS is to assume that two partons entering a DPS process from a given proton are uncorrelated to one another. This leads to the "pocket formula", in which the DPS cross section for A + B is computed as the product of SPS cross sections for A and B, divided by a geometrical prefactor σ eff . Here, the kinematics of the final state A + B in DPS events is simply that obtained by overlaying SPS A and B events. The simulations of DPS (and more general multiple parton interactions, MPI) in general-purpose event generators such as Herwig [26][27][28][29][30][31][32][33][34], Pythia [35][36][37][38][39][40][41][42] and Sherpa [43][44][45] (in particular, the AMISIC++ model [46]) are fundamentally based on the pocket-formula picture. These Monte-Carlo simulations are key tools in experimental extractions of DPS, precisely because many such extractions rely on the different kinematic "shapes" of DPS (A, B) and SPS A + B events, and Monte-Carlo generators provide fully-differential predictions of these shapes (for both SPS and DPS). The number of kinematic distributions used to extract the DPS contribution in past analyses ranges from two in the ATLAS and CMS extractions of DPS in W + 2 jets [47,48], three or four in the ATLAS and CMS extractions in the four-jet process [49][50][51], to eleven in the recent CMS extraction in same-sign WW [52].
The pocket-formula picture of DPS cannot be the complete one, however, and over the past few years a complete theoretical framework for the description of DPS in Quantum Chromodynamics (QCD) has been developed [5,[53][54][55][56][57][58][59][60][61][62][63][64] (see [13,65,66] for reviews). One key aspect is that in QCD, the two partons entering the DPS process from a proton can have a common origin in a single parton splitting perturbatively into two (the "1 → 2 splitting") [5,56,57,67]. Treating this splitting appropriately requires a formalism in which the transverse separation between the partons y is taken into account. 1 Inclusion of the 1 → 2 splitting leads to potential double counting issues; most notably, the process in which one has a 1 → 2 splitting in both protons overlaps with a loop correction to SPS (see Figure 2). The DPS description is clearly more appropriate at large y = |y|, whilst the SPS one is appropriate at smaller y ∼ 1/Q h , with Q h the hard scale. A QCD framework that consistently incorporates the 1 → 2 splittings in DPS and overcomes the double counting issues was developed by M. Diehl, JRG and K. Schönwald [61], and will be referred to here as the DGS framework. The first core aspect of this framework is that the DPS cross section is written in terms of y-dependent double parton density functions (dPDFs), which are integrated over y down to a cut-off ∼ 1/ν. The parameter ν is an unphysical scale, taken to be of order Q h . The second core aspect of the framework is the inclusion of a "subtraction term" into the total cross section for the production of A + B (in addition to the DPS and SPS terms), which cancels the dependence on ν order-by-order in the strong coupling α s , as well as ensuring that the total cross section smoothly interpolates between the DPS description at large y and the SPS description at small y, as is intuitively appropriate.
Other effects also exist beyond the pocket-formula picture. The dPDFs should be "aware" of the constraints associated with the finite number of valence quarks in the proton (and the fact that its composition is uud) and the fact that the momentum of all partons should add up to the proton momentum. Formally this information is encoded in the number and momentum sum rules for the dPDFs [53,[68][69][70][71][72], which place non-trivial constraints on their structure. The MPI model in Pythia 8 in fact takes account of number and momentum sum-rule constraints in an approximate way, by ordering the interactions in scale and rescaling the PDFs following each hard interaction [37]. In addition to this, there can be non-perturbative correlations between the parton momentum fractions and y in the dPDFs, and correlations in spin, colour and flavour between partons [5,73] (for a review, see [74]). All of these types of effects can lead to differences in the DPS rate and/or DPS shapes (for examples of these, see [8,70,[75][76][77][78][79][80]), where effects on the DPS shapes are particularly important with regards to the correct experimental extraction of DPS contributions.
In light of this, there is a need for an improved approach to generate event-level DPS predictions that goes beyond the pocket formula and, ideally, is based on the full QCD framework of [61]. One possible approach involves reweighting events generated by an existing Monte-Carlo generator; this approach has been used to incorporate certain 1 → 2 splitting effects [81,82] and the effect of quark spin correlations [78,80]. In our work, we have chosen to take a different approach, building a whole new Monte-Carlo simulation of DPS from the ground up based on the DGS framework, which we believe to be advantageous in terms of flexibility, ease of use, and future development. We refer to this algorithm as dShower. In a previous work [79] we developed a parton-shower description of the DPS term, with proper account of the y dependence and 1 → 2 splitting effects, and a cut-off on the y integral ∼ 1/ν ∼ 1/Q h . That is, we recast the first core aspect of the DGS framework into a parton-shower description. The goal of the present work is to do the same for the second core aspect of the DGS framework, and develop a parton shower that can generate both DPS and SPS events without double counting. This requires a formulation of the DGS subtraction scheme at the fully-differential level, with an appropriate parton shower for all terms. In order to achieve this goal, we adapt techniques used in the matching of fixed nextto-leading-order (NLO) computations to the parton shower [83][84][85][86][87][88][89][90][91][92][93][94][95][96]. Also in that context, there is a potential double counting issue (for example, between the real-emission process in the NLO fixed-order process and the first emission in the shower), and a subtraction scheme is needed to remove this double counting.
The paper is organised as follows. In Section 2 we present a brief review of the DGS framework, along with an overview of the key features of the DPS shower that we developed in [79]. Section 3 describes in detail our implementation of the DGS subtraction scheme at the differential level in the parton shower. As part of this procedure, we alter one aspect of the DPS shower from its formulation in [79]; whereas previously the 1 → 2 splitting occurred with the two daughter partons having no transverse momentum relative to the parent, we now add the possibility for the daughter partons to have a relative transverse momentum k ⊥ ∼ 1/y drawn from a distribution g(k ⊥ , y). This is beneficial in terms of being able to construct a subtraction term that cancels both the DPS at small y and the SPS at large y at the differential level, and yields a more realistic DPS description at large y. We construct the algorithm in the context of on-shell vector-boson pair production [97][98][99][100] (in fact, up to the next-to-leading order [101,102]). Extension of this procedure to more complex processes is in principle straightforward.
In Section 4 we present numerical results from the algorithm in the context of on-shell Z 0 Z 0 production. Our purpose here is not to perform a full phenomenological study of Z 0 Z 0 production, but rather to study the behaviour and performance of the algorithm. Thus, in this proof-of-concept study we include only the O(α 2 s ) gg → Z 0 Z 0 loop-induced process in the SPS piece, and divide this contribution by 10 -this is to boost the relative importance of DPS and reduce the Monte-Carlo statistics needed to obtain distinguishable DPS effects. We perform the important validation check that the subtraction term cancels the ν dependence of the DPS term, and investigate the effect of various sensible choices for the profile g(k ⊥ , y) in the DPS term (with corresponding choices in the subtraction term). We also show that in several distributions we see a difference in the SPS+DPS results compared to the SPS results alone, in the context of this toy study.
Finally, in Section 5, we conclude and discuss potential future directions.

Review of the dShower algorithm
In this section, a review of the algorithm proposed in [79] is given. This algorithm is based on the QCD framework developed by M. Diehl, JRG and K. Schönwald [61] (DGS framework) whose main features are gathered in the following. This section also introduces the subtraction scheme presented in [61] that addresses the double-counting issue mentioned in the introduction.

The DGS framework
In a proton-proton collision happening at a centre-of-mass energy of √ s, the total cross section for the production of a final state A + B via a process involving two separate hard interactions ij → A and kl → B is given by the factorisation formula 2 [5,53,[60][61][62][63][64] σ DPS (A,B) (s) = (2.1) See Figure 1 for an illustration of a DPS process. Here,σ ij→A andσ kl→B are the partonlevel cross sections for the subprocesses ij → A and kl → B. The symmetry factor in front of the sum is equal to one half if A = B and to unity otherwise. The functions F ij (x 1 , x 2 , y, µ 2 ) are the y-dependent dPDFs; note that in this work we will only consider the case in which the two hard scatters are at equal scales, such that there is only one scale µ 2 in the dPDFs. A dPDF is proportional to the joint probability (or, more specifically, the number density) of finding two partons of flavours i and j within the same proton with longitudinal momentum fractions x 1 and x 2 when those partons participate in two different hard interactions characterised by the same scale Q h [5]. The evolution of the dPDFs with respect to the factorisation scale µ is described by the homogeneous double DGLAP equations [5,61]. It is customary to choose µ ∼ Q h . The impact parameter y gives the relative distance between the two partons.
For small values of y, the dominant behaviour of the dPDFs can be expressed in terms of the single PDFs (sPDFs) and a perturbative 1 → 2 splitting kernel. At leading order (LO) in the strong coupling α s , this perturbative splitting expression reads [5] F spl,pt This expression includes the effects of the 1 → 2 splitting mechanism presented in the introduction. More precisely, it takes into account the fact that the pair of partons ij can originate from the perturbative splitting of a parton k with longitudinal momentum fraction x 1 + x 2 . The flavour k is uniquely determined by the flavours i and j for LO QCD splittings. If there is no flavour k such that the branching k → i + j is allowed, because of colour or flavour considerations, then the perturbative splitting expression for the pair ij is equal to zero. This small-y expression involves the unregularised splitting kernel P k→i+j (z) (see e.g. [53]) and the sPDF f k of parton k, which gives the probability of probing such a flavour k at the scale µ.
In [61], the y-dependent dPDFs are modelled as the sum of an intrinsic part and a splitting part. The evolutions of both components as a function of µ are given by the (homogeneous) double DGLAP equations. For the intrinsic part, the initial condition for the evolution is a product of sPDFs multiplied by a phase-space factor and a Gaussian in y. The starting scale for the evolution is chosen to be µ 0 Λ QCD , where Λ QCD ∼ 1 GeV is the typical non-perturbative scale of QCD. In contrast, the input for the evolution of the splitting part of the dPDFs, is the perturbative splitting expression given in Equation (2.2) (multiplied by a Gaussian factor that suppresses this expression for y 1/Λ QCD ) . The input is then evolved starting from the scale µ y = b 0 /y * with y * = y/ 1 + y 2 /y 2 max , y max = 0.5 GeV −1 , b 0 = 2e −γ E 1.12 and γ E , the Euler-Mascheroni constant [61]. The scale µ y p p y Figure 1. Sketch of a DPS at a pp collider leading to the production of the final state A + B. The transverse distance y between the partons is represented.
Example of a process which can be seen either as a DPS or as an SPS. If the hard process is defined by the black box, then it is a DPS with the two subprocesses qq → A and qq → B. In the case where the hard process is defined by the green box, then one has the SPS gg → A + B. The pieces which are not included within the boxes are integrated out inside the PDFs.
is not simply 1/y to avoid the sPDF and the strong coupling present in Equation (2.2) being evaluated at a scale which is below Λ QCD when y → +∞. Instead, µ y → b 0 /y max 2.24 GeV, which is still in the perturbative regime. This construction for the dPDFs ensures that the dominant behaviour of the dPDFs at small y is given by the perturbative splitting expression written in Equation (2.2), as required.
The function Φ(yν) in Equation (2.1) is a cut-off at small y values. It regulates the divergence of the DPS cross section which appears when y → 0 (recall the 1/y 2 behaviour in Equation (2.2)). This power divergence is related to a double-counting issue between SPS and DPS, which is inherent to the 1 → 2 splitting mechanism. More specifically, a DPS process where 1 → 2 splittings occur in both protons (commonly referred to in the literature as a "1v1" DPS process) can also be considered as a loop correction to the SPS process. The latter description is actually the more appropriate one at small y where the entire loop process is contained in a small space-time volume. An illustration of this double-counting issue is given in Figure 2. In the following, the Heaviside function Θ(yν − b 0 ) will be used as a cut-off, as was also done in the numerical studies of [61].
Introducing the cut-off Φ(yν) simply regulates the DPS cross section: it does not solve the double-counting issue. There is double counting between the SPS and DPS contributions for all y > b 0 /ν, where the DPS (SPS) term gives a poor description for small (large) y values. The simple sum of SPS and DPS terms has a strong dependence on the unphysical parameter ν. These two related problems are cured by defining the total cross section for the production of a final state A + B as [61] σ tot where σ SPS A+B is the usual total cross section for the production of the final-state A + B via SPS given by the factorisation formula [103][104][105][106] as The subtraction term σ sub (A,B) is the integral over y of a quantity dσ sub (A,B) /d 2 y that is defined to satisfy dσ sub (A,B) /d 2 y dσ DPS (A,B) /d 2 y for y ∼ 1/ν and dσ sub (A,B) /d 2 y dσ SPS A+B /d 2 y for y 1/ν. When the two partons are well separated, the subtraction and SPS terms cancel and one is left with the DPS description which is valid in this region of the phase space. At small y, the subtraction and DPS terms cancel and leave the SPS term, which is the appropriate description in this region. Such a scheme removes the double counting and ensures a smooth transition between the SPS and DPS regimes. To achieve this objective in practice, the following form for the subtraction term σ sub (A,B) is taken 3 This term is nothing else but the DPS cross section given by Equation (2.1), but with the full dPDFs replaced by their small-y perturbative expressions written in Equation (2.2). Let us briefly sketch how this term satisfies the requirements. At small y ∼ 1/ν ∼ 1/Q h , the DPS cross section is dominated by the 1v1 term, and there is little room for evolution between µ y and Q h , such that dσ 1v1,pt (A,B) /d 2 y dσ DPS (A,B) /d 2 y and we recover the SPS term in this limit. SPS loop contributions are typically written as an integral over momenta rather than positions, but it is known that at large y the dominant contribution to the SPS loop term has the form of Equation (2.5) [5,57], such that dσ 1v1,pt (A,B) /d 2 y dσ SPS A+B /d 2 y and we recover the DPS term. We will only consider the unpolarised colour-singlet term in the DPS and subtraction cross sections here, for simplicity and because this is typically the dominant contribution to DPS. In this case, at large y, we only replace the unpolarised colour-singlet piece of the SPS loop by the DPS description, and all spin/colour/flavour interference/correlation contributions remain described by the SPS term.
Since the DPS and subtraction terms coincide in the vicinity of the cut-off y = b 0 /ν, up to higher order terms in α s , the leading dependence of the two terms on ν is the same, 3 Note that in [61], the subtraction term in fact comprises two terms: σ 1v1,pt (A,B) , which removes double counting between DPS and SPS, and σ 2v1,pt (A,B) , which removes double counting between DPS and the socalled "twist 2 × twist 4" mechanism. The twist 2 × twist 4 mechanism and σ 2v1,pt (A,B) do not contribute at the leading logarithmic level when we take ν ∼ Q h (as we do here), and we do not consider them in what follows. and cancels out. Using the change of variables u = yν, one can show that this leading behaviour is ∝ ν 2 : (2.6) In later sections, the implementation of this subtraction scheme within a parton-shower algorithm as well as a numerical example of this implementation will be presented. A key aspect of this implementation will be the cancellation of the ν dependence of the DPS and subtraction terms, as in Equation (2.3), albeit now at the differential level.

The dShower algorithm
The aim of the algorithm proposed in [79] is to simulate exclusive parton-level DPS events. The starting point is to select two hard scatters with their respective kinematics according to the DPS cross section introduced in Equation (2.1). A value for y is also sampled according to the cross section. After that, the two hard scatters are evolved simultaneously using a variant of the usual parton-shower algorithms. In particular, the evolution of the initialstate partons which are initiating the two hard scatters is guided by the y-dependent dPDFs presented in the previous section. More precisely, consider a pair of partons of flavours i and j belonging to the same proton with momentum fraction x 1 and x 2 and participating in two different hard interactions characterised by the same hard scale Q h . The probability that this pair remains resolved during a backward evolution from the scale Q 2 h down to a lower scale Q 2 and then appears as coming either from the pair i j or the pair ij is [79] (2.8) By iterating Equation (2.7), QCD emissions are attached to the incoming partons and their effects are consistently included. Once an emission has occurred at a scale Q emi < Q h , the evolution is carried on, but with starting scale Q emi instead of Q h . The algorithm stops when the evolution scale Q reaches Λ QCD . The algorithm described in [79] also includes the possibility that the two incoming partons inside the same proton may be resolved into a single parton. This phenomenon, referred to as "merging", aims to give a geometrical picture of the backward evolution of the system that is consistent with the 1 → 2 splitting mechanism. The merging procedure proceeds as follows. At the scale Q = µ y 1/y, the backward evolution gets frozen and the merging happens with a probability given by , (2.9) where F spl ij is the splitting part of the full dPDF F ij , which is obtained as explained in the previous section. In the case where the merging does not happen, then the evolution of the pair ij is carried on as before, but with the term corresponding to the 1 → 2 splitting mechanism removed from the expression of the full dPDF (i.e. the splitting part is omitted and only the intrinsic one remains). In the case where the merging happens, the two partons i and j are merged into a single parton k with momentum fraction x 1 +x 2 . The evolution of this single parton k from the scale µ y down to the non-perturbative scale Λ QCD is carried on using the conventional one-parton branching algorithm. For the whole procedure to work, one needs to have Q h 1/y. With our choice for the cut-off Φ(yν), this can be ensured at the cost of requiring that ν Q h . This is one of the limitations of the algorithm. In order to be able to include the case ν > Q h , one would need to combine forward and backward evolutions, which is beyond the scope of this work.
In the procedure introduced in [79], the merging of the two partons i and j happens at zero transverse momentum. More precisely, the four-momenta p i and p j of partons i and j after the merging occurred are aligned with the beam axis in the laboratory frame. It will be seen in a later section how one can modify the kinematics such that p i and p j get a non-vanishing transverse momentum during the merging procedure.

Implementation of the subtraction scheme
As mentioned previously, there is a potential double counting issue between DPS processes in which there is a 1 → 2 splitting in both protons (referred to as 1v1 events), and loop corrections to SPS. The subtraction scheme introduced by the DGS framework removes the double counting in the physical quantity -the cross section for the production of A + B via both DPS and SPS -via the master formula, Equation (2.3). This equation is written at the inclusive level. However, we require a subtraction scheme that can be implemented in a parton-shower framework where the DPS part is generated using the dShower algorithm, such that we can simulate full events for the combination of SPS and DPS without double counting. This subtraction scheme must be formulated at the fully-differential level, and its construction will be detailed below.
We note that more-differential formulations of the DGS framework do exist -in particular a formulation differential in the transverse momenta of the two produced systems A and B was obtained in [63]. The framework constructed in that paper can be used to resum logarithms of the transverse momenta p ⊥ over the hard scale Q h to, in principle, arbitrary accuracy. However, in this formulation, the DPS and subtraction terms have different y values in amplitude and conjugate (termed y + and y − ), and there are further terms associated with interference between DPS and SPS. These features are necessary in the full all-order framework with transverse-momentum dependence. However, such features do not appear to be amenable to a probabilistic parton-shower treatment (and some kind of amplitudelevel parton branching framework [107][108][109][110] would presumably be needed). In this work we take a simpler approach, neglecting DPS/SPS interference, having only a single value of y in the DPS and subtraction terms, and making the most "physically reasonable" choices of transverse-momentum profiles g(k ⊥ , y) in the 1 → 2 splitting (to be discussed shortly). Our treatment should be sufficient to achieve (at least) leading logarithmic accuracy for a broad set of observables, and represents the best we can achieve in the context of a probabilistic approach.
In Section 3.2, the subtraction term at the differential level will be constructed by combining the cross section σ sub (A,B) = σ 1v1,pt (A,B) with a shower algorithm. As suggested by Equation (2.5) itself, the subtraction term is "SPS-like" in terms of the shower (there is only one parton in each proton) so the shower algorithm will be the usual one-parton branching one. The kinematics of the subtraction term, which results from this combination, should match the SPS one for large y whereas it should coincide with the DPS one for small y. In order to best satisfy both requirements, and following the spirit of the DGS subtraction approach, we decide to assign to the subtraction term the same kinematics as the one generated by the dShower algorithm for a 1v1 event where no QCD emissions occurred before the merging phase, which is forced to happen at a scale ∼ Q h . Such DPS events are referred to as "1v1,pt" events in the following.
The cancellation between the subtraction term and DPS at small y occurs essentially by definition. In the implementation of the dShower algorithm, the DPS events corresponding to small y ∼ 1/ν ∼ 1/Q h are 1v1 events, where 1 → 2 splittings occur in both protons. These splittings occur very close in scale to Q h such that there is little room for emissions above the scale µ y ∼ ν ∼ Q h of the 1 → 2 splittings. At small y and for ν ∼ Q h , 1v1,pt events are indistinguishable from 1v1 events (up to small corrections), and thus the subtraction term matches the DPS one.
At large y values, the kinematics of the subtraction term needs to be equivalent to the SPS kinematics (to be more specific, the unpolarised colour-singlet contribution to SPS). In the following, Z 0 Z 0 production is used as an illustration. Here, for the SPS process, we will consider only the O(α 2 s ) loop-induced process initiated by a pair of gluons, see Figure 3, since this is the contribution that overlaps with DPS (i.e. has a large-y tail). It is also gauge invariant and well-defined on its own. The topology of the only graph in the loop-induced contribution that has a large-y tail is the one in Figure 3b, such that the topologies of SPS and 1v1,pt events match. The choice to start the shower with a forced double merging at scale ∼ Q h for all y in 1v1,pt events ensures that the shower starting scales match between the SPS and 1v1,pt (and thus subtraction) terms at large y. On the other hand, with the current version of the dShower algorithm, a reasonable kinematic match between the subtraction and SPS terms at large y cannot be achieved. The kinematics of the loop-induced process leads at LO to bosons that have a non-vanishing transverse momentum with respect to the beam axis, even without the shower. In contrast, the equivalent topology obtained with a DPS 1v1,pt event gives bosons which are produced along the beam axis at LO, since partons are merged with zero relative transverse momenta. In Section 3.1, an improved merging kinematics for the DPS (and subtraction) term will be proposed such that it follows more closely the SPS kinematics at large y. This will yield an improved description at large y overall -the cancellation between SPS and the subtraction Figure 3. Examples of graphs contributing to the loop-induced gg → Z 0 Z 0 process. The graph in (b) has the same topology as a 1v1,pt event.
term will be more complete, and the mergings in the remaining DPS term, which are then dressed by QCD emissions with dShower, will have more realistic kinematics.

Merging with non-vanishing transverse momentum
Before presenting the new kinematics which includes a non-vanishing transverse momentum, the old kinematics developed in [79] is reviewed in detail.

The old procedure
Consider a pair of hard scatters that was evolved from a hard scale Q h down to the scale Q = µ y with the double-parton branching algorithm presented earlier. At this resolution scale, the two incoming partons i and j inside the proton moving along the +z-axis in the laboratory frame have momentap i,j = ξ i,j ( √ s/2)(1; 0, 0, 1), where the momentum fractions ξ i,j will be referred to as the "pre-kick" momentum fractions in the following. The merging happens with a probability equal to F spl ij (ξ i , ξ j , y, µ 2 y )/F ij (ξ i , ξ j , y, µ 2 y ). Before implementing the merging, one needs to apply longitudinal boosts to these partons (and their daughters) in order to recover overall momentum conservation. Indeed, some parton emissions might have been added to the two hard scatters during their common evolution from Q h down to µ y . Adding these emissions breaks momentum conservation since some partons turn into virtual particles. In particular, the partons which are initiating the hard scatters are now space-like and have acquired a transverse momentum by recoiling against the emissions, whereas they used to be light-like and moving along the beam axis. The longitudinal boosts are determined by requiring the invariant mass and the rapidity of each hard scatter to remain as they were before the shower algorithm [30,79]. The longitudinal boosts have the following form The parameter λ is the exponential of the rapidity associated to the longitudinal boost. Therefore, a boost with λ 1 does not change the initial momenta too much. In practice, if the parton emissions that were added are hard, then λ may be larger than unity. After applying the boosts, the two partons i and j extracted from the proton have momenta in the laboratory frame. Since the old procedure does not add any transverse momentum to these latter momenta, the resulting parton after merging has a momentum given by momentum fractions x i = λ i ξ i and x j = λ j ξ j are usually different from the "pre-kick" ones ξ i and ξ j . This ensures that the emissions prior to the merging phase do not break momentum conservation.

The new procedure
With the new procedure, the two partons i and j involved in the merging are now allowed to have a non-vanishing transverse momentum k ⊥ . More precisely, before applying the boosts, the momenta in the laboratory frame are defined as with ϕ some azimuthal angle. The energies and longitudinal components of these two momenta are related to the pre-kick momentum fractions ξ i,j as follows We also define the virtualities of these momenta as These relations lead to One is left with three degrees of freedom: k ⊥ , Q 2 i and Q 2 j . Momentum conservation gives us one constraint. Indeed, when one sumsp i andp j , one would like to get a light-like momentum along the +z axis. This implies E i + E j = p z i + p z j which can be rewritten as Unfortunately, this is the only constraint. Let us now apply the longitudinal boosts that restore overall momentum conservation, as in the old procedure. The two boosted momenta p i and p j should now add up to a light-like momentum along the beam axis. Given that the two boosts are in general different (λ i = λ j ), this is possible only if E i,j = p z i,j . These two last constraints imply that With this prescription, the resulting parton after the merging has a light-like momentum moving along the +z-axis, as with the old procedure. Partons i and j now have a transverse momentum which will be propagated to the final states by recoil. For k ⊥ = 0, one recovers exactly the old kinematics.
The only remaining degree of freedom is thus k ⊥ . Naively, k ⊥ should be a function of three parameters: ξ i , ξ j and µ y . Intuitively, one also expects k ⊥ ∼ µ y . This is not enough to fix an expression for k ⊥ and several choices are thus possible. The choice that is made in this work will be presented shortly.
Let us now consider a 1v1,pt event i.e. there are no emissions before the double merging. With this new procedure, after the double merging and boosts, the partons inside the proton moving along the +z axis have four-momenta whereas the ones moving along the −z axis have momenta with k + ⊥ and k − ⊥ the transverse momenta generated during the merging procedure. In the case of Z 0 Z 0 production, the pre-kick momentum fractions are given by with M Z the Z 0 mass and Y 1,2 the rapidities of the bosons in the laboratory frame. According to momentum conservation, the Z 0 bosons now have momenta Both bosons thus get a transverse momentum given by p ⊥1,2 = ±p ⊥ , with p ⊥ = k + ⊥ + k − ⊥ . Therefore, the transverse momenta of the bosons produced in a 1v1,pt event are directly related to the choice of k ⊥ profile made. In such a 1v1,pt event, extra emissions may be attached to the merged system after the merging phase, thus modifying further the transverse-momentum distributions of the bosons. For the purposes of comparing 1v1,pt (i.e. subtraction) and SPS events, those additional emissions are actually not relevant because they lead to the exact same effects in both event types, and in the study in the next part of the section, we will neglect their effect. Since there are no prior emissions before the double merging, the λ coefficients can be analytically calculated. One finds that they are all equal to 1 + p 2 ⊥ /M 2 Z . The post-kick momentum fractions are thus and depend explicitly on p ⊥ . They lead to a squared invariant mass of the diboson system equal to (3.14)

Choice of the transverse profile
Whatever choice for k ⊥ is made, the kinematics of a 1v1,pt event obtained with this choice should match as closely as possible the SPS kinematics for large y values. In this work, rather than aiming for an exact match, we will adopt a simple choice for the transverse profile in the merging procedure, which should nevertheless reproduce the SPS kinematics at large y reasonably well. More specifically, values for k ⊥ will be sampled randomly according to the following distribution which is normalised as β is a free parameter of the model that controls the width of the distribution. In the following, β = 1 will be used but the impact of different choices for β will be discussed in a later section. The distribution is represented for a few values of y in Figure 4. One can see that the distribution peaks at k ⊥ = 1/( √ 2β y) µ y , as desired. It will now be shown how this choice leads to a reasonable match between the 1v1,pt events and the SPS events at large y in the case of Z 0 Z 0 production.
Loop diagrams are generally computed as integrals over internal momenta rather than positions, and no full result exists for the gg → Z 0 Z 0 loops differential in the transverse partonic separation y. However, the small-p ⊥ behaviour of the loop-induced process gg → Z 0 Z 0 is dominated by the contribution from the region of large y values [5,57]. Therefore, if the kinematics of a 1v1,pt event and the SPS one lead to the same behaviour at small p ⊥ , then one can state that the two kinematics match to a reasonable degree of accuracy in the large-y region (and thus, that the kinematics of the subtraction and SPS terms also match in the large y region). This can be checked by studying the p ⊥ distribution of the produced bosons. For the 1v1,pt events, p ⊥ is defined as the sum of the two vectors k + ⊥ and k − ⊥ , which are selected according to Equation (3.15). This quantity is thus distributed according to with the following normalisation In the SPS cross section, the y parameter is integrated over. One thus needs to do the same for the 1v1,pt events in order to be able to compare. The 1v1,pt cross section differential in p ⊥ is given by Equation (2.5), but with the profile h(p ⊥ , y) inserted into the y integral. Then the p ⊥ distribution of the bosons obtained for a 1v1,pt event can be estimated to be with Ei(x) the exponential integral function defined as (3.20) In the limit where p ⊥ ν, one gets This is, at least, not too far from the log 2 (p 2 ⊥ /ν 2 ) behaviour one obtains for the p ⊥ spectrum of the loop-induced process for small p ⊥ values [57,97,111]. This behaviour leads to a divergence when p ⊥ → 0 referred to as the "DPS singularity", since this one also originates from the double counting between SPS and DPS. This singularity is however integrable, meaning that integrating log 2 (p 2 ⊥ /ν 2 ) down to p ⊥ = 0 yields a finite result. In the case of a 1v1,pt event, the log(p 2 ⊥ /ν 2 ) behaviour obtained in Equation (3.21) leads also to an integrable singularity.

Subtraction scheme at the differential level
The new kinematics presented in the previous section was introduced so that the 1v1,pt events and the SPS kinematics lead to similar behaviours at large y values. The subtraction term will then correctly reproduce the DPS one at small y and approximately the SPS one at large y, both at the inclusive and differential levels. The objective now is to create a shower algorithm that can simulate event shapes for the combination SPS+DPS without double counting. The procedure which will be presented in the following uses ideas from matching [83][84][85][86][87][88][89][90][91][92][93] between NLO matrix elements and parton showers. Similarly to the MC@NLO method [93][94][95][96], we decide to split the cross section for the production of a final state A + B into two terms. More precisely, for any observable O, we write symbolically This formula is the differential version of Equation (2.3). The operators S 1 and S 2 encapsulate the effects of the one-parton and two-parton branching algorithms respectively. In other words, S 1 is the usual shower algorithm whereas S 2 is the dShower algorithm (including the merging procedure) recalled in Section 2.2. The quantities t 1 and t 2 are the starting scales of the shower algorithms. Usually, it is the type of shower algorithm that is implemented that determines which scale should be used. However, they should be related to the hard scales of the corresponding hard scatters. As explained in Section 2.2, one must impose t 2 ν. In order to achieve the best matching between DPS and subtraction terms at small y, one must take t 1 = t 2 , as will be discussed later. The two operators S 1 and S 2 are unitary, meaning that they cannot modify the value of the total cross section σ tot A+B , but only the event shapes. One thus has two types of events: SPS-like events (first term of Equation (3.22)) and DPS-like events (second term). For an SPS-like event, there is only one hard scatter and its kinematics is sampled according to σ SPS A+B − σ sub (A,B) . The event is then showered using the one-parton branching algorithm. The DPS-like events start from two hard scatters whose kinematics are selected according to σ DPS (A,B) . The dShower algorithm S 2 is then applied to this pair of hard scatters. The DPS-like events include all the contributions to DPS (1v1 contribution as well). Since y is not an observable, one needs to integrate over it in the second term of Equation (3.22). S 2 contains an implicit dependence on y due to the way the merging procedure is implemented, recall Section 2.2. Note that for each term in Equation (3.22), both the shower and cross section parts can contribute to the total value of O.
Let us now explain how Equation (3.22) is implemented from an algorithmic point of view. The first technical aspect is that the phase spaces for SPS-like and DPS-like events are different. More precisely, in the instance of diboson production via SPS, the kinematics of the diboson system can be parametrised by three non-trivial 4 variables Φ 1 = {Y 1 , Y 2 , p 2 ⊥ }, with Y 1 and Y 2 the rapidities of the two bosons and p 2 ⊥ the transverse momentum squared of the bosons with respect to the beam axis in the laboratory frame. All the relevant kinematic quantities can be derived from these three variables, as illustrated in Section 3.1. In the case of two hard scatters, the same rapidities Y 1 and Y 2 can be used to characterise the kinematics of the two bosons. At LO, the two bosons are produced with zero transverse momenta so there is no need for the variable p 2 ⊥ in the DPS case. The bosons get a non-vanishing transverse momentum afterwards via the shower algorithm S 2 . The phase space for DPS can thus be encapsulated in the variable Φ 2 = {Y 1 , Y 2 , y}, with y the impact parameter. Since Φ 1 = Φ 2 , one has to choose the event type before sampling the kinematics. This can be done with the following algorithm [38,93] 1. Select a random number R uniformly between 0 and 1. If one has R < M 1 /( then the event is an SPS-like one, otherwise it is a DPS-like one. 2. Select a phase-space point Φ i according to the distribution p i (Φ i ), i being equal to 1 or 2, depending on the event type previously determined. Calculate the corresponding quantity w i (Φ i ).
3. Accept the event with a probability given by w i (Φ i )/M i . In the case where the event is rejected, then go back to the first step. If the event is accepted then apply the corresponding shower algorithm S i .
Here, the event weight w i (Φ i ) is defined for i = 1, 2 as with σ 1 = σ SPS A+B − σ sub (A,B) and σ 2 = σ DPS (A,B) . The functions p i (Φ i ) are some positive-definite distributions normalised to unity which are used during the importance-sampling procedure to increase the efficiency of the Monte-Carlo method. The number M i is defined as the maximum value of the event weight w i (Φ i ) over the whole phase space parametrised by Φ i , thus ensuring that w i (Φ i )/M i < 1. On average, the events are generated with the correct weight σ tot A+B since where the right-hand-side of the equation is the sum of two terms: the first one (second one) is the product averaged over the corresponding phase space of the weight associated to an SPS-like (DPS-like) event with the probability to accept this event type. Also, on average, the relative probability to select the event type i is σ i /σ tot A+B , as desired.
The second technical aspect is linked to the fact that the implementation of Equation (3.22) implies the handling of events with negative weights, as in the MC@NLO procedure. Indeed, for some specific values of Φ 1 , it may happen that w 1 (Φ 1 ) < 0. The algorithm proposed above can be adapted to account for such cases by accepting the SPSlike events with a probability equal to |w 1 (Φ 1 )|/M 1 instead of simply w 1 (Φ 1 )/M 1 . In that case, M 1 must be defined as the maximum of |w 1 (Φ 1 )|. When constructing histograms, the SPS-like events with w 1 (Φ 1 ) < 0 contribute with a weight −1 whereas the ones with w 1 (Φ 1 ) > 0 and the DPS-like events are recorded with weight +1. Such a procedure ensures that the average weight of an SPS-like event is σ 1 . Indeed, one can write which is the product averaged over the phase space parametrised by Φ 1 of the weight associated to an SPS-like event in the histograms with the probability to accept an SPSlike event. This is similar to what is proposed in the MC@NLO implementation [93][94][95][96]. In order for the whole procedure to be working efficiently, the fraction of events with negative weights should not be too large, typically a few percent.

Analytical expression
Let us now understand how the subtraction term is coupled to the one-parton branching algorithm S 1 , as indicated by Equation (3.22). First of all, the algorithm that implements Equation (3.22) requires to be able to calculate dσ sub (A,B) /dΦ 1 . We recall that Φ 1 includes the variable p ⊥ , such that we need a suitable p ⊥ profile for this term. As mentioned in the beginning of this section, we choose to assign to the subtraction term the p ⊥ profile that is generated by the dShower algorithm for a 1v1,pt event (i.e. a 1v1 event with no QCD emissions before the merging phase). This latter profile was derived earlier in Section 3.1.3 for diboson production. One can thus insert the profile h(p ⊥ , y) given by Equation (3.17) inside the subtraction term as follows whereσ Z is the partonic cross section for the process qq → Z 0 . The c q coefficients are the couplings of the Z 0 with the incoming quarks q and only depend on the flavour of those quarks. The sum over q includes all the quark flavours which are allowed. The factor two in front of that sum accounts for the symmetry between the branchings g → qq and g →qq. There is some freedom in choosing which momentum fractions X should be used in the splitting kernels and in the gluon sPDFs f g : one could use either the pre-kick or the post-kick fractions defined in Section 3.1. The scale µ should be set to the hard scale appropriate to the process, although there are several potential choices. We will come back to this question shortly. Provided the scale µ does not depend on y, we can straightforwardly perform the y integral in Equation (3.27) analytically, yielding (3.28) This last expression is what is needed for the implementation of Equation (3.22). Indeed, the subtraction term is now written as an integral over Φ 1 . Inserting the p ⊥ profile does not change the dependence of the subtraction term on ν since which is the same dependence as in Equation (2.6). The p ⊥ profile of the subtraction term is represented in Figure 5 for two values of ν.

Choices of scales and momentum fractions
Let us now discuss the choice of scale µ in the subtraction term, as well as the momentum fractions X ± 1 and X ± 2 . We will also discuss the issues of the choice of renormalisation/factorisation scales in the SPS and DPS terms, which we shall refer to here as µ SPS and µ DPS respectively, and the choice of shower starting scales t 1 and t 2 in Equation (3.22).
Clearly, all renormalisation/factorisation scales should be set to be of the order of the hard scale Q h . But for the SPS, DPS (and subtraction) terms slightly different choices of hard scale may be optimal, even though formally the differences will be beyond the accuracy of the computation. Customary choices for µ SPS in the context of Z 0 Z 0 production are µ SPS = m ZZ [112][113][114][115][116], µ SPS = m ZZ /2 [117][118][119][120][121][122] and µ SPS = M Z [121,[123][124][125], with m ZZ the invariant mass of the diboson system given by Equation (3.14). By contrast, for Z 0 Z 0 production via DPS one would typically choose µ DPS = M Z . At large y, the SPS term should predominantly produce the bosons with p ⊥ ∼ 1/y M Z , such that at such y values one can drop p ⊥ in dynamic scales like m ZZ and write this as a function of M Z and the rapidities Y i alone. To achieve best matching between the subtraction and DPS at small y, and subtraction and SPS at large y, the optimal choice of µ in the subtraction term would then be a y-dependent choice that tends to µ DPS at small y, and to µ SPS (p ⊥ = 0) at large y (this in practice could be implemented via appropriate profile scales [61,126,127]). With this choice, one can straightforwardly follow the procedure above up to Equation (3.27) (since the scales are independent of p ⊥ ), but would no longer be able to perform the y integral analytically to obtain Equation (3.28).
An alternative possibility is to choose µ to either be µ SPS (or µ SPS (p ⊥ = 0)) or µ DPS . In this case the matching between the subtraction term and either DPS or SPS will be degraded at small y or large y, where the degradation in matching will be, in general, more observable at small y (since this is the leading-power SPS region). This would favour the choice µ = µ DPS in this case. Now let us discuss the choice of starting scales t i for the showers. We set the shower starting scales for the SPS and subtraction terms to be equal (= t 1 ), as written in Equation (3.22). The reason for this is that then we can treat these terms together as SPS-like events in the algorithm. This in turn minimises the number of events with negative weights -given that the SPS term is usually much larger that the subtraction term, one is ensured that the combination d(σ SPS ZZ − σ sub (Z,Z) )/dΦ 1 is positive-definite over a large region of the phase space parametrised by Φ 1 . As in the MC@NLO method, a minimal fraction of negative-weight events is desired because, for a given accuracy, the larger the fraction is, the higher the statistics needs to be. If one separates the scales of the SPS and subtraction terms, then one has to split the SPS-like events into pure SPS events and subtraction counter-events which contribute to the histograms with weight −1. This will increase the number of negative weights drastically.
It is in principle possible to choose the shower starting scale to be different from the renormalisation/factorisation scale in each term, although having such a mismatch between the cross section expression and shower is somewhat unnatural. If we want to match the shower starting scale with the renormalisation/factorisation scale, the constraint that the shower starting scales of the SPS and subtraction terms are equal implies that µ = µ SPS . If µ DPS = µ SPS , this choice is incompatible with µ = µ DPS .
In the Z 0 Z 0 production example we study here, we will simply set all renormalisation, factorisation and shower starting scales to M Z . In such a case, where we set µ DPS = µ SPS , we can achieve all desired properties above simultaneously. Now we discuss which momentum fractions X should be used in the expression of the subtraction term. To achieve the best match between the DPS and subtraction terms at small y, the pre-kick fractions ξ constitute a better choice than the post-kick fractions x. Indeed, the DPS cross section uses the pre-kick fractions given by Equation (3.11). Moreover, the post-kick fractions contain an explicit dependence on p 2 ⊥ , see Equation (3.13), which technically prevents us from inserting the integral over p 2 ⊥ in Equation (3.26).

Numerical checks
It will now be shown how the subtraction term performs numerically. The first step is to check that the kinematics of the subtraction term is indeed equal to that of a DPS 1v1,pt event. The kinematics corresponding to a 1v1,pt event can be simulated by combining the cross section σ 1v1,pt (Z,Z) defined by Equation (2.5) (with µ = M Z ) with the dShower algorithm S 2 . By definition, the shower evolution of a 1v1,pt event starts with a forced double merging at t 2 = M Z , in contrast with a usual 1v1 event where the merging phase happens at the scale µ y 1/y which is below t 2 . To highlight this technical difference, the shower algorithm used to shower the 1v1,pt events is denoted by S 2 . Since the evolution of a 1v1,pt event starts directly with the merging phase, there are no emissions before this phase, as mentioned before. Recall that at small y ∼ 1/ν ∼ 1/M Z the 1v1,pt DPS term coincides with the full one. The subtraction term in this comparison is simply the corresponding term in Equation (3.22) i.e. the cross section given by Equation (3.28) coupled with the shower algorithm S 1 , with t 1 = M Z . In the rest of this section, ν = M Z is used. The effects of a variation in ν are studied in Section 4.1. The only differences between the two terms are then the shower algorithm, the way the phase space is sampled (recall that Φ 1 = Φ 2 ) and the choices of scales and momentum fractions.
In the following figures, the two previously described terms S 1 (M Z ) ⊗ dσ sub (Z,Z) /dO and d 2 y S 2 (M Z ) ⊗ dσ 1v1,pt (Z,Z) /(dO d 2 y) are designated by "Sub" and "1v1,pt" respectively. The results for √ s = 13 TeV were obtained using the 3-flavour MSTW2008 set of LO sPDFs [128,129]  cross-section formulae and in the showers. We only include three flavours to avoid to have to deal with the different mass thresholds that would add further complications to the problem. The showers are angular ordered and stop when the evolution scale reaches the value of 2 GeV. No cuts are applied to the hard process qq → Z 0 ⊗ qq → Z 0 . We take M Z = 91.188 GeV. In Figures 6 and 7, the histograms of the transverse momenta of the Z 0 bosons and of the Z 0 Z 0 pair are given for several choices of momentum fractions X ( Figure 6) and scale µ (Figure 7). These two histograms give complementary pieces of information since the transverse momenta of the Z 0 bosons are mostly determined by the cross section whereas the transverse momentum of the Z 0 Z 0 pair is particularly sensitive to the shower activity. Indeed, the transverse momentum of the Z 0 Z 0 pair must balance that of all the extra parton emissions in order to achieve overall momentum conservation. In all the histograms, the error bars represent the statistical errors due to the use of Monte-Carlo techniques. As motivated above, the choice µ = M Z and X = ξ for both the PDFs and splitting kernels leads to the best match between the 1v1,pt and subtraction terms, at least for the presented distributions. With this choice, the subtraction term should reproduce the DPS one at small y, since this latter is equal to the 1v1,pt term in that region.
The second step is to check the large-y region. For y 1/ν, the subtraction term should match the unpolarised, colour-singlet part of the SPS loop-induced term. The subtraction term S 1 (M Z ) ⊗ dσ sub (Z,Z) /dO will now be compared to the loop-induced SPS cross section coupled to the S 1 (M Z ) algorithm. In the region y 1/ν, the choice of scale µ and fractions X does not matter as much as it does for y ∼ 1/ν because the p ⊥ values are here small and the different choices thus coincide. In this study, our focus will be on comparing the overall shapes of the two terms (particularly at small p ⊥ ν) rather than making precise numerical comparisons between the two -in any case the magnitudes of the two should not coincide even at low p ⊥ , as the full SPS loop-induced term contains additional colour, spin and flavour interference/correlation contributions, that are not contained in our subtraction term.
In this study, the loop-induced cross section was computed using the matrix-element generator OpenLoops 2 [131][132][133][134][135]. The factorisation scale and the argument of the strong coupling are set to M Z . In the OpenLoops 2 calculation one has all six quark flavours running inside the loop (with all quarks treated as massless except the top quark), instead of the three massless flavours in the calculation of the subtraction term. However, since we only aim at a rough shape comparison between the SPS and subtraction terms, this mismatch is not critical. We use the same 3-flavour α s in both the SPS and subtraction terms. In the SPS calculation, we use the default values for the Higgs and top masses, M H = 125 GeV and M t = 172 GeV.
In Figure 8, the subtraction term is compared to the SPS one. Here, the histograms are normalised to unity because we are mainly concerned with the shapes of the two different terms, as mentioned above. It can be seen that the p ⊥ spectra for the boson pair exactly match. This is because the p ⊥ spectrum of the Z 0 Z 0 pair is mainly controlled by the shower algorithm used and the two terms are showered with the exact same algorithm S 1 (M Z ). Nevertheless, the curves obtained for the p ⊥ spectrum of the Z 0 bosons do not coincide. This is due to the fact that the Z 0 p ⊥ is strongly determined by the cross section. The p ⊥ profile which was inserted in the expression of the subtraction term is the p ⊥ spectrum of a 1v1,pt event, and ensures an accurate subtraction in the region y ∼ 1/ν. However, this profile only approximates the p ⊥ spectrum of an SPS event and hence does not perfectly match the SPS cross section in the region y 1/ν. In particular, the small-p ⊥ behaviour obtained with the subtraction term is log(p 2 ⊥ /ν 2 ) instead of the log 2 (p 2 ⊥ /ν 2 ) that can be extracted from the SPS cross section, recall Section 3.1.3. It will be seen in a later section how one can modify the transverse profile used in the merging kinematics to improve the matching between the SPS and the subtraction terms in the large-y region.

Numerical results
In this last section, the results obtained from the numerical implementation of Equations (2.3) and (3.22) are presented for Z 0 Z 0 production via SPS and DPS at √ s = 13 TeV. The set of sPDFs, the running scheme for the strong coupling and the choices of scales and momentum fractions are identical to the ones mentioned in the previous section. In particular, the factorisation scales and the arguments of the couplings in all the cross sections as well as the starting scales of the showers are set to be equal to M Z . The cross sections are computed either analytically or with OpenLoops 2. As before, in this numerical study, we will only include the loop-induced process in the SPS piece, although in principle one can also add other SPS processes on top of the loop-induced one (such as the qq → Z 0 Z 0 Born process). For the DPS cross section written in Equation (2.1), the set of y-dependent dPDFs that is used is the 3-flavour DGS set originally developed in [61] and improved in [79]. The results are presented at parton level, meaning that there is no hadronisation phase. In each event, there are at most two different hard scatters.
In this study we choose to rescale the SPS cross section by a factor 1/10. This is to counteract the fact that the DPS cross section is power suppressed with respect to the SPS one [5]. Such a rescaling is of course not physical, but is helpful in this proof-of-concept study to distinguish the DPS process from the SPS one in the histograms and to enhance  the sensitivity to the ν variation. We recall here that the SPS term does not contain any dependence on the parameter ν and the cancellation of the dependence on this unphysical parameter only occurs between the subtraction term and the DPS one.

Validation
Let us start by studying the impact of the subtraction term. The histograms presented in Figure 9 were produced setting σ sub (Z,Z) = 0, whereas the ones in Figure 10 were obtained using all the terms present in Equation (3.22). As expected, removing the subtraction term induces a strong dependence on the scale ν in the event shapes. The same effect can be observed for the total cross sections, see Table 1.
In the case where the subtraction term is included, the fact that the event shapes are independent of ν (up to subleading terms in α s ) can be understood as follows. As we increase ν from an initial value of the order 5 of Q h , a positive contribution is added to the DPS term at small y ∼ 1/Q h . For the dominant 1v1 part of this, the double merging occurs very close to the two hard scatters. The additional 1v1 events hence develop a topology that is similar to the one usually associated to an SPS event. However, as we increase ν, the subtraction term gets a nearly identical additional contribution at small y. This means that the term that is subtracted from the SPS cross section is larger, recall Equations (3.22) and (3.28), which implies fewer actual SPS events. The two mechanisms are designed to cancel each other. In practice, a slight dependence on ν may appear for some observables, however. This can be due to the fact that only the leading contributions were included in the definition of the DPS and subtraction terms. Adding higher order corrections to both terms would reduce the residual ν dependence (a key result that is needed for this is obtained in [136]). In practice, the observables are even less sensitive to a ν variation than it appears in this proof-of-concept study because the DPS and subtraction terms are relatively small compared to the SPS one (recall the factor 1/10 applied to the SPS cross section). Let us briefly comment on the number of events with negative weights that are generated by our algorithm. The fraction of events that are accepted with a negative weight is rather small: 0.4% for ν = M Z and drops to 0% for ν = M Z /2 and ν = M Z /4. Therefore, these events do not affect the efficiency of the algorithm. The fraction of events would be even smaller if the SPS cross section were not rescaled.

Improving the matching at large y
In Figure 8a, it was observed that the shapes of the Z 0 p ⊥ spectra produced by the SPS and subtraction terms do not coincide, even at small p ⊥ Q h . This is due to a mismatch for large y values between the p ⊥ profile of the subtraction term and the SPS cross section.
argument presented here also works for ν > Q h .
It is actually possible to calculate the p ⊥ profile corresponding to the contribution to the SPS process which overlaps with DPS (i.e. the loop-induced process) in the large-y region. This was achieved in [5] and the p ⊥ profile of the unpolarised, colour-singlet contribution to SPS for large y values can be approximated to be The factor in front of the integral ensures that the profile is correctly normalised: The p ⊥ profile in Equation (4.1) contains ultraviolet divergences at y + = 0 and y − = 0, where y ± = y ± z/2. However, no such divergences exist in the actual SPS cross section. This is because the integrand in Equation (4.1) is only valid in the region in which |y ± | 1/Q h ∼ 1/ν, which is the region of the integral where a DPS description is most appropriate.
The region in which one of y ± goes to zero whilst the other stays finite is the region of the integral where an SPS/DPS interference description is most appropriate. The "DPS" region |y ± | 1/Q h ∼ 1/ν ultimately yields the leading behaviour of the SPS cross section ∝ log 2 (p 2 ⊥ /ν 2 ) (mentioned in Section 3.1.3 and [57,111]), whilst the "DPS/SPS interference" region yields a subleading behaviour ∝ log(p 2 ⊥ /ν 2 ). Here, we are predominantly interested in the leading low-p ⊥ behaviour associated with the DPS region. To extract this behaviour, we can simply insert ultraviolet regulators in Equation (4.1) to cut off the integrand when |y ± | ∼ 1/ν. In this work, we will regulate the ultraviolet divergences by adding a term b 2 0 /ν 2 to each denominator factor in Equation (4.1), yielding: Integrating this profile over y as in Equation (3.19), one obtains (4.4) The function K 0 (x) is one of the modified Bessel functions of the second kind and reads In the limit where p ⊥ ν, one gets which gives the leading log 2 (p 2 ⊥ /ν 2 ). Note that the regularisation in Equation (4.3) changes the normalisation of the profile. This can be rectified by replacing the factor y 4 by (y 2 + b 2 0 /ν 2 ) 2 in this same equation. This substitution then modifies the result obtained in Equation (4.4) but does not change the leading log 2 (p 2 ⊥ /ν 2 ) behaviour that is extracted from this result for small values of p ⊥ . Using another regularisation scheme has the same effect: it changes the subleading terms, but not the leading one. If the p ⊥ profile derived in Equation (4.4) is then used to construct the subtraction term then the p ⊥ spectra obtained from the subtraction and SPS terms coincide in the small-p ⊥ region, up to corrections going like log(p 2 ⊥ /ν 2 ) and terms which are not logarithmically enhanced. The problem here is that the kinematics of the subtraction term must also match the one of a DPS 1v1,pt event in the small-y region and it is cumbersome to design a transverse profile g(k ⊥ , y) for the merging kinematics whose convolution with itself leads to a p ⊥ profile as given by Equation (4.4) (recall Equation (3.17)). This is the reason why the transverse profile g(k ⊥ , y) was chosen to be Gaussian in this work, see Equation (3.15). Such a form leads to a resulting p ⊥ profile h(p ⊥ , y) that can be analytically calculated and at least has a reasonably similar behaviour, once integrated over y, as the one given by Equation (4.4) in the small-p ⊥ region.
In Figure 11, the approximated SPS p ⊥ profile given by Equation (4.4) is compared to the one given by Equation (3.19) for several values of β. This latter profile was obtained from a Gaussian distribution g(k ⊥ , y). It can be observed that the shape of the SPS profile is best reproduced for β = 2. This is confirmed in Figure 12 where the SPS term is compared to the subtraction term for several values of β. One observes in the plots that whatever the value of β is, the shape of the subtraction term does not match that of the SPS term at the lowest p ⊥ values. This is due to the fact that changing the parameter β cannot change the log(p 2 ⊥ /ν 2 ) behaviour obtained from the resulting p ⊥ profile for small p ⊥ values, which does not match the SPS log 2 (p 2 ⊥ /ν 2 ). In this sense the Gaussian ansatz is not ideal. One has to keep in mind, however, that in fact the transverse profile g(k ⊥ , y) of the 1 → 2 splitting does not play a role at the leading-logarithmic level in the transverse-momentum distributions of the Z 0 bosons, so these considerations are technically beyond our intended accuracy. The Gaussian ansatz implements in a simple way the physical intuition that the partons in the 1 → 2 splitting should be given a relative transverse momentum k ⊥ ∼ 1/y.
In Figures 13 and 14, the results obtained by combining all the contributions as described in Equation (3.22) are given for several values of β. One can notice in Figure 13 that in general the value of β does not affect too much the resulting kinematic distributions. In order to observe a discrepancy, one needs to study the small-p ⊥ region with extreme cuts on either the invariant mass or the transverse momentum of the Z 0 Z 0 pair, see Figure 14. The fact that the results do not depend strongly on the value of β is expected: the discrepancy between the different choices is not a leading-logarithmic effect.
One may wonder whether it is possible to improve the Gaussian ansatz -i.e. define a class of profiles g(k ⊥ , y) such that the resulting p ⊥ profile behaves as log 2 (p 2 ⊥ /ν 2 ) in the small-p ⊥ region. To achieve such a goal, let us revisit the equations of Section 3.1.3. We recall that the small-p ⊥ behaviour of the loop-induced SPS term is dominated by contributions from the region 1/ν |y ± | 1/p ⊥ (the logarithmic integrations for y ± are "cut off" at values of order 1/p ⊥ by the exponential factor in Equation (4.3)). In a similar way, the dominant small-p ⊥ behaviour of the subtraction term under the Gaussian ansatz arises from the region 1/ν y 1/p ⊥ -we have a logarithmic integration over y that extends Figure 11. Different p ⊥ profiles for the subtraction term for ν = M Z . The profile given by Equation (3.19) which corresponds to a Gaussian distribution g(k ⊥ , y) is represented for three values of β (red, blue and green curves). The approximated "true" profile (black) and the fitted profile (magenta) are given respectively by Equation (4.4) and Equation (4.10). The fitted profile corresponds to a decreasing Gaussian distribution g(k ⊥ , y), as given by Equation (4.9). The area under each curve is equal to ν 2 /(2b 2 0 ).  Figure 12. Transverse momenta of the Z 0 bosons as produced by the SPS and subtraction terms. The subtraction term corresponding to a Gaussian distribution g(k ⊥ , y) is given for several values of β. The fitted profile corresponds to a decreasing Gaussian distribution g(k ⊥ , y). The SPS setup is the reference in the ratio plot. The histograms are normalised to unity. between y ∼ 1/ν (where it is cut off by the factor Φ) and y ∼ 1/p ⊥ (where it is cut off by the Gaussian factor), recall Equation (3.19). For the purposes of computing the leading low-p ⊥ behaviour, one can replace the Gaussian factor in Equation (3.17) by a simple cut-off imposing yp ⊥ < 1, yielding for the p ⊥ distribution:

SPS
This agrees with Equation (3.21) at the leading-logarithmic level. This insight allows us to design an h(p ⊥ , y) that yields a double logarithmic behaviour in the small-p ⊥ limit. We need an expression which is strongly suppressed for yp ⊥ > 1, as for the Gaussian ansatz, but which is proportional to −y 2 log(yp ⊥ ) in the limit yp ⊥ 1 rather than y 2 . Then, the leading low-p ⊥ behaviour will be proportional to (recall Equation (4.7)) Such a profile h(p ⊥ , y) can be obtained for example from the following form for g(k ⊥ , y): The width of the Gaussian in this expression has been chosen such that when this profile is used to construct the subtraction term, the coefficient of the log 2 (p 2 ⊥ /ν 2 ) term in the p ⊥ distribution is the same as the corresponding coefficient in Equation (4.6). Unfortunately, we were not able to obtain the p ⊥ profile of the subtraction term corresponding to Equation (4.9) analytically. However, one can perform a fit of this profile, using the following functional form: In Figure 11, the fit of the p ⊥ profile is compared to the approximated SPS profile given by Equation (4.4) and to profiles corresponding to a Gaussian g(k ⊥ , y). One can see that this fitted profile more closely approximates the shape of the SPS profile than the other ones for small values of p ⊥ . This is due to the fact that the two profiles have the same double-logarithmic behaviour in the small-p ⊥ region.
Using an approximation of the p ⊥ profile instead of the exact expression does mean that the matching between 1v1,pt events and the subtraction term is to some extent degraded. In Figure 15, the subtraction term corresponding to the fitted profile given in Equation (4.10) is compared to the 1v1,pt DPS term, as defined in Section 3.3. As a reminder, the transverse momenta k ⊥ of the merging partons in a 1v1,pt event are selected according to g(k ⊥ , y), which is here the "decreasing Gaussian" given by Equation (4.9). In this figure, it can be observed that the two terms start to disagree at large p ⊥ values. This is in contrast with the case where g(k ⊥ , y) is a bare Gaussian, where the p ⊥ profile of the subtraction term can be   Figure 15. Transverse momenta of the Z 0 bosons as produced by the 1v1,pt and subtraction terms for a decreasing Gaussian form of g(k ⊥ , y). The 1v1,pt setup is the reference in the ratio plot and is defined as in Section 3.3. The histograms are not normalised to unity.
analytically calculated. Indeed, it was noticed in Figure 6 that the 1v1,pt and subtraction terms overlap perfectly in this instance.
The mismatch at large Z 0 p ⊥ leads to an imperfect subtraction between the DPS and subtraction terms at small y and large p ⊥ . However, one notes that when all contributions are combined, the use of a fitted p ⊥ profile instead of an analytical result does not have a strong impact on the kinematic distributions, including the Z 0 p ⊥ -see Figure 13, where the fitted-profile result agrees well with the Gaussian-ansatz results, even at large p ⊥ . This is because the subtraction term for the decreasing Gaussian ansatz falls more steeply than the SPS term, such that it is much smaller than SPS at large p ⊥ -see Figure 12. Since the large-p ⊥ region is dominated by contributions from the small-y region, the DPS term should also be much smaller than the SPS term at large p ⊥ . The mis-cancellation seen in Figure 15 is then numerically unimportant in the combination.
Both the Gaussian ansatz (with adjustable β) and the decreasing Gaussian ansatz (using the fitted profile of Equation (4.10) in the subtraction term) are available as options in the code.

Distinguishing DPS from SPS
As previously mentioned, we do not aim here at a full phenomenological analysis of DPS in the Z 0 Z 0 production process. However, even in the context of our toy set-up where we only have the loop-induced process in the SPS piece, and this is multiplied by 1/10, it is interesting to investigate in what kinematic regions we can observe the largest impact from the DPS process.  We recall from Section 2.1 that the DPS cross section is generically not well-defined on its own, since it depends on the unphysical parameter ν, and that the well-defined combination is the total cross section SPS+DPS-sub. How can we then define a separation of SPS and DPS? Note that, from a theoretical point of view, the SPS cross section for pp → Z 0 Z 0 is perfectly defined on its own. Therefore, we can compare the signal produced by the SPS process on its own to the one obtained when combining SPS and DPS. Any discrepancy between the two we attribute to DPS. In this way we effectively define the quantity "DPS-sub" to be the DPS contribution, putting the large-y parts of 1v1 loops that are not already described by the SPS term into the DPS contribution.
In Figures 16, 17 and 18, some event shapes are given. The setups of the simulations are the same as before. More precisely, the label "SPS+DPS" refers to the results obtained using Equation (3.22) for ν = M Z and β = 2 i.e. by combining SPS and DPS. The "SPS" curves were again produced with the loop-induced process only, with the cross section multiplied by a factor 1/10. The comparison shows that the inclusion of DPS leads to more events in the regions of small transverse momenta and small invariant masses. It is natural that DPS should be concentrated in this region since, at LO, the bosons are produced with zero transverse momenta in the DPS process. Combining the DPS process with the SPS one should then add to the SPS cross section a contribution that is peaked at zero transverse momentum and at an invariant mass of 2M Z , recall Equation (3.14). This leads us to propose an upper cut on either the transverse momenta of the bosons (or of the pair) or the invariant mass of the pair as a useful cut to distinguish DPS from SPS. Moreover, the results presented in Figure 17 seem to advocate an upper cut on the difference in azimuthal angles ∆ϕ ZZ and a lower cut on the absolute value of the difference in pseudorapidities ∆η ZZ of the bosons as discriminating cuts. For instance, in Figure 18b, the p ⊥ spectrum of the pair was produced for both setups by only accepting the events that satisfy ∆ϕ ZZ < 2. This seems to enhance the discrepancy between the two setups, especially in the region of small transverse momenta which is the region where the DPS contribution is expected to be important. Removing the factor of 1/10 in the SPS piece will reduce the differences that can be observed between the SPS and SPS+DPS curves. Including the other contributions to the SPS process may affect the event shapes observed for Z 0 Z 0 production, which may lead to different discriminating cuts being appropriate. However, this is probably not the case since our reasoning uses rather general distinguishing characteristics of the DPS signal. Moreover, the proposed cuts are used in many phenomenological and experimental analyses to distinguish the DPS signal from the background SPS signal. For instance, similar cuts were already proposed in the context of a phenomenological study of Z 0 Z 0 production in [137] and for the CMS extraction of DPS in same-sign WW production, where discriminating variables of the kind we discussed were used to train boosted decision trees [52]. For an extensive review of experimental extractions of DPS, where in many places such variables are used to discriminate DPS and SPS, see Chapters 6-8 of [138].

Summary
In this work, the Monte-Carlo simulation of DPS dShower introduced in [79] has been augmented such that SPS and DPS processes can be combined in a consistent manner for the first time. This is a non-trivial task; simply adding up SPS and DPS leads to a doublecounting issue both at the inclusive and differential levels. At the inclusive level, the problem of combining DPS and SPS without double counting was solved in [61], via the inclusion of a subtraction term. The objective of this work was to extend this subtraction scheme to the differential level in such a way that it can be implemented within a probabilistic parton-shower algorithm.
This required several steps. First of all, the kinematics of the 1 → 2 splittings was modified such that a relative transverse momentum k ⊥ ∼ 1/y was generated between the daughter partons (with y the partonic transverse separation). In the original dShower algorithm [79], the daughter partons were produced with zero relative k ⊥ . This new kinematics is more realistic, and ensures that the kinematics of "1v1,pt" DPS events (in which 1 → 2 splittings occur in both protons and there are no QCD emissions above the characteristic scale of the 1 → 2 splittings) mimic more closely at large y the kinematics of an SPS event, whose topology at such y values is equivalent to the 1v1,pt one (see Figure 3).
Then, a subtraction term was introduced, whose kinematics was chosen to be the one generated by the shower algorithm for a 1v1,pt DPS event. With such a choice (and with the modification to the DPS algorithm just described), the kinematics of the subtraction term matches the DPS one at small y by definition, and approximately matches the SPS one at large y, thus extending the subtraction scheme at the differential level. Finally, each term was combined with a shower algorithm, such that event shapes corresponding to the production of a given final state via both SPS and DPS could be simulated without double counting. The overall design of the subtraction scheme in the shower is to a certain extent similar to techniques used in the matching of NLO computations to the parton shower [83][84][85][86][87][88][89][90][91][92][93].
This subtraction scheme was implemented in the new version of the dShower simulation, thus allowing the combination of SPS and DPS. The implementation was numerically validated at parton level in the context of Z 0 Z 0 production. In our proof-of-concept study, the SPS term was the loop-induced process initiated by a pair of gluons since it is the only contribution that overlaps with the DPS process and that has a large-y tail. This SPS term was divided by 10, to boost the visibility of the DPS contribution and reduce the required statistics. We studied the dependence of the algorithm on the quantity ν, an unphysical parameter that effectively demarcates SPS and DPS. Once the subtraction term is included, the results show a rather small dependence of the cross section and event shapes on this scale, as should be the case. We also investigated several different sensible choices for the k ⊥ profile g(k ⊥ , y) in the 1 → 2 splitting process and subtraction term, including an "optimal" choice for which the behaviour of the subtraction term matches that of the SPS loop-induced term at small p ⊥ . For many distributions, almost no difference was observed between the different choices, with a small difference being observed in the region of phase space where the transverse momenta of both bosons are small. The implementation of this subtraction scheme generates some counter-events that contribute to the histograms with a negative weight (as is also encountered in NLO+shower matching schemes such as MC@NLO [93][94][95][96]). However, it was shown that it is possible to limit the fraction of events with negative weights to a few percent if one couples the SPS cross section and the subtraction term to the exact same shower algorithm.
Using the toy set-up described above, we also studied in what kinematic regions the inclusion of DPS has an observable impact. Our results indicate that upper cuts on p Z ⊥ , p ZZ ⊥ , m ZZ and ∆ϕ ZZ as well as a lower cut on |∆η ZZ | will lead to an enhanced DPS contribution. This is consistent with previous experimental and phenomenological studies of DPS.
In the future, it would be interesting to use this algorithm to make a proper phenomenological analysis of Z 0 Z 0 production and other processes of interest such as W + W − production. For such studies it would be desirable to include at least the Born SPS process in addition to the loop-induced one, massive quark flavours, decays of the bosons, and hadronisation of the low-scale partons. It would also be interesting to study the effects of different sets of sPDFs and dPDFs in the simulation, or to adapt the algorithm such that it can handle unequal-scale dPDFs. The new PDF interpolation library ChiliPDF [139] could help to achieve such goals. The first aspect would help to assess the uncertainties related to the PDFs, whereas the second one would be relevant for DPS processes that involve hard scatters characterised by two different scales such as four-jet or W+2 jet production.