A Monte-Carlo Simulation of Double Parton Scattering

In this work, a new Monte-Carlo simulation of double parton scattering (DPS) at parton level is presented. The simulation is based on the QCD framework developed recently by M. Diehl, J. R. Gaunt and K. Sch\"{o}nwald. With this framework, the dynamics of the $1\to2$ perturbative splittings is consistently included inside the simulation, with the impact-parameter dependence taken into account. The simulation evolves simultaneously two hard systems from a common hard scale down to the hadronic scale. The evolution is performed using an angular-ordered parton shower which is combined with a set of double parton distributions that depend explicitly on the inter-parton distance. An illustrative study is performed in the context of same-sign WW production at the LHC, with the quark content of the proton being limited to three flavours. In several distributions we see differences compared to DPS models in Herwig, Pythia, and the DPS"pocket formula".


Introduction
In high-energy proton-proton collisions such as the ones that occur at the Large Hadron Collider (LHC), the underlying event can be an important background to a variety of signals. Therefore, in order to accurately describe experimental data, it is necessary to develop a simulation of the underlying event [1,2]. Current event generators such as Herwig [3][4][5], Pythia [6,7] and Sherpa [8][9][10], model the underlying event with a good agreement with experimental data. However, these models can be improved in order to reach an even higher accuracy, which is required for the next generation of proton-proton colliders and for beyond-the-standard-model searches.
At parton level, the underlying event is generated by two major components: multiple parton interactions (MPI) and parton showers. Most of the parton showers currently implemented are at a leading-logarithm (LL) accuracy, which means that the leading logarithms are fully resummed within a so-called Sudakov form factor. In fact, they include also some next-to-leading-logarithm (NLL) effects, mostly via colour coherence, four-momentum conservation, running of the strong coupling, etc... [2] In order to improve the simulation of the underlying event, one can try to implement new parton showers that would include more aspects such as spin correlations, subleading-colour effects, new choices of kinematics, next-to-leading-order (NLO) splitting kernels, transverse-momentum-dependent parton distributions (TMD), amplitude-level evolutions, etc... Many efforts have been made in those directions recently [11][12][13][14][15][16][17][18][19][20][21][22][23][24].
Another way of improving the simulation of the underlying event is to develop new models of MPI. At cross-section level, a scattering with n parton-parton interactions (nPS) is usually suppressed compared to a scattering with a single parton-parton interaction (SPS). More precisely, the ratio between the total cross sections scales as [25,26] with Q h the energy scale at which the proton is probed and Λ ∼ 1 GeV, the characteristic scale of the proton. One can see that for high energy processes where Q h Λ, the nPS total cross section is strongly suppressed compared to the SPS one. However, for scales Q h of the order of Λ, nPS is unsuppressed. In fact, most of the underlying event is composed of these so-called "soft" MPI. Moreover, nPS constitutes a systematic background of SPS signals. For high energy scales such as at the LHC, one can expect the ratio nPS/SPS to be enhanced. Indeed, the protons are probed at lower momentum fractions, where the population of partons is greater [25]. Therefore, MPI models must be included inside the event generators in order to accurately describe experimental data.
An mPDF is a complicated object which takes into account all the correlations between the n partons belonging to the same proton. Those correlations originate for example from kinematic constraints, quantum-number conservation rules (i.e. the sum rules) or from dynamical effects such as the 1 → 2 parton splittings that occur inside the proton [28]. The mPDFs are non-perturbative quantities, and their calculation for arbitrary n is far beyond the current state of the art (for progress in the case n = 2, see [29]). Theoretically, they can be extracted from the "light-cone" wavefunction of the proton [30][31][32][33], but this wavefunction contains Fock states with an arbitrary number of particles, which makes its modelling presently impossible. Furthermore, unlike for the single parton distribution functions (sPDFs), one cannot constrain the mPDFs using current experimental data.
The usual event generators thus adopt the following strategy to model MPI [34][35][36][37][38]. First, a hard process and its kinematics are selected, without taking into account the possible presence of secondary parton-parton interactions. After this, secondary partonparton interactions are added. Each system is then evolved by using a parton shower. In the current event generators, the mPDFs are typically written as a product of sPDFs. This ansatz is then modified in order to take into account the kinematic constraints and the quantum-number conservation rules [37,38].
This ansatz leads to a reasonably good description of MPI. However, it fails to take into account some parton-parton correlations which are not necessarily negligible. For example, models based on the light-cone wavefunction of the proton with only three quarks showed strong correlations at large momentum fractions, especially in spin and in colour [31,32], and these correlations should be implemented. Also, the dynamical correlations due to the 1 → 2 splittings should be included as well. These dynamical correlations are important [28,39,40], especially for small momentum fractions and large energy scales, where the other correlations tend in general to be reduced [41,42]. Therefore, there is a need for a realistic set of mPDFs. This set must be based on theoretical works, since very little experimental data have been gathered so far.
Unfortunately, no set of mPDFs is available for the moment. However, the case n = 2, referred to as double parton scattering (DPS), has been widely studied [26][27][28][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57]. Despite the lack of experimental data, several groups managed to produce sets of double parton distribution functions (dPDFs) that represent an improvement over the simple product ansatz. These sets were produced by solving the double DGLAP equations (dDGLAP), an extension of the usual DGLAP equations. The first analytical solutions of the dDGLAP equations were derived in [43][44][45], but those solutions cannot be used as such for phenomenological studies where numerical solutions are needed instead. Fortunately, the first numerical set of leading-order (LO) dPDFs was generated by JG and W. J. Stirling and is referred to as GS09 [28]. With this set, the contributions to the dPDFs from the 1 → 2 splittings are included, although without any dependence on the inter-parton distance. This set showed some differences with the usual ansatz made for dPDFs, especially considering the longitudinal correlations between partons [28]. Later, significant progress was made in the theoretical description of DPS, in particular with regards to incorporating the 1 → 2 splittings in a consistent way, with the impact-parameter dependence fully taken into account. This new approach to DPS led to new QCD frameworks which were developed by four different groups, namely B. Blok et. al. [47,51], M. G. Ryskin and A. M. Snigirev [46,49], A. V. Manohar and W. J. Waalewijn [50] and M. Diehl, JG and K. Schönwald [54]. As mentioned above, the total cross section for DPS is suppressed compared to the SPS one for high energy scales. However, for differential cross sections, the situation can be rather different. For example, consider the final state A + B characterised by the scale Q h . This final state can be obtained with the SPS pp → A + B or with the DPS composed of the two subprocesses pp → A and pp → B. Let 1 q A and q B be the transverse momenta of the final states A and B with respect to the beam axis in the centre-of-mass frame of the pp system. For small momenta i.e. |q A | ∼ |q B | ∼ Λ for DPS, and |q A + q B | ∼ Λ for SPS, it can be shown that [26,27] dσ DPS (A,B) (1.2) Thus, at differential level, DPS and SPS contribute with the same strength in some regions 2 of phase space. For some processes, SPS can even be suppressed by a higher multiplicity of couplings. An example of such a process is same-sign WW production (i.e. W + W + or W − W − ) represented in Figure 1. Both diagrams include vertices whose strength is given by the electroweak coupling α w . If one ignores the decays of the W bosons, it can be seen that the cross section for WW production via DPS scales as O(α 2 w ), whereas the one for WW production via SPS is of order 3 O(α 4 w ). Since in the perturbative regime (which is our framework) the couplings are relatively small, it turns out that the SPS and DPS cross sections are comparable in the instance of same-sign WW production. For these reasons, DPS cannot be neglected and needs to be accurately modelled. The DPS contribution to this process has been extensively studied from the theoretical side [58][59][60][61][62][63][64][65] and the first experimental evidence for DPS-initiated same-sign WW has recently been obtained by the CMS collaboration [66].
The aim of this paper is thus to present a simulation of DPS which is based on the QCD framework developed in [54], referred to later as the DGS framework. This framework has many advantages which makes it well-suited to a Monte-Carlo implementation. For example, it involves dPDFs which are defined for an individual hadron and which have a probabilistic interpretation. The parton-level simulation which is proposed in this work consists in a phase-space generator and an angular-ordered parton shower. First, two hard systems are generated using the full DPS cross section. Thereafter, the two systems are showered simultaneously in order to give a set of final-state partons. In both steps of the simulation, the dynamics of the 1 → 2 splittings as well as the impact-parameter dependence are fully taken into account, with dPDFs depending explicitly on this parameter.
It is not the first time that the 1 → 2 splittings are included within a simulation of the underlying event. Indeed, B. Blok and P. Gunnellini developed an approach to include 1 → 2 splitting effects inside the MPI machinery of Pythia 8 [67,68]. However, in this approach, the idea is to reweight the effective cross section 4 of the DPS process on an event-by-event basis with a factor that takes into account the 1 → 2 splitting effects. A similar strategy was used in [65] to study the impact of the spin correlations between partons in same-sign WW production, where here the effects of 1 → 2 splittings were not considered. The approach of [67,68] utilises a reweighting factor for the effective cross section that was calculated for a particular set of dPDFs, together with some appropriate choice of scale for the chosen processes. These dPDFs were calculated relying on particular factorisation assumptions. It is not clear how straightforward it would be for a general end-user to adapt this approach to incorporate different dPDFs and use it for arbitrary processes and kinematics. In the simulation presented in this work, the approach is process independent and does not rely on a particular set of dPDFs, which makes it more flexible and user-friendly. A first-principles implementation is also convenient for future developments, such as a consistent combination of SPS and DPS processes. Note that this simulation constitutes the first implementation of the full DGS framework; neither the work from [67,68], nor the one from [65] implements all the features of the framework. It is also the first simulation which gives a geometrical picture of the evolution which is consistent with the 1 → 2 splitting mechanism. 5 The first objective of this simulation is to allow a phenomenological study of any set of dPDFs that suits the DGS framework as well as the sum rules that those dPDFs must satisfy [28]. It is important to emphasise here that the simulation is not bound to a specific set of dPDFs. Other dPDF sets can be plugged into this simulation without having to modify the whole approach, as long as they follow the prescription given in [54]. The second objective is to test some features related to the overall DGS framework, such as the dynamics of the 1 → 2 splittings or the impact-parameter dependence. The new aspects of this simulation might improve the current MPI models in the future, which is the underlying goal of this work.
The plan of this paper is the following. First, reviews of parton showers and DPS are given in Sections 2 and 3 respectively. Then, the simulation of DPS which has been developed is introduced in Section 4 and the results of the simulation in the context of W + W + pair production are presented in Section 5. Finally, the main ideas and the conclusions are gathered in the summary.
2 Current parton shower algorithms

Selection of the hard process
Let us consider a proton-proton collision that happens at a centre-of-mass energy √ s. The collision leads to the production of a final state A. In an event generator, the collision is described as the interaction between a parton i coming from one proton with a parton j coming from the other. In the centre-of-mass frame of the pp system, partons i and j have four-momenta 6 p i,j = x 1,2 ( √ s/2)(1; 0, 0, ±1), which results in a squared invariant mass of The total cross section for the process pp → A can be written using the factorisation formula [69] σ The integrated parton-level cross sectionσ ij→A describes the short-range physics (matrix element + phase space) of the interaction between the two partons i and j coming from the incoming protons. This partonic cross section may be calculated using perturbation theory, in practice up to a few terms. The factorisation scale µ 2 can be seen as an artificial delineation between short-range (σ ij→A ) and long-range (PDFs) physics. Below µ 2 , the physics is absorbed into the PDFs, which are universal (i.e. independent of the process ij → A). One should have µ 2 Λ 2 ∼ 1 GeV 2 in order for perturbation theory to be applicable forσ ij→A .
In an event generator, Monte-Carlo techniques [2,6] are used to select a hard process ij → A as well as its kinematics according to Equation (2.1). This equation should be kept in mind since it will be compared later to its equivalent for DPS.

QCD radiation
Once a hard process has been selected, it needs to be evolved in order to take into account the extra emissions of colour-charged particles [2]. First, an evolution variable Q 2 is defined. The evolution then brings the system from the hard scale Q 2 h down to Q 2 0 ∼ Λ 2 by going downwards in Q 2 i.e. from the hard process down to the non-perturbative regime. Within 6 Here, the transverse momenta of partons i and j with respect to the beam axis (called "primordial" or "intrinsic" transverse momenta) are neglected i.e. partons i and j are assumed to be collinear with the incoming protons. This framework is called "collinear factorisation". In the PDFs, those transverse momenta are integrated over. the parton-shower framework, this is done by defining a branching probability for each QCD branching. For final-state radiation (FSR) i.e. radiation coming from the final-state legs of the system, these branching probabilities are calculated by approximating matrix elements in the limit where the emitted parton is collinear to the radiating one, referred to as the collinear limit. The expression of the branching probabilities at LO for FSR can be found in [2,70].
Radiation associated with the partons which initiate the hard process is called initialstate radiation (ISR). In the case of proton-proton collisions, these partons actually come from protons. Hence, the inner structure of the proton needs to be taken into account in order to give an accurate description of ISR. This can be achieved using PDFs.
The main idea in the case of ISR is to work with the ensemble of partons described by the PDFs and to relate the branching probability to the variation of this set under a change in the evolution variable Q 2 . This variation is calculated with the DGLAP equations. The system of partons is evolved by using the so-called "backward evolution" [71,72]. The aim is thus to reconstruct the past history of the parton which initiates the hard process by using a conditional branching probability. This probability is defined as the rate at which the number of partons of flavour i changes during a variation dQ 2 of the scale. More precisely [71,72]: This quantity is the probability that a given parton of flavour i, with longitudinal momentum fraction x, is resolved during an evolution from a scale Q 2 h down to Q 2 and then appears as coming from a parton of flavour k with momentum fraction x/z [71,72]. The functions P k→i (z) are the LO unregularised 7 splitting kernels and α s is the strong coupling. The exponential factor is the no-emission probability. It ensures unitarity, which means that the probability density defined by Equation (2.2) is normalised to unity [70]. This noemission probability is closely related to the Sudakov form factor defined in perturbative calculations; the main difference being the presence of a non-perturbative PDF ratio. In both cases, the role of the exponential factor is to resum some large logarithms that can spoil the perturbative expansion. For this reason it is customary to refer to the no-emission probability as the Sudakov form factor and this latter term will be used in the following. The inner structure of the proton is correctly taken into account in Equation (2.2) with the presence of a ratio of PDFs; the evolution is "guided" by the PDFs. This expression is valid in the collinear limit only, since the splitting kernels in the DGLAP equations are derived with this approximation.
The evolution variable Q 2 should be chosen to be proportional to the virtuality of the off-shell parton which is radiating. In Pythia 6, the evolution variable was chosen to be exactly the virtuality [73], whereas in Pythia 8, Sherpa [8], Dire [74], Vincia [75] and Ariadne [76] it is the transverse momentum squared p 2 ⊥ of the emitted parton with respect to the initial direction of the radiating one. In Herwig, the evolution variable for FSR is defined to be [77] where p 2 k is the four momentum squared of the parton k which is currently radiating and m k , its rest mass. In the case of ISR, the initial-state parton i gets a space-like virtuality during the backward evolution which is given by For the branching k → i+j of a final-state parton k, if E k is its energy, thenq 2 E 2 k θ 2 ij in the limit θ ij → 0, where θ ij is the opening angle between the three-momenta of partons i and j. Therefore, an evolution downwards inq 2 implies that the angle θ ij must decrease through the evolution. This property is referred to as angular ordering [3,77]. It has an important consequence, namely that the shower is coherent. This means that in an event in which the branching k → i + j happens, partons i and j radiate soft gluons coherently, so that at angles greater than θ ij , it is as if the gluons are radiated by parton k [78]. A parton shower which is not coherent radiates too much and does not reproduce the results that one can obtain with matrix-element calculations in the limit where the emitted gluons are soft (referred to as the soft region).
In the simulation of DPS which will be presented later, the evolution variable will be Q 2 =q 2 , as in Herwig.

Kinematics
The branching probabilities are used to select probabilistically a set of phase-space points {(Q 2 n , z n )}, which represents all the extra emissions that have been added to the original hard process. The algorithm which is employed to achieve that is called the "veto algorithm" [2,3,6]. Each new iteration of the algorithm uses the scale Q 2 of the previous one as its Q 2 h in the expression of the branching probability. Once the shower has been generated, it is necessary to set up kinematics that preserve four-momentum conservation and ensure that all final-state partons are on-mass-shell. Multiple strategies exist. In the following, the kinematics defined in Herwig [3] will be presented since the evolution variable Q 2 =q 2 is used.
Let us consider the branching k → i + j. Within the parton-shower framework, this branching is a phase-space point (q 2 , z). The four-momenta of the two new partons coming from the branching need to be constructed from these two shower variables. The first quantity which needs to be defined is the virtuality of the radiating parton. Its value can be extracted from Equations (2.3) and (2.4) for FSR and ISR respectively. The virtualities are necessary to compute the magnitude p ⊥ of the transverse momentum of partons i and j with respect to the direction of the momentum of their mother k. This one is given by [3] For FSR, partons i and j are set on-mass-shell so p 2 i,j = m 2 i,j . Therefore In the case of ISR, it is partons k and j that are set on-mass-shell. Moreover, partons k and i are assumed to be massless. Thus The magnitude p ⊥ defined above is not enough to fully specify the relative transverse momentum of partons i and j. Indeed, one needs to define the orientation of this momentum in the plane perpendicular to the direction of parton k. This is achieved by selecting an azimuthal angle ϕ. This angle is typically uniformly distributed between 0 and 2π. However, this flat distribution can be biased to take into account the azimuthal correlations between the partons which are due to the fact that the spins and polarisations carried by the partons lead to additional interferences [78]. The variables p ⊥ and ϕ are enough to describe the transverse motion of partons i and j with respect to the direction of parton k. The longitudinal motion is fixed by the shower variable z. Parton i carries a fraction z of the longitudinal component of the momentum of parton k whereas parton j carries a fraction 1 − z.
The kinematics of a branching k → i + j can be iterated in order to construct the kinematics of the whole shower. In the case of FSR, each colour-charged final-state leg of the hard process generates its own shower and is therefore called the progenitor. The result of the shower is a jet of partons which are distributed around the direction of this progenitor. To each progenitor is associated another leg of the hard process which carries the matching colour/anticolour. This other leg is referred to as colour partner. The shower is generated in the rest frame of the progenitor-partner pair, referred to as a dipole. The +z direction is defined to be along the direction of the momentum of the progenitor. This frame is most of the time different from the centre-of-mass frame of the pp system (laboratory frame). In particular, the z-axis of the dipole may not coincide with the beam axis of the laboratory frame. The relative transverse momenta of the subsequent branchings are then computed iteratively using Equation (2.6). The longitudinal parts are calculated by applying the definitions of the longitudinal fractions z n . At the end of the procedure, the whole shower generated by the progenitor is boosted back to the laboratory frame. After the kinematics has been constructed for FSR, all the progenitors now have a time-like virtuality, since they radiated. This is to be expected, but the problem is that the kinematics of the hard process was selected by considering the progenitors on-mass-shell. Therefore, the kinematics which has been newly established breaks four-momentum conservation. A way to recover momentum conservation is to boost the generated jets along the direction of their respective progenitor such that the invariant mass √ŝ of the hard process remains the same as the one which was selected before 8 the shower [3].
For ISR, the procedure is similar. The hard process is initiated by two partons which have been extracted from the proton beams and with momenta p i,j = x 1,2 ( √ s/2)(1; 0, 0, ±1) in the laboratory frame. A colour partner is assigned to each one of them and the showers are generated in the dipole rest frames, as for FSR. In the case where the two initialstate partons form a colour singlet (e.g. W or Z 0 productions), the z-axis of the dipole is aligned with the beam axis. However, in the case where an initial state is colour connected to a final state (e.g. in gg → gg scattering) then the z-axis of the dipole is not aligned with the beam axis and one needs to boost the resulting shower. After the backward evolution has been performed, the new partons that are extracted from the beams have momenta p i,j = x 1,2 ( √ s/2)(1; 0, 0, ±1) in the laboratory frame. The fractions x 1,2 can be related to the fractions x 1,2 selected before the shower by using the definitions of the z n : x 1,2 = x 1,2 / n z n . The transverse part of the kinematics is calculated iteratively with respect to the z-axis in the dipole rest frame. Each shower is then boosted back to the laboratory frame. The two partons which are initiating the hard process now have acquired a transverse momentum and a space-like virtuality. This is because some emissions were attached to those two partons, which turned them into virtual particles. However, the kinematics of the hard process before the shower was established with the momenta p i,j . Therefore, momentum conservation is also lost in the ISR case. One solution here is to rescale the momenta of the partons that are initiating the hard process such that the invariant mass squaredŝ = x 1 x 2 s and the rapidity Y = (1/2) ln(x 1 /x 2 ) of the hard process remain equal to their original values (i.e. before ISR). The rescaling factors are found by solving two algebraic equations and they define the longitudinal boosts that must be applied to the two partons extracted from the beams, as well as to the emissions which have been attached to them. The transverse kick has to be absorbed globally by the whole final state by applying a transverse boost to this latter. Indeed, no transverse kick can be given to the momenta p i,j since it is convenient to keep those momenta along the z-axis (i.e. collinear to the incoming protons) [3]. All the details regarding the construction of the kinematics for FSR and ISR can be found in [3]. Before closing this section, let us come back to two technical aspects which must be mentioned. First, it has not been specified yet from which scale Q 2 h the evolution should start. There are multiple answers to this question. In Herwig, each progenitor starts its evolution with its own starting scale. The starting scaleq 2 h,i of a progenitor i is related to the starting scaleq 2 h,j of its colour partner j via the relationship [2,3] where m 2 ij is the dipole mass squared. In the case where i and j are both initial-state partons, referred to as an initial-initial (II) dipole, m 2 ij is simply the invariant mass squared s of the hard process. However, if the dipole is stretched between an initial-state parton 8 Recall that the kinematics of the hard process is selected using Equation (2.1). and a final-state parton (IF/FI dipole), then m 2 ij is notŝ anymore and is instead equal to the Mandelstam variable −t (or −û). The condition given by Equation (2.8) ensures that the soft region of the dipole phase space is correctly covered by the respective showers of partons i and j, without overlap. One can see that there is a certain degree of freedom in Equation (2.8). Indeed, any combination of (q 2 h,i ,q 2 h,j ) that satisfies this condition will ensure that the soft region is covered. However, those combinations will give different results outside the soft region. This is because an angular-ordered shower describes correctly the emission pattern of a dipole in the soft and collinear regions only. Outside these regions, matrix elements must be used. In particular, the hard non-collinear region of the phase space (referred to as the "dead cone") is not populated by an angular-ordered shower and must be filled with matrix elements [77]. The issues related to the dead cone in the case of DPS will not be addressed in this work. In the default Herwig, the choicesq 2 h,i =q 2 h,j =ŝ andq 2 h,i =q 2 h,j = −t are made for the II and IF/FI dipoles respectively [3]. It will be seen later that this choice should be modified for the DPS case.
The second technical aspect is the argument of the running strong coupling α s that is used in the parton shower. One possible choice is the evolution variableq 2 . However, it is argued in [79][80][81][82] that, in the context of an angular-ordered shower, using the transverse momentum squared p 2 ⊥ takes into account some NLO effects, which makes this choice better thanq 2 itself. This strategy is referred to as the "Monte-Carlo scheme". Deriving the same result in the case of DPS is beyond the scope of this work. Nevertheless, we decide to use the same scheme in the following and some arguments for making such a choice will be given in Section 4.3. Note that consideringq 2 or the virtuality as the argument of the strong coupling are valid choices too in the context of a LO shower, since the differences between those choices lead to contributions which are beyond the accuracy of the shower.

Double vs. single parton scattering
In the following, a review of multiple parton interactions and, in particular, double parton scattering is given.

Current MPI models
Equation (2.1) describes a proton-proton collision as a single parton-parton collision. However, many partons can be extracted from the same proton and those partons may also initiate other parton-parton collisions referred to as secondary interactions. There are several models for MPI. Within Herwig, the number of secondary interactions is selected according to some distribution derived from the eikonal model [34,35,83]. Each subsystem is thereafter showered independently. At the end of the procedure, if four-momentum conservation is violated, 9 the event is regenerated. This is repeated until all kinematic constraints are fulfilled [34]. In Pythia 8, the strategy is different. The generation of secondary interactions is combined with ISR and FSR in a unique sequence of decreasing transverse-momentum values. More specifically, a probability, differential in p 2 ⊥ , to have 9 For example, the partons have extracted more energy than is available inside the proton.
a secondary interaction is defined [36][37][38]. As for the eikonal model used in Herwig, this differential probability is derived from the cross sections of the QCD 2 → 2 processes. 10 During a common evolution which is performed by going downwards in p 2 ⊥ , three scales are generated with the veto algorithm: one for ISR, one for FSR and one for MPI. The highest scale determines what actually happens. This procedure is called "interleaved evolution" [73,84].
The common evolution of the different subsystems involves mPDFs. As mentioned in the introduction, a typical ansatz for the mPDFs is to assume that they can be expressed as a product of sPDFs. In the instance of DPS, this means that the dPDFs are written as [28] where y is the distance between the two partons in the plane transverse to the momentum of the proton. The y-dependence of the dPDFs has been factorised out into some distribution F (y) which one assumes to be flavour and scale independent. This latter is usually modelled by using the electromagnetic form factor of the proton with a characteristic size of the order of the radius of the proton [34,84]. Equation (3.1) is convenient, but it fails to take into account the correlations between the partons belonging to the same proton. Some of these correlations are a consequence of the number and momentum sum rules that the mPDFs must verify in order to give a realistic description of the proton. The number sum rules state that the proton has two valence u quarks and one valence d quark. In the case of SPS, their expressions can be found in a textbook and read  where the valence components of the sPDFs are defined 11 as f uv (x, µ 2 ) = f u (x, µ 2 ) − fū(x, µ 2 ) (and similarly for d v ). In the following, this kind of relation will be symbolically written as u v = u −ū. The momentum sum rule imposes that all the partons probed at a given scale must carry the full momentum of the proton they belong to. More specifically, the sPDFs must satisfy the following equation with n f the number of quark flavours considered. An extension of these sum rules for the DPS case has been proposed in [28]. They are referred to as the Gaunt-Stirling (GS) sum rules and have been extensively studied [28,55,85]. It is explained in [55] that it is the integrals 12 over y of the dPDFs that satisfy the GS sum rules.
The GS sum rules strongly constrain the dPDFs. For example, if one defines the dPDF d v d v as the probability density to extract two valence d quarks from the same proton, then this dPDF must be identically zero at all scales. Also, finding a parton with momentum fraction x 1 reduces the probability to find a second parton with a fraction x 2 which is close to the value 1 − x 1 . In order to approximately include such effects, the usual sPDFs are rescaled and renormalised inside Pythia 8. For example, the probability to extract a valence d quark with momentum fraction x 2 , knowing that a parton of flavour i has been extracted beforehand at a scale µ 2 1 with a fraction x 1 , is given by [37] f dv is equal to zero if i = d v and unity otherwise. This ensures that at most one valence d quark can be extracted from the proton. The rescaling ensures that the kinematic constraint x 1 + x 2 ≤ 1 is fulfilled. It can be checked that such a distribution satisfies the following sum rule Similar distributions are defined for the other flavours. With such a scheme, the factorised form of the dPDFs given by Equation (3.1) is replaced by the following expression inside Pythia 8 [86] F Py These dPDFs approximately satisfy the GS sum rules [86] and are clearly symmetric in the sense that F Py ij (x 1 , x 2 , y, µ 2 1 , µ 2 2 ) = F Py ji (x 2 , x 1 , y, µ 2 2 , µ 2 1 ). The rescaling introduced in Equation (3.4) generates a suppression of the dPDFs close to the kinematic limit. More precisely, the distribution given by Equation (3.4) tends smoothly towards zero whenever the kinematic boundary x 1 + x 2 = 1 is approached. In QCD studies, it is customary to implement this kinematic suppression by adding a phase-space factor to Equation , with p ≥ 1 and Θ, the Heaviside function [39].
More features are implemented inside the MPI model of Pythia 8 [37]. For example, the concept of "companion" quark is introduced to take into account the fact that sea quarks are always produced in pairs. More precisely, each time a sea quark is extracted from a proton, it leaves behind its corresponding antiquark which is included inside the structure of the beam remnant. This latter quark is referred to as the "companion".
In Herwig, the factorised form (3.1) is preserved, without adding any phase-space factor. As mentioned above, the kinematic constraint is enforced by vetoing the events that do not satisfy it. Moreover, the MPI machinery cannot extract too many valence quarks since the backward evolutions of the secondary interactions are forced to terminate on a gluon, whereas the one of the hard process finishes necessarily on a valence quark. In practice, this is achieved by evolving the secondary interactions using sPDFs with the valence contributions subtracted out [34].
In the following, unless explicitly mentioned, the F ij will refer to the y-dependent dPDFs as defined in [54] i.e. the factorised form given by Equation (3.1) will not be used and has been recalled here for historical reasons only.

Review of double parton scattering
It has been seen that a factorisation formula can be written for SPS, recall Equation (2.1). In the same way, a factorisation formula can also be derived for DPS [27,52,54,87,88]. Consider a final state A+B produced during a pp collision at a centre-of-mass energy of √ s.
It is assumed that the production of such a final state can be described with the subprocesses pp → A and pp → B i.e. a DPS. It is possible to define two different factorisation scales µ 2 A and µ 2 B , one for each subprocess. The total cross section for the process pp → A + B via a DPS process only is [28,54] This formula can be seen as a product of two Equations (2.1), especially for the short-range part. Indeed, the short-range part of Equation (3.7) is the product of the parton-level cross sections for the subprocesses ij → A and kl → B, where these have associated squared invariant masses ofŝ 12 = x 1 x 2 s andŝ 34 = x 3 x 4 s respectively. The long-range part is more complicated. It involves the dPDFs F ij (x 1 , x 2 , y, µ 2 A , µ 2 B ). It can be intuitively understood that the transverse distance y between the two partons has to be the same for the two incoming protons in order for the two pairs of partons to actually collide in two separate hard interactions [25]. Since y is not a measurable quantity, this degree of freedom must be integrated over in order to give a physical cross section. The function Φ is a cut-off at small y = |y| which will be discussed later. The quantity in front of the sum in Equation (3.7) is a symmetry factor which is equal to one half if A = B and to unity otherwise [28]. It comes from phase-space integration. It is worth recalling that Equations (2.1) and (3.7) are valid for partons that are collinear with the incoming protons (i.e. the primordial transverse momenta are neglected in the hard processes and in measurements). Illustrations of this formula are given in Figure 2.   the integral over y can be performed. This leads to the definition of an effective cross section 13 [28,35] and Equation (3.7) can be recast as the so-called "DPS pocket formula" where σ SPS A and σ SPS B are the SPS cross sections for the processes pp → A and pp → B given by Equation (2.1). This formula is particularly simple to use in practice to estimate the size of the DPS contribution. Historically, this formula has been used to experimentally measure the effective cross section σ eff [46,89,90]. More precisely, the D0 and CDF collaborations have measured σ eff at a pp collider by using the DPS contribution to pp → γ + 3 jets, with A = γ + 1 jet and B = 2 jets [91][92][93]. The D0 collaboration found σ eff = 16.4 mb whereas the CDF collaboration measured σ eff = 14.5 mb. In [89,90], it is explained that including some theoretical considerations leads to a lower value than the one extracted by the D0 and CDF collaborations. More measurements have been performed recently at the LHC by the ATLAS, CMS and LHCb collaborations [94][95][96][97][98][99][100][101]. What can be remembered from these measurements is that σ eff ∼ 1/Λ 2 i.e. it is of the order of the transverse area of the proton [25]. Note that σ eff is process independent according to Equation (3.1). However, if one uses Equation (3.9) to define σ eff (as experimentalists do), then one may find that it depends on process, scale, etc...
Let us now present the main features of the DGS framework introduced in [54]. This framework involves y-dependent dPDFs i.e. no factorisation ansatz is used. These dPDFs satisfy the following two properties: i. These dPDFs satisfy the homogeneous dDGLAP evolution equations [27,54]. These equations can be derived by considering the renormalisation of the y-dependent dPDF operator. In the case where the two factorisation scales are set to be equal i.e.
whereP i →i (z) are the usual regularised splitting kernels. This equation is basically the sum of two usual DGLAP terms. The only differences are the presence of the dPDFs and the fact that the upper boundary of the integral is not unity anymore but is now determined by the kinematic condition with the scale µ 2 therefore turns out to be simply the sum of the contributions from the evolution of each one of the two partons i and j. It is important to note that the impact parameter y does not contribute to the evolution at all.
ii. At small y and for scales µ ∼ 1/y, the dPDFs should be given, up to formally powersuppressed corrections, by a perturbative splitting expression involving the sPDFs. This can be derived by considering the operator product expansion of the dPDFs at small y. At LO, the expression is given by [27] F spl,pt This term takes into account the fact that the pair of partons ij can originate from the perturbative splitting of a parton k with longitudinal momentum fraction The flavour k is uniquely determined by the flavours i and j for LO QCD splittings so no sum is needed. If there is no flavour k such that the branching k → i + j is allowed, because of colour or flavour considerations, then the perturbative splitting expression for the pair (i, j) is equal to zero. This small-y expression involves the unregularised splitting kernel P k→i+j (z) and the sPDF of parton k, which gives the probability of probing such a flavour k at the scale µ. There is no need to regularise the splitting kernel since virtual loops cannot lead to a 1 → 2 splitting [28].
In [54] a model set of dPDFs was constructed satisfying these constraints, and it is this set of dPDFs that is used in our numerical studies. We slightly adjusted this set to approximately take account of momentum and number sum-rule constraints, as discussed in Section 4.6 and Appendix B. However, it is important to remind the reader that the framework which will be presented in the next section is not tied to this particular dPDF set. One can use any dPDFs that are consistent with the DGS framework (namely satisfying Properties (i.) and (ii.) above) and that approximately satisfy the momentum and number sum-rule constraints.
Let us now briefly review the model set of dPDFs used in [54]. The authors modelled the dPDFs as follows where F int ij is called the intrinsic component of the dPDFs and it contains the non-perturbative contributions to the dPDFs. The term F spl ij is referred to as the splitting component and corresponds to the contribution from the perturbative 1 → 2 splittings. Both F int ij and F spl ij separately satisfy the homogeneous dDGLAP equations [26,27,54]. Each component has its own starting scale for the evolution. The intrinsic component is initialised at the scale µ 0 = 1 GeV by which is basically Ansatz (3.1) with a phase-space factor times a Gaussian in y with a width which depends on x 1,2 and on the flavours i and j. The y shape comes from the link between dPDFs and generalised parton distributions (GPDs) in the approximation of uncorrelated partons [27,42,102]. The widths of the Gaussians h ij (x 1 , x 2 ) are defined as [42] where the α i and B i coefficients are obtained using GPD phenomenology and are usually scale-dependent. More precisely, at low scale, values for these coefficients are determined by fitting models for GPDs to data for electromagnetic form factors [102] as well as for J/ψ photoproduction and deeply virtual Compton scattering [103,104]. In the procedure sketched in [54], the dependence on x 1,2 is actually removed for simplicity so the functions h ij (x 1 , x 2 ) are brought back to flavour-dependent coefficients which are where q i stands for any quark or antiquark. Those coefficients are obtained by setting x 1 = x 2 = 10 −3 in Equation (3.14) and using the values of the α i and B i coefficients given in [42]. The initial condition (3.13) is then evolved according to the homogeneous dDGLAP equations from the scale µ 2 0 up to µ 2 . For the splitting component, the starting scale 14 is µ y = b 0 /y * , with y * = y/ 1 + y 2 /y 2 max , b 0 = 2e −γ E 1.12 and y max = 0.5 GeV −1 . The input is then This expression reduces for small y to the perturbative splitting expression given in Equation (3.11), thereby ensuring that Property (ii.) above is satisfied. At small y, the splitting component behaves like 1/y 2 [26,27,54]. From a naive power counting, this would mean that the component F int ij is negligible compared to F spl ij for y → 0. However, this might not be necessarily the case in practice.
The input for the splitting component is not only defined by the small-y expression given by Equation (3.11). In particular, some modelling is added. First, a Gaussian factor is included in order to suppress the expression at large y values, as for the intrinsic component. Second, the starting scale is defined to be µ y and not b 0 /y. This is to avoid the sPDF and the strong coupling being evaluated at a scale which is outside the perturbative regime. Indeed, with this choice, µ y → b 0 /y max 2.24 GeV when y → +∞, which is still in the perturbative regime. The input (3.16) is also evolved by using the homogeneous dDGLAP equations, but starting from the scale µ y [54].
Let us now come back to the general DGS framework. The 1/y 2 behaviour of the splitting component is troublesome. Indeed, the expression diverges when y → 0 and leads to an unphysical DPS cross section. As often in Feynman graph calculations, a divergence in the formulae is a manifestation of a problem of double counting. In that case, it is a double counting between DPS and SPS. Indeed, a qq pair with separation y may 15 be resolved as a single gluon at resolution scales smaller than 1/y. This phenomenon is described by the 1 → 2 splitting mechanism. The double counting issue then appears since a DPS process with 1 → 2 splittings in both protons may also be regarded as a loop correction to the SPS process. The sketch given in Figure 3 shows how the same Feynman diagram can be seen either as an SPS or as a DPS.
The factorisation formula for DPS thus breaks down for small y because of double counting between SPS and DPS. This issue can be solved by a two-step procedure explained in detail in [26,27,54]. The first step is to regulate the DPS total cross section at small y. This is done by including the function Φ inside the factorisation formula (3.7). The function Φ is chosen so that Φ(u) → 1 for u → +∞ and Φ(u) → 0 for u → 0, which indeed regulates the integral. The cut-off scale ν separates DPS from SPS and can be seen as some new factorisation scale. In [54], it is argued that one should choose ν ∼ Q h , with Q h the hard scale which characterises the final-state A + B. In the following, the function Φ(u) is chosen to be the Heaviside function Θ(u − b 0 ). The second step is to make a subtraction 14 Including the coefficient b0 inside the definition of the starting scale µy simplifies some expressions in [54]. The physics is not changed since b0 is of the order of unity. 15 The colour configuration of the qq pair must be in the octet representation. that removes the double counting between DPS and SPS. The total cross section for the production of the final-state A + B thus becomes 16 where σ sub is the integral over y of a quantity dσ sub /d 2 y that is defined to satisfy dσ sub /d 2 y dσ DPS (A,B) /d 2 y for y 1/Q h and dσ sub /d 2 y dσ SPS A+B /d 2 y for y 1/Q h . Thus, when the two partons are well separated, the production of A + B is described as a DPS with the DPS cross section. In this region, the approximations used to derive the DPS cross section hold and Φ(yν) 1. In contrast, when the partons get really close (y ∼ 1/Q h ), the DPS description is not valid anymore and the SPS cross section is used instead. It can be shown that the dependence of the total cross section σ A+B on the unphysical cut-off ν is removed via a cancellation between the ν-dependent parts of σ DPS (A,B) and σ sub . More specifically, in [54], the subtraction term is defined as the DPS cross section given by Equation (3.7), but with the dPDFs replaced by the fixed order splitting expression (i.e. Equation (3.16) without the Gaussian factor and with the scale µ y replaced by the generic scale µ). The function Φ(u) is also inserted inside the subtraction term. Since the dPDFs are dominated by the perturbative splitting expression at small y, one can understand that the ν-dependences cancel, at least order by order in QCD.

The approach
The main idea is to use Equation (3.7) to generate two separate hard processes with their respective kinematics. After that, the two hard processes are showered simultaneously during a common evolution guided by the dPDFs. This procedure leads to the set of final-state partons. Let us compare with what is already done in the current event generators. The possibility of choosing two separate hard processes already exists. However, their kinematics are selected according to the usual SPS cross section given by Equation (2.1). The kinematics are then rectified in order to take into account the kinematic constraints that link the two hard processes. The total DPS cross section is then calculated with the DPS pocket formula given by Equation (3.9). Regarding the evolution of these two hard processes, the current models of MPI shower them almost independently in the sense that the dynamical correlations between the different subsystems are not taken into account. Using dPDFs should then catch some of these dynamical correlations. The impact parameter y is present in MPI models. However, the y-dependent part of the dPDFs is factorised out into a function F (y), as in Equation (3.1). The function F (y) is then modelled and used to calculate the average number n MPI of secondary interactions. This is the eikonal model [34,35,38,83]. In Herwig, the number of secondary interactions is calculated straight from the eikonal model [34,35], whereas the function F (y) is used to weight the probability to have a new secondary interaction in Pythia 8 [38]. In this latter event generator, the factor F (y) is now dependent on the longitudinal fraction x in order to take into account the longitudinal correlations between the partons [38]. In this work, the longitudinal correlations are included by using the y-dependent dPDFs instead of the factorisation form given by Equation (3.1).

Selection of the two hard processes
In order to select kinematic and flavour configurations for each one of the two hard processes, one needs to sample random variables according to Equation (3.7). The generic method to achieve this is well known for the usual SPS cross section formula and is explained in great detail in [6]. In the case of DPS, the strategy is broadly the same. The major difference is the dimension of the phase space which jumps from three to seven. Indeed, each hard process is characterised by three non-trivial variables. The last variable is the impact parameter y. More specifically, Equation (3.7) is rewritten as The Mandelstam variablest A andt B specify the transverse momenta of the outgoing particles in each one of the two 2 → 2 processes. The factor 2π comes from the fact that the dPDFs have been assumed to be independent of the azimuthal angle of the vector y. In other terms, the dPDFs only depend on the magnitude y. In the following, the equal-scale case will be considered. This is because there is no set of unequal-scale y-dependent dPDFs available yet. However, an extension of the algorithm presented below to the unequal-scale case is in principle achievable, see Section 4.5. The equal-scale case is suitable for processes such as same-sign WW production where one has µ 2 A µ 2 B m 2 W , with m W , the mass of the W boson. More generally, the prescription which will be used in the following is to set the common factorisation scale µ 2 to be equal to min(µ 2 A , µ 2 B ). From a parton-shower point of view, this is the most realistic choice since the two hard processes should be resolved at the start of the evolution. In contrast, note that the arguments of the couplings used in the expressions of the differential parton-level cross sections do not have to be the same for the two hard processes.
The phase-space boundaries are mainly determined by the cuts (e.g. on the transverse momenta) and the kinematic constraints. The two main constraints are x 1 + x 3 ≤ 1 and x 2 + x 4 ≤ 1. If one selects first the variables for the subprocess pp → A, then the variables τ B and Y B are constrained by some limits which are functions of τ A and Y A .
The constraints on the variable y are less trivial. Theoretically, the integral goes up to y = +∞. However, one expects the integrand to fall to zero quickly for values of y larger than the radius of the proton. For this reason, one can in practice cut off the y integral at some value y cut , if this value is much larger than the radius of the proton. The value y cut = 8 GeV −1 will be used here. This choice will be motivated in Section 5.1. Note that even if y > y max , µ y stays larger than b 0 /y max because of the y * -prescription introduced in Equation (3.16). The lower limit is given by the function Φ(yν). In our case Φ(yν) = Θ(yν − b 0 ), which implies that the integral is non-zero for y > b 0 /ν (or µ y < ν). In the following, the choice ν = Q h is made, with Q h , the hard scale. The lower limit is thus b 0 /Q h . This is a natural requirement since the DPS description is not valid for y 1/Q h .

Combining parton showers and dPDFs
The aim now is to use the dPDFs to guide the ISR evolution of the two hard processes. This idea is not completely new and has already been investigated in the past for Pythia 8 [38,73]. However, the authors were using the model of mPDFs presented in Section 3.1 which is based on factorising the mPDFs as products of sPDFs. An evolution which uses the y-dependent dPDFs is proposed here.
Let us consider two partons of flavours i and j belonging to the same incoming proton with momentum fractions x 1 and x 2 and participating in two different hard processes characterised by the same hard scale Q 2 h . Simulating ISR for DPS requires to perform a simultaneous backward evolution of the two partons belonging to the same proton, starting from the scale Q 2 h . In order to do so, one needs to define a branching probability for the pair of flavours ij. This can be achieved by using the homogeneous dDGLAP equations (3.10) in the same spirit as what is usually done with the conventional parton showers [38,73]. One can write where the fact that the argument of α s should be the transverse momentum squared p 2 ⊥ of the branching has been anticipated, see Section 2.3. Also, the factorisation scale µ 2 has been replaced by the evolution variable Q 2 , since the context of parton shower is now considered. The splitting kernels are now the unregularised ones, since the phase-space is regulated by a set of cut-offs. In order to get a well-defined probability which satisfies unitarity, one needs to add a Sudakov form factor as follows is the ingredient needed to simulate ISR for DPS. The physical interpretation is the following. dP ISR ij is the probability that the pair ij remains resolved during an evolution starting from the scale Q 2 h down to the scale Q 2 . After that, the pair ij might appear as coming either from the pair i j (first term in (4.2)) or the pair ij (second term). In practice, this choice is made by selecting a scale for each one of the two channels. The highest scale determines which channel actually happens. This method is referred to as the "competing veto algorithm" [105].
The form of Equation (4.2) can be used to motivate the fact that p 2 ⊥ was chosen to be the argument of α s in the case of DPS too. Indeed, Equation (4.2) can be seen as the sum of two usual ISR branching probabilities as used in the SPS case, the main differences being the presence of dPDFs instead of sPDFs and the different upper boundaries for the integrals. For high-energy collisions, one expects the momentum fractions to be rather small (typically x 1,2 ∼ 10 −3 ). Therefore, in practice, one has 1 − x 1,2 1 and the kinematic conditions are hence similar to the SPS case ones. One can thus expect that most of the arguments presented in [79][80][81][82] should hold for the DPS case too. As a reminder, the choice of the argument of the strong coupling leads to contributions which are beyond the accuracy of a LO shower so this choice should not have a significant impact on the numerical results. Equation (4.2) also motivates the fact that the whole shower evolution in the case of DPS is gauge invariant. Indeed, the fact that the DPS ISR evolution is similar to the sum of two usual SPS ISR evolutions leads us to divide the radiation pattern into FSR and ISR in the same way as in the SPS case, where both components make use of the gaugeinvariant Altarelli-Parisi splitting functions for their calculation. Moreover, the dPDFs have a gauge-invariant operator definition [27] and evolve according to gauge-invariant renormalisation group equations (recall Equation (3.10)). Finally, the "initial conditions" described in Section 3.2 reduce at small y and adequate scales into the appropriate gaugeinvariant perturbative splitting expressions (recall Equation (3.11)).
Let us now come back to the DPS ISR evolution. Two regimes are defined. The evolution from the hard scale Q 2 h down to the scale µ 2 y can be done using the two components of the dPDFs i.e. F ij = F int ij + F spl ij . Since the splitting component is not defined for Q 2 ≤ µ 2 y according to the procedure prescribed by [54], the evolution from the scale µ 2 y down to the scale Q 2 0 is performed using the intrinsic part only i.e. F ij = F int ij . The philosophy of the backward evolution of DPS is then the following. At the starting scale Q 2 h of the evolution, the two partons of flavour i and j belonging to the same protons have a size of order 1/Q h ≤ y so one can talk about DPS. However, after an evolution downwards in Q 2 which leads to Q 2 = µ 2 y , the partons now have a size of order 1/Q = y and the pair ij might be resolved into a single parton of flavour k. In the following, such a phenomenon will be referred to as "merging" since, from a backward-evolution point of view, it seems that the two partons of flavours i and j merge into a single parton of flavour k. The merging happens with a probability given by p Mrg = F spl ij (x 1 , x 2 , y, µ 2 y )/F ij (x 1 , x 2 , y, µ 2 y ). In the case where a merging happens, then a simple backward evolution of the single parton is performed. If no merging occurs then the two partons remain resolved as a pair and the evolution is carried on with the intrinsic part of the dPDFs only. More details about the merging procedure will be given in a dedicated section. An illustration of the backward evolution is given in Figure 4. The algorithm has the following structure: 1. Define two hard processes with their common scale Q 2 h . Those two hard processes are initiated by four partons of flavour i, j, k and l. 4. After the parton shower has been performed (i.e. Step (3)), the scale Q 2 is now equal to µ 2 y . Consider a proton in which a pair of partons with flavours i and k and with fractions x 1 and x 3 is resolved. If there exists a parton of flavour h such that the branching h → i + k exists then take a random number R uniformly distributed between 0 and 1. If R < p Mrg = F spl i k (x 1 , x 3 , y, µ 2 y )/F i k (x 1 , x 3 , y, µ 2 y ) then merge the two partons into a single one with momentum fraction x 1 + x 3 . After the merging, a usual backward evolution of the single parton h is performed from the scale µ 2 y until the minimum scale Q 2 0 is reached. In the case where such a flavour h does not exist or if R > p Mrg then proceed with Step (5). Do the same for the other proton.
5. Evolve each remaining pair of partons downwards in Q 2 from the scale µ 2 y down to some infrared cut-off Q 2 0 ∼ 1 GeV 2 . In the expression of dP ISR ik , only the intrinsic part of the dPDFs is now used i.e. F ik = F int ik .
In the absence of merging, the construction of the kinematics at the end of the shower follows the exact same procedure as the one sketched in Section 2.3. More precisely, the kinematics is constructed for each one of the two hard processes so that the respective invariant mass and rapidity of each subsystem are conserved. Since the two hard processes are separate, no complications appear. The kinematic constraints, imposing that the partons inside the same proton cannot carry more energy than what is available, are implemented within the evolution (recall the boundaries of the integral inside Equation (4.2)). The generation of FSR is the same as in the SPS case, since no PDFs are involved.  The main difference with the SPS case is the choice of the starting scale Q 2 h for the ISR evolution. As mentioned in Section 2.3, this choice depends on the types of dipoles involved and the partons initiating the hard process might have different starting scales. The problem with DPS is that the two partons extracted from the same proton must start with the same scale, since only a set of equal-scale dPDFs is available. Let us take the example of same-sign WW production. Here, one starts with two II dipoles, since the W bosons are colour singlets. If the first hard process is initiated by partons i and j and the second by partons k and l, then the conditions on the starting scales in an angular-ordered shower (Q 2 =q 2 ) readq whereŝ A andŝ B are the invariant masses squared of the two hard processes. The fact that the dPDFs use the same scale imposesq 2 h,i =q 2 h,k andq 2 h,j =q 2 h,l . With the constraints above, this implies thatŝ A =ŝ B , which is not satisfied in general. Therefore, it is not possible to satisfy at the same time both of the constraints given in Equation (4.4). The choice that is made in this instance is to set all the starting scales equal to min(ŝ A ,ŝ B ). This breaks one of the conditions of Equation (4.4) but, in the case of WW production, one expects thatŝ A ŝ B m 2 W so the violation is not too large. Some complications arise when one wants to study W + 2 jets or 4-jet production. Indeed, in those cases, there is no reason whyŝ A should be comparable toŝ B . Moreover, the colour configuration might lead to some IF/FI dipoles. In these cases, the strategy which is adopted is to set the starting scales of the four incoming partons equal to a common scaleq 2 h . For the IF/FI dipoles, the starting scale for the FSR evolution of the final-state parton which is linked to the initial-state parton is then set tot 2 /q 2 h , instead of simply −t as in the SPS case. This will ensure that, for any IF/FI dipole, the conditionq 2 h,iq 2 h,j =t 2 is satisfied. The choice of the common scaleq 2 h depends on the dipole configuration. In the case where there are two II dipoles among the list of dipoles, then one should chooseq 2 h = min(ŝ A ,ŝ B ), as mentioned before. In contrast, if there is only one II dipole belonging, for example, to the hard process A, then the most reasonable choice seems to beq 2 h =ŝ A . Finally, if there are only IF/FI dipoles, then one should setq 2 h = min({−t a }), with the index a going through the list of IF/FI dipoles which are present. This is the most straightforward solution but more sophisticated strategies will be investigated in the future. Unfortunately, there is not really a solution for the case of two II dipoles withŝ A andŝ B very different. This is because a realistic description of W + 2 jets and 4-jet production via DPS requires a set of dPDFs with two different scales. Note that 4-jet production via DPS was studied in [67,68], since the approach of the authors was using unequal-scale dPDFs.

Kinematics
Some difficulties appear when one allows mergings to happen. Let us consider two partons i and j with momentum fractions x 1 and x 2 initiating two different hard processes. At the scale Q 2 = µ 2 y , they merge into a single parton k. The two partons i and j now get a space-like virtuality. However, the only scale which is present is the scale µ 2 y and there is some arbitrariness in how the virtualities should be related to that scale. Moreover, the construction of the kinematics is now troublesome. As explained in Section 2.3, momentum conservation gets broken by the fact that the two partons that initiate the hard process obtain a transverse momentum and a space-like virtuality because of the ISR evolution. This is solved by applying a longitudinal boost on each side, so that the invariant mass squaredŝ and the rapidity Y of the system are conserved. In the case of SPS, this works since one has two degrees of freedom (two boosts) and two constraints (ŝ and Y ). In the case of DPS, the situation is different. For example, if a merging happens inside one beam but not within the other one, then the whole system is initiated by three partons. Therefore, one is left with only three degrees of freedom (one longitudinal boost for each parton). This is not enough to satisfy the fact that the invariant mass and the rapidity of each subsystem must be conserved, which in total gives four constraints. The situation becomes worse if two mergings happen, which then reduces the number of degrees of freedom to two. The system is thus overconstrained in the case of merging. This issue has already been mentioned and discussed for the transverse-momentum-ordered shower of Pythia 8 in [73]. There, the merging is referred to as "joined interaction". The authors proposed two solutions to such a problem. The first one is to drop the statement that parton k must have a momentum fraction equal to x 1 +x 2 . This removes a constraint and allows the kinematics to be established in the case of a transverse-momentum-ordered shower. Nevertheless, the momentum fractions used as arguments for the dPDFs must be adapted in order to account for such a change. The corrections are then of order O(µ 2 y /Q 2 h ). With the second solution, this constraint is kept but no transverse momentum is given to the virtual partons i and j, which also allows the construction of the kinematics. Those prescriptions were proposed for a shower with a local-recoil strategy. For a global-recoil strategy as in angular-ordered showers, one needs to adapt them.
No ultimate solution has been found yet. However, the procedure presented in the following seems to give reasonable results, although there is room for improvement. At the scale Q 2 = µ 2 y , the evolution of the two hard processes gets frozen and the probability p Mrg is evaluated for each proton. If no mergings happen, then the evolution is carried on and the construction of the kinematics is the same as in the SPS case. If at least one merging happens, then the first step is to construct at the scale Q 2 = µ 2 y the individual kinematics of each hard process using the procedure described in Section 2.3. This is done before implementing the mergings. The two hard processes are thus still separated. After that, the idea is to define a new hard process which absorbs the two hard processes and all the emissions that have occurred so far. This new hard process is characterised by a squared invariant massŝ and a rapidity Y , which can be calculated. Before actually implementing the mergings, this new hard process is initiated by four partons: i and k on one side, and j and l on the other side. It is convenient to come back to a hard process initiated by only two partons. Therefore, one can define some pseudo-initiators with four-momenta p i + p k and p j + p l respectively. Those momenta are actually the ones which are assigned to the mother partons after merging. Let us take the example where partons i and k merge into a single parton of flavour h and partons j and l do not merge. The new hard process is then physically initiated by the three partons h, j and l. The momentum of h is defined as p h = p i + p k , which leads to x h = x i + x k . Here, it is important to emphasise that the momentum fractions x i and x k are not exactly the same as the momentum fractions which were generated by the shower and used to evaluate the probability p Mrg . These latter ones will be referred to as ξ i and ξ k . The fact that x i,k = ξ i,k is the price to pay to allow emissions before the merging phase and to be able to conserve the invariant mass and the rapidity of each hard system. More precisely, the momentum fractions x i,k and ξ i,k are related by the longitudinal boosts that are applied to the incoming partons before actually implementing the mergings. The longitudinal boosts have the following form In practice, λ 1 since the shower should not alter too much the initial kinematics of the hard systems. Before applying the boosts, partons i and k have momenta ( √ s/2) ξ i,k (1; 0, 0, 1) in the laboratory frame. After applying the longitudinal boosts, the two momenta are p i,k = ( √ s/2) λ i,k ξ i,k (1; 0, 0, 1). Thus, after implementing the merging, parton h has a momentum fraction given by In this work, no transverse momentum is given to partons i and k. However, a variant of this procedure that generates a transverse momentum for partons i and k will be investigated in future works. Since p i and p k are light-like momenta along the beam pipe and pointing in the same direction, their sum p h is necessarily a light-like momentum along the beam pipe too. Adding emissions to parton h can thus only turn its light-like momentum into a space-like momentum. For the calculations, a pseudo-initiatorg with momentum pg = p j + p l is defined. This pseudo-initiator is simply a mathematical tool and is not physically implemented. The new hard process thus has a squared invariant mass and a rapidity given bŷ An illustration of this example is given in Figure 5. After this has been done, the evolution is carried on from the scale µ 2 y down to Q 2 0 . In this example, a simple backward evolution of parton h is performed, whereas the evolution of the pair jl is carried on as described in Section 4.3. At the end of the shower (i.e. Q 2 = Q 2 0 ), the three partons that are extracted from the beams are h , j and l . During the evolution, subsequent emissions were attached to the two partons h andg that initiate the new hard process defined previously. Because of these emissions, partons h andg are now virtual particles and their momenta are not the light-like momenta p h and pg anymore. Instead, partons h andg have obtained a transverse momentum and a space-like virtuality, which break momentum conservation. In order to recover momentum conservation, one can now use the same strategy as the one used in the usual SPS case. In particular, the kinematics can be constructed. One has two degrees of freedom (a rescaling factor for parton h and one for partong) and two constraints (ŝ and Y ). The kinematics is thus constructed so that the invariant mass and the rapidity of the new hard process are conserved. The rescaling factor for parton h defines the longitudinal boost that must be applied to parton h and its shower, whereas the one for the pseudoinitiatorg gives the longitudinal boost which is applied to both partons j and l , as well as their respective shower.
Regarding the evolution after the mergings, one needs to redefine the dipoles since the colour flow might have been modified, as it will be seen in the next section. This means that new starting scales must be assigned to each parton. The strategy is similar to the one proposed in Section 4.3. All the partons initiating the new hard process start with the initial scale µ 2 y , regardless of whether they belong to an II or an IF/FI dipole. In the case of an IF/FI dipole, this means that the final-state parton linked to the initial-state parton starts with a scale min(µ 2 y ,t 2 /µ 2 y ). Here, it is made sure that the maximum starting scale for any parton is µ 2 y , since the evolution stopped at this scale. With this prescription, each parton gets a unique chance to radiate within an interval of scales. This should avoid double-counting issues. Indeed, if the final-state parton were starting its evolution with the scalet 2 /µ 2 y , then, in the case wheret 2 /µ 2 y > µ 2 y , the parton could radiate again in the interval [µ 2 y ,t 2 /µ 2 y ] despite this possibility already being covered during the first evolution down to µ 2 y , before the merging phase. For the same reason, the starting scale for a parton belonging to a final-final (FF) dipole is set to min(µ 2 y , m 2 dip ), where m dip is the dipole mass. Although the strategy mentioned above seems to remove the double-counting issues, it is not clear whether it produces under-counting issues or not. To answer this question, one would need a full matching between the emission patterns of the system before and after the merging phase, which is beyond the scope of this work. In the case of W + W + production, it will be seen later 17 that the cross section is dominated by the intrinsic × intrinsic part, which leads to a sampling of the variable y biased towards the large values. Therefore, µ y is usually close to the infrared cut-off Q 0 of the shower, leaving little room for extra emissions after the merging phase. In contrast, the merging phase may happen much earlier during the shower evolution for other processes such as Z 0 Z 0 production. For these processes, further work is needed regarding the choice of the starting scales after the merging phase.
The kinematics for FSR also needs to be discussed in the case of merging. After the mergings have happened, the new dipole configuration might generate further FSR. Those emissions come either from the decay products of the two hard subsystems or from the final-state partons generated previously by ISR. When those particles radiate, they obtain a virtuality, which breaks momentum conservation. As described in Section 2.3, momentum conservation is recovered by boosting the resulting jets along the direction of their respective progenitor. Here, the invariant mass which is conserved is √ŝ . Unfortunately, this procedure may in general alter the individual kinematics of each hard system. In particular, there is no guarantee that the invariant mass of each hard system will be preserved, since it is the invariant mass of the new hard process which is conserved instead. This might be an issue since one expects the invariant masses to be distributed according to the cross section formula. Nevertheless, the situation becomes better in the case where the decay products of the hard systems are colour singlets. In this specific case no QCD radiation is attached to those particles so they do not get any virtuality. Thus, the invariant mass can be preserved. For instance, in the case of WW pair production, each W boson may decay into a pair of leptons. In the absence of an electroweak shower, no extra emissions are attached to those leptons. However, after merging, those leptons must be considered as FSR progenitors, even if they do not radiate. This is because they must balance the momenta of the jets in the equations that state global momentum conservation. The crucial point here is to consider the sum of the two leptons as a single FSR progenitor, and not each lepton on its own. It is this sum which is then used within the equations. The same boost is thus applied to both leptons. This ensures that the invariant mass of the lepton pair (which is the W mass in our instance) is preserved. A similar strategy unfortunately does not work in the case where the decay products are colour charged (e.g. Z 0 boson decaying into jets), since their virtuality will be modified due to additional QCD radiation.

Colour flow
In the case of merging, the colour flow needs to be corrected. This is due to the fact that ISR is performed as a backward evolution. Under the leading-colour approximation [106], which is the framework of current parton showers [2], each time a new parton is emitted, a new colour is generated. Thus, the colours of the partons that are meant to merge do not match. This would mean that the merging cannot occur, at least from a colour point of view. Therefore, the colours need to be matched. The main idea is illustrated in Figure 6. In this example, the backward evolution leads to a green antiquark which should be merged with a blue-purple gluon. This implies that the colours blue and green should be set to be equal. The strategy is to change the most recent colour, which is green in this instance. This aims to disturb the colour flow as little as possible. In the case of a double merging, the colour flow needs to be corrected twice. In this case, one needs to make sure that a colour is not modified more than once. Otherwise, this might lead to some final-state gluon with a colour equal to its anticolour (referred sometimes as "singlet gluons"), which should not be included under the leading-colour approximation. If such a situation appears, despite the precautions implemented, then the configuration must be vetoed, since such a gluon would cause problems during the hadronisation phase.
The treatment of colour flow in the case of merging is not fundamental in the context of a parton-level simulation. However, the modification of the colour flow due to a merging might affect the hadronisation phase significantly [2]. This is the reason why this issue has been addressed in this work.

Parton showering with different scales
As mentioned in Section 4.3, a realistic description of DPS with hard processes characterised by two different scales requires unequal-scale dPDFs. Such a set is not available yet. However, one can already extend the algorithm which has been proposed previously. The evolution of the dPDFs with two different factorisation scales has been discussed for instance in [28,107].
Let us consider two hard processes characterised by the scales µ 2 A and µ 2 B and initiated by four partons: i(x 1 ) and k(x 3 ) on one side, and j(x 2 ) and l(x 4 ) on the other side (recall Figure 2). The kinematics of the two hard processes can be selected using Equation with the unequal-scale dPDFs. The pairs ik and jl need to be evolved. The starting scales are now allowed to be different. Let us assume thatq 2 h,k <q 2 h,i . The strategy is to evolve parton i from the scaleq 2 h,i down toq 2 h,k by using the following branching probability which needs to be corrected by the appropriate Sudakov factor, as usual. This is nothing but the usual DGLAP equations, except that the sPDFs are replaced by the unequal-scale dPDFs and the upper boundary of the integral is now 1 − x 3 instead of unity. Thus, the evolution affects parton i only and parton k acts simply as a spectator. Once the scaleq 2 i has been brought to the scaleq 2 h,k , the scales are now equal. The evolution can then be carried on using the equal-scale dPDFs and the procedure described in Section 4.3.

Defining valence and sea components for the dPDFs
In event generators, one wants to be able to decompose the dPDFs involving u and d quarks into valence and sea components which have a literal interpretation as probabilities to find valence and sea quarks inside the proton. This is particularly important for some hadronisation models. For example, a bookkeeping of valence and sea quarks is required in order to establish the structure of the beam remnants in Pythia 8 [37]. Moreover, such a separation allows us to enforce some physical requirements inside the shower evolution of a pair of partons. For instance, the fact that the dPDF d v d v is identically zero at all scales ensures that the pair of partons cannot be evolved back to a pair of valence d quarks. This latter property is the reason why the valence and sea components are separated in the shower evolution 18 described in Section 4.3. Note that such a separation is not strictly necessary for the parton-level simulation that is presented here. However, we intend ultimately to incorporate this model into an existing event generator with a hadronisation model, and so make this valence-sea separation with this in mind.
The conventional definition of the valence-valence distribution is However, such a definition leads to negative values. This is an issue since one wants to be able to interpret u v u v as a probability density, which therefore must be positive-definite. Moreover, applying the conventional scheme given by Equation (4.10) assigns non-zero values to the splitting part [u v u v ] spl of the dPDF whereas one would expect [u v u v ] spl to be identically zero at all scales. Indeed, this parton configuration does not involve any sea quark so no terms due to a 1 → 2 perturbative splitting can contribute to the dPDF u v u v . Therefore, there is a need to define a new scheme. This scheme should satisfy the following properties: 1. The valence and sea components must be positive-definite.
2. They must sum up to the full dPDFs. For example, one should be able to write and so forth.
3. They must satisfy the intuitively expected dDGLAP evolution equations. For example, in the case of the u v d s distribution, the associated evolution equation should not contain any term that involves the distribution gd s . This is because the u quark is a valence quark so it cannot come from the branching of a gluon at a lower scale.
4. The valence-valence distributions must satisfy the intuitively expected number sum rules. More precisely, the valence-valence distribution q v q v must satisfy the relation 12) with N qv the number of valence quarks.
Such a scheme can be derived. The valence-valence, valence-sea and sea-sea components are defined as follows: for the intrinsic part, and for the splitting part. The scheme for the intrinsic part is the conventional one given by Equation (4.10). However, the scheme for the splitting part is more complicated. The idea is to set [u v u v ] spl to the desired value i.e. zero for all scales. More generally, the splitting part of any valence-valence component must be set to zero for all scales since these components do not get any contributions from 1 → 2 perturbative splittings. The other components are then constructed so that Properties (2) and (3) are satisfied. Also, the correct initial conditions must be applied. For example, the [u sū ] spl distribution must be initialised by the 1 → 2 splitting term since a gluon is allowed to split into a u sū pair. In contrast, the distributions [u s u s ] spl and [ūū] spl are initialised to zero since these configurations cannot originate from the splitting of a gluon. These initial conditions are guaranteed by the way the distributions are defined. This has the important consequence of breaking the symmetry between the distributions u s u s ,ūū and u sū . Note that this symmetry is maintained for the intrinsic part, as long as it is satisfied at the initial scale of the evolution. The same scheme is used for the d sector. There is no need to define such a scheme for the s, c and b sectors since the proton does not contain any s, c or b valence quark. However, one needs to take care of the distributions that couple u and d. In particular, the constraint [u v d v ] spl = 0 must be imposed since u v d v is a valence-valence component. The scheme for the intrinsic part is the conventional one. For the splitting part, the following one will be used: The last configurations which have not been specified yet can be defined according to the conventional scheme for both the intrinsic and the splitting parts. For example, one can write u v g = ug −ūg and u s g =ūg for a configuration involving a u quark and a gluon. It can be checked using the equations above that the scheme satisfies Property (2). The fact that the distribution u v d s verifies Property (3) is explicitly shown in Appendix A.
Similar steps to those given in Appendix A can be used to show that the other distributions also satisfy Property (3); for brevity, we do not present these explicitly here. Property (4) is strongly dependent on the inputs used for the dPDFs at the initial scale of the evolution (recall Equations (3.13) and (3.16)). In particular, the DGS set of y-dependent dPDFs generated in [54] does not satisfy Property (4). In order to approximately recover the number sum rules in the valence-valence sector, the initial conditions of the DGS set were modified. The modifications which were made as well as their impact on the sum rules are presented in Appendix B. Note that no specific modifications were made to improve the way the sum rules are verified in the other sectors.
A general argument that Property (1) holds can be given following a similar logic as the one presented in Section 5.3 of [108]. The distributions defined in the scheme satisfy the expected evolution equations (see Property (3)) and are all initialised with a value which is positive (or zero). The evolution equations state that the derivative of a dPDF with respect to the scale is equal to the convolution of some dPDFs with the regularised splitting kernels. If the splitting kernels are positive-definite, then the derivative will start with a positive value, which means that the dPDF will increase with the scale and therefore remain positive at higher values of the scale. At LO, the splitting kernels can be divided into a positive-definite part and a negative part 19 which is the virtual contribution to the kernel at z = 1. From the structure of the DGLAP equations, one can notice that the negative contribution to the evolution of a dPDF is proportional to the dPDF itself. Therefore, the negative contribution cannot change the sign of the dPDF, which thus remains positive.

Setups of the simulations
The simulation introduced in the previous section has been used to generate partonlevel events for W + W + pair production via DPS only. The results will be compared with Pythia 8 and with Herwig. The two hard processes are ud → W + → e + ν e and ud → W + → µ + ν µ . 20 For all simulations, the factorisation scale and the argument of the couplings used in the cross-section calculations are set to be equal to min( √ŝ A , √ŝ B ). The produced leptons are constrained to have a transverse momentump ⊥ > 20 GeV and a rapidity |η| < 5 in the centre-of-mass frame of the collision. It is assumed that the neutrinos can be exactly reconstructed. The set of LO sPDFs that is used is the 3-flavour MSTW2008 one [109,110]. In order to be consistent with the sPDFs, the scheme for the strong coupling α s will be the 3-flavour scheme developed by the same authors [111]. With this scheme, the value of α s at the Z 0 mass is α s (m Z ) = 0.126. For the production of the W bosons, only the channel ud → W + will be considered. However, the strange quark is included within the shower, as well as the u and d quarks. The shower cut-off of the simulation for both ISR and FSR is set to p ⊥,0 = 1 GeV (which means thatq > 2 GeV). For Pythia 8 and Herwig, the hadronisation phase, MPI and matrix-element corrections are switched off.
In the plots presented in the next section, the names refer to the following setups. The curves named "Pythia" and "Herwig" refer to the results obtained with Pythia 8.2.40 and Herwig 7.1.4 respectively. The implementation of the algorithm described in Section 4 will be referred to as "dShower". This simulation uses the set of y-dependent dPDFs generated in [54], with the scheme defined in Section 4.6. The setup "dSh-NoSpl" refers to the same setup as "dShower", but with the splitting part of the dPDFs set equal to zero i.e. F = F int . This implies that the possiblity for merging is switched off (i.e. p Mrg = 0). A comparison between those two setups will allow us to estimate the size of the contribution of the splitting part of the dPDFs. In order to measure the effect of the dPDFs y-dependence, 21 two other setups are defined. The idea is to use dPDFs that do not depend on y, unlike the DGS set. In these setups, the y-dependence is removed using a factorisation ansatz for the dPDFs. More precisely, the y-dependent part of the dPDFs is factorised into a function F (y). The first setup, "Fact", uses a product of sPDFs as in Equation (3.1). For this setup, a usual angular-ordered shower is added. The second setup, "GS09", instead employs the GS09 set (limited to three flavours) developed in [28]. For this last setup, no shower is added, since this would require dedicated work. 22 For the setups that are based on a factorisation ansatz, the value of the effective cross section σ eff needs to be consistent with the inputs for the dPDFs given in Equations (3.13) and (3.16). More specifically, the value of the effective cross section is directly linked to the values of the widths h ij which are given in Equation (3.15). In the case of W + W + production, the main contribution comes from the width h ud of the ud distribution. One can now find out which σ eff the value of h ud corresponds to. The effective cross section is given by Equation (3.8). For the function F (y), one can use the same Gaussian form factor as the one used in Equations (3.13) and (3.16). Note that this construction does not take into account the fact that the effective width of the ud distribution will gradually change during the evolution due to the mixing of this distribution with ug, gd and gg which have different widths. The value obtained for σ eff will thus be a rough estimate. The function F (y) can be decomposed into intrinsic and splitting parts, as for the dPDFs. Therefore, the effective cross section gets contributions from intrinsic × intrinsic, intrinsic × splitting and splitting × splitting pieces which are referred to in the literature as 2v2, 2v1 and 1v1 contributions respectively. In the case of W + W + production, the 2v2 contribution is the dominant one since one needs at least two pertubative splittings to reach the configuration ud (e.g. u → ug followed by g → dd) [54]. The contributions involving splitting parts will then be neglected for this calculation. In this context, the effective cross section can be approximated to be This leads to σ eff 8πh ud = 69.09 mb, which is the value that will be used in the following. This is larger than the values measured by CDF and D0. However, the value of h ud which 21 And not the cumulative effect of the dPDFs in addition to the parton shower. 22 The GS09 set includes contributions from 1 → 2 splittings and the shower needs to be consistent with this aspect.
has been used corresponds to fits to data for GPDs [102]. Equation (5.1) can be used to justify the value of the cut-off y cut = 8 GeV −1 defined in Section 4.2. Indeed, if one limits the integral in Equation (5.1) to the region defined by 0 < y < y cut , then one gets Thus, in the case of WW pair production, 99% of the full integral to infinity is taken into account with this value of the cut-off. One may worry about the 2v1 and 1v1 contributions. However, these contributions get at least one factor 1/y 2 from the splitting part, which makes the integral converge faster. The approximation y < y cut = 8 GeV −1 is therefore valid in the context of WW pair production.

Total DPS cross section and partonic luminosity
The first observable which will be studied is the total DPS cross section. The contribution from SPS is here omitted. For the setups using the y-dependent dPDFs, this means that the cross section is calculated using Equation (3.7) only. The subtracting term σ sub and the SPS cross section σ SPS introduced in Equation (3.17) are thus neglected. In the case of same-sign WW pair production, σ sub is irrelevant as it has been numerically shown in [54]. This is again because there is no direct LO splitting that leads to the configuration ud. The fact that σ sub can be neglected also implies that the dependence of σ DPS on the unphysical scale ν should be negligible. 23 In contrast, σ SPS may not be negligible. The results are presented in Table 1.
The first aspect which should be noticed is that the GS09 set, Pythia 8 and the setup dSh-NoSpl all lead to a smaller cross section than the one which could be predicted from the DPS pocket formula. This is due to the fact that these three setups include number effects and a suppression of the dPDFs near the kinematic limit. More precisely, GS09 and dSh-NoSpl use phase-space factors whereas Pythia 8 uses a rescaling of the PDFs, see Section 3.1. Both number effects and the kinematic suppressions result in a decrease of the cross section. The observation made for the GS09 set is consistent with the results presented in [62]. One may wonder why the setup Fact gives a slightly lower cross section than the one obtained with Equation (3.9). This is because the DPS pocket formula uses the total cross section for single W + production, which does not include the kinematic constraint x 1 + x 2 ≤ 1. This constraint is in contrast implemented for the setup Fact.
The dShower algorithm gives a higher cross section than that predicted with the DPS pocket formula. This is because the splitting part of the dPDFs is now included and the 2v1 and 1v1 terms enhance the cross section. The GS09 also includes the 2v1 and 1v1 contributions, albeit not correctly taking into account the y-dependence. However, the cross section for GS09 remains smaller than the one found with Equation (3.9). The main difference is that the 2v1 and 1v1 terms get geometrical enhancements when one uses 23  y-dependent dPDFs [40,47]. This explains why the cross section found for dShower is higher than the one obtained with the GS09 set. In order to better understand the differences between all the setups, one needs to consider a less inclusive quantity than the total cross section. Here, the partonic luminosity will be used. It is closely related to the total cross section, but offers the possibility to study a PDF set in different regions of phase space. In the case of DPS, it is defined as where here we use µ = ν = m W , hard system is produced at zero rapidity and the other one is at Y . This expression will be used for the setups dShower and dSh-NoSpl. If the setups involve dPDFs f ij that do not depend on y (e.g. GS09 and Pythia 8), then one can use Equation (3.8) to simplify the expression of the luminosity: If one now uses the ansatz given by Equation (3.1) (e.g. for the setup Fact), then one gets the following formula The luminosity L is plotted as a function of the rapidity Y of the second hard process in Figure 7. An important feature that appears in the ratio plot is the fact that all the ratios with respect to the setup Fact are roughly constant over a large band of rapidities, and then start to decrease around Y = 3. This is an effect of the kinematic suppression of the dPDFs. Indeed, the higher the value of the rapidity is, the closer to the kinematic limit the system is and the dPDFs get suppressed. This suppression is not present for the setup Fact.
The shape of the luminosity for the GS09 set requires some explanations. For low values of the rapidity, the kinematic suppression is not relevant so the 2v1 and 1v1 contributions raise the luminosity above the one obtained with a simple factorisation ansatz. However, for higher values of the rapidity, the phase-space factor of the GS09 dPDFs starts to suppress the dPDFs since the system is tending towards the kinematical limit. This results in a drop of the luminosity, which goes below the Fact luminosity.
The setups Fact, Pythia 24 and dSh-NoSpl only involve the 2v2 contributions. Therefore, one would expect similar shapes for the luminosity for these three setups. On one hand, Pythia 8 leads to a luminosity which is very close to the one obtained with the factorisation ansatz; the only difference being the drop at large Y for the Pythia setup which is caused by the kinematic suppression. On the other hand, the luminosity for dSh-NoSpl is significantly smaller than the two other ones. This is a direct effect of the evolution of the y-dependent dPDFs of the dSh-NoSpl setup. With our choice for the value of σ eff , one can consider that the three dPDF sets start at low scale with the same y-dependence for the dPDFs that involve two quarks. More precisely, this y-dependence is Gaussian and reads In the case of the setups Fact and Pythia, this Gaussian factor is integrated inside the luminosity expression and leaves behind the factor 1/σ eff . In particular, the y-dependence is fixed and equal to that at the starting scale. In the case of the dSh-NoSpl setup, the evolution alters the y-dependence of the intrinsic part of the dPDFs due to mixing with other dPDFs which have different widths. Thus, the shape is no longer Gaussian at high scales. As it can be seen in Figure 8, the tail of the y-distribution of the dPDFs has been dampened by the evolution. This effect has already been observed and discussed in [42]. The dampening is even stronger for large values of the rapidity. Since the luminosity at a given value of Y is the area under the curve represented in Figure 8, the dampening of the y-distribution results in a lower value of the luminosity. Thus, the fact that the dSh-NoSpl setup shows a smaller luminosity is a consequence of the evolution of the intrinsic part of the y-dependent dPDFs.

Asymmetry
A relevant observable for same-sign WW pair production via DPS is the lepton pseudorapidity asymmetry. It is defined as [62] where σ is the inclusive DPS cross section and η ,1 and η ,2 are the pseudorapidities of the two charged leptons in the centre-of-mass frame of the pp collision. A lot of attention has been brought to this observable during the last few years [25,62,65]. The asymmetry is sensitive to the correlations between the two hard systems. Indeed, if the two hard processes are completely independent, then there is no reason why the probability for the leptons to be emitted in the same hemisphere would differ from that to be emitted in different hemispheres. The resulting asymmetry is therefore equal to zero. In contrast, 24 The rescaling of the PDFs in Pythia 8 might recreate some 2v1 or 1v1 features, however. with parton correlations, the fact that a lepton is produced in one hemisphere affects the probability for the second lepton to be located in the same hemisphere.
In order to probe large momentum fractions, it is useful to define a minimum value for the rapidity η min such that |η ,1 |, |η ,2 | > η min . For high values of η min , the parton correlations should contribute significantly and a large asymmetry should be observed. The asymmetry is given as a function of η min in Figure 9 for all the different setups. It can be noticed that all the setups which include a kinematic suppression of the dPDFs as well as number effects lead to a significant asymmetry which increases with η min . In contrast, the setups Fact and Herwig give an asymmetry which is roughly constant and close to zero. For the setup Fact, the two hard systems are completely independent so this result is expected. The DPS model implemented within Herwig also uses two independent hard processes. However, the backward evolution of the second hard process is performed using a modified version of the sPDFs, as mentioned in Section 3.1. This might explain the rise of the asymmetry for large values of η min .
The dShower setup gives the largest asymmetry so including y-dependent 2v1 and 1v1 contributions seem to enhance parton correlations. As explained in the previous section, the GS09 setup does not lead to the same asymmetry since this setup does not employ y-dependent dPDFs. The effects of the 2v1 and 1v1 contributions are thus less significant for this setup. The dSh-NoSpl setup gives a smaller asymmetry than the one obtained with dShower most likely because it only includes the 2v2 terms. It is somewhat surprising that Pythia 8 leads to a higher asymmetry than the ones generated with the GS09 and the dSh-NoSpl setups. This is perhaps linked to the way the dPDFs are implemented inside Pythia 8, see Section 3.1. The effects of including the 1 → 2 splittings with the y-dependence has already been investigated in [112]. It was found there that the y-dependent 2v1 and 1v1 contributions increase the asymmetry, which is consistent with our observation.

Event shapes
This section will end with some event shapes for √ s = 14 TeV. In all the histograms which will be presented, the error bands represent the statistical errors due to the use of Monte-Carlo techniques. The first one is the mass spectrum of the W bosons represented in Figure 10. This plot is introduced for validation purposes. The mass spectrum is given for the dShower setup with and without shower. One can see that the two mass spectra exactly match. This is an important result since it means that the parton shower with mergings does not alter the mass distribution of the W bosons, whose shape is determined by the cross section of the hard process, recall Equation (3.7). This has been achieved with the procedure described in Section 4.4. The mass spectrum obtained with Pythia 8 is also represented for comparison. More validation plots are given in Appendix C.
Let us now study some pseudorapidity distributions. In Figure 11a, the pseudorapidity distribution of the two charged leptons is given. This distribution is sensitive to the sup- pression of the dPDFs near the kinematic boundaries. Indeed, leptons with high rapidities imply that the system reaches the kinematic boundaries and suppressions occur. Therefore, a setup which includes kinematic suppressions of the dPDFs should have fewer events that contain leptons with high rapidities. This is what is observed on the plot. The setup Fact, which does not include any kinematic suppression, is above the other setups in the high-rapidity ranges, whereas it is below in the central region. This distribution has already been discussed in [25,62].
The pseudorapidity distribution of the charged particles produced in the event is given in Figure 11b. This observable is less sensitive to the kinematic suppression of the dPDFs. Indeed, it can be noticed that the setups Fact and Pythia give similar results in the central region, even if the Pythia setup includes some suppression effects whereas the setup Fact does not. Moreover, it seems that using a set of y-dependent dPDFs makes a modest difference. However, as for the previous rapidity distribution, including the splitting part of the dPDFs does not have a strong effect. Indeed, on both plots of Figure 11, the setups dShower and dSh-NoSpl overlap.
In Figures 12 and 13, some properties of the WW pair are presented. First, let us discuss the pseudorapidity difference of the two W bosons given in Figure 12. This observable is closely related to the asymmetry discussed earlier and is thus sensitive to parton correlations. The main idea is that in presence of suppression effects, the system will try to avoid the kinematic boundaries by producing the W bosons far apart i.e. with a large rapidity difference. This is what can be observed on the plot where the setups dShower, dSh-NoSpl and Pythia generate more events with large rapidity differences than the Fact setup that does not have any suppression effect. It is interesting to notice that including the splitting part of the dPDFs makes a difference for this distribution. This difference was already present on the asymmetry plot. The last observable which will be discussed is the transverse momentum of the WW pair, see Figure 13. This observable is interesting since it is directly linked to the characteristics of the shower. Indeed, at LO, the DPS cross section for WW pair production predicts that the W bosons are created with zero transverse momenta. The two W bosons actually get a transverse momentum by recoiling against the partons which have been generated by ISR. Their transverse momenta is thus a pure product of the shower. The most important piece of information that should be extracted from Figure 13 is that the dShower setup leads to fewer events with a small p WW ⊥ than the dSh-NoSpl setup. The most sensible explanation is that including the splitting part of the dPDFs results in a larger Sudakov factor and thus in a stronger suppression at small p ⊥ . To see that, one needs to recall the branching probability for ISR defined in the case of DPS by Equation (4.3). The strength of the Sudakov factor is determined by the following quantity where a pair of partons ij with momentum fractions x 1 and x 2 evolves backwards into a pair i j with fractions x 1 /z and x 2 via the QCD branching i → i. For the case of W + W + production, the most relevant branchings are g →d and g → u. In the case of the setups dShower and dSh-NoSpl, the splitting kernels are exactly the same so the differences are necessarily due to the ratio of dPDFs. In order to investigate the differences between the two setups, the dPDF ratio gu/du, corresponding to the backward branching g →d in the presence of a u quark, is plotted as a function ofq in Figure 14. It can be seen that including the splitting part of the dPDFs may considerably increase the dPDF ratio. More precisely,  Figure 12. Difference between the pseudorapidities of the two W bosons. The setup dShower is the reference in the ratio plot.
the lower the value of y is, the larger the ratio is. This might have been expected since the splitting part of the gu distribution behaves like 1/y 2 . In the case of the dSh-NoSpl setup, where only the intrinsic part of the dPDFs is taken into account, the ratio is close to the one obtained with a factorisation ansatz. Moreover, this ratio does not seem to be too sensitive to the value of y. Similar behaviours are seen for the ratio gd/dd with ad quark, as well as for the ratios gu/uu and gd/ud corresponding to the backward branching g → u in the presence of a u quark and ad quark respectively. These observations are consistent with what can be observed on the p ⊥ spectrum. Indeed, the setups dSh-NoSpl and Fact have similar dPDF ratios, which results in a similar Sudakov suppression at small p ⊥ . In contrast, the dShower setup includes the splitting part which leads to a larger ratio and thus to a stronger suppression at small p ⊥ .
A similar difference between the setups dShower and dSh-NoSpl can be observed for high values of p WW ⊥ . More precisely, the setup dShower generates more events with a large p WW ⊥ than the dSh-NoSpl setup. The reason is the same. For high p ⊥ values, the Sudakov factor is close to unity and is thus irrelevant in that region. The probability to have an emission at high p ⊥ is therefore driven by the terms which are present in Equation (5.8).
Since including the splitting part of the dPDFs considerably enhances the dPDF ratio, it results that the dShower setup has a higher probability to emit in the high p ⊥ region than the dSh-NoSpl setup.

Summary
In this work, a parton-level simulation of DPS has been introduced. This simulation is based on the general DGS framework and includes the 1 → 2 perturbative splittings as well   as the y-dependence. Some first numerical results have been obtained by combining the set of y-dependent dPDFs developed in [54] with an angular-ordered shower. Several issues relative to parton showering in the context of DPS have been addressed. More specifically, a lot of work has been dedicated to the kinematics and the treatment of the colour flow in the case of merging. As a by-product of this work, a new scheme for defining the valence and sea components of the dPDFs has been proposed. This scheme is consistent with the evolution equations and ensures that each component is initialised with a positive value. The simulation has been used to study same-sign WW pair production. The results show some differences with the conventional approaches to dealing with DPS. In particular, using a y-dependent set of dPDFs clearly enhances the 2v1 and 1v1 contributions to the cross section, as has been predicted in [40,47]. This results in a larger cross section and in a larger asymmetry. From a parton-shower point of view, the y-dependence of the splitting part seems to have a sizeable effect on the dPDF ratios involved in the branching probabilities. In the case of W + W + production, this leads to a stronger Sudakov suppression at small p ⊥ values as well as a higher probability to emit at large p ⊥ values. Aside from these interesting effects, the impact of including the mergings is rather small. This is because of our choice of process. Indeed, in the case of WW pair production, a merging is possible only if at least one emission has happened beforehand during the backward evolution. In order to more clearly see the impact of the mergings, it would be more interesting in the future to study other processes such as Z 0 Z 0 production which contains partonic configurations like uū or dd which can be more easily merged.
Several directions can be indicated for future works. The most important one is the development of a shower evolution with two different hard scales. With the current state of the simulation, the only DPS processes which can be studied are the ones that involve two hard systems with comparable scales such as WW or Z 0 Z 0 production. It would be very helpful to be able to study other processes such as W + 2 jets or 4-jet production, where the two scales are usually very different.
The kinematics of the shower in case of merging still requires some further work. Indeed, the one which has been established in this work preserves the invariant mass of the resonance only if the decay products of that resonance are all colour singlets. Therefore, it is necessary to develop in the future a procedure that preserves the invariant mass whatever the colour charge of the decay products is. This aspect is relevant if one wants to study Z 0 Z 0 production with at least one of the Z 0 bosons decaying into jets. It is recalled here that it is the FSR kinematics which is creating problems, whereas the ISR one seems to be performing as expected.
The simulation should also be improved in order to allow more realistic studies. A first upgrade would be for example to extend the quark content from three to five flavours. The DGS set of y-dependent dPDFs has already been extended to include five flavours [112]. The remaining task would be to include the mass thresholds inside the shower evolution. A second upgrade would be to link the simulation to a hadronisation model so that it generates hadronic final states. This would allow to study the impact of the mergings on the colour flow.
Finally, some work is needed to implement Equation (3.17) inside the simulation in a consistent way. Naively, the simulation should be able to switch from a DPS description to an SPS description for y 1/Q h , with Q h , the hard scale. Such a simulation would allow for example to model WW pair production with both DPS and SPS processes taken into account. Including the subtraction term from Equation (3.17) should also remove any dependence on the unphysical scale ν. Some techniques used in event generators for the matching-and-merging procedure [113][114][115][116][117][118][119][120][121][122][123] might be helpful to achieve such a goal.
The ideas developed in the context of this simulation could be used to try to improve the current MPI models. For example, a system containing n different interactions could be divided into n(n−1)/2 pairs of interactions. Each one of these pairs can be seen as a DPS so the approach introduced in this work could be applied to each pair separately. The shower evolution developed for DPS and the whole merging procedure could then be embedded inside the MPI evolution. This would hopefully include some of the parton correlations.

Acknowledgments
BC would like to thank Michael Seymour for his advice, comments and thoughtful remarks towards this work, as well as Torbjörn Sjöstrand and Peter Skands for useful and inspiring discussions. The help of Matthew De Angelis is also greatly acknowledged. This work has received funding from the European Union's Horizon 2020 research and innovation programme as part of the Marie Skłodowska-Curie Innovative Training Network MCnetITN3 (grant agreement no. 722104). The histograms have been produced with Rivet [124] and the sketches with Axodraw [125].

A Evolution equations for valence and sea components
Let us take the example of the u v d s distribution. For this specific dPDF, one expects the following dDGLAP equation which is basically Equation (3.10) with the following notation 25 In order to show that the distribution u v d s defined with the scheme given in Section 4.6 actually satisfies Equation (A.1), the starting point is to write the dDGLAP equations for 25 The operator ⊗ 2 is defined in a similar way.  The proof for the other distributions follows a similar reasoning. One needs to write the dDGLAP equations for the plain dPDFs and combine them according to the definitions of the valence and sea distributions given in the scheme.

B Initial conditions and number sum rules
With the initial conditions given by Equation (3.13), the DGS set of y-dependent dPDFs does not satisfy the number sum rules. To see that, let us focus on the d v d v dPDF. In Figure 15, the evolution of d v d v as a function of the scale µ is given. One can see that the quantity d v d v generated with the input (3.13) is not identically zero. As mentioned in Section 3.1, this is unphysical. Fortunately, this issue can be solved by modifying the initial conditions for the intrinsic part. Indeed, it is argued in [28] that one should subtract from Equation (3.13) for the dd distribution the following term This has the effect of removing the valence-valence contribution from the full dd dPDF at the starting scale µ 0 and thus forbidding this configuration. Since the homogeneous dDGLAP equations preserve the sum rules [28], this constraint should be present at higher scales too. For the u sector, one can apply the same method. More specifically, the initial conditions should take into account the fact that extracting a valence u quark halves the probability of finding a second valence u quark. This is fulfilled by subtracting the following term from Equation (3.13) for the uu distribution It is shown in [28] that including these subtraction terms, referred to as "number effect" terms, considerably improve the way the dPDFs describe the GS sum rules, at least for dPDFs that do not depend on y. For y-dependent dPDFs, this procedure seems to work too. Indeed, in Figure 15, it can be seen that the initial conditions which include the number effect terms lead to a d v d v distribution which is zero for all scales, as wanted. This automatically implies that the number sum rule for the d v d v distribution given in Section 4.6 (see Property (4)) is satisfied. In Figure 16, the quantity is represented as a function of x 2 for the two types of initial conditions. According to the sum rule stated for the u v u v distribution, this quantity should be equal to the distribution u v (x 2 , µ 2 ), which is also given in the figure. It appears that the initial conditions which include the number effect terms improve significantly the way the sum rule for u v u v is verified.

C Validation plots and scale variations
In Figures 17 and 18, the pseudorapidities η of the leptons, the rapidities y W of the W bosons and the asymmetry are given for the dShower setup with and without shower. Those observables are determined by the DPS cross section (3.7) and the shower should not affect their shape too much. The issue is that the procedure introduced in Section 4.4 shuffles around the individual kinematics of each hard process. A solution has been found to preserve the invariant masses of the W bosons, recall Figure 10. In contrast, no solution has been implemented yet in order to conserve the values of the rapidities which hence might be altered by the shower. This can be seen in Figure 17, especially for the central region of the y W distribution. Fortunately, the differences remain modest. The asymmetry seems to be slightly affected by the shower for large values of η min . It is important to specify here that the higher η min is, the lower the statistics is since more events get vetoed. Figure 15. Valence-valence component of dd for two different kinds of initial conditions. The distribution in blue has been generated with initial conditions that include the number effect terms, whereas the one in red use initial conditions that do not include any number effect term. Figure 16. Verification of the number sum rule for u v u v for the two types of initial conditions. The blue and red curves give the evolution of the distribution defined in Equation (B.3) as a function of x 2 for µ = ν = 80 GeV. The green curve gives the distribution u v (x 2 , µ 2 ). According to the number sum rule, the ratio should be equal to unity.
It has been explained in Section 3.2 that the ν-dependences of the DPS cross section and of the subtraction term in Equation (3.17) cancel each other, at least order by order. The problem is that the simulation of DPS introduced in this work does not include this subtraction term. Therefore, the results of the simulation are sensitive to the choice made for the scale ν. This sensitivity can be assessed by varying the scale ν. It is customary to vary the scale upwards (ν ← 2ν) and downwards (ν ← ν/2). In our case, the variation upwards is not possible. Indeed, the scale ν was originally chosen to be the hard scale Q h , see Section 4.2. Hence, a variation upwards implies that the scale ν is now 2Q h and it might happen that the value of µ y is larger than the hard scale Q h , since the lower limit for the range of y values is now b 0 /(2Q h ). Unfortunately, the approach developed in Section 4.3 cannot handle such a configuration because it relies on the fact that µ y < Q h . Nevertheless, a variation downwards is possible. The results of the scale variation are given in Figures 19  and 20. It can be seen that the scale variation does not affect the event shapes. This is because the dominant contribution in the case of W + W + production is the 2v2 part of the cross section, which is not too sensitive to the value of ν. The asymmetry is nonetheless slightly reduced.