PineAPPL: combining EW and QCD corrections for fast evaluation of LHC processes

We introduce PineAPPL, a library that produces fast-interpolation grids of physical cross sections, computed with a general-purpose Monte Carlo generator, accurate to fixed order in the strong, electroweak, and combined strong-electroweak couplings. We demonstrate this unique ability, that distinguishes PineAPPL from similar software available in the literature, by interfacing it to MadGraph5_aMC@NLO. We compute fast-interpolation grids, accurate to next-to-leading order in the strong and electroweak couplings, for a representative set of LHC processes for which EW corrections may have a sizeable effect on the accuracy of the corresponding theoretical predictions. We formulate a recommendation on the format of the experimental deliverables in order to consistently compare them with computations that incorporate EW corrections, and specifically to determine parton distribution functions to the same accuracy.


Introduction
With the recent completion of Run II, the Large Hadron Collider (LHC) has accumulated data from an integrated luminosity of approximately 150 fb −1 [1]. This represents only a small fraction of the anticipated 3000 fb −1 that will eventually be recorded in the forthcoming twenty years of LHC operation. Nevertheless the statistical uncertainty of the data has already shrunk to unprecedentedly small values, typically 1 % or less, a fact that will allow for precision tests of the Standard Model (SM) and for indirect searches of New Physics -1 -only if theoretical predictions become comparatively precise. This entails the computation of additional higher-order contributions to the fixed-order perturbative expansion, on the one hand, and an increasingly sophisticated determination of the Parton Distribution Functions (PDFs) of the proton [2,3], on the other hand.
In the first respect, because Quantum Chromodynamics (QCD) dominates the interactions occurring within colliding protons at the LHC, much effort has been devoted to the computation of higher-order QCD corrections: fully-differential next-to-leading order (NLO) results, possibly matched to a parton shower, are currently automated in various general-purpose Monte Carlo generators [4][5][6] (see also ref. [7] for a review), while an increasing number of next-to-next-to-leading order (NNLO) predictions are becoming available for processes with various degrees of inclusiveness (see e.g. ref. [8] and references therein). The computation of higher-order corrections in the electroweak (EW) and combined QCD+EW theory has also witnessed a comprehensive progress. Frameworks were developed in which the QCD and EW couplings are simultaneously treated as small parameters in the perturbative expansion, and the computation of theoretical predictions, accurate to NLO in both (including multi-coupling QCD-EW terms), is automated [9][10][11]. For an extensive and recent review, see ref. [12].
In the second respect, contemporary PDF sets [13][14][15] incorporate a significant amount of LHC data, which is analysed with NNLO QCD theory by default. No EW corrections are systematically included in the theoretical description of the experimental observables to which PDFs are optimised, except for QED effects if a photon PDF is determined [16][17][18][19][20]. The resulting relative PDF uncertainty -which accounts only for the uncertainty of the data and of residual methodological inefficiencies inherent to each PDF determinationcan be as low as 1 % at the EW scale [14]. Theoretical uncertainties, possibly of comparable size (e.g. from missing higher-order terms in the QCD perturbative expansion), have started to be represented into PDF uncertainties only very recently [21,22].
The two respects are intertwined, as they both concur to determine the accuracy of the theoretical predictions that are matched to the precision of the data. In particular, taking advantage of the automation pioneered in refs. [9][10][11], perturbative corrections that arise from the simultaneous expansion in both the QCD and EW couplings should start to be incorporated in calculations for LHC processes as standard. The reason is twofold. First, one expects NNLO QCD and NLO EW corrections to be of comparable size because, at the EW scale, the QCD and EW running couplings become similar, α 2 s ∼ α. If NNLO QCD corrections are included by default in the computations, NLO EW corrections should be taken into account as well. Second, the virtual exchange of soft or collinear weak bosons leads to Sudakov logarithms [23,24] (see also ref. [12] and references therein), which can make the coefficients of the EW series grow faster than their QCD counterparts. This behaviour is relevant in phase-space regions associated with large mass scales (roughly of the order of a TeV), where several LHC data sets (e.g. Z-boson transverse-momentum distributions) enter both the validation of the SM and the search for new physics.
The consistent inclusion of QCD and EW corrections in precision computations for LHC processes entails the solution of two separate problems. First, a problem of efficiency: fast-interpolation grids should be constructed, whereby partonic matrix elements, -2 -• PineAPPL offers an easy-to-use interface written in the C programming language, that allows MC integrators to read and write PineAPPL grids. C was chosen because it can be easily interfaced with both Fortran and C++, the two main programming languages in which most MC integrators are written. The interface consists of roughly thirty functions, among which only a handful are needed in practice. We also provide and support a Python package based on this C interface. See appendix A.3 for an example.
• PineAPPL has been explicitly interfaced to mg5_aMC (v3+) [5,11], 1 for which it supersedes aMCfast [29]; see appendix A.2 for an example of a runcard. While the usage of mg5_aMC(v3+) + PineAPPL is similar to that of mg5_aMC(v2) + aMCfast + APPLgrid, it is however more efficient. For instance, the usage of the latter typically required a substantial amount of memory (close to 120 GB for Drell-Yan), if a grid for a differential distribution with more than 10 bins was generated. The memory usage is substantially reduced (roughly 1 GB) after 'optimisation' of the grids, i.e. an optimisation of the grid representation in memory. However, this required a two-step procedure: first, to produce an unoptimised grid in order to identify those parts where the cross section is either zero or extremely suppressed; second, after those parts are removed, to fill an optimised grid with a small number of bins. PineAPPL avoids this by using a more space-efficient representation from the start. This leads to substantially faster run times, in particular for simple processes, because the grids do not need to be optimised and their combination is faster.

Cross sections in a multi-coupling expansion
Fixed-order partonic cross sections a + b → X supported by PineAPPL are written as an expansion in powers of the strong coupling α s , the electromagnetic coupling α, and the logarithms of ξ R = µ R /Q and ξ F = µ F /Q, This cross section is differential with respect to the observable O, which, in general, is a function of phase space and subject to the usual conditions (soft and collinear safety, etc.). In experiments, but also for many calculations where the phase-space integration is performed using MC techniques, finite statistics does not allow for the exact reconstruction of the dependence of the cross section on the observable O. Instead, it it sufficient to 1 A version of mg5_aMC interfaced to PineAPPL can be downloaded from https://launchpad.net/ mg5amc-pineappl/trunk, and it will be included as standard in the next mg5_aMC release. The Ellis-Sexton scale Q 2 , if chosen dynamically, depends on the phase space, however we assume the fractions ξ R and ξ F to be phase-space constants in any case. This allows for variations around the central scale choice ξ R = ξ F = 1, but it does not otherwise allow for arbitrary changes of the scale. The terms with powers m > 0 and n > 0 vanish for the central scale choice and are only required for variations of the factorisation and renormalisation scales. To estimate the perturbative QCD uncertainty -no EW uncertainty is covered by this method -one typically uses scale variations. Common prescriptions are 7-point and 9-points scale variations, which evaluate the cross section using respectively the following values (ξ R , ξ F ) 7−pt ∈ 1, 1 , 1 2 , 1 2 , 2, 2 , 1 2 , 1 , 1, 1 2 , 2, 1 , 1, 2 ; (2.4) (ξ R , ξ F ) 9−pt ∈ 1, 1 , 1 2 , 1 2 , 2, 2 , 1 2 , 1 , 1, 1 2 , 2, 1 , 1, 2 , 1 2 , 2 , 2, 1 2 . (2.5) The (asymmetric) uncertainties are then given as the minimum and maximum value (the envelope), measured from the central value (1,1). As is clear from eq. (2.1), the EW coupling α is assumed not to be a dynamically varying coupling, but instead a constant over phase space. This, however, includes the most common choices of the coupling, which are (not necessarily in this order), α(0), α(M Z ), and α Gµ . The problem that PineAPPL solves can now be described: approximately reconstruct the functions w (k,l,m,n,o) ab x 1 , x 2 , Q 2 (2.6) of eq. (2.2) from a set of N function evaluations for specific momentum fractions, scales, and values of the observable given by the MC integrator together with the corresponding value of the weights, eq. (2.6). This problem is solved by finding an appropriate representation of eq. (2.7) and is described in section 2.2.1. Using eq. (2.1) and (2.8) PineAPPL can then quickly calculate hadronic cross sections for an arbitrary number of PDF sets and perform scale variations.
-6 -Note that we have omitted the dependence of the weights w, the observable O, and the scales µ F , µ R , Q 2 on the specific kinematics for which they are computed. Indeed, beyond LO, different kinematic contributions have to be considered (in ref. [29], for example, they are labelled with an index α, see eq. (12) therein). In the FKS subtraction scheme [34,35] employed in mg5_aMC one type of kinematics for each counterterm (soft, collinear, and soft-collinear) is needed, however this is not the general case. In Catani-Seymour subtraction [36,37], for example, different dipoles have different phase spaces and therefore different scales. PineAPPL remains completely blind to this fact, and a consistent treatment is ensured by filling each event into a grid using the numerical value of Q 2 .

Grid representations
We now explain the details of how the phase-space weights w ab in eq. (2.3) are represented.

4-tuples.
A straightforward representation is given by 4-tuples, i.e. a list of the momentum fractions x 1 and x 2 , the scale Q 2 , and the phase-space weight w for each phasespace point; 4-tuples are sufficient to reconstruct the differential cross sections. For each combination (a, b, k, l, m, n, o) (see section 2.2 for their definition) we save the following 4-tuples, . (2.9) The reconstruction of the differential cross sections are then done by simply multiplying the phase-space weights w with PDFs evaluated with the correct arguments given in the 4-tuple and summing over all indices a, b, k, l, m, n, o, and i. The 4-tuple representation is very easy to implement and test. Furthermore, it reproduces the exact numerical value that is also calculated by the MC integrator. However, the price one has to pay is the size of the 4-tuples. For example, NLO QCD+EW Drell-Yan lepton-pair production (see section 2.3.1 and section 3.2.1) needs 159 MB of storage for a target precision of 1 % of the integrated cross section. While this is an acceptable size, increasing the precision by an order of magnitude would require roughly 100 times the size, due to the Monte Carlo convergence that goes as 1/ √ N with N being the number of 4tuples. With increasing size also the speed of the convolution degrades, because it basically becomes bound by the speed with which the 4-tuples can be read from disk. However, due to the uncompressed nature of this representation it can serve as an intermediate format to develop and quickly cross check more space-efficient representations, one of which we will discuss next.

Lagrange-interpolation grid.
A different strategy is to partition a subset H of the (x 1 , x 2 , Q 2 ) space, along each axis into a small number of bins and to insert the phase-space weights w into the corresponding discrete bin. Using the bin centres and their values one already has a straightforward representation of eq. (2.6), but given a finite number of bins this approach usually yields an insufficient approximation for the cross section. Increasing the number -7 -of bins improves the precision, but it also increases the space requirements. This problem in turn is solved using interpolation methods, which increase the precision using the same number of bins. We use the 'Lagrange-interpolation grid' method presented in ref. [25] with the parameters published in ref. [29], which give sufficient precision (see section 3). For the sake of completeness, the following summarises the interpolation algorithm.
This method first maps (x 1 , x 2 , Q 2 ) → (y 1 , y 2 , τ ), with (2.11) The function y(x) maps events with large x effectively linearly and small x effectively logarithmically onto y. This reflects our knowledge of PDFs, which behave differently in those two regions, and thereby increases the precision of the interpolation. For the convolution of a grid with a PDF set also the inverse functions are needed, which are where W 0 (x) is (the principle branch of) the Lambert W function or product logarithm, which satisfies the relation W(x) exp(W(x)) = x. Following ref. [25] (see eq. (17) therein), we furthermore divide the weights, before filling them into the grid, by the function This flattens the interpolated function in the region x → 1, where the PDFs are small and tend towards zero, and enhances the function in the small-x region. The effect of this step is an improvement of the precision that depends on the initial states and the process, but it can be as large as a factor of 10 to 100 (one or two more correct digits compared to the MC result). Before performing a convolution this step is inverted by simply multiplying the interpolated grid values with ω(x 1 , x 2 ). The final step is filling the weights into the grid, which maps the variables (y 1 , y 2 , τ ) onto the 3-dimensional Lagrange-interpolation grid with N y = 50 points in each y direction and N τ = 30 points in the τ direction. The interpolation orders s y and s τ are 3 for each dimension, and only the subspace To illustrate the filling step we give an example in figure 1, where, for simplicity, we have chosen a static scale, so that we do not need to interpolate in the τ direction, and where we also limited the number of grid points to N y = 10. Each grid point has a numerical value a i,j associated, and the set of all numerical values {a i,j } for all grid indices (i, j) ∈ [0, N y ) × [0, N y ) constitute 'the grid'. Inserting a specific weight W = w(x 1 , x 2 )/ω(x 1 , x 2 ) into the grid is shown in figure 1 as a small black square, inside the larger grey one. We have defined the grid points at specific positions, but the points given by the MC will land u(y(x 1 )) u(y(x 2 )) 0 1  2  3  4  5  6  7  8  9  0  1  2  3  4  5  6  7  8  9   4  5  6  7 3 2.00 × 10 −7 Figure 1. Example of a 2-dimensional 10 × 10 grid, which is being filled at the location marked with the small black square at (5.8, 4.8). Each side of the grey square starting at k i = 4 and k j = 3 has a length of N y + 1 = 4. This square marks the grid values (grey dots) that are being updated using eq. (2.14). The somewhere between them. The interpolation order s y then defines a square with length s y + 1 around its centre (u(y(x 1 )), u(y(x 2 ))), given by the MC. All grid points with indices (i, j) covered by the grey square are then updated according to the following formula, with the Lagrange basis functions, where the product runs over all indices of the grid points covered by the grey square in figure 1, starting from the smallest index in the square, k i and k j . Finally, we remap u(y) = y − y min ∆y , (2.16) using y min = y(x max ) and y max = y(x min ) and the grid spacing ∆y = (y max −y min )/(N y −1), so that the integer part of u(y) gives the grid point index, e.g. u(y min ) = 0 and u(y max ) = N y − 1, and the fractional part of u(y) gives the relative location between the nearest grid points.

Perturbative orders
In section 2.2 we labelled the different perturbative orders using the indices k, l, m, n; their values are process specific. However, in general we define as leading order (LO) the set of contributions for all possible initial states ab that lead to the same final state X, for -9 -which the sum of the coupling exponents in eq. (2.1) is smallest, i.e. k + l = p, where p = min(k + l). This number is usually determined by the number of external particles. For many processes there is only one contribution at LO (in terms of k and l), but this is not true in general. Indeed, when a process has multiple quark lines, colourless (photons, . . . ) and coloured particles (gluons, . . . ) can be exchanged between them, making it possible to have more than one combination of α s and α. A typical example at the LHC is (on-shell) top-pair production, which has three different contributions at LO: O(α 2 ), O(α s α), and O(α 2 ). Each contribution receives a higher-order correction with an additional power of α s or α, which in general leads to at least two next-to-leading order (NLO) corrections. The correction with the largest power in α s can be unambiguously called 'the' QCD correction, and the one with the largest power in α 'the' EW correction. All remaining corrections are of combined type, meaning that, in general, they cannot be attributed to either one of strong or electroweak origin. However, for the sake of simplicity and with a slight abuse of notation, it is customary to call NLO QCD (EW) corrections those corrections of order α s (α) times the couplings of the LO contribution with the largest power of α s . In the example above, NLO QCD and EW corrections to top pair production will denote respectively those at O(α 3 s ) and O(α 2 s α). In this paper, in particular when discussing results in sec. 3, we will explicitly list the orders that are considered in the cross section at LO, NLO QCD and NLO QCD+EW accuracy.
Due to the typical sizes of the couplings α 2 s ∼ α, it is naively expected that within the same order, i.e. for fixed k+l, terms with larger powers α k s dominate over those with smaller powers (and larger powers of α l ). In practice, however, this naive expectation is not always true due to dynamic effects. Some examples are vector-boson scattering processes [38,39], top-pair production with a W boson and four-top production [40], and Higgs production with a bottom-pair [41].

Example: Drell-Yan lepton-pair production at the LHC
To give an example of eq. (2.1), the following shows Drell-Yan lepton-pair production up to terms at NLO (with some arguments suppressed for the phase-space weights): The term in the first line is the LO term, the second line shows the NLO QCD correction, and the third line the NLO EW correction. Note that all terms depend on the renormalisation scale only indirectly through α s , because 1) higher-order terms in α never generate a renormalisation scale dependence (in the α schemes that are valid according to section 2.2) and 2) higher-order QCD corrections only introduce an explicit renormalisation scale dependence in counterterms with vertices with more than two gluons. At NLO these terms are not present for this process so that terms proportional to log(ξ 2 R ) vanish. Both NLOs, however, have contributions from a collinear counterterm that depends on the factorisation scale. Since this process has a single LO, combined QCD-EW corrections first appear at next-to-next-to-leading order (NNLO), which include the QCD correction at O(α 2 s α 2 ), the EW correction O(α 4 ), and a single combined correction at O(α s α 3 ).
Note that all initial states have to be taken into account that lead to the same final state. This includes the photon-photon initial state, which appears already at LO. In the corresponding Feynman diagrams all particles are colourless, so that the photon-photon initiated contributions only receive EW corrections. The EW corrections also introduce quark-photon contributions, in analogy of QCD corrections introducing quark-gluon contributions.

Validation and interpretation of PineAPPL grids
In this section we demonstrate the capabilities of the PineAPPL library by interfacing it to mg5_aMC, and by computing fast-interpolation grids, accurate to NLO QCD and NLO QCD+EW, for a common set of LHC processes in which EW corrections may sizeably affect the accuracy of the theoretical predictions. In order to consider realistic kinematics for these processes, we rely on a representative set of LHC measurements. Our aim is twofold. First, we want to validate the results obtained with PineAPPL; second, we want to assess the impact of the EW corrections for the specific experimental setups. We describe the settings employed for the computations, the corresponding results for each process, and possible implications for the determination of PDFs.
We generate each process by means of the Universal FeynRules Output (UFO) [56] model loop_qcd_qed_sm_Gmu, included as standard in mg5_aMC. It contains the UV and R 2 counterterms relevant to NLO QCD and EW corrections, the latter in the G µ scheme. The model features five massless quark flavours, sets the CKM matrix equal to the identity, and is compatible with the usage of the complex mass (CM) scheme [57,58] for all massive particles, see ref. [11] for details. We use this scheme for all processes that involve only massless particles in the final state. The photon is always considered as part of the proton in the initial state and of any hadronic jet produced in the final state: in particular, photon-induced (PI) contributions are consistently included at LO and NLO. 3 We use a PDF set that contains a photon PDF, namely NNPDF31_nlo_as_0118_luxqed [19]. We evaluate the PDF uncertainty associated to the theoretical predictions a posteriori, that is, we convolve the fast-interpolation grid generated with PineAPPL with each member in the PDF set, and we compute the associated standard deviation. Monte Carlo weights are stored as Lagrange-interpolation grids.
The central values of the renormalisation and factorisation scales, µ R and µ F , are chosen in a process-specific way, as discussed in sec. 3.2. In order to estimate the missing higher-order uncertainty, we allow the events to be reweighted on-the-fly in the Monte Carlo generation upon scale variations, with the technique presented in ref. [59]. To this purpose, we use the default mg5_aMC implementation, whereby the factorisation and renormalisation scales are varied down to a factor 1/2 and up to a factor 2, and the envelope from the nine-point scale variations is constructed, see equation (2.5). However, we note that PineAPPL allows the user to determine the envelope with any point prescription, see appendix A.1 for an example.
The values of the relevant physical parameters are chosen as where M Z , M W , m t are the values of the Z-boson, W-boson, and top-quark masses, respectively, Γ Z and Γ W are the widths of the Z and W bosons, and G µ is the value of the Fermi coupling. The value of the strong coupling is chosen consistently with the PDF set, The definition of observables and cuts is process-specific, and it follows the corresponding experimental measurements, see section 3.2. When relevant, final-state photons and massless charged fermions (leptons and light quarks) are recombined together if they satisfy the condition ∆R f γ < 0.1, where ∆R f γ is the fermion-photon distance. In this case the sum of their momenta is assigned to the charged fermion, and the photon is removed from the event. Kinematic observables and cuts are defined starting from recombined momenta. If we were interested also in jet-related observables, photons surviving the recombination would have to be clustered together with coloured partons. 4 Finally, although contributions corresponding to the radiation of a heavy boson are formally of the same perturbative order of the EW corrections, they are not included in our computations. In fact, while nothing prevents one to include these contributions a posteriori, as they are finite, their impact is either smaller than the one of 'standard' EW corrections, or anyway negligible with respect to the total cross section (see refs. [63][64][65] for some process-specific cases).

Results for specific processes and measurements
We focus on the following three processes: Drell-Yan lepton-pair production, top-quark pair production, and Z-boson (lepton-pair) production with non-zero transverse momentum in proton-proton collisions. These are some of the most commonly and most precisely measured processes at the LHC, which are widely studied to test the SM and/or search for new physics. We therefore expect them to allow us to clearly show the benefit of being able to make fast and reliable theoretical predictions accurate to NLO QCD+EW with PineAPPL. In the following, we shall present the experimental measurements, the process-specific settings, and the phenomenological results for each of these processes.
When presenting the results, for each of the processes and measurements considered, we compute differential cross sections for the observables defined in the experimental analyses in two different ways: directly, by means of mg5_aMC, and a posteriori, by convolving the fast-interpolation grid produced by PineAPPL with the PDF set specified in section 3.1. We refer to the first result as the MC result, and to the second as the PineAPPL result. We repeat the computation for theories accurate to NLO QCD and to NLO QCD+EW, respectively. The corresponding orders of the strong and EW couplings that we consider are specified for each process. In each case, we determine the PDF uncertainty (coming from the PDF ensemble), the scale uncertainty (coming from variations of the factorisation and renormalisation scales), and the Monte Carlo uncertainty (coming from the finite number of events generated). In this last respect, we consider by default high-statistics computations, whereby we require a relative Monte Carlo precision of 0.1 ‰ on the integrated cross section. While this choice does not affect the validation of the PineAPPL result against the MC result, it ensures that the statistical uncertainty of the computation remains negligible in comparison to the PDF and scale uncertainties, as we will explicitly demonstrate. This is a desirable feature to correctly interpret the size of the EW corrections. An example that validates the PineAPPL result in the case of a low-statistic run is nevertheless provided in appendix C.
Our goal is indeed twofold. On the one hand, we aim to validate the interpolation grids generated with PineAPPL: to this purpose we shall verify that the MC and the PineAPPL results are identical up to numerical inaccuracies due to the grid interpolation. This equivalence must hold for any choice of renormalisation and factorisation scale and should not depend on the MC uncertainty of the binned cross section. On the other hand, we aim to study the size of the EW corrections, in particular with respect to the kinematics of each process, and to three kinds of uncertainties: the PDF uncertainty, the scale uncertainty, and the uncertainty of the experimental data.
We present these comparisons in figures 2, 3, 4, and 5. The format of the plots is the same across all figures. The first panel displays the relative difference (in per mille) between the PineAPPL and the MC results for the central scale choice and upper/lower edges of the scale-uncertainty envelope, for theories accurate to both NLO QCD and NLO QCD+EW. The following three panels present the theoretical predictions, accurate to either NLO QCD or NLO QCD+EW, always normalised to the former; on top of the theoretical predictions, the PDF uncertainty, the scale uncertainty and the Monte Carlo uncertainty are displayed in turn. The relative uncertainty of the experimental data is also shown for comparison. We shall now discuss the results for each process and data set.

Drell-Yan lepton pair production.
Experimental measurements and process features. We select the single-differential invariant mass distribution of the lepton pair, M ¯ , measured by the ATLAS experiment at a centre-of-mass energy of 7 TeV in the high-mass region (M ¯ > 116 GeV) [66]. We also select the single-differential rapidity distribution, y ¯ , in slices of the invariant mass of the lepton pair, M ¯ , measured by the CMS experiment at a centre-of-mass energy of 7 TeV [67]. These measurements are currently included as standard in the NNPDF3.1 [14] and MMHT2014 [13] PDF sets, although with appropriate kinematic cuts that remove the bins at the largest values of invariant mass, where EW corrections become sizeable. As explained in section 2.3.1, the process has a single LO, O(α 2 ); at NLO, the QCD contribution is O(α s α 2 ), while the EW contribution is O(α 3 ). Our NLO QCD computation includes the O(α 2 ) and O(α s α 2 ) contributions, while our NLO QCD+EW computation includes the O(α 2 ), O(α s α 2 ) and O(α 3 ) contributions. Combined QCD-EW corrections occur only at NNLO, and are therefore not considered here. EW corrections for this process were computed in refs. [11,[68][69][70] (see also [71] and references therein). The process receives contributions from 13 (35) parton luminosities at NLO QCD (NLO QCD+EW), see appendix B for details.
Process-specific settings. We use a fixed value for the renormalisation and factorisation scales µ R = µ F = M Z , where M Z is the mass of the Z boson, for the ATLAS measurement, and the scale µ R = µ F = M ¯ , where M ¯ is the central value of each invariant mass slice, for the CMS one. In the case of ATLAS, we require p T > 25 GeV, |η | < 2.5 and 116 GeV < M ¯ < 1500 GeV for the transverse momentum and the pseudorapidity of each lepton and for the invariant mass of the lepton pair, respectively. Conversely, in the case of CMS, we require p 1 T > 14 GeV, p 2 T > 9 GeV, |η | < 2.4, |y ¯ | < 2.4 and 20 GeV < M ¯ < 1500 GeV for the transverse momentum and the pseudorapidity of each lepton, and for the rapidity and the invariant mass of the lepton pair.
Numerical results. We first consider the single-differential measurement of a leptonpair for large invariant masses performed by the ATLAS experiment at 7 TeV. From figure 2 we immediately observe that the validation of the PineAPPL result against the MC result is successful. The relative difference between the two is of the order of 0.1 ‰ at most, with negligible fluctuations across different invariant mass bins. The agreement is similarly good irrespective of the perturbative accuracy of the theory (NLO QCD or NLO QCD+EW) or of the scale choice. As explicitly demonstrated in appendix C, the good agreement is also independent from the numerical precision of the Monte Carlo run.
The measurement is mainly driven by qq scattering: specifically, the leading (nextto-leading) contribution comes from a uū/cc (dd/ss) parton luminosity, which accounts for about 55 % (49 %) of the cross section for the lowest invariant mass bins, and 68 % (22 %) for the largest invariant mass bins. 5 The PI contribution raises from about 1.3 % in the lowest bin to about 3.6 % in the highest bin. 6 Overall, the EW corrections range between −5 % around M ¯ 150 GeV, −2 % to −3 % for intermediate invariant mass values, 150 GeV M ¯ 700 GeV, and −6 % to −10 % for the largest invariant mass bin, 5 The size of these contributions may depend on the input PDF set. Here and in the following, we always quote results obtained from the NNPDF31_nlo_as_0118_luxqed PDF set. 6 For the MRST2004qed PDF set [72], which is used to subtract PI contributions (see section 4), the value in the highest bin is roughly twice as large, 6.9 % (also in absolute numbers).
-14 -  M ¯ > 1000 GeV, see figure 2. For this reason, the data points with M ¯ > 210 GeV were not included in the NNPDF3.1 analysis [14]. The NLO QCD+EW corrections always lead to a reduction of the cross section in comparison to the NLO QCD prediction. The size of this shift is comparable to the data uncertainty at small values of M ¯ , and rapidly becomes negligible with respect to it as the value of the invariant mass increases and the data uncertainty blows up. This fact suggests a couple of observations in light of the inclusion of EW corrections in a fit of PDFs. First, the description of the more precise bins in the low invariant mass range is likely to change, and will possibly become more accurate should the inclusion of EW corrections improve the data/theory agreement. Second, the kinematic cut that excludes any data point at large M ¯ can be safely removed: any shift in the predictions induced by the more accurate NLO QCD+EW theory is likely to be easily accommodated by the large data uncertainty. However, due to the increased Run-II LHC luminosity, data will become more precise.
In comparison to the PDF uncertainty, the size of the EW corrections is always larger, -15 -especially at the boundaries of the distribution. This fact suggests that, once included in a global fit, EW corrections will make PDFs more accurate. In comparison to the scale uncertainty, the size of the EW correction is similar, except for the four bins at the largest invariant mass, where the latter is significantly larger than the former. This fact suggests that the impact of NNLO QCD corrections [73][74][75][76][77] is comparable to the one of NLO QCD+EW, except at very large values of the invariant mass, where the EW correction still dominates. This result stresses the need to include the EW corrections in order to obtain an accurate description of the large invariant mass bins. Finally, the Monte Carlo uncertainty remains negligible in comparison to the data, PDF and scale uncertainties, and to the size of the EW correction. Our conclusions should therefore not be affected by a generation of too few Monte Carlo events. We then turn our attention to the double-differential measurement performed by the CMS experiment at 7 TeV. For illustrative purposes, we report only four out of the six invariant mass bins, respectively below the Z-boson mass peak, 45 GeV < M ¯ < 60 GeV, on the Z-boson mass peak, 60 GeV < M ¯ < 120 GeV, above the mass peak, 120 GeV < M ¯ < 200 GeV, and at very high invariant masses, 200 GeV < M ¯ < 1500 GeV, see figure 3. Analogous plots for the remaining low invariant mass bins are collected in appendix C. From figure 3, first of all we validate the PineAPPL result: its relative difference with respect to the MC result is always below a fraction of per mille, again irrespective of the accuracy of the theory, of the choice of scale, and of the kinematic bin considered.
As in the case of the ATLAS high-mass Drell-Yan measurement, the CMS measurement is also dominated by qq scattering. The leading (next-to-leading) contribution to the 45 GeV < M ¯ < 60 GeV invariant mass bin comes from the uū/cc (dd/ss) parton luminosity, which accounts for about 70 % (22 %) of the double differential cross section, with small fluctuations across the rapidity range. The PI contribution decreases from about 4 % at zero rapidity to 1.5 % in the largest rapidity bin. The situation is slightly different in the 60 GeV < M ¯ < 120 GeV invariant mass bin, where the leading (next-to-leading) contribution comes instead from the dd/ss (uū/cc) parton luminosity, which accounts for about 60 % (44 %) of the double differential cross section at small rapidities, and for about 56 % (50 %) at large rapidities. In the remaining two invariant mass bins, the leading (next-to-leading) contribution comes again from the uū/cc (dd/ss) parton luminosity, which accounts for about 69 % to 95 % (38 % to 30 %) and 57 % to 70 % (48 % to 34 %) of the cross section, respectively for 120 GeV < M ¯ < 200 GeV and 200 GeV < M ¯ < 1500 GeV in the corresponding rapidity intervals; PI contributions range between 3.7 % to 0.6 % and 7.3 % to 1.6 % in the two invariant mass bins, respectively, for increasing rapidity.
The way in which NLO QCD+EW corrections affect the theoretical prediction for the double differential cross section (with respect to its counterpart accurate to NLO QCD) depends on the invariant mass bin. In the 45 GeV < M ¯ < 60 GeV region, they enhance the value of the cross section by about 11 % across all the rapidity range. This is mostly due to photon-radiation effects on events with M ¯ M Z at the Born, for which the invariant mass is shifted to lower values. In the 60 GeV < M ¯ < 120 GeV region, EW corrections suppress the value of the cross section by about 2 %, again across all the rapidity range; in the 120 GeV < M ¯ < 200 GeV bin, the suppression is around 4 % to 5 %; and in the -16 -   figure 2, but for the CMS double-differential Drell-Yan lepton pair measurement at a centre-of-mass energy of 7 TeV [67]. Displayed are only four of the six invariant mass bins available, respectively below the Z-boson mass peak, 45 GeV < M ¯ < 60 GeV, on the Z-boson mass peak, 60 GeV < M ¯ < 120 GeV, above the mass peak, 120 GeV < M ¯ < 200 GeV, and at very high invariant masses, 200 GeV < M ¯ < 1500 GeV. Results for the slices 45 GeV < M ¯ < 60 GeV and 45 GeV < M ¯ < 60 GeV can be found in figure 8.
- 17 -200 GeV < M ¯ < 1500 GeV bin, the suppression increases further to about 6 % to 7 % for rapidities y ¯ < 2.0. For this reason, for instance, the data points with M ¯ > 200 GeV and y ¯ > 2.2 were not included in the NNPDF3.1 analysis [14]. In general, the size of the EW corrections is comparable to or slightly larger than the data uncertainty, except for the invariant mass bin 45 GeV < M ¯ < 60 GeV, where the shift due to the EW correction overshoots the data uncertainty by about a factor of ten, and at large rapidities, where the shift due to the EW correction, although it can become large, is always a fraction of the data uncertainty. Because EW effects are subtracted from the data used in PDF fits (see section 4), a good agreement between data and theory is usually achieved without the inclusion of EW corrections. However, as already observed in the case of the ATLAS measurement, should EW corrections be included in a fit of PDFs, the latter are likely to become more accurate: even though the apparent description of the data will not improve, by including the more precisely predicted bins in the low invariant mass range, PDFs will resemble more closely the underlying truth. Furthermore, the kinematic cut that excludes any data point at large invariant mass and/or rapidity can be safely removed.
In comparison to the PDF uncertainty, the size of the EW corrections is always larger. We therefore anticipate that, even if the agreement between the more accurate theory (including EW corrections) and the data will remain the same, the PDFs will however become overall more accurate. In comparison to the scale uncertainty, the size of the EW correction is similar, except on the Z-boson mass peak, 60 GeV < M ¯ < 120 GeV, where the scale uncertainty exceeds the size of the EW correction by about a factor of five. This is due to the choice of invariant mass window around the Z peak, in which positive and negative EW corrections almost cancel. Finally, the Monte Carlo uncertainty remains negligible in comparison to the data, PDF and scale uncertainties, and to the size of the EW correction, except for a couple of bins at forward/backward rapidity in the highest invariant mass bins. Improving the Monte Carlo precision will require to generate a larger number of events, possibly with cuts that select only the kinematic bins affected by the largest MC uncertainties. If this turned out to be computationally too expensive, it would be desirable to treat this uncertainty as an additional theoretical uncertainty in the PDF fit [78].

Top-quark pair production.
Experimental measurements and process features. We select the single-differential distribution in either the transverse momentum of the top quark, p t T , or the invariant mass of the top-quark pair, m tt , measured by the ATLAS and CMS experiments at a centreof-mass energy of 8 TeV [79,80]. These measurements have been extensively studied in the context of PDF fits in refs. [8,[81][82][83] (see also refs. [65,84] for studies related to the photon density) and included by default in the CT18 [15] analysis. Because EW corrections are significantly smaller for distributions differential in the rapidity of either the top quark or the top-quark pair [84], these distributions were preferred for inclusion in the NNPDF3.1 analysis [14]. The process receives pure QCD contributions at LO, O(α 2 s ), and at NLO, O(α 3 s ). These orders make up our NLO QCD computation. The -18 -NLO QCD+EW computation includes the O(α 2 s ) and O(α 3 s ) QCD contributions, the LO contribution O(α s α) and the NLO contribution O(α 2 s α). We do not consider the LO contribution O(α 2 ) nor the NLO contributions O(α s α 2 ) and O(α 3 ), which have been shown to be negligible [11,84]. EW corrections for this process were computed in refs. [11,65,[84][85][86][87][88][89][90][91][92][93]. The process receives contributions from 7 (37) parton luminosities at NLO QCD (NLO QCD+EW), see appendix B for details.
Process-specific settings. We employ the following functional form for the renormalisation and factorisation scales; µ R = µ F = m 2 t + (p t T ) 2 2 for the distribution differential in the transverse momentum of the top quark, and µ R = µ F = H T /4 for the distribution differ- T and pt T the mass of the top quark and the transverse momenta of the top and antitop quarks, respectively. These choices were demonstrated to maximise the convergence of the perturbative expansion [94]. No cuts are imposed.
Numerical results. In figure 4 we report the distributions differential in the transverse momentum of the top quark, p t T , and in the invariant mass of the top-quark pair, m tt . Analogous plots for the distributions differential in the rapidity of either the top quark or the top-quark pair are collected in appendix C. From figure 4, we immediately validate the PineAPPL result: its relative difference with respect to the MC result is at most as large as 0.4 ‰, irrespective of the accuracy of the theory, of the choice of scale and of the distribution considered.
The process receives its leading contribution from the gg channel, which varies between 81 % and 61 % (76 % and 83 %) of the p t T (m tt ) differential cross section as the value of the transverse momentum of the top quark (the invariant mass of the top-quark pair) increases; the largest PI contribution for this process comes from γg scattering, which accounts for about 0.5 % to 1 % (0.5 % to 0.7 %) of the cross section, and is almost entirely (90 %) due to the LO contribution at O(α s α); the contribution from other PI parton luminosities is comparatively negligible. Overall, the EW corrections suppress the p t T (m tt ) distribution by about 0.2 % to 3.5 % (0.5 % to 0.2 %) for increasing values of p t T (m tt ), except in the first bin of the p t T distribution, where they enhance the cross section by about 1 %. The size of these shifts, however, remains always significantly smaller than the data uncertainty. 7 As a consequence, we anticipate that the more accurate NLO QCD+EW theory is likely to be easily accommodated by the large data uncertainty, should the data be fitted with the inclusion of EW corrections.
The size of the EW correction is comparable to the size of the PDF uncertainty, except at large values of transverse momentum or invariant mass, where the former becomes larger than the latter. This fact suggests that, once included in a global fit, EW corrections can improve the accuracy of the PDFs. In comparison to the scale uncertainty, the size of the EW corrections remains negligible: despite the fact that the choice of factorisation and renormalisation scales have been devised to optimise the convergence of the perturbative   expansion, NNLO QCD corrections remain large [97][98][99][100][101][102][103][104][105], as expected in a process mostly initiated by gluons. Their inclusion is therefore mandatory in a fit of PDFs. Finally, the Monte Carlo statistical uncertainty remains negligible in comparison to the data, PDF and scale uncertainties, and to the size of the EW correction. Our conclusions are therefore not affected by Monte Carlo inefficiencies.

Z-boson production with non-zero transverse momentum.
Experimental measurements and process features. We select the single-differential transverse momentum distribution of the sum of the two leptons (the 'Z boson'), p ¯ T , measured by the CMS experiment at a centre-of-mass energy of 13 TeV [106]. This measurement, which has not been included in any PDF determination so far, shows very low experimental uncertainties (at the percent level or below). EW corrections are therefore expected to be essential to achieve a good description of it, and to constrain accurately the PDFs, together with NNLO QCD corrections, which are already well known [107][108][109][110][111][112][113] Analogous measurements, from the ATLAS [95] and CMS [96] experiments at a centre-of-mass energy of 8 TeV, were partly included (upon selection of an appropriate kinematic cut that excluded bins with large EW corrections) in a dedicated study [114], in the NNPDF3.1 PDF set [14] and in variants of the CT18 PDF set [15]. In the QCD computation, we  consider a single LO contribution O(α s α 2 ) and a single NLO contribution O(α 2 s α 2 ). In the NLO QCD+EW computation, we supplement these with another LO and NLO, which are O(α 3 ) and O(α s α 3 ); contributions of the order O(α 4 ) are not considered (see ref. [62]). EW corrections for this process were computed in refs. [11,[115][116][117][118]. The process receives contributions from 101 (166) parton luminosities, see appendix B.
Process-specific settings. As in the case of Drell-Yan, we use a fixed value for the renormalisation and factorisation scales µ R = µ F = M Z , where M Z is the Z-boson mass. Consistently with the experimental analysis, we require p T > 25 GeV, |η | < 2.4, |M ¯ − M Z | < 15 GeV, |y ¯ | < 2.4 and 20 GeV < p ¯ T < 1500 GeV for the transverse momentum and pseudorapidity of each lepton, and for the invariant mass, pseudorapidity and transverse momentum of the lepton pair. We finally discard all bins with p ¯ T < 20 GeV to avoid a kinematic region where resummation effects are sizeable.
Numerical results. In figure 5 we report the distribution differential in the transverse momentum of the Z boson. Also in this case, the PineAPPL result is well validated, as it differs from the MC result by 0.5 ‰ at most. The accuracy of the theory or the choice of scale do not alter this conclusion.
The process receives its leading contribution from qg-andqg-initiated channels, which account for about 65 % of the cross section, with some variations in the relative contribu- As a consequence, we anticipate the inclusion of EW corrections to be relevant for an accurate fit of this data.
The size of the EW correction is between four and twenty times larger than the size of the PDF uncertainty: as previously noted in the other cases, this fact suggests that, once included in a PDF fit, EW corrections can improve the accuracy of the PDFs. In comparison to the scale uncertainty, the size of the EW corrections remains negligible at small values of p ¯ T , roughly p ¯ T 400 GeV, while it becomes up to four times larger than it in the two bins at the largest value of p ¯ T . In this kinematic region, NLO EW corrections might therefore become even more relevant than NNLO QCD corrections, and should therefore be mandatorily included in a fit of PDFs to this data set. Finally, the Monte Carlo uncertainty is well under control, as it remains mostly negligible in comparison to the PDF, scale and data uncertainty, and to the size of the EW correction.

Subtraction of EW effects from data
The ability to perform theoretical calculations simultaneously accurate in both the QCD and EW couplings is not sufficient to make a consistent comparison with experimental measurements. In this section we formulate some guidelines to facilitate this task. We focus on the problem of data with (partially) subtracted EW effects, which, if compared to theory predictions including them, leads to a double counting issue. Our guidelines are intended to make the reader aware of an emerging new issue, whose definitive solution remains however beyond the scope of this work.
A first example is the subtraction of (irreducible) background processes which must not be considered as such. A very blatant case is neutral-current Drell-Yan, where the signal process is the production of an opposite-sign lepton pair, which starts at O(α 2 ). Because this process is usually thought as a quark-initiated s-channel mechanism (qq → γ * /Z → ¯ ), in many analyses the PI component, γγ → ¯ in the t channel, is considered a different process, and therefore as a background and subtracted. The subtraction from the measured data is done by calculating the theoretical predictions of the double-photon initiated contribution, possibly including (ill-defined) higher-order corrections. For example, in refs. [119,120] (a similar statement appears also in an older analysis [66]), one reads: The photon-induced process, γγ → ¯ , is simulated at LO using Pythia 8 and the MRST2004qed PDF set [72]. The expected yield for this process also accounts for NLO QED/EW corrections from references [121,122], which decrease the yield by approximately 30 %.
Such a distinction, which is unphysical and incorrect in quantum mechanics, may be somehow justified at LO. Beyond this order, it is simply wrong. Indeed, at O(α 3 ), the reaction qγ → ¯ q becomes possible, which includes both kind of topologies discussed above, and needs both in order to yield an IR-finite result, see Fig. 6 (as a consequence, one cannot speak of EW corrections to γγ → ¯ ). While it is common sense that the QCD counterpart of this subtraction should never be performed -no one would consider to 'subtract' the gluon-initiated contribution to top-pair production in top analyses -seemingly it is not so when EW corrections are considered.
A second example is related to removing EW effects from data. These can be either the full EW corrections or just a part of them. In either case, a comparison between these data and a NLO-EW accurate simulation aimed at the extraction of some parameter would be meaningless, as some effects included in the latter have been removed from the former. The typical example relevant for the LHC is the deconvolution of effects due to multiple-photon radiation from light particles in the final state. This applies mostly to processes such as neutral-or charged-current Drell-Yan, especially when electrons are considered. The problem lies in the fact that electrons, and to a lesser extent muons, tend to radiate photons, which are not accounted for in QCD-only matrix elements. Thus, leptons that are measured in the detector are less energetic, and this fact is compensated for by inverting a photon shower. The resulting dataset is e.g. referred to as pre-FSR with observables defined in terms of Born-level electrons (see e.g. ref. [95] for its definition). These datasets are needed for and correctly used in QCD-only PDF determinations, since the EW corrections to some DY observables can be significant, and excluding them would therefore degrade the quality of the fit. In fits including fixed-order EW corrections the problem with this definition, besides double counting, is that the first photon emission is included exactly at the matrix-element level. The inclusion of subsequent emission would require the matching with the QED shower, which is not yet available for general processes.
It is interesting to note that one can tune the QED parton shower to mimic NLO EW effects for specific processes and observables, so that a prediction only accurate at NLO QCD displays a remarkable agreement with another at NLO QCD+EW when the photonic shower is included (see also the behaviour of predictions showered with Photos [123][124][125] in ref. [126]). However, this kind of agreement always comes a posteriori, and cannot be ensured in general. The transverse momentum of the Z boson as shown in section 3.2.3 is such an example: the difference between the two datasets is rather small (less than 6 %), but the NLO EW corrections are as large as 25 %. Furthermore, the deconvolution of QED effects in data introduces a dependence on the program (and possibly on the specific version) employed for the shower inversion. This fact is especially problematic if deconvolved datasets are the only ones which are published, since undoing the exact deconvolution can be very difficult, or practically impossible.
A more physical definition of leptonic observables would be one making use of either bare leptons (the leptons as they emerge after FSR) or of dressed leptons (leptons and photons are clustered together and their momenta are combined, in analogy with jets in QCD). The problem with the former is that electrons are never measured as bare particles, because of the finite resolution of the electromagnetic calorimeter. For what concerns muons, while in principle the concept of a bare muon is physical, it should be kept in mind that modern, general-purpose codes employed to compute EW corrections treat leptons as massless, to ensure numerical stability of the matrix elements. In this case, using bare leptons is not collinear safe. Dressed leptons avoid all these shortcomings, with the further advantage of being inclusive on the effect of extra collinear emissions. This fact encourages to explore the possibility of employing a dressed-lepton definition, regardless of the leptonic flavour. We acknowledge that this practice is already being followed in (some) experimental analyses: indeed, to mention two examples discussed in this paper, in ref. [95] data for dressed leptons are published, together with the Born-level and bare ones, while refs. [106] employs a dressed-lepton definition. We therefore recommend that these ways of presenting the experimental data become standard in the future.

Conclusions and Outlook
The systematic inclusion of EW corrections in accurate theoretical computations for several LHC processes is becoming more and more important in order to match the increasing precision of the data. In this paper we simplified the computational aspect of this task, building upon the automation of QCD and EW computations pioneered in recent years [9][10][11]. Specifically we developed PineAPPL, a new library that stores perturbative calculations from an external Monte Carlo generator in a PDF-independent way using interpolation grids. This offers the advantage of fast a posteriori convolutions with PDFs, for example to study the uncertainties coming from different PDF sets and/or the strong coupling α s , and to determine the PDFs themselves, a task for which fast-interpolation grids are fundamental. We tested PineAPPL together with mg5_aMC and found a precision of 10 −4 to 10 −5 relative to the MC result, which is excellent for all foreseeable practical purposes. Although we used mg5_aMC, we note that PineAPPL is not tied in any way to a specific Monte Carlo generator, and can be easily interfaced with any of them.
We emphasise that a distinguishing feature of PineAPPL is the support for arbitrary coupling orders not only in the strong, but also in the electroweak coupling. This enables us to generate, for the first time, NLO EW and NLO combined QCD-EW interpolation grids. Using mg5_aMC we calculated and showcased the impact of these corrections for specific measurements of some representative LHC processes: Drell-Yan lepton-pair production, top-pair production, and Z-boson production with non-zero transverse momentum.
-24 -Finally, we discussed the issue of subtracting EW corrections in experimental data, which becomes important when theoretical predictions including EW corrections are compared to experimental data. In particular, with the development of PineAPPL, all technical requirements are fulfilled for producing the first PDF fit of LHC data including EW and combined QCD-EW corrections. This will have at least two advantages: in PDF fits phase-space regions are usually cut away if they exhibit large EW corrections; including them therefore increases the number of data points in a fit and therefore indirectly enlarges a PDF set's interpolation region. Secondly, this makes it possible to use experimental data that are closer to the actual measurement, without the need to compensate for missing EW corrections. We plan to address this task in a future work.

Acknowledgments
We would like to thank Stefan Dittmaier, Stefano Forte, Zahari Kassabov, Davide Pagani, and Juan Rojo for a critical reading of the manuscript, and Rikkert Frederix for discussions about aMCfast. We also acknowledge discussions with Florencia Canelli, Stefano Camarda, Paolo Francavilla, Abideh Jafari, Andreas Jung, Elizaveta Shabalina, Wolfgang Wagner about the treatment of EW effects in experimental data. S.C. and C.S. are supported by the European Research Council under the European Union's Horizon 2020 research and innovation Programme (grant agreement no. 740006). E.R.N. is supported by the European Commission through a Marie Skłodowska-Curie Action (grant agreement no. 752748). M.Z. thanks the Nikhef institute in Amsterdam, where he was employed when this paper was started.

A Installation and usage of PineAPPL
PineAPPL currently consists of three parts: 1) the library itself, which is a dependency for the other parts, 2) the helper program pineappl, which allows one to read PineAPPL grids from the command line and make predictions with it (explained in section A.1), and finally 3) the C interface, which is intended to be used in Monte Carlo integrators to generate the grids.

A.1 Demonstration of pineappl
The program pineappl can be used to perform quick convolutions and other calculations with existing grids on the command line. If started without any arguments, it prints its help and lists all supported subcommands: Calculates PDF uncertainties

Convolutions.
The most important subcommand is convolute, which performs a convolution of a single grid with a single or multiple PDF sets. As an example we show the grid produced for the ATLAS Drell-Yan high-mass lepton-pair production from section 3.2.1, convoluted with NNPDF31_nlo_as_0118_luxqed as the main PDF set and with CT18NLO as a second PDF set.  The output shows all 13 bins with lower (xmin) and upper limit (xmax) of the invariant mass M ¯ of the lepton pair, with the differential cross section dσ/dM ¯ (diff), integrated -26 -cross section (M max ¯ − M min ¯ )dσ/dM ¯ (integ), and the perturbative uncertainty estimated from a 7-point scale variation (envelope given by neg unc and pos unc). The uncertainty estimation can alternatively use a 3-or a 9-point scale variation using the optional program switch --scales 3 or --scales 9, respectively. The (differential) results for the second PDF set (CT18NLO) is shown in absolute numbers and also as a percentage relative to the result of the first PDF set.  100.00% -3.05% -9.99% The first four columns are the same as in convolute, and the remaining ones show all orders normalised to the sum of the leading orders, which in this case is only the O(α 2 ). Absolute numbers are shown if the switch --absolute or -a is passed to the program.  The first three columns are known from convolute. The next columns (the switch --limit This shows that channel #0 represents the up-type quark-anti-quark contributions (shown with PDG id 2 and 4 for up and charm quarks, which have the same matrix elements), channel #15 is the same channel with its initial states transposed, channels #5 and #20 are the down-type quark-anti-quark channels, and channel #30 is the photon-photon channel. The size of the remaining channels is smaller than the photon-photon channel. The factors 1 are not important here, but in general they can contain CKM values and charge factors that, if kept in the squared matrix elements, would not allow for sharing a single matrix element for different quark flavours and therefore slow down the calculation. A complete list of all channels and of their contribution to the cross section for all of the processes discussed in section 3 is collected in appendix B.

A.2 Sample runcard for mg5_aMC@NLO
The following run card was used to produce the results shown in section 3.2.1. The only difference with respect to a standard mg5_aMC run is the switch set iappl 1, which enables to fill a PineAPPL grid. For a complete set of runcards and patches see https:

A.3 Example Monte Carlo program in C++
The following listing shows how to setup PineAPPL using its C interface in a simple Monte Carlo integrator for calculating the double-photon contribution to Drell-Yan lepton-pair production at the LHC. All PineAPPL functions have the prefix pineappl_. The full example together with a makefile can be found at https://github.com/N3PDF/pineappl/ tree/master/examples/capi-dy-aa. The documentation of the C API can be found at https://docs.rs/pineappl_capi.

Installation of Rust
All parts are written in Rust: a Rust compiler and related tools are needed. On operating systems with a bash shell (such as Linux or MacOS) the installation is as simple as $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -32 -which downloads the compiler rustc, the package manager cargo, and a few other helpful tools. When the installation has completed make sure to read and follow the instructions printed on screen. See also https://www.rust-lang.org/tools/install for more details and for installation instructions for other operating systems.

Installation of the command-line program pineappl
The command-line program pineappl is compiled and installed using $ cargo install pineappl_cli This program also needs LHAPDF [127] installed; make sure that the environment variables PATH, LD_LIBRARY_PATH, and PKG_CONFIG_PATH are properly set. For usage instructions simply type pineappl in your shell and read the help message.

Installation of the C-language interface (optional)
For the C interface you need to first install cargo-c, $ cargo install cargo-c and then download the PineAPPL repository, compile and finally install into it into a directory $prefix as follows: $ git clone https://github.com/N3PDF/pineappl/ $ cd pineappl_capi/ $ cargo cinstall --release --prefix=DIRECTORY The last line will install the C header pineappl_capi.h, the library, and a pkg-config 8 file (pineappl_capi.pc) into the directory specified as DIRECTORY. Make sure that the environment variables PATH, LD_LIBRARY_PATH, and PKG_CONFIG_PATH are properly set. The latter is needed for pkg-config --cflags --libs pineappl_capi to work, which prints the necessary compiler/linker flags.
After being installed, one can compile and link against the library. See appendix A.3 for an example.

B Parton Luminosities
In this appendix, we present the complete breakdown of parton luminosities entering the NLO QCD+EW predictions of the various measurements considered in section 3. For each parton luminosity, we indicate its percentage contribution to the cross section in a given bin. In particular: •  ; and table 7 to 120 GeV < M ¯ < 1500 GeV. In each table, the bins are in the rapidity of the lepton pair, 0.0 < y ¯ < 2.4, and each of them has a width of 0.1 (0.2 in the largest invariant mass range).

C Low statistics and complementary results
In this appendix we collect some additional plots, in the same format of those presented in figures 2-5, specifically: figure 7 is the same as figure 2, but for a low-statistic MC run; figure 8 is the same as figure 3, but for the two missing lepton-pair invariant mass bins, 20 GeV < M ¯ < 30 GeV and 30 GeV < M ¯ < 45 GeV; and figure 9 is the same as figure 4, but for the distributions in the rapidity of the top quark, y t , and in the rapidity of the top-quark pair, y tt . In this case the factorisation and renormalisation scales are kept equal to H T /4, see section 3.2.2. From figure 7, we observe that the validation of the PineAPPL result remains successful: its difference with respect to the MC result is at most 3 ‰, as usual irrespective of the accuracy of the theory and of the choice of scale. As expected, however, the result is largely unreliable to make any conclusion about the size of the EW corrections: large -40 -      fluctuations are seen in the predictions, and the MC uncertainty dominates over the PDF and scale uncertainties. Figure 8 demonstrates that the two bins at the lowest invariant mass of the Drell-Yan lepton-pair measured by the CMS experiment display very similar features as the bin at immediately larger values of invariant mass, see figure 3: the EW correction enhances the cross section by 2 % to 3 % across all the rapidity range; the amount of this shift is slightly larger than the PDF uncertainty, but is largely overshot by the scale uncertainty; the MC uncertainty remains comparatively negligible.
Finally, figure 9 allows us to further validate the PineAPPL result against the MC result for the top-quark rapidity distributions, and to explicitly check that the size of the EW corrections in this case is negligible with respect to the companion top-quark transverse momentum and top-quark pair invariant mass distributions, see figure 4, consistently with what was already observed in ref. [84].   Figure 9. Same as figure 4, but for the distributions in the rapidity of the top quark, y t , and in the rapidity of the top-quark pair, y tt .