The dark side of the proton

We study the sensitivity of the High-Luminosity LHC to a light baryonic dark photon B, primarily coupled to quarks, as a constituent of the proton. This is achieved by allowing for a dark photon parton distribution function (PDF) in the PDF evolution equations. Depending on the mass and coupling of the dark photon, the evolution of standard quark and gluon PDFs is distorted to varying degrees. By analysing the effect of the dark photon on the tails of Drell-Yan invariant mass distributions, we demonstrate the potential of the LHC in determining competitive bounds on dark photon parameter space.


Introduction
The majority of the visible sector cosmological energy budget is comprised of hadrons, yet it is rendered visible by the photon, which itself makes up only a tiny fraction of the energy budget and does not behave as matter. It is not unreasonable to expect that the moment the curtains to the dark sector are drawn back it will be rays of dark light that flood detectors and not necessarily the dominant matter component. Thus, the most effective strategy to unveil the particle physics of the dark sector may be to search for new light states carrying a vanishingly small fraction of the dark energy budget; perhaps, even, dark photons (hereafter referred to as 'B'). In recent years searches for dark photons have gained momentum, both theoretically and experimentally; see, for example, the recent reviews [1][2][3], which paint a picture of the breadth of activities in this area.
Being Abelian vectors, dark photons can naturally be light, thus no specific mass scale is particularly deserving of attention than another. As a result experimental search strategies should endeavour to cover as broad a mass range as possible. Below m B 1 GeV a variety of intensity frontier experiments have significant sensitivity to the presence of dark photons, however above this mass scale only high energy accelerators have the capability to probe dark photon parameters.
Pursuing this program, [4,5] use deep inelastic scattering (DIS) data from HERA, and projected data at the Large Hadron Electron Collider (LHeC), to derive bounds on a particular class of dark photon models, in which the dark photon is introduced via kinetic mixing with the SM electroweak bosons. In these studies, the dark photon is treated as a mediator of DIS, hence modifying the theoretical expressions for the DIS structure functions, which allows for the extraction of bounds. Further, in [5] it is noted that a fully-consistent treatment using this approach requires a simultaneous fit of both parton distribution functions (PDFs) and dark photon parameters; here, the interplay is a mild second-order effect, yielding a small relaxation of the constraints derived in [4] (however, in [6,7] it was shown that at the reach and precision of the high-luminosity phase of the Large Hadron Collider (HL-LHC), simultaneous analysis of PDFs and BSM effects will be significantly more impactful).
What if a dark photon was baryonic, being primarily coupled to quarks over leptons? In this case, PDF effects take centre-stage, and it becomes reasonable to consider the dark photon not simply as a mediator of DIS, but as a constituent of the proton in its own right. This is not without precedent; whilst the vast majority of New Physics (NP) searches at the LHC involve processes initiated by coloured partons, namely quarks and gluons, it is well-known that quantum fluctuations can give rise to non-coloured partons inside hadrons, although with much smaller abundance. A key example is the inclusion of photons and leptons as constituents of the proton, which can play a crucial role in achieving precise phenomenological predictions at the LHC. In the recent LUXqed publication it was shown that the photon PDF can be determined in a model-independent manner, using DIS structure function data [8,9]. These results brought an extremely accurate determination of the photon PDF, that superseded the previous model-driven or purely data-driven analyses [10,11]; now, the LUXqed method has been incorporated in several global PDF sets [12][13][14]. Going beyond just photon PDFs, the LUXqed approach has since been extended to the computation of W and Z boson PDFs [15], and lepton PDFs [16]. Whilst the impact of the photon PDFs is sizeable in a number of kinematic regions, the impact of lepton PDFs is rather small at Run III. However lepton-initiated processes will become an important feature in the near future, in particular in the HL-LHC phase, which will provide the largest proportion of new high-energy particle physics data in the next 20 years [17][18][19][20].
In this spirit, we might reasonably ask whether the proton could contain small contributions from a dark photon, the consideration of which could be important in the near future. In this work, we assess the impact of the inclusion in the proton of a new, light baryonic dark photon B with mass in the range m B ∈ [2, 80] GeV, coupling primarily to quarks via the effective interaction Lagrangian: where the dark fine structure constant is of the order α B ∼ 10 −3 . The dark photon's parton distribution enters into the PDF evolution equations in the same way as the photon PDF, except for a flavour-universal coupling and a non-zero mass threshold. The other PDFs, particularly the quarks and antiquarks, are modified by the presence of a dark photon, especially in the large-x region; this gives rise to significantly different predictions for key observables that can be measured at a very high degree of precision at the LHC. In this work we focus on the high invariant-mass Drell-Yan (DY) differential distributions, whose theoretical description is significantly affected by the distortion of the quark and antiquark PDFs due to the presence of a non-zero dark partonic density. For the first time in the literature, we demonstrate the strong sensitivity to this dark photon's mass and coupling of the precise measurements of the high-mass Drell-Yan tails at the HL-LHC, by looking at the data-theory agreement using the standard PDFs and the PDFs modified by the presence of a non-zero dark photon distribution. Whilst the sensitivity of collider measurements to BSM colored partons in the proton has been shown to be very strong [21,22] -as one would expect given that light coloured particles very rapidly distort both the DGLAP evolution and the running of α s (µ) [23] -in this work, we show that in the near future we will still be able to competitively probe the presence of a dark parton that couples to quarks via a much more subtle QED-like mixing.
In Sec. 2 we describe how PDF evolution is modified by the presence of a non-zero dark photon distribution, and state the order of our calculation in QCD and QED perturbation theory. Additionally, we show the dark photon distributions that we obtain, and display how the other parton distributions are modified by the presence of the dark photon. Following this, in Sec. 3 we consider how the presence of the dark photon PDF affects precision theoretical predictions for DY processes at the HL-LHC. If observations at the HL-LHC are SM-like, this could be used to place bounds on the dark photon content of the proton and hence constrain the parameter space of this model. In certain mass ranges we find that these projected bounds are competitive with existing limits, demonstrating the extraordinary ability of the precision-QCD era of LHC physics to probe new light dark sector physics.

Determination of the dark PDF contribution
In order to produce a PDF set which include a non-zero dark photon distribution, we follow the method described in [24], which constitutes a first exploration into the effects of the inclusion of lepton PDFs. In this study, simple ansätze for the functional forms of the light lepton PDFs (electrons and muons) are postulated at the initial PDF parametrisation scale Q 0 = 1.65 GeV, based on the assumption that initial-state leptons are primarily generated by photon splitting, while leptons that are heavier than the initial-scale (namely the tauon) are dynamically generated at their mass threshold and kinematical mass effects are neglected, as is done for all heavy partons in the Zero-Mass Variable-Flavour-Number scheme (ZMVFN) [25,26]. All parton flavours, including the lepton ansätze alongside initial quark, gluon and photon PDFs drawn from some fixed baseline PDF fit, are then evolved using an appropriately modified version of the PDF evolution equations, the Dokshitzer-Gribov-Lipatov-Altarelli-Parisi (DGLAP) equations [27][28][29], thus producing a final PDF set now including lepton PDFs.
Analogously in this work, given that the dark photon mass range that we consider here is above 2 GeV, we dynamically generate a PDF at the dark photon mass threshold m B . We then evolve this new distribution alongside quark, antiquark, gluon and photon PDFs drawn from a baseline set. Hence, via the interplay between the flavours generated by the DGLAP evolution, the resulting quark, gluon and photon PDFs differ relative to the original reference set evolved excluding dark photons, allowing the impact of the dark photon inclusion to be assessed (and, in the subsequent section, bounds from HL-LHC projected pseudodata to be extracted).
In this section, we describe this procedure in more detail. We begin by explicitly showing the modification to the DGLAP equations required by the presence of dark photons in the proton. We then display and discuss the resulting 'dark PDF sets', and compare them to baseline PDF sets excluding the dark photon. In particular, we analyse the dark luminosities, which show an appreciable deviation from their standard counterparts for sufficiently large values of the coupling α B ; this motivates the phenomenological study that we present in Sec. 3.

The DGLAP equations in the presence of dark photons
As is well known, in order to combine QCD and electroweak calculations at hadron colliders, the PDF evolution must be determined using the coupled QCD⊗QED DGLAP evolution equations [30][31][32]. Here, we modify these equations by adding the leading order evolution of a dark photon PDF. In order to assess the impact of such a dark photon PDF in the evolution, it is essential to include all QCD and QED contributions of the same magnitude as the leading dark contribution. Indeed, to include the terms multiplied by α B ∼ 10 −3 consistently, we must also include the terms multiplied by α s ∼ 10 −1 , α 2 s ∼ 10 −2 , α 3 s ∼ 10 −3 , α ∼ 10 −2 and αα s ∼ 10 −3 in the evolution (and further it is no loss to include the terms multiplied by α 2 ∼ 10 −4 ); in particular, we work at NNLO in QCD, NLO in QED, and include QCD-QED interference; furthermore, we always include a photon PDF. On the other hand, the lepton PDFs determined in [16,17,19,20] give a contribution that is more than one order of magnitude smaller than the dark photon contributions determined in this work, thus we can safely ignore them.
With the orders and flavours specified, the modified DGLAP equations which we use in this work can be stated as: where µ 2 is the factorisation scale, n f the number of active flavours, q i (q i ) the parton density of the i th (anti)quark, g the gluon PDF, γ the photon PDF, and B the new dark photon PDF. The symbol ⊗ denotes the usual Mellin convolution: Evolution equations for antiquarks can be obtained by employing conjugation invariance. The matrix elements P ij (with i, j = q,q, g, γ, B) are perturbatively calculable functions 1 called splitting functions. We can decompose the splitting functions into series of the form: where we follow the notation of [33,34]; the upper indices indicate the (QCD,QED,Dark) order of the calculation (where in this work we have added an additional 'Dark' index, corresponding to the powers of the dark coupling α B ). The QCD contributions to the splitting functions were fully computed up to O(α 3 s ) in [35,36] 2 , the mixed QED and QCD contribution P (1,1,0) ij was computed in [33], and the NLO QED contribution P (0,2,0) ij was computed in [34].
The coefficients P (0,0,1) ij can be calculated directly by finding the most collinearlydivergent parts of the four dark splitting channels pictured in Fig. 2.1. The only non-zero contributions are given by ij = qq, qB, Bq and BB (the same results for the antiquarks can be obtained by charge conjugation). We briefly summarise the calculation in Appendix A. However, a detailed calculation is not strictly necessary, since the form of the interaction 1 Technically, mathematical distributions. 2 They are also partially known at O(α 4 s ) [37], but this contribution are not yet fully known and are not included in any public PDF evolution code. Furthermore, their contribution is beyond the accuracy needed in the current analysis.
Lagrangian Eq. (1.1) is identical to that of the electromagnetic-hadronic interaction in the SM, except with a universal coupling 1 3 g B to all quarks and antiquarks. It follows that the splitting function contributions provided by the dark photon B will be identical (up to a factor of 1 9 , due to our convention for the universal coupling) to those provided by the photon γ; in particular, we can quote the required leading-order splitting functions by comparing to [24]: Here, the + notation used in P (0,0,1) qq (x) denotes the usual plus-distribution, defined by: (1)).

(2.5)
To solve the modified DGLAP equations (2.1), we must also specify initial conditions for the dark photon at the initial scale Q 0 = 1.65 GeV. Given that we consider masses m B ∈ [2, 80] GeV, we set B(x, Q 2 ) = 0 for all Q < m B , and we generate the dark photon PDF dynamically at the threshold Q = m B from PDF evolution, similar to the treatment of heavy quarks in the ZM-VFN scheme [25,26], and the tauon PDF in [24]. Hence the dark photon PDF is always proportional to the dark photon coupling α B and to log(Q 2 /m 2 B ) for Q > m B .

PDF sets with dark photons
We have implemented the modified DGLAP equations described in Sec. 2.1 in the public APFEL PDF evolution code; more detail regarding the code implementation is given in App. B. Using the modified code, we produce a PDF set and a corresponding LHAPDF grid [38] including dark photons, for each given value of the dark photon mass and coupling that we consider. We focus on the introduction of a dark photon into the evolution of the NNPDF3.1luxQED set [12] 3 , which provides our SM baseline, namely an NNLO global PDF analysis of all standard parton flavours together with a photon PDF (the photon PDF in this set is determined using the LUXqed method [9]).
In this section, we display the key results from a 'dark PDF set' in a particular scenario that is permitted according to the bounds given in Ref. [40], namely: which corresponds to taking g B = 1.94 × 10 −1 . As described above, a massless dark photon is generated dynamically at the threshold Q = m B , and is set to zero before this threshold is reached. We have chosen a sufficiently high (admissible) value of the coupling to display the impact upon PDFs and parton luminosities. In Fig. 2.2, we display both the photon and dark photon PDFs in our representative dark set (obtained by setting the dark photon coupling and mass at the values given in Eq. (2.6)) at the scales Q = 100 GeV and Q = 1 TeV, and show their relative PDF uncertainties. As anticipated, the dark photon PDF features the same functional form as the photon PDF (this is to be expected since the photon and dark photon splitting functions are identical up to scaling), but its density is smaller since α B α. Furthermore, it can be shown that increasing α B , and also moderately decreasing m B , increases the similarity of the dark photon and photon PDFs. The dark photon uncertainty is mostly comparable to the photon uncertainty up to x ∼ 0.4, and then increases faster than the photon uncertainty. This is due to the dark photon being generated off the singlet PDF (the sum of all quarks and antiquarks) at its mass threshold with a rather small coupling; in particular, the dark photon uncertainty is comparable to the uncertainty of the singlet PDF scaled by a factor of α B . This makes it comparable to the photon PDF uncertainty (for the choice of α B and m B of Eq. (2.6)), except in the large-x region where the singlet PDF uncertainty dramatically increases, resulting in the dark photon PDF uncertainty to consistently increase up to ∼ 10% at x ∼ 0.6. We have verified that for larger couplings the uncertainty increases, as one would expect. Now that we have introduced a new parton in the proton, it is interesting to ask how much 'space' it takes up; this can be quantified by determining the momentum carried by the dark photon at different energy scales. As usual, the momentum fraction carried by any given parton flavour f at the scale Q is defined by: (2.7) In Table. 2.1, we give a comparison between the momentum carried by the dark photon, the photon and the singlet for the representative dark PDF set computed using the values specified in Eq. (2.6), and compare them to the baseline SM PDF set, at Q = 100 GeV and Q = 1 TeV. We observe that the fraction of the proton momentum carried by the dark photon increases with the scale Q, which is to be expected by analogy with the photon's behaviour. Depending on the coupling and the mass of the dark photon, the latter carries up to a fraction of percent of the proton momentum's fraction at Q ≈ 1 TeV. Crucially, the presence of a dark photon in the DGLAP equations also modifies the evolution of all other flavours of PDFs due to the coupling of the PDFs via the modified DGLAP equations Eq. (2.1). We expect that the modification of the quark and antiquark flavours is strongest, as the dark photon is directly coupled to them. We also anticipate a modification to the gluon and photon PDFs, but these will be second order effects, so we expect that they will be smaller in comparison. Moreover, the density of each of the flavours should reduce, as the new dark photon 'takes up space' in the proton which was previously occupied by the other flavours. Results are shown in Fig. 2.3, in which the ratio between the central value of the dark-photon modified singlet (u-valence) PDF and the central value of the baseline singlet (u-valence) PDF are displayed and compared to the current 68% C.L. PDF uncertainty.
We observe that the modification of the singlet becomes visible at about x ∼ 0.2 and reaches 3% at larger values of x ∼ 0.5. This is well within the 68% C.L. uncertainty of the singlet PDF from the baseline NNPDF3.1luxQED NNLO set. However, thanks to the Ratio u V ratios @ 1TeV inclusion of a vast number of new datasets and the increased precision of the methodology used in global PDF analysis, the recent NNPDF4.0 NNLO set [39] displays significantly smaller large-x uncertainty. Such a decrease in PDF uncertainties goes in the direction indicated by the dedicated study on how PDF uncertainties will decrease in future, thanks to the inclusion of precise HL-LHC measurements [41]. In particular, to give an indication of how the modification of PDFs due to the presence of a dark photon might come into tension with decreasing PDF uncertainties during the HL-LHC phase, we display the projected 68% PDF uncertainties at the HL-LHC determined in the 'optimistic' scenario, Scenario 3, of Ref. [41]. In this case, should PDF uncertainties decrease to the level predicted by Ref. [41], the distorted singlet PDF approaches the edge of the projected PDF uncertainty at x ∼ 0.1 − 0.3 region, for the values given in Eq. (2.6). This is particularly relevant for the analysis that we present in the next section.

Probing the Dark Sector
In this section we review the existing constraints on the dark photon. Subsequently, in order to assess the impact of a non-zero dark photon parton density on physical observables, we plot the parton luminosities when the dark photon is included, as compared to our baseline SM set. We compare the predicted deviations with the current PDF uncertainties and with the projected PDF uncertainties at the HL-LHC. Finally, we motivate and present our analysis of projected HL-LHC Drell-Yan data and compare the maximal sensitivity we can achieve to the existing bounds derived in the literature.

Review of existing constraints on the dark photon
To appreciate the utility of the dark photon PDF at colliders, we may compare to alternative probes. Recent works considering this class of baryonic dark photon models include [40,[42][43][44][45][46]. There are a variety of competing constraints on this scenario, of varying theoretical robustness. One class of constraints, first considered in detail in [42], is theoretical and concerns the mixed U(1) B −EW anomalies. Suppose we envisage that the UV-completion of the model Eq. (2.6) is perturbative with U(1) B linearly realised. In that case the mixed anomaly must be UV-completed by some fermions with electroweak charges. In this perturbative UV-completion they will obtain their mass from spontaneous U(1) B -breaking. As a result, they will be coupled to the longitudinal mode of B and an additional Higgs-like scalar with a Yukawa coupling λ ∝ M F /v B , where M F is the fermion mass, v B is the U(1) B -breaking expectation value, and we have assumed three sets of fermions with the same charge (1/3) as the left-handed fermions, for simplicity. On the other hand we have g B ∝ M B /v B following from the charge and symmetry-breaking vacuum expectation value. As a result, we expect: where the precise numerical factors are taken from [42]. Thus, requiring perturbativity λ 4π implies an upper bound on g B , where the factor 1/3 follows from the fact that each family of fermions is in triplicate to mirror the QCD multiplicity of the SM quarks. This limit is shown as a dashed line in Fig. 3.4 where we have taken M F ≥ 90 GeV for the electroweak-charged fermions.
However, a number of implicit assumptions have been made which can weaken upon further inspection. To see this, consider cancelling the anomaly with N copies of the above class of fermions. In this case the limit becomes: Hence we see that this theoretical limit makes not only the assumption of a weakly-coupled UV-completion, but also depends on assumptions of minimality of the UV completion as well. As a result, while this limit does guide the eye as to the nature of the UV-completion, it cannot be considered a strong theoretical limit on the model parameters. The only truly model-independent theoretical limit comes from considering the scale at which the validity of the IR theory itself breaks down. Given that the quark interactions are vector-like there is no possibility of tree-level unitarity violation in quark scattering mediated by B, thus we must look to quantum effects. In this case the mixed-anomaly becomes relevant and renders the theory non-unitary unless [47]: where α W is the SU(2) fine structure constant at the electroweak scale and M Λ is the energy scale at which the theory becomes strongly coupled. Numerically this is which is too weak to be relevant for our purposes. As a result we conclude that the effective theory considered here is valid throughout the energy scales under investigation. However, we note that, as shown in [46], the mixing with the Z-boson is sensitive to the details of the UV-completion; for this reason we restrict the mass range under investigation to m B ≤ 80 GeV, above which these UV-dependent effects can be important. There are three relevant classes of experimental constraints. The first concerns the exotic Z-boson decays Z → Bγ. These constraints were calculated in [43] based on the LEP analysis for Z → Hγ, H →hadrons [48]. 4 This limit, relevant to the higher mass range, is shown in red in Fig. 3.4. The second class of constraints at lower masses concerns exotic Υ decays [51,52], where the constraint is dominated by limits on Υ(1S) → 2 jets [53], shown in blue in Fig. 3.4. Finally, there are additional searches for hadronically decaying resonances at hadron colliders [46,[54][55][56]. The strongest are from CMS B+ISR searches [57,58], shown in yellow in Fig. 3.4.

Effects of the dark photon on parton luminosities
In Sec. 2, we showed that the presence of a dark photon modifies all other flavours of PDFs via the mixing associated with the DGLAP evolution equations, with a modification that is proportional to α B and the logarithm of m B . In order to assess the impact of a dark photon parton density on physical observables, and thus extract the sensitivity that the LHC can achieve on the parameters of the model, in the following subsection we compare the size of the dark parton luminosities to luminosities involving the other partons, and assess the impact of the dark photon on the dominant partonic channels.
Parton luminosities are doubly differential quantities defined as: where S is the squared centre-of-mass energy of the hadronic collision, M X is the invariant mass of the partonic final state, y is the rapidity of the partonic final state, and f i (x, Q) is the PDF of the i th parton evaluated at the scale Q. Different choices for Q can be adopted in order to improve predictions of a particular process and/or distribution. At the level of pure luminosities, without the convolution with any specific matrix element, the factorisation scale can be naturally set to Q = M X . For plotting purposes, it is useful to define the M X -differential luminosities, given by: We first compare the size and the M X -dependence of the different parton luminosities in the candidate dark PDF set obtained by setting the mass and the coupling to the values indicated in Eq. (2.6). In Fig. 3.1 we plot Φ BB , Φ Bq as compared to Φ qq , Φ γγ . We observe  that, while the BB channel is suppressed by two powers of the dark coupling, and its size never exceeds more than a fraction of a percent of the qq luminosity, the Bq channel grows from about 2% of the qq luminosity at M X ∼ 1 TeV to about 8% of the qq luminosity at larger values of the invariant mass. Its contribution exceeds that of γγ scattering by one order of magnitude.
We now turn to assess the change in the other luminosities, as a result of the inclusion of a non-zero dark photon parton density. In Fig. 3.2 we display the ratio of the dark-photon modified quark-antiquark integrated luminosity Φ Dark qq with the baseline one, Φ SM qq at the centre of mass energy √ S = 14 TeV, for different values of the α B and m B parameters, starting from our benchmark values, Eq. (2.6). In each figure, the dark blue band corresponds the 68% C.L. PDF uncertainty of the NNLO baseline NNPDF3.1luxQED set, while the green bands show the projected PDF uncertainty on the parton luminosity at the HL-LHC; this estimate for the uncertainty on the PDF luminosity is obtained from the 'optimistic' scenario, Scenario 3, analysed in [41], as above. Starting from the values of Eq. (2.6), we observe that the deviation in the qq luminosity due to the presence of the dark photon is significant compared to the size of the projected PDF uncertainties at the HL-LHC. Decreasing the mass of the dark photon by a factor of 10 increases the impact of the dark photon on qq initiated observables, while increasing the coupling by less than a factor of 2 brings the luminosity beyond the edge of the 68% C.L. error bands.
Crucially, the effect of the dark photon is much larger in the qq-initiated processes than in any of the other channels, including qq, qg and gg. This motivates the study of the high-mass Drell-Yan tails that we put forward in the following section.

Constraints from precise measurements of high-energy Drell-Yan tails
Given that the qq channel is the most affected by the presence of a non-zero dark photon parton density, in this study we focus on precise measurements of the high-mass Drell-Yan tails at the HL-LHC. It is important to note that these projected data are not included in the fit of the input PDF set used as a baseline, otherwise, as was explicitly shown in [6,7,59], the interplay between the fit of the new physics parameters and the fit of the PDF parametrisation at the initial scale might distort the results.
To generate the HL-LHC pseudodata for neutral-current high-mass Drell-Yan cross sections at √ S = 14 TeV, we follow the procedure of [6]. Namely, we adopt as reference the CMS measurement at 13 TeV [60] based on L = 2.8 fb −1 . The dilepton invariant mass distribution m is evaluated using the same selection and acceptance cuts of [60], but now with an extended binning in m to account for the increase in luminosity. We assume equal cuts for electrons and muons, and impose |η | ≤ 2.4, p lead T ≥ 20 GeV, and p sublead T ≥ 15 GeV for the two leading charged leptons of the event. We restrict ourselves to events with m greater than 500 GeV, so that the total experimental uncertainty is not limited by our modelling of the expected systematic errors, by making our projections unreliable. To choose the binning, we require that the expected number of events per bin is bigger than 30 to ensure the applicability of Gaussian statistics. Taking into account these considerations, our choice of binning for the m distributions at the HL-LHC both in the muon and electron channels are displayed in Fig. 3.3 with the highest energy bins reaching m 4 TeV. In total, we have two invariant mass distributions of 12 bins each, one in the electron and one in the muon channels.
Concerning uncertainties, in [6] this data is produced by assuming that the HL-LHC phase will operate with a total integrated luminosity of L = 6 ab −1 (from the combination of ATLAS and CMS, which provide L = 3 ab −1 each), and also assuming a five-fold reduction in systematic uncertainty compared to [60]. We regard this scenario as optimistic in this paper; we also manipulated the projected data so that it reflected a more conservative possibility, where the total integrated luminosity of the high-mass Drell-Yan tail measurements is L = 3 ab −1 (say, for example, they are made available only by either ATLAS or CMS) and with a two-fold (rather than a five-fold) reduction in systematic uncertainties.
For these projections, the reference theory is the SM, with theoretical predictions evaluated at NNLO in QCD including full NLO EW corrections (including in particular the photon-initiated contributions); note, however, that the Drell-Yan production has been recently computed at N 3 LO in QCD [61,62]. In the kinematical region that is explored by our HL-LHC projections (m > 500 GeV), the perturbative convergence of the series is good and the N 3 LO computation is included within the NNLO prediction, with missing higher order uncertainty going from about 1% to a fraction of a percent. Given the good perturbative convergence of the matrix element calculation, and the absence of N 3 LO PDFs that match the accuracy of the N 3 LO computation of the matrix element, we use the NNLO QCD and NLO EW accuracy of our calculations, both for the SM baseline and for the dark-photon modified PDF set that we use to compute the maximal sensitivity to the dark photon parameters.
The central PDF set used as an input to generate the theoretical prediction is the SM baseline that we use throughout the paper, namely the NNLO NNPDF3.1luxQED set. The central values for the HL-LHC pseudodata are then generated by fluctuating the reference theory prediction by the expected total experimental uncertainty, namely where λ, r i are univariate Gaussian random numbers, δ exp tot,i is the total (relative) experimental uncertainty corresponding to this specific bin (excluding the luminosity and normalisation uncertainties), and δ exp L is the luminosity uncertainty, which is fully correlated amongst all the pseudodata bins of the same experiment. We take this luminosity uncertainty to be δ exp L = 1.5% for both ATLAS and CMS, as done in [41].
To obtain bounds on the dark photon mass and coupling, we select a grid of benchmark points (m B , α B ) in the dark photon parameter space; our scan consists of 21 points, distributed as a rectangular grid with masses m B = 2, 5,8,10,20,50,80 GeV and couplings α B = 10 −3 , 2 × 10 −3 , 3 × 10 −3 . We then construct dark PDF sets at each of these benchmark points (thus a total of 21 PDF sets, in each case including quarks, antiquarks, the gluon, the photon and the dark photon PDFs), using the appropriate values of m B , α B , and hence compute theoretical predictions in both the optimistic and conservative scenarios at each grid point. The predictions are produced assuming that the primary contribution comes from the qq channel; in particular, we note that the partonic diagrams that include a dark photon in the initial state (such as Bq →ql + l − or Bq → ql + l − ) are suppressed by two powers of α B , one from the dark photon PDF and one from the matrix element, and therefore are suppressed beyond the accuracy of our calculation.
In Fig. 3.3 we display the data-theory comparison between the HL-LHC pseudodata in the electron channel, generated according to Eq. (3.7), and both the SM theoretical predictions obtained using the NNLO baseline PDF set NNPDF3.1luxQED and the predictions obtained using the dark PDF sets produced with the dark photon mass and coupling set to (m B = 5 GeV, α B = 3 × 10 −3 ) and (m B = 5 GeV, α B = 5 × 10 −3 ) respectively. We also display the ratio between the central values of those predictions and the central values of the pseudodata as compared to their relative experimental uncertainty in both the optimistic and conservative scenarios. We see that whilst the SM predictions are within the 1σ experimental uncertainty (by construction), the dark-photon modified predictions display significant deviations. In the bottom inset we show the ratio between the predictions obtained in the two representative dark photon scenarios to the central SM theoretical predictions obtained with the the baseline SM PDF set. PDF uncertainties are shown; we display both the current PDF uncertainty of the NNLO baseline PDF set NNPDF3.1luxQED and the projected PDF uncertainties at the HL-LHC, obtained as described at the beginning of this section. Comparing the size of the PDF uncertainties to the size of the projected experimental uncertainties at the HL-LHC, we observe that whilst the current PDF uncertainties are comparable to the experimental uncertainties of the projected data, the projected HL-LHC uncertainties are subdominant as compared to the experimental uncertainties of the pseudodata.
The χ 2 -statistic of the resulting dark PDF set's predictions on high-luminosity highmass neutral-current DY data is defined as: where ||v|| 2 A = v T Av, D is the projected data, T(m B , α B ) are the theoretical predictions using a dark PDF set containing a dark photon of mass m B and coupling α B , and Σ is the total covariance matrix (incorporating both experimental and theoretical uncertainties):  unrealistic to assume that the PDF uncertainty will not decrease as compared to the uncertainty of the NNPDF3.1luxQED baseline, given that we already know that in the updated NNPDF4.0 set [39] the uncertainty of the large-x quarks and antiquarks has already decreased by a sizeable amount thanks to the inclusion of precise LHC data. We thus decide to use the projected PDF uncertainties determined in [41]; in particular, we use Scenario 1 of [41] (the conservative scenario) when we consider the conservative experimental scenario, and we use Scenario 3 of [41] (the most optimistic scenario) when we consider the optimistic experimental scenario. In Appendix C we discuss how our results depend on the assumptions we make on PDF uncertainties. Assuming that the projected PDF uncertainties at the HL-LHC that we display in the bottom inset of Fig. 3.3 are realistic, even in the most optimistic scenario they still amount to 4% to 6% in the largest bins. Therefore, their contribution is much larger than the scale uncertainty of the Drell-Yan matrix element at NNLO in QCD; hence PDF uncertainty is the dominant theory uncertainty on the predictions, and thus it is this contribution that is included in the theory covariance matrix.
To compute the contribution of PDF uncertainties to the theory covariance matrix, we build the theoretical covariance as defined in [63]: where the theoretical predictions for the differential cross section dσ th,(r) i are computed using the SM theory and the r th replica from the baseline PDF set, with PDF uncertainties rescaled by the HL-LHC uncertainty reduction, and averages · rep are performed over the N rep = 100 replicas of this PDF set.
We define the difference in χ 2 to be: where χ 2 0 is the χ 2 -statistic when predictions from the baseline set are used instead. For each fixed m B = m * B in the scan, we then model ∆χ 2 (m * B , α B ) as a quadratic in α B and determine the point at which ∆χ 2 = 3.8, corresponding to a confidence of 95% in a one-parameter scan. Hence, we construct 95% confidence bounds on m B , α B (and hence m B , g B via an appropriate conversion) as displayed in Fig. 3.4. There, the purple (dashed) projected bounds are computed in the conservative scenario excluding (including) the PDF theory covariance matrix, and the green (dashed) projected bounds are computed in the optimistic scenario excluding (including) the PDF theory covariance matrix.
We observe that the projected HL-LHC sensitivity to the detection of a dark photon is competitive with existing experimental bounds, across a large range of possible masses, especially m B ∈ [2, 6]∪ [25,50] GeV. Even in the most conservative scenario including PDF uncertainty (shown as a dashed purple line), the projected sensitivity remains competitive with experiment. Furthermore, one of the useful features of our projected sensitivity is that it is uniformly excluding across a large range (because of the logarithmic dependence on m B ); when compared to individual bounds one at a time, for example the Υ-decay bounds, or the anomaly bounds, our projected sensitivity is powerful.  The solid green and purple lines correspond to projected bounds obtained excluding projected PDF uncertainty, whilst the dashed lines correspond to projected bounds obtained including projected PDF uncertainty, as discussed in the text.

Conclusions
While the dark sector has evaded a non-gravitational detection for decades it is possible that the first clues may be right under our noses, hidden in the dark depths of the proton.
To illustrate this point we have considered the possible existence of a baryonic dark photon.
Being coupled to quarks it may, through radiative effects, be present in the proton and carry some fraction of its momentum at colliders. In this work we have explicitly calculated the dark photon PDF and hence the dark parton luminosities at the LHC at leading order in the dark photon coupling. Since any momentum fraction carried by the dark photon is not carried by the SM partons, the leading effect of the dark photon is to take away part of their momentum. Importantly, this affects precision predictions for event distributions at colliders. Being a high-precision observable at hadron colliders, DY production at high energies is sensitive to the dark photon content of the proton, through the reduction in light quark PDFs. To this end, we have shown that future DY measurements at the HL-LHC are competitive with present lower energy probes in constraining regions of dark photon parameter space. This reveals a new facet of the era of precision hadron collider physics we are currently entering.
Of course our projected sensitivity is based on the phenomenology tools that we currently have, while the actual analysis of real data at the HL-LHC will be able to use the tools that will be developed by then. Most importantly new global PDF sets will be made available, which will possibly be done at N 3 LO, and resulting PDFs can be consistently used with N 3 LO computations of the matrix elements. Moreover the associated PDF uncertainty will most certainly include a missing higher order uncertainty component in the PDF uncertainty that is not currently included in any of the global NNLO PDF fits. The actual sensitivity calculated with real data at the HL-LHC and the refined tools that the PDF community is working towards will be compared with the analysis we have put forward in this work, and the power of the LHC in constraining or revealing the presence of a dark photon will be unveiled.
It is widely accepted that we must endeavour to search in every corner of parameter space for evidence of the dark sector. This has led to a blossoming of novel theoretical ideas and ingenious experimental advances as an ever-greater range of viable dark sector possibilities come under scrutiny. At the same it is important to reflect on our limited view of the dark sector and, perhaps, look inwards. In this work we have pursued this strategy in a very literal sense, asking "Could dark sector constituents lie within the baryons that make up the visible world?" Our results suggest that they could, leaving their fingerprints in SM processes at the HL-LHC.

A Calculation of the dark photon splitting functions
In this Appendix, we explicitly compute the leading order dark photon splitting functions.
We begin by reminding the reader of one possible definition of leading order splitting functions. Consider a process initiated by some parton, which has the potential to radiate another parton before participating in a hard reaction (either a virtual parton, which is reabsorbed, or a real parton which is emitted). Suppose that radiation occurs via a vertex with coupling g (which could be taken to be g s , e or g B , the strong, electric, or dark couplings as appropriate), and corresponding fine structure constant α = g 2 /4π. The cross-section for the radiative process can be shown to take the form: Here, σ Born (xp) is the cross-section when parton radiation does not occur, l T is the transverse momentum of the radiated parton and x is the fraction of momentum of the parton that goes on to take part in the hard reaction. The factor P (x) is called the splitting function and depends on the type of parton initiating the process, the type of parton radiated, and the type of parton that goes on to participate in the hard reaction; typically it is written P ij (x) where i is the parton which goes on to participate in the hard reaction, j is the initial parton, and the third parton flavour is left implied by the structure of the theory. In particular, we see that it multiplies the most collinearly divergent part of the integrand on the right hand side of Eq. (A.1). We now explicitly compute the splitting functions for each of the four dark photon splitting channels, shown in Fig. 2.1.

Contribution to
Consider a general leading order partonic process where a quark q scatters off another particle k to produce some final state X via some hard interaction, as shown in Fig The amplitude for this process can be written as M(p)u(p), where M(p) is the amplitude for the unknown hard scattering part of the diagram (taking as an argument the fourmomentum p of our incoming quark q), including the final state X and the target particle k, and u(p) is the appropriate plane-wave spinor coming from the initial quark q. After summing over all spins the amplitude gives rise to a cross-section: where the sum over X includes an integral over the phase space of the final state X.
On the other hand, the initial quark q may also radiate a real dark photon before it participates in the interaction. Such a contribution arises from the diagram shown in Let l be the momentum of the outgoing dark photon. Then the amplitude for this diagram is: where M(p − l) is the same amplitude for the hard subgraph above, but this time with momentum p − l entering the graph. (p) denotes the polarisation of the dark photon. In order to compute the splitting function contribution P (0,0,1) qq (x), we must determine the most collinearly-divergent part of the amplitude of Eq. (A.3). To this end, we write l in terms of its Sudakov decomposition: where l T is some transverse four-vector, η is a null vector satisfying 2η · p = 1, and x is a constant chosen to make the decomposition true. In particular, we see that as l T → 0, we have l → (1 − x)p, i.e. the initial and radiated partons become collinear, and the parton which goes on to participate in the hard interaction has momentum fraction x. Therefore, in this notation, the collinear limit is l T → 0. We now write Eq. (A.3) in terms of l T , and hence determine the most collinearly divergent parts that we are required to retain. Via a short calculation, we can obtain: Retaining only the divergent terms as l T → 0, the modulus-squared spin-sum/averaged amplitude is then given by: Inserting into the standard cross-section formula, and appropriately changing integration variables, reveals that the most collinearly-divergent part of the cross-section associated to this process is given by: As it stands, we see that σ NLO,real (p) also contains a soft divergence as x → 1, which characterises the divergence that occurs when the radiated parton's energy tends to zero. The soft divergent part is cancelled by adding up the virtual contribution, which corresponds to a dark photon being emitted by the quark, and later being reabsorbed. Fortunately, there is a clever probability argument which sidesteps actual computation of the virtual corrections. Let's begin by writing the virtual corrections as: where A is some appropriately chosen constant. It follows that the sum of the virtual corrections and the leading order graph is given by (1 + A)σ Born (p). Thus adding the virtual corrections, leading order graph and the real emission graph, we have: We interpret the coefficient of σ Born (xp) as the possible outcomes for the initial quark: it either radiates a B-boson, radiates a virtual quark which later recombines with the original quark, or it does none of the above. At this order, these are the only possibilities, and hence the probabilities of these events must sum to 1: where we introduce the regulator that cuts off the integral at 1 − by using the Heaviside step function Θ. We can now compute the (divergent) value of A that enforces this condition, finding: This reveals that the splitting function can be written (in a distributional sense) as: Manipulating this distribution shows that P (0,0,1) qq (x) can be written in the more concise form: where the plus distribution is defined in Eq. (2.5).
(x), we consider the radiation of an antiquark from a dark photon, as shown in Fig. A.3. kq B X Figure  The amplitude for this diagram is given by: As before, we can use the Sudakov decomposition to write the amplitude in terms of the transverse momentum l T : where we drop all terms that remain finite as l T → 0. Further simplification of the spinor structure, and ignoring terms that remain finite as l T → 0 throughout, the modulus-squared spin-sum/averaged amplitude becomes: Inserting into the cross-section formula, we have: and hence we deduce that the splitting function is given by: In this case, there are no virtual corrections to take care of at this order.
where A denotes the virtual contribution and the 1 denotes the contribution from the Born amplitude for the dark photon initiated process. Solving this equation, we find that A = −2/27, and it follows that the splitting function is given by: Contribution to P Again, the splitting function can be found using a simple probability argument. We note that: P To see why this equation holds, recall that P (0,0,1) qq (x) is the probability that a quark of momentum p will split into a quark of momentum xp (which participates in a hard interaction) and a dark photon of momentum (1 − x)p, whilst P (0,0,1) Bq (1 − x) is the probability that a quark of momentum p will split into a dark photon of momentum xp (which participates in a hard interaction) and a dark photon of momentum (1 − x)p. The only difference is the participant in the hard reaction, whose only effect on the splitting function is to determine whether virtual corrections are necessary or not. Hence Eq. (A.21) follows.
Therefore, discarding the virtual corrections from P

B Code implementation of the coupled evolution
The solution of the DGLAP equations in the presence of QED corrections [64][65][66] has been implemented in several tools; in particular, they are implemented in APFEL [67], which provides a public code, accurate and flexible, that can be used to perform PDF evolution up to NNLO in QCD and NLO in QED, using a variety of heavy flavours schemes. We implemented the modified DGLAP equations (2.1) in APFEL. The evolution is performed in x-space (rather than Mellin N -space), and uses a rotated basis of PDFs such that a maximal number of PDF flavour combinations evolve independently. If we define the following vector of PDFs: Here, the dots denote the relevant SM matrix, with the quark-quark splitting function corrected with a dark contribution as appropriate. This equation (together with the other decoupled scalar equations) is solved using an adaptive step-size fifth-order Runge-Kutta method, as described in [67].
To solve the modified DGLAP equations (2.1), we must also specify initial conditions for the dark photon; this is where we make appropriate ansätze for the functional form of the dark photon at the initial scale Q 0 = 1.65 GeV. If the mass of the dark photon m B were less than the scale Q 0 , we could postulate a functional form for the initial dark photon PDF assuming that the dark photon PDF is primarily generated by quark splitting. An appropriate initial condition in this case would be given by: (P Bq j ⊗ (q j (x, Q 2 0 ) +q j (x, Q 2 0 )), . On the other hand, in our phenomenologically relevant region, we have m B > 2 GeV; thus in our case, we always have m B > Q 0 . As a result, we set B(x, Q 2 ) = 0 for all Q < m B and we generate the dark photon PDF dynamically at the threshold Q = m B from PDF evolution, similar to the treatment of heavy quarks [25,26], and the tauon PDF in [24].

C PDF uncertainties
Here we quantify the effect of PDF uncertainty. PDF uncertainties have an experimental component and a theoretical one; at the moment, the highest perturbative order that is included by default in a PDF fit is NNLO (given that the PDF evolution at N 3 LO is not fully known yet), and as of yet no PDF set includes the component of the PDF uncertainty associated with missing higher order uncertainties in the theory predictions used in the fit. We have indications [68] that the inclusion of theory uncertainties in the PDF fits will not significantly enhance the PDF uncertainties in phenomenologically relevant observables. Hence, in this study, we make assumptions on the PDF uncertainties based on the best tools that we have available. There are several assumptions that can be made: • Assumption A: Ignore PDF uncertainties.
• Assumption B: Assume that the correct estimate for PDF uncertainties at the end of the HL-LHC phase is the one given in Scenario 3 of Ref. [41].
• Assumption C: Assume that the correct estimate for PDF uncertainties at the end of the HL-LHC phase is the one given in Scenario 1 of Ref. [41]. This is the least optimistic projections for PDF uncertainties.
• Assumption D: Assume that the PDF uncertainties will not decrease in the next 15-20 years from the current level, hence use the PDF uncertainty associated with the NNPDF3.1luxQED NNLO baseline.
In this paper, we adopt 'Assumption B' for our optimistic scenario, and we adopt 'Assumption C' for our conservative scenario. In Table C.1, we report the maximal 95% sensitivity obtained by varying these assumptions from our default choice. We observe that the uncertainties associated with 'Assumption D' would spoil the sensitivity, but this scenario is extremely far from realistic.