Precision Probes of QCD at High Energies

New physics, that is too heavy to be produced directly, can leave measurable imprints on the tails of kinematic distributions at the LHC. We use energetic QCD processes to perform novel measurements of the Standard Model (SM) Effective Field Theory. We show that the dijet invariant mass spectrum, and the inclusive jet transverse momentum spectrum, are sensitive to a dimension 6 operator that modifies the gluon propagator at high energies. The dominant effect is constructive or destructive interference with SM jet production. We compare differential next-to-leading order predictions from POWHEG to public 7 TeV jet data, including scale, PDF, and experimental uncertainties and their respective correlations. We constrain a New Physics (NP) scale of 3.5 TeV with current data. We project the reach of future 13 and 100 TeV measurements, which we estimate to be sensitive to NP scales of 8 and 60 TeV, respectively. As an application, we apply our bounds to constrain heavy vector octet colorons that couple to the QCD current. We project that effective operators will surpass bump hunts, in terms of coloron mass reach, even for sequential couplings.


Introduction
The hierarchy problem of the Standard Model (SM) singles out the TeV scale as special. New particles and dynamics are expected to appear around this scale, in order to allow for a natural explanation of the small value of the Higgs boson mass. Ongoing searches for New Physics (NP) at the Large Hadron Collider (LHC) are targeting the TeV energy scale. The LHC is operating at 13 TeV (at or close to its design energy), and although much more data will be collected, the reach on NP, expressed in terms of the mass M of particles that can be directly produced, is beginning to asymptote. If there is NP near the TeV scale, we are confronted with the possibility that it has mass too heavy for direct production at the LHC.
If NP is too heavy to be directly produced, it can nevertheless leave traces at low energies through virtual effects. When an experiment is performed at an energy E M , the effects of NP can be described in a completely general way in terms of the SM Effective Field Theory (EFT) [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16] L Here one includes all operators of dimension d bigger than 4 that can be constructed out of SM fields.
The value of an observable O(E), characterized by some typical energy E, will receive corrections with respect to its SM value O SM (E) which can be written schematically as where m SM is a typical SM mass scale and the dots represent terms further suppressed by inverse powers of M . κ and κ are model dependent coefficients which are typically expected to be of order 1. Notice that the corrections shown in Eq. 2 arise from the interference between the SM and the the NP and, if κ, κ = 0, are the leading corrections to the SM calculation. If one is able to perform a measurement of the observable O(E) with a precision ∆O/O ≡ δ < 1 then the combination M/ √ c i can be constrained with some accuracy. Let us consider two separate cases depending on the value of E: • E ∼ m SM . The energy used to probe the operator is of the order of some SM mass scale. This is the case in which one tries to constrain the NP measuring the branching ratio of known SM particles, for instance the Higgs boson or a B-meson. In this case the two terms in Eq. 2 are of the same size and one is sensitive to • E m SM . This is the case in which an operator in Eq. 1 is probed at high energies, for instance looking at its effect on the final state kinematical distribution for some SM process. In this case, assuming κ ∼ 1, one is sensitive to M until Given a fixed total accuracy δ, probing an observable at higher energy enhances the effect of the operators in Eq. 1 and allows to probe heavier NP. The smallest value of δ that can be achieved at the LHC for a given observable is limited by systematic uncertainties of both experimental and theoretical origins. However, the LHC is able to probe a variety of SM processes at very high energies and with very large luminosity. One then expects that, in certain cases, the accuracy gain associated to larger energy will be able to overcome the limited precision associated to δ. Studies along these lines have been performed already by various collaborations [17][18][19][20][21][22][23][24][25][26][27]. In particular, Ref. [28] has shown that already with existing 8 TeV LHC data, neutral and charged current Drell-Yan measurements are able to set world leading constraints on certain electroweak observables previously studied at LEP.
In this paper we apply this philosophy to the study of QCD at high energies. We are motivated by the existence of publicly available experimental measurements of both the dijet invariant mass distribution and the inclusive transverse momentum distribution performed by both ATLAS and CMS at 7 TeV center of mass energy [29][30][31]. Although other measurements of jet cross sections have been published by the experimental collaborations, even at higher collider energies, we found that only the aforementioned publications provide all the necessary statistical information, in particular the full set of correlated systematic uncertainties, which is needed to perform realistic fits to the data. Both inclusive jet and dijet cross sections have been studied as a probe of contact operators at the Tevatron [32][33][34]. By the previous argument, the LHC has a clear advantage over the Tevatron due to its higher center of mass energy.
We focus our attention on a specific dimension 6 operator [35] We normalize the size of the operator to the W boson mass, and define its strength with the dimensionless coefficient Z. In addition to being a reasonable benchmark to perform our analysis, the operator in Eq. 5 can be understood as an oblique observable for the QCD sector of the SM [35]. This operator is generated in a variety of BSM scenarios. As a weakly coupled example, in Section 5 we consider a new heavy vector boson, a color octet under strong interactions, coupled universally to quarks through the SU (3) c gauge current. In a strongly coupled theory where the gluon is composite at a mass scale Λ c we also expect Z to be generated with a size Z ∼ m 2 W /Λ 2 c . Assuming that CP is conserved by the NP, there is only one additional dimension 6 operator in the SM EFT that can be written in terms of the gluon fields only This operator, however, does not interfere with any 2 → 2 tree-level SM amplitude [36] and we do not discuss its effects in this paper. 1 1 In [35] the Z parameter was defined through Process A SM A (1) A (2) qq → qq 4 m 4 W Table 1: Contribution of Eq. 5 to 2 → 2 quark amplitudes. We write |M| 2 = g 4 where |M| 2 is the matrix element square summed over final state and averaged over initial ones. A SM is the SM value of the matrix element square. No sum over final state quark flavor is included for qq → q q . For the process p 1 p 2 → p 3 p 4 the Mandelstam variables are defined as usual as s = (p 1 + p 2 ) 2 , t = (p 1 − p 3 ) 2 , and u = (p 1 − p 4 ) 2 .
A simple way to understand the effect of Eq. 5 on jet processes at the LHC is to notice that it induces a higher derivative correction to the gluon kinetic term, so that the transverse part of the gluon propagator, P T G , is modified by a constant term proportional to Z The modified propagator will affect, at leading order, the matrix element squared of 2 → 2 quark processes as shown in Table 1. While at fixed angle the SM matrix element asymptotes to a constant at high energy, the Z dependent terms introduce an energy growing behavior. This will be reflected in an anomalous high energy behavior of kinematical distributions for pp → jets at the LHC. An alternative way to reach the same conclusions is to use the equation of motion of the gluon field to rewrite the operator in Eq. 5 in terms of four fermion contact operators It follows that at LO Eq. 5 will only contribute to the cross section of 2 → 2 processes with Using the Bianchi identity for the gluon field, D µ G A νρ + D ρ G A µν + D ν G A ρµ = 0 and integration by parts we find Eq. 5 is thus equivalent to the definition of [35] up to the inclusion of the GGG term.
external quarks. 2 LHC jet physics has already been used to constrain the SM EFT, by both theorists [17, 32-34, 37, 38] and the experimental collaborations, ATLAS [39][40][41][42][43][44][45] and CMS [46][47][48][49][50]. On the theory side, fully-exclusive NLO QCD predictions for jet production have been available for some time [51][52][53]. Very recently, the NNLO leading-color predictions were presented for the single inclusive jet transverse momentum and dijet invariant mass [54,55]. In this study, we improve on previous theory analyses by using state of the art fully differential NLO predictions interfaced to parton shower, by using the most recent public data, and crucially by including all significant sources of uncertainties both of experimental and theoretical origin. To the best of our knowledge, the operator in Eq. 5 has not been previously considered by the experimental collaborations.
The remainder of the paper is organized as follows. In Section 2, we use the 7 TeV public analyses from both CMS and ATLAS to constrain Z. In Section 3, we project the bounds on Z to higher energies and luminosities, in particular we consider the case of 8 and 13 TeV LHC and a possible circular pp collider running at a center of mass energy of 100 TeV. In Section 4, we discuss the validity of our bounds within the EFT framework. In Section 5, we introduce a simple UV completion of Eq. 5 in terms of a new heavy color octet boson coupled to the SM fermions through the SU (3) c gauge current. We compare the reach of ordinary resonance searches to our bounds extracted using Eq. 5. In Section 6, we discuss other operators affecting jet physics at high energies. Finally, we conclude in Section 7. In particular Fig. 12 in Section 7 summarizes our main results.

Bounds from existing searches
To constrain the operator in Eq. 5 we use the existing double differential measurements of the dijet cross section and inclusive jet cross section performed by both ATLAS and CMS collaborations at a center of mass energy of 7 TeV [29][30][31]. As anticipated in the previous Section, what makes these older analyses the most relevant to us is the fact that both collaborations provide the full statistical information which is needed to use these measurements to constrain NP. This information includes, in particular, the full statistical covariance matrix for the correlated systematic uncertainties. 3 The experimental analysis are briefly summarized in Table 2.
In order to compare with the available experimental data we have to calculate, for a given experimental distribution, both the SM prediction and the corrections from Eq. 5.
The SM predictions for all the distributions include QCD effects at the Next to Leading Order 2 A third way to reach again the same conclusion is to perform the following field redefinition at order Z  (50) GeV i/2 < y * < (i + 1)/2 |y| < 3, R = 0.6 i = 0, . . . , 5 ATLAS inclusive jet [30] 4.5 Table 2: Summary of the experimental searches used for our analysis. The double differential measurements are always presented in terms of an energy variable (the dijet invariant mass, m jj , for the dijet search and the jet transverse momentum, p T , for the inclusive jet one) and a rapidity variable. The variable y * , used by the ATLAS dijet analysis, corresponds to y * ≡ |y 1 − y 2 |/2, where y 1,2 are the rapidities of the two jets. Notice that y * = | log tanθ/2| whereθ is, at leading order, the scattering angle in the center of mass frame. R is the jet radius.
(NLO) interfaced to a parton shower using the POWHEG method [57,58] as implemented in the POWHEG-BOX program [59,60]. Showering and hadronization are performed with Pythia6 [61] and Pythia8 [62]. Jets are reconstructed using the anti-k T jet algorithm [63], with different radius parameters. We produced results for R = 0.4, 0.6, and 0.7. We concentrate on the choices R = 0.6 for ATLAS and R = 0.7 for CMS, as these values are the ones for which a better fit of the SM background is obtained. 4 The theory predictions are calculated using parton distribution functions from NNPDF3.0 [64]. This choice is motivated by the availability of replicas obtained fitting a reduced set of experimental measurements which do not include jet data. As it will be explained shortly, this turns out to be useful to estimate to which extent the effects of NP affecting jet measurements could be hidden by the PDF fitting procedure. More details about the generation of POWHEG predictions can be found in Appendix A. We consider four sources of theoretical uncertainty: scale variations, PDFs, α s choice and showering and hadronization effects. We evaluate the scale uncertainty on the cross section by varying the renormalization and factorization scale, µ r and µ f . We consider seven choices of scales (µ r , µ f ) = (a, b) × p  . The PDF uncertainty grows with the energy and dominates over the scale one for the NNPDF30 nlo as 0118 nojet PDF set (S no-jet ). The PDF uncertainty associated to the NNPDF30 nlo as 0118 PDF set (S jet ) is roughly one half of the one associated to S no-jet .
PDFs uncertainties and correlations across all energy and rapidity bins are evaluated using the standard NNPDF replicas prescription. 5 The α s uncertainty is evaluated using the PDF4LHC recommendation [65]. Two additional PDF sets are provided employing two different values of the strong coupling constant, α (65). Similarly to the scale uncertainty, for any bin i we take the uncertainty on the cross section associated with α s to be given by (σ max Finally, in order to estimate the uncertainty due to showering and hadronization we perform the showering and hadronization stages with two different shower Monte Carlo programs, Pythia6 [61] and Pythia8 [66] and with three different tunes. We associate an error to the result σ i obtained for each shower program and tune choice and we perform a convolution of the results to produce an error estimate using the same procedure we used for scale uncertainty. The shower and hadronization errors obtained in this way are included in the fit. For the observables under consideration this uncertainty is usually small, of order few percents. 5 Each NNPDF set comes with a central value and a set of replicas. Given two bins i and j in which a cross section has to be calculated, the ij element of the PDF covariance matrix is given by whereσ i and σ (r) i are the cross section in the i-th bin calculated with the central value PDF and the r-th replica respectively. N R = 100 is the number of replicas in a given NNPDF set.
In Fig. 1 we display the fractional value of the leading theoretical uncertainties for the dijet and inclusive p T analyses performed by ATLAS, in the central rapidity bin. Similar results hold for the analogue analyses performed by CMS. The scale uncertainty is almost constant in energy, of order 10% and dominates at low energies. For the PDF uncertainty, we show two possible PDF choices, both from NNPDF3.0. The first one, dubbed S jet , is the standard NLO set with α s (m Z ) = 0.118 and set name NNPDF30 nlo as 0118. The second one, corresponding instead to set name NNPDF30 nlo as 0118 nojet and dubbed S no-jet , does not include jet data, both from Tevatron [67] and low luminosity Run-I LHC (ATLAS inclusive jet at 2.76 and 7 TeV [68], CMS 7 TeV inclusive jet and dijet [31] and tt production at ATLAS [69][70][71] and CMS [72][73][74].) For both sets, the PDF uncertainty is negligible at low energies, but can become the dominant source of theory uncertainty at high energy in the case of S no-jet . The fractional uncertainty associated with S jet is roughly half the size of the one from S no-jet , throughout the energy range. The reason is the gluon PDF, which is strongly constrained by the jet data included in S jet .
The large uncertainty associated with S no-jet introduces an important limitation in trying to constrain the SM EFT by measuring the high energy tail of some kinematical distribution involving jets. One might be tempted to only use S jet , given its smaller uncertainty, but there is a possible concern in doing so. If NP is present, it could contaminate the same jet data that are used in S jet to constrain the PDFs. Using S jet to constrain this same NP would then be circular and potentially result in an artificially strong bound. However, we point out that contact operators in general, and Eq. 5 in particular, give their largest contributions in the central kinematical region, while PDF extraction is typically more sensitive to the forward region, where the cross section is larger. Given this caveat, we will proceed by using both sets S jet and S no-jet in the following, as comparitive benchmarks to estimate the sensitivity to the Z parameter.
In Fig. 2, we compare our calculations with the experimental results for both dijet and inclusive p T , for both ATLAS and CMS, and for the two most central rapidity bins. While the black error bars represent the fractional size of the 1σ experimental uncertainties, the shaded region displays the fractional size of the theory uncertainty, calculated as the sum in quadrature of all the effects described above. The two theory uncertainty bands correspond to the two choices of PDF sets, the wider one being associated to S no-jet and the smaller one to S jet .
In order to get a sense of the quality of the fit we build a χ 2 statistic where σ th i and σ exp i are the theory and experimental cross section in the i-th bin, respectively, and Σ is the uncertainty covariance matrix constructed as Σ exp is the covariance matrix provided by the experimental collaboration and Σ th is the theory covariance matrix including scale, PDF, α s , and hadronization uncertainties. We notice that with two exceptions (the 0.5 ≤ y * < 1 bin in the ATLAS dijet and inclusive jet measurements when fitted with S jet ) all the p-values show good consistency between theory and data.
Inclusive jet  The results of the double-differential fit are presented in Appendix B. We briefly discuss them here to point out their main features. While the fits to each individual rapidity bin have a reasonable p-value, the p-values worsen when one tries to fit multiple rapidity bins. This is especially the case for ATLAS inclusive p T and dijet, where including more than two rapidity bins leads to a vanishingly small p-value. The possibility of this behavior is due to the existence of correlated uncertainties. Including only the central rapidity bin of one analysis, the main source of uncorrelated uncertainty is from the PDF. When two or more rapidity bins are used, one has to take into account the cross correlation of the associated PDF uncertainty. This cross correlation is shown in Fig. 3.
As expected, in a given rapidity bin the PDF uncertainties are highly correlated for nearby energy bins. What may be surprising is that nearby energy bins are also highly correlated across different rapidity bins. The reason for this is that for a given center of mass energy, √ s, and a given m jj bin, 6 approximately a single value of x 1 x 2 = m 2 jj /s is sampled by the PDFs (where x 1 and x 2 are the two parton momentum fractions). When fitting a double differential distribution, part of the uncertainty associated to the PDFs will drop because of this correlation, and this can lead to a worsening of the fit. Clearly such a degradation is only expected if some of the uncertainties (either theoretical or experimental) have been underestimated. We are not able to pin point with certainty the source of the problem, which seems to mainly affect the ATLAS results. We note that Ref. [75] makes a similar observation, when fitting the inclusive jet p T double differential distribution at 8 TeV.
We stress, however, that we find good agreement between data and our calculation in the central region of each search. This agreement persists if we fit the ATLAS dijet and inclusive jet data using a fixed order NLO calculation from NLOJet++ [76]. The details of this additional check can be found in Appendix B.

Analysis
S no-jet -1bin S no-jet -2bins S jet -1bin ATLAS dijet [-19.8 Table 2, while different columns correspond to different PDF sets or different numbers of rapidity bins included in the fit. The entries marked with '*' are those for which the fit excludes the SM (Z = 0) at 95% CL.
We can now proceed to constrain the operator in Eq. 5. For a given experimental measurement, the cross section in a certain energy bin is quadratic in the Z parameter σ SM i is the value of the SM cross section in the i-th bin. For each analysis, we use the leading order formulas in Table 1 to calculate the values of σ (1) i and σ (2) i , by integrating the partonic cross sections over the phase space defined by the experimental cuts for each analyses. The PDF integration is performed in Mathematica using the available PDF grids from the LHAPDF set [77]. The inclusion of the NLO correction to the BSM calculation can result in a reduction of the BSM cross section by 10−20% at large m jj /p T [78].
This will result in a similar weakening of the bound on Z. We notice that these BSM NLO effects will only represent a small multiplicative correction of order 10% to the NP scale which is constrained by the searches. For this reason we consider their inclusion of lower priority with respect to the evaluation of SM NLO effects and their associated uncertainties.
In order to constrain Z we construct a likelihood function where Σ is the same covariance matrix of Eq. 15. Using the normalized likelihood function L(Z) as the posteriori probability distribution for Z (assuming a flat prior on Z) we calculate Confidence Level (CL) intervals for Z as the iso-contours of L containing a given probability p. We validate our procedure by reproducing the contact operator bounds set by ATLAS [29] and CMS using the same analyses [79]. The results of our validation are shown in Appendix C and they are consistent with the results of the experimental collaborations.
Our ability to constrain Z relies on the fact that the ratios σ (1) i /σ SM i and σ (2) i /σ SM i grow as the appropriate energy variable, either the invariant mass or the transverse momentum, is increased. In particular the effect of Z is such that positive values of Z correspond to a positive interference with the SM (σ  We display the results of the fit in Table 3. We present them in three different ways corresponding to different choices of the PDF set and the number of rapidity bins included in the fit. Using only the most central bin of each analysis and S no-jet , we find a very similar constraint on Z. We notice in particular that the 95% CL interval is asymmetric, with the limit being weaker for Z < 0 where there is negative interference with the SM. This is due to the fact that the large systematic uncertainties, in particular the PDF ones, make the likelihood non-Gaussian, and the limit non-symmetric. We also present the same results using either S no-jet and two rapidity bins, or S jet and a single rapidity bin.
As explained above, the fits combining two rapidity bins have, in general, lower p-values at the Z = 0 point. This in particular implies that the ATLAS dijet and CMS inclusive-jet fit exclude the SM at the 95% CL. We notice however that for the other searches the constraint improves significantly with respect to S no-jet /1 bin, and the reason for this is the large correlation of the PDF uncertainties across different rapidity bins. We do not show the results of the fit by adding more rapidity bins, because we find the addition of additional bins, beyond the two most central, to have a negligible impact on the constraint on Z.
For the S jet /1 bin fit, the size of the 95% CL interval is again smaller than the one of the S no-jet /1 bin fit, and comparable to that of the S no-jet /2 bin one. Again, including additional rapidity bins do not significantly impact the bound.
We conclude this section by noting that, already with half of the full LHC center of mass energy and a small fraction of the final available luminosity, jet physics is able to place powerful constraints on modifications of the QCD sector, such as those represented by Eq. 5. The constraint on the Z parameter is at the level of 5 × 10 −4 level, corresponding to a mass scale 95% CL bounds on Z × 10 4 for  Table 4: Projected 95% CL constraints on the Z parameter for the 8 TeV LHC with 20 fb −1 of integrated luminosity. The two rows correspond to the limit extracted from either the dijets or the inclusive jet analysis, respectively, while the three different columns correspond to different PDF sets or different number of rapidity bins included in the fit.

Projected reach
In this section, we project the reach on Z to values of the center of mass energies and luminosities for which either the full statistical information regarding the measurements, or the measurements themselves, are not yet available. In particular, we show the projected reach for a possible future 100 TeV pp collider.
As for the 7 TeV dataset, we consider two different kinematical distributions: dijet invariant mass and inclusive jet p T . At 8 and 13 TeV, the analyses are defined using the same kinematic cuts as the ATLAS ones listed in Table 2. At 100 TeV, the η cut for the dijet analysis is raised to |η| < 5. We proceed to calculate a Z dependent prediction for all the distributions, at the different energies, in the same way we did in the previous section for the 7 TeV analyses. In Fig. 4, we show the breakdown of the theory uncertainties for the central rapidity bin of the dijet invariant mass distribution, at both 13 and 100 TeV. The behavior of the various uncertainties is indeed similar to the 7 TeV analysis, as shown in Fig. 1.
In order to estimate the reach on Z we consider the following likelihood function The covariance matrix, Σ, is given by Like in the previous section, Σ th is the theory covariance matrix including scale, PDF, α s , and hadronization uncertainties. As an estimate of the experimental uncertainties, we assume Σ exp to be the sum of two components, one completely uncorrelated (δ U ) and one completely correlated At 8 and 13 TeV, we take (δ U , δ C ) = (2%, 5%), while at 100 TeV, (δ U , δ C ) = (5%, 10%). We validated this choice by redoing our 7 TeV, replacing the true experimental uncertainties with the fixed values (δ U , δ C ) = (2%, 5%). We verify that the bounds of Table 3   . conservative uncertainties for our 100 TeV projections. The statistical part of the covariance matrix is defined by where L is the integrated luminosity.
In Table 4 we show the 8 TeV projections. For the inclusive jet analysis, we use the binning and cuts of the very recent ATLAS analysis [75], for which the statistical information regarding the correlation of experimental errors is not yet publicly available. For the dijet one, we instead adopt the same binning and cuts that we used at 7 TeV. The projections show a similar constraining power of the dijet and inclusive jet analyses. Furthermore, they show once again the crucial role of the PDF uncertainty in limiting the reach. This is evident from the different size of the 95% CL interval extracted from S no-jet /1 bin and S no-jet /2 bin or S jet /1 bin, the latter having comparable size Z ≈ 2 × 10 −4 .
A similar trend emerges for our projections at 13 and 100 TeV. The dijet and the inclusive jet search both have similar constraining power on Z at fixed center of mass energy and luminosity. We notice again that CLs extracted from the S no-jet /2 bin and S jet /1 bin fits are similar in size and stronger by roughly a factor of 2 than those extracted from the S no-jet /1 bin fit. One important message from our results is the role of the luminosity. At 13 TeV the reach is of order Z ≈ 10 −4 with 40 fb −1 , and only improves by roughly 50% with the full luminosity of HL-LHC, corresponding to a NP scale of order M ≡ m W / √ Z ≈ 8 TeV . In particular we find almost no improvement in going from 300 fb −1 to 3 ab −1 . This is attributed to the large size of the theory systematic uncertainties.
Increasing the energy by roughly a factor of 8, going from LHC to a future 100 TeV pp collider, results on the other hand in a very strong sharpening of the bounds, by roughly a factor of 30, corresponding to Z ≈ 1.5 × 10 −6 and M 60 TeV.
The similarity, observed throughout our study, between the results of the S no-jet /2 bin fits and  the S jet /1 bin ones, implies that the double differential distribution of both dijets and inclusive jets is effectively able to constrain the PDFs, avoiding a possible contamination from NP. As we discussed in the previous section, this happens because measurements of the cross section in the same invariant mass (or transverse momentum) bin, performed at different rapidities, provide two independent measurements of the same value of x 1 x 2 .
We will now point out an alternative way to reduce the large PDF uncertainty associated to S no-jet : combining measurements performed at different collider center of mass energies. The reason for this can be exemplified by noticing that the dijet cross section at 8 TeV, for m jj = 2.5 TeV, constrain the same value of x 1 x 2 ≡ m 2 jj /s that is constrained by a measurement performed at 13 TeV for m jj = 4 TeV. The existence of a large correlation of PDF uncertainties across bins with the same value of m jj / √ s is shown in Fig. 5. We can thus use this observation and combine 8, 13, and 100 TeV projections by carefully including their correlated PDF uncertainties. We use S no-jet and fit only the most central bin at each energy. The scale uncertainty is taken to be completely correlated across all bins included in the fit and the uncertainties in Eq. 20 are included assuming no correlation across different center of mass energies. The results of the 8 + 13 TeV and 8 + 13 + 100 TeV are shown in Table 6, and they support our reasoning. The 8 + 13 TeV combination is as constraining as the S no-jet /2 bin and S jet /1 bin ones, while the 8 + 13 + 100 TeV delivers a slightly stronger bound.

Validity of the effective field theory
One possible concern in using high energy processes to probe the SM EFT, is the fact that, like any EFT, its validity will break down at some high energy scale. In the case of the operator in Eq. 5, this comes about because of its nature as a higher derivative correction to the gluon kinetic term. The gluon propagator can be written down, at all orders in Z, giving Expanding the above expression for small Z, one obtains Eq. 9. At an energy E ∼ m W / √ Z, higher order corrections, for instance from operators of dimension 8 and above, are expected to become relevant, invalidating the effective description in term of Eq. 5. 7 The scale Λ max (Z) ≡ m W / √ Z represent the maximal cutoff associated to Eq. 5: depending on the specific UV completion, the NP is expected to appear at or before Λ max . 8 Note that Λ max (Z) is only an approximate definition of the maximal scale at which the EFT will break down. A more precise description requires specifying both the UV theory replacing Eq. 5, and the value of the required accuracy.
Since we are using high energy jet data to extract bounds on Eq. 5, we have to make sure that the EFT we are using is under control. Studies along this direction are already present in the literature in the framework of the SM EFT and DM production at colliders [28,[81][82][83][84][85][86][87][88][89][90][91]. We adopt the following simple procedure. For a given analysis we recalculate the bound on Z including only events for which the relevant energy variable, m jj or p T , is below a given value Λ cut . The 95% CL bound now becomes a function of Λ cut , Z(Λ cut ). This bound will be consistent with the validity of the EFT if when using events with m jj < Λ cut and 7 For Z < 0, Eq. 22 implies the existence of a ghost with mass m W / √ Z. However, no general positivity argument exists to imply Z > 0 as explained in Ref. [80]. 8 It is interesting to note that when the equation of motion for the gluon field is used to rewrite Eq. 5 as a four-fermion operator (see Eq. 11), the naive cutoff of the EFT, defined as the energy scale at which the qq → qq amplitude becomes O(16π 2 ), is larger than Λ max by a factor 4π/g s . There is no contradiction, as Eq. 11 only captures the physics of Eq. 5 at leading order.  Figure 6: 95% CL bounds on Z from the 7 TeV ATLAS and CMS analyses described in the text, as a function of Λ cut . For a given value of Λ cut only events with m jj < Λ cut for dijets (left panel) or p T < Λ cut for inclusive jets (right panel), are used to extract the limit. The gray area is defined as Z > m 2 W /Λ 2 cut for dijets and Z > m 2 W /(4Λ 2 cut ) for inclusive jets, and roughly corresponds to the region in which the EFT description is no longer a valid approximation (see Eq. 23).
when using events with p T < Λ cut . The factor 1/4 appearing in Eq. 24 is chosen because for a given invariant mass m jj the maximal available p T is m jj /2.
In Fig. 6, we show the 95% CL bounds on Z, for Z > 0, as a function of Λ cut for the 7 TeV LHC analyses. We see, as expected, that the bound degrades lowering the value of Λ cut . We also see how the large uncertainties associated to the PDF set S no-jet make the limits barely consistent with the validity of the EFT. On the other hand the bounds extracted using S jet lie, for most cases, below the shaded region in the Figure, representing the bound in Eq. 23.A quantitatively similar result holds if, in place of S jet , we use S no-jet and fit the two most central rapidity bins.
The Λ cut plot for the 8 TeV dijet projection is shown in Fig. 7. Again, while the bound extracted from S no-jet is barely consistent with the validity of the EFT, the one obtained from S jet is well within the allowed region.
Finally, in Fig. 8 we show the Λ cut dependence of our 13 TeV and 100 TeV projections. In this case, we use S no-jet , but combine 8+13 TeV and 8+13+100 TeV results in order to constrain the PDF variations. For the 100 TeV case, Fig. 8 shows the effect of the 13+100 TeV combination: lowering Λ cut below 10 TeV pushes the bound on Z, extracted from the 100 TeV data alone, outside the validity of the EFT. The inclusion of the 13 TeV dataset brings the bound back into the allowed region. Similar results, which are not shown here, can be obtained from our inclusive jet projections.
Note that the energy-combination bounds fall more steeply than the naive expectation of Λ −2 cut . The reason for this is the different energy dependence of the gluon parton luminosity (which controls the background) and the valence quark one (which controls the signal), the latter falling more slowly with energy than the former.
These Λ cut plots have many interesting features. They show, for instance, which energy scales are responsible for setting the bound on Z. As Λ cut is increased, each bound improves up to a

Derivative expansion breakdown
Dijets (8 TeV -20 fb -1 ) S jet /1bin (solid) Figure 7: Λ cut plot for the 8 TeV dijet projection. We use only the central rapidity bin, y * < 0.5, to extract the limit. We show the different behavior between PDF sets S no-jet and S jet . Fitting the first two rapidity bins of the dijet distribution yields quantitatively similar results to the S jet curve. 95% CL limits are shown for both Z > 0 (constructive interference with the SM) and Z < 0 (destructive interference with the SM).
certain Λ * cut , above which the bound flattens out. This is the energy at which the suppression due to the PDF luminosity becomes too small to overcome the energy growth due to Z. We see that for dijets, this energy scale corresponds to roughly Λ * cut ∼ 3 TeV at √ s = 8 TeV, and Λ * cut ∼ 5, 50 TeV for center of mass energies of 13 and 100 TeV, respectively. As we will demonstrate in the next Section, our Λ cut -dependent bound on Z can be used to constrain explicit NP models generating Eq. 5 at low energy. If M is the mass scale at which new states are present, one can identify Λ cut = c × M , with c 1 and use Z < Z(Λ cut ) to constrain M .

Constraints on a vector color octet
A simple and motivated extension of the SM, that can be constrained by our bounds, is the theory of a new massive color octet vector boson, G, which couple to the SM quark SU (3) c current [92][93][94][95][96] where D µ is the SU (3) c covariant derivative for the adjoint representation, and T A are the SU (3) c generators in the fundamental representation. The model is described by two parameters: the mass M of the new color octet and its coupling g G to the quark current. The Lagrangian in Eq. 25 has to be considered an effective description below the energy scale Λ = (4π/g G )M , at which G self-interactions become strong. Eq. 25 can emerge as the low energy description of a composite theory in which G represents an excited state of the gluon or, for instance, as the

Derivative expansion breakdown
Dijets (S no-jet /1bin) It is worth pointing out the existence of simple weakly coupled UV completions of Eq. 25. Consider an extension of the SM in which the SU (3) c color group is replaced by an SU (3) 1 × SU (3) 2 gauge theory with gauge couplings g 1 and g 2 . We take the SM quarks to transform under SU (3) 1 . Introducing a scalar Φ, transforming as a bi-fundamental under the gauge group, and assuming Φ obtains a vacuum expectation value Φ = V × I 3×3 , the gauge group is broken down to SU (3) 1+2 , which is identified with SM color. The orthogonal combination obtains a mass from the kinetic term of Φ, The physical states, the gluon G and the heavy octet G, are given by the linear combinations Matching to Eq. 25 it follows that When G is light enough, it can be singly produced at the LHC through qq → G, and observed through its decay to dijets. The experimental signature is a bump in the dijet invariant mass  distribution. On the other hand, as the mass of G is increased, these searches are expected to become ineffective. In this heavy octet limit, integrating out G in Eq. 25 generates, at leading order in 1/M 2 , the effective four-fermion operator According to Eq. 11, this four-fermion operator corresponds to a value of Z given by In Fig. 9, we show how G affects the dijet and inclusive jet cross sections at the LHC for √ s = 13 TeV. We compare the calculation of pp → G → jj, using Eq. 25, with the EFT prediction from Eq. 30. We use two benchmark points for the mass, the first one, m G = 12 TeV, in which the octet is too heavy to be directly produced and the second one, m G = 7 TeV, in which the new vector can be produced on-shell. For this second case we also increase the width of G with respect to the value predicted by Eq. 25 Fig. 9 shows that the EFT calculation is able to describe the cross section with better than 10% accuracy throughout its naive regime of applicability: for the heavy benchmark point this is the whole kinematical range available at 13 TeV, while for the light benchmark case we take it to be m jj < M − Γ G /2 for the dijet analysis and 2p T < M − Γ G /2 for the inclusive jet one. Conventional bump-hunt searches (see for instance ATLAS [42,44,45,47,[97][98][99] and CMS [100][101][102][103][104][105][106][107][108]) are insensitive to the effects of a dijet resonance that is too heavy to be produced directly. Conversely, a fit to the differential distribution in terms of Eq. 5 or, more generally, the SM EFT, is able to capture the virtual effects of G in a theoretically solid, model independent way. Even for a lighter resonance, the tail of low energy events with either m jj < Λ cut or p T < Λ cut , displayed in the bottom row panels of Fig. 9, would be missed by a search looking for a narrow resonance.
We can now compare the projected reach of traditional dijet resonance searches on the parameter space of the model in Eq. 25, with our limits on Z. We do this in Fig. 10. The LHC limits on G are taken from Ref. [109]. For the bound on Z, we use the one shown in Fig. 8 for the 8+13 TeV combination. We take Λ cut = M − Γ G /2. We see that, while at low masses and couplings the resonant searches provide the strongest constraint on the model, their reach degrades both at large M and at large g G , as G becomes a wide resonance. Conversely, the EFT bounds are able to access a much wider region in masses and coupling, up to 30 TeV for g G ≈ 3. Notice that even for a vector resonance which is as weakly coupled as QCD, g G = g s , the EFT bound extends up to 10 TeV. As already noticed, even a large increase in luminosity, going from 40 fb −1 to 300 fb −1 only results in a modest increase in the reach. This example opens up the exciting possibility of testing, and maybe even discovering, NP within the dataset already available from Run-II of the LHC.

Other operators
In this Section, we comment on other dimension 6 operators that can affect jet physics at high energies. Requiring that there is at least one 2 → 2 SM amplitude with which the dimension 6 operator can interfere at tree-level, generating an effect that grows with energy, all the relevant terms can be written as a linear combination of four quarks operators. Assuming that the NP satisfies Minimal Flavor Violation [110], the full set of four quark operators has been classified by Ref. [17]. Neglecting terms suppressed by the Yukawa couplings, or involving nontrivial flavor contractions, the list of operators is the following: In Eq. 32, the operators on the right involve a color octet contraction, while those on the left a singlet one. In order to fix the notation we normalize the operators in Eq. 32 according to where v ≈ 246 GeV is the Higgs vacuum expectation value. As mentioned in the Introduction, the operator in Eq. 5 is a linear combination of the octet operators in Eq. 32. In particular  Table 7: 95% CL projected bounds on the W parameter, extracted from the dijet invariant mass distribution at 8 and 13 TeV. We compare the dijet reach with the neutral and charged Drell-Yan bounds from Ref. [28]. Note that the 8 TeV pp → entry corresponds to the recasting of actual data from ATLAS and CMS [28], whereas the other 8 TeV bounds are projections.
Including all the operators in Eq. 32 modifies the dijet/inclusive jet cross section in a given bin in a way analogous to Eq. 16, where the c I are the coefficients of the dimension 6 operators in Eq. 32, both singlets and octets. We could now in principle repeat our fits using Eq. 35 to construct a likelihood function, analogously to Eq. 18. Due to the suppressed size of the SM interference of many combinations of operators in Eq. 32, the resulting likelihood function would be highly non-Gaussian and the result of the global fit would be impossible to present in a closed form. For this reason, instead of attempting to be fully general, in the following we present 8 and 13 TeV projections for the bounds on a few motivated combinations of operators in Eq. 32.

The W parameter
The W parameter is an oblique correction to the electroweak sector of the SM [35] induced by the following dimension 6 operator Notice the analogy with the definition of Z in Eq. 5. Using the equation of motion of the W boson, Eq. 36 corresponds to The bounds on W we obtain using 8 and 13 TeV projections for the dijet invariant mass distribution are shown in Table 7. Similar bounds are obtained using the inclusive jet p T distribution. The bounds on W in Table 7 are weaker than those on Z extracted from dijets. This is due to the g 2 2 /g 2 s suppression of the operator in Eq. 37 compared to Eq. 11. 95% CL bounds on c I × 10 3
Our bounds use the 8 and 13 TeV calculation for the dijet invariant mass distribution. We use only the central rapidity bin and S jet in the fit.
We find it useful to compare these bounds on W with the bounds obtained in Ref. [28], studying neutral and charged Drell-Yan processes at the LHC. We find that leptonic final states are superior in constraining W, which is a consequence of the smaller systematic uncertainties impacting Drell-Yan cross sections.

Quark compositeness
Composite Higgs models with partial compositeness, which avoid the flavor problem by imposing large flavor symmetries in the strong sector, may require some of the light SM quark chiralities to be strongly coupled to the composite sector in order to explain the top quark mass [111]. If one of the SM quark chiralities is composite one expects operators in Eq. 32 to be present in the low energy theory with coefficients where Λ c and g c are the scale of compositeness and the coupling of the SM quarks to the strongly interacting sector, respectively. In Table 8 we show the bounds on four of these operators, assuming that only one of them at a time is present in the Lagrangian. This is a reasonable assumption since, if only one of the chiralities is composite, the operators which are shown are expected to be the most important ones. Notice that we don't include O uu , and O dd . The reason for this is that, at the LHC, the BSM dijet cross section is dominated by processes which are initiated by uu, dd, and ud. Keeping only terms involving the first generation, one can show that O (3) qq and O (8) uu,dd can be rewritten, using Fierz identities, into combinations of operators appearing in Table 8. The bounds shown in Table 8 are asymmetric as a consequence of suppressed interference with the SM and the importance of the quadratic terms in the cross section (see Eq. 35). Jet data constrain universal u R compositeness more strongly than universal d R compositeness. The reason for this is the larger value of the u quark PDF with respect to the d one. At 13 TeV, the bounds on the scale Λ c can be as strong as 70 TeV, for maximal g c ∼ 4π.
Dijet -8TeV 20fb -1 Dijet -13TeV 300fb -1 gaussian (dashed) Figure 11: 95% CL contours for the coefficients r V /M A and r A /M A , in Eq. 40, extracted from our dijets projections at 8 and 13 TeV. We use S jet , fitting only to the central rapidity bin. The allowed region is the one bounded by the lines. The dashed contours are obtained from the Gaussian approximation to the likelihood function, by discarding quadratic terms from Eq. 35.

The axigluon
A more general scenario than the one presented in Section 5 is represented by a color octet vector A, of mass M A , whose interactions with quarks is a combination of vector and axial couplings [92-94, 112, 113]. This so-called axigluon has interactions, Integrating out A generates the following dimension 6 operator in the low energy theory: In Fig. 11, we show the combined bounds bounds on the quantities r V /M A and r A /M A , obtained from dijet projection at 8 and 13 TeV. We find that, for r V , r A ∼ 1, the bound on M A is of order 5 and 10 TeV for center of mass energies of 8 and 13 TeV, respectively. Red and blue correspond to fits to ATLAS and CMS data, respectively, whereas gray corresponds to future projections at 8 and 13 TeV.
We focused on the parameter Z defined in Eq. 5 and reproduced here for convenience Z can be understood as an oblique correction for the QCD sector of the SM. Its effect is to modify the gluon propagator and to induce energy growing terms in 2 → 2 quark amplitudes.
The results of our fits to the 7 TeV data, and projections at 8 and 13 TeV, are summarized in Fig. 12. We display our projections in terms of the mass scale Λ = m W / |Z|. We see that dijets and the inclusive jet p T spectra have similar constraining power. The public 7 TeV data constrain Λ 3.5 TeV, and our 8 TeV projections push this scale to 4−5 TeV. 13 TeV data are expected to double this reach, as shown in the figure. Our projections show, in particular, that the EFT reach quickly saturates: going from 40 fb −1 to 300 fb −1 at 13 TeV only provides a minor increase in the value of Λ, see Table 5. We also extract the potential reach of a future pp collider, with 100 TeV center of mass energy, finding a bound on Λ of order 80 TeV.
Despite the existence of significant uncertainties of both theoretical and experimental origin, high energy QCD measurements can be a powerful probe of the SM EFT. We find, in particular, that our results are competitive with the bounds from Drell-Yan in [28] when they can both be applied to the same UV theory. If we assume, for instance, that all the SM gauge bosons are composite at the same energy scale Λ C , our 13 TeV-300 fb −1 dijet projections imply Λ C 11 TeV. This is comparable with the results of [28] reporting Λ C 9.6 TeV with the same energy and luminosity, obtained as a constraint on the parameter W from charged Drell-Yan projections.
A crucial element, that enters our fits, is the evaluation of systematic uncertainties of theoretical origin, in particular the scale and the PDF uncertainties. The uncertainty associated to the PDFs is the major limiting factor in constraining Z. We discussed three possible ways to deal with this large systematic uncertainty. The first is to use a PDF set including high energy jet data in its global fit (what we call S jet in the paper), reducing the uncertainty associated to the gluon PDF. The second possibility is to exploit PDF correlations across different rapidity bins by performing a doubly differential fit. Finally, the third option is to exploit PDF correlations across different center of mass energies by performing an 8+13 TeV data combination. These three different methods give similar results when they are all applicable. In particular, both the doubly differential fit and the energy combination do not suffer from the potential problem of overfitting the data, in case the new physics we are trying to constrain also affects the jet data entering the determination of S jet .
The possibility to fit Eq. 5 to a differential cross section, either the dijet invariant mass distribution or the inclusive jet transverse momentum distribution, is of fundamental importance. Any EFT is valid only below a certain energy scale, and this scale has to be specified in order to define the theory. The maximal value of the EFT cutoff, for Eq. 5, is given by Λ max ∼ m W / |Z|. However, the actual EFT cutoff can be lower in specific UV completions. Fitting to a differential distribution allows us to assess how the bound on Z changes by lowering this cutoff, and therefore to understand when the EFT framework is useful to describe the data.
We applied the cutoff dependent bounds, shown in Figs. 7 and 8, in the context of a specific UV completion of Eq. 5, in which Z is generated by integrating out a massive vector boson coupled to the SM SU (3) c quark current. We find that the EFT is able to probe a large region of the parameter space of the model, surpassing the mass reach of traditional searches for narrow dijet resonances (see Fig. 10).
If a deviation from the SM prediction is measured in some high energy observable, then comparing different differential distributions will become of fundamental importance. Given the existence of systematics uncertainties, it is only by comparing multiple distributions and examining their correlations, within the SM EFT framework, that a call for a discovery can be made.
Our results can be extended in various directions. It would be interesting to use the emerging NNLO results of [54,55] to reduce the uncertainty associated to the scale variation of the SM predictions. A second improvement in our analysis would be to include NLO BSM effects in our calculation for a more precise estimate of the NP scale. Finally, more operators from the SM EFT can be considered. We initiated this program in Section 6, but have not presented bounds on all possible operators. We leave these extensions for future work.

A Generation of the SM predictions
In this appendix we document the input parameters and the procedures we followed to obtain the theory predictions and uncertainties for the production of jets in the SM. As explained in the main text, we used the POWHEG-BOX program [59,60], which provides NLO QCD corrections to jet pair production interfaced with a parton shower.
Events have been generated in parallel runs on a cluster, taking advantage of the parallelization features of the POWHEG-BOX-V2. An example of the input parameters used to produce the runs used for this analysis, considering the parallelization of the runs over 2000 cores, is shown below: The value of the parameter bornsuppfact is particularly important to obtain a satisfactory coverage of the phase space and reasonable statistical uncertainties in the tails of the invariant mass and jet transverse momentum distributions studied here. When producing predictions for a 100 TeV collider, bornsuppfact was raised to 25 TeV.
The value of the parameter hdamp affects instead the goodness of the agreement of POWHEG predictions with data, especially for the dijet mass distribution in the central rapidity bin. Despite the fact that POWHEG predictions are NLO accurate for inclusive quantities like the invariant mass of the two hardest jet, or the inclusive jet transverse momentum, the POWHEG formula allows for terms beyond NLO to be present in the final predictions. The value of the hdamp parameter ultimately determines the numerical size of these terms. For a more detailed explanation of the role of hdamp, and its relation with the fraction of the real-emission matrix element that is exponentiated by the POWHEG formula, we refer interested readers to the original papers [114,115]. For this study, we treated the value of hdamp as a nuisance parameter: we produced predictions for different values (hdamp= 50, 125, 250, 500, 1000 and ∞ ) and we obtained the best overall fit to ATLAS data using hdamp= 250 GeV, which we choose as our baseline value.
Finally, we shower each event with two different shower Monte Carlo programs, Pythia6 and Pythia8. When showering with Pythia6, we use the Perugia 0 tune [116] while showering with Pythia8 is performed with both the Monash 2013 [117] and the ATLAS A14 [118] tunes. We use the convolution of these results to determine the showering and hadronization uncertainties, which is included in our fit as described in Section 2.

B Double-differential fit
In order to estimate the quality of our fit to the experimental measurements, we define a p-value estimator as where CDF χ 2 N (χ 2 ) is the cumulative function for the the chi-squared distribution with N degrees of freedom and χ 2 is evaluated at the SM prediction (corresponding to Z = 0). The p-values for the fit of all individual y bins, and for their sequential combination, are shown in Fig. 13. While individual bins have in general a sizable p-value, their combination does not and quickly degrades in the case of ATLAS data, both dijet and inclusive p T . In particular, the inclusion of 3 or more rapidity bins brings the p-value below 10 −5 (outside the range of the plot), indicating a very poor quality of the fit. For a given search, the p-value for the two different PDF sets are usually comparable and, due to the larger uncertainties, S no-jet gives a better fit than S jet . In order to investigate the robustness of the POWHEG predictions, we also considered QCD results at fixed NLO, independently obtained by NLOJet++ [76]. This allows us to estimate the effect of the terms beyond NLO which are included in the POWHEG formula, and the importance of the choice of the renormalization and factorization scales in the goodness of the fit. For the NLOJet++ [76] predictions, the renormalization and factorization scales used the same functional dependence, µ = p T e 0.3y * , adopted in the ATLAS publication [29], which is different from the underlying-Born p T used in POWHEG. The availability of these fixed order predictions in the form of APPLgrid [119] grids allowed us to quickly obtain scale and PDF variations for the observables measured by ATLAS, that is the dijet invariant mass distribution and the inclusive jet p T distribution at 7 TeV. Non-perturbative corrections extracted by experimental collaborations were also available through APPLgrid and applied to these results. The resulting p-values are presented in the two lower panels of Fig. 13, and show a similar behavior with respect to our POWHEG predictions, with just a mild improvement for the dijet doubly differential fit.

C Validation
Both ATLAS and CMS have used their 7 TeV data to constrain contact operators affecting jet physics (although not the contact operator that corresponds to Z, from Eq. 11). In this section we validate our analysis strategy by comparing our results against theirs. Both experimental analysis considered the same four-fermion operator where q L stands for left handed quark doublet and ζ = ±1. The value of ζ reflect the sign of the SM-NP interference contribution, being negative for ζ = 1 and positive for ζ = −1.
The ATLAS [29] analysis uses part of their 7 TeV data including only dijet invariant masses with m jj > 1.31 TeV and y * < 0.5. The limits are presented for destructive interference (ζ = 1) only, and for different choices of PDFs. Using NNPDF2.1 and R = 0.6, the bound on the scale of the operator they obtain is Λ > 7.0 TeV. Using the same dataset we extract a limit of 9.1 TeV.
CMS [79] uses data from Ref. [31], restricting the analysis to inclusive jet production, with p T > 501 GeV and |η| < 0.5. Using CTEQ6.6 PDFs, they obtain a bound of 9.9 TeV and 14.3 TeV, for ζ = +1 and ζ = −1, respectively. Using the same data, the bounds we obtain are 10 TeV and 19 TeV, for the same scenarios. Notice however that a precise comparison is hindered by the fact that we are using NNPDF3.0 to extract the bounds.
The results of our validation procedure are summarized in Table 9. Our bounds are stronger than the experimental one, but always within 30% of their value. While our calculation of the NP contribution is at leading order in the strong interaction, NLO corrections are included by ATLAS. It is known that their effect is to reduce the bounds on Λ by tens of percent [78], so bringing our prediction in agreement with the ATLAS result.  Table 9: Comparison between the bound on the operator of Eq. 42 presented by the experimental collaborations and the limit extracted with our fitting procedure.