Associated jet and subjet rates in light-quark and gluon jet discrimination

We show that in studies of light quark- and gluon-initiated jet discrimination, it is important to include the information on softer reconstructed jets (associated jets) around a primary hard jet. This is particularly relevant while adopting a small radius parameter for reconstructing hadronic jets. The probability of having an associated jet as a function of the primary jet transverse momentum ($p_T$) and radius, the minimum associated jet $p_T$ and the association radius is computed upto next-to-double logarithmic accuracy (NDLA), and the predictions are compared with results from Herwig++, Pythia6 and Pythia8 Monte Carlos (MC). We demonstrate the improvement in quark-gluon discrimination on using the associated jet rate variable with the help of a multivariate analysis. The associated jet rates are found to be only mildly sensitive to the choice of parton shower and hadronization algorithms, as well as to the effects of initial state radiation and underlying event. In addition, the number of $k_T$ subjets of an anti-$k_T$ jet is found to be an observable that leads to a rather uniform prediction across different MC's, broadly being in agreement with predictions in NDLA, as compared to the often used number of charged tracks observable.


Introduction
Hadronic jets are the most abundant objects at a proton-proton collider like the LHC, and it is a major challenge to separate the signals being looked for from standard model (SM) backgrounds in multijet final states. One promising direction that has recently received attention in both theoretical and experimental studies is that the separation of light quark-initiated jets from gluon-initiated ones can be viable in these search channels. Quarks are often encountered in the decays of new particles predicted by scenarios beyond the standard model, as well as in the decay of the weak bosons, Higgs and top quark in the SM itself. On the other hand, in the corresponding SM backgrounds involving multiple hard jets, there is a larger fraction of gluoninitiated jets from QCD radiation. Here, quark-or gluon-initiated jets (henceforth simply referred to as quark and gluon jets) refer to the parton in the hard process at leading order in perturbation theory that initiates the parton shower. Based on the difference in the radiation pattern of quarks and gluons, a likelihood based discriminant can be built to separate decay jets from QCD radiation jets with a certain efficiency [1].
Several variables have been proposed to separate quark and gluon jets, mostly relying on the fact that a gluon of similar energy leads to more soft emissions compared to a quark. This includes both discrete variables like the number of charged tracks inside the jet cone, as well as continuous ones like the width of a jet and energy-energy-correlation (EEC) angularity [1][2][3][4][5]. ATLAS and CMS collaborations have also studied the discrimination of light quarks from gluons along these lines with the 7 and 8 TeV LHC data respectively [6,7]. Using data samples with "enriched quark and gluon content", data-based taggers were also developed, and compared to the predictions from Monte Carlo (MC) simulations. While there are differences between the predictions of different MC's, as well as between the data-based tagger and the MC results, they are consistent with each other within the large systematic uncertainties at present.
An important question in this regard is the proper choice of a jet algorithm and radius parameter. In the LHC environment, in order to keep the contribution of the underlying event and multiple proton-proton collisions at a minimum, for multijet processes the standard choice is an anti-k t algorithm with radius parameter R = 0.4. In addition, in the ATLAS study mentioned above, jets are required to satisfy an isolation criterion: a jet is considered isolated if there is no other reconstructed jet within a cone of size ∆R < 0.7 (where ∆R = (∆η) 2 + (∆φ) 2 is the standard distance measure in the pseudorapidity-azimuthal angle plane). An optimum choice for the jet radius parameter was discussed in Refs. [8,9] for quark and gluon jets as a function of their transverse momenta (p T ), and it was observed that one usually requires a larger radius for a gluon jet in order for the parton p T to be close to the jet p T . However, for experimental purposes it is advantageous to use a fixed and small radius parameter for the jets, for reasons mentioned above. Therefore, we propose to recover the missed information on radiation from the parent parton outside the chosen jet radius by including softer reconstructed jets that can be present (with a calculable probability) around a certain radius of a primary hard jet. These softer jets are referred to as "associated jets" in this study. It is important to note here that imposing an isolation criterion as above while studying quark and gluon jet properties might not be appropriate, since it leads to rejecting a fraction of the jet candidates beforehand, and thus biasing the sample to ones where the initial quark or gluon has not radiated outside the adopted jet radius.
We first compute the associated jet rates in QCD to next-to-double logarithmic accuracy in Sec. 2, and then compare the analytical results with those from different parton shower MC's in Sec. 3. Using the information on the presence (or absence) of associated jets can improve the discrimination of quarks and gluons. We demonstrate this through a multivariate analysis in Sec. 4. Several combinations of jet discrimination variables are tried out, and an attempt is made to determine an optimum choice. Even though we include standard discrimination variables like the number of charged tracks as inputs to our multivariate analysis, it should be emphasized that they are subject to MC ambiguities stemming from parton shower algorithms and their associated parameters, and tunings of hadronization and underlying event (UE) models. However, in order to judge the improvement in tagger performance on using the associated jet rates, we compare the performance of different sets of variables within the same MC.
In Secs. 5 and 6 we study the use of the number of subjets of a jet (defined with an exclusive k t algorithm) in place of the number of charged tracks, since the different MC prediction tend to be similar for the former observable. We compute the subjet rates upto NDLA as well, and compare the NDLA results with predictions from different MC's. Our results on both associated jets and subjets are summarized in Sec. 7. We discuss the 2-dimensional joint distributions of the three discrimination variables used as inputs in the multivariate analysis in an Appendix.

Associated jet rates: analytical calculations
To begin with, let us define the longitudinally invariant jet algorithms [10][11][12][13] adopted in this study. The distance measures between each pair of objects i and j (d ij ), and between an object and the beam (d iB ) are given by where p ti , y i and φ i are the transverse momentum, rapidity and azimuth of object i, and R is the jet radius parameter. The jet algorithm in use is fixed by the parameter p, with p = 1, 0, −1 for the k t [11], Cambridge/Aachen [14,15] and anti-k t [13] algorithms, respectively. At any stage of clustering, if a d ij is the smallest measure we combine objects i and j. If d iB is the smallest we call i a jet and remove it from the clustering list. This procedure is continued until there are no more objects left to cluster.
Once a primary jet j has been defined, say using the anti-k t algorithm with radius parameter R, we define a nearby jet i with p tj > p ti > p a and R < ∆R ij < R a as an associated jet. Thus the associated jet rates are functions of the primary jet p t = p j , its radius R, the association radius R a and the minimum associated jet p t = p a . In Fig. 1 we illustrate the idea of an associated jet schematically, and show the relevant variables that determine the associated jet rate.
In perturbative QCD, the rate of n-jet production from a primary object of type i (i = q, g in this case), R i n , can be obtained from the associated generating function [16][17][18][19] Φ We can recover the jet rates by differentiating at u = 0, hard process a s s o c i a t e d j e t j e t R ij < R a p ti > p a p tj > p ti Figure 1. A schematic illustration of associated jets, and the relevant variables which determine the associated jet rate (see text for details).
The jet rates R i n = R i n (p j , ξ) are functions of the trigger jet transverse momentum p j , and the evolution scale for parton showering, which, for hadron-hadron collisions is taken as ξ = ∆R 2 /2. This is equivalent to the evolution scale for coherent parton showering, ξ ≡ 1−cos θ, with θ being the emission angle (∆R 2 /2 ≈ θ 2 /2 ≈ 1−cos θ). To be resolved, an emission must have ξ > ξ j = R 2 /2 and p t > p a . Since the jet rates R i n include the trigger jet j, the probability of n associated jets for a jet of type i with transverse momentum p j is Here, ξ a = R 2 a /2, with R a being the association radius defined above. The generating functions Φ i (u) were computed in the context of e + e − collisions in Ref. [16], upto next-to-double logarithmic accuracy (NDLA). Here, leading double and next-to-double logarithms refer to α n S log 2n and α n S log 2n−1 , where the logarithms are those of R a /R and/or p j /p a . For p a sufficiently large, these terms are determined by the timelike showering of final-state partons, while contributions from initial-state showers and the underlying event can be avoided. Following the same methods as in Ref. [16] for hadron hadron collisions, for ξ > ξ j and p j > p a , we have the quark and gluon generating functions to NDLA Here, the running coupling is evaluated at the transverse momentum scale of the emission, k 2 t = z 2 p 2 j ξ . Defining α S = α S (p 2 j ξ)/π, i.e. in terms of the coupling at the hard scale, we have to NDLA The solution for the quark generating function is easily seen to be We can solve for the gluon generating function by iteration, and then substitute in this equation to get the complete solution. For brevity we define the following logarithms: In terms of these variables the NDLA quark generating function is where, including the full P gq splitting function, 1 (2.10) Defining similarly 2 we solve the gluon generating function by iteration to second order in u, which gives the probabilities for 0 or 1 associated jets: 12) 1 We keep terms that are formally power-suppressed in order to satisfy the boundary condition P 0 = 1 when p a = p j . 2 We drop the α 2 S term in Γ f as it is beyond NDLA and does not affect the boundary condition.

Associated jet rates: comparison with Monte Carlo
We are now in a position to compare the NDLA predictions for associated jet rates discussed in the previous section with the results obtained using the Herwig++ [20] and Pythia8 [21] event generators 3 , where the quark-and gluon-initiated jets are simulated using the Z + q and Z + g processes at leading order in QCD (with the Z boson subsequently decayed to νν). The event samples were generated for protonproton collisions at the 13 TeV LHC, using the CTEQ6L1 [22] parton distribution functions (PDF) for the Pythia generators and the default MRST LO * * [23] PDF and UE model for Herwig++. Subsequently, we used a modified version of DELPHES2 [24] for including detector effects. For observables based on charged tracks to be discussed in the following, we use a minimum p T threshold of 1 GeV for each track. All jets are reconstructed with an anti-k t algorithm [13,25] with radius parameter R = 0.4, and are required to have p T > 20 GeV. In addition, the leading jet is required to be central with |η| < 2.
In Fig. 2 we show the probability of obtaining n associated jets P n as a function of the jet p T for n = 0, 1 and n > 1, for quark-and gluon-initiated jets, in the left and right columns respectively. The association radius is set to be R a = 0.8 and the minimum associated jet transverse momentum is p a = 20 GeV. In the MC simulations, P n has been computed as a function of p T (j s ), which is the vector sum of the leading jet and associated jet p T 's. The jet rates are studied as a function of p T (j s ), as it is closer to the transverse momentum of the parton that initiates the final state shower.
We see that the functional behaviour with respect to the jet p T in the MC computation 4 and the NDLA calculation are similar, although there are some differences in the values of P n . In particular, the MC prediction of P 1 for quark and gluon jets is higher than the NDLA result, especially at higher p T (j s ), with Herwig++ giving rise to a slightly larger P 1 compared to Pythia8. For a quark jet, the probability of having at least one associated jet ranges from around 15% to 25% as we go from p T (j s ) = 200 GeV to p T (j s ) = 500 GeV and at higher p T (j s ) the probability essentially remains the same. For gluon jets, the corresponding probability ranges from around 30% to 40% as we go from p T (j s ) = 200 GeV to p T (j s ) = 500 GeV. The larger probability to have an associated jet around a gluon can thus be utilized to better discriminate it from quarks, as we shall see in the next section.
The NDLA computation includes only the time-like showering of the final state partons, and ignores some power-suppressed effects due to momentum conservation and hadronization. On the other hand, the MC results shown above include momentum conservation and hadronization as well as the effects of initial state radiation (ISR) and multiple interaction (MPI). In order to quantify the effect of ISR and MPI, we compare the predictions for P n with and without ISR and MPI in Herwig++,  Pythia8 as well as in Pythia6 [26] (we use the version Pythia 6.4.28 with the AUET2B-CT6L tune) in Fig. 3. It is clear from this figure that the impact of ISR and MPI is rather small for our choice of the association radius R a = 0.8, thereby making the predictions stable against such effects. For this choice of R a , we can see that Pythia8 shows the highest variation against such effects, followed by Pythia6, while the effects are indeed negligible for the case of Herwig++ 5 . Furthermore, the MC results become closer to the NDLA ones when ISR and MPI effects are switched off.
We also investigated the effects of momentum conservation, by changing the recombination scheme in the anti-k t jet algorithm from the default E-scheme to the "winner-take-all" scheme introduced in [27], which is less sensitive to recoils in the parton shower [28]. Such a change increases the MC associated jet rates very slightly. We believe this is because the axis of the leading jet is moved away from the overall momentum vector of the system. The effects are roughly proportional for quark and gluon jets, so they would not affect discrimination significantly.

Variables for quark-gluon separation
A large number of variables have been surveyed in the context of quark-gluon discrimination, constructed out of either track based observables or calorimeter based ones [1][2][3][4][5]. While the former category has the practical advantage of being more accurate due to better track momentum resolution as well as being less prone to pile-up contamination, the latter category can be used for jets with larger rapidities outside the tracker coverage. The most widely studied variables include the number of charged tracks inside the jet cone (n ch ), the jet width [1] and energy-energycorrelation (EEC) angularity [4]. The jet width is defined as where the sum goes over all the tracks associated to the jet. A similar track-based EEC variable, denoted by C (β) 1 can be defined as Here again the sum over i and j run over all the tracks associated to the jet with j > i, while β is a tunable parameter. It has been demonstrated in Ref. [3,4] that smaller values of the exponent β leads to a better quark-gluon separation, and β = 0.2 is found to be optimal from perturbative calculations and MC studies based on Herwig++ and Pythia8 generators. We have compared the performance of the jet width variable w and the EEC variable C in the multivariate analyses (MVA) to be discussed below, and find that in all cases C (β=0.2) 1 leads to a better separation 5 However, we have checked that if we take a larger association radius, R a > 1.2, the ISR effects become appreciable in Herwig++.
of gluons from quarks. Therefore, in the following, we only show results based on n ch (with each charged track having p T > 1 GeV) and C (β=0.2) 1 . In addition, we shall include the associated jet information as well as the jet mass variable and compare the performance of the different MVA methods. As seen in the previous section, for n = 1 or n > 1, the probability of finding n associated jets, P n , is significantly larger for gluon jets compared to quark-initiated ones across the whole p T range of interest. Therefore, the presence (or absence) of an associated jet within a certain distance R a of a high-p T jet can be used to further improve the separation.
As the boundary between the signal and background regions in the hyper-surface spanned by the variables is non-linear, it is beneficial to adopt a multivariate analysis strategy as compared to a cut-based one. For this purpose, we employed a Boosted Decision Tree (BDT) algorithm with the help of the TMVA-Toolkit [29] in the ROOT framework. The training of the classifier was performed with Z +q−jet and Z +g−jet samples, and we generated the above MC samples uniformly distributed in jet-p T 6 . The input variables for the two variable training are taken to be n ch and C (β=0.2) 1 , while for three-variable trainings we further include the variable m J /p T,J , where m J is the jet mass and p T,J is the transverse momentum of the leading jet. The information on the number of associated jets is included in the form of two categories (n = 0 or n ≥ 1) in the MVA.
It should be emphasized that the MC prediction of the discrimination variables, especially the number of charged tracks n ch is quite sensitive not only to the parton shower (PS) algorithm adopted and the related parameters, but also to the tuning of the hadronization and underlying event models. This is expected, since n ch is not an infrared safe quantity, and only the ratio n gluon ch /n quark ch converges rather slowly to the ratio of the colour factors C A /C F for high jet p T [30]. The disagreement between different MC's can therefore be reduced only by appropriate tuning at the LHC energies. With this limitation of the MC predictions in view, in this study, we compare the performance of different MVA methods within the same MC generator to estimate the improvement in adding associated jet related observables. We also show the quark-gluon separation as predicted by the different MC's for comparison. In Appendix A we present details of the distributions of the discrimination variables and the differences between the MC predictions for them.

Performance in MVA
Based on the BDT analysis, we obtain the efficiencies of tagging quark ( q ) and gluon jets ( g ) as a function of the cut on the BDT score. It is more useful to compare the ratio of the tagging efficiencies as a function of q , in order to judge the separation power of a "quark-rich signal" from a "gluon-rich" background. In Figs. 4-6 (left column) we show the ratio of the quark and gluon tagging efficiencies, q / g as a function of q , for 400 < p T (j s ) < 500 GeV, with the event samples generated with all the three MC codes. Four different MVA methods are shown corresponding to different choices for the discrimination variables: • Method-1: Two variables, n ch and C 1 with β = 0.2.
• Method-2: Two variables, n ch and C 1 with β = 0.2, with two categories determined in terms the number of associated jets (n = 0 or n ≥ 1).
• Method-4: Three variables, n ch , C 1 with β = 0.2 and m J /p T,J , with two categories determined in terms the number of associated jets (n = 0 or n ≥ 1).  We can quantify the improvement in quark-gluon separation using g (Method-1)/ g (Method-{2,3,4}) as a function of q , as shown in Figs. 4-6 (right). For e.g., for an operating point of q = 0.4, we can obtain an improvement of around 10%, 15% and 20% using Methods-2,3 and 4 respectively, when compared to Method-1. The differences between the improvement factors obtained using the three MC's are found to be small.   Ε q : quark efficiency Ε g ratio p T j s 400, 500 GeV PYTHIA6 In order to estimate the change in tagger performance as we consider lower p T jets, we show in Fig. 7 the same results as in Fig. 4, but now with 150 < p T (j s ) < 200 GeV. The improvement on adding associated jet rates is still appreciable, although it is somewhat reduced compared to the higher p T range. The fluctuations in the g ratio for lower values of q in Fig. 7 are due to low MC statistics.   We can see in Figs. 4-6 that there is an improvement in going from a two variable analysis to a three variable one by including the variable m J /p T,J . This can be understood as follows. The jet mass variable is related to C are two independent variables belonging to the C 1 class which carry all the information on this hardest emission, and including both of them improves the tagger performance. For this reason, further addition of a third variable in the C 1 class does not change the performance appreciably, a fact that we explicitly checked by a separate MVA analysis. There is a further improvement in the quark-gluon separation when the number of associated jets information is included at the level of categories in both the two and three variable MVA analyses. Since the associated jet rates carry the additional information of radiation outside the jet cone, Methods 2 and 4 lead to further improvements as compared to Methods 1 and 3, respectively.
Method 4 leads to the best performance out of the four different MVA's considered. In fact, we find that there is an alternative way to include the associated jet rates information in Method 4 by using the modified jet mass variable m(j s )/p T,J in Method 3. Here, m(j s ) is the jet mass computed by adding the leading jet and associated jet four momenta. Because of a larger associated jet rate, for the same p T (j s ), m(j s ) is higher for a gluon jet compared to a quark, while p T,J is lower. Therefore, using either associated jet rate categories and m J /p T,J , or using only the variable m(j s )/p T,J leads to the same MVA performance, as shown in Fig. 8.

Subjet rates in jets: analytical calculations
The number of charged tracks inside a jet cone, n ch (with each track having transverse momentum above a threshold, usually taken to be around 1 GeV) is often used as a good discriminating variable. However, as mentioned earlier, the MC predictions for this observable are quite sensitive not only to the parton shower (PS) algorithm and the related parameters, but also to the tuning of the hadronization and underlying event models. On the otherhand, we find that the number of subjets of a primary jet leads to a more uniform prediction across the MC's, and thus can be better suited in quark gluon separation studies. The number of subjets as a quark-gluon separation variable was considered earlier in Ref. [1]. In this study, we compute the subjet rates to NDLA accuracy, and show a detailed comparison with different MC's. We find the subjets of jet j with the exclusive k t algorithm, which applies the dimensionless distance measure to its constituent objects and clusters them as discussed for a generalized k t algorithm in Sec. 2, until the smallest y ik is above y cut . Thus the subjet rates are functions of the jet p t = p j , the jet radius R, and y cut .
In this section, we compute the subjet rates to NDLA, i.e. considering double and next-to-double logarithms, α n S L 2n and α n S L 2n−1 , where now L = ln(1/y cut ). The relevant generating functions in this case are those given in Refs. [10,19]: The Sudakov factors for no resolvable emission are now Hence the rates for 1, 2 or 3 subjets in a quark jet are: where ∆ f = ∆ 2 q /∆ g , and for a gluon jet we have 6 Subjet rates in jets: comparison with Monte Carlo We now compare the above results with Monte Carlo predictions. MC samples of quark and gluon jets were prepared for the subjet analysis using the same setup as in the associated jet study in Sec. 2, however, detector effects and minimum p T cuts for the charged and neutral hadrons were not included for this analysis. In this sense, our study of the subjet rates should be taken as illustrative, and we do not include the subjet rates in an MVA analysis in this paper. As we shall see in the following, one needs to go down to at least L = 4 to have some discrimination power. This corresponds to going down to 0.1 for ∆R resolution, which is the typical size of calorimeter cells, although the ∆R separation of subjets would be larger when the subjet p T is smaller compared to the primary jet p T . Therefore, in a proper analysis, combining track and calorimeter information is essential, and a detailed experimental study is necessary, which is beyond the scope of this paper. Figure 9 shows comparisons between the resummed results of Eqs. (5.9,5.10) and the MC results for jets with p T,J ∈ [500, 600] GeV and R = 0.4. For quark jets the different MC's agree quite well with each other and with the resummed calculations, the MC predictions being somewhat below the resummed 1-subjet rate for L > 4, and vice-versa for 2 subjets. Hadronization effects are small for L < 7, after which the 1-and 2-subjet rates are suppressed and the higher subjet rates are therefore enhanced. At this value of R p T j , L = 7 corresponds to resolving subjets with min{p ti , p tj }∆R ij ∼ 6 GeV.
For gluon jets the agreement between the resummed results and the Monte Carlos is still quite close for 1 subjet. For 2 and 3 subjets the peak rates are in roughly the same place but have higher values than the resummed ones, with the effect that the rate for 4 or more subjets is substantially suppressed. Once again the hadronization effects are small for L < 7, after which the 1-and 2-subjet rates are suppressed and the higher subjet rates are enhanced, actually bringing the latter into close agreement with the analytical calculations. In conclusion, the fairly good agreement between the Monte Carlos and the resummed 1-, 2-and 3-subjet rates for R = 0.4 and L not too large (L < 5, subjet resolution above about 15 GeV) suggests that in this range those subjet rates can be used for quark-gluon discrimination. At larger jet radii, the agreement remains similar, as we have checked using R = 0.8.

Summary
To summarize our findings, we show that in studies of light quark and gluon jet separation at the LHC, it is important to include the information on associated jet rates around a primary hard jet. Associated jet rates are defined as the probability of finding at least one softer reconstructed jet around the primary hard jet under consideration. This probability is found to be substantially higher for a gluon-initiated jet compared to a quark-initiated one. Since commonly a small jet radius parameter is adopted in LHC studies of hadronic jets, the associated jet rates carry the information on the radiation outside the chosen jet radius.
We compute the associated jet rates up to NDLA accuracy in perturbative QCD, as a function of the primary jet and minimum associated jet p T 's, as well as the jet radius and association radius parameters. The NDLA results are thereafter compared with predictions from different parton shower MC's. Since the NDLA predictions include only the time-like showering of the final state partons, we demonstrate the effects of ISR and MPI in the MC predictions as well, and it is observed that the NDLA predictions are closer to the MC's when ISR and MPI are switched off. Overall, the associated jet rates are not very sensitive to these effects as long as the association radius is not too large.
The probability of having at least one associated jet for a primary gluon jet is roughly a factor of two larger than for a quark jet, with a small variation in this number as a function of the jet p T . This fact makes the presence or absence of associated jets a good variable for quark-gluon discrimination studies. We demonstrate the impact of including the associated jet rate information by including this variable in an MVA analysis, along with the well-studied variables of number of charged tracks, energy-energy-correlation angularities and jet mass. Comparing different two and three variable MVA's with and without the associated jet information, we find that including the associated jets leads to an improvement of around 10% in rejecting gluons, for a fixed quark selection efficiency of 0.4. We also show that using a three variable MVA with associated jet categories leads to the best performance, with an improvement of 20% in rejecting gluons, for the same quark efficiency as above.
Since for the number of charged tracks variable the MC predictions tend to differ, and are dependent on the parton shower and underlying event parameter tunes, we explore the number of k t subjets of an anti-k t jet as a quark-gluon separation variable. We compute the number of subjets to NDLA accuracy, and compare the resummed predictions with different MC's. The different MC predictions are found to be rather uniform, with the resummed predictions being broadly in agreement with them. However, for gluon jets the peak rates for 2 and 3 subjets are found to be lower in the resummed computation, which might arise due to higher-order effects that are in general bigger for gluons. in Herwig++ and Pythia8, for quark and gluon jets with p T (j s ) ∈ [400, 500] GeV having n Ajet = 0 and ≥ 1 associated jets.
In Figs. 10-12 we show 2-dimensional plots of the joint distributions of the three discrimination variables used in the MVA presented in Section 4, for the two Monte Carlo event generators Herwig++ and Pythia8. The following features may be observed: • There are differences between the distributions predicted by the two Monte Carlos, those of Pythia8 being somewhat narrower for quark jets and substantially narrower for gluon jets.
• The distributions of the infrared-unsafe variable n ch show the greatest differ- and m J /p T,J in Herwig++ and Pythia8, for quark and gluon jets with p T (j s ) ∈ [400, 500] GeV having n Ajet = 0 and ≥ 1 associated jets. ences, with those of Pythia8 being larger at high n ch . This could be due to differences in tuning of the non-perturbative parameters of the generators.
• The above features are reflected in the likelihood plots, showing the probability ratio P q /(P q +P g ), and account for the higher discrimination efficiency predicted by Pythia8 (Fig. 5 vs Fig. 4).
• The quark-gluon discrimination in the events with associated jets is weaker than that for n Ajet = 0. This is expected because the events are selected according to p T (j s ), the sum of leading and associated jet p T 's. Therefore those with associated jets have leading jets with lower p T 's, which have lower discriminating power.
• Nevertheless the inclusion of the associated jet category improves the MVA performance, because the probability of an associated jet is lower for quark jets.