Dichroic subjettiness ratios to distinguish colour flows in boosted boson tagging

$N$-subjettiness ratios are in wide use for tagging heavy boosted objects, in particular the ratio of 2-subjettiness to 1-subjettiness for tagging boosted electroweak bosons. In this article we introduce a new, \emph{dichroic} ratio, which uses different regions of a jet to determine the two subjettiness measures, emphasising the hard substructure for the 1-subjettiness and the full colour radiation pattern for the 2-subjettiness. Relative to existing $N$-subjettiness ratios, the dichroic extension, combined with SoftDrop (pre-)grooming, makes it possible to increase the ultimate signal significance by about $25\%$ (for $2\,\text{TeV}$ jets), or to reduce non-perturbative effects by a factor of $2{-}3$ at $50\%$ signal efficiency while maintaining comparable background rejection. We motivate the dichroic approach through the study of Lund diagrams, supplemented with resummed analytical calculations.


Introduction
With the increasingly high-energy scales probed by the Large Hadron Collider (LHC), massive electroweak bosons (H/Z/W ) and top quarks are often produced with a transverse momentum much larger than their mass. In this boosted regime, when they decay hadronically, they are reconstructed as single jets that have to be separated from the much more common quark-and gluon-initiated jets. Over the past few years, several techniques relying on jet substructure, i.e. on the internal dynamical properties of jets, have been devised in order to achieve this task. These techniques are now routinely used in LHC analyses and new-physics searches.
There are three common families of methods used to separate boosted heavy objects from standard QCD jets: (i) taggers, which impose that a jet contain two hard cores (or three for a top-quark), a situation more common in signal jets than in QCD jets which are dominated by soft-gluon radiation; an increasingly widespread technique for tagging is the modified MassDrop tagger (mMDT) [1,2] and its generalisation, SoftDrop [3], which will be our chosen tools here; (ii) radiation constraints, which constrain soft-gluon radiation inside jets, expected to be larger in QCD jets than in colourless weak-boson decays; a widespread way of applying radiation constraints is to cut on jet shapes, for example the ratio of τ 2 /τ 1 , where τ N is the N -subjettiness [4][5][6]. (iii) groomers, which clean the fat jets of soft-and-large-angle radiation, often dominated by the Underlying Event and pileup, hence ensuring a better mass resolution.
To reach a large discriminating power, it is helpful to combine several of these techniques. Since taggers and groomers share many similarities, one often starts by applying a tagger/groomer and then imposes a cut on the value of a jet shape computed on that tagged/groomed jet. Finally, one selects jets with a (groomed) mass closeenough to the weak boson mass.
In this paper, we introduce the concept of "dichroic" subjettiness ratios for applying radiation constraints. Starting from an object in which two hard prongs have been identified ("tagged"), the dichroic variant of subjettiness differs from standard subjettiness ratios because it uses different (sub)jets for the numerator and denominator of the τ 2 /τ 1 ratio. The reason for calling this "dichroic" is that the two different (sub)jets that are used are dominated by two distinct colour flows. In particular we will use a large jet for calculating τ 2 and a smaller, tagged subjet for τ 1 . Calculating τ 2 on the large jet provides substantial sensitivity to the different colour flows of signal (colour singlet when viewed at large angles) and background (colour triplet for a quark-jet or octet for a gluon-jet). Calculating τ 1 on the tagged subjet ensures that it is not substantially affected by the overall colour flow of the large jet, but rather is governed essentially by the invariant mass of the two-prong structure found by the tagger. The resulting dichroic τ 2 /τ 1 ratio gives enhanced performance compared to existing uses of N -subjettiness, which adopt the same (sub)jet for numerator and denominator (see e.g. [7][8][9][10][11] for recent examples).
Performance of radiation-based discrimination involves two criteria: the ability to distinguish signals from backgrounds and the robustness of that discrimination, notably its insensitivity to non-perturbative effects. As discussed already in [12], these two criteria are often in tension, because the region of large-angle soft kinematics on one hand provides substantial discrimination power, but is also the region where the Underlying Event and hadronisation have the largest impact. A point central in our discussion will be the trade-off between these aspects. To reduce the tension between discrimination power and perturbative robustness we will show how the dichroic subjettiness ratio can be used in combination not just with tagging but also a separate (pre-)grooming step.

Setup and useful tools for discussion
Before introducing the dichroic tools in Section 3, let us first discuss the individual building blocks used in our new combination and introduce a simple framework to facilitate the discussion of the underlying physics and expected performance.

A tagger, a groomer and a jet shape
We will concentrate on the modified MassDrop tagger, used here as a tagger, Nsubjettiness as a radiation-constraining jet shape, and SoftDrop as a groomer. These are all common choices in the literature, but we believe that our generic strategy can be extended to other combinations if needed. To ease the physics discussion below, let us briefly recall how these methods are defined.
The modified MassDrop tagger and SoftDrop both start by reclustering the jet with the Cambridge/Aachen algorithm. They then recursively undo the last step of the clustering, splitting the current jet j into two subjets j 1 and j 2 . The procedures then stop if the splitting is symmetric enough, i.e. if z > z cut θ 12 R β , z ≡ min(p t1 , p t2 ) p t1 + p t2 , (2.1) with p ti the transverse momentum of the subjet j i , θ 12 their angular separation in the rapidity-azimuthal angle plane and R the jet radius. If the symmetry condition Eq. (2.1) is not met, the procedure is recursively applied to the subjet with the largest p t . Eq. (2.1) with β = 0 corresponds to the mMDT, 1 while SD generalises it to the case of β = 0. Note that to some extent mMDT and SD have both tagging and grooming properties. When we use mMDT and SD together, the z cut parameter of SD will be renamed ζ cut in order to avoid confusion. N -subjettiness is defined as follows: for a given jet, one finds a set of N axes a 1 , . . . , a N (see below) and introduces τ (βτ ) N = 1 p t R βτ i∈jet p ti min(θ βτ ia 1 , . . . , θ βτ ia N ), (2.2) where the sum runs over all the constituents of the jet, of momentum p ti and with an angular distance θ ia j = ∆y 2 ia j + ∆φ 2 ia j to the axis a j ; β τ is a free parameter and in what follows we will concentrate on the case β τ = 2. This specific choice has shown good performance in Monte-Carlo numerical simulations and considerably simplifies the physical discussions below. However, the techniques introduced in this paper straightforwardly apply to other values of β τ , including the frequent experimental choice β τ = 1, and we will comment on this in Section 4.3 and Appendix A.
We still need to specify how to choose the N -subjettiness axes. In practice, there are several methods that one can use. Common choices include using exclusive k t axes or using "minimal" axes, i.e. use the set of axes that minimise the τ N . We will instead consider the case of exclusive axes obtained by declustering the result of a generalised-k t with p = 1/2 [13,14]. The motivation behind this choice has been explained in [12] and is related to the fact that, since it preserves the ordering in mass, it produces results very close to the much more complex minimal axes. 2 Since weak bosons radiate less than QCD jets, the ratio τ 21 = τ 2 /τ 1 is expected to be smaller for weak bosons and one imposes a cut τ 21 < τ cut as a radiation constraint to distinguish weak bosons from the QCD background.

A useful graphical representation
To guide our discussion, it is helpful to consider the available phasespace for radiation inside a (QCD) jet in the soft-and-collinear limit and see how the various methods under consideration constrain that phasespace. This is conveniently done using Lund diagrams [16]. Consider an emission at an angle θ from the jet axis, carrying a fraction z of the transverse momentum of the parent parton. Lund diagrams represent the two-dimensional phasespace for emissions using the angle, or more precisely log(1/θ), on the horizontal axis, and the relative transverse momentum, log(k t /p t,jet ) = log(zθ), on the vertical axis. As shown in Fig. 1, a line of constant momentum fraction z corresponds to a diagonal line with log(k t ) = constant − log(1/θ) and a line of a given mass, m 2 ∼ zθ 2 in the soft and small-angle approximation, corresponds to a diagonal line with log(k t ) = constant + log(1/θ).
In the soft-and-collinear approximation, sufficient for the following discussion, each emission comes with a weight 3) Figure 1: Lund diagram representing the phasespace available for an emission from the jet initial parton at an angle θ and carrying a momentum fraction z. The diagram shows a given emission (the solid dot) as well as lines with the same momentum fraction, k t and mass scales. z θ) m a s s~z with C R the colour factor of the parton initiating the jet, i.e. C F = 4/3 or C A = 3 respectively for quark and gluon jets. The strong coupling constant, α s , is evaluated at a scale equal to the transverse momentum of the emission relative to its emitter. Apart from running-coupling effects and subleading corrections in the hard-collinear and soft-large-angle regions, this weight is uniform over the Lund plane.
In the leading logarithmic approximation, the radiation in a jet is a superposition of independent and strongly-ordered (primary) emissions in that plane, as well as secondary emissions emitted from the primary emissions and which can be represented as extra Lund triangles (leaves) originating from each of the primary emissions, tertiary emissions emitted from secondary ones, etc... Leaves will be discussed in more detail below.
To illustrate how one can use this pictorial representation to discuss physics processes, let us consider the case of the (full) jet mass distribution m 2 /σ dσ/dm 2 . The corresponding Lund diagram is represented in the left panel of Fig. 2. One first needs an emission that provides the dominant contribution to the mass of the jet, i.e. an emission such that m 2 = zθ 2 p 2 t or such that zθ 2 = ρ = (m/p t R) 2 , where we have conveniently normalised the angles in units of the jet radius R and introduced the dimensionless (squared) mass ρ instead of m 2 . For simplicity, we shall assume a jet radius of R = 1 from now on. The integrated weight for emissions that generate a (normalised) jet mass equal to ρ is 3 where for the last equality we have illustrated the structure of the answer in a fixed coupling approximation, as indicated by the superscript "f.c.". Modulo corrections induced by the running of the strong coupling, the logarithmic behaviour basically comes from the integration over the solid line of equal mass in the Lund representation.
We also need to impose that no emissions occur at larger mass. This induces a Sudakov suppression exp The double-logarithmic behaviour corresponds to the shaded area in the Lund diagram. Note that R full (ρ) defined in Eq. (2.4) is the derivative of R full (ρ) with respect to log(1/ρ).
In the end, the leading-logarithmic (LL) result for the cross-section can be written as ρ σ This expression has a simple graphical representation: a prefactor corresponding to the emission setting the mass, the solid line in the Lund diagram, and a Sudakov suppression for larger masses, the shaded area in the Lund diagram.
Let us now consider the jet mass distribution after the application of the mMDT. This is represented in the right panel of Fig. 2. In this case [2], emissions with z < z cut are discarded by the mMDT recursive procedure, 5 so that both the prefactor R mMDT for having an emission setting the jet mass and the Sudakov exponent R mMDT are restricted to z > z cut and we have (assuming ρ z cut 1), (2.7c) Compared to the full mass result, Eq. (2.6), the prefactor is smaller but the Sudakov suppression is also less important. In practice, we will therefore have a suppression of the QCD background at intermediate masses but an increase at very small masses. More generally, we see that in order to have a large suppression of the QCD background, we want a method that keeps the prefactor small but gives a large Sudakov suppression. This will be a key element of our dichroic approach. Next, we consider the signal (electroweak boson) jets in the context of Lund diagrams. For, say, a W boson, the original splitting, W → qq occurs on a line of constant mass m = m W and, since the corresponding splitting function does not have a 1/z divergence at small z, this splitting will be concentrated close to the large-z end of that constant-mass line, with the small-z tail exponentially suppressed (in our logarithmic choice of axes). As a direct consequence, no emissions are possible at larger mass and there will not be any Sudakov factor for that region. For simplicity in our discussion below, we will assume a constant splitting function in z, which would be the case e.g. for a Higgs boson or an unpolarised W boson. For an mMDT z cut condition, this yields a signal efficiency of 1 − 2z cut . Subsequent emissions from the original qq pair will happen as if they were secondary emissions from these two quarks, i.e. essentially in two separate Lund planes each of them restricted to angles smaller than the separation θ qq between the two quarks, because of angular ordering. One of those Lund planes (that for the softer of the qq pair) will be represented as a leaf, cf. Fig. 3. Now that we have discussed how mass distribution and radiation constraints are represented in terms of Lund diagrams, we will use Lund diagrams to discuss more complex substructure methods, leaving corresponding analytic expressions to section 5.

Radiation constraints (N -subjettiness)
Let us now examine how a cut on N -subjettiness on the full jet affects the pattern of allowed radiation. Our discussion will be in a context where the full jet has a specified mass, denoted through ρ. The constraints imposed by a cut on the N -subjettiness ratio τ 21 can then again be presented quite straightforwardly in terms of Lund diagrams, at least in the small τ 21 limit, which is what we will consider in our discussion.
Say that we have a first emission with an angle θ 1 and momentum fraction z 1 that dominates the jet mass, ρ = z 1 θ 2 1 . It can be shown [12] that, in our leading-logarithmic approximation, τ N (with β τ = 2) will be dominated by the N th largest zθ 2 . We therefore have τ 1 ≡ ρ and imposing a cut τ 21 < τ cut is equivalent to vetoing emissions down to a "mass-like scale" zθ 2 = ρτ cut , for both primary and secondary emissions. This is represented in Fig. 3 for QCD and signal jets, where the extra constraint on Nsubjettiness corresponds to an extra Sudakov factor represented by the blue shaded region. In the background case, the leaf that emerges from the plane corresponds to a region of secondary emissions, while in the signal it corresponds to the region of emissions from the softer of the qq pair. Assuming a background mainly consisting of quark jets, the main parts of the plane in the two figures are both associated with a C F colour factor, while the leaf in the background case is associated with a C A colour factor, in contrast with the C F factor for the signal, and correspondingly represented with a darker shade of blue.
We see that we now have a Sudakov suppression for both the signal and the QCD background. Since the vetoed area is larger for the background than for the signal, the former is more suppressed than the latter, implying a gain in discriminating power. Furthermore, since, for a given τ cut , the vetoed area increases when ρ gets smaller, the discriminating power will also be larger for more boosted jets.  Figure 3: Lund diagram for QCD background jets (left) and signal jets (right) corresponding to the requirement of a given full jet mass with a cut on the N -subjettiness ratio τ 21 . The red shaded region (present only in the background case) corresponds to the Sudakov vetoed region for the mass, as in Fig. 2, together with the prefactor for having an emission on the solid red line. The blue shaded region corresponds to the additional veto coming from the cut on N -subjettiness. The dashed/dotted red line for the signal case represents the fact that, for signal jets, small-z configurations are exponentially suppressed. The region that emerges from the plane is referred to as a "leaf" and in the left-hand diagram represents secondary emissions from emission 1, while in the right-hand diagram it represents emissions from the softer of the two prongs of the decay.

Combining mMDT/SD with N -subjettiness
We can now present the main proposal of this paper concerning the dichroic combination of a tagger with a radiation constraint. The discussion below assumes that we use SoftDrop or the modified MassDrop tagger as our tagger and a cut on τ 21 as a radiation constraint, but we believe that the core argument can also be applied to other shapes, for example to energy correlation functions [18][19][20].
Let us consider a high-p t large-radius (R 1) jet on which we have applied an mMDT (or SD) tagger. The original large-radius jet will be called the full jet. The part of the jet that remains after the mMDT/SD tagging procedure will be called the tagged jet, and has an angular size comparable to the angle between the two hard prongs identified by the tagger. The N -subjettiness variables τ 1 and τ 2 can be evaluated either on the full or the tagged jet and there are three combinations of interest: (3.1c) The first two options are currently widely used in the literature (see e.g. [7][8][9][10][11] for recent examples). The third, "dichroic", option is a new combination, and is the subject of this paper. 6 To understand how these different variants work, we will take two approaches. First we will consider what values of τ 21 arise for different kinematic configurations involving three particles in the jet, i.e. two emissions in the case of QCD jets, and the original two prongs plus one additional emission in the case of signal jets. Then we will use this information to understand how a cut on τ 21 constrains the radiation inside the jet.
During this discussion it will be useful to keep in mind the core difference between signal and background jets. In the case of the background jets, the whole Lund plane and the leaf can contain emissions, as shown in Fig. 3(left). In the case of signal jets, emissions are mostly limited to the region shown in blue in Fig. 3(right), i.e. at angles smaller than the decay opening angle and transverse momenta smaller than the mass. The leaves in the two cases have different colour factors, however we will neglect this aspect in our discussion. 7 Rather we will concentrate on the differences that arise at large angle, i.e. from the different coherent radiation patterns of coloured versus net colour-neutral objects.
We consider the situation where, after the tagger has been applied, the tagged jet mass is dominated by emission "a", i.e. ρ ≈ z a θ 2 a (in the case of the signal jet this is the softer of the two prongs). The Lund-plane phasespace can then be separated into 3 regions depicted in Fig. 4. Region A (in red) is the region that is constrained to be free 6 One can be tempted to also consider a fourth option where τ 1 is computed on the full jet and τ 2 on the tagged jet. It is straightforward to show, following the same arguments as below, that this is not the best combination, as one might expect intuitively. 7 At low p t a significant part of τ 21 's discriminating power is arguably associated with the leaf and, for gluon-initiated background jets, with the part of the main Lund plane that is at small angles compared to the decay opening. This is mostly equivalent to quark-gluon discrimination, which is known to be only moderately effective [18,[21][22][23] and not to improve significantly at high-p t . These effects are included in the analytic calculations of section 5. In each Lund diagram, emission "a" corresponds to the emission that dominates the mMDT/SD jet mass. This defines three regions: region A (red) is vetoed by mMDT, region B (blue) contains the constituents of the mMDT/SD jet and region C (blue) is the difference between the mMDT/SD jet and the full jet. Emissions "b" and "c" are respectively in regions B and C, and the three plots correspond to three different orderings of z c θ 2 c compared to z a θ 2 a and z b θ 2 b . The table below the plots shows the corresponding value of τ 21 for both the QCD background (where all three regions have to be included) and the signal (where only regions A and B are present). For simplicity, "b/a" stands for (z b θ 2 b )/(z a θ 2 a ), and so forth.

Case 1
of radiation by the fact that the tagger has triggered on emission a. This corresponds to the region where both zθ 2 > z a θ 2 a and Eq. (2.1) are satisfied. It is responsible for the Sudakov exponent associated with the tagger, cf. Eq. (2.7).
Of the remaining phasespace, region B (blue) corresponds to emissions that are contained inside the tagged jet. It is populated in both signal and background cases. It contains not only emissions that satisfy the mMDT/SD condition (z > z cut in the case of mMDT), but also emissions with zθ 2 < z a θ 2 a and θ < θ a , due to the Cambridge/Aachen declustering used by mMDT/SD. Region C (green) corresponds to emissions that are in the original full jet, but not in the tagged jet. It is uniformly populated in the background case, while in the signal case it is mostly empty of radiation, except at the left-hand edge (initial-state radiation) and the right-hand edge (leakage of radiation from the colour-singlet qq decay). The emission with the largest zθ 2 in each of regions B and C will respectively be labelled b and c and we will assume strong ordering between emissions, as in section 2.
There are three kinematic cases to consider for the relative zθ 2 ordering of emissions a, b and c, cf. Fig. 4. In each case, Fig. 4 gives the result for each of the τ 21 variants, for both background and signal. The signal case simply assumes that there are no emissions in region C, which is appropriate in a double-logarithmic approximation. The results are expressed as a shorthand, i/j ≡ z i θ 2 i /z j θ 2 j . The case of the signal is particularly simple: since z b θ 2 b < z a θ 2 a and there is nothing in region C, all variants give Given that the signal result is always the same, the performance of the signal/background discrimination will be best for the method that gives the largest background τ 21 result (recall that one enhances signal relative to background by requiring τ 21 < τ cut ).
Let us examine the background separately for each of the three kinematic cases shown in Fig. 4: c , all three τ 21 variants give the same result as for the signal, a , but τ full 21 and τ dichroic 21 now both take the larger value of z c θ 2 c /z a θ 2 a . They should therefore perform better in this case.

Finally, for z
is again given by z b θ 2 b /z a θ 2 a ; τ full 21 is given by z a θ 2 a /z c θ 2 c , since τ 1 is dominated by emission c, while τ 2 is dominated by emission a. Depending on the exact configuration, τ full 21 may be larger or smaller than z b θ 2 b /z a θ 2 a and so may or may not be advantageous. τ dichroic 21 has a value of z a θ 2 a /z a θ 2 a = 1, which is always larger than the signal and larger than the other two variants.
Overall therefore, τ dichroic 21 is expected to be the best of the three variants. Alternatively, we can also see the benefit of the dichroic combination by examining directly how emissions are constrained when one applies a given cut on the τ 21 ratio, similarly to the discussion in Section 2.3. We have represented the Lund diagrams relevant for our discussion in Fig. 5, where we have used the same regions A, B and C as in the above discussion. We start by considering a jet for which we already have applied the mMDT/SD procedure, resulting in a (mMDT/SD) mass ρ dominated by emission "a". This automatically comes with a mMDT/SD prefactor and Sudakov suppression represented by the solid red line and shaded light red area (region A) in The situation for τ full 21 is a bit more involved and we have three cases to consider. The first case is when there is (at least) one emission in region C with zθ 2 > z a θ 2 a /τ cut ≡ ρ/τ cut and is represented in Fig. 5(ii). Let us then call emission "c" the emission in region C with the largest zθ 2 , which thus comes with a Sudakov suppression imposing that there are no other emissions in region C with zθ 2 > z c θ 2 c . Emission "c" will dominate τ 1 so that the cut on τ 21 will come with an extra suppression factor in region C extending from z c θ 2 c down to zθ 2 = z c θ 2 c τ cut . Consequently, all emissions down to z c θ 2 c τ cut are vetoed as depicted in Fig. 5(ii). The second case is when the emission in region C with the largest zθ 2 satisfies z a θ 2 a ≡ ρ < z c θ 2 c < ρ/τ cut . This region, represented by the hatched area in Fig. 5(ii), is entirely forbidden because it would give a value of τ 21 ≥ z a θ 2 a /z c θ 2 c which is always larger than τ cut . The third case is when there are no emissions in region C with zθ 2 > ρ. This directly comes with a Sudakov suppression in region C vetoing emission down to zθ 2 = ρ. In this case, τ 1 is dominated by emission "a" and the constraint on τ 21 further vetoes emissions with ρτ cut < zθ 2 < ρ in both regions B and C. These two vetoes combine to vetoing all emission down to ρτ cut as represented in Fig. 5(iii).
If instead we use our new τ dichroic 21 variable, we are always in the situation of Fig. 5(iii), where we veto all emissions down to a mass scale ρτ cut in both regions B and C. This new version therefore comes with the strongest Sudakov suppression, i.e. of the three τ 21 variables it is the one that, for background jets, is least likely to have a small τ 21 value. Given that the three τ 21 variants behave similarly to each other for signal, the signal-to-background discrimination should be improved for the dichroic variant.
With our dichroic method, we actually recover the same overall Sudakov suppression as the one we had when measuring the full jet mass and cutting on the full N -subjettiness (see Section 2.3 and Fig. 3(left)). The gain of our new method (3.1c) compared to this full N -subjettiness case comes from the fact that the prefactor associated with the jet mass is now subject to the constraint imposed by the tagger. If we take for example the case of the mMDT, this prefactor would be largely suppressed for the background -going from ∼ α s log(1/ρ) for full N -subjettiness to ∼ α s log(1/z cut ) for the dichroic method -while the signal would only be suppressed by a much smaller factor ∼ 1 − 2z cut . Additionally, measuring the tagged jet mass instead of the full jet mass significantly reduces ISR and non-perturbative effects which would otherwise affect the resolution of the signal mass peak (see also [24,25]).
Finally, we note that the gain in performance is expected to increase for larger boosts due to region C getting bigger (double-logarithmically in ρ).

Dichroic subjettiness with SoftDrop (pre-)grooming
Since τ dichroic 21 uses τ 2 computed on the full jet, including all the soft radiation at large angles, we can expect this observable to be quite sensitive to poorly-controlled nonperturbative effects -hadronisation and the Underlying Event -and to pileup.
The standard strategy to mitigate these effects is to kill two birds with one stone and to use mMDT (or SD) both as a two-prong tagger and as a groomer, and impose the τ 21 constraint on the result. This is equivalent to the τ tagged 21 variant discussed ( Fig. 5(a)), with the drawback and loss of performance described in the previous Section.
We show here how we can achieve a background rejection that is larger than for τ tagged 21 and more robust with respect to non-perturbative effects than τ dichroic 21 . Conceptually, the idea is that the tagger and groomer achieve two different tasks: the tagger selects a two-prong structure in the jet, imposing a rather hard constraint on the soft radiation in order to do so, leading to a small R prefactor for the jet mass. This is not Figure 6: Phasespace constraints on QCD jets obtained from our new combination including grooming: we first groom the jet, e.g. with SoftDrop (SD). We then compute both the jet mass and τ 1 on the tagged jet (here using the mMDT), yielding the solid red line prefactor and the shaded red region (A) for the Sudakov exponent. We then impose a cut on the τ 21 ratio with τ 2 computed on the SD jet, leading to the extra shaded blue and green regions (B and C) for the Sudakov exponent.
quite what we want from a groomer, which should get rid of the soft-and-large-angle radiation while retaining enough of the jet substructure to have some discriminating power when using radiation constraints.
This suggests the following picture: we first apply a "gentle" grooming procedure to the jet, like a SoftDrop procedure with a positive value of β. This is meant to clean the jet of the unwanted soft junk 9 while retaining as much as possible the information about the perturbative radiation in the jet. We can then carry on with the dichroic method presented in the previous Section, i.e. use a more aggressive tagger, like mMDT, 10 to compute the jet mass and τ 1 and compute τ 2 on the SD (pre-)groomed jet: .
This is depicted in Fig. 6, where regions A and B are the same as in the previous Section, but now region C indicates the region where emissions are kept by the groomer but rejected by the tagger. Similarly, we can introduce Note that we will always choose our mMDT-tagging and SD-grooming parameters such that the tagged jet is the same whether tagging is performed before or after grooming.
For mMDT-tagging with parameter z cut and SD-grooming with parameters ζ cut and β, this implies ζ cut ≤ z cut and β ≥ 0.
Using the same arguments as in Section 3.1, we can show straightforwardly that this method will have a larger rejection than with the other two variants where one would be computing the jet mass on the mMDT-tagged jet and the τ 21 ratio either on the mMDT-tagged jet, τ tagged 21,groomed ≡ τ tagged 21 , or on the SD-groomed jet, τ full 21,groomed , owing to a larger Sudakov suppression of the background, for a similar signal efficiency.
Compared to the other possible situation where both the jet mass and the τ 21 ratio are computed on the SD-groomed jet, the dichroic variant would have a smaller R prefactor, associated with mMDT instead of SD. This again leads to a larger background rejection.
Because of the initial grooming step, the groomed dichroic subjettiness ration is expected to be less discriminating than the ungroomed version introduced in Section 3.1. Indeed, the associated Sudakov exponent is smaller since we have amputated part of the soft-large-angle region. One should however expect that this groomed variant will be less sensitive to non-perturbative effects. Overall, there is therefore a trade-off between effectiveness, in terms of achieving the largest suppression of the QCD background for a given signal efficiency, and perturbative robustness, in terms of limiting the sensitivity to poorly-controlled non-perturbative effects.

Performance in Monte-Carlo simulations
Let us now investigate the effectiveness and robustness of dichroic subjettiness ratios in Monte-Carlo simulations, using Pythia 8.186 [35], at a centre-of-mass energy of √ s = 13 TeV. Our signal sample consists of W W events, while for the background we use dijet events. Jets are reconstructed with the anti-k t [13] algorithm with R = 1 and in determining signal and background efficiencies we keep all jets above a given p t cut. 11 We use the modified MassDrop tagger with z cut = 0.1 for the 2-prong tagging and vary the cut on the τ 21 ratio. Whenever a SoftDrop (SD) grooming procedure is included, we use ζ cut = 0.05 and β = 2 as illustrative parameter choices (recall that the SoftDrop condition is imposed as z > ζ cut (θ 12 /R) β instead of Eq. (2.1), i.e. we use separate symbols z cut and ζ cut respectively for the parameters of mMDT and SD). Jet reconstruction and manipulation are performed with FastJet 3.2.0 [14,36] and fjcontrib 1.024 [37]. : τ 21 distributions for jets in dijet (solid lines) and W W (dashed lines) events again imposing p t > 2 TeV and including SoftDrop grooming. Different colours correspond to different combinations of jets used for the computation of the jet mass, τ 1 and τ 2 as indicated in the legend, our new dichroic combination being plotted in black. We have selected jets with a mass is between 60 and 100 GeV. The cross-section used for normalisation, σ, is defined after the jet p t and mass cut, so that all curves integrate to one. We start by examining the τ 21 distribution. This is plotted in Fig. 7 for both QCD jets (solid lines) in dijet events and W jets (dashed lines) in W W events. We select jets above 2 TeV and always apply SoftDrop grooming. In practice, we use parton-level events, and impose a cut on the reconstructed jet mass (SD-groomed or mMDT-tagged) 60 < m < 100 GeV. We consider four cases: the τ full 21,groomed distribution when we cut on the SD-groomed mass and the τ tagged 21 , τ full 21,groomed and τ dichroic 21,groomed distributions when we cut on the mMDT-tagged mass. As expected, the distributions for signal (W ) jets are peaked at smaller values of τ 21 than the corresponding distribution for background (QCD) jets. Fig. 7 shows that all the signal distributions, and in particular the three options where one measures the mMDT-tagged jet mass, are very similar. This is in agreement with our discussion in the previous Section. Comparatively the background distributions look rather different. The case where everything is computed from the mMDT-tagged jet (the solid blue curve) peaks at smaller values of τ 21 as expected from its smaller Sudakov suppression, related to the fact that this combination puts no constraints on large-angle emissions (region C in the previous Section). Furthermore, the dichroic combination, the solid black curve in Fig. 7, is expected to have the largest suppression and is indeed peaked at larger τ 21 values, translating into a larger discrimination against signal jets.
Note that the τ 21 distribution for the dichroic combination also shows a peak for τ 12 > 1 that we have not discussed in our earlier argumentation. This comes from events with multiple emissions in region C and will be discussed briefly in our analytic   Figure 8: Mass distribution for QCD jets with p t > 2 TeV (anti-k t , R = 1) at parton level, including SoftDrop grooming. The dashed lines, in red for the SDgroomed jet and in blue for the mMDTtagged jet, are the mass distributions with no constraint on N -subjettiness. The solid lines have an additional cut τ 21 < 0.3 with different combinations of jets used for the computation of the jet mass, τ 1 and τ 2 as indicated in the legend, our dichroic combination being plotted using a solid black line. The cross section used for normalisation, σ is that for jets above the p t cut.
calculations in Section 5.
Results for the mass distribution obtained for background (QCD dijets) jets at parton level (without UE) are presented in Fig. 8. As in Fig. 7, SoftDrop grooming has always been applied prior to any additional tagging or N -subjettiness cut. Again, we can identify most of the features discussed in section 3. First of all, if we compare the mMDT-tagged mass (dashed blue curve) to the SD-groomed jet mass (dashed red curve) we see that the latter is smaller than the former at small masses, owing to the larger Sudakov factor R SD > R mMDT , but larger at intermediate masses, due to the larger prefactor R SD > R mMDT .
Then, we can consider the effect of the additional constraint on the τ 21 ratio, taken here as τ 21 < 0.3 for illustrative purpose. If we compute τ 21 on the same jet as for the mass (τ full 21,groomed in solid red and τ tagged 21 in solid blue for the SD-groomed and mMDTtagged jets respectively), we see that the cut reduces the background, that the reduction increases for smaller masses and that the reduction is larger for the SD-groomed jet than for the mMDT-tagged jet. This last point is a reflection of the fact, that the Sudakov suppression associated with the N -subjettiness cut is larger when both the mass and τ 21 are computed on the SD-groomed jet (Fig. 3(left)) than when both the mass and τ 21 are computed on the mMDT-tagged jet (Fig. 5(left)). Then, when measuring the mMDT-tagged jet mass, one sees that computing τ 21 on the SD-groomed jet (τ full 21,groomed , the solid green curve in Fig. 8) shows a larger suppression than computing τ 21 on the mMDT-tagged jet, although the difference is reduced at very small masses. Finally, if we consider our new, dichroic case, Eq. (3.2) (τ dichroic 21,groomed , the solid black curve), we see a larger suppression than in all other cases, as expected from our earlier arguments.

Signal v. background discrimination and other performance measures
To further test the performance of our new method, we have also studied ROC (receiver operating characteristic) curves, shown in Fig. 9 for parton-level simulations and in Fig. 10 for hadron-level events including hadronisation and the Underlying Event. In all cases, we impose the constraint that the mass is between 60 and 100 GeV. Efficiencies are given relative to the inclusive cross-section for having jets above our p t cut.
Let us first discuss the result of parton-level simulations, Fig. 9, where the dichroic ratio is again represented by the black curves. Without grooming (the left-hand plot in the figure), our method shows a substantial improvement compared to all other combinations considered, outperforming them by almost 30% in background rejection at a signal efficiency of 50% and by more than a factor of 2 at a signal efficiency of 40%. After SoftDrop grooming (right-hand plot), the dichroic method, i.e. computing the jet mass and τ 1 on the mMDT-tagged jet and τ 2 on the SD-groomed jet, still shows an improvement, albeit less impressive than what is observed using the full jet to compute τ 2 .
If instead we consider the results at hadron level, including both the perturbative parton shower as well as non-perturbative effects, Fig. 10, we see that the dichroic subjettiness ratio still does a better job than the other variants but the gain is smaller. For example, measuring the mMDT-tagged mass with a cut on the groomed dichroic ratio, τ dichroic 21,groomed , the optimal choice in Fig. 10, is only slightly better than the next best choice where one measures the SD-groomed mass and imposes a constraint on τ full 21,groomed . This is because in going from parton to hadron level, the ρ groomed -τ full 21,groomed curve has moved down more than the ρ tagged -τ dichroic 21,groomed curve, i.e. the former is getting a significantly larger boost in its discriminating power from non-perturbative effects. 12 This is potentially problematic, because one does not necessarily want signal-to-background discrimination power for a multi-TeV object to be substantially driven by the physics that takes place at a scale of 1 GeV, physics that cannot, with today's techniques, be predicted from first principles. Additionally, phenomena happening on a scale of 1 GeV are difficult to measure reliably.
It would be interesting to investigate non-perturbative effects in greater depth, both analytically, e.g. following the approach used in [2], or by studying their depen-  dence across different Monte-Carlo generators and associated tunes. However, for the purpose of this article, we limit ourselves to using the results from Pythia 8. In evaluating the overall performance of different τ 21 combinations we will consider both the signal significance and the size of non-perturbative effects. We will use the following alternative to ROC curves. For a given method and p t cut, we first determine the τ 21 cut required to obtain a desired signal efficiency (at hadron level). For that value of the τ 21 cut, we can compute the signal significance, defined as S / √ B (computed at hadron level) which is a measure of the discriminating power of the method; we then estimate non-perturbative effects as the ratio between the background efficiency at hadron level divided by the background efficiency at parton level, which is a measure of robustness against non-perturbative effects. We will show results for a range of different signal-efficiency choices and jet p t cuts.
In Fig. 11, which highlights the key performance features of the dichroic method, we plot the signal significance versus the non-perturbative effects for different methods. In the left-hand panel, the curves correspond to a range of τ 21 cuts for jets with p t > 2 TeV. The points on the curves correspond to different signal efficiencies (starting from 0.05, in steps of 0.05, and with S = 0.5 indicated by a bigger point). In the right-hand panel, the points on the curves correspond to different p t cuts, with the τ 21 cut adjusted (as a function of p t ) so as to ensure a constant signal efficiency of 0.5. To avoid the proliferation of curves, the result for the ungroomed ρ full -τ full 21 is not shown since it is obvious from the ROC curves in Figs. 9 and 10(left) that it is extremely sensitive to non-perturbative effects.
In both plots, we see that the dichroic method comes with larger discriminating power with a relatively limited sensitivity to non-perturbative effects, provided one first applies a grooming step. Without the grooming step, one observes a much larger sensitivity to non-perturbative effects, as one might expect. 13 It also appears that the performance gain increases when the boost, i.e. the jet p t , increases. This was also expected from our arguments in Section 3. Finally, compared to the common setups in the literature, namely with modified MassDrop tagging with a cut on τ 21 applied either on the mMDT (ρ tag -τ tagged 21 , the dot-dashed blue curve) or on the full jet (ρ tag -τ full 21 , the dashed green curve), our dichroic method with grooming (solid black) gives up to a factor of two improvement in signal significance, with comparable non-perturbative effects. Considering other combinations that have not been widely used experimentally, τ full 21,groomed with either a groomed (ρ groom , solid red) or a tagged (ρ tag ), solid green) jet mass both perform well, however τ dichroic 21,groomed still remains the best, with an optimal   Figure 11: Signal significance plotted versus the non-perturbative effects for the QCD background (defined as the ratio between the background "fake" tagging rate at hadron and parton level). Different curves correspond to different combinations indicated in the legend. For the solid curves, a SoftDrop (β = 2 and ζ cut = 0.05) grooming is applied, while no grooming is applied for the dashed curves. In the left-hand plot, we impose a 2 TeV p t cut on the initial jet. The symbols on each curve then correspond to a signal efficiency (computed at hadron level) ranging from 0.05 upwards in steps of 0.05, with the large symbol on each line corresponding to ε S = 0.5 and the efficiency at the right-hand extremity explicitly labelled. In the right-hand plot, the signal efficiency (computed at hadron level) is fixed to be 0.5 and the p t cut on the jet is varied between 500 GeV and 3 TeV (in steps of 500 GeV, labelled explicitly for the groomed dichroic ratio), with the large symbol on each line corresponding to a 3 TeV cut.
significance that is about 25% larger, and smaller non-perturbative corrections for any given signal significance. As a final check, we have studied the dependence of the signal efficiency on the τ 21 cut, as shown in Fig. 12. Comparing the left and right-hand plots, it appears clearly that applying SoftDrop grooming helps to reduce non-perturbative effects which otherwise significantly lower the signal efficiency. It is also interesting to notice that without grooming, the signal efficiency obtained with our dichroic method (the dashed black curve on the left plot of Fig. 12) only reaches its plateau for cuts on τ 21 larger than 1 already at parton level. This can likely be attributed to initial-state radiation in the   jet at angles larger than the decay angle of the W boson. These effects are strongly reduced by SoftDrop grooming (see also the discussion in Section 5).
In the end, a more complete study would include variations of the SD parameters and of the cuts on the mass. A brief investigation of the SD parameters shows that our choice of β = 2 and ζ cut = 0.05 seems a decent default, at least for the process and kinematic domain under study. However, in view of the good signal efficiency reached when computing both the jet mass and τ 21 with SoftDrop, it might also be interesting to investigate our dichroic combination where we also use SoftDrop for the tagger instead of the mMDT. An extensive analytic study foreseen in a follow-up paper [38] would allow for a systematic study of these effects. Such an analytic understanding could also be of use in the context of building decorrelated taggers [10].

Brief comparison with other tools
To complete our Monte Carlo studies, in Fig. 13 we compare the performance of τ dichroic 21,groomed with various other tools: mMDT tagging alone, SoftDrop grooming alone (β = 2 as above), and also the Y m variant [25] of Y-splitter [39], combined either with SoftDrop (pre-)grooming or with trimming, as described in detail in Ref. [25] (see also Ref. [24]). Whereas in the analogous Fig. 11, all curves involved the same signal efficiency, here this is no longer the case. Accordingly efficiencies are reported versus p t in table 1. Let us start by examining the pure mMDT result: as known already from [2] it provides mild tagging, it has small non-perturbative corrections and only modest dependence on p t . SoftDrop (β = 2), when used alone, has slightly lower significance and larger non-perturbative corrections. 14 These two tools have the highest signal efficiencies, of about 63% and 76% respectively at 2 TeV.
Next we examine combinations that involve Y m -splitter. Recall that this tool undoes the last clustering of a generalised-k t (p = 1 2 ) clustering of the jet constituents, determines y = min(p 2 t1 , p 2 t2 )∆R 2 12 /m 2 on the two resulting prongs, and then imposes a cut y > y cut . This cut is similar in its effect to z cut in mMDT. When used in conjunction with SD (pre-)grooming, the highest-mass emission that passes the SD cut is also the one that is unclustered by Y m -splitter and so it is required to pass the y cut condition. As a result, the constraint in the Lund plane turns out, at the leading-log level, to be identical to that obtained with τ dichroic 21,groomed and the condition τ cut = 1, with a Sudakov suppression vetoing all emission down to a mass scale ρ in the SD-groomed jet, and a small prefactor ∼ α s ln(1/y cut ). This is reflected in Fig. 13, where one sees that the τ dichroic 21,groomed < 1 curve (black open diamonds) is remarkably similar to the SD+Y msplitter curve (red open squares). Where the τ dichroic 21,groomed variable has an advantage is that one can now further adjust the choice τ cut , whereas with SD+Y m -splitter that freedom is not available.
Of the various Y m combination considered in Ref. [25], the one that gave the best signal-to-background discrimination was Y m with trimming, shown as red solid squares in Fig. 13. Overall it performs less well than the mMDT plus τ dichroic 21,groomed combination with S fixed to 0.4, even though is has a broadly similar signal efficiency.
Another point to discuss concerns the choice of β τ in the N -subjettiness definition, Eq. (2.2). Many experimental uses of N -subjettiness ratios have concentrated on the choice β τ = 1, while throughout this article we have used β τ = 2. A discussion of the β τ = 1 case is given in Appendix A, including comparisons of dichroic and normal variants. Dichroic always perform best also for β τ = 1, and so in the brief summary that we give here we will only show dichroic results.
An argument often given for the choice of β τ = 1 is that it is less sensitive to non-perturbative effects. Fig. 13 (right) shows groomed (filled symbols, solid lines) and   Figure 13: Signal significance and non-perturbative effects for background, for jet p t cuts ranging from 500 GeV to 3 TeV in steps of 500 GeV, as in Fig. 11(right). The 3 TeV point is always labelled with a larger symbol. The plots compare τ dichroic 21,groomed (β τ = 2) with a range of other tools, including Y m -splitter (left) and β τ = 1 dichroic subjettiness ratios (right). Where the β τ value is not explicitly labelled, it is equal to 2. Note that the default signal-efficiency working point for the dichroic subjettiness ratios is 0.4 here rather than the 0.5 chosen in Fig. 11. The signal efficiencies for other cases are given in Table 1.  Table 1: Signal efficiencies for the various tools shown in Fig. 13.
ungroomed (open symbols, dashed lines) results for β τ = 1 (squares and triangles) and β τ = 2 (circles). For the β τ = 1 case, we have considered either exclusive-k t axes with the standard E-scheme four-vector recombination (triangles), or the exclusive-k t axes with the winner-takes-all (WTA) recombination scheme (squares) [15,40,41]. In both the SD-groomed and ungroomed cases, the non-perturbative corrections are somewhat smaller for β τ = 1 (except in the WTA groomed case). In the ungroomed case, β τ = 1 also leads to better signal-discrimination. However once SD-grooming is included the signal discrimination is best for the β τ = 2 case. If one is concerned about the slightly larger non-perturbative effects for the SD-groomed β τ = 2 case, then one can slightly increase the τ cut choice: in Fig. 11(right) where τ cut was chosen so as to obtain a higher signal efficiency of S = 0.5 the τ dichroic 21,groomed (β τ = 2) performance is very similar to the τ dichroic 21,groomed (β τ = 1, S = 0.4) performance in Fig. 13(right). Therefore, it is the SD-groomed, β τ = 2, dichroic ratio that appears to give the best overall performance.
There are a number of other variables that one might also consider, notably energycorrelation functions (ECFs) [18][19][20]. In particular we expect that dichroic ratios may be of use also for the most recent set of ECFs discussed in Ref. [20], a number of which are designed to have similarities to N -subjettiness. Their study is, however, beyond the scope of this work.

Brief analytic calculations
In this Section, we consider brief analytic calculations relating to the observables we have presented so far. Our main goal here is to illustrate that the discussion from Section 3 -where we used Lund diagrams to motivate dichroic subjettiness ratios -does indeed capture the qualitative picture observed in Monte-Carlo simulations. To that aim, it is sufficient to use leading-logarithmic accuracy, where we control double logarithms, i.e. α n s ln j ρ ln k ρ ln z cut ln m ζ cut with j + k + + m = 2n, assuming ρ, τ cut , z cut , ζ cut 1. Note that, recently, several jet substructure methods have been understood at higher accuracy, see e.g. [17,42], and we intend to provide a more precise calculation in future work [38].
In practice, we will express everything in terms of the following fundamental block (cf. Fig. 14(left)): 1) where angles are normalised to the jet radius R and we use the 1-loop running-coupling prescription, α s (zθp t R) = α s /(1 + 2α s β 0 log zθ) with α s ≡ α s (p t R) and Figure 14: Lund diagrams associated with various analytic calculations. Left: the basic building block T α , Eq. (5.1), used to write all Sudakov exponents. Centre: representation of the full jet Sudakov R full (ρ, τ cut , z), Eq. (5.3a), including secondary emissions. Right: representation of the full jet Sudakov R SD (ρ, τ cut , z), Eq. (5.3c), including secondary emissions. For both the centre and right plots, the dot indicated by z corresponds to the emission dominating the jet mass and we will integrate over allowed values of its momentum fraction z.
4n f T R )/(12π). Explicit expressions for T α are given in Appendix B and are mostly taken from Ref. [12]. Note that T α (ρ, z cut ; C R ) = 0 for z cut < ρ.
In the above expressions, z corresponds to the momentum fraction of the emission dominating the jet mass (emission "a" in Figs. 4 and 5). Compared to the simple R factor that we had in Section 2.2, we keep the z integration explicit since the secondary emissions, the C A terms, depend explicitly on z. In all cases, the integration over z runs over the region kinematically allowed by the tagger defining the jet mass. The Sudakov exponent in these expressions is then essentially given by the jet on which we compute τ 2 . While we only target leading-logarithmic accuracy, our results also include the single-logarithmic contributions coming from hard collinear splittings, which are often phenomenologically important. They appear as the b i factors in Eqs. Finally, as expected, if one takes the limit β → ∞ of the SD results, one recovers the full results. Also, the limit β → 0 of (5.2c), reduces to (5.2b).
So far, we have not yet discussed the case where ρ is computed from the mMDTtagged jet and τ 21 from the full jet. This is more involved due to the two separate kinematic configurations involved (see Fig. 4(b-c)). In the end, we find (assuming ρ < z cut ) ρ mMDT , τ full 21 : and a similar expression with "full" replaced by "SD" for the case where τ 21 is calculated on the SD jet. In the above expression, we have used ρ c = z c θ 2 c and R out,full (ρ c , τ, z c ) = T 0 (ρ c τ, z cut ; The configurations contributing to the last two lines of Eq. (5.5) come from jets with at least one emission in region C (discarded by mMDT) with ρ c ≡ z c θ 2 c > ρ/τ cut . They result in an extra contribution to the mass distribution, which would then be larger than what we obtain with our dichroic combination (Eq. (5.2d) or, equivalently, the first line of Eq. (5.5)). When using the dichroic combination, these configurations would all have τ 21 ≥ 1 (up to τ 21 = z cut /ρ). In particular, for a cut τ 21 < τ cut with τ cut > 1, the dichroic combination leads to: with R out,full (ρτ ) = T 0 (ρτ, z cut ; C R ) (5.8a) R out,SD (ρτ ) = R out,full (ρτ ) − T β (ρτ, ζ cut ; C R ). (5.8b) This result splits into 2 contributions corresponding to the two terms in the round bracket on the second line of (5.7): the first term comes from configurations where there is no emission in region C with zθ 2 > ρτ cut , and it corresponds to values of τ dichroic 21 < 1 (this is manifest, because in Eq. (5.7), given for τ cut > 1, it has no dependence on τ cut ). For the second contribution, the part corresponding to values of τ dichroic 21 ≥ 1, there is an emission "c" with z c θ 2 c > ρτ cut . To guarantee τ 21 < τ cut , we then need to veto emissions (both primary and secondary) with zθ 2 > ρτ cut . 15 Note that this second contribution itself includes two sub-contributions: the case where emission "c" is the only emission in region C with zθ 2 > ρ, yielding a contribution to the τ 21 distribution proportional to δ(τ 21 − 1) (recall that τ full continuum with τ 21 > 1 in the τ 21 distribution (see Fig. 7 as well as the right plot of Fig. 15 below). One can calculate the δ(τ 21 − 1) contribution to the τ 21 distribution by taking the difference between (5.7) and (5.2d) for τ cut → 1 which gives 9) with R full (ρ) the full jet mass Sudakov, Eq. (5.4a), and Eq. (5.9) is equal to the τ cut → 1 limit of the second term in round brackets in Eq. (5.7).
In practice the δ-function contribution gets smeared out to values of τ 12 > 1 through the effect of multiple emissions. Note that it is relatively straightforward to check that the limit τ cut → 1 in Eq. (5.5), or the limit τ cut → z cut /ρ in Eq. (5.7) both tend to the mMDT jet mass distribution.
From the equations above, the τ 21 distribution, for a given jet mass, can be obtained by taking the derivative with respect to τ cut and normalising by the jet mass distribution without any cut on τ 21 . Background efficiencies can also be obtained straightforwardly by integrating any of the above mass distributions over the allowed mass window.
For signal jets, we assume that if the jet mass is not within some reasonable window around the boson mass, then the jet is discarded. We then find the following signal efficiency with z min = ρ, z SD (ρ) or z cut depending on whether the mass is computed on the full jet, the SD-groomed jet or the mMDT-tagged jet, respectively. The τ 21 distribution for a given jet mass can be obtained by taking the derivative of S with respect to τ cut (and normalising appropriately). In Eq. (5.10) the Sudakov exponent is given by valid for small τ . Here we target double-logarithmic accuracy, α n s ln 2n τ , though we also include a set of finite-z and hard-splitting corrections that were found to be numerically important in Ref. [12] (cf. Eq. (A.24)). These represent only a subset of next-to-leading logarithmic terms. Note that for z 1 (1 − z 1) the term on the fourth (second) line is zero because of the last of the Θ-functions in Eq. (5.1), while the term on the third (first) line corresponds to the leaf in Fig. 3(right). For simplicity, in our numerical results we will use p sig (z) = 1 in Eq. (5.10). 16 Eq. (5.10) also includes a factor f ISR that accounts for the effect of initial-state radiation (ISR). Such effects are present both for signal and background jets and are generically single-logarithmic. As such they are subleading compared to the doublelogarithms that we resum.
Nevertheless, if we consider signal jets and examine the limit of large p t with M , τ cut , etc. all fixed, then because of the absence of double logarithms of ρ, single-logarithmic ISR effects (α s ln ρ) n can be numerically dominant [24]. Physically, they are associated with the requirement that ISR should not substantially modify the mass of the signal jet. The correction involves (α s ln ρ) n terms, only when the mass is determined on the full jet and the factor f ISR then takes the form where a non-global contribution (formally of the same logarithmic order) is ignored for simplicity. In the above formula, δM is size of the mass window in which signal jets are accepted, and a full treatment of all single-logarithmic corrections would need to account also for logarithms of δM/M . A more complete treatment of f ISR would be relevant for precise phenomenological applications. The finite O (α s ) component associated with high-p t emissions could be obtained e.g. using POWHEG [44], aMC@NLO [45] or at NNLO using MATRIX [46] or MCFM [47].
We can now compare our analytic predictions with the Monte-Carlo results from the previous Section. We use α s (M Z ) = 0.1383, as in the Pythia 8 simulations presented in the previous Section, and freeze the coupling for scales below µ fr =μ fr p t R, which we set to 1 GeV. We start with the QCD mass distributions, shown on the left plot of Fig. 15, to be compared to the Monte-Carlo results presented in Fig. 8. Globally, we see that our analytic calculation captures correctly the main patterns discussed earlier. We note however that the analytic distributions, especially those involving the   Figure 15: Same as figure as 8 and 7 now obtained from our analytic calculation instead of Monte-Carlo simulations. In the right-hand plot, for clarity, the δ-function that appears at τ dichroic 21,groomed = 1 (dijets) has been represented with finite width and scaled down by a factor of 5.
full jet mass, are less peaked than the Monte-Carlo ones. This is likely due to subleading logarithmic corrections, like multiple-emission corrections which would effectively increase the Sudakov exponent.
The τ 21 distributions for both QCD jets and signal (W ) jets are shown in the right plot of Fig. 15, to be compared with Fig. 7. The ordering between the different curves is well captured by our analytic expressions. Differences related to the oversimplicity of our leading-logarithmic approximation are larger than what was seen for the mass distribution. First, our analytic calculations are non-zero when τ 21 → 1. This region is however not under control within our strongly-ordered approximation. Similarly, the kink observed for τ 21 ∼ 0.5 is not physical. It comes from the onset of the secondary-emission contribution which starts, in our formulas, at τ 21 = b g . The analytic calculation for our dichroic combination is given by the black curves in the right plot of Fig. 15. The dijet case clearly has a contribution proportional to δ(τ 21 − 1) (cf. Eq. (5.9)) (scaled down by a factor of 5 for clarity), which is not observed in the Monte-Carlo results. In practice, additional emissions at smaller zθ 2 would also contribute to τ 21 , and they would transform the δ(τ 21 − 1) contribution into a Sudakov peak at τ 21 1, which is visible on the Monte-Carlo simulations. We are currently working on a better analytic calculation, lifting the assumption that emissions are strongly ordered in zθ 2 [38]. Finally, let us turn to the ROC curves, plotted in Fig. 16. We again see that they reproduce the main qualitative features observed in Section 4. There are however quantitative differences between our analytic results and the Monte-Carlo simulations. For example, our calculation over-estimates the signal efficiencies. A more quantitative description would require a more precise analytic treatment including subleading corrections, beyond the strong-ordering approximation, and fixed-order corrections for signal efficiencies.

Conclusion
In this paper we have examined the interplay between boosted-object tagging algorithms, mMDT or SoftDrop, and radiation constraints, notably as imposed through N -subjettiness cuts. The analysis points to a new N -subjettiness ratio, τ dichroic where the numerator is evaluated on the full jet, while the denominator is evaluated on the set of constituents left after the tagging stage. The name "dichroic" comes from the fact that the large-angle colour flow, present in backgrounds but not signals, gets directed exclusively to the numerator and not the denominator. It is this feature that leads to an enhanced significance in distinguishing (colour-singlet) signals from (colour-triplet or octet) backgrounds, notably compared to current widely used N -subjettiness ratios.
As well as considering signal-significance, it is important to keep non-perturbative effects under control: a method that is overly reliant on non-perturbative physics for its discrimination power is one for which signal-efficiency and background-rejection estimates may be highly model-dependent, and correspondingly uncertain. It is also likely to be subject to large detector effects. We have found that the combination of τ dichroic 21 with a light grooming step based on SoftDrop (β = 2), τ dichroic 21,groomed = τ SD 2 /τ tagged 1 is effective in maintaining good signal-to-background significance while substantially limiting non-perturbative effects. The overall behaviour of our dichroic τ 21 variable, with grooming, was illustrated in Fig. 7: the τ 21 distribution for signal jets is left largely unmodified by the change to a dichroic variant (black dashed curve versus any of the other dashed curves), whereas the distribution for background jets is shifted to substantially higher values of τ 21 (black solid curve versus any of the other solid curves), increasing the ability to distinguish signal and background. Figures 11 and 13 provide a summary of the signal-significance (vertical axis) and non-perturbative corrections (horizontal axis) for a range of boosted-object identification methods. The points along the lines correspond to different signal-efficiency working points (Fig. 11(left)) or p t cuts (the other plots). One sees that τ dichroic 21,groomed with β τ = 2, in black, provides the best signal significance of any of the methods and that, for a given signal significance, it tends to limit the size of non-perturbative effects relative to other methods.
In addition to the Lund-plane based arguments given in section 3 and the Monte Carlo studies of section 4, we have also outlined the analytic leading-logarithmic structure of different combinations of taggers and τ 21 ratios. As well as bringing insight into the behaviour of different taggers, such calculations provide a basis for the future design of "decorrelated" [10] combinations of taggers and dichroic radiation constraints, providing background rejection that is independent of the tagged jet mass and thus straightforward to use in the context of data-driven background estimates.
A Dichroic subjettiness ratios for β τ = 1 In Section 3, we have argued in favour of the dichroic subjettiness ratios using Nsubjettiness with β τ = 2. In this appendix, we briefly discuss the case β τ = 1, for which the dichroic variant can also be considered. Note that for β τ = 1, we have defined the N -subjettiness axes through an exclusive-k t declustering. This can be done either using the standard E-scheme four-vector recombination or the winner-takes-all (WTA) recombination scheme. For simplicity, we will focus on E-scheme results here. A brief comparison between the two axis choices is shown in Fig. 13(right). Fig. 17 shows ROC curves similar to those presented in Figs. 9 and 10, this time including results for β τ = 1 as dashed lines.
We can make several observations based on these plots. First, as for β τ = 2, we see that the dichroic ratio also outperforms the other combination for β τ = 1. The performance gain is however smaller, especially with SD grooming.
In terms of the sensitivity to non-perturbative effects, we see that N -subjettiness ratios with β τ = 1 are rather stable even without any SD grooming step. This small sensitivity to non-perturbative effects might have been anticipated since the corresponding k t cut is less affected by soft-and-large-angle emissions than for β τ = 2. A consequence of this observation is that grooming is less critical when using a cut on N -subjettiness ratios with β τ = 1, and without SD grooming the dichroic combination shows a more sizeable performance gain compared to the other approaches, cf. the bottom-left plot of Fig. 17.

C Example code for dichroic subjettiness ratios
In this last Appendix, we briefly indicate how dichroic subjettiness ratios can be implemented using tools available in FastJet and fjcontrib. First, besides standard FastJet headers needed for jet clustering, one needs to include the following headers: #include <fastjet/contrib/ModifiedMassDropTagger.hh> // mMDT tagger #include <fastjet/contrib/SoftDrop.hh> // optional SD grooming #include <fastjet/contrib/Nsubjettiness.hh> // tau1 and tau2 Then, one should declare the basic objects needed for tagging, computing τ 1 and τ 2 , and, optionally, grooming: Note that all parameters here are given as examples and have not been optimised. Also, when used with events contaminated by pileup, a proper pileup mitigation technique should be implemented. This can for example be done by passing a fastjet::Subtractor to the mMDT and SD via the set subtractor method, and using a GenericSubtractor [48] or a ConstituentSubtractor [32] for the N -subjettiness variables. Alternatively one can use methods that carry out event-wide pileup-suppression such as PUPPI [34] or SoftKiller [33].
Finally, for a given jet (jet in the example below), one can compute the dichroic subjettiness ratio using