1 Introduction

The study of fundamental interactions between elementary particles is the primary goal of particle physics. The probe to these interactions is usually done via different types of scattering processes. Dedicated experiments give the unique opportunity to probe such interactions in a controlled way. Today’s advanced colliders are the types of experiments that reveal great information about the fundamental interactions of nature. These high-energy and highly luminous colliders provide us ample opportunities to study the fractions of longitudinal and transverse polarization of heavy particles emerging either from the Standard Model (SM) or from beyond the SM (BSM) scenarios. For example, the models like supersymmetry  [1] or composite Higgs models [2] allow the production of the heavy resonance, which subsequently decays to a pair of essentially longitudinally polarized W or Z bosons  [3, 4]. In the SM, the study of the polarization fractions of the heavy bosons is particularly compelling since it reveals the true nature of the electroweak symmetry breaking. The WW production via vector boson fusion (VBF) production  [5, 6] tends to give longitudinal W bosons at high energies. This is because of the domination of the Goldstone nature of W at high energies. On the other hand, finding the fraction of longitudinally polarized W in a process might reveal some hints of new physics. The polarization study of W boson is, therefore, an important check for SM or BSM scenarios.

Fig. 1
figure 1

Sketch of two body decay of W in (left) its rest frame and (right) in the lab frame

The simplest way of examining the polarization of W is via its decay products. The study usually is done using the standard variable \(\cos \theta _*\), where \(\theta _*\) is the angle between the propagation direction of W in the lab frame and one of the decay products in the rest frame of W. Since the leptonic channels are the cleanest channel at a collider, most of the phenomenological studies of W polarization have been carried out in the leptonic channel. Experimental collaborations at the LHC have already measured the polarization fraction of the SM W boson. CMS collaboration has measured the value in leptonic W+jet events  [7] and ATLAS collaboration has done it via semi-leptonic \(t\bar{t}\) events  [8]. Despite the clean channel in the leptonic decay modes, it produces an invisible neutrino, and hence the study becomes difficult. On the other hand, in the hadronic decay modes of W, both jets can be observed and the study of the polarization does not have the ambiguity of the missing energy. However, signals in the hadronic modes are always tricky to separate from the huge QCD background in a collider, especially in a hadron collider. In addition, other effects like pile-up (PU), and underlying event (UE) add another level of difficulty to the study via the hadronic modes. However, a better understanding of such effects and recent advancements in mitigating these effects allow us to study polarization in the hadronic channels as well.

The interesting developments in this direction make use of machine learning or jet substructure-based analysis. A boosted, hadronically decaying W boson can be reconstructed as a fat jet via jet clustering algorithms using a large radius parameter. Due to the resonance decay of such an on-shell W boson, the W fat jet tends to have two prongs inside. These two prongs can be detected as two subjets using the jet substructure technique. A new variable \(p_\theta \), the ratio between the difference in energies of the two subjets in a W fat jet and the magnitude of three-momentum of the W fat jet, has been proposed in Ref.  [9]. In this reference, the authors showed that this variable is a proxy for the variable \(\cos \theta _*\), i.e. \(p_\theta =|\cos \theta _*|\) at the parton level. The advantage of using this proxy variable \(p_\theta \) over the standard variable \(\cos \theta _*\) is that the former can be calculated in the lab frame without going back to the rest frame of the W boson. The reconstruction of the variable crucially depends on how accurately the two subjets have been identified. In Ref.  [9], N-subjettiness was used to find the two axes of the subjets after the grooming via Mass Drop tagger  [10]. However, in this paper, we show that the polarization study using N-subjettiness [11] can be improved if we do not use any grooming method, especially in the region of \(p_\theta \rightarrow 1\). We also used the Soft Drop  [12] tagger to find the subjets which also yields quite decent results.

This article is organized as follows. We briefly discuss the polarization states of W boson in Sect. 2. Jet substructure and study of polarization using jet substructure are discussed in Sect. 3. The template models and calculation of variables are described in Sect. 4. Section 5 discusses the main result of our study, and finally, we summarize our work in Sect. 6.

2 W Boson polarization

A massive particle with spin j has a total of \(2j+1\) polarization (helicity) states. However, distinguishing among these polarization states is a difficult task in itself. On the other hand, the study of polarization states tells us about the short-distance interaction a particle has gone through. For example, a polarization study can reveal whether an interaction is parity conserving or violating, or the underlying structure of the interaction the particle has gone through. The same can be true for charge conjugation or time-reversal symmetry. This work focuses on the polarization states of a spin one particle, namely W boson. It has three polarization states, comprising one longitudinal and two transverse polarization states. Longitudinal and transverse polarization states are identified with the eigenstates of \({\hat{p}}\cdot \vec J\), where \({\hat{p}}\) and \(\vec J\) are 3-momenta unit vectors and angular momenta vectors, respectively. Longitudinal polarization states are those which have eigenvalue 0, while the transverse states are those which have eigenvalues \(\pm 1\). The angular distribution of the decay products of the W boson depends on its polarization state.

One of the most popular ways to study the polarization of a particle, that can decay, is via the angular distributions of its decay products. For the case of a massive W boson which has two different types of polarization states, viz. longitudinal and transverse, polarization of decaying W boson can be determined using its two-body decay products. If a W decays to two massless particles q and \(q'\) in the lab frame, then one can boost back to its rest frame with the z-axis to be taken along the propagation direction of W boson in the lab frame. In the rest frame, one can then measure the angle between the z-axis and one of the decay products as \(\theta _*\). This is depicted in Fig. 1. The decay of W in its rest frame is depicted in the left panel, and the same in the lab frame is depicted in the right panel of the figure. When integrated over the azimuthal angle in the rest frame of W, the angular distribution of one of the decay products in the rest frame of W can be expressed as

$$\begin{aligned} \frac{1}{\sigma }\frac{d\sigma }{d\cos \theta _*}&= f_0\frac{3}{4}\sin ^2\theta _* + f_- \frac{3}{8}\left( 1+\cos \theta _*\right) ^2 \nonumber \\&\quad + f_+\frac{3}{8}\left( 1-\cos \theta _*\right) ^2 \end{aligned}$$
(1)
$$\begin{aligned}&= f_0\frac{3}{4} (1-\cos ^2\theta _*) + f_T \frac{3}{8}\left( 1+\cos ^2\theta _*\right) \nonumber \\&\quad + f_D\frac{3}{4}\cos \theta _*, \end{aligned}$$
(2)

where \(f_{0,\pm }\) are fractions of different polarization states present in W sample and \(f_T=f_++f_-\) is total transverse polarization fraction and \(f_D=f_--f_+\). We should note that, in practical cases, all the helicity states of the W will interfere with each other [13,14,15] to give rise to the final distribution. Integration over the full decay azimuthal angles for W boson decay eliminates the interference terms although some applications of the cuts (like maximum \(\eta \) cut on hadrons or jets) will reinstate some of the interference terms between the different polarization states of the W boson [16,17,18].

By limiting ourselves to a measurement of \(|\cos \theta _*|\) and \(f_-=f_+\), the anticipated distribution is given by

$$\begin{aligned} \frac{1}{\sigma }\frac{d\sigma }{d|\cos \theta _*|}&{=} f_0\frac{3}{2} (1-|\cos \theta _*|^2) + f_T \frac{3}{4}\left( 1{+}|\cos \theta _*|^2\right) . \end{aligned}$$
(3)

The variable \(|\cos \theta _*|\) is defined in the rest frame of W while the W is produced in the lab frame, which, in general, is not the rest frame of W. So, a variable that mimics the variable \(|\cos \theta _*|\) but calculated in the lab frame is useful. In Ref.  [9], one such variable has been suggested:

$$\begin{aligned} p_\theta = \frac{|\Delta E|}{|\vec p_W|}, \end{aligned}$$
(4)

where \(\Delta E\) is the difference in energy of the two decay products and \(\vec p_W\) is the 3-momenta of W in the lab frame. In addition, we have also used another variable \(z_j\) (momentum balance of decay product) for our analysis in the polarization study of W. The variable is defined as

$$\begin{aligned} z_j = \frac{p_T^\text {leading}}{p^W_T}. \end{aligned}$$
(5)

This variable \(z_j\) is the ratio of transverse momenta \(p_T\) of the hard-\(p_T\) decay product and the W boson  [19]. In this case, both the \(p_T\) of the decay product and the W are measured in the lab frame. As we have performed the study on the boosted W jets, the variable calculation in the lab frame is much more useful in terms of its subjets inside the fat jet. For the subsequent discussions, the variables calculated at the MG5 parton-level will be referred to \(\cos \theta _*\) and \(z_{j_*}\). The same variables, which are calculated from the subjets of the reconstructed boosted jets, will be represented by \(p_\theta \) and \(z_{j}\), respectively.

3 Jet substructure based polarization study

As we already discussed that the theoretical distributions of \(\cos \theta _*\) for longitudinally polarized W and transversely polarized W are proportional to \((1-|\cos \theta _*|^2)\) and \((1+|\cos \theta _*|^2)\), respectively. Most of the existing polarization studies of W or Z bosons are done with leptonic final states. However, in the case of a W boson, the leptonic final state contains a neutrino. The weakly interacting neutrino then does not leave any trace at the detector. This makes the polarization study in leptonic decay modes difficult. On the other hand, the hadronic decay modes of W produce two jets at the detector. This, in principle, should make the polarization study of W easier. However, an effective reconstruction of jets and then the W bosons make it more difficult to study the polarization of W at a collider. Moreover, the elimination of the huge QCD background at a collider, especially at a hadron collider, adds up to another level of difficulty. However, recent developments in the study of jets and their substructures ease some of the jobs of finding jets or subjets inside a fat jet. Although we will not report a signal-background type of analysis in this article, we show that the \(p_\theta =|\cos \theta _*|\) distribution can be reproduced with a good accuracy for longitudinal and transversely polarized W boson using its hadronic decay channel when W is boosted and gives rise to fat jet.

If the momentum of the decaying W boson is high enough, its decay products tend to be collinear. In the case of hadronic decay modes, these decay products will again shower and form collimated objects including all the collinear decay products. Now, a jet clustering algorithm may not be able to distinguish between these highly collinear decay products and will cluster these collimated final states into a single jet. These jets are popularly known as fat jets or boosted jets. These boosted jets are of high interest in the study of boosted topologies. In our study, we, too, have considered boosted W jets. Once the W fat jet is found, the job remains to effectively find subjets inside the fat jet. In this study, we have used two different jet substructure-based methods to find the two subjets inside the boosted W jets. These two methods are described in the next two subsections.

3.1 Finding subjets with N-subjettiness

In the polarization study of W boson, as described above, the \(p_\theta \) variable relies on the effectiveness of the construction of the subjets inside the boosted W jet. The jet substructure observable N-subjettiness  [11], which tries to find out N lobes inside a fat jet, can be used to identify the subjets inside a boosted jet. In particular, in the case of a fat W jet, we are interested in \(N=2\), i.e. 2-subjettiness. For N number of candidate axes \(a_1, a_2, \ldots a_N\), the N-subjettiness method first calculates

$$\begin{aligned} {\tilde{\tau }}_N = \sum _{i\in J} p_{T_i}\min \left\{ \Delta R(i,a_1)^\beta ,\Delta R(i,a_2)^\beta ,\ldots ,\Delta R(i,a_N)^\beta \right\} , \nonumber \\ \end{aligned}$$
(6)

where J represents the full jet and

$$\begin{aligned} \Delta R(i,a) \equiv \sqrt{\left( \eta _i-\eta _a\right) ^2 + \left( \phi _i - \phi _a\right) ^2}, \end{aligned}$$
(7)

with \(\eta \) and \(\phi \) being the pseudorapidity and azimuthal angle, respectively. The weighing exponent \(\beta \) has been kept in the distance measure to keep it more general  [20]. This form of \({\tilde{\tau }}_N\) with the exponentiated distance measure is analogous to the jet substructure observables generalized angularities  [21,22,23,24,25]. Other forms of distance measures are suggested in Ref.  [26]. In order to calculate the final N-subjettiness variable \(\tau _N\), a minimization of \({\tilde{\tau }}_N\) is then performed by varying the directions of the axes, i.e.

$$\begin{aligned} \tau _N = \min _{\{a_1, a_2,\ldots ,a_N\}} {\tilde{\tau }}_N. \end{aligned}$$
(8)

The minimization procedure follows a variation of Lloyd’s algorithm  [27] for k-means clustering algorithm developed to find k number of clusters in a statistical distribution. The four basic steps of N-subjettiness, each of which has many different choices, are as follows.

  1. 1.

    Initialization: This step focuses on choosing the seed axes for the further minimization of \({\tilde{\tau }}_N\). Many different choices for the seed axes is available in Refs. [20, 26]. For this study, we have used the seed axes to be the axes of the subjets of generalized exclusive \(k_t\) clustering of the constituents of the fat jet.

  2. 2.

    Assignment: Once N candidate axes are chosen, a subcluster is assigned to each of the axes. Then the constituent particles inside the jet are associated with the nearest subcluster. The nearness is defined by the definition of the distance measure given in Eq. (7). These subclusters, with the collection of the particles after the final iteration, are defined to be the N subjets of the original jet.

  3. 3.

    Update: This step performs updating the old axes to the new ones. The condition for the iteration is set by the minimization condition of \({\tilde{\tau }}_N\). In this work, we have followed the iteration procedure for general values of \(1\le \beta <3\) discussed in Ref.  [20]. One may follow Ref.  [26] for other kinds of distance measures and their iteration procedure.

  4. 4.

    Iteration: In this step, the repetition of the last two steps, namely the Assignment and Update steps, is performed until a desired accuracy is reached. The measure of accuracy is usually defined as the average of the shift of axes in the \(\eta \)-\(\phi \) plane. For \((n+1)^\text {th}\) iteration, the accuracy is defined as

    $$\begin{aligned} \Delta ^{(n+1)} \equiv \frac{1}{N}\sum _{A = 1}^{N} \Delta R(\hat{a}_A^{(n)},\hat{a}_A^{(n+1)}). \end{aligned}$$
    (9)

In this work, we have used the available choices implemented in Fastjet Contrib  [11], of the above steps. A brief description of these choices is discussed below, and the most effective choices in our study are discussed in the next section.

For our work, the choice of seed axes is based on the exclusive \(k_t\) clustering algorithm  [28], requiring it to return exactly N subjets. For this exclusive algorithm, the generalized measure of \(k_t\) distances is parametrized by an exponent p:

$$\begin{aligned} d_{ij} = \min \left\{ p_{T_i}^{2p}, p_{T_j}^{2p} \right\} \Delta R\left( i,\,j\right) , \qquad d_{iB} = p_{T_i}^{2p}, \end{aligned}$$
(10)

where \(d_{ij}\) is distances between \(i\text {th}\) and \(j\text {th}\) constituents and \(d_{iB}\) is called beam distance for \(i\text {th}\) constituent. The value \(p=1\) represents \(k_t\) algorithm and \(p=0\) represents Cambridge–Aachen algorithm. The limit \(p\rightarrow 0\) has a tendency to allow far-away soft radiation to be clustered in a stage. This clustering history is useful in the transverse W analysis of polarized W jets. We have chosen the winner-takes-all (WTA) recombination scheme  [29,30,31] in the exclusive clustering algorithm because of its recoil-free nature. Furthermore, one might choose to start from different seed axes than the ones returned by the exclusive \(k_t\) algorithm. This can be easily achieved by adding random noises to the exclusive \(k_t\) algorithm subjet direction. However, we have chosen not to use any such random noises. The procedure of not using random noises on known as one-pass minimization  [26]. On the other hand, the parameter \(\beta \) in the distance measure in Eq. (7) is not very important in the polarization study. We have chosen \(\beta =1\), in which the final subjet axes after the minimization points towards the actual radiation direction like a “median”  [20]. For \(\beta =2\), the final axes are usually directed towards mean subjet energy.

3.2 Finding subjets with soft drop

Soft Drop grooming/tagging method [12] was proposed to groom away the soft and wide angle constituents, which comes predominantly from PUs or UEs, inside a jet. We used the Soft Drop grooming algorithm to find the subjets inside the boosted jet also. To explain this, we first explain the Soft Drop algorithm, which performs the following three steps.

  1. 1.

    The algorithm first goes back to the last stage of jet clustering. Let \(j_1\) and \(j_2\) be the two subjets giving rise to the final jet J.

  2. 2.

    It then checks for the condition \(\dfrac{\min \{p_{T_{j_1}}\!, \ p_{T_{j_2}}\}}{p_{T_{j_1}}+p_{T_{j_2}}} > z_\text {cut} \left( \dfrac{\Delta R(j_1,j_2)}{R}\right) ^{\beta _\text {SD}}\).

  3. 3.

    If the condition in step 2 is satisfied, the algorithm declares J to be the final groomed jet. Otherwise, it discards the softer subject, promotes the harder one to J, and restarts from step 1.

In this way, we get the final groomed jet. One may consider the two subjets to be the subjets of a two-pronged jet. This is a good approximation since, for a two-pronged jet, the two prongs cluster first and then the two prongs combine to give rise to the final jet. As described in the algorithm itself, \(z_\text {cut}\) and \(\beta _\text {SD}\) are parameters of the Soft Drop groomer and R is the radius parameter of the clustering algorithm. An important observation is that as \(\beta _\text {SD}\rightarrow \infty \) or \(z_\text {cut}\rightarrow 0\), the algorithm returns the original ungroomed jet as the final Soft Drop jet. Here, these two parameters are chosen suitably to achieve our goal. In this work, we consider this as one of the methods to find subjets of the boosted jets. Here, again, we checked with a number of different choices of these two parameters. The best choices for our purposes will be given in the next section.

As we already mentioned that the two variables \(p_\theta \) and \(z_j\) have already been suggested in the literature  [9, 19], our main objective of this report is to improve on the analysis. As we already explained that the MG5 parton-level variable \(p_\theta ^*\) calculated in the lab frame correctly reproduces \(|\cos \theta _*|\), where \(\theta _*\) is the angle between the two decay products of W in its rest frame. In \(\theta _* \rightarrow \pi /2\) limit, both the decay products make almost the same angle with respect to the boost axis of the decaying W and after the boost, in the lab frame, they share almost equal energy inside the fat jet. However, if \(\theta _* \rightarrow 0\) or \(\pi \), one decay product is parallel to the boost axis, while the other is anti-parallel to the boost axis. Hence, after the boost, the parallel one becomes highly energetic, and the anti-parallel one becomes soft and wide angle. Because of this, finding the subjet effectively is difficult in the case of \(|\cos \theta _*|\rightarrow 1\). Our study mainly focuses on this region so the discrimination between longitudinal and transverse W bosons can be improved. We applied the above two methods to improve upon the earlier studies available in the literature.

4 Generation of templates

As discussed previously, we are interested in reconstructing boosted W jet and want to separate the two differently polarized boosted W boson using the variables \(p_\theta \) and \(z_j\). In this regard, we need to calibrate two kinds of samples, one with fully longitudinally polarized boosted W bosons and another with fully transversely polarized boosted W bosons. For this purpose, we used two specific interaction Lagrangians.

4.1 Template models

Longitudinally polarized W bosons can be generated via a fictitious scalar particle \(\phi _s\), which has a Higgs-like coupling to the W bosons and gluons,

$$\begin{aligned} \mathcal {L}_s = c_s^w \phi _s W^\mu W_\mu + c_s^g \phi _s G^{\mu \nu } G_{\mu \nu }, \end{aligned}$$
(11)

where \(c_s^w\) and \(c_s^g\) are the coupling constants. The interaction of the scalar in the Lagrangian with W bosons is through a non gauge-invariant term. This is needed to pick out longitudinal W bosons at high energies. If the two W bosons are produced via the s-channel process through \(\phi _s\); they will have a small admixture of transverse W bosons. However, the fraction of transverse W bosons are suppressed by a fraction \(\frac{m^4_W}{E^4}\). The admixture of transverse W bosons would be less than \(10^{-3}\) for W bosons with energies higher than 500 GeV.

On the other hand, transversely polarized W bosons can be produced by using non-renormalizable dimension-5 interaction terms with the help of a fictitious pseudo-scalar field \(\phi _{ps}\). It can couple to W’s and gluons via terms like

$$\begin{aligned} \mathcal {L}_{ps} = d_s^w \phi _{ps} W^{\mu \nu } \tilde{W}_{\mu \nu } + d_s^g \phi _{ps} G^{\mu \nu } \tilde{G}_{\mu \nu }, \end{aligned}$$
(12)

where \(d_s^w\) and \(d^g_s\) are the coupling constants. From Eq. (12), it can be shown that the amplitude for W boson production from the pseudo-scalar vertex has the form \(\mathcal {M} \propto \epsilon _{\mu \nu \rho \sigma } p_1^\mu \epsilon _1^\nu p_2^\rho \epsilon _2^\sigma \), where \(\epsilon _{\mu \nu \rho \sigma }\) is the fully-antisymmetric tensor and \(p_i\), \(\epsilon _i\) represent the four-momentum and polarization vector for the \(i\text {th}\) W. This form in the amplitude helps us to get a purely transversely polarization vectors.

4.2 Details of generation of sample events

We implemented the two models in FeynRules  [32] to generate Universal FeynRules Output (UFO)  [33] files. We then used these UFO files in Madgraph5  [34] to generate \(pp\rightarrow W^+W^-\) sample events at a centre-of-mass energy \(\sqrt{s}\) = 14 TeV. The events were generated separately for longitudinally and transversely polarized W bosons with an intermediate scalar (\(\phi _s\)) and pseudo-scalar (\(\phi _{ps}\)), respectively. At the parton level event generation, we demand that the W bosons should have \(p_T>300\) GeV for both cases. In order to generate a relatively pure longitudinal sample, the events producing W bosons via \(\phi _s\) have been passed through an additional lower cut of 500 GeV on the magnitude of the three-momentum of each W boson. Since we are interested in boosted W region produced via heavy resonance decay, the mass of the scalar \(\phi _s\) and pseudo-scalar \(\phi _{ps}\) are chosen to be 1 TeV. From each event, we prefer to study the polarization of one boosted W boson via its hadronic decay mode. Therefore, one W boson was allowed to decay hadronically, and the other one was forced to decay leptonically during the event generation using MadGraph5. We then used Pythia8  [35, 36] to shower and hadronize the parton level events generated by MadGraph5 with the Monash 2013 Tune  [37] to take care of the underlying events and multi-parton interactions in the proton-proton collisions. The Pythia8 generated hadrons were then passed on to Delphes  [38] detector simulations.

We carry out the analysis at three different levels. They are (a) parton level, (b) pythia level, and (c) delphes level. Parton level means the variables were calculated at the parton level final state, i.e. after event generation in MadGraph5. Pythia level means the variables were calculated after showering and hadronization by Pythia8. The variables calculated after detector effects using Delphes are represented by delphes level. The details of the event generation at these three different levels are described below.

  • Parton level From the parton generated by MadGraph5, we construct the variables by using relations given in Eqs. (4) and (5). The variables generated at the parton level will be labelled as \(\cos \theta _*\) and \(z_{j_*}\).

  • Pythia level The analyses with the hadrons generated by Pythia8 will be referred to as pythia level analysis. No further cuts on the Pythia8 generated hadrons have been imposed at this stage.

  • Delphes level The analyses with the particles generated by the Delphes-implemented particle flow algorithm will be referred to as delphes level analyses. No further cuts on the Delphes generated particles have been imposed at this stage.

For the analyses at pythia level and delphes level, we cluster the final state hadrons/particles within \(|\eta |<\) 4.0. We used Cambridge–Aachen algorithm with radius \(R_0\) = 1.0 for clustering of the jets with the help of FastJet  [39, 40]. The events with one fat jet with \(p_T>300\) GeV were taken into consideration for further analyses. A jet is tagged as a W jet if it has a mass with a range between 60 and 100 GeV. We then used two different methods, viz. N-subjettiness and Soft Drop, described in the previous section to find the two subjets of W jet using FastJet contrib  [11, 12, 26]. Since we are expecting two subjets in a fat W jet, we have taken \(N=2\) for the calculation of N-subjettiness variable \(\tau _N\) as defined in Eq. (6) for all the cases. After finding out the two subjets from the fat jet, we reconstruct the two variables \(p_\theta \) and \(z_j\) as expressed in Eqs. (4) and (5).

4.3 Best case scenarios

The different choice of axes and distance measures in N-subjettiness technique and different \(z_\text {cut}\) and \(\beta _\text {SD}\) in Soft Drop groomer can be useful depending on which type of polarization we want to study. In order to get close enough distribution of \(p_\theta \) to \(|\cos \theta _*|\), we have varied different parameters of Soft Drop and N-subjettiness. We first performed a thorough scan over these parameters to get a good match to the parton level distribution, which is essentially the theoretical distribution of \(|\cos \theta _*|\). We did not match the distribution of \(z_j\) variable with the distribution of its parton level counterpart \(z_{j_*}\) since these two variables are highly correlated. In our study, we found that we need to take different values of the parameters for the longitudinal case than the transverse case. We tabulate these parameter choices that best fit the \(|\cos \theta _*|\) and \(z_j\) distributions to their parton level counterparts for both longitudinal and transverse cases.

Fig. 2
figure 2

Normalized distribution of angular variable \(p_\theta \) (upper panel) and momentum balance \(z_j\) (lower panel) for longitudinally polarized W. The distributions are shown for parton level (blue dashed), pythia level (green solid), and delphes level (red dash-dotted) analyses with N-subjettiness (left panel) and Soft Drop (right panel) techniques to find the subjets

Fig. 3
figure 3

Normalized distribution of the angular variable \(p_\theta \) (upper panel) and momentum balance \(z_j\) (lower panel) for transversely polarized W. The convention for the colours and the level are similar to Fig. 2

In general, the matching of the pythia or delphes level distribution to the original distribution does not have much dependence on the value of the exponent \(\beta \) in the N-subjettiness distance measure. The reason is that once a set of candidate axes is chosen, the formation of the subjets (the subclusters in the Assignment step) is independent of the exponent \(\beta \). There is, however, a weak dependence on \(\beta \) of the further iterative steps, which guide towards the final axes choice by minimizing \({\tilde{\tau }}_N\). The value \(\beta =1\) direct the axes towards the actual radiation direction (‘median’ direction) and \(\beta =2\) direct the axes towards the mean direction  [20]. However, for collimated radiation, the median direction and mean direction are almost collinear. More importantly, the subjets remain the same on a jet-by-jet basis in almost all the events. On the other hand, choosing the value of p in the exclusive algorithm, which is needed to find the seed axes, is crucial and is usually different in the transversely and longitudinally polarized W boson. The value of p close to zero implies that the Cambridge–Aachen version of the algorithm is in use. In this case, no weightage on the \(p_T\) of the constituent is given. This allows the algorithm to merge faraway radiation merge at a later stage. This helps set the seed axes near the actual hadronic decay product of transversely polarized W boson, in which most of the decays are expected to be wide. On the other hand, the value of p away from zero helps in the longitudinal W boson since its decays are expected to be narrower. The recombination scheme of the exclusive clustering algorithm is chosen to be WTA scheme  [29,30,31]. This is a recoil-free recombination of two four-momenta, and the resultant momentum is chosen as the direction of the energetic four-momenta. The polarization study weakly depends on the choice of the recombination scheme. Furthermore, we have chosen one-pass minimization to keep the N-subjettiness observable infrared-and-collinear safe and deterministic for any given jet  [20].

The choice of parameters in the Soft Drop subjet finding method is also crucial. Since the transversely polarized W boson decay products are expected to be widely separated, a weaker grooming, i.e. a larger \(\beta _\text {SD}\) and a smaller \(z_\text {cut}\), is useful. It is the opposite, i.e. stronger grooming, in the case of longitudinal W boson. The effect of grooming is further discussed in Sect. 4.4.

We show these matching in Fig. 2 for longitudinally polarized W (the Lagrangian is given in Eq. (11)). As mentioned earlier that we carried out the analysis at three different levels, viz. (a) parton level, (b) pythia level, and (c) delphes level. In all the panels of Fig. 2, blue dashed, green solid, and red dash-dotted histograms represent the distributions of the variables at parton level, pythia level, and delphes level analyses, respectively. We can see that both N-subjettiness analysis and Soft Drop analysis provide good matching for the longitudinal case for the variables \(p_\theta \) and \(z_j\). The distribution is a little off near the value 1. This is because one of the subjet is very soft in that region, and hence it is difficult to reconstruct that soft subjet effectively in that phase-space region.

The same analysis has been done for the transverse case also. Figure 3 shows the distributions of the same variables for the parton, pythia, and delphes level analyses. The conventions (colour, label etc.) are similar to that of Fig. 2. We see the same feature here again, i.e. the distribution is not very accurate near 1 because one of the subjet is very soft here.

The Figs. 2 and 3 help us to understand the choices of different parameters as tabulated in Table 1. The \(p_\theta \) distributions peak near 0 for the case of longitudinal W whereas they peak near 1 for the case of transverse W. This indicates that the decay products tend to take an almost equal share of a longitudinal W boson energy. In contrast, the energy share of the decay products of a transverse W tends to be asymmetric. Thus aggressive grooming, e.g. high \(z_\text {cut}\) or small \(\beta _\text {SD}\), tends to throw away the softer subjet of a transverse W boson. Therefore, for the case of Soft Drop as a method of finding the subjets inside boosted W, we need very soft \(z_\text {cut}\) and large \(\beta _\text {SD}\) in order to keep the softer subjet of the final jet. On the other hand, the slightly higher \(z_\text {cut}\) and low \(\beta _\text {SD}\) are preferable for the longitudinal W bosons.

Table 1 Parameter choices for the two techniques used to find subjets inside W fat jet. These values of parameters are taken to optimize the templates for longitudinally and transversely polarized W. For N-subjettiness calculation, an ‘unnormalized measure’ with \(R_0=0.2\) was used
Fig. 4
figure 4

Demonstration of effect of grooming in the N-subjettiness scenario. The \(p_\theta \) distribution for (left) longitudinal and (right) transversely polarized W boson for weak, moderate, strong grooming scenario

4.4 Effect of grooming

We now show the effect of grooming in the N-subjettiness subjet finding method. We have already explained that \(\beta _\text {SD} \rightarrow \infty \) or \(z_\text {cut} = 0.0\) is the ungroomed scenario of a fat jet. For that, we prefer to demonstrate three different levels of grooming (a) weak, (b) moderate, and (c) strong. The exact values of the parameter of these three levels of grooming are as follows:

$$\begin{aligned} \text {weak grooming:}&\quad \beta _\text {SD} = 3.0,\,z_\text {cut}=0.10, \\ \text {moderate grooming:}&\quad \beta _\text {SD} = 2.0,\,z_\text {cut}=0.15, \\ \text {strong grooming:}&\quad \beta _\text {SD} = 1.0,\,z_\text {cut}=0.20. \end{aligned}$$

The \(p_\theta \) distribution for the three grooming scenarios with the N-subjettiness subjet finding method have been plotted in Fig. 4. Red, green, and blue histograms are for strong, moderate and weak grooming. The distributions corresponding to the ungroomed scenario are shown by grey histograms in Fig. 4. The \(p_\theta \) distributions shown are for the delphes level analysis.

It can be seen from the distributions that as the grooming becomes stronger, the \(p_\theta \) distribution deviates from the original from its ungroomed distribution, which was closer to the parton level distribution. As the grooming becomes stronger, the softer or wider subjet component of a fat jet tends to get removed. As a result, they become effectively one-prong jets. This leads to the removal of such jets, and hence no jet survives in the \(p_\theta \rightarrow 1\) region. The effect of grooming is seen near \(p_\theta \rightarrow 1\) in both the transverse and longitudinal scenarios. The effect is, however, more pronounced in the transverse case since its original distribution peaks at \(p_\theta =1\).

5 Result

After the generation of the templates of \(p_\theta \) and \(z_j\) for longitudinal and transverse W bosons, we plan to demonstrate the usefulness of these templates. This is shown via (a) separability between the templates of the two differently polarized W boson samples, and (b) finding the fraction of longitudinal/transverse W boson from an admixture of the two samples.

Fig. 5
figure 5

ROC curves to illustrate the separability between two templates, viz. longitudinal and transverse. The parameter choices are taken to optimize the longitudinal template

Fig. 6
figure 6

ROC curves to illustrate the separability between two templates, viz. longitudinal and transverse. The parameter choices are taken to optimize the transverse template

5.1 Separability

We check the separability between the longitudinal and transverse W bosons in terms of Receiver Operating Characteristic (ROC) curves. Whenever there is a difference in the distribution of a variable coming from two different types of sources, we may try to get a score of their separability via ROC curves. These curves are usually drawn to show how much a particular distribution can be rejected at what acceptance level of the other. This is usually done for signal and background analysis, where our main aim is to accept the signal and reject the background effectively. However, ROC can also give us a sense of the separability of two distributions. Although longitudinal and transverse distributions are not signal and background analyses, we have drawn their ROC curves to show their separability via this method. If two distributions are identical, the areas under the ROC curves are 0.5. If there are separations between the two different distributions, the area under the curve varies from 0.5 to 1 with 1 being completely separable. Hence, the closer the value of the area under the ROC curve to 1, the better they can be separated.

We consider two separate cases to show our result. In one case, we want to get longitudinally polarized W event over the transverse one (see Fig. 5) and here we use the parameter choice for the Longitudinal W best case scenario from Table 1. In another case, we try to get the transversely polarized W dominated region over the longitudinal one (see Fig. 6). Here we use the jet substructure (JSS) parameters as per the transverse W best case scenario from Table 1 using two feature variables, namely \(p_\theta \) and \(z_j\). In order to get better separability, we explore some recently developed techniques like gradient-boosted decision trees  [41]. We used XGBoost [41] toolkit used for gradient boosting. For gradient boosted decision tree method of separation, we consider \(\sim \)5500 estimators and a maximum depth of 4 where the learning rate varies depending on the achievement to separate longitudinal and transverse polarized samples at different levels of measurements (parton level, pythia level, or delphes level). For parton level analysis in both cases, our learning rate is 0.03, and we have used 80% of our total dataset for training purposes and 20% for validation, whereas for the other kind of analysis, the learning rate is 0.001 and we used 70% of the data to train our sample and 30% for validation. At this point, we note that there is a possibility of over-training of the BDT classifier for a given data sample. In the case of overtraining, the training sample gives extremely good accuracy but the test sample fails to achieve that. We can observe noticeable differences in the ROC curve of training and testing cases in such scenarios. We have explicitly ensured that with our choice of parameters, the algorithm overtraining of the data sample has been properly taken care of.

In Fig. 5, we show the ROC curves. In this BDT classifier, the template for longitudinally polarized W bosons is taken to be the signal and that of transversely polarized W bosons is taken to be the background. In the left side plot, we show the ROCs for parton level, pythia level, and delphes level analyses with N-subjettiness technique. On the other hand, in the right panel of the plot, similar things are shown with the Soft Drop technique. Alternatively, Fig. 6 shows the ROC curve where the signal is characterized by the events with transversely polarized W and the background events are coming from longitudinally polarized W decay. Here also, the left and the right side plot portrays the separability using N-subjettiness and Soft Drop techniques, respectively. For all the cases, areas under the ROC curves are listed in Table 2.

Table 2 Area under the ROC curves for different levels of analysis
Fig. 7
figure 7

ROC curves to illustrate the separability between two templates, viz. longitudinal and transverse. N-subjettiness parameters are optimized for (left) longitudinal and (right) transverse polarizations of W boson. In both panels, blue solid, red dashed, green dotted, and black dash-dotted lines correspond to the ungroomed, weak, moderate, and strong grooming scenarios, respectively. Weak, moderate, and strong groomings are as per the definitions provided in Sect. 4.4

Table 2 shows that the best separability is achieved at the parton level among all the cases. Among the two subjet finding methods, N-subjettiness achieves better separability in both scenarios. Overall, the transverse W best-case scenarios perform better compared to their Longitudinal counterparts. This is somewhat expected. Better reconstruction of the transverse W bosons means a better reconstruction of the variable near the \(p_\theta \rightarrow 1\) region. The effectiveness of the separability between the two distributions comes primarily from this \(p_\theta \rightarrow 1\) region. Therefore, the template that best optimizes the transverse case would better separate two differently polarized W bosons. Actually, the area under the ROC curve decreases monotonically as we increase p from 0.05 (transverse W best-case) towards 0.6 (longitudinal W best-case) and beyond.

As explained before, the value \(p = 0\) represents the seed axes choices of the two subjets via the exclusive Cambridge–Aachen clustering algorithm. This, therefore, suggests that a simpler exclusive Cambridge–Aachen algorithm as the seed axes finding procedure would provide the best result in terms of the separability.

5.2 Effect of grooming in separability

We now compare the separability for different types of grooming. To showcase the effect of grooming, we have taken four different grooming scenarios, viz. ungroomed, weak, moderate, and strong. We take the same set of N-subjettiness parameters for the weak, moderate, and strong grooming as defined in Sect. 4.4. We then prepared the \(p_\theta \) templates with N-subjettiness subjet finding procedure as described in the previous sections. These templates have been taken at the delphes level analysis. The separation of the two templates, namely transverse and longitudinal, are then studied via ROC curves.

In Fig. 7, we show the separability via ROC curves of the four differently groomed scenarios. In the left panel of the figure, the curves correspond to the optimal N-subjettiness parameter choices for longitudinal W. In this case, the longitudinal template is taken to be the signal and the transverse one to be the background. The ROC curves for ungroomed, weak, and moderate grooming scenarios are almost overlapping, and the strong grooming is a little separated from the three. However, the area under the curves, as written alongside the legends, shows that there is a clear ordering from ungroomed to strong grooming scenarios. This is expected for the reason explained in Sect. 4.4. In the right panel of Fig. 7, we show the ROC curves corresponding to the N-subjettiness parameters optimized for transverse W. All the curves are drawn with the transverse template as the signal and the longitudinal as the background. Here again, a similar feature is observed, i.e. the ungroomed scenario gives the best separability, and the strong grooming provides the worst separability. Overall, as explained earlier, the parameter choices optimized for transverse W and ungroomed scenario gives the best separability among all the cases considered.

5.3 Template fitting

We now use the above templates of longitudinal and transverse W bosons to acquire information from a mixed sample. For this study, we first prepared sample events, which have an admixture of longitudinal and transverse W bosons in it. We then try to fit these mixed sample events with the templates that we generated earlier. Let L(x) and T(x) be the distribution for a variable x for the two templates of longitudinally and transversely polarized W bosons, respectively. These distributions are after the detector simulation and hence are not necessarily the same as the parton level distribution. Let a mixed sample has the distribution M(x) for the same variable x. The fraction, \(\alpha \), of longitudinally polarized W boson in the mixed sample may be estimated by minimizing the following quantity.

$$\begin{aligned} \chi ^2 = \sum _{i\,\in \,\text {bins}} \left| M(x_i)-\alpha L(x_i) - (1-\alpha ) T(x_i)\right| ^2 \end{aligned}$$
(13)

The minimization over the fraction \(\alpha \) gives the estimate for \(\alpha \) as

$$\begin{aligned} \alpha = \frac{\sum _{i\,\in \,\text {bins}}\left( M(x_i)-T(x_i)\right) \left( L(x_i)-T(x_i)\right) }{\sum _{i\,\in \,\text {bins}}\left( L(x_i)-T(x_i)\right) ^2} \end{aligned}$$
(14)

In this part of the study, we used the delphes level distributions as our templates for longitudinally and transversely polarized W. We then prepared mixed sample events with three different fractions of 25%, 50% and 75%. We then tried to estimate the value of \(\alpha \) for these mixed sample cases. The estimated values are presented in Table 3.

Table 3 Estimated fraction (4th column) and an actual fraction (3rd column) of longitudinally polarized W in a mixed sample in the two techniques to find the subjets

We have done this analysis with both the subjet finding methods, viz. N-subjettiness as well as the Soft Drop method, and for both the scenarios with the templates being optimized best for longitudinally and transversely polarized W. In this part of the analysis, we used only \(p_\theta \) variable. We can see from Table 3 that the fraction \(\alpha \) can be estimated with relatively good accuracy for the case when the template is best optimized for transversely polarized W in the N-subjettiness subjet finding method.

6 Summary and outlook

To summarize, we have studied the polarization states of hadronically decaying boosted W bosons. We have considered 14 TeV centre-of-mass energy at the LHC in this study. We first generated approximately pure longitudinal and transverse W boson by taking appropriate template models and high enough \(p_T\) cut to keep hadronic W as a fat jet. The analysis was done using angular variable \(p_\theta \) (a proxy for \(|\cos \theta _*|\)) and momentum balance \(z_j\) calculated using momenta and energies of the two subjets inside boosted Ws. We employed the technique of N-subjettiness and Soft Drop to find the two subjets inside W fat jets. The analysis was done at three different levels viz. (a) parton level, (b) pythia level, and (c) delphes level. The different parameters of N-subjettiness and Soft Drop were optimized to achieve a better match to the parton level distribution of these two variables for longitudinally and transversely polarized W bosons separately. Although the optimized values of the parameters are different in two differently polarized cases, the separability is quite good in these two cases. We then used the templates to get an estimate of the fraction of longitudinally polarized W in a set of mixed sample events. The estimates are better for the case when the template is optimized for transversely polarized W than the longitudinal case.

The primary improvement of this study is to find the subjets inside a fat jet with relatively better accuracy. These techniques can be used in studies where the subjets inside a boosted jet are needed to be found. Although we did not carry out signal-background analysis in this study, this technique can be used to do such types of studies. This improvement may be achieved with other boosted objects like Z, H, t, or other heavy BSM particles.