Identifying a new particle with jet substructures

We investigate a potential of measuring properties of a heavy resonance X, exploiting jet substructure techniques. Motivated by heavy higgs boson searches, we focus on the decays of X into a pair of (massive) electroweak gauge bosons. More specifically, we consider a hadronic Z boson, which makes it possible to determine properties of X at an earlier stage. For $m_X$ of O(1) TeV, two quarks from a Z boson would be captured as a"merged jet"in a significant fraction of events. The use of the merged jet enables us to consider a Z-induced jet as a reconstructed object without any combinatorial ambiguity. We apply a conventional jet substructure method to extract four-momenta of subjets from a merged jet. We find that jet substructure procedures may enhance features in some kinematic observables formed with subjets. Subjet momenta are fed into the matrix element associated with a given hypothesis on the nature of X, which is further processed to construct a matrix element method (MEM)-based observable. For both moderately and highly boosted Z bosons, we demonstrate that the MEM with current jet substructure techniques can be a very powerful discriminator in identifying the physics nature of X. We also discuss effects from choosing different jet sizes for merged jets and jet-grooming parameters upon the MEM analyses.

Abstract: We investigate a potential of measuring properties of a heavy resonance X, exploiting jet substructure techniques. Motivated by heavy higgs boson searches, we focus on the decays of X into a pair of (massive) electroweak gauge bosons. More specifically, we consider a hadronic Z boson, which makes it possible to determine properties of X at an earlier stage. For m X of O(1) TeV, two quarks from a Z boson would be captured as a "merged jet" in a significant fraction of events. The use of the merged jet enables us to consider a Z-induced jet as a reconstructed object without any combinatorial ambiguity. We apply a conventional jet substructure method to extract four-momenta of subjets from a merged jet. We find that jet substructure procedures may enhance features in some kinematic observables formed with subjets. Subjet momenta are fed into the matrix element associated with a given hypothesis on the nature of X, which is further processed to construct a matrix element method (MEM)-based observable. For both moderately and highly boosted Z bosons, we demonstrate that the MEM with current jet substructure techniques can be a very powerful discriminator in identifying the physics nature of X. We also discuss effects from choosing different jet sizes for merged jets and jet-grooming parameters upon the MEM analyses.

Introduction
The Large hadron collider (LHC) has played an important role in deepening our understanding of electroweak symmetry breaking by discovering the Higgs particle. As the LHC experiment reaches the energy scale of tera electronvolt (TeV), it is of paramount importance to study potential new physics such as various extended Higgs sectors, existence of other fundamental scalars [1][2][3], vector resonances under the set-up of composite models [4][5][6][7][8][9], and so on. While the SM-like Higgs field involves the electroweak symmetry breaking, resonances in extended Higgs sectors would have sizable decay branching ratios to heavy SM particles including the weak gauge bosons, the Higgs, and the top quark, if kinematically allowed. As increased center-of-mass energy at the LHC enables us to probe heavier new particles of O(1) TeV, a substantial mass gap between a new particle and a heavy SM state would result in a large boost of the latter, accompanying highly collimated objects along the boost direction of the latter in the final state. While the leptonic decay products of the above-listed heavy SM particles often carry advantages in conducting data analyses owing to their cleanness, hadronic decay products are expected to play an important role in not only discovery opportunity but property measurement at the early stage due to their larger branching fractions. However, their jetty nature at the detection level renders associated analyses challenging because of significant overlaps between the final state jets, requiring robust analysis tools to deal with such hadronic objects reliably. A promising venue in developing relevant techniques is the field of jet substructure [10].
A successful application of the jet substructure techniques is to tag single-jet-looking objects from decays of boosted, heavy SM states (e.g., t/W/Z/H) against structureless or single-prong QCD jets [11]. The idea is that one can capture hadrons from the decay of a heavy SM particle, using a single "merged" jet which is defined by a proper choice of the jet size. An expected benefit from utilizing a resultant (massive) merged jet is mitigation of the systematics which often arises in considering multi-particle final states (e.g., combinatorial ambiguity), by reducing the number of reconstructed objects. The price for it is the possibility that even a normal QCD jet may acquire a sizable mass in combination with underlying QCD activities including pile-ups. 1 In this regard, there are dedicated studies • to reduce corruptions from irrelevant hadrons for a given jet [13][14][15][16], and • to differentiate a jet resulting from a boosted heavy SM state from an ordinary QCD jet by looking into its substructure [13,[17][18][19][20][21][22][23][24].
Many proposed methods along the line have been successfully implemented for analyzing the LHC data, and concurrently the sensitivities for the high mass region have been improved by reducing relevant SM backgrounds efficiently. In this context, it is interesting to question how far the characteristic features in kinematic observables are captured after subjet isolations, if included are various realistic effects such as parton shower, hadronization/fragmentation, detector response, and jet clustering. We first point out that rather precise identification of the features is viable in some controlled environment, despite the presence of realistic effects. Motivated by the spin-parity determination of the SM Higgs boson [25] and the diboson resonance [26,27] through massive bosonic intermediary states in relevant decay processes, we focus on the analysis of W/Z-induced two-prong jets and examine well-motivated angular variables formed with reconstructed subjets. In the case of production of a new, bosonic heavy resonance, the jet substructure techniques are relevant to the channels of W W , ZZ, and Zγ in which the associated final state is, at least, partially hadronic.
For a sufficiently boosted, heavy state V , the angular separation ∆R between its two decay products is given by where m V and P V T denote the mass and the transverse momentum of particle V . As usual jet substructure techniques begin with identifying a "merged" jet by a fairly large fixed cone size to capture all constituent jets followed by a declustering procedure to find subjets, the hardness of P V T is crucial in choosing a reasonable cone size, hence too a successful subjet analysis. Moreover, considering the fact that the generic shape analysis demands global information, we see that a proper definition of merged jets is a key component for posterior analyses. In particular, the phase-space reduction induced by fixing a cone size for merged jets would cause adverse distortions of the kinematic distributions of interest, becoming an obstacle to render it hard to decode the physics behind signals. To illustrate these points, we employ two benchmark points for a heavy resonance decaying into a ZZ final state in order to cover kinematically distinctive regions, one for the moderately boosted Z case and the other for the highly boosted one. We contrast/compare them in terms of the angle particularly sensitive to the CP state of the resonances. We there explicitly show that remarkably, jet substructure techniques preserve useful information quite well.
Being confident of the above single-variable analysis, we then move our focus onto matrix element method (MEM)-based observables which allow us to make full use of all available information encrypted in four-momenta of final state particles [28][29][30][31][32][33][34][35]. Unlike other statistical methods based on distributions of multiple observables, the MEM is predicated on a straightforward and elegant interpretation on the probability measure P, that is, the quantum amplitude of a given process with hypothesis α is schematically given as follows: where M is the matrix element for hypothesis α and W is the transfer function introduced to map parton-level momentum vectors ({q}) to reconstruction-level ones ({p reco }).
Markedly, the usefulness of the MEM has been proven in discriminating different spin/CP state hypotheses [25,28,29,33,34,36]. In particular, the MEM was a driving force to determine various properties of the SM Higgs particle in the four-lepton channel, which has been considered as one of the most exciting achievements at the LHC. In more detail, by identifying the interaction between the Higgs boson and a Z-boson pair, it has been shown that the Higgs boson is indeed related to the SU (2) L × U (1) Y gauge symmetry breaking mechanism. We note that this channel comes with ten degrees of freedom compared to its competing diphoton channel with only four degrees of freedom although the former involves smaller statistics than the latter. Therefore, given low statistics, it is imperative to combine different information from various degrees of freedom in an optimized way, for which the MEM is well-suited. We remind that many of the collider studies for the decay of a heavy resonance into the final state particles via massive SM states often advocate fully leptonic channels in not only search for new particles but measurement of their properties, due to the clean nature of leptonic final states even at the reconstruction level. While it is challenging to extract useful information from hadronic decay products unlike leptonic ones, the remarkable discriminating power of the MEM motivates us to construct an MEM-based kinematic discriminant (KD) using four-momenta of subjets. We then investigate how much the discrimination potential is retained in the context of jet substructure techniques again employing the benchmark scalar resonances.
To convey our main ideas coherently, we organize this paper as follows. In Section 2, we begin with the discussion on the phase-space reduction by introducing a fixed cone size. In Section 3, we provide a brief review of angular distributions for discriminating the spin and the CP states of heavy resonances, and discuss the impact of the phase-space reduction upon kinematic observables, in particular, CP-sensitive ones. We then present our main results obtained from the MEM-based analyses in conjunction with the jet substructure techniques in Sections 4 and 5. Finally, Section 6 is reserved for our conclusions and outlook.

Phase-space reduction
We begin this section by estimating the cone size R for "Merged Jets" (MJ) to capture all the two visible particles v 1,2 emitted from a highly boosted massive particle (e.g., . For simplicity, we assume that the two partonic decay products are massless and well-approximated to two subjets j 1,2 which are the constituents of a merged jet. We define P T (MJ) and m MJ as the laboratory-frame transverse momentum and the mass of a merged jet, respectively. With the assumption of P T (MJ) m MJ , simple kinematics in leading-order QCD leads to where z is defined as min(P T (j 1 ) ,P T (j 2 ) ) , i.e., the fractional transverse momentum of the leading subjet (say, j 1 ) with respect to the total transverse momentum. Here the equality is obtained in the limit of z = 1/2.
We then closely look at the relation between R and the angular separation ∆R 12 of two subjets which is defined as ∆R 12 ≡ ∆η 2 12 + ∆φ 2 12 , where ∆η 12 and ∆φ 12 denote the differences between the two subjets in pseudorapidity and azimuthal angle in the laboratory frame, respectively. The angular distance between j 1 and j 2 in the laboratory frame can be expressed in terms of the polar angle θ and the azimuthal angle φ of the leading subjet in the heavy particle rest frame relative to the boost direction to the laboratory frame [37]: where cosh η = E MJ /m MJ is a Lorentz boost factor of the MJ. One can show that ∆R 12 has a minimum at θ = π/2 and φ = 0 for any fixed η [37]. Therefore, a necessary condition to capture the two subjets for a given η is that the cone size R should be greater than the lower limit of ∆R 12 :  where the last step is done by setting cosh η in the transverse plane and taking a large transverse momentum limit. Note that this asymptotic behavior is identical to the estimate in eq. (2.1). Now if we set the cone size to be R MJ , all events with R < R MJ are accepted. We then translate this inequality to the upper bound for the polar angle θ: This inequality implies that fixing the cone size for MJs confines the polar angle to a certain range, resulting in a reduction of the accessible phase space.
To visualize this observation, we exhibit cos θ distributions of quarks (say, b) from Higgs or Z gauge boson decays. To minimize any effects on the angular distributions from their production, we assume that a pair of H or Z bosons are produced via the decay of a heavy scalar S, for example, gg → S → HH/ZZ. Trivially, cos θ for the Higgs boson case has a flat distribution. On the other hand, a Z boson has transverse and longitudinal polarization components, and thus its coupling to particle S is described in a somewhat complicated manner. Denoting M Z and Λ as the Z gauge boson mass and a scale parameter, we define the interaction Lagrangian between S and Z as where Z µν andZ µν are the field strength tensor and the dual field strength tensor for the Z boson, respectively. In M S M Z limit, the first term takes care of the interaction of the longitudinal polarization component while the other two describe that of the transverse polarization components [28,38], and the resulting differential cross section in cos θ is given by (2.7) Figure 1 displays our numerical results with parton-level Monte Carlo simulation for which the input mass of the heavy resonance S is 1 TeV for illustration. As mentioned above, we take the decay process of H or Z into a bottom quark pair. The upper-left panel shows the unit-normalized distributions of ∆R bb for the Higgs boson (orange histogram) and the Z gauge boson (blue histogram). The red and the blue dashed lines mark the locations corresponding to R = 0.8 and R = 1.2, respectively, allowing us to develop our intuition on what fraction of events are tagged. The other three panels (upper-right for the Higgs boson, lower-left for the longitudinal Z, and lower-right for the transverse Z) demonstrate the unit-normalized cos θ distributions with R ≤ 0.8 (red histogram) and R ≤ 1.2 (blue histogram) and compare them with the corresponding theory expectation represented by solid black lines. We clearly observe that a fixed cone size for MJs may distort the shape of differential distributions. Hence, when investigating physics governing experimental signatures with kinematic distributions including angular observables, one should conduct a careful examination on how much of partonic information would be missing by the introduction of a fixed cone size for MJs in reconstructing final state objects.

Angular correlations among final state particles
As in the case of the SM Higgs boson whose first signature appeared in the final state with γγ and ZZ, if a heavy new particle X respects the SM electroweak gauge symmetry, it may appear as a resonance in the final state with ZZ, W W , Zγ, and γγ. We divide them into three categories according to the number of angular degrees of freedom measured in the X rest frame.
1. X → γγ: Two angular degrees of freedom as (θ * , Φ * ) 2. X → Zγ: Four angular degrees of freedom as (θ * , Φ * , θ 1 , φ 1 ) 3. X → ZZ/W + W − : Six angular degrees of freedom as (θ * , Φ * , θ 1 , φ 1 , θ 2 , φ 2 ) 2   As schematically shown in Figure 2, the decay of X into two gauge bosons V 1 V 2 involves two degrees of freedom, polar angle θ * and azimuthal angle Φ * of V 1 (or equivalently V 2 ) about the beam axis. In a similar manner, each of the two gauge bosons involves two degrees of freedom, polar angle θ i and azimuthal angle φ i of one of the decay products relative to the V i boost direction in the V i rest frame. Another degree of freedom comes with the rapidity of the whole decay system which encodes the information of initial state partons through the parton distribution functions. However, imposing a rapidity cut on the reconstructed heavy resonance, we anticipate that any of its associated impact upon kinematic observables becomes mild [34].
The two angles θ * and Φ * describe the decay process of X, and they are evaluated as follows: where | X implies that all relevant physical quantities are measured in the rest frame of particle X. The overall azimuthal angle Φ * contains the information on the helicity state of X, but any detectable effect would be imprinted if there is interference among various helicity states. In the case where one does not include accompanying particles with X, the helicity of X is fixed by initial partons resulting in incoherent superpositions and does not have any useful information. On the other hand, we can understand the spin state of X from the polar angle θ * , although the detailed distribution in cos θ * also depends on the initial state partons because the set of possible helicity states of X is determined by the sum of helicities of the initial partons. Thus the determination of the helicity or spin of X with the Φ * or θ * variable is closely connected to the production mechanism through partons involved in the proton beam.
As we showed in Section 2, we can infer the polarization of V i from its decaying angles θ i , which is crucial information for understanding the coupling of X-V 1 -V 2 , and they are defined as follows: where the decay products of V 1 and V 2 are distinguished merely to avoid any potential notational confusion (see also Figure 2(c) for relevant decay products). Of various properties of resonance X, the CP state would be one of the highly non-trivial properties to be identified in collider analyses. Out of the six angular variables, the difference between two decaying planes of V 1 and V 2 , i.e., φ ≡ φ 1 − φ 2 , provides the strongest discriminating power between different CP states [28,33,38], 3 and this quantity is evaluated as In the rest of this study, we focus on the determination of the CP state of X assuming that X is a scalar S, as other properties such as the spin of X or the interaction to a longitudinal or transverse component of V i can be measured by other angular variables explained above. We remark that if there are interactions between S and the longitudinal polarization of V i through either a tree level coupling SV µ i V iµ or higher dimensional operator S D µ H † D µ H, we can distinguish them with those for CP-odd scalar (i.e., J P C = 0 −+ ) where it mostly interacts with the transverse polarization vector of V i . We therefore consider only higher dimensional operators of dimension 5, keeping the SM electroweak gauge symmetry SU (2) L × U (1) Y : where W a µν and B µν are field strength tensors of SU (2) L and U (1) Y , respectively, whileW a µν andB µν are their corresponding dual field strength tensors. After electroweak symmetry breaking, the couplings between S and mass eigenstate vector bosons can be described as where coupling constants can be written in terms of c Y , c W , and the Weinberg angle θ w as follows: As two coupling constants c Y and c W determine four decay modes of S, at least two decay channels should arise. For example, if the S → γγ channel is observed, one can expect to observe at least either of S → ZZ or S → Zγ channel as well. However, S → W + W − may not be available, as it depends only on c W which could vanish if S were SU (2) L -singlet. As briefly discussed before, φ plays an important role in determining the CP state of resonance S. In this sense, Zγ and γγ final states are irrelevant because they do not involve two decaying planes. In our numerical study, we focus on S → ZZ which subsequently decay semileptonically, i.e., qq + − . One reason for this choice is that the qq + − final state is expected to offer a better handle in inferring the underlying decay mode than the fully hadronic decay channel in which there exists non-negligible chance to misidentify observed events as S → W + W − due to the issue of jet mass resolution [27,40]. 4 Compared to the fully leptonic channel, the semileptonic channel surely enjoys higher statistics due to the larger branching fraction of Z into quark pairs, allowing us to expect better signal sensitivity. However, in a more realistic situation, this naive expectation is not straightforwardly applied. Once we take SM backgrounds into consideration, we are forced to impose severe cuts to suppress huge backgrounds including Z +j's so that we may end up with a similar order of sensitivity compared to the 4 channel. More specifically, it turns out that for m S 700 GeV, the signal sensitivity expected from the semileptonic channel becomes comparable to that from the fully leptonic channel [41,42]. Remarkably, the jet substructure techniques come into play in this high-mass regime. Note again that merged jets from major backgrounds contain a single quark together with additional QCD activities from radiation, whereas signal merged jet consists of two partons. Therefore, jet substructure techniques enable us to control and reduce SM backgrounds more efficiently.
On top of background rejection, we pro-actively utilize jet substructure methods to extract partonic information from a merged jet initiated by V i → qq. As explicitly demonstrated in Section 2, the procedures in the methods effectively restrict relevant phase space of final states, and in particular, the accessible region in θ i angles may be significantly affected. The coefficients for the differential distributions in φ are related to θ i in the narrow width approximation (NWA) as follows [28]: where we average contributions from different quark and anti-quark flavors as we cannot discern them. Here η defines a Lorentz boost factor as cosh η = M S /(2M Z ). Certainly, the above expressions imply that jet clustering procedures alter φ distributions by limiting θ i angles. If there were no restrictions on θ i , integrating θ i over the full ranges of (0, π) would give rise to differential distributions in φ as However, as we pointed out in the previous section, fixing the angular separation between relevant subjets results in shrinking accessible phase space with respect to θ i (see also eq. (2.5)), and therefore, to appropriately interpret outputs from any data analyses for discrimination the CP state of S, we need to have a solid understanding of relevant effects. We shall closely look at this observation in the next section, taking a couple of benchmark points (BP) with different jet size parameters in Cambridge/Aachen (C/A) algorithm [43,44]. The following benchmark points are chosen to cover different kinematical regions; one is for the moderately boosted Z and the other is for a highly boosted kinematics of Z.
• BP1 : M S = 750 GeV with a large jet size of R MJ = 1.2, • BP2 : M S = 1500 GeV with a decent jet size of R MJ = 0.6.
For the mass choice in BP1, we expect moderately boosted phase space in which the associated merged jet analysis becomes comparable to analyses based on a normal jet size because the efficiency for tagging a single merged jet with C/A of R = 1.3 becomes similar to that for tagging two ordinary jets with the anti-k t algorithm of R = 0.5 [42,45,46]. We find that typical Lorentz boost factors in the two BPs are large enough (e.g., cosh η 4.17 for BP1 and cosh η 8.33 for BP2) to simplify eq. (3.14) as Note that the second term in this expression differs from that of eq. (3.15) by a sign. Denoting two relevant coefficients by C 1 and C 2 , we have from which we find where we define the ratio of C 2 to C 1 as R φ . Hence, a better discriminating power is expected with a larger R φ . The expressions in eqs. (3.16) and (3.17) suggest that this ratio at the parton level without any restriction on the phase space converges to a quarter.
Naively, one would expect that this ratio becomes smaller as more realistic effects are taken into account. However, we will find in the next section that remarkably, phase space reduction by a fixed cone size renders R φ greater than 1/4, achieving better identification on the CP state.

Results with jet substructure techniques
In this section, we present our results with jet substructure techniques in conjunction with Monte Carlo simulation with various realistic effects such as parton shower, hadronization/fragmentation, and detector responses taken into account. To this end, we take a chain of simulation programs. We first create our model files using FeynRules [47] and plug them into a Monte Carlo event generator MadGraph5 [48] with parton distribution functions parameterized by NN23LO1 [49]. The generated events are further pipelined to Pythia 6.4 [50] for taking care of showering and hadronization/fragmentation, and to

Event reconstruction
As we discussed earlier, for our benchmark points belonging to a high mass regime, Z decay products are likely to be highly collimated. Denoting the angular distance between the two (fermionic) decay products as ∆R ff , its distribution develops a peak as shown in Figure 3, which is inherited from a Jacobian peak in the Z-boson transverse momentum distribution. The last statement can be understood by restricting eq. (2.4) into the transverse plane, i.e., where m ff is the invariant mass of two decay products. The minimum opening distance is obtained with the numerator set to be the mass of S Note that m ff follows the usual Breit-Wigner distribution around M Z so that some small fraction of events are found even below the expected minimum value 2 csc −1 (M S /(2M Z )) in the ∆R ff distributions exhibited in Figure 3. 5 Predicated upon this parton level assessment, we determine the isolation criteria for leptonic decay products of Z bosons and a jet size for clustering merged jets to capture hadronic decaying Z bosons into a single jet.  Table 1. Isolation parameters for each benchmark point to reconstruct an individual lepton from a Z boson decay.

Lepton isolation criteria
To reconstruct individual Z boson-induced leptons without any confusion with heavy flavor quark-induced leptons, we require the following isolation criteria: where is a candidate for an isolated lepton, and i's are any particles in the vicinity of the lepton candidate which satisfies Isolation parameters for each benchmark point are tabulated in Table 1 which are the standard parameters [54,55] except R iso . We choose R iso to have an isolated lepton according to Figure 3.  Table 3 for a given merged jet size. A jet size R MJ is chosen such that tagging efficiencies with different R MJ 's remain unchanged.

Tagging a merged jet
We begin with applying C/A algorithm to cluster particles from a hadronically decaying Z boson. As this is a sequential recombination algorithm based on the angular separation between two objects, it is useful for us to access sub-clusters by the angular order, in particular, to evaluate the φ angular variable. The algorithm combines two objects, which have the smallest angular distance, by adding up their momenta. This combining process continues until every clustered object is isolated from the others by an angular distance R MJ . Here R MJ defines the jet size for a merged jet in the C/A algorithm. In language of the k T algorithm [56], the C/A algorithm is equivalent to the sequential clustering with a metric between objects, where B denotes the beam line. 6 For each iteration, two objects which have the smallest d ij are combined. If d iB is smallest, the object i is promoted to a C/A jet and escapes from clustering. The iteration terminates if all objects are identified as jets. After a completion of clustering, we can obtain an angular hierarchy of sub-clusters by simply rewinding the clustering procedure. We then match sub-clusters to partons from a Z gauge boson, imposing relevant cuts to reduce the possibility of mistagging QCD jets as a Z-induced jet. For this purpose, we employ the Mass Drop Tagger (MDT) [13] whose procedure is briefly summarized below. The MDT essentially traces back the clustering sequence of a C/A jet and attempts to find subjets satisfying the symmetric conditions.
(1) Clustering: We cluster energy deposits in calorimeters using the C/A algorithm of a jet radius R = R MJ .
(2) Splitting to look into a substructure: We rewind the last clustering sequence of a jet j, labelling two subjets as j 1 and j 2 with m j 1 > m j 2 .
(3) Checking symmetric conditions: Major backgrounds in our case would be Z(→ + − ) + js where a quark-initiated jet appear as a merged jet. In this case, the most of the energy deposits are inclined to be localized along the momentum direction of the initial quark, so that there is a high chance of unbalanced energy sharing between two subjets including mass and p T . In contrast, a signal MJ consists of two prongs (i.e., two quarks) that ensure democratic energy sharing in two subjets. To quantify this difference, the MDT demands an upper bound µ * and a lower bound y * on µ and y respectively.
This procedure is useful to discriminate prongs in subjects from soft showering, on top of reducing backgrounds. If subjets do not satisfy above criteria, the MDT procedure redefines j 1 as j and repeats the rewinding procedure in (2).
Once the MDT tags a signal MJ and locates two prongs in the MJ, it decontaminates QCD corruptions in subjets by reclustering energy deposits in the MJ again with the C/A algorithm of a small radius jet size R filt , (4) Filtering: We reculster constituents of an MJ with the C/A algorithm of radius, to find n new subjets {s 1 , s 2 , · · · , s n } ordered in descending P T . Here R * is the maximum of allowed size of subjets to minimize the QCD contamination. The MDT takes into account an O(α s ) correction from hard radiation, by allowing up to three subjets in redefining an MJ as Assigning subjets to prongs from Z: If we have only two subjets {s 1 , s 2 }, we take these two subjets as two particles from a Z boson. In the case where we have three subjets {s 1 , s 2 , s 3 }, we merge s 3 with other subjet s i which has the smaller angular distance from s 3 . By this merging process, we identify subjets {j 1 , j 2 } in an MJ as We summarize parameters of the MDT procedures for two benchmark points in Table 2. As a MDT procedure has a cut y * on the phase space, we expect certain effects on the angular distributions in return as the cone size of a merged jet restricts the polar angles of decaying particles from Z bosons. Since a jet clustering procedure with the MDT is a key process to recover the parton-level information from the corresponding reconstruction-level information, we investigate effects from the MDT to understand phase-space distortion in reconstruction-level analyses.

Phase-space distortion from a jet substructure
As discussed earlier, constructing a merged jet to capture partons from the decay of a heavy (boosted) particle often accompanies cuts to suppress the rate to misidentify an ordinary QCD jet as a merged jet. In the MDT procedure, symmetric cuts µ * and y * are utilized to reduce single-prong jets from QCD backgrounds. While the µ * cut does not give any strong restriction on signal MJs, the y * cut may provide a limit on the phase space in subjets from a Z boson. Suppose that the softer subjet j 2 carries away z fraction of the total momentum, i.e., zP T (MJ) . We then find that symmetric cut y in eq. (4.6) can be expressed as where in the first step we make use of eq. (2.1) in the limit of P T (MJ) m MJ . A similar expression is available for the harder subjet j 1 which takes away the momentum of (z − 1)P T (MJ) . Solving the two inequalities for z (one from eq. (4.10) and the other from the corresponding one for j 1 ), we find that for a given y * , the momentum sharing z should be confined to a region defined as which, in turn, restricts the angular distance between the two subjets in terms of y * , To develop our intuition of the effects from this restriction, we put symmetric conditions of MDT to parton-level simulation data for S → ZZ → qq + − . In a parton-level simulation, only y * cut in eq. (4.6) becomes effective. Thus we apply a symmetric cut y * to two quarks from Z boson. We then plot distributions of ∆R qq with the events whose y value is greater than a certain y * . Figure 4 shows those distributions for three different y * values (red solid histogram for y * = 0, blue dotted histogram for y * = 0.09, and green dotted histogram for y * = 0.3) with M S = 750 GeV (left panel) and M S = 1500 GeV (right panel). We impose a basic cut of P T (MJ) > P * T (MJ) = 0.40 M S as well, which is well-motivated in the sense of reducing backgrounds by focusing on the central region.
We clearly observe that as we increase the y * cut, more phase space with large ∆R qq is removed. This implies that even though we begin with a sufficiently large cone size R MJ  est is an estimated distance between subjets evaluated from eq. (4.12) with a symmetric cut y * of BDRS.
to retain most of phase space as in Table 2, the y * cut in the MDT procedure effectively restricts the available region of ∆R qq below R (y * ) est . Moreover, we find that ∆R qq is smaller than typical choices of R MJ , for example, ∆R qq ≤ R which are marked by blue arrows in Figure 4. The resulting restriction on cos θ (i.e., the Z rest-frame polar angle of the harder quark relative to the Z boost direction) can be derived from eqs. (2.5) and (4.12): where the approximation is valid up to the second order in (R MJ /2). Thus as far as R MJ is larger than R (y * ) est , the cone size R MJ does not have direct effect on the phase space deformation compared to cuts from jet substructures, which are introduced to reduce background QCD jets. Our Monte Carlo study indeed confirms this observation. In Figure 5, we contrast the φ distributions at the parton level with those at the detector level. For parton-level distributions, we restrict the angular distance between two quarks from a Z boson decay by two different upper bounds of ∆R qq for each benchmark point. In the case of M S = 750 GeV (upper panels), the two upper bounds are chosen to be 1.0 (red lines) and 1.4 (blue lines) to have R MJ = 1.2 between them. Similarly, in the case of M S = 1.5 TeV (lower panels), they are chosen to be 0.6 (red lines) and 1.0 (blue lines). We clearly see that φ distributions depart further from the theory expectation (solid black curves) with the smaller cone size ∆R qq , whether the resonance is CP-even (left panels) or CP-odd (right panels). When it comes to detector-level analyses, however, once we introduce a fairly hard y * cut resulting in R (y * ) est < R MJ , the above-discussed parton-level effect simply disappears. Corresponding dotted lines in Figure 5 clearly support our expectation that final φ distributions are not much different even with different R MJ values. 7 We also understand this point from "constant" MJ-tagging efficiencies even with different C/A jet sizes in Table 2. Although we vary the size of MJs with different R MJ values, the overall cut on the angular distance of a quark pair is determined by R (y * ) est , allowing us to have a "stable" MJ-tagging rate.
Another important message from this series of exercises is that the impact of analyses cuts upon detector-level reconstructed objects is in more favor of our goal of discriminating CP states, unlike typical expectations in detector-level data analyses. More specifically, the difference of φ distributions between CP-even and CP-odd cases appears enhanced even after incomplete integrations over angular variables such as polar angles θ i in eqs., (3.14) and (3.15). This enhancement overcomes the adverse effects of detector resolution which often degrade subsequent data analyses.

Analysis with matrix element methods
In this section, we discuss further analyses with matrix element methods using fourmomentum information of subjets obtained by the jet substructure technique delineated in the previous section. We begin with a general overview for CP state discrimination with various measures, followed by matrix element methods and our main results with them.

Determining the CP property
To deal with experimental systematic errors properly and maximize distinctive asymmetric features between the φ differential distributions for CP-even and CP-odd resonances, a simple measure A φ has been introduced [57] for the S → ZZ → 4 channel: where N simply denotes the number of events. One may make use of the below-defined cumulative probability over A φ as a measure to determine unusualness for any observation A obs φ under the given hypothesis, where P implies the associated probability. An alternative method to obtain a probability density function (pdf ) is the kernel density estimation (KDE). One may estimate pdf from simulated data, and use the estimated function f KDE in performing a log-likelihood ratio test as to obtain the most powerful test between two simple hypotheses at a given significance level α according to the Neyman-Pearson lemma [58,59]. The pdf P (q φ |0 PC ) for a test statistic q φ with a given hypothesis 0 PC and the given number of events N evt is calculated from a huge number of pseudo-experiments which are generated with hypothesis 0 PC . The corresponding cumulative probabilities based on q φ are However, the above approaches, which are based on φ distributions, rely on the projection of our observed momenta of visible particles, {p reco } = {p j 1 , p j 2 , p − , p + }, into a single angular variable φ. Although in our study, phase-space reduction by cuts in jet substructure methods can enhance the difference between two CP hypotheses as we have observed in the previous section, it does not guarantee whether this projection attains the best sensitivity in cases where there exist at least three correlated angular variables as in eqs. (3.14) and (3.15). In the next section, we instead directly convert the observed momenta into a probability under a given model hypothesis. We then utilize this probability as a likelihood ratio test between different hypotheses, the CP state of a scalar resonance S in our study.

Matrix Element Method
As briefly mentioned in eq. (1.2), the probability based on the matrix element in a given hypothetical process α is given by where f p i (x i ) is a parton distribution function of parton p i inside the beam with a fractional energy of x i . Π q i describes the phase space of parton-level particles q i which are related to observed momentums {p reco } of corresponding particles. If detectors were perfect, such a relation would be trivial. However, as instrumental effects including detector smearing and responses become important factors in precise measurements, transfer functions W (q i , {p reco }) are introduced to map the information from reconstructed particles to the parton-level input for MEM by modelling energy smearing, in particular, effects in jet reconstruction stemming from showering, hadronization/fragmentation, and jet energy scales with gaussian functions that were obtained in the course of understanding top-quark properties in the Tevatron experiments [60][61][62][63][64][65]. In our study, we take a conservative approach for which we set W (q i , {p reco }) to be a delta function as we are not aware of precise information on detector responses. We instead model a pdf based on the reconstruction level distributions as we describe below. We remark that for the case at hand, all kinematic information can be restored with measured four-momenta of visible particles, meaning that the x i 's in parton distribution functions become fixed. Hence, the probability evaluated from a matrix element can be simplified as follows: We then decompose the matrix element into the production part p 1 p 2 → S and the decay part S → j 1 , j 2 , + , − through a narrow width approximation (NWA) which is valid as long as the decay width Γ S of resonance S is negligible compared to its mass M S . We also note that S is a scalar particle so that any helicity connections with partons in production part are disconnected unlike higher spin cases [66]. Therefore, we have the ratio of probabilities with different CP hypotheses 0 PC as where we dropped the common parts involving a production mode. Here we use the fact that cross sections σ 0 PC are fixed by the observed value. There is a subtlety in calculating a matrix element M as the current jet algorithms cannot specify the charge or flavor for light quarks. In order to deal with this issue, we symmetrize a matching between subjet (j 1 , j 2 ) and (q,q) as which alters the above-given probability ratio to the symmetrized ratio called a kinematic discriminant (KD) Predicated upon the KD, we construct two pdf 's P 0 PC (KD) = P (ln KD|0 PC ) with a large number of reconstructed events for each hypothesis that are prepared with Monte Carlo simulation at the detector level, ensuring the consideration of various experimental effects.
Assuming that each pseudo-experiment is independent and identically distributed, we set two likelihoods L(0 PC ) with the fixed number of events N evt Corresponding test statistic q M is defined as the log-likelihood ratio, A pdf of P (q M |0 PC ) for a test statistic q M with a given hypothesis 0 PC and a given number of events N evt is calculated from a huge number of pseudo-experiments. Cumulative probabilities based on q M are given by 14)  Table 3. Event selection criteria and corresponding efficiencies for each benchmark point. As rapidities of final particles do not depend on the CP states of a spin 0 particle, the efficiencies in different CP states of S are the same.

Results
We finally present our main results on distinguishing CP-even and CP-odd states in this section, comparing three methods, two with angular variables A φ and φ , and the other with an MEM-based variable. To maximize relevant performances, we first construct pdf s for test statistics in both methods based on the log-likelihood ratio. In our analyses we do not consider backgrounds since (1) we compare the performance of each method in the best case, and (2) background subtraction can be performed with s Plot [67]. We rather focus on studying effects from cuts to reduce backgrounds. Detailed information on potential backgrounds and recipes to take them into consideration in KD-based analyses shall be provided in Appendix A, and we simply continue our discussion here, taking the dominant reducible backgrounds. As discussed earlier, the main SM backgrounds to two leptons plus a single MJ are Z(→ + − ) + js where a QCD jet can mimic a merged jet by dressing up a mass due to QCD contaminations [41,46,[68][69][70]. Obviously, the resonance mass window cut is useful to suppress backgrounds, i.e., the invariant mass formed by a merged jet and a lepton pair should fall into the range around the mass of S. We set a different mass range in each benchmark point to consider effects of smearing. To reduce backgrounds further, it is noteworthy that for a signal event, a merged Z-jet and a di-leptonic Z are typically symmetric since they originate from a single resonance, whereas for a background event, the corresponding objects are asymmetric because a quark-initiated jet and Z are expected to have a sizable mass gap between them. This observation motivates us to introduce a P T asymmetric variable y ZZ [69] that is expected to be a reasonable choice to reduce Z + js backgrounds: Cut-efficiency flows for our benchmark points are summarized in Table 3. Imposing those cuts on Monte Carlo event samples, we conduct posterior analyses to determine the CP state of S using the pdf s from three methods listed below. We present our results for M S = 750 GeV in Figure 6 and M S = 1500 GeV in Figure 7: A φ in the first row, q φ in the second row, and q M in the third row. The first two columns show their distributions under 0 ++ hypothesis (red histogram) and 0 −+ hypothesis (blue histogram) with different numbers of events N event = 10 (first column) and N event = 50 (second column).
In discriminating different hypotheses 0 ++ and 0 −+ with a test static X, we calculate p 0 −+ (X obs ; X) to reject the 0 −+ hypothesis in favor of 0 ++ (Type I error α) [29], exhibiting "Brazilian" plots in the third column, i.e., 1σ (green) and 2σ (yellow) bands around the peak in p 0 ++ (X obs ; X) according to the number of required events to separate the two hypotheses. 1 We clearly observe that the most CP-sensitive method is to utilize a test static with the MEM-based log-likelihood ratio. In our study, we find that the separating power of the MEM-based method remarkably does not change much with resonance mass M S , providing its robustness over a wide range of mass space. In BP1 of M S = 750 GeV representing the moderate boost region, the required number of events to have Type I error in the level of 3σ is N (3σ) event = 18 +14 −10 within 1σ deviation for 0 ++ hypothesis. For BP2 of M S = 1.5 TeV representing highly boosted region, the corresponding required number of events is N (3σ) event = 21 +17 −12 . This can be understood in terms of the distortion in the differential distribution in φ as it is the most crucial angular variable encapsulated in the MEM to determine the CP state. If we neglect restriction on the phase space of leptons as lepton isolation R iso is larger than the minimum angular distance ∆R ff in eq. (4.2), the most relevant one is from the phase-space reduction in the jet clustering procedure as in eq. (4.15). The corresponding coefficient ratios R φ , which are defined in eq. (3.21), for the two BPs are As the shapes in φ distributions between the two benchmark points become similar after MDT procedures, we anticipate that N (3σ) event for both BPs are of same order, accordingly.

Conclusion
In this paper, we illustrated that the Matrix Element Method (MEM) could be a powerful method for identifying properties of a new particle in the final state with two jets and two leptons via a pair of SM gauge bosons. As a concrete example, we focused on discriminating CP property of a spin-0 resonance which decays into a pair of SM gauge bosons. A substantial mass gap between the new heavy particle and SM gauge bosons often results in collimated decay products of SM gauge bosons, which brings us a few challenges in applying the MEM for measuring the CP property. The prescription and the focus in our paper are respectively as follows;  Figure 6. We compare performances among various methods for the BP of M S = 750 GeV with R MJ = 1.2, the method based on A φ variable of eq. (5.1) in the first row, a log-likelihood ratio test based on the φ angular distributions from eq. (5.4) in the second row, and a log-likelihood ratio test based MEM of eq. (5.13) in the third row. For the statistics, we generate 5×10 6 pseudo-experiment.
• to reduce combinatorial issues and to have a systematic approach, we adopt a merged jet to capture two quarks from the decay of an SM gauge boson, and • we studied effects from jet clusterings and associated jet substructure methods on the phase space of visible particles.
A certain degree of prejudice in performing data analyses with detector-level reconstructed particles is typically expected in comparison with relevant theoretical expectations. However, our study based on both analytical calculations and Monte Carlo simulation demonstrated that restrictions on the phase space invoked by jet clustering procedures could enhance the difference in angular distributions for new particles with different CP states, unlike the naive expectation stated above. We also showed that the performance of our data analyses does not significantly depend on the size of MJs, as internal cuts in jet grooming procedures affect the phase space for visible particles stronger. We believe that our finding here benefits the determination of a reasonable size of MJs to make a balance between analysis performance and enhancement of the ratio of signal-over-background.
In our analyses with the MEM, we refrained from integrating partonic phase space through transfer functions which map reconstructed objects to the partonic phase space. We rather modeled a probability density function (pdf ), generating many pseudo-experiments with Monte Carlo simulation for a given signal hypothesis. This procedure can take into account various effects from jet clustering procedures together with detector effects as well as offer computational advantages compared to the situation where 2 N vis -dimensional integration is required in dealing with transfer functions. within M S ± 50 GeV 0.037% 0.82% 0.68% Cross section (σ) -3.16 fb 0.0671 fb 0.0609 fb Table 4. Cut flows for major backgrounds of BP1. In a signal region using a merged jet, Z + jets becomes the dominant background compared to reducible electroweak process of pp → ZZ.
Two benchmark points were selected to cover various phase space regions from the moderately boosted one to the highly boosted one. According to our numerical studies, discriminating different CP states requires O (20) signal events at the level of 3σ significance over the mass range of a new particle from over 700 GeV that the current LHC has the equal level of sensitivity in a merged jet with dileptons compared to a full leptonic channel [41].  Table 4.
To estimate contributions from above backgrounds, we perform LO Monte Carlo simulation for ZV and Z + js. The cut flow is shown in Table 4. In Figure 8 we show m (MJ, + , − ) distribution for the backgrounds and signal after all the cuts except m (MJ, + , − ) in Table 4. We find the Z + js far dominates ZV background.
To include the background, the pdf of a set of discriminator variables {x} for signal and background hypotheses can be written as where r x = σ x /σ total as we build f from amplitudes. The best discriminator is the momenta {p} themselves, and we need matrix elements of signal and background to convert the momenta into a probablity of observing them under a given hypothesis. Then, the likelihood ratio q M is defined by with R σ = σ bkg /σ sig . In cases where we control backgrounds efficiently so that R σ < 1, we use the following approximation In Figure 9 we show the discrimination power including the background estimation. Comparing with the similar plot ( Figure 6 second row) but without background contamination, more events are required to discriminate the CP property of the scalar. Therefore, applying new techniques to reduce the background would help to probe the properties of the new particles.  Figure 9. Similar results to Figure 6 for BP of M S = 750 GeV with backgrounds. We used zeroth order q M about R σ . We assumed that N signal = N bkg .
As the main background is from Z + js where a faked fat jet is from a QCD jet, to include the background in the MEM, an analytic matrix element of Z + js should be provided after the MDT. The leading order and NLO results are shown in Refs. [71,72]. However, it would be difficult to go beyond next-to-leading logarithmic (NLL) accuracy since the original definition of MDT has non-global logarithms (NGLs) [71,72]. While a modified Mass Drop Tagger (mMDT)/soft drop [16,71,72] have been proposed, and NNLL accuracy has been shown recently [73]. However, this is beyond the scope of the paper where a simple MDT is used, so we ignore the discussion of the MEM including backgrounds.

B Phase space restriction from other jet methods
Besides the MDT, there are also some jet methods widely used in the literature. In the following, we give some comments on how the jet methods affect the phase space.
• Trimming: The trimming procedure [15] uses a k t algorithm to divide the whole jet into some subjets with a size R sub , and then removes the subjets with p T i /p jet T < f cut , where f cut is a parameter and p T i is the transverse momentum of the i-th subjet. The remained subjets are reclustered as a trimmed jet. Similar to the MDT, the f cut give a cut on the p T fraction of subjet z > f cut . Like eq. (4.12), it effectively reduce the cone size of fat jets.
• Pruning: The pruning method [14] uses the C/A or k t algorithm to cluster the jets.
At each recombination step 1, 2 → p, either min(p T 1 , p T 2 )/p T > z cut or ∆R j1,j2 < m J /p T J needs to be satisfied. Interestingly, both cuts give some upper limit on the cone size of fat jets.
• N-subjettness: N-subjettiness [23,24] is defined as  where d 0 = Σ k p T k R 0 . For a two prong fat jet, usually τ 21 = τ 2 /τ 1 is computed, and an upper limit is used. Since τ 2 = 0 at the parton level, it is better to check the effect at the reconstruction level. In Figure 10 we show the ∆R qq between two partons from a Z boson decay after τ 21 cuts. The τ 21 cut reduces all the ∆R qq region, so it could be taken as an independent cut after jet grooming (MDT, trimming, pruning).