Jet substructure and probes of CP violation in Vh production

We analyse the hVV (V = W, Z) vertex in a model independent way using Vh production. To that end, we consider possible corrections to the Standard Model Higgs Lagrangian, in the form of higher dimensional operators which parametrise the effects of new physics. In our analysis, we pay special attention to linear observables that can be used to probe CP violation in the same. By considering the associated production of a Higgs boson with a vector boson (W or Z), we use jet substructure methods to define angular observables which are sensitive to new physics effects, including an asymmetry which is linearly sensitive to the presence of CP odd effects. We demonstrate how to use these observables to place bounds on the presence of higher dimensional operators, and quantify these statements using a log likelihood analysis. Our approach allows one to probe separately the hZZ and hWW vertices, involving arbitrary combinations of BSM operators, at the Large Hadron Collider.


Introduction
Both before and after the discovery of a new resonance at the Large Hadron Collider (LHC) [1,2], much attention has been focused on how to efficiently determine its spin and couplings . Deviations from Standard Model (SM) behaviour would signal the presence of new physics beyond the Standard Model (BSM), and there are significant motivations for expecting such deviations to be present at some level, not the least given that new physics is expected to explain or clarify the nature of electroweak symmetry breaking. There are two main approaches for addressing BSM corrections to the Higgs sector. The first is to postulate the existence of a specific theory, and analyse how the particle content leads to corrections to SM observables. This approach must necessarily be used for collider experiments whose energy exceeds the lowest energy scale associated with the new physics (e.g. a new particle mass). The second possibility is to use effective field theory techniques to write down possible corrections to the SM Lagrangian in the form of additional operators, which ultimately arise from integrating out the new degrees of freedom in a particular BSM model. One may systematically classify these operators according to their mass dimension, such that higher-dimensional ones are suppressed by increasing powers of the new physics scale. For a given mass dimension, there is a finite independent set of possible operators. By including all of these (in a chosen basis), one allows for the most general corrections to the SM. This approach has the benefit of being completely model-independent, but at the price of being applicable only for energy scales which are below the lowest new physics scale. This is a reasonable assumption to make, given that current studies (such as those referred to above) appear to show only small deviations, if any, from the Standard Model.
In this paper, we focus on the coupling of the Higgs to vector bosons V = W , Z. The operators relevant for these interactions have been classified in [44][45][46]. It is important to understand that bounds derived on the hZZ vertex, do not automatically translate to bounds on the hW W vertex. For example, as argued in ref. [38], violation of custodial symmetry can arise naturally in new physics models. While higher dimension operators may be constrained from precision tests as well as Higgs rates [41,47,48], the constraints depend on various assumptions. Unambiguous and definitive constraints can only be determined by directly probing the nature of the hZZ and hW W vertices separately. In order to determine whether or not the higher dimension operators are present in nature, one must study various scattering processes that involve the hZZ and hW W vertices. The decay of the Higgs boson to Z boson pairs at the LHC has been studied in [13,14,16,24,[49][50][51][52][53][54], which focused on the fully leptonic decay channel. Combined with LHC data, this disfavours the possibility that the recently discovered boson is purely pseudoscalar at ∼ 2 − 3σ significance [55][56][57][58][59]. The decay of the Higgs to W boson pairs is more difficult in principle, due to the limited kinematic resolution inherent in having missing energy in the final state. This mode has been investigated in [13,15,19,21,60,61]. As ref. [15] in particular makes clear, the kinematic cuts used to select events in this case may overly diminish the signal for BSM effects.
Another possibility is to study the production of the Higgs boson via vector boson fusion, and angular observables exist for distinguishing various BSM scenarios [62][63][64][65]. However, a deficiency of this mode is that it is not possible to unambiguously separate BSM contributions to the hW W and hZZ vertices. Furthermore, ref. [26] argued that the momentum dependence associated with an anomalous hV V vertex can have a dramatic effect on the rapidities of the quarks that emit the vector bosons, and consequently of the acceptance of the event selection cuts.
Given the above difficulties, the possibility has been explored of using a future electronpositron collider, such as the proposed International Linear Collider (ILC) or equivalent [66][67][68][69][70]. The different BSM corrections have different CP properties, which manifest themselves in different angular decay products of the Higgs and associated particles. An e + e − collider can explore this in detail using polarised beams. In addition, the partonic centre of mass energy is known precisely in such a collider, and one may distinguish different contributions to the V V h vertices using the fact that they lead to different power-like growths of associated Higgs production cross-sections near threshold [66]. However, it is still not possible to unambiguously determine BSM corrections to the W W h and ZZh vertices separately at such a collider 1 . Whilst this is possible at the LHeC (a proposed e − p facility) [71,72], it is clearly advantageous to use the LHC itself to achieve this.
In this paper, we show that one can indeed distinguish the presence of higher dimensional operators at the LHC, using the associated production of a Higgs with a vector boson (V h production), where the Higgs boson decays to a pair of b quarks. For many years, it was thought to be impossible to analyse this mode, due to the presence of large QCD backgrounds. This situation has changed due to the development of jet substructure techniques, as pioneered in [73]. By requiring the Higgs boson to be boosted, the b quark pair from its decay will be approximately collinear. One may then distinguish the boosted Higgs signal by looking for a fat jet, containing two smaller subjets (modulo a filtering procedure) each of which reconstructs the b mass. Subsequently, a number of approaches for utilising jet substructure have been developed [74][75][76][77][78], together with analytic understanding [79,80] and applications in experimental analyses [81][82][83][84][85][86][87][88][89][90][91]. As the present authors already pointed out in [27], reconstructing both the Higgs momentum and the associated vector boson opens up the use of polarisation-related methods for V h production, analogous to those used in the e + e − studies mentioned above: the spin state of the associated vector boson is influenced by the presence of higher dimensional operators in the Higgs sector, so that angular observables involving the vector boson decay products can be used to constrain BSM physics. Furthermore, this can be done separately for the Zh and W h channels, allowing one to independently elucidate the nature of the hZZ and hW W vertices.
The structure of our paper is as follows. Throughout the remainder of this introduction, we discuss the framework we are using for higher dimensional operators in more detail. In section 2, we describe the details of our simulations and the selection cuts. We also make a note of higher order effects in both W h and Zh production and describe kinematic reconstruction issues. In section 2.6 we describe the increased sensitivity to non-SM couplings of the hV V vertex, mentioned earlier. In section 3 we construct and describe various angular observables that are able to discriminate the different non-SM couplings of the hV V vertex. In section 4 we construct likelihoods out of various observables and estimate the required luminosity for the 14 TeV LHC to constrain the anomalous hV V couplings. In section 5 we construct a CP-odd asymmetry that is linearly sensitive to the CP-odd coupling and hence to CP violating effects in the hV V vertex. Finally in section 6 we summarize and conclude.

Higher Dimensional Operators
As already mentioned in the introduction, one may encapsulate the structure of BSM physics in a model-independent way by adding higher dimensional operators to the SM Lagrangian. One starts by classifying all possible higher-dimensional operators that can 1 It is not easy to study the anomalous WWh couplings via the process e + e − → ννh since there are large irreducible backgrounds to this process and a high degree of beam polarization as well as measurements of the polarization of the final states are required. See ref. [68] for details. serve as corrections to the Standard Model Lagrangian and that are gauge-invariant, an exercise which was first carried out in [44,92,93]. There is a single dimension five operator which, after electroweak symmetry breaking, is responsible for neutrino masses and mixings. One is then motivated to proceed to dimension six operators, of which there are many -ref. [44] lists over a hundred. However, not all of these are independent, as one may use equations of motion to relate them. To this end, ref. [46] showed that there are 59 independent operators. Reference [45] expressed these in a basis more directly suited to Higgs boson physics, and also discussed how the coefficients of these operators scale differently if electroweak symmetry breaking is weakly or strongly coupled. Effective operators for a hypothetical spin one or spin two Higgs boson have been presented in [94], which also discusses their implementation in a computational framework inclusive of next-to-leading order matrix element corrections and parton shower effects. A pedagogical review of the literature may be found in section 2 of ref. [95].
Since our objective is to study the Lorentz structure of the hV V vertex for massive gauge bosons, we will only concern ourselves with a subset of the operators. It is sufficient to consider the following three operators Here Λ is the scale of new physics and the multiplicative Wilson coefficients b W W , c W W and b hW parametrize the relative strengths of these operators. Let us write the coupling of a scalar state to two vector bosons in the form iΓ µν (p, q) µ (k 1 ) * ν (k 2 ), where p = k 1 + k 2 , q = k 1 − k 2 and { µ (k i )} (i = 1, 2) are the polarization vectors of the two gauge bosons. Then the vertices arising from adding the above operators to the SM are (V = W, Z): Here (α V , β V , γ V ) are effective coupling strengths and µνρσ is the completely antisymmetric tensor. At tree level in the SM α V = 1 and β V = γ V = 0. The parameter x = 1 when V = W and x = cos θ w when V = Z. Note that while the terms with coupling α V and β V parametrize CP-even terms, the term with coupling γ V is CP-odd. These terms (β V , γ V ) may be generated within the SM at higher orders of perturbation theory, although the resulting couplings are likely to be very small. Significantly large values of these couplings would be a signal for BSM physics. Such a vertex may be generated by including higher dimensional operators in the Standard Model Lagrangian.
The parameters entering the above vertex are and for the Z and W boson cases respectively. Here we have introduced a factor (1 + a V ) in front of the SM Lagrangian in each case, such that a V = 0 denotes the presence of BSM corrections. In the effective Lagrangian approach, one should ideally set a V = 0. However, several models of BSM, including the MSSM, predict Higgs couplings that have the same tensor structure as the SM, but a different strength. Setting a V = 0 then allows for a study of such a possibility. In the rest of this work we use the following rescaled dimensionless couplings that are defined as follows For completeness, we write down the hZZ and hV V vertices in terms of these couplings below.
One should note that, since we will always consider this vertex in processes where the V bosons are connected to external fermions, the terms ) vanish due to current conservation and we will not consider them any further. Note that the extra factors of cos The aim of this paper is to perform a detailed study of angular observables designed to distinguish the three contributions to each hV V vertex: SM, BSM CP even, and BSM CP odd, building upon the preliminary study of [27]. We discuss the details of our analysis framework in the following section.

Event Simulation and Selection
We consider V h production (V = Z, W ± ), where the V decays leptonically, and the h boson to a bb pair. Further, we use jet substructure algorithm techniques to not just separate QCD backgrounds but use it to reconstruct the parton-parton CMS frame for V h production. In this section we describe the tools and methods used for our analysis, including the selection cuts utilised for both W h and Zh production. We simulate all processes using MadGraph5 [96], having implemented the effective Lagrangian in FeynRules [97,98]. The output is interfaced with Pythia6 [99] for showering and hadronization. We use the 'Z2Star' tune for Pythia6, including initial and final state radiation along with effects of multiple interactions, and use the CTEQ6L1 parton distribution functions [100]. We use FastJet [101] to cluster jets.
Our selection cuts for these signal processes are as follows.

Zh production
For Zh production we require: 1. A fat jet of radius R = ∆y 2 + ∆φ 2 = 1.2 and transverse momentum p T > 200 GeV.
After applying the mass drop and filtering procedure of [73] on this fat jet, we require no more than three sub-jets with p T > 20 GeV, |η| < 2.5, and radius R sub = min(0.3, R bb ), where R bb is the separation of the two hardest subjets, both of which must be b-tagged. In addition, we also require that the invariant mass of this jet system reconstructs the Higgs mass in the range 110 -140 GeV. V +jets corresponds to the Z+jets background for the Zh process and W +jets for the W h process. For the last three columns the SM contribution was set to zero and each of the values of b V 1 , b V 2 , c V were set to reproduce the SM total cross-section before applying cuts.
2. Exactly 2 leptons (transverse momentum p T > 20 GeV, pseudo-rapidity |η| < 2.5) of same flavour and opposite charge, with invariant mass within 10 GeV of the Z mass m Z . These should be isolated: the sum of all particle transverse momenta in a cone of radius R = 0.3 about each lepton should not exceed 10% of that of the lepton transverse momentum.
The first selection requirement listed above is used to reconstruct the decaying Higgs. The requirement for a fat jet with large transverse momentum means that we are looking at events with a highly boosted Higgs. Note that by allowing for a third hard jet inside the fat jet, the procedure allows for an extra jet other than the two b-jets originating from the radiation of a gluon from the b-quarks. The second requirement allows for reconstruction of the Z boson, where the isolation criterion removes most of the tt and QCD background. The third requirement ensures that the Higgs and the Z boson lie in almost the same plane of production as is expected for the signal.
After cuts, the only significant surviving background process is Z + jets. Cross-sections at Leading Order (LO) after cuts are shown in table 1. The h → bb branching ratios were taken from Ref. [102]. The cross-sections for backgrounds, the SM, the pure CP-odd operator c V = 0 and the two BSM CP-even operators b V 1 = 0 and b V 2 = 0 are shown, with all other couplings set to zero in each case. The values of the BSM couplings are chosen so as to reproduce the SM total cross-section, without any selection cuts. Note that in spite of this choice of couplings, the cross-sections after cuts for the BSM cases are much higher than the SM, implying a greater acceptance for the BSM cases. We will elaborate on this point later.

W h production
For W h production we require the following: 1. The Higgs reconstructed as above.

5.
No additional jet activity with p jet T > 30 GeV, and rapidity |y| < 3 (to suppress single and top pair production backgrounds).
The difference between Zh and W h is that in the latter case only the transverse momentum of the W boson can be determined in the detector. It is possible to some extent, however, to reconstruct the neutrino momentum from missing energy requirements, as discussed in what follows. The LO cross-section for the signal and major backgrounds are detailed in table 1. Once again, the choice of couplings (b W 1 , b W 2 , c W ) is such that the total crosssection (before any kinematic cuts) is identical to the SM total cross-section. As in the case of Zh, we see that in this case also, the cross-section after cuts is larger for the BSM couplings, indicating a higher acceptance of the selection cuts to BSM physics. In the following we describe certain detector effects that we have considered in our analysis.

Detector effects
While a full detector simulation is beyond the scope of this study, it is still important to check whether the inefficiencies of a detector do not dilute the effects that are observable with exact reconstruction. To this end we use the Delphes 3 package [103] for a fast simulation of detector response. We set the parameters of the detector simulation tuned for the CMS detector, with some modifications: • The lepton isolation radius R is set to a reduced value of 0.3, to allow for isolation of leptons in high transverse momentum events where the leptons will be collimated.
• We modify the jet reconstruction algorithm for the detection of a boosted Higgs as described above.
We use the Delphes package since it has been shown to give good agreement with data [103]. However, we do not carry out any validation with experiment for our choice of parameters as this is beyond the scope of the discussion presented here.

Higher order effects
At LO and at NLO in QCD, V h production occurs through quark-initiated processes. The NLO (QCD) correction to the LO order process is given entirely by corrections to the Drell-Yan process [104][105][106]. The NLO process produces extra QCD radiation in the initial state thus affecting observables such as the transverse momentum of the final state particles. It should be noted that such effects are large only near the threshold of the transverse momentum cuts due to collinear and/or soft initial state radiation [107]. In fact the use of asymmetric transverse momentum cuts on the V-boson (p V T > 150 GeV) and Higgs (p h T > 200 GeV) transverse momenta means that most of this effect will be concentrated in the region (p V T < 200 GeV). We omit this region of phase space in our final results.
The K-factor (ratio to the LO order cross-section) for NLO (QCD) is about 1.2. In the special kinematic region of the boosted analysis the K-factor for both Zh and W h was found to be ∼ 1.5 in Ref. [73]. For the Zbb background the K-factor was found to be about the same while for W bb the K factor is higher and about 2.5. The other main background, tt production, was found to have a K-factor ∼ 2. We will use these values of the K-factor in our analysis of likelihoods in section 4.
Furthermore, we simulate the kinematics of an extra jet using the MLM matching procedure [108] with one additional jet for both signal and background. We have checked that our results do not vary significantly with the addition of this extra jet. For W h production, since we veto events with additional hard jets to remove backgrounds, the effect of extra radiation on the observables we consider is negligible.

Reconstructing the neutrino momentum
One must reconstruct the neutrino in W h production to determine our angular observables. We identify the neutrino transverse momentum p T ν with the missing transverse momentum p T . As explained previously, the missing transverse momentum is approximated by taking the negative vector sum of the transverse momentum of all particles that can be detected (> 0.5 GeV). In order to evaluate the full four momentum of the neutrino, we demand that the squared sum of the neutrino and lepton momenta be equal to the squared W boson mass ((p ν + p l 1 ) 2 = M 2 W ), and solve the resulting quadratic equation. Comparing with the "true" Monte-Carlo generated neutrino momentum, we find that choosing a given solution out of the two possible ones, reconstructs the true neutrino momentum 50% of the time, with 5% giving imaginary solutions. One may improve on this by comparing the boosts of the Higgs β h z and reconstructed W β W z in the z direction. The solution with the minimum value for |β W z − β h z | gives the true neutrino momentum in 65% of cases. We thus present all our results using the latter algorithm.

Sensitivity to anomalous couplings
It has been observed that the momentum dependence of the BSM couplings of the hV V vertex push the p T and invariant mass ( √ŝ V h ) distributions to larger values [25,26,109], due ultimately to the extra momentum factors present in the BSM vertices. This is confirmed in the distributions shown in the plots in figure 1. The plot on the left shows the transverse momentum distribution of the W boson in W h production in the SM (black solid line) compared each of the pure BSM couplings (with the SM contribution set to zero), using the selection cuts described in the previous section. The values of the couplings have been chosen so that they reproduce the SM cross-section when no cuts are applied. We see that the effect of all the BSM couplings is to push the p W T distribution to larger values. The effect is even more pronounced for the coupling b W 2 while the b W 1 and c W couplings have a strong but less pronounced effect on this distribution. In the right plot we show the same distribution but this time for admixtures of the SM coupling (a W = 0) with each of the BSM couplings. The effect of the BSM couplings is still easily discernible, though less prominent than compared to plot on the left. The larger p T distributions of the V h system  also lead to larger Higgs boosts and a reduced separation R ll and R bb between the leptons (from the decay of the gauge bosons) and b jets (from the decay of the Higgs) respectively. As mentioned in the previous section, this effect is further quantified by the results of table 1, in which the cross-section for pure BSM processes after cuts is significantly higher than the SM result, after imposing that the cross-sections agree before cuts.
In figure 2, we consider the SM coupling a W = 0 supplemented by either the c W (CPodd) coupling or the b W 1 coupling applied to the W h channel. We show the ratio of the SM+BSM and SM cross-sections both for the total cross-section (R ± tot = σ SM +BSM ± tot /σ SM tot ) and the cross-section after applying selection cuts (R jetsub ± = σ SM +BSM ± jetsub /σ SM jetsub ). Here R + and R − correspond to the case (a W = 0, b W 1 = 0) and (a W = 0, c W = 0) respectively. As is expected, both ratios decrease with the strength of the BSM couplings and approach unity as the BSM couplings tend to zero. R + tot shows a faster rise with coupling strength than R − tot . This is because the interference term in the matrix element squared for the CP-odd coupling does not contribute to the cross-section 2 . Importantly, R jetsub (for both couplings) increases at a faster rate than R tot with increasing values of the corresponding couplings. Similar results also hold for the second CP-even anomalous coupling b W 2 . These ratios are therefore quite sensitive to the presence of anomalous couplings. Whilst not directly experimentally measurable, they can be determined by comparing an experimental measurement of the V h signal with a precise theoretical prediction for the SM only contribution. If this lies away from unity, this constitutes a strong indication of BSM physics. Another feature that should be noted is that the ratio of the ratios (R ± jetsub /R ± tot ) increases at a faster rate for the CP-odd coupling than it does for the CP-even coupling. This is in agreement with the results of table 1, where it was observed that the acceptance to the selection cuts of the pseudo-scalar state was higher than a scalar with anomalous coupling b W 2 .

Angular observables
In this section, we consider differential observables that can distinguish between the different BSM vertices occurring in the hV V interaction. One such observable, the transverse mass of the V h system, has been used at the Tevatron to probe the hW W vertex [110], however, it has been shown to be ineffective at the LHC [25]. Furthermore, the CP-odd coupling contributes to this observable only through quadratic terms in the matrix element squared and therefore is not the most sensitive observable 3 . Angular observables, as we will show, can be linearly sensitive to the anomalous couplings. This is useful in that one may construct asymmetry parameters that are manifestly zero for the SM, such that any non-zero measurement constitutes discovery of new physics. Note that in the context of effective theory analysis, constructing observables that are linear in the anomalous couplings is of paramount importance.
The tensor structure of the BSM vertices will be reflected in the angular distribution 2 In practice, if the squared term is larger than the interference term, this is an indication that the effective theory framework is breaking down. Here, however, we are merely quantifying the effect by which selection cuts enhance BSM effects, by fixing the BSM cross-section to be artificially high (equivalent to an unphysically low cut-off scale). 3 Care must be taken if such quadratic terms become important, as this signals a potential breakdown of the effective theory description. of the decay products of the gauge boson 4 . To this end, we construct various angular observables that could discriminate between the different vertex structures. The momenta of the V and Higgs bosons are reconstructed from the leptons and jets as follows: where {p b i } are the momenta of the b jets, p j is the momentum of the light quark jet if it is reconstructed and p l 1 and p l 2 are the momenta of the lepton and the anti-lepton respectively (for W h, p l 1 corresponds to the lepton momentum and p l 2 to the neutrino).
With these momenta, we may define Here p (Y ) X corresponds to the three momentum of the particle X in the rest frame of the particle Y . If Y is not specified then the momentum is defined in the lab frame. The parameter cos θ * corresponds to the angle between the direction of the decaying lepton in the rest frame of the V boson with the direction of flight of the V boson in the lab frame. This angle, first defined in [66], encodes the W boson polarization.
The SM and BSM couplings lead to mostly longitudinal and transverse W bosons respectively. That cos θ * then effectively distinguishes SM and BSM effects can be seen in   3. We see that the SM distribution (black solid line) peaks at cos θ * = 0 and vanishes at cos θ * = ±1. The distribution for the BSM coupling b W 2 (green dot-dashed line) closely follows the SM distribution. This is expected since the tensor structure of the vertex in both these cases is the same. In contrast, the couplings b W 1 and c W produce distributions that have minima at cos θ * = 0. They too appear to vanish at cos θ * = ±1, however this is the effect of the selection cuts. Without applying any selection cuts the distribution for these two cases peak at cos θ * = ±1.
The behaviour of this distribution for each of the couplings can be understood as follows. For a transversely polarized W boson, the decay lepton spins align themselves perpendicular to the direction of motion of the W boson and gives rise to a distribution of the form (1 ± cos θ * ) 2 , while in the case of a longitudinally polarized W boson, the spins of the decay leptons align themselves along the direction of the W boson and give rise to a distribution of the form sin 2 θ * . The two BSM couplings b W 1 and c W produce more transversely polarized W boson states while the SM coupling and the coupling b W 2 produce more longitudinally polarized W bosons. Using this distribution, it is therefore possible to differentiate between vertex structures with couplings b W 1 or c W from the vertex structure with couplings a W or b W 2 , but not between b W 1 and c W or between a W and b W 2 .
In figure 4 we present plots of the same observable cos θ * for admixtures of the SM coupling with each of the BSM couplings. We present three cases corresponding to (a W = 0, b W 1 = 0.1), (a W = 0, c W = 0.1) and (a W = 0, b W 2 = 0.01). We see that differences, though reduced, are still discernible in this distribution. Similar results also hold for Zh production.
To fully distinguish the CP even (b W 1 ) and odd (c W ) BSM contributions, one must construct CP-odd observables, which is difficult in principle for a proton-proton collider [111]. For Zh production, Ref. [112] considered two such observables, although these are sensitive to radiation and hadronization corrections; Ref. [113] defined observables which are insensitive to the CP nature of BSM contributions. Ref. [49] examined CP-odd asymmetries in W h production with the decay h → W ( * ) W * , though the effect of the BSM CP even term was not considered. The hint for possible CP-odd observables comes from looking at the matrix element squared for the process q(k 1 )q (k 2 ) → W + (p W )h(p h ) → l + (p 1 )ν l (p 2 )(p h ) shown in appendix A. The interference term between the CP-odd coupling and the SM coupling (a W c W ), is proportional to µνρσ k µ (p h − k 1 ) ν p ρ W p σ 1 , where k = k 1 + k 2 and µνρσ is the anti-symmetric Levi-Civita tensor. Such a term depends on the angle between the plane of production of the W h and the direction of flight of the lepton. This is depicted in figure 5.
We now construct the following angles based on the observation made above.
We use the same notation for the momenta as described below eq. (3.2). For an e + e − collider where the direction (as well as the energy) of the lepton and anti-lepton are well known, it is sufficient to define the normal to the plane of production with the cross-product between any one of the leptons and the direction of flight of the Higgs or gauge boson. Note that the choice of vectors in e + e − collisions completely fixes whether the normal points 'below' or 'above' the plane of production. At the LHC, the information about the direction of the quark or anti-quark is not known and hence it is difficult to fix the direction of the normal to the plane of production. However, it is known that the valence quarks are likely to carry a larger fraction of the proton momentum. The direction of the normal to the plane of production can then be fixed by the momenta of the V and Higgs bosons. We use this fact to construct the first angle in eq. (3.4), cos δ + , which corresponds to the angle between the direction of flight of the lepton (with the momentum evaluated in the rest frame of the V boson) and the plane formed by the V boson and the Higgs. In an e + e − collider, where the centre-of-mass and lab frames coincide, the gauge and Higgs bosons will be produced back to back. However, at the LHC one can take advantage of the asymmetric collision energies of the partons which results in the difference between the centre-of-mass and lab frame. For asymmetric collisions, the plane defined by the cross-product between the V boson and Higgs directions, coincides with the plane of production. In figure 6 we show the distribution of this observable for SM as well as for the anomalous couplings. In the plot on the left, the SM prediction is compared to the prediction for each of the anomalous couplings : b W 1 = 0 (blue dotted line), b W 2 (green dot-dashed) and c W (red dashed) with all other couplings set to zero in each case. All distributions show a dip at cos δ + = 0. This is created by the transverse momentum cut on the leptons, since low p T leptons will always be perpendicular to the normal to the plane of production.
We see from this distribution that for b W 1 = 0 leptons are produced mostly in the plane of production, while for c W = 0 the leptons tend to be produced mostly perpendicular to the plane of production. For two cases, the SM and for b W 2 = 0, the distribution is flat (without cuts) and has a slight dip at cos δ + = 0 due to the p T cuts, as explained above. This observable clearly discriminates between the b W 1 = 0 and c W = 0 cases, a feature that was absent in the distributions of observables discussed earlier.
More interesting effects can be seen in this observable when we consider admixtures of each of the anomalous couplings with the SM. In the right plot of figure 6 the distribution of three cases are compared with the SM expectation: (a W = 0, b W 1 = 0.1) (blue dotted line), (a W = 0, b W 2 = 0.01) (green dot-dashed) and (a W = 0, c W = 0.1) (red dashed), with all other couplings set to zero in each case. As expected, the distribution for the case (a W = 0, b W 2 = 0.01) follows the SM distribution closely. The CP-even (a W = 0, b W 1 = 0.1) case is similar to the pure anomalous coupling case (a W = −1, b W 1 = 0 ). We have checked that the interference term alone for the case (a W = 0, b W 1 = 0.1) produces a similar distribution to the case (a W = −1, b W 1 = 0) and is therefore linearly sensitive to b W 1 . For the CP violating case (a W = 0, c W = 0.1) the distribution is skewed towards positive values of cos δ + . This is due to the presence of the Levi-Civita tensor in the interference term of the matrix element squared as described above. Note that the distribution will peak for negative values of cos δ + if the sign of the coupling c W were interchanged. This observable is therefore, linearly sensitive to c W and hence to its sign. We will use this fact to construct asymmetries in the next section.
The second observable we consider is slightly more complicated in construction. The momenta of the two leptons from the decay of the gauge boson are evaluated in the frame in which the Higgs would be at rest, were its three momentum reversed 5 . Then cos δ − is the angle between the plane formed by the two leptons in this frame and the V boson in the lab frame. This angle is related to the angle φ depicted in figure 5. The distribution of this observable for SM is compared with three other cases in the left plot of figure 7 : b W 1 = 0  figure (and other couplings set to zero). The behaviour of this angle is very similar to that of cos δ + . There are two noticeable differences. Firstly in the case c W = 0 the distribution of cos δ − appears to show a more heightened difference from SM as compared to the distribution for cos δ + . For the case when b W 1 = 0 the opposite is true and cos δ + appears to show a greater difference from the SM distribution. This is also true when we set (a W = 0, b W 1 = 0.1). For the CP-violating case, the same skewed behaviour of the distribution that was observed for cos δ + , reappears here. As usual the distribution for the coupling b W 2 in both the pure and mixed cases of the left and right plots of figure 7, follow closely the SM expectation.
The last observable we consider (∆φ lV ) is the azimuthal angle difference of the lepton momenta (evaluated in the rest frame of the V boson) and the V boson momentum. The distribution for this observable is shown in figure 8. The left plot compares the SM expectation with three cases of the pure BSM couplings. The right plot compares the SM prediction with admixtures of SM with each of the BSM couplings. For all the cases we consider, there is a significant difference from the SM distribution of this observable. The most striking difference, however, is for the pure CP-even case (b W 1 = 0) which displays a minimum in this distribution at ∆φ lV unlike the other cases. Differences between the distributions remain, although reduced, when considering admixtures of the SM coupling (a W ) with each of the BSM couplings.
We also show in figure 9 the distributions of the observables described above for the backgrounds to W h production listed in table 1. The distribution of the various angles follow the SM distribution except for the angles ∆φ lV and cos θ * . For completeness, we also show the distributions of the various angles defined in eq. (3.2) and eq. (3.4) for Zh production. In figure 10, the SM distribution (black solid line) is compared with the predictions of the three different BSM couplings. The values of the couplings are chosen as in table 1, so as to reproduce the SM total cross-section (before cuts). The distributions display a similar behaviour as compared to the analogous distributions in W h production.
In figure 11, the SM distribution (black solid line) is compared with three cases which involve admixtures of the SM and BSM couplings The asymmetries in the distributions of cos δ + and cos δ − that one observes in W h production for the CP-violating case (a Z = 1, c Z = 0.1), although present, are far less prominent in Zh production. The reason for this difference can be ascertained by looking at the CP violating term in the matrix element squared. For W h production, as described earlier, this term was simply proportional to a Levi-Civita tensor of the form µνρσ k µ (p h − k 1 ) ν p ρ W p σ 1 . The CP violating term in the matrix element squared for Zh production has several instances of the Levi-Civita tensor that come with opposite signs. These do not cancel out (as they do in W h production) since they are multiplied by axial and vector couplings (which are of different strengths). As a result, the distributions of cos δ + and cos δ − receive contributions from Levi-Civita tensors of opposite sign and hence display a reduced skewness in distribution in comparison to the analogous distribution in W h production.
We now have a set of observables that can discriminate not only the SM coupling from BSM couplings but also between the various BSM couplings, as evidenced by the distributions presented in this section. In order to fully assess the discriminating power of these observables and to estimate the typical luminosities that one would require at a 14 TeV LHC to rule out the various anomalous couplings, we perform a multi-variable likelihood analysis in the next section.

Multi-Variable Likelihood analysis
In the previous section we described the various observables that one could use in order to probe anomalous couplings in V h production. We found that the transverse momentum of the V boson (or the Higgs), the angle cos θ * and any one of the correlated observables defined in eq. (3.4) can be used for this purpose. It is well known that the maximized log likelihood ratio provides the strongest test statistic according to the Neyman-Pearson lemma. Therefore in order to assess the sensitivity of these observables to probe anomalous couplings at the LHC, we perform a three dimensional extended binned-likelihood analysis. The procedure we follow is outlined below.
We set the SM expectation, with a W = 0 as our null hypothesis. The alternate hypotheses are chosen to be the various cases which involve any one of the BSM couplings. We define our likelihood as functions of a set of three observables. These are p W T , cos θ * and any one of the observables defined in eq. (3.4). In fact, we perform this analysis for three different definitions of the likelihood (L) which depend on the choice of observable, namely L(p W T , cos θ * , cos δ + ), L(p W T , cos θ * , cos δ − ) and L(p W T , cos θ * , ∆φ lV ). As a first step we produce three dimensional histograms with the various combination of observables listed above. The choice of range and bins for each of the observables is listed below.
Using these histograms we can now determine the Likelihood function. Let t i be the expected bin height (or number of events) of the i th bin derived from theory (in our case from Monte Carlo simulations). The probability that the i th bin will have n i observed events (observed bin height) is a Poissonian probability given by We can now proceed to determine the probability of generating the full distribution for all of the histogram bins by multiplying the probability for each of the bins. The binned likelihood is then given by a Poisson distribution Here N is the number of bins, t i is the expected number of events under the hypothesis X and n i is the number of observed events. The likelihood ratio is then defined as We now use these three dimensional histograms to generate "pseudo-data". This is done by using the theoretically determined t i to generate Poisson distributed random numbers which correspond to our pseudo-data. We repeat this procedure for all bins in order to generate pseudo-data. We then determine the distribution of the likelihood ratio Q by generating 5 × 10 3 "pseudo-events". A typical distribution for Q is shown in figure 12.
Using the distribution Q, we can determine the p-value of excluding the alternate (BSM hypothesis) 6 . We include the effects of backgrounds completely but only profile over the various nuisance parameters that arise from detector effects and selection cuts. The results of this procedure are shown in figure 13 for the pure BSM cases, where we show the variation of the p-value of the BSM hypothesis against the luminosity. To assess the sensitivity of each of the observables, we set the coupling strengths to values so that they reproduce the SM cross-section after applying all the cuts, i.e. (a W = −1, b W 1 = 0.1), (a W = −1, c W = 0.1) and (a W = −1, b W 2 = 0.007) with all other couplings set to zero for each of these cases. This choice of couplings hence eliminates the rate information from the analysis. We stress that this is done to check the discriminating power of the observables under consideration. The horizontal line indicates exclusion of the alternate hypothesis at 95% confidence level. A second horizontal line is shown in some cases below the first one and this indicates exclusion of the alternate hypothesis at 3σ confidence level. For two of the couplings c W and b W 1 we observe that the likelihood constructed with the ∆φ lV observable provides a slightly stronger discriminant. In both cases we find that exclusion of the pure BSM hypothesis at 95% confidence level is possible with ∼ 50fb −1 luminosity. The coupling b W 2 can be excluded with even less data with exclusion 95% confidence level possible with just 30fb −1 luminosity. All the likelihoods produce similar results in this case. This is as expected since, the strongest discriminator for this coupling is the transverse momentum distribution, while angular observables for b W 2 are not very different from SM predictions. An important point to note is that we have set the couplings to very small values. This does not correctly reproduce the Higgs partial decay widths. For example, if the Higgs were a pseudo-scalar, then in order to reproduce the SM decay width in h → V ( * ) V ( * ) decays, the coupling c V should have a value ∼ 3. For such a large value of the coupling, the V h production channel can easily rule out the pseudo-scalar hypothesis with ∼ 20fb −1 of data. We would like to emphasize that although a pure pseudoscalar hypothesis has been ruled out by an analysis of the Higgs decaying to four lepton channel, the same is not true for the hW W coupling [52] 7 . Since it is not easy to reconstruct the kinematics of the final state in the case of higgs decaying through W bosons, what we suggest is that it is possible to use only the rate information from the decay coupled with an analysis of W h production (as prescribed here) to easily rule out the pure pseudoscalar hypothesis for the hW W coupling with a relatively small amount of luminosity.
We are, however, more interested in the case where there are admixtures of the SM coupling and the BSM couplings. In figure 14 we show the variation of the p-value for the alternate hypothesis with luminosity for 14 TeV LHC. For the case when (a W = 0, b W 1 = 0.1), we find that the likelihood function constructed with ∆φ lV does slightly better than the other two likelihood functions. We find that the BSM hypothesis for this choice of coupling strengths can be excluded at 95% confidence level with about 100fb −1 luminosity. For the CP violating case, (a W = 0, c W = 0.1), as expected, the two likelihood functions constructed with cos δ + and cos δ − appear to be the strongest discriminators with 95% confidence level exclusion of the BSM hypothesis is possible with about 100fb −1 of data. Finally for the case when (a W = 0, b W 2 = 0.01), we observe, once again, that those likelihoods constructed with cos δ + and cos δ − do slightly better than the likelihood constructed with ∆φ lV . We find that 95% confidence level exclusion of this BSM hypothesis possible with about 50fb −1 luminosity.
We also perform this analysis for Zh production. The variation of the p-value of the alternate hypothesis with luminosity for a 14 TeV LHC is shown in figure 15. Once again we compare the results of three different likelihood functions constructed out of three different combination of observables, namely L(p Z T , cos θ * , cos δ + ), L(p Z T , cos θ * , cos δ − ) and L(p Z T , cos θ * , ∆φ lV ). In contrast to W h production we find that all three likelihoods have a discriminating power not very different from one another. The smaller cross-section for Zh production implies that the luminosities at which various hypotheses can be excluded is higher than for the corresponding hypotheses in W h production. The luminosities at which we find exclusion of the BSM hypothesis at 95% confidence level are as follows: • b Z1 = 0.12, with all other couplings set to zero : ∼ 100fb −1 .
The luminosities listed in this section are all within the projected value of 300 fb −1 for the LHC [114]. We reiterate here that an analysis of the hV V vertex in the V h production mode was not conceived before due to the small cross-section in this channel. We have shown that such an analysis is indeed possible. The increased acceptance to BSM physics to the cuts employed in a boosted analysis plays a crucial role in improving the sensitivity of V h production to BSM physics. We have constructed observables that are linearly sensitive to BSM couplings. Further, in this section we have shown that our observables are quite powerful and exclusion of various BSM hypotheses is possible with a relatively small amount of data. The importance of our analysis is that it provides a direct method of studying the hW W vertex. As mentioned earlier, other production modes such as VBF and h → W ( * ) W * do not provide clean probes of the same 8 . It should be noted that a likelihood analysis has several nuisance parameters (from detector effects and selection  cuts) that are a source of uncertainties in this analysis. In order to reduce the uncertainties in probing anomalous couplings, we look at the possibility of constructing asymmetries in the next section. This is best suited for the CP violating case where it is easy to construct asymmetries that would vanish in case of CP conservation. Note that it is possible to construct asymmetries that are non-zero for the SM and are also linear in the anomalous couplings 9 . However, here we focus our attention on CP-violation.

Asymmetries
In this section, we define asymmetry parameters related to the angular observables of section 3. There are a number of motivations for this. Firstly, asymmetry parameters defined in terms of ratios are typically theoretically cleaner than kinematic distributions, due to cancellation of PDF and scale uncertainties, as well as reduced sensitivity to radiative corrections. They are also experimentally easier and cleaner to measure, being related to simple counting experiments, recording the number of events in well defined regions of phase space.
Such asymmetries can be constructed using the observables cos δ + and / or cos δ − defined in eq. (3.4). In this section we will consider the asymmetry constructed out of the observable cos δ + only 10 as follows. For W + events (tagged using the sign of the decay lepton) one defines A(cos δ + ) = σ(cos δ + > 0) − σ(cos δ + < 0) σ(cos δ + > 0) + σ(cos δ + < 0) , (5.1) defining minus this quantity for W − events. For (a W = 0, c W = 0.1), we find the value of this asymmetry, after applying all selection cuts, to be A(cos δ + ) = 0.315 for W + h production. We also verify that the value of the asymmetry for all other cases (BSM, SM and backgrounds) is less that 1 × 10 −3 which is within the statistical uncertainty limits of our procedure and can be safely assumed to be vanishing. We emphasize that the vanishing of this asymmetry holds true even after including detector effects. This makes it a robust observable to probe CP violation. Since the transverse momentum cuts increase the acceptance of the BSM vertex, the value of the asymmetry depends on this kinematic cut. The asymmetry also depends on rapidity cuts since the observable depends on the cross-product of the Higgs and gauge boson momenta. We perform a simple parton level analysis of W + h production with the W + decaying leptonically and the Higgs decaying to a b-quark pair. In order to mimic the cuts of a boosted analysis, we apply the following cuts at the parton level: 1. Transverse momentum of the leptons p l T > 30 GeV; rapidity |y l | < 2.5; separation from b-quarks ∆R lb > 0.3, where l = {e + , µ + }. We note that the value of the asymmetry calculated at the parton level is in very good agreement with the asymmetry calculated using the full boosted analysis simulation described in the previous sections. We evaluate the variation of this asymmetry with the strength of the coupling c W using a parton level analysis. The variation of this asymmetry is shown in figure 16. We see that the sign of the asymmetry depends on the sign of c W . We also observe that the asymmetry peaks for a value of c W ∼ 0.1 while a minima is observed for c W ∼ −0.1. The extrema signify regions where the interference plays an important role. For larger values of c W the quadratic term (in the matrix element squared) starts contributing more strongly to the total cross-section, thus reducing the value of this asymmetry. However, this would correspond to a kinematic region where one no longer trusts the effective theory framework.
In Zh production, with (a Z = 1, c Z = 0.1) the asymmetry is found to be ∼ 0.02, an order of magnitude less than the asymmetry in case of W h production. The much smaller value is due to the presence of different vector and axial-vector couplings of the quarks and leptons with the Z boson, as explained earlier. Stronger probes of CP violation in the hZZ vertex can be found in ref. [116].

Conclusion
The ongoing attempts to pin down the nature of the recently discovered Higgs-like particle constitute a major global effort in contemporary particle physics. In this paper, we have considered the associated production of the Higgs with a massive gauge boson at the LHC as an alternative to VBF production, in probing anomalous contributions to the hW W vertex. Similar analyses in VBF can be quite difficult due to large backgrounds and our inability to reconstruct the final state momenta. Furthermore, in VBF production there is a significant contribution from the hZZ vertex, reducing the ability to cleanly distinguish this from the hW W vertex. We have shown that such a separation is indeed possible in V h production, despite the smaller cross-section. This has been made possible with the use of modern jet-substructure techniques. Consistent with previous studies [25,26,109], we find that the very same selection criteria that are applied to eliminate backgrounds, also enhance the sensitivity to BSM physics. This is ultimately due to the additional momentum factors that correct the hV V vertices in an effective theory framework, which boost the Higgs to higher transverse momenta on average.
Building on the preliminary work of ref. [27], we constructed angular observables that are sensitive to new physics. To test the ability of these observables to probe the tensor structure of the hV V vertex, we performed a log likelihood analysis. Three dimensional likelihood functions were constructed with different combinations of the observables. We found that with a relatively small amount of data (less than 150fb −1 luminosity), it is possible to exclude all the different cases of couplings we have considered. For example we found that the CP-violating case (a W = 0, c W = 0.1) could be excluded at 95% confidence level with ∼ 90fb −1 luminosity for 14 TeV LHC.
Finally we constructed an asymmetry that is sensitive to the amount of CP violation in hW W interactions. The asymmetry vanished for all CP-conserving cases -in particular, it is zero in the SM, such that any non-zero measurement consitutes unambiguous discovery of new physics. We checked that the asymmetry is robust against hadronization, radiation and detector effects.
The results of our paper merit further investigation, including implementation in future experimental analyses. We furthermore anticipate other useful applications that may result

A Matrix elements for V h production
In this appendix, we collect the matrix elements for V h production at leading order, including the effects of the higher-dimensional operators described in section 1.
We evaluate the squared matrix element for the Feynman diagram shown in figure 17. We do not consider decay of the Higgs boson since we assume it to be spin zero and therefore its decay products will not carry information about the hW W vertex. We evaluate the matrix element squared for this process. We use the following notation to identify parts of the matrix element that are proportional to each of the anomalous couplings of eqs. (1.2, 1.3).
where M i , i = {a W , b W 1 , b W 2 , c W } are the matrix elements generated from the coupling with coefficient i. In keeping with the philosophy of the effective field theory approach, we keep only terms which are at most linear in the BSM couplings (constituting the interference of the BSM physics with the SM). Quadratic terms would necessitate the inclusion also of dimension eight operators. The results are: Here {k 1 ,k 2 ,p 1 ,p 2 } = µνρσ k 1µ k 2ν p 1ρ p 2σ with µνρσ the Levi-Civita tensor, and corresponds to the propagators for the W bosons.