Fingerprinting Higgs Suspects at the LHC

We outline a method for characterizing deviations from the properties of a Standard Model (SM) Higgs boson. We apply it to current data in order to characterize up to which degree the SM Higgs boson interpretation is consistent with experiment. We find that the SM Higgs boson is consistent with the current data set at the 82 % confidence level, based on data of excess events reported by CMS and ATLAS, which are interpreted to be related to the mass scale mh = 124-126 GeV, and on published CL_s exclusion regions. We perform a global fit in terms of two parameters characterizing the deviation from the SM value in the gauge and fermion couplings of a Higgs boson. We find two minima in the global fit and identify observables that can remove this degeneracy. An update for Moriond 2012 data is included in the Appendix, which finds that the SM Higgs boson is now consistent with the current data set at only the 94 % confidence level (which corresponds to ~ 2 sigma tension compared to the best fit point).

Recently, ATLAS and CMS have reported significant exclusion limits 1  hjj → γγjj channel at CMS [11], which would correspond to a SM Higgs boson produced through vector boson fusion. We will use these experimental results in this paper to determine to what degree current data is selecting a SM Higgs boson or not.
These experimental hints, along with the lack of any clear evidence of new states discovered to date at the LHC are suggestive that an effective theory of the EWSB sector, including a light scalar resonance and the approximate symmetries of SU(2) c as well as MFV, is currently an appropriate description of the data. We will examine recent results from the LHC in this framework and ascertain up to what degree experiment is selecting a SM Higgs doublet at present. We perform a broad analysis of this question in Sections II and III in terms of two parameters characterizing the deviation from the SM value in the gauge and fermion couplings of a Higgs boson. We find two nearly degenerate minima in the global fit. The effective Lagrangian we employ emerges naturally in composite Higgs scenarios. We comment on their interpretation in terms of current data.
The LHC results should be interpreted in the context of the indirect evidence for the SM Higgs and its properties in EWPD. We examine the consistency of the results of our global fit to LHC Higgs-like data and EWPD in Section IV.
Finally, in Section V, we discuss how future measurements can be presented in a manner that can more precisely and efficiently refine the understanding of such an effective theory with the aim of clarifying if the correct description of EWSB is the SM Higgs mechanism or not. In particular we point out the utility of ratios of best fit signal strengths which can be provided by the experimental collaborations and which have the ability to resolve the degeneracy in the two best fit regions we find when examining LHC Higgs-like data. This is due to the fact that, although an individual experimental signal can be faked by a Higgslike degree of freedom, it is difficult for such a degree of freedom to simultaneously reproduce the Higgs predictions in other channels with different dependencies on effective couplings.

II. THE EFFECTIVE THEORY
We consider an effective Lagrangian with a light scalar resonance, denoted as h in the following, that includes the Goldstone bosons associated with the breaking of SU(2) × U(1) Y → U(1) Q and the SM field content. A minimal description of these degrees of freedom is given by an effective chiral Lagrangian with a nonlinear realization of the SU(2) × U(1) Y symmetry. The Goldstone bosons eaten by the W ± , Z bosons are denoted by π a , where a = 1, 2, 3, and are grouped as with v = 246 GeV. The Σ(x) field transforms linearly under SU(2) L × SU(2) R as Σ(x) → L Σ(x) R † where L, R indicate the transformation on the left and right under SU(2) L and SU(2) R , respectively, while SU(2) c is the diagonal subgroup of SU(2) L × SU(2) R , under which the scalar resonance h transforms as a singlet. A derivative expansion of such a theory is given by [12][13][14] Here we have adopted for later convenience the notation of composite models [12][13][14]. Minimal Flavour Violation dictates that the sole source of flavour violation are the Yukawa couplings and that h is a singlet in flavour space and that its couplings to fermions (c, c 2 , ...) are flavour-universal. (Note that MFV is also compatible with c i proportional to a combination of the SM Yukawa matricies.) We adopt this assumption, although we note that the LHC data is essentially only sensitive to flavour violation linked to the large top coupling (scaled by c from its SM value) at present.
R is weakly gauged in such theories. We will fit the current data including the leading linear Higgs-like coupling effects in the Lagrangian of Eq. (2).
We will neglect, however, dimension five operators h G µ ν G µ ν , h W µ ν W µ ν , h B µ ν B µ ν and other higher dimension operators in the fit 2 , focusing on the effect of a and c. This can be justified by UV model building, for example in the composite Higgs case. In general, the coefficients a, b, b 3 , c, c 2 , ... are arbitrary 2 See Ref. [15] for a discussion on the impact of these operators on Higgs production.
numerical parameters subject to experimental constraints. Model building in the UV of this effective theory fixes relations between the parameters.
The completion of the theory with a SM Higgs boson fixes a = b = c = d 3 = d 4 = 1 and b 3 = c 2 = 0, and higher order terms in the polynomial expansion of the h field vanish. In this case h becomes part of a linear multiplet interpreted to correspond to the singular point where c = 0, a = 1, while a more general interpretation of this hypothesis is c = 0 with a unfixed, but constrained by current data.

III. FITTING ENHANCED CROSS SECTIONS
The experimental hints for the Higgs boson that have been reported to date in each channel individually provide only marginal evidence. The best fit values to the signal strengths µ = (σ × Br)/(σ × Br) SM for particular Higgs masses are summarized in Table I.
In Table I we also give in the second column the expected mass sensitivity of the various experimental signatures as well as in the third column the best fit value for µ together with the 1σ experimental error and the 95% confidence level limit. If the mass resolution is not explicitly quoted, as in the pp → ZZ * → l + l − l + l − and the pp → W W * → l + νl −ν results given by ATLAS, we display as an estimate the mass sensitivity for these channels as quoted by CMS. Since the mass sensitivity is such that the best fit masses can reasonably be interpreted to overlap, we do not introduce a correction factor in the fit to shift to a common mass value. We also display, in column two, the local significance when given by the experiments. In column four we schematically list the leading sensitivity of the signal in terms of the  forming a global fit for a particular mass value m h ≈ 124 GeV. The CMS photon measurements are split up in accordance with Ref.
[10]. The γγ events are classified by the conversion of the photon in the crystal -defining a parameter R 9 -and their location in the detector, being endcap -e or barrel -b. The data we fit to also have associated exclusion curves. We take these exclusions into account by another procedure described in the text. Note that the τ + τ − searches at ATLAS are included in the exclusion analysis but not fit to in the signal strength best fit as the corresponding experimental error is not available.
The excesses of events with approximately the same mass scale in various channels are suggestive of a resonance, that could be interpreted as evidence of a light Higgs boson. We will assume that these excesses of events correspond to the same underlying physics and fit the data to discern the degree up to which the excesses are consistent with a SM Higgs boson interpretation. 4 Our procedure to perform a global fit to the current data is as follows.
First we fit to reported values of µ i including the deviations in the SM predictions by allowing the parameters a and c to deviate from their SM values of 1. We include in the fits the effects of modified production cross sections and branching ratios due to the rescaling of the SM couplings by the parameters a and c. In order to carry out these fits we are required to make a set of assumptions, which are summarized and discussed in the Appendix. We will illustrate the sensitivity of the fit to the various assumptions by varying them in the results presented in Section III A.
For example, consider the case of the event yield used to construct each µ i for pp → γ γ, which in this discussion we will assume is only produced through gluon fusion. The SM prediction for the Higgs boson producing such events with an integrated Luminosity L dt can be schematically written as We will assume that the effects of the coefficients a, c are to simply rescale the number of events in the various signal cross sections. We neglect small shape differences in the differential distributions that could affect the event yield as (a, c) deviate from (1, 1). Then the integration over luminosity with phase space cuts will essentially cancel in the constructed theoretical prediction for the ratio µ i . The sole effect of the deviation from (1, 1) for (a, c) will be to generate an excess/suppression of events compared to the SM prediction. This can be directly fit to the reported experimental best fit value for this ratio. This is the procedure we will adopt.
We construct a χ 2 measure for a two parameter fit in the following way. We define the matrix C as the covariance matrix of the observables, and ∆ θ i as a vector of the difference in the observed and predicted value of the ratio, as a function of (a, c). The χ 2 measure is then given by The minimum χ 2 min is determined, and the 65%, 90% and 99% best fit CL regions are given by ∆χ 2 < 2.1, 4.61, 9.21, respectively, for χ 2 = χ 2 min + ∆χ 2 . The confidence level regions are defined by the cumulative distribution function for a two parameter fit. The matrix C is taken to be diagonal with the square of the 1 σ theory and experimental errors added in quadrature for each observable in the diagonal element. As correlation coefficients are currently not supplied by the experimental collaborations, offdiagonal correlation coefficients are neglected. For the experimental errors we use the quoted 1 σ errors on the reported signal strength. The errors δµ i for the individual channels i are made symmetric by taking For theory predictions of the cross section values and related errors, we use the numbers given on the webpage of the LHC Higgs Cross Section Working Group [30] for m h = 124 GeV and √ s = 7 TeV. 5 We symmetrize the total cross section error and propagate the error to get a theory error on µ. Taking with r i (a, c) the appropriate rescaling factor for each cross section and defining the error on each σ i to be δσ i , we determine the error δµ by For each search channel, ATLAS and CMS provide [10, 11, 26, 29] an exclusion upper limit µ i L on each 'signal strength' so that µ i > µ i L is excluded at 95% C.L. This value is reported in Table I. The final exclusion limit µ L from combining all channels is also provided as a function of the Higgs mass, with a SM Higgs boson mass being excluded whenever µ L < 1. We obtain such combined limits by a simple χ 2 procedure, solving for µ L the equation whereμ is the average of the individualμ i , the measured signal strengths for each channel (which make µ i L larger when they are nonzero) 6 . When there are no excesses (or for the purpose of calculating expected limits) one simply setsμ i = 0, in which case our simple recipe combines in quadrature the limits from different channels. Again we neglect correlations in the measured limits on the signal strengths. In applying this procedure to our case, note that we have mapped the reported CL s exclusion curve as  Table I (plus the ATLAS h → τ + τ − analysis).
The results of the fit are dependent on the SM values used for the masses of the known particles, gauge coupling constants in the SM, etc. We summarize the SM inputs, which we have used, in Appendix A.

A. Results
Using the above described procedure we perform a global fit to the currently available data, with results shown in Fig. 1. The global fit results are combined with the exclusion contours in Fig. 2 1), to a nearly degenerate point in terms of 6 We contrast this semi-empirical approximate formula with a more precise determination of the combined limit in Appendix B. terms of (a, c, χ 2 min ). The SM point for comparison is (1, 1, 5.44). It is of interest to determine a means by which the best fit region degeneracy can be resolved. We discuss such an approach in Section V.
Note that in Figs. 1, 2 we present results which we label as 'inclusive' or 'gg only'. These fits differ in how the production cross sections are treated. For 'gg only' in the signals with γγ, τ + τ − final states, we only rescale the dominant gluon fusion production channel, neglecting subdominant channels. While for results labeled as inclusive the following production channels are included for each signal: (ii) γγ production via gluon fusion, vector boson fusion (VBF), tth production, and associated production with W ± and Z. (Note that the γγ events that have associated jets, interpreted to come from VFB, are treated exclusively.) (iii) bb production is summed over associated h W ± and h Z production, (iv) τ + τ − production is summed over gluon fusion, vector boson fusion, tth production, and associated production with W ± and Z. Previous analyses at CMS are reported to only include the VBF initial state as the tagging jets are used to eliminate Drell-Yan Z → τ + τ − events. The updated results use a modelling of the Drell-Yan spectrum based on measurements of Drell-Yan produced Z → µ + µ − events, so that the updated analysis is more inclusive.
We do not find dramatic differences in the fits using the two different procedures, and thus are lead to consider our approximations used in performing the rescalings to be satisfactory. See Appendix A for a detailed discussion.
As can be inferred from Fig. 1 and Fig. 2, the direct fit to the data and the exclusion curves are selecting the same region of parameter space. We also performed exclusive fits to the following subsets of data. We combined the bb and τ + τ − data for a test of the fermion couplings and combined the W + W − , ZZ data This effect manifests itself through the high energy behavior of the longitudinal degrees of freedom in high energy scattering still growing with energy, and in EWPD through loop diagrams involving the longitudinal degrees of freedom of the gauge bosons [33]. In EWPD, the corrections to the gauge boson propagators can be expressed in terms of shifts of the parameters STU [34][35][36] given by Here we have introduced a cutoff scale Λ, which approximately represents the mass of the new states that are required in this framework to unitarize longitudinal gauge boson scattering at a scale given by For EWPD we use the results of the Gfitter collaboration [37] S = 0.02 ± 0.11, T = 0.05 ± 0.12, U = 0.07 ± 0.12 .
And the correlation coefficient matrix is given by We perform joint fits to LHC data and EWPD by adding the corresponding entries to an enlarged covariance matrix (including STU in terms of a) in the global fit. Correlations among EWPD observables are included.
We then perform a new global minimization and joint fit. The results are given in  states, one expects the properties of the Higgs will change. Nevertheless, when approaching the data in effective field theory, the symmetries that are known to be at least approximately present at the weak scale are already highly constraining, as discussed in Section II. One is therefore lead to the effective description of the data which we have adopted 7 , and global fits to precision Higgs data will allow the SM hypothesis of EWSB to be experimentally tested in the near future.
As the LHC will be running at  Fig. 5. We restrict ourselves to a subset of the channels used in this analysis. The prospects of reducing the degeneracy in the best fit regions relies mostly on further observations related to signal events in σ×BR(h → γγ). Here we have rescaled using the previously defined 'inclusive' rescaling and combined the production channels that contribute to the analyses of signal channels. Note that each individual channel has a mapping away from the SM point of (1, 1) in the (a, c) plane to a family of degenerate points. This leads to two best fit areas, which have to be resolved. This can be done by realizing that the mapping is not the same when comparing among different channels.
Experimental results can be presented in such a manner as to aid in this pursuit. When best fit signal 7 The robustness of the requirement of exact MFV is less significant than SU(2)c in the EWSB sector. strengths are presented for a common Higgs mass scale, correlation coefficients should be supplied as well, which will lead to more accurate fits. The effective theory approach makes clear that it is instructive to experimentally provide ratios of various best fit signal strengths, so that the parameters (a, c) can be more precisely determined. In particular, it would be helpful if the experimental collaborations provide results that allow the degeneracy in the two fit minima to be resolved. Ratios of effective signal strengths can help in resolving this degeneracy. In Fig. 6 we show the effective production contour ratios for two combinations of extracted best fit signal strengths in the (a, c) plane, Here we have included superscripts for the various production channels to make explicit the ratios to be constructed. For example σ hZZ means the combination of the production channels discussed in Section III A that are included in ZZ signal events. These ratios are obviously 1 in the SM. It is important to note that comparing theoretical and experimental determinations of such ratios, which include sets of best fit signal strengths simultaneously, will allow the degeneracy of the best fit regions to be significantly reduced.
Such combinations can also be experimentally appealing when they allow systematic uncertainties to be cancelled, such as photon systematic uncertainties in µ γγ V BF /µ γγ .

VI. CONCLUSIONS
We have examined the current LHC data in an effective theory to determine to what degree the SM Higgs hypothesis is emerging from the data. To this end we have performed global fits of best fit signal strengths and exclusion regions, taking into account current data. The SM Higgs hypothesis turns out to be consistent with the data at the 82 % CL. In our global fits we find that there are two best fit regions. We have determined experimentally accessible ratios of best fit signal strengths for a specific Higgs mass value that will allow the degeneracy in the best fit regions to be significantly reduced with sufficient data collected at 8 TeV c.m. energy.

Appendix A: Fitting Assumptions
In order to perform the fit we have made several assumptions, some of which can clearly be relaxed with more input from experimental collaborations. In this Appendix we discuss these assumptions in more detail.
(i) We have assumed that the excess events reported in Table I, which have various best fit mass values, correspond to the same underlying physics, which we are assuming can be characterized by the Lagrangian in Eq. (2). We have argued that symmetry considerations lead us to consider this Lagrangian but the degree to which the reported best fit values of µ i can be associated with a particular mass scale is less clear. The mass resolution in all the various channels is larger than the spread of the Higgs masses listed in Table I It is reasonable, however, to consider such effects to support our association of the excess with a common mass scale, which we choose to be 124 GeV.
(ii) We have assumed that the given best fit values of µ i in the various channels are uncorrelated to one another, neglecting both theory and experimental correlations. This is due to the lack of correlations reported by the experimental collaborations. This assumption is most easy to address with further experimental input. Our fitting procedure can be easily modified to include such correlation coefficients as off-diagonal elements in the covariance matrix. The signals naively expected to be most strongly correlated are the γγ events. We have studied the effects of correlations on the fit by introducing pseudo-correlations through a correlation coefficient of size 0.5 between all γγ data. The fit results are robust against such pseudo-correlations and we still find two best fit regions with a similar parameter space as without correlations taken into account. We have also examined the robustness of the fit results against a set of other pseudo-correlations randomly chosen and assigned as off-diagonal elements in the covariance matrix. Again we find robust fit results when globally fitting the data.
(iii) We have assumed that the effect of rescaling the cross section and the branching ratios can be directly associated with a rescaling of µ i , neglecting the effect that rescaling the various channels modifies the differential distributions. This assumption can fail in the presence of higher dimensional operators. The modification of the differential distributions will affect µ i due to experimental selection cuts modifying the shape of the differential distributions. However, we consider this effect to be subdominant to the effect we have incorporated by fitting to an unfixed (a, c) with production cross sections and branching ratios modified accordingly. Furthermore, we rescaled the various branching ratios according to the couplings included in the leading order formulae of the decay widths. This procedure is consistent when including only QCD corrections to the decay widths, however when including higher order EW corrections further effects due to a, c differing from 1 are neglected. Again, this effect is subdominant to the effects that we have retained.
(iv) We have also neglected, in using µ i , that this result is determined assuming the SM in the combination of sub-channels, leading to the reported µ i ratio. We also expect this effect to be subdominant to the effects that we have retained. This is in particular the case if the sub-channel combination is dominated by a particular final state. The analyses that use sub-channel combinations are the W W, ZZ, τ τ channels. We have tested the robustness of the fit by fitting with two procedures shown in  Table I to the value in   Table II. On the right we have also added the ATLAS τ τ and Tevatron data on pp → bb and pp → W + W − as shown in Table II.  In this section we present updated results including the data presented at Moriond 2012 [40]. The most significant change in the data that was used in Table I is an update to the ATLAS measurement of pp → W W → + ν −ν . In addition, ATLAS reported best fit signal strengths in the pp → bb and pp → ττ channels while CDF and D0 / reported a broad excess in pp → bb events. Further, CMS has now also supplied best fit signal strengths as a function of m h , allowing various Higgs mass hypotheses to be fit to. In this Appendix we include these experimental results in our fit and supply supplementary plots for various Higgs masses (refining also our determination of the 95% C.L. exclusion limits).   The updated data that we supplement Table I with is given in Table II. Due to an apparent inconsistency in the ATLAS best fit signal strength plot for pp → bb and the corresponding ATLAS CL s limit plot we do not use the bb best fit signal strength value in the combined fit. We show in Fig. 7 the effect of the Moriond 2012 data on our previously reported fit results.
We also show joint fits for the Higgs mass values m h = 119.5, 124, 125 GeV where we have taken the experimentally reportedμ and the corresponding theory predictions at a common m h due to the release of the required data by CMS, after version one of this paper. The data we use is given in Table III Table III and its caption. The red dashed line is the ATLAS exclusion limit as described in the text, the blue solid line is the CMS limit and the combined CMS and ATLAS limit is included as a black solid line. degree of contamination of this signal with gg initial state Higgs events to enable a consistent treatment of the reported best fit signal stengths. One can demonstrate how the VBF signal interpolates between the two results shown in Fig. 8 by adding in contamination due to σ(gg → h) events with our consistent rescaling procedure. One finds the series of plots shown in Fig. 9 for various degrees of contamination. Note that CMS also reports W + W − jj events which offer a similar discrimination of the parameter space as the γγ jj signal in principle. However, again contamination due to σ(gg → h) events will exist and is not reported by the collaborations. We do not use this data as a separate channel at present. The update to the data has a small effect on the CL of the SM Higgs hypothesis compared to the best fit value of the current data.
Assuming no contamination due to gg for VBF events one finds that our previously reported global fit with the Moriond 2012 data update (but without correcting to a single Higgs mass value in the experimental best fit signal strengths) has the SM hypothesis residing on a 94 % CL curve around the best fit value of (a, c).
Assuming a 3% contamination of the VBF events due to gg, the SM hypothesis remains consistent with the data at 93 % CL compared to the best fit value. For a direct comparison, the Fermiophobic scenario with a = 1, c = 0 is consistent with the data at 96 %CL for the same global fit. When a 3% contamination due to gg events for the VBF diphoton signal of CMS is assumed, the same Fermiophobic scenario is consistent with the data at the 88 %CL. We do not consider a Fermiophobic scenario to be favoured by the global data or the pattern of deviations from the SM in the current data set. With such marginal signal events, statistical fluctuations in the data are still present affecting the pattern of deviations.
Finally, the 95% CL exclusion curves from ATLAS and CMS in the plots of this Appendix have been determined using a more precise method than Eq. (8). In the same spirit of Ref. [39], for each individual search channel i, we have first approximated the corresponding probability density function of the signal strength parameter µ by a Gaussian p i (µ) ∝ Exp[−(µ −μ i ) 2 /(2σ 2 obs,i )]. We have obtained the quantities µ i and σ obs,i trying to get the best approximation to the reported 95% CL in that channel, µ i L (obtained from the equation µ i L 0 p i (µ)dµ = 0.95). In the case of CMS channels, we use the approximation of Ref. [39] (which uses σ obs,i σ exp,i = µ i L,exp /1.96 and obtainsμ by solving the equation that determines µ i L ). In the case of ATLAS data, we find better agreement with the reported limits by directly usingμ i =μ i and σ obs,i as provided 8 . To illustrate the precision of our approximations to the exclusion limits we compare them with the official 95% CL limit on σ/σ SM in Fig. 10. In the left (right) panel we show the CMS (ATLAS) limit, with the official curve in black. The red curve is the simple approximation of Eq. (8) and the green curves are more precise determinations of the limit as explained above. The dashed green line corresponds to an approximate determination ofμ i and σ obs,i as in Ref. [39]. The solid green line (only