A twisted tale of the transverse-mass tail

We propose a tantalizing possibility that misinterpretation of the reconstructed missing momentum may have yielded the observed discrepancies among measurements of the $W$-mass in different collider experiments. We introduce a proof-of-principle scenario characterized by a new physics particle, which can be produced associated with the $W$-boson in hadron collisions and contributes to the net missing momentum observed in a detector. We show that these exotic events pass the selection criteria imposed by various collaborations at reasonably high rates. Consequently, in the presence of even a handful of these events, a fit based on the ansatz that the missing momentum is primarily due to neutrinos (as it happens in the Standard Model), yields a $W$-boson mass that differs from its true value. Moreover, the best fit mass depends on the nature of the collider and the center-of-mass energy of collisions. We construct a barebones model that demonstrates this possibility quantitatively while satisfying current constraints. Interestingly, we find that the nature of the new physics particle and its interactions appear as a variation of the physics of Axion-like particles after a field redefinition.

A Additional considerations for the CDF analyses 17 B Further details of our analysis 18

Introduction
It has been over a decade since the discovery of the Higgs boson at the Large Hadron Collider (LHC).Unfortunately, the large set of searches designed to look for traces of physics beyond the Standard Model (BSM) of particle physics has only returned empty-handed without any definitive signature of new physics (NP).More importantly, all searches that we have designed in order to discover the "well-motivated" models, which are constructed to address/solve a host of issues ranging from naturalness to dark matter, have only yielded exclusion plots (see, e.g., [1][2][3][4][5][6][7][8][9][10][11][12][13]).On the other hand, there have been exciting but scattered hints of NP emerging from the intensity frontier and from cosmological measurements (see, e.g., [14][15][16][17]).Not surprisingly, considerable efforts have gone into interpreting these anomalies in terms of BSM physics.
The recent (and the most precise) measurement of the W -boson mass (M W ) by the CDF collaboration of Tevatron [18] has given rise to much excitement.After analyzing 8.8 fb −1 of data, the collaboration finds the M W measurement to be in tension with all previous direct and indirect measurements in a statistically significant way * : M W = 80.4335 ± 0.0094 GeV : CDF [18] , M W = 80.360 ± 0.016 GeV : ATLAS [19] , M W = 80.354 ± 0.032 GeV : LHCb [20] , M W = 80.375 ± 0.023 GeV : D0 [21] , M W = 80.3545 ± 0.0057 GeV : Precision Electroweak [22] . (1.1) At this juncture, one can attribute the discrepancy between the CDF and the other measurements (the ones given above and also various LEP results [23][24][25][26]) to underestimated/unaccounted for systematics, and simply wait for future measurements from the LHC before speculating over possible BSM implications.However, we take a contrasting viewpoint and attempt to find an interpretation where the existing tension between these measurements (both direct and indirect) can be reduced.
There have been multiple proposals exploring a plethora of BSM solutions (see, e.g., ) along with various proposed corrections to electroweak (EW) precision observables (see [54][55][56][57][58][59][60][61][62][63] and references therein), to address the discrepancy.The underlying theme for all these attempts is to introduce NP which modifies precision EW observables such that the precision fit of the M W becomes compatible with the CDF measurement, therefore, ignores all other direct measurements.
The discrepancy between the CDF measurement of M W and that predicted by EW precision fits may in itself be taken to be a hint for BSM physics.However, when compared with the other experimental measurements by ATLAS, LHCb, and LEP, which are consistent with each other and with the EW fit at 1σ, the implications of the CDF result become much more nuanced.In this work, we attempt to address the question of whether one can reconcile the CDF value of M W not just with the EW precision fit but also with measurements from other colliders.In particular, we ask whether an NP interpretation exists where M W remains the same as the precision EW fit, but its measurements at different colliders yield differing values.
Remarkably, we do find such a scenario.The all-important observation which allows us to reconcile these different measurements is that precise M W measurements rely on leptonic decays of W which give rise to neutrinos in the final state.Since the exact reconstruction of the W four-vector is not possible, experimental collaborations use various kinematic variables sensitive to the W -boson mass, the most important of which is the transverse mass, M T .It is defined using only the transverse components of the lepton momentum (p ℓ ) and the missing transverse momentum (namely, ⃗ p miss
Note that, if the missing momentum is entirely due to the missing neutrino from W decay, the transverse mass shows a kinetic endpoint at M T ≤ M W .Even though smearing, energy mismeasurements, and hadronic activities (especially in proton colliders) in the event soften the kinematic edge, a precise extraction of M W is possible after taking various systematics into consideration, with the underlying assumption that the missing momentum is mostly due to the neutrino from W -decay.We find that breaking this assumption slightly gives us the desired result.If NP gives rise to events where a W is produced along with a BSM invisible state, say Φ, the missing momentum observed in these events becomes larger than the neutrino transverse momenta.Using the definition of Equation (1.2), it is a straightforward exercise to show that Therefore, if these events pass event selection criteria as designed by the experiments, one expects more events at the tail of the M T distribution.We intuit that if this entire set of events, i.e., Standard Model (SM) single W -events + SM background events + NP events, is fitted with the SM-only hypothesis to find the W -mass, one inadvertently obtains the best fit to be slightly larger than the true M W .The working principle in our framework is therefore rather simple: (i.) we need a light NP particle, Φ, which decays mostly to the dark sector (or sufficiently long-lived), so that it gives rise to missing momentum in the detector; and (ii.) we need an irrelevant operator that allows for the production p + p(p) → W + Φ.In this paper, we show that such a naive set-up accommodates the CDF measurement of M W , with the precision electroweak measurement on one hand, and with results from LEP, ATLAS, and from LHCb on the other.We take Φ to be a real scalar (SM gauge singlet) and invoke the following operator where κ is a dimensionless complex coupling constant, Λ is the scale of the irrelevant operator, and g w is the weak coupling constant.Apart from this, we also assume that Φ decays mostly to the dark sector.Even though Equation(1.4) implies a non-zero width of Φ to SM (if allowed by kinematics), this width would be phase-space and m 2 Φ /Λ 2 eff suppressed, where Λ eff = Λ/ |κ| is the effective scale of the operator.Consequently, the fractional width of Φ to SM can be made negligible by assuming marginal coupling of Φ to the dark sector.The physics of M W measurement is, however, independent of the details of such couplings and, therefore, we do not present any explicit model of dark-Φ interactions.
As we show next, this minimal and naive set-up is sufficient for the purpose of resolving the discrepancies observed around the W -mass measurement.To support our claim through quantitative statements, we use simulations, the details of which we provide in Section 2. In Section 3 we obtain the range of the effective scale of our operator, Λ eff , which is compatible to all direct measurements along with the EW precision fit value.After determining Λ eff , in Section 4, we discuss observables that may constrain the existence of Φ and its interaction in Equation (1.4).Once we obtain the allowed space for NP consistent with all the observables discussed, we make predictions for future M W measurements at the LHC (at 13 TeV and with an integrated luminosity of 500f b −1 ).Later, in Section 5, we dig deep into understanding the origin of the crucial operator in Equation (1.4), which compels us to consider questions regarding aspects of EW symmetry.We provide several scenarios which allow us to address these questions.It is outside the scope of this work to give a complete classification of all possible models of ultraviolet (UV) physics that may lead to our effective theory of W -mass anomaly and to study phenomenological consequences of all these different classes.We leave these for future endeavors.We conclude in Section 6.

Simulation Details
In this section, we support our claim through quantitative statements, for which we perform simulations relevant to the M W measurements at the CDF, at the ATLAS, and at the LHCb (as given in Equation (1.1)).The task of calculating the effect of Equation (1.4) in the determination of M W requires a careful understanding and reproduction of the analyses performed by each of these collaborations.This task is rather difficult (especially in the context of Tevatron analyses) since efficient and vetted fast-simulators for CDF or D0 are not available readily.This implies that it is simply not feasible to fit the "observed data" to determine the Wilson coefficient in Equation (1.4).In this work, we, therefore, take an alternate approach.We use the range of M W (as reported by the corresponding experimental collaborations) that best represents the observed-data, to determine the allowed strength of the operator in Equation (1.4).
From our analysis, it is clear that-by construction-a sizable fraction of NP events pass the set of cuts, which are designed to select a pure sample of SM W events in any of these experiments.We find that the number of such NP events depends on the center-ofmass energy of collisions and the cuts themselves.Therefore, the shift in the fitted M W from its true value critically depends on the specificity of the analysis.
Even though the details of the exact procedures we employ for different measurements in Equation (1.1) are far removed from each other, here we summarize the steps that characterize all these studies.
• In this work, we choose the true mass of the W -boson (denoted by M W from now on) to be the one determined using precision electroweak observables.
M W = 80.3545 ± 0.0057 GeV . (2.1) • Using M W = M W , we generate a large sample of matched W (ℓν) + jets events at the parton level for which we utilize MadGraph-v3.4.1 [64].The inputs to the matrix element generators are a set of parton level cuts, which we list under Table 1, a factorization/renormalization scale, and a parton distribution function (PDF) set.
For factorization/renormalization scales, we use the default MadGraph values, whereas for PDF we use NNPDF23 NLO [65,66].Subsequently, all parton level events are passed through Pythia-v8.306[67] for showering and hadronization.In order to avoid double counting, we employ the MLM scheme [68] and use xqcut = 30 GeV.We use Delphes-v3.5.0 [69] to provide a realistic detector environment whenever we  1.Cuts and selection criteria for simulating events W (ℓν) + jets for CDF, for AT-LAS@7 TeVand for LHCb@13 TeV.
can.For ATLAS, we use the default card as provided in Delphes.We will mention additional steps/details specific to individual measurements later.
• We impose selection cuts as tabulated in Table 1.Note that we closely follow the cuts as given in the respective experimental reports [18][19][20].These sets of cuts consist of variables already discussed previously in Equation(1.2), except for pseudo-rapidity for the lepton (namely, η ℓ ) and the transverse hadronic recoil variable u T .The working definition of u T employed in this work is collider specific and so we describe it later.
• We analyze the final sample of selected events and calculate observables.For the rest of this work, we denote the set of observables needed for estimating M W to be X .For example, in case of CDF X consists of {M T , p ℓ T , p miss T } defined in Equation (1.2).This list is also summarized in Table 1.The outcomes of this step are histograms corresponding to the variables in X -i.e., for every observable x ∈ X we obtain a Histogram X which represents L × dσ/dX, L being the integrated luminosity.
• We repeat all the steps above after setting M W = M W + ∆, where ∆ represents the shift in the mass parameter.We denote histograms of the variable x for a given ∆ by X(∆).In this notation, therefore, histograms for M W = M W are simply X(0).
• We also require simulated event samples for NP.We implement the operator of Equation (1.4) into MadGraph and repeat the above procedure to generate corresponding histograms.For NP, we denote these histograms by X NP (Λ eff ), because of its obvious dependence on Λ eff .
• Finally, for each values of ∆ we find the preferred value of Λ eff by minimizing the function D 2 defined via (2.2) In the above, X b represents the number of events in the bin b of the histogram X, and σ X b 2 is the variance of the same bin.The sum runs over all bins in the fitting range.We specify the fitting range for the three measurements in Table 1.
Before we summarize the results of our study, we need to mention analysis-specific details.
Even though we mention W → ℓν states in Table 1, we work with W → eν e for Tevatron and ATLAS, whereas we use W → µν µ for LHCb.As mentioned before, we employ semi-realistic detector environments as implemented in Delphes for our ATLAS study.
For Tevatron and for LHCb, we simply proceed directly to the analysis stage skipping the detector-simulation step.Since muons at the LHCb are well reconstructed with high efficiency and the muon p T is the only observable, we expect our results for LHCb to be realistic.For Tevatron, however, the results are sensitive to details.In Appendix A, we show the comparison of confidence bands that correspond to different levels of detail (but using the same set of cuts in Table 1).In particular, we show the difference of analyzing directly using the output of Pythia, after taking into account QED corrections given by ResBos-v2.0[70] utilizing Reference [71], and finally after taking into account smearing as given in Reference [71].Also, in Appendix A, we discuss the differences in using only the M T variable for minimization in contrast to combining all the three variables {M T , p T , p miss T }.Given these issues, we choose to use histograms after ResBos2 for all three variables but with a broad range of systematics (0-5%) that mostly captures the uncertainty associated with our Tevatron-specific analyses.
Finally, note that both the CDF and the ATLAS collaboration use the variable u T which is a measure of the hadronic recoil.An upper cut on the hadronic recoil preferably selects W with small p T .For Tevatron, we use the sum of all momenta for all final state hadrons and photons within |η| ≤ 3.6 to calculate the recoil, whereas for ATLAS we use the sum of all jets and photons within |η| ≤ 4.9.

Strength of new physics compatible with all measurements
With the details of the analysis in hand, we now determine the range of Λ eff that can simultaneously satisfy all direct measurements.We take analysis-specific systematics into account due to the issues outlined previously.We begin by plotting all the histograms that play a role in determining M W in Figure 1.In each of these plots we show the distributions corresponding to M W = M W (shaded) and for M W = M W + ∆ (black lines), where for respectively.In each panel, the different histograms correspond to SM with ∆ = 0 (shaded), ∆ = 1 GeV (black line) (large ∆ chosen for demonstration), and the NP process with Λ eff = 1 TeV (colored line).For legibility, we scale the NP numbers by 10 4 .numerical demonstration we have taken ∆ = 1 GeV.In each of these variables, there is a characteristic scale (related to the mass of W -boson), beyond which the distribution falls.A larger M W increases the characteristic scale, which results in a rightward shift of the edge of M T and slightly harder p ℓ T and p miss T .On the other hand, the same plots for the NP events (evaluated here for Λ eff = 1 TeV, shown by colored lines, and scaled by 10 4 for legibility) have comparatively flatter distributions in the range of the plot.Consequently, these add "relatively" more events in the bins where SM distribution falls rapidly, shifting the histograms slightly towards larger values of the kinematic variables.Therefore, as argued at the beginning of this work, the distribution for M W = M W when combined with a suitably weighted NP distribution may mimic the shape corresponding to a higher M W .
Following the recipe described above, we can determine the confidence belts in Λ eff for each value of ∆.Note, however, that the location of the minimum of D 2 in Equation (2.2), as well as the width of the confidence belt depends on the assigned variance in each bin of As explained before, we also need to add a systematics component to the variance, which reflects the uncertainties due to scale, generator, detector elements, etc.To take this into account, we perform our analysis by varying the systematics between 0% and 5%.We give the result of the minimization procedure in the three plots of Figure 2 corresponding to CDF (left), ATLAS@7 TeV (center), and LHCb (right).As mentioned before, we are more prone to systematics in the context of Tevatron analyses, because of which we show the 68% confidence level (CL) contours for 5% systematics, in addition to the 0% and 1% ones.Note that, while extracting the bands for the CDF analysis, we convoluted the histograms generated after Pythia simulations by the bin-by-bin N 3 LL + NNLO factors as given by the ResBos2 package and quoted in Reference [71].We indicate, using dotted lines, the upper limits of the M W measurements (∆M W + 1σ) reported by D0, ATLAS, and the LHCb collaborations, and with the shaded region we show the 1σ limits (i.e., ∆M W + 1σ) corresponding to CDF.Note that, for the CDF 1σ range, we have allowed for the possible 10 MeV downward shift, as reported in Reference [71].
Our first observation is that Λ eff → ∞, which corresponds to κ → 0 for any finite Λ, is inconsistent with CDF (even when we include 5% systematics in our analysis).Secondly, contours corresponding to 0% and 1% systematics are contained within the 5% systematics band, as expected.In particular, we find that one needs to use 0.12 TeV < Λ eff < 0.35 TeV (68% CL using 5% systematics) in order to predict the right shift of M W at CDF.Of this, 0.15 TeV < Λ eff < 0.35 TeV is simultaneously allowed by the D0 and CDF measurements.
As opposed to Tevatron, for ATLAS@7 TeV and LHCb we expect the systematics to be much more in control, for reasons already mentioned.Hence, for these, we show results with 0% and 1% systematics only.For both these experiments, we find that there is a wide range of Λ eff for which the NP hypothesis is allowed by the corresponding measurements of M W , namely, Λ eff > 0.18 TeV for ATLAS and Λ eff > 0.13 TeV for LHCb.As expected, the bands are consistent with ∆ = 0 for Λ eff → ∞. Figure 3.The 68% bands from all experiments which provide M W measurements.We show results for 5% systematics for Tevatron (teal) and 1% for both ATLAS (red) and LHCb (violet), overlaid with the experimental measurements (1σ).The solid vertical lines (grey) give the range of Λ eff that is simultaneously consistent with all the experiments.
In Figure 3, we simultaneously plot the results obtained from the simulations corresponding to CDF, ATLAS@7 TeV, and LHCb.The shaded bands, teal for CDF, red for ATLAS@7 TeV, and violet for LHCb, show the 68% CL bands obtained by minimizing Equation (2.2) with respect to the parameter Λ eff .For CDF, we use 5% systematics per bin, while for ATLAS and LHCb we use 1% systematics.The different bands, overlaid on the measurements, clearly convey the message that there is an overlap between the observations at CDF, ATLAS, and LHCb.This region of overlap (solid vertical gray lines in Figure 3) determines the range for which the NP scenario is 'consistent' with all the M W measurements (at 68% CL) and is given by: 0.18 TeV < Λ eff < 0.35 TeV . (3.1) As we do not have access to all the details of the experimental analysis, it is impossible for us to pin-point the systematics associated with the various experiments.Hence, the best we could do is to use the well-motivated, albeit, somewhat ad-hoc values of systematics.However, we stress that even if we take 1% systematics for all the experiments, we still get a non-vanishing range of Λ eff that satisfies all the measurements.In Appendix B we show the results for different systematics at different levels of confidence.Before proceeding to the next part of our analysis, note that we have not discussed the measurements by the LEP collaborations [23][24][25][26] at all.Given that the NP particle couples only to the quarks, in our hypothesis, we expect the LEP results to remain consistent with the EW precision measurements.We have, however, performed simulations for the D0 experiment (pp collision) to check if our NP operator can simultaneously incorporate the D0 measurements for M W as well.We find that the range of Λ eff consistent with D0 includes in it the range quoted above in Equation (3.1).We have not shown this band in Figure 3 for readability.Instead, we have shown the D0 bands in Figure 7 in Appendix B.

Constraints from other measurements
With Λ eff determined in the previous section (see Equation (3.1)), we now focus on constraints imposed by experiments performed at similar energy scales as the ones that enter the M W measurements, i.e., from high-energy colliders.Two obvious measurements that should constrain the operator in Equation (1.4) are the following: • pp → W W → eµ + p miss T differential cross section.Both these measurements have been performed by the ATLAS collaboration using 13 TeV LHC data, the former with 81 pb −1 of data [72] and the latter with 36.1 fb −1 of data [73].

Single W production cross-section measurements
We begin the discussion with the single W channel.Even though the underlying processes corresponding to the W cross-section measurement and the W mass measurement are identical, the two analyses are different.For the mass measurement, ATLAS uses the data in the bins given by the fitting ranges (given in Table 1), while the cross-section measurement includes the high momenta data as well.In fact, it is the events in these high momentum bins (≫ M W ) that we use to derive the bounds from the W cross-section data.To obtain constraints on Λ eff from this channel, we compare our SM single W + SM background + NP hypothesis against the experimental observation.For background (SM W + SM background), we use the data provided in the experiment paper [72] and we simulate the NP contribution pp → W Φ + jets in MadGraph, followed by Pythia for showering and Delphes for detector simulations.We use the anti-k t algorithm [74] with p min T = 20 GeV, R = 0.6 to cluster calorimeter elements within |η| < 5.For subsequent analysis, we impose the same cuts on the kinematic variables (X ) and the selection criteria on the number of final state particles as used by ATLAS.These cuts and selection criteria are given in Table 2. Note, in our analysis, we use only the electron channel.
In our study, we use the differential distributions for M T , p ℓ T , and p miss T variables.Furthermore, we use the same binning for the variables as the experimental report [72].Lower bins for all these observables are background-dominated, therefore, we concentrate on the high energy tails and impose analysis level cuts on the variables as follows: We take the sum of the events in all the bins, passing these cuts, from Reference [72] to constrain our NP scenario and use the Bayesian method to obtain 95% CL exclusions.For the three distinct variables (M T , p ℓ T , p miss T ), we get three different limits, given by: Clearly, p ℓ T provides the most stringent constraint.Unlike M T and p miss T , no information about missing transverse momentum is needed to construct p ℓ T , leading to less systematics for this variable.

WW cross-section measurements
We now move on to the constraints from the W W cross-section measurement.Similar to the single W case, we use the background estimates given in the experimental paper [73].For consistency, we mimic the experimental analysis as far as possible, focusing on the pp → W W → eµ + p miss T channel.The collaboration selects events with exactly one hard electron and one hard muon and uses the following variables to characterize these events: p lead,ℓ T : momentum of the hardest lepton in the event , p eµ T : transverse momentum of the eµ system , m eµ : invariant mass of the eµ system , p miss T,track : transverse momentum computed using jet and lepton tracks .
In addition, the collaboration imposes a veto on b-tagged jets with p T > 20 GeV and |η| < 2.5.For unflavored jets, the veto is for p T > 35 GeV and |η| < 4.5.In Table 3, we list the kinematic cuts and the selection criteria that the collaboration imposes on the events. Variables Table 3. Event selection criteria for W W and W W Φ production at √ s = 13 TeV.
We also impose the same cuts and selection criteria on signal events.For the signal, we simulate pp → W W Φ in MadGraph and allow the W W system to decay to eµ + p miss T only.We then pass the simulated parton level events through Pythia for subsequent showering and hadronization.Post hadronization and showering, the events are passed through Delphes, with the default ATLAS card.Note, in particular, we use the same jet definition as in the single W analysis.
In computing the pp → W W Φ cross-section, we find that the amplitude shows a power-law growth with the partonic center-of-mass energy, √ ŝ, up to energies much higher than the suppression scale Λ of the irrelevant operator in Equation (1.4).This growth, beyond the UV cut-off of the theory, is clearly due to the amplitude picking up unphysical modes.This implies that we are extending the amplitude to energies beyond the range of computability of the effective theory.In order to regulate our result and force it to be in the regime of trustable computability, we impose a cut-off on the energy of the NP events following the prescription in Reference [75].To be specific, we only include NP events for which the invariant mass of the W W Φ system (namely, M W W Φ ) is less than Λ.
With the cut on M W W Φ and the kinematic/selection cuts listed in Table 3 applied to the signal events, we use the differential distribution with respect to p lead,ℓ T to obtain constraints.We focus on p lead,ℓ T as the other available distributions (e.g., p eµ T , m eµ , and angular variables) are less sensitive.Furthermore, ATLAS has much better control over both statistical and systematic uncertainties for the p lead,ℓ T distributions, compared to the other variables.As mentioned earlier, the NP effects are most prominent in the tails of the momenta distributions.The experimental analysis consolidates the events with p lead,ℓ T > 190 GeV into one 'overflow' bin.We use the events in this overflow bin to obtain the exclusion.We use 10% systematics, as reported in Reference [73] for p lead,ℓ T .Since we explicitly introduce a scale Λ in our analysis, our result from the di-boson process is qualitatively different from all the earlier results.Earlier, physics was insensitive to the simultaneous scaling of κ → aκ and Λ → aΛ, since ultimately Λ eff = |κ|/Λ remained invariant.However, the 'elevation' of Λ to the role of the explicit cut-off introduces scale dependence.Hence, the constraint obtained from the W W analysis is essentially on the coefficient |κ| for a varying Λ.In Figure 4, we show the 95% CL exclusion for 10% systematics, as obtained from this analysis, in the |κ|-Λ plane (violet shaded region).The contour tells us what is the maximum |κ| for a given Λ.For example, for Λ = 1 TeV, it imposes |κ| ≤ 0.62.It is clear from the plot that for the scale below 500 GeV, however, the constraints from W cross-section measurement become important.Note that, previously we found single W gives Λ eff > 0.15 TeV from p ℓ T measurement.Here, we translate this bound to the |κ| − Λ plane (shown in red).In the Figure , we also indicate the region (in gray) 'disallowed' from fitting different M W measurements taking 5% systematics for CDF and 1% for both ATLAS and LHCb.Given all the exclusions, the region of parameter space allowed (in white) lies between 1 ≲ |κ| ≲ 3 and Λ ≲ 0.65TeV.Note that, we have checked other exclusive channels with dibosons and jets in the final state [76][77][78][79] and find that the bounds discussed here are the strongest.
It is to be noted, the bounds we derive from pp collisions are different from collider bounds which exist in the literature.The existing bounds do not affect us as these are sensitive to the decay channels of Φ, e.g., multi-lepton [80,81], 2ℓ2γ [81], multi-photon [82], and 2ℓ2h [83].Also, Φ couples to the Higgs, the electron, the photon, and gluons only at the order of multi-loops.Therefore, Higgs→ invisible bounds [81], constraints from electron colliders and beam dumps (e.g., [84,85]), and constraints where Φ is produced from gg fusion [86] are not relevant for our NP scenario.Similarly, the W and Z boson decay widths are affected either at higher order or with phase space suppressions.Hence, we do not consider these bounds.

Projections for W mass measurement from 13TeV LHC data
After obtaining the allowed range of Λ eff , we use our NP hypothesis to predict the M W extraction expected from the 13 TeV LHC data.To be specific, we simulate for the ATLAS detector assuming an integrated luminosity of 500 fb −1 .Needless to say, although we do not explicitly simulate for CMS, the predictions for ATLAS should act as a proxy for the former as well.We generate the NP events and follow the same prescription as used for the 7 TeV simulations.We use the same cuts, the same fitting ranges, and the same bin widths.From this exercise, we predict (at 68% CL) for LHC@13 TeV the following ranges of ∆ for two different systematics: In Figure 5, we present these contours for 0% (darker brown) and 1% systematics (lighter 0.2 0.4 0.6 . Predictions for the expected shift in M W (∆) for ATLAS@13 TeV at 500 fb −1 (brown band).We also show the range of Λ eff as allowed from current measurements of M W at different colliders.The horizontal dotted line indicates the current measurement of ∆ at ATLAS@7 TeV.
brown).We also show the range of Λ eff that is currently allowed (gray band) and the ATLAS@7 TeV measurement (1σ) of M W (dotted lines).

Electroweak Considerations
So far in this work, we have outlined an interesting and bare-minimal scenario, which accommodates a remarkable feature that makes the task of extracting M W from leptonic decays of W in hadron colliders highly nontrivial.In fact, conventional strategies with the SM hypothesis simply give an incorrect estimation.The result that the extracted mass depends on the nature of the colliders and/or the center-of-mass energy of collisions is intriguing.The simplicity of the scenario lets it hide from the ensemble of NP searches.
In the remaining part of this work, we speculate about the nature/ultraviolet aspects of the scenario.Even though we do not suggest particular renormalizable UV completions of Equation (1.4), our discussion here is geared towards finding possible further constructions, still in terms of irrelevant operators, that address questions regarding the EW symmetry and the flavor symmetry.As we show now, there is a multitude of possibilities even at this intermediate level.Finding and classifying all possible renormalizable UV completions is a completely different task and we leave it for future endeavors.
We begin this exercise by noting that in case the complex parameter κ is purely imaginary (i.e., κ = ik), the theory described in Equation (1.4) is equivalent to more familiar constructions of Axion Like Particles (ALPs).A field-dependent redefinition of left-handed u and d quarks eliminates the operator in Equation (1.4) but gives rise to new ones: where • • • represent additional terms of order (Φ/f Φ ) 2 or more, and terms suppressed by at least one power of 16π 2 .The redefinition we use is chiral in nature, and hence the anomaly associated with the electromagnetic current gives rise to operators ΦF F .Note, no ΦG G is generated since the redefinition includes opposite (field dependent) phases for u L and d L .Even the mass-dependent operators and the ΦF F term do not seem independent.A suitable redefinition of the right-handed quarks can eliminate the massterms in Equation (5.1) as well as the anomaly term, at the cost of a new term involving ∂Φ.Note that, we can reach the same cleaner-looking Lagrangian, if we employ rather a vectorial redefinition of u and d quarks (instead of the chiral ones in Equation (5.1)). (5.2) As mentioned earlier, these recasts bring the unusual operator in Equation (1.4) in the well-studied paradigm of the ALP physics and make the task of building further models and deriving constraints simpler.The guiding principle for building the UV model which will give rise to the apparent shift of W -mass is, therefore, straightforward -the UV model must result in Equation (1.4) and/or the derivative operator in Equation (5.1) in terms of left-handed quarks, but there should not be any quark field redefinitions that can eliminate both at the same time.Consequently, for the rest of this work, we use the derivative operator in Equation (5.1) as the starting point for further constructions while discussing issues of flavor and EW symmetry.Generalizing it in the flavor space, we write the operator in a convenient manner: where i, j are flavor indices, q L represent the usual left-handed doublets, and σ 3 is the Pauli matrix.It allows us to jump directly into the flavor question.Arbitrary k ij is simply ruled out from large flavor-changing neutral currents (FCNCs) (for a recent review see Reference [87]).The UV model must include considerations from the flavor sector.A safer ansatz is using k ij = k δ ij -which does not give rise to any new flavor-breaking spurions ‡ .However, given specific models, one might require small non-diagonal k ij elements to counter loop-induced FCNCs.
The left-handed quark doublets are also electroweak doublets and the operators in Equation ( 5.3) also violate electroweak symmetry.Not surprisingly, the imposition that Equation(5.3) arises from a fully electroweak theory is a lot more demanding.The simplest construct is to take k/f Φ to be proportional to the Higgs vacuum expectation value (vev) v.For example, when the Higgs is replaced with its vev, the following electroweak operator yields Equation (5.3): (5.4) This scheme finds the dimension D = 5 operator from a truly D = 7 operator.Because of this, one expects the scale in the UV (namely, Λ) to be far more suppressed than the apparent scale f Φ as long as one takes k ∼ k.This seemingly low Λ may not necessarily mean the existence of additional new degrees of freedom at low energies.For explicit construct, see, for example, Reference [88] which, in fact, deals with ALP-like scenarios.
A far more creative and attractive avenue is to have the coupling in Equation(5.3) from an electroweak D = 5 operator.This requires an electroweak triplet Σ ≡ Σ a t a ≡ {Σ ± , Σ 3 }.
Further model building is necessary to accommodate Σ ± , since these have to be heavier than the EW scale to avoid bounds from W/Z widths.The light neutral state (Φ) can be obtained by introducing another electroweak singlet (say Σ 0 ).It is trivial to design a potential (using only marginal and relevant operators) with the Σ fields and the Higgs field, ‡ Note that bringing in additional quark flavors changes the best fit and exclusion plots in where one obtains a near massless light scalar after the Higgs is replaced by its vev.This requires choosing coupling constants for different operators suitably and also cancelling quantum corrections with bare terms.Since we give no importance to the amount of 'naturalness' we do not foresee any problem with constructing a model in these lines.
The lack of a concrete model makes a discussion about contributions to the EW Tparameter moot.Any positive contribution to the T parameter from the triplet [89,90] can be counteracted by the presence of heavy fermions or kinetic mixing (see, e.g., [91][92][93][94]), which might be present in the UV model.Also, one can not but notice that the phenomenological constraints and best fit values for W -mass measurements will be much different for any of these UV scenarios here.For example, one has to take into account Σ ± contributions to p + p(p) → Φ + W to re-derive the best fit plots, find constraints on the mass of Σ ± , and look for additional signals via which the model might present chances for it being discovered at the LHC.All these discussions are beyond the scope of this work.Similarly, a proper discussion of flavor constraints (see, e.g., [87,[95][96][97]), should include a full model-that determines relationships between the different parameters and also the running of the couplings to low energies.
TB acknowledges the hospitality provided to him by TIFR where a substantial amount of this work was completed.

A Additional considerations for the CDF analyses
In this Appendix, we discuss some subtleties related to our CDF analyses.We have discussed the methodology in the text itself and argued in favor of the validity of our analysis.However, as we are unable to incorporate some aspects of detector simulations and statistical nuances, we perform additional checks to establish the robustness of our results.To verify that the systematics used by us captures the effects of detector smearing and final state radiations, we perform an auxiliary analysis.In this analysis, we find the 68% CL bands on Λ eff after the convolution of the Pythia output with both the smeared and unsmeared ResBos2 factors [71].As smearing affects M T the most [71], we use only M T for this study.In the left panel of Figure 6, we plot these bands with 5% systematics for the unsmeared case (lighter shade) and without systematics for the smeared case (darker shade).For reference, we also show the band obtained using only the Pythia output (solid borders).From the Figure, it is clear that the effect of smearing is encapsulated by the band with no smearing but with 5% systematics.
As a second check, we compare the 68% CL bands on Λ eff obtained using only M T and the band obtained by combining all the kinematic variables {M T , p ℓ T , p miss T }.In the right panel of Figure 6, we show these bands for the best fit obtained by using only M T (lighter shade) and all the variables (darker shade).As expected, we get a tighter band for the case where all the variables are combined.These comparisons ensure that the correlations between the different variables, which we cannot take into account, do not substantially modify our conclusions.

B Further details of our analysis
In the main text, we presented the results for a particular combination of systematics, viz., 5% and 1% for CDF and the LHC experiments respectively.We have used larger systematic uncertainty for the Tevatron analyses as our handle on the experimental details is much less compared to the LHC experiments.We have, however, ensured that our central results are robust against a variation of systematics for the different experiments.
These analyses clearly indicate that the central observation is not a result of overestimating the variances in the denominator of Equation (2.2).That is, the NP scenario we have discussed is not merely an artefact of our numerics but a viable physical For the sake of completeness we note that when we take all the systematics to be zero, we still can't completely rule out the NP scenario with 0.19 TeV < Λ eff < 0.24 TeV at 99% CL.Even at 95% CL we have Λ eff ≃ 0.21 still allowed.Along with CDF, ATLAS, and LHCb, in Figure 7, we have also plotted the band obtained from the D0 experiment.For the analysis corresponding to D0, we have used the same pp ( √ s = 1.96TeV) raw data used for CDF, and analysed it with the cuts and fitting ranges as obtained from the corresponding D0 paper [21].We resisted showing the details of the D0 simulations in the main text as the D0 constraints are not any more stronger than the ones already obtained from CDF and ATLAS (for the different systematic combinations that we have checked).

Figure 1 .
Figure 1.Distributions of different kinematic variables corresponding to CDF (top), ATLAS (middle) and LHCb (bottom).In each row, the left, center, and the right plot shows the histogram corresponding to M T , p ℓ T , and p miss T

Figure 4 .
Figure 4. Allowed (white) region consistent with all the measurements of M W (at 1σ) along with the 95% CL exclusions obtained from ATLAS measurements of W → ℓ + p miss T (red) and W W → eµ + p miss T (violet) cross sections.

Figure 2 - 4 ,
where the biggest effect arises because of the strange quark.Converting in the basis of Equation (1.4), one finds additional operator with the replacement of V CKM ud dL → V CKM us sL.

Figure 6 .
Figure 6.bands using Pythia results only (with 5% systematics) (transparent, solid boundaries), using Pythia + ResBos2 (5% systematics) (lighter shade), and Pythia + ResBos2 + smearing (0% systematics) (darker shade).For all three bands, we use the M T distributions only.Right: 68% CL bands obtained by using the M T variable only (lighter shade) and the one with all kinematic variables (M T , p ℓ T , p miss T ) combined.

Figure 7 .
Figure 7.In the left(right) panel we have plotted the 68%(95%) CL bands obtained from our analysis for all the four hadron collision experiments: CDF, D0, ATLAS, and LHCb.These bands overlayed on the experimental measurements of M W from all these experiments.
[71]: 68% CL bands corresponding to 0%, 1%, and 5% systematic uncertainties for the CDF experiment overlaid on the CDF (+ResBos2[71]) and D0 measurements of M W at 1σ. Centre: 68% bands corresponding to ATLAS, overlaid on the ATLAS M W measurement at 1σ. Right: 68% band for LHCb overlaid on the LHCb M W measurement using p ℓ T only.
the histogram.The statistical component of the variance is rather straightforward.Using the notation established above, we take σ

Table 2 .
Event selection criteria for W and W Φ production at √ s = 13 TeV.