Bottom-Flavored Mono-Tau Tails at the LHC

We study the effective field theory sensitivity of an LHC analysis for the $\tau \nu$ final state with an associated b-jet. To illustrate the improvement due to the b-tagging, we first recast the recent CMS analysis in the $\tau\nu$ channel, using an integrated luminosity of 35.9 fb$^{-1}$ at $\sqrt{s}=13$ TeV, and provide limits on all the dimension-six effective operators which contribute to the process. The expected limits from the b-tagged analysis are then derived and compared. We find an improvement of approximately $\sim 30\%$ in the bounds for operators with a b quark. We also discuss in detail possible angular observables to be used as a discriminator between dimension-six operators with different Lorentz structure. Finally, we study the impact of these limits on some simplified scenarios aimed at addressing the observed deviations from the Standard Model in lepton flavor universality ratios of semileptonic B-meson decays. In particular, we compare the collider limits on those scenarios set by our analysis either with or without the b-tagging, assuming an integrated luminosity of 300 fb$^{-1}$, with relevant low-energy flavor measurements.


Introduction
The high-energy tails of two-to-two scattering processes at the LHC are some of the most sensitive probes for New Physics (NP) at the collider. In absence of direct evidence for new physics, and assuming the mass scale of new particles lies above the energy reach of the collisions, these searches can provide very strong and model-independent limits on dimension-six operators. Scattering amplitudes involving such operators grow with the square of the energy, E 2 , compared to the corresponding Standard Model (SM) amplitudes. This enhancement of new physics effects at high energies can be leveraged to compensate the limited statistical and systematic precision of these processes, allowing the limits obtained in this way to be competitive with those derived from precision lowenergy data. For instance, it has already been shown that high-energy tails of 2 to 2 processes at LHC can provide complementary information to low-energy flavor physics on the flavor structure of New Physics [1][2][3][4][5][6][7][8][9] or even be competitive with LEP in putting constraints on electroweak precision tests [10][11][12][13].
In case of the process at hands, pp → τ ν, the relevant operators are semileptonic four-fermion operators. In the formalism of the SM Effective Field Theory (SMEFT) and in the Warsaw basis [14], the ones which show a growth with energy of the scattering amplitude, compared to the SM, are lq ] ijkl l i γ µ σ I l j q k γ µ σ I q l + [C ledq ] ijkl l α i e j d k q α l + [C (1) lequ ] ijkl l α i e j αβ q β k u l + h.c.
Lepton and quark doublets are l i = (ν i L , i L ) and q i = (V * ji u j L , d i L ), respectively, where V is Cabbibo-Kobayashi-Maskawa (CKM) matrix.
This specific process is particularly interesting now due to the close connection with the measurements of lepton flavor universality (LFU) ratios of semileptonic B-meson decays R(D ( * ) ) = Br(B → D ( * ) τ ν)/Br(B → D ( * ) ν) (with = e, µ) [15][16][17][18][19][20][21][22][23][24][25], which in a combined fit of BaBar, Belle, and LHCb data, show a deviation from the SM prediction at the ∼ 3σ level [26], hinting for a possible presence of new physics in the b → cτ ν transition. 1 Since the mass scale of new resonances indicated by these deviations lies in the few-TeV range, testing this process in high-energy scattering at the LHC is clearly particularly motivated. CMS [27] and ATLAS [28] searches in the τ ν channel have been recasted to provide limits on EFT operators in [6,9].
The main goal of this work is to design an LHC analysis of the pp → τ ν process, including also the requirement of a b-jet in the final state. This is expected to improve the sensitivity on operators involving a b quark, such as those involved in the R(D ( * ) ) observables. In order to quantify the gain in sensitivity due to the b-tagging, and to validate our background analysis, we also recast the CMS analysis of the pp → τ ν search [27]. We thus provide the present EFT limits from this search, as well as the future sensitivity of the searches for both cases with and without the b-tagging.
In Section 2 we describe the EFT operators employed in the analysis, and the approach used to derive the EFT dependence of the cross section in each bin of the transverse 1 The light leptons channels are instead consistent with each other and with the SM expectation mass. In Section 3 we validate our analysis and simulation for pp → τ ν against the CMS analysis in [27]. After that, we perform a new analysis for pp → τ ν + b for further improvement. Also, we discuss the potential of some angular distributions for extracting more information on the tensor structure of four-fermion operators. In Section 4 we obtain the present limits and future sensitivity on the EFT coefficients from both τ ν and τ ν + b analyses. In Section 5 we discuss some implications of these constraints on some flavor structures, comparing with low-energy flavor measurements such as R(D ( * ) ), B → τ ν, and τ decays. We conclude in Section 6. In Appendices we provide the cross section fit in terms of EFT coefficients and full differential cross section of 2 to 3 process as well as some simulation details.

EFT contributions to high-energy tails
New physics effects in low-energy flavor observables are usually discussed in terms of an effective Hamiltonian defined at the low-energy scale with quarks in the mass basis.
For the charged-current transitions at hand, the relevant effective Lagrangian is usually defined as C ij T (ū i σ µν P L d j )(τ σ µν P L ν τ ) + h.c. .
These coefficients, evaluated at the matching scale, can be easily translated into those in the linear basis, Eq. (1): lq ] 33kj , Going from the matching scale down to the low-energy scale relevant for flavor processes, the anomalous dimension induced by QCD interactions must be taken into account [29]. It can be noted that the O V LL operator has no QCD anomalous dimension. The O V RL operator is generated at dimension-6 in the SMEFT only via anomalous W boson couplings to right-handed quarks, and at energies above the electroweak scale is therefore resolved into a vertex correction for the W , so does not behave as a four-fermion operator (no growth with energy of the scattering amplitude). It can also be generated as a dimension-8 operator, thus receiving a further v 2 /Λ 2 suppression compared to dimension-6 operators. For this reason we keep it in the analysis done in the mass basis but drop it in the SMEFT analysis.
The parametrization in Eq. (2) is convenient for discussing low-energy flavor observables, but also for the high-energy tails studied here, as it features a non-interference among different EFT coefficients in the limit of negligible fermion masses. 2 We thus implement in a FeynRules [30] model the effective operators in Eq. (2).
Since these semileptonic operators contribute to the scattering amplitude with a single insertion, in general the cross section is quadratic in the EFT coefficients and be written as where i, j are flavor indices and X runs over all possible operators in Eq (2). Operators with the top quark do not contribute to this process. This leaves thirty EFT coefficients (six from each type of opeartor). In the limit of negligible fermion masses, the interference terms σ ij,X SM −EF T vanish for all operators, except for the one associated with C V LL . We obtain the linear and quadratic terms by simulating them separately using MadGraph5 aMC@NLO [31]. The complete cross section dependence on the EFT coefficients is provided in Appendix A.
Employing the EFT approach to discuss high-energy tails of scattering processes comes with important caveats regarding the validity of the EFT expansion. By assumption, the energy scale of new states should be much above the typical energy of the process, M 2 NP ŝ, whereŝ ∼ 1 TeV in our case. Due to the growth with the energy of the EFT scattering amplitude, the cross section in the most sensitive bins is dominated by the EFT-squared contribution, rather than the SM-EFT interference. Since quadratic terms are formally of order ∼ 1/M 4 NP , like the interference of possible dimension-8 operators with the SM, the validity and generality of the approach could be questioned if their inclusion were to affect the results. Nevertheless, in case of single tree-level mediators this is not an issue, since it turns out that the interference of dimension-8 operators with the SM is always smaller than the interference of dimension-6 terms with SM, if M 2 NP >ŝ, as shown in [9]. A cancellation between dimension-six and eight contributions would require a specific multi-mediator scenario with tuned couplings.
Even if the mediator has a mass lower than the scattering energy, thus invalidating the EFT expansion, the limits obtained in the EFT approach can still be indicative of the true limits. In case of a mediator exchanged in the s-channel, the true signal includes a resonance and is always larger than the EFT prediction, implying that the bounds obtained in the EFT would be conservative [5]. In case of an exchange in the 2 Only O SL and O T with same flavor content have a non-vanishing interference among themselves. t or u channel, instead, the true signal can be smaller but, as shown in [6], the EFT limits approximate well those obtained in the complete model. We refer to [9] for a more detailed discussion of possible caveats due to the EFT expansion in the pp → τ ν(+b) process at the LHC.
3 Boosting flavor precision at the LHC

Tagging bottom flavor
Tagging a b-quark is beneficial in two aspects. First, while the dominant SM contribution to the τ ν final state comes from the parton distribution function (PDF) of light quarks, the beyond the SM (BSM) contribution of interest are initiated by cb and ub initial state partons. Tagging a b-quark exclusively will suppress only the SM contribution and thus the sensitivity of the cross section on the EFT coefficients is enhanced. Secondly, by tagging a b-quark, one can restrict the analysis to the subset of four-fermion operators where one of the field is a b-quark, thus reducing the dimensionality of EFT parameter space entering the analysis. The dimension could be further reduced by an extra c-tagging.
The relevant collider search is pp → τ + / E T + b. Inclusive τ ν resonance searches without b-tagging using data at √ s = 13 TeV have been performed in [27,28]. To best of our knowledge, the experimental searches in pp → τ ν + b is not available. Collider studies of the process pp → τ ν + b in the context of W and leptoquark searches have been performed in [32,33].

Validation against CMS τ ν analysis
We adopt the analysis of the CMS τ ν resonance search at √ s = 13 TeV [27] with an integrated luminosity of 35.9 fb −1 , recasting it to derive the sensitivity on the EFT coefficients. We collect all simulation details in Appendix C. Here, we focus on describing our main analysis procedure and results.
We first identify the isolated leptons according to the criteria p T (l)/(p T (l)+p T (cone)) > 0.85 where p T (cone) is the surrounding transverse momentum within the isolation cone size of R iso = 0.3. Any events with isolated leptons with p T (l) > 20 GeV and |η(l)| < 2.5 are vetoed. All particles in the event are clustered by Fastjet 3.1.3 [34] using the antik T algorithm [35] with a jet size of R = 0.5. Events with at least one jet that satisfies p T (j) > 20 GeV and |η(j)| < 2.5 are selected 3 . Jets are classified into four categories depending on whether they match to either heavy flavors or truth-level tau-lepton, namely b, c, τ -jets and light jets. Jets are first iterated to identify τ -jet candidates. While the ϵ �→τ VLoose Figure 1: The misidentification rate of j → τ for the VLoose working point taken from the CMS performance of reconstruction and identification of tau leptons using data at √ s = 13 TeV [36].
CMS analysis in [27] uses the sophisticated multivariant-based (MVA-based) τ -jet identification, we classify a jet as a τ -jet candidate if a truth-level tau lepton in the hard process is found inside a jet within a distance of R = 0.25 from a jet vector. Events with more than one τ -jet candidate are vetoed. The remaining jets are further iteratively searched for b-hadrons or c-hadrons inside them to identify b, c-jets candidates. If a b-hadron (c-hadron) is found inside a jet, it is declared to be a b-jet candidate (c-jet candidate).
The leftover jets are classified as light jets. The missing transverse momentum p mis T is defined as the negative vectorial sum of all visible reconstructed objects such as τ -jet and QCD-jets.
Similarly to the analysis in [27], we adopt the very loose (VLoose) working point for tag and mistag rates of the MVA-based τ -jet identification taken from [36] (see Fig.4 of [36]). The tag rate in VLoose working point is roughly 70%, τ →τ = 0.7, whereas the mistag rate j→τ is shown in Fig. 1. The mistag rate decreases with an increasing p T (τ ) and its value is smaller than 0.4% for p T (τ ) 80 GeV. In applying the mistag rate in Fig. 1 to QCD-jets in the τ ν analysis, we do not distinguish the heavy flavor jets from the light jets. In our analysis, we assume that the mistag rate is saturated to the smallest value in Fig. 1 for the transverse momentum p T (τ ) > 300 GeV as it is not available in [36].
The analysis cuts imposed in the CMS analysis [27] are and, to reflect the back-to-back configuration of τ ν system, 0.7 < p τ T /p miss where p τ T is the transverse momentum of the τ -jet, while its magnitude is denoted by p τ T (similarly for the missing transverse momentum). The variable ∆φ in Eq. (6) is an azimuthal angle. Finally, events that passed the cuts in Eqs. (5) and (6) are binned in the transverse mass, m T , defined as Following the description above, we validate our background simulation against the CMS analysis. They are illustrated in Table 1. While the first two bins of m T variable in Table 1 are in a good agreement with CMS result (values in parenthesis) except for tt background which differs more than twice (in a conservative way), our estimate of the last bin turns out to be more conservative except for the dominant one, W +jets 4 . Although we decided not to further investigate to resolve the discrepancy in Table 1, due to limited available information from Ref. [27], we point out that the dominant background W +jets agrees well with the CMS analysis and thus sensitivities on the EFT coefficients derived either from our estimate or the CMS one will be similar. The same set of cuts in Eqs. (5) and (6)   EFT, C t t single-t W+jets ll+jets → * γ Z/ +jets ν ν → Z Figure 3: The N j distribution of the signal with C 23 V LL = 1 and backgrounds. Events are restricted to include at least two jets, N j ≥ 2, and less than two b-jets, N b < 2, and satisfy p T (τ ) > 50 GeV (for the leading jet if no τ -jet is found) and p miss T > 100 GeV.

Analysis of τ ν with an associated b-jet
For the analysis with a b-jet, the event selection is the same as in Section 3.2, except that events with at least two jets are considered. The extra jets, in addition to the τ -jet, in signal samples is likely to include a b-jet, whereas those in the background samples are likely light jets faking b-jets. Events with more than one τ -jet or b-jet are vetoed. We adopt the following tag and mistag rates for b-jet identification, along with the VLoose working point for the τ -identification explained in Section 3.2, W +jets is the irreducible SM contribution to the τ ν(+b) channel, and will interfere with the contribution from the EFT operators with the same helicity structure. In order to develop our analysis, we choose C cb V LL = 1 as benchmark point for signal events. To avoid double counting the SM contribution, we take only BSM event samples from the interference and quadratic terms in Eq. (4). The benchmark signal events were generated through the process pp → τ ν matched up to an extra-jet (using k T -jet MLM matching [37]) in the 5-flavor scheme. The distributions of the same variables used in the CMS analysis described in Section 3.2 are illustrated in Fig. 2, where τ refers to the τ -jet (or the leading jet if not found). As is evident in Fig. 2, they continue to be efficient discriminators for the τ ν process with the associated b-jet.
We impose the following cuts on the events, and, similarly to reflect the back-to-back configuration of τ ν system, The cuts on p T (τ ) and p miss T in Eq. (9) were relaxed to retain more events, compared to those in τ ν analysis in Section 3.2. Additionally, we impose a cut on jet multiplicity whose definition includes τ -jet as well, that is efficient in reducing tt background as is evident in Fig. 3. The cuts in Eqs. (9),11) were not optimized (similar cuts are also found in [32]). We leave optimizing the cuts using multivariate method or machine learning for future work.
Interestingly, we find that the dominant contribution of W +jets to the signal region comes from fakes as is illustrated in Fig. 4. To be specific, most τ -tagged jets in W +jets are found not to be in a back-to-back configuration with the missing transverse momentum,  Table 1) with the ratio of events between the two analyses. and what mimics the signal topology are fakes. Therefore, the estimation of W +jets background becomes sensitive to the p T -dependent tau mistag rate. Whereas the signal region for the signal events is enriched by τ -tagged jets as is evident in Fig. 4. Although W +jets is an irreducible background in terms of Feynman diagrams, this property makes it a kinematically reducible background to the signal, which implies further suppression of the interference between the signal and background. Assuming this property remains true even at the level of dimension-8 operators, it will help in establishing the better EFT expansion, namely σ dim6 2 σ SM −dim8 . According to the jet flavor distribution in Fig. 4, the c-jet population in W +jets is close to 16% followed by a few % of b-tagged jets. Given the mistag rates in Eq. (8), we find that the dominant contribution to W +jets comes from c-jet faking b-jet followed by b-jet and light jets faking b-jet (last two have similar sizes). While we used rather conservative mistag rate for c-jet, any improvement will further reduce W +jet background. However, note that the signal from the bcτ ν type operator has a benefit from the higher mistag rate for c-jet as the extra-jet can be easily c-flavored as is seen in left panel of Fig. 4.
As an estimate of the systematic uncertainty for the backgrounds, we rescaled each uncertainty in the CMS analysis in Table 1 with the ratio of events between the two analyses. These were summed in quadrature for the total number of background events.
Our final background estimates for pp → τ ν + b are reported in Table 2 Figure 5: The definition of three angles in our coordinate system for the process cg → τ νb.
We factorized 2 to 3 process effectively as the product of 2 to 2 process and 1 to 2 process. The artificially introduced intermediate momentum k corresponds to the momentum of the τ ν system (whether or not it is associated with a resonance).
As was mentioned in Section 3.1, the b-tagging is beneficial as it suppresses mainly the SM contribution, W +jets for instance, while retaining most BSM signals from the operators with b-quark. Indeed, we can see by comparing two Tables 1 and 2 that the size of W +jets is significantly reduced by simply demanding b-tagged jet. On the contrary, our benchmark signal with C cb V LL = 1 is reduced at most by a factor of three in presence of the b-tagging, as is illustrated in Table 3.

Studying angular distributions
The heavy flavor tagging can improve the sensitivity on operators involving a b-quark but has little or no impact on the different tensor structures. In order to increase the sensitivity on these, the natural candidate are angular observables. Furthermore, in case of an observation of a deviation from the SM, studying angular distributions can help to address the degeneracy in operator space that would otherwise be present.
For better understanding of the angular dependence, we evaluate analytically the partonic differential cross sections with respect to various angles defining our coordinate system of 2 to 3 process, consisting of five variables, namely √ŝ , z, θ, ψ, φ. The three angles are illustrated in Fig. 5. When the process is thought of as 2 to 2 process like, for instance, cg → (τ ν) + b by treating τ ν effectively as one particle (whether or not it is associated with the resonance), we use θ to refer to the polar angle in the rest frame of this effective 2 to 2 process. On the other hand, ψ refers to the polar angle of the τ ν system in its rest frame. The remaining angle φ denotes the relative angle between two planes of the τ ν system and the aforementioned effective 2 to 2 process. The variable z is the fraction of the partonic energy √ŝ flowing into the τ ν system. More detailed description is given in Appendix B.
Assuming, for simplicity, that all particles in the processes are massless and that all EFT coefficients are real, the partonic differential cross section from the BSM is evaluated where the integration over φ and z has been performed (see Appendix B for the full differential cross section before the integration). While O V LL operator interferes with the SM contribution from the W boson exchange, we have not generalized the differential cross section in Eq. (12) to include it for a technical reason 5 . Instead, we included the SM contribution numerically (see Fig.6). Although the SM contribution makes a visible effect for a lower energy, √ŝ O(TeV), its effect is found to be negligible around TeV scale for the chosen EFT coefficient in Fig.6. One notes that the interference term between O SL and O T in Eq. (12) disappears in our massless limit upon integrating over the polar angle The distinction between operators with different Lorentz structures will be pronounced in the differential distribution of the polar angle ψ of the τ ν system, namely dσ/d cos ψ.
We can integrate the partonic cross section in Eq. (12) over θ. However, to avoid the singularity in the forward region, namely near θ ∼ 0, (from t-channel diagram of the  operator with the best-fit value C cb V LL | best-fit = 0.068 including the SM contribution. In both plots, the energy of the system is fixed to be √ŝ = 1 TeV and p T (b) ≥ 20 GeV was imposed.
process) as is evident in Eq. (12), we need to impose a cut on the p T of the b-quark. The For the given cut on p T (b) ≥ p T min and fixed energy √ŝ , the differential cross section is obtained by integrating over θ and z, where the boundary values of cos θ are given by We performed the integration numerically for a fixed partonic energy √ŝ = 1 TeV with p T min = 20 GeV. The resulting differential angular distribution is shown in Fig. 6. As is evident in Fig. 6, the distribution of dσ/d cos ψ looks promising as a discriminant for different Lorentz structure of four-fermion operators. However, the distributions in Fig. 6 could be far from the reality as they are affected by kinematic cuts.
We investigate the implication of the kinematic cuts on the angular distributions using the partonic MC events of pp → τ νb process in terms of the angular variables shown in Fig. 7, motivated by what has been explored in the single top process [38]. The angular variables in Fig. 7 are more suited for experimental measurements, whereas those in Fig. 5 were more convenient for the analytic evaluation. θ N in Fig. 7 is the angle of p τ with respect to the normal vector N = p b × p τ ν whereas the variable θ * is the polar angle of τ  vector (denoted by p τ ) with respect to the τ ν vector ( p τ ν ) in τ ν rest frame. θ * is related to ψ in our coordinate through the relation, θ * = π − ψ. One could also define angle between p τ and p b (3-vector of b) in Fig. 7. We found that its differential distribution is more pronounced, while having similar shapes, than that of cos θ * .
After imposing the CMS type cuts in Section 3.2 on τ -lepton and missing transverse momentum, the resulting angular distributions are illustrated in Fig. 8. Comparing two plots in the left panels of Figs. 6 and 8, we observe that both edges of the distributions in Fig. 8 are depleted due to kinematic cuts 7 . Interestingly, the distribution from the tensor operator becomes more pronounced. On the other hand, the distribution of dσ/d cos θ N in presence of kinematic cuts does not look promising.
We have not implemented the angular observables described in this section to our analysis as it requires more detailed study at the hadron level including a realistic reconstruction of neutrinos. We leave more comprehensive study on them for future work.

Sensitivity on EFT coefficients
In this Section we present the limits on the EFT coefficients in Eq. (2) obtained by recasting the CSM τ ν analysis [27]. We then compare the prospects for an integrated luminosity of 300 fb −1 using the same analysis, with those derived from our analysis with a b-tagged jet.
For each of the three m T bins the total cross section is the sum of the SM background cross section, as detailed in the previous sections, and the EFT contribution consisting in the interference and quadratic terms of Eq. (4). From this cross section we build a loglikelihood by assuming the number of events in each bin follows a Gaussian distribution.
Given the sufficiently large number of expected events in each bin, the central limit theorem assures us that using a Gaussian distribution instead of a Poisson one is a good approximation. We thus have where L indicates the luminosity, σ SM, bin is the SM prediction for the cross section in each bin, σ EFT, bin (C ij X ) is the EFT-dependent cross section, and N obs ev, bin is either the observed number of events in that bin (for recasting the CMS analysis) or is fixed to the expected number of events in the SM for the prospects. The variance σ 2 bin is obtained, for each bin, by combining in quadrature the statistical and systematic uncertainty. Correlations between different bins are neglected since they are not reported by the experiment. EFT coeff.  (2), evaluated at the 1TeV scale and switched on one at a time, using the CMS τ ν analysis at √ s = 13 TeV and an integrated luminosity of 35.9 fb −1 . In the third and fourth column we show the prospects with a luminosity of 300 fb −1 for the same τ ν analysis and τ ν + b-jet analysis we propose, respectively.

Sensitivity from CMS τ ν analysis and future prospects
In order to extract the present EFT limits from the CMS measurements in the τ ν channel, we fix the integrated luminosity to 35.9 fb −1 and employ the CMS prediction for SM background events, see Table 1. We also use their estimate for the systematic uncertainty in each bin and combine it in quadrature with the statistical uncertainty. We checked that using the CMS prediction for the SM backgrounds or our results doesn't affect in a sizeable way the results of the fit.
By setting the integrated luminosity to 300 fb −1 and the number of events to the expected number in the SM, we obtain the future prospects for the EFT limits. We scale both statistical and systematic uncertainties as √ L, assuming that also systematic uncertainties will decrease with time thanks to improved SM computations and understanding of the detector performance. We avoid extrapolating to the full HL-LHC luminosity since it is expected that the analysis will qualitatively improve with more data, for example thanks to finer binning in the transverse mass that will be allowed when more events are collected, as well as improved experimental techniques.
The present limits and future prospects on all the EFT coefficients, switched on one at a time, are collected in Table 4 (second and third column, respectively). We also derived 2D limits in all pairs of mass-basis EFT coefficients and checked that no relevant correlations are present, as expected from the fact that coefficients with different fermion flavor or chirality do not interfere with each other. The present limits obtained from the CMS analysis are in agreement with those derived in [6], comparing 2D limits with those reported in [39] we also find a good agreement. Using the relations in Eq. (3) we translate the χ 2 of Eq. (15) as function of the SMEFT coefficients in Eq. (1). The corresponding single-coefficient limits are shown in Table 5.
In this scenario the only large correlation between coefficients is between the [C lq ] 3333 and [C (3) lq ] 3323 coefficients, since for both the leading contribution to pp → τ ν is mainly due to the same bc → τ ν partonic process, as will be discussed in more details below.
In the ancillary files chSQ LEFT CMS36fb.m and chSQ SMEFT CMS36fb.m we provide the complete χ 2 functions for the CMS recast in the two EFT bases, so that limits can be easily derived in any specific direction in the EFT coefficient space.

Sensitivity from the τ ν + b analysis
In a completely analogous manner we obtain the future prospects for the proposed τ ν analysis with an associated b-jet, discussed in Section 3.3. We use the estimate for the SM background contributions, and their systematic uncertainty, reported in Table 2 and the cross-section dependence on EFT operators with a b-quark, obtained with the same analysis.
The expected 95% CL intervals for each coefficient, taken one at a time, with an integrated luminosity of 300 fb −1 are collected in the fourth column of Tables 4 and   5. Comparing with the expected bounds obtained for the same integrated luminosity from the analysis without the b-jet requirement (third column) we observe a 30÷35% improvement on the sensitivity on those EFT coefficients. This improvement, with same luminosity, is larger than the one obtained when increasing the luminosity from 36 to 300 fb −1 with the standard analysis. This improvement from the b-tagging is consistent with what has been found in the pp → µµ(+b-jet) channel in [7], where the limit, for a luminosity of 36 fb −1 , improved by ∼ 33% when compared with the analysis without the b-tag done in [5].

Flavor physics from collider tails
In this section we discuss what information on the flavor structure of New Physics can be extracted from high-p T tails of pp → τ ν(+b) at LHC, and how this compares with limits from low-energy flavor processes. This topic has already been the focus of several works in recent years, see [6,9] for τ ν searches and [1-5, 7-10] for other leptonic final states.
Since the main focus of our work is in the high-energy tails with a b-tagged jet, we concentrate on operators involving a b quark. For the purpose of illustration, among the operators in Eq. (2) we focus for the moment on the left-handed vector operator O V LL . The two charged-current contact interactions involving a b quark are cb → τ ν and ub → τ ν, generated by the C cb and C ub coefficients, respectively. Since, by assumption, the new physics mediators should be above the energy scale of collisions, these coefficients should be matched to the SMEFT operators in Eq. (1) (see Eq. (3) for the relations): The three [C lq ] 33i3 coefficients involved in these partonic transitions also generate, via CKM misalignment, contributions to other transitions involved in pp → τ ν: Note however that these transitions do not contribute to pp → τ νb. Depending on the specific direction in UV flavor space of the SMEFT coefficients [C (3) lq ] 33i3 the collider signal rate can be enhanced with respect to the contribution arising only from C cb , thanks to the different parton luminosities and CKM factors.
Let us consider C cb and C ub , that contribute also to pp → τ νb. The naive estimate of the interference term between the SM and BSM amplitudes, taking into account the PDF luminosity, is whereσ ij denotes the partonic cross section and C cb , C ub were pulled out of the partonic cross sections for clear comparison, whereas the quadratic terms are Switching on a single [C lq ] 33i3 coefficient at a time, the interference and quadratic terms in Eqs. (18) and (19) become where κ i3 = V ui /V ci (= 4.22, 0.24, 0.09 for i = 1, 2, 3, respectively) and the PDF luminosity ratio L ub /L cb ≈ (13, 24, 50) for partonic scattering energy of q 2 = (0.5, 1, 2) TeV, respectively, given collision energy √ s = 13 TeV. Also, the quadratic terms in the EFT are larger than the interference with the SM for the parameter space and energy range relevant for the bounds.
In case of the [C lq ] 3313 coefficient, the contribution from the ub initial state to the quadratic term of the cross section is enhanced by a large factor ∼ (4.22) 2 (L ub /L cb ) compared to the cb initial state and thus dominates. For the [C (3) lq ] 3323 coefficient, on the other hand, the contributions from ub and cb initial states are of the same order at 1 TeV, with ub (cb) becoming more important at higher (lower) energies. For [C (3) lq ] 3333 , the suppression due to the small numerical coefficient |V ub /V cb | 2 = κ 2 33 = (0.09) 2 is not compensated by the enhancement due to the up-quark PDF even up to 2 TeV of scattering energy, therefore the contribution from ub initial state will be subdominant with respect to the one from cb (see [6] for a related discussion).
Another potentially interesting case would be the contribution from tb initial states for the operator with [C (3) lq ] 3333 . The relative contribution from tb initial states will roughly scale like |V tb /V cb | 2 (L tb /L cb ) (with |V tb /V cb | ∼ 24.9) up to the different phase space contribution. Although top PDF in the proton is negligibly small, including a top quark in the final state, pp → τ ν + tb, would modify completely the set of backgrounds and it may be worth investigating.
To illustrate quantitatively this discussion, we show in Fig. 9 the 95%CL limits and prospects in the planes ([C  Fig. 9 is due to the fact that both limits arise mostly from the same partonic process cb → τ ν, that is, the C cb V LL coefficient, while the weaker Figure 9: 95% CL limits and prospects in several pairs of SMEFT coefficients, while other operators are set to zero. The solid (dashed) green lines are 1(2)σ contours from the R(D ( * ) ) fit [26], while orange lines are 95% CL limits from B → τ ν. limit in the perpendicular direction arises from ub → τ ν, due to the CKM suppression. In the central panels of Fig. 9  lq ] 33i3 applies. Finally, in the bottom panel of Fig. 9 we illustrate the constraints in the plane of the scalar and tensor operators [C (1) lequ ] 3332 and [C The branching ratio Br(B → τ ν) is given by where Br(B − → τ −ν ) SM = (7.92 ± 0.55) × 10 −5 [42] and the combination of experimental measurements is Br(B − → τ −ν ) exp = (1.09 ± 0.24) × 10 −4 [43]. Taking one coefficient at a time, the 2σ limits are: For all the 2D planes in Fig. 9 we also show with solid (dashed) green lines the 1(2)σ contour from the R(D ( * ) ) fit (the RG evolution from m b up to 1 TeV is included, which is relevant for scalar and tensor operators [29]) and with orange lines the 95% CL limit from B → τ ν. Comparing the low-energy limits with those from high-p T tails, we see that the expected sensitivity at 300 fb −1 with our analysis with the b-tagging starts to probe regions not already excluded by flavor measurements (see, for example, the upper two panels).
Furthermore, while the leptonic decay B → τ ν only tests the specific combination of EFT coefficients in Eq. (21), LHC searches put independent limits on all of them, thanks to the vanishing interference between different coefficients. One could also expect that the limits from high-p T tails will improve substantially with HL-LHC, thanks to larger number of events, finer binning, and possibly the addition of angular distributions. 8 The combination C cb SL (TeV) = −4C cb T (TeV) is generated at the UV matching scale by integrating out the leptoquark S 1 ∼ (3, 1, −1/3). The QCD RG evolution down to m b modifies it to C cb SL (mb) ≈ −8C cb T (mb) ∈ [0.113 ÷ 0.170] 1σ , which is the value quoted in [40].
Going beyond operators with a b quark, let us consider low-energy observables constraining other u i d j τ ν contact interactions. The other quark pairs are ud, us, cd, and cs.
The most sensitive observable to the first two are τ − → νπ − (K − ) decays, while charm transitions are tested in (semi-)tauonic tau decays. In order to compare the sensitivity reach on EFT operators let us focus for simplicity on left-handed operators C ij V LL . Tau decays to pions and Kaons are tested at the per-mille level, and the limits can be written as [44]: providing the following 2σ intervals: C ud V LL ∈ [−9.2, 1.6] × 10 −3 , C us V LL ∈ [−2.8, −0.02] × 10 −2 . Note that the latter does not include zero due to some tension with the SM.
Comparing these limits with those from pp → τ ν in Table 4, we observe that present LHC constraints are comparable, while future limits will be stronger. For what regards the comparison with D meson decays, a detailed analysis was done recently in [9], to which we refer for details. The limits obtained from (semi-)leptonic decays are C cd V LL ∈ [−0.21, 0.27] and C cs V LL ∈ [−1.4, 7.0] × 10 −2 . Also in this case the high-p T limits are stronger.

Collider limits for Rank-One-Flavor-Violation
In several new physics scenarios, the UV physics responsible for the contributions in R(D ( * ) ) and pp → τ ν couples only to a specific combination of left-handed quarks. For example, the vector-leptoquark U µ 1 ∼ (3, 1, 2/3), which is one of the favourite scenarios for addressing the B-anomalies, couples to left-handed fermions as where we selected only the coupling to the third generation leptons as it is the one contributing to pp → τ ν. The coupling to left-handed quarks and third generation leptons is thus parametrized by the vector in U(3) q flavor space g i3 . As a consequence, the structure of SMEFT coefficients is of rank-one: [C The same rank-one structure is generated for other single-leptoquark scenarios and in all cases where the new physics flavor structure is induced via the mixing of SM quark doublets with a single vector-like fermion. The generalisation of this flavor structure has been dubbed Rank-One-Flavor-Violation (ROFV) in [45].
Following this hypothesis, we can parametrize the SMEFT coefficients as Figure 10: Under the ROFV assumption (for α 1 = α 2 = 0), we show with a red-coloured region the 95% CL exclusion from the CMS pp → τ ν search, assuming that the best-fit value of R(D ( * ) ) is reproduced. The red and purple lines correspond to the expected 95% CL limits with 300 fb −1 from pp → τ ν and pp → τ νb, respectively. We also report the 95% CL limits from B → τ ν (green), τ → νK (cyan), and τ → νπ (orange).
where C L ∈ R andn i is a unitary vector in U(3) q flavor space: with θ ∈ [0, π/2], φ ∈ [0, 2π), α 1,2 ∈ [−π/2, π/2]. The directions aligned with down quarks form the chosen orthonormal basis in this space. Another possible choice of basis is the one aligned with up quarks, and the rotation between the two basis is given by the CKM matrix. In the left panel of Fig. 10, we draw the directions in (θ, φ) associated with each SM quark direction (the corresponding α 1,2 phases are not shown), and the corresponding (θ, φ) are also shown as dots in the right panel.
With this parametrization, the combination of coefficients contributing to R(D ( * ) ) is given by By imposing that the measurement of C cb V LL = 0.068 ± 0.017 from R(D ( * ) ) is reproduced (for example at the best-fit point), we can fix the overall coefficient C L in Eq. (27) as function of C cb V LL (i.e. of R(D ( * ) ) and of the other parameters: best−fit cos θ(cos θV cb + sin θ sin φe iα 2 V cs + sin θ cos φe iα 1 V cd ) .
By plugging this in the definition of the ROFV structure of the SMEFT coefficients in Eq. (25), all of the [C lq ] 33ij will depend only on θ, φ, and the two phases α 1,2 . Fixing the phases, for instance to zero, we can study the collider limits (and prospects) from pp → τ ν (+b) in the plane of θ and φ, see Fig. 10. In the same figure we also report the 95% CL limits from low-energy processes sensitive to the [C (3) lq ] 33ij coefficients, specifically B → τ ν (green), τ → νK (cyan), and τ → νπ (orange), as discussed in the previous section. The limits from D meson decays, instead, are too weak for any value of θ and φ.

Conclusions
In this work, we derived the sensitivity on the EFT coefficients of four-fermion operators from the collider study of both τ ν and τ ν + b channels at the LHC. The former has been extensively considered in literature, including in the context of the anomalous R(D ( * ) ) measurements [6] and comparing the sensitivity against D-meson decays [9]. Using the existing CMS τ ν analysis with an integrated luminosity of 35.9 fb −1 at √ s = 13 TeV, we obtained the constraints for all EFT coefficients of four-fermion operators contributing to the process, both in the mass-eigenvalue basis and in the gauge-invariant Warsaw basis.
The likelihood function for all coefficients is provided alongside this work in ancillary files, allowing the reader to study limits in any direction in the EFT space.
Using the τ ν analysis to validate our procedure and estimates of all the background channels by comparing with CMS results, we studied the possibility of including bottom flavor tagging by devising a dedicated analysis. The impact of b-jet tagging on the EFT is mainly two-folds. First, it allows to focus only on the subset of EFT operators involving a b-quark. Secondly, demanding a b-tagged jet suppresses the SM backgrounds while retaining most of the b-enriched signal events, thus improving the sensitivity on that subset of EFT coefficients. Comparing the sensitivity with the analysis without a b-jet, we estimate the improvement in the EFT limits to be approximately 30%, for the same luminosity.
We also discussed possible strategies for distinguish the operators with different Lorentz tensor structures using the angular observables. To isolate the pure angular properties from the impact of a realistic neutrino reconstruction, we worked at the parton level assuming perfect neutrino reconstruction in both our analytic evaluation and the MC simulation. The differential distribution of the polar angle θ * (equivalent to ψ in our analytic evaluation) in τ ν rest frame shows promising discrimination power. While the major limitation might be caused by a realistic neutrino reconstruction, it certainly deserves further detailed investigation. We provided full analytic differential cross section of 2 to 3 process in Appendix B. This allows to study analytically other sets of angular observables, as well as transforming easily to other coordinates.
Comparing the limits, and prospects, on pair of coefficients derived from mono-τ tails with those from low-energy flavor measurements, specifically R(D ( * ) ) and B → τ ν, we find that in some cases the LHC prospects with a luminosity of 300 fb −1 and the b-tagging requirement start to be competitive. Furthermore, the higher luminosity reachable at HL-LHC is expected to further improve the picture by reducing the statistical uncertainty, allowing more m T bins at high energy, and possibly studying angular distributions. A dedicated analysis is left for future work.
In several ultraviolet completions of the semi-tauonic operators [O (3) lq ] 33ij , the mediators couple to a single direction in quark-flavor space. For instance, this is automatic for single-leptoquark exchange [45]. In this case the EFT coefficient matrix is a rank-one tensor: [C (3) lq ] 33ij = C Lnin * j , wheren i is the unitary vector in the quark flavor space. In our analysis, we have shown that, once the overall C L coefficient is fixed by the R(D ( * ) ) measurement and for a simplifying assumption for the phases, the collider limits on [C (3) lq ] 33ij can be recasted in terms of two angles that nicely visualize the collider probes of the flavor (mis)alignment. In the plane of these two angles, the collider limits from mono-tau tails are competitive with the constraint from B → τ ν and τ decays to νπ and νK.
The collider strategy we presented aims to improve the sensitivity to semileptonic four-fermion operators in the SMEFT containing a b-quark. This is part of a larger effort by the community, aimed at extracting the largest possible amount of information on EFT extensions of the Standard Model from LHC data, that will help us understanding the nature of NP better.

Acknowledgments
We thank KyeongPil Lee and Hwidong Yoo for useful discussions regarding the experi-

A Cross section in terms of EFT coefficients
In Table 4, we have presented one-dimensional sensitivity by switching on only one operator at a time. Here, we present our result for the EFT cross section, keeping all operators.
Along with the background in Table 1, one should be able to construct the complete likelihood function in the space of EFT coefficients, c.f. Eq. (15). Following the description in Section 3.2, the BSM cross section, in fb, after imposing the same cuts as those in CMS analysis using 35.9 −1 fb at √ s = 13 TeV takes the form, where the interference terms for three m T bins are given by and quadratic terms for three m T bins are = 55620(C 11 SL ) 2 + 1345(C 12 SL ) 2 + 0.03106(C 13 SL ) 2 + 694.2(C 21 SL ) 2 + 3146(C 22 SL ) 2 + 3.091(C 23 SL ) 2 + 55540(C 11 SR ) 2 + 1340(C 12 SR ) 2 + 0.03109(C 13 SR ) 2 + 686.9(C 21 SR ) 2 + 3151(C 22 SR ) 2 + 3.093(C 23 SR ) 2 + 245200(C 11 T ) 2 + 5627(C 12 = 170400(C 11 SL ) 2 + 3942(C 12 SL ) 2 + 0.08904(C 13 SL ) 2 + 1846(C 21 SL ) 2 + 6909(C 22 SL ) 2 + 6.539(C 23 SL ) 2 + 169500(C 11 SR ) 2 + 3938(C 12 SR ) 2 + 0.08971(C 13 Note that the large numerical factors in front of many EFT coefficients are artifacts of our definition of EFT coefficients. They do not invalidate the EFT expansion. Similarly, the BSM cross section for the pp → τ νb process, following the description in Section 3.3, can be written as where the interference terms for three m T bins are given by and quadratic terms for three m T bins are given by As we briefly mentioned in Section 3.4, the interference terms between O T and O SL operators exist due to kinematic cuts. We found that they are small enough to be ignored.

B Calculation of differential cross section
The analytic evaluation of the 2 → 3 amplitude should help us with the exact understanding of the E-growing behavior of the amplitude and various angular distributions. In this section, we calculate the helicity amplitude and differential cross section of the process cg → τ ν + b which is relevant for the R(D ( * ) ) anomaly. For the helicity amplitude, we do it only for the O cb V LL operator as an example (see Section 2 for the definition). For the differential cross section, we include all operators with respect to various angles defining our coordinate system.

B.1 Coordinate and four momenta
The 2 → 3 scattering process can be described in terms of 5 independent kinematic variables. Among many choices, we adopt the following coordinate system in terms of Figure 11: Our coordinate system in p 1 p 2 center-of-mass frame and four momenta (see also where √ŝ is total energy of the entire system, z the fraction of energy flowing into the k 1 k 2 system, namely E k 1 +E k 2 = z √ŝ , θ the polar angle between p 1 and k 1 +k 2 directions, φ the angle between two planes made of (p 1 , k 3 ) and (k 1 , k 2 ) pairs, and ψ the polar angle between k 1 and k 1 + k 2 directions in the k 1 k 2 rest frame. They are illustrated in Fig. 11.
Two incoming momenta p 1 , p 2 and three outgoing momenta k 1 , k 2 , k 3 in the p 1 p 2 center-of-mass frame are parametrized in terms of variables in Eq. (39) as (1, sin θ cos φ, sin θ sin φ, cos θ) , (1, − sin θ cos φ, − sin θ sin φ, − cos θ) , where the momentum k has the invariant mass of m 2 k = (2z − 1)ŝ. Note that the 2 → 3 process can be effectively factorized into 2 → 2 and 1 → 2 via an intermediate momentum k (whether or not the intermediate momentum is associated with a resonance). The momenta k 1 and k 2 in Eq. (40) are obtained by boosting those in the k 1 k 2 rest frame, (1, sin ψ, 0, cos ψ ) , Figure 12: The t-channel diagrams of cg → τ + ν b from the W -boson exchange in the SM (left) and four-fermion operator (right). Figure 13: The s-channel diagrams of cg → τ + ν b from the W -boson exchange in the SM (left) and four-fermion operator (right).
along the z-axis with the boosting factor,

B.2 Helicity amplitude
The t-channel amplitude in Fig. 12 is given by where q = k 3 − p 2 and j µ L is the left-handed fermion current, j µ L =ū(k 1 )γ µ P L v(k 2 ). Similarly, the s-channel amplitude in Fig. 13 is given by where q = p 1 + p 2 and j µ L =ū(k 1 )γ µ P L v(k 2 ) as before. The t-, s-channel momentum squared are given by Using the expressions for the spinors in terms of our coordinates, the t-channel amplitudes are evaluated to be The s-channel amplitudes are given by The helicity amplitudes for other operators in Eq. (2) can be similarly obtained. The overall amplitude grows like ∼ √ŝ as is expected whereas BSM amplitude grows like ∼ŝ with respect to the SM amplitude, dictated by the Lorentz structure of the O cb V LL operator. As is evident in Eq. (46), the t-channel amplitude is singular in the forward region, θ ∼ 0, and it leads to the logarithmic growth of the cross section, regulated by the bottom quark mass m b : In practice, we need to regulate the large log by higher p T cut on b-jet than m b as the coupling α s is roughly α s ∼ 1/ log(E 2 /Λ 2 QCD ).

C.2 Signal simulation to pp → τ ν, τ νb
The four-fermion operators were implemented in FeynRules [30] and the resulting UFO In this work, we choose 5-flavor scheme as our default for both τ ν and τ νb processes 9 . The signal samples for τ νb process were simulated through the pp → τ ν + 0, 1j process by MadGraph5 aMC@NLO v2.3.3, interfaced with Pythia v6.4, and they were matched at LO up to an extra jet using k T -jet MLM matching. Since the b-jet is tagged for the τ νb process, we generate the signal events only for four-fermion operators with b-quark such as (bu)(τ ν) and (bc)(τ ν) with all possible Lorentz structures. Whereas the signal samples for the inclusive τ ν analysis were generated by MadGraph5 aMC@NLO v2.3.3, interfaced with Pythia v8.219 [47], without matching.
We numerically estimate the (partial) k-factor of the signal cross sections by comparing the signal rates of pp → τ ν without matching and available matched samples of pp → τ ν + 0, 1j described above. The comparison is presented in Table 6 where the crude estimate of the k-factor is found to be roughly one. We also have checked that the differential distributions of all relevant kinematic variables agree well between two cases.
We also point out that the signal rates obtained from pp → τ νb at the matrix level without the matching is not appropriate for the study of τ νb. Not only the unmatched τ νb processes severely underestimate the signal rates (as is illustrated in Table 7), also the discrepancy of differential distributions between unmatched τ νb and matched τ ν + 0, 1j samples is not negligible.