Probing a Z′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{Z}^{\prime }$$\end{document} with non-universal fermion couplings through top quark fusion, decays to bottom quarks, and machine learning techniques

The production of heavy neutral mass resonances, Z′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {Z}^{\prime }$$\end{document}, has been widely studied theoretically and experimentally. Although the nature, mass, couplings, and associated quantum numbers of this hypothetical particle are yet to be determined, current LHC experimental results have set strong constraints assuming the simplest beyond Standard Model (SM) hypotheses. We present a new feasibility study on the production of a Z′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {Z}^{\prime }$$\end{document} boson at the LHC, with family non-universal couplings, considering proton–proton collisions at s=13\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{s} = 13$$\end{document} and 14 TeV. Such a hypothesis is well motivated theoretically and it can explain observed differences between SM predictions and experimental results, as well as being a useful tool to further probe recent results in searches for new physics considering non-universal fermion couplings. We work under two simplified phenomenological frameworks where the Z′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{Z}^{\prime }$$\end{document} masses and couplings to the SM particles are free parameters, and consider final states of the Z′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {Z}^{\prime }$$\end{document} decaying to a pair of b\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{b}$$\end{document} quarks. The analysis is performed using machine learning techniques to maximize the sensitivity. Despite being a well motivated physics case in its own merit, such scenarios have not been fully considered in ongoing searches at the LHC. We note the proposed search methodology can be a key mode for discovery over a large mass range, including low masses, traditionally considered difficult due to experimental constrains. In addition, the proposed search is complementary to existing strategies.


I. INTRODUCTION
The standard model (SM) of particle physics is a successful theory to explain a plethora of experimental observations involving weak, electromagnetic, and strong interactions over the last few decades.However, as experiments probe new questions and increasing energies, observations indicate the SM is incomplete and might be a low-energy remnant of a more complete theory.There are a multitude of theoretical models proposed to overcome the SM limitations.Although the initial motivations and resulting implications of these models can vary, a common characteristic is the manifestation of new particles that can be probed in proton-proton (pp) collisions at the CERN's Large Hadron Collider (LHC).
Numerous ideas have been proposed to probe physics beyond the SM, motivating a large volume of searches at the LHC.Nonetheless, extensive searches have found no firm indication of new phenomena, largely constraining theories and setting exclusion limits up to multi-TeV on the masses of new particles predicted by those theories [1][2][3][4][5][6][7][8].Possible explanations for the lack of evidence point to either new particles being too massive or having too low a production rate in existing colliders, or new physics having different features compared to what is traditionally assumed in many beyond SM theories and searches, thus remaining concealed in processes not yet investigated.In particular, many searches conducted so far at the LHC rely heavily on the assumption that these hypothesized new particles have similar couplings to all generations of fermions, including couplings to the partons inside the proton, thus favoring LHC production modes through light quarks.Therefore, if new phenomena are within the reach of the LHC, both in energy and production rate, they might manifest with different features compared to what is assumed in searches at high energy colliders, thus requiring new efforts and experimental quests.
In this paper, we consider a different scenario in which new particles have non-universal fermion couplings, favoring higher-generation fermions, which we refer to as anogenophilic particles.In particular, we consider a new neutral vector gauge boson, Z , with only couplings to third generation fermions, referred to as tritogenophilic.This physics case is also interesting theoretically and because of recent results in precision measurements, offering a new physics phase space not yet fully explored at the LHC.
The mass, quantum numbers, and couplings of new hypothetical mediators can be open parameters to be determined experimentally, making the new physics phase space broadly defined.Thus, initial ATLAS/CMS searches for these new type of particles were conducted considering models with democratic couplings to all fermion families, and focused on Drell-Yan production mechanisms with light quarks (e.g., qq → Z ), and final states with muons and electrons with high signal acceptance and a narrow "bump" in the reconstructed invariant mass spectrum of lepton pairs sitting above a smooth and steeply falling background distribution [72,73].However, from the phenomenological point of view, when couplings to light quarks are suppressed in pp colliders, relative to higher-generation fermions, new production mechanisms become dominant to generate and discover beyond SM resonances.They are produced in association with other SM particles and give origin to rare and peculiar signatures.The phenomenology of purely top-philic Z [74-76] scenarios, as well as models with a Z that couples to top quarks and tau/muon [77, 78] leptons, have already been studied in the literature.Furthermore, a CMS search has been performed for a neutral resonance coupling to top quarks and decaying to muons or electrons [79].
In this paper, we perform a previously unexamined feasibility study on the production of a more general tritogenophilic Z produced through the fusion of a t t pair (t tZ ) and decaying to a pair of b quarks (Z → b b), as in Figure 1.We consider the final state where one of the two remaining tops from the fusion process, referred to as spectator top quarks, decays to bW and the W boson subsequenly decays to an electron or muon plus its neutrino.Such a choice balances the lower W → ν branching fraction compared to W boson decays into two quarks, with a cleaner final state.This has the double advantage of mitigating the large background from full-hadronic SM quantum chromodynamics (QCD) processes, and of overcoming the otherwise overwhelming events rate that is outside the typical trigger bandwidth at the LHC, rendering the search sensitive to a wide range of Z masses.For Z masses below the 2m t kinematic production threshold, where Z → t t decays are not permitted, the Z decay to b b is the dominant discovery mode.Furthermore, the analysis strategy proposed in this paper provides enhanced sensitivity compared to other approaches already used in searches at the LHC [1,4,[80][81][82].Above 2m t , the reduced jet multiplicity of the Z → b b final state, in comparison to Z → t t, favors the experimental reconstruction of the Z mass.In this work, machine learning techniques are used to maximize the experimental sensitivity.

II. SAMPLES AND SIMULATION
Signal and background samples are generated with MadGraph5 aMC (v2.6.3.2) [83] considering pp beams colliding with a center-of-mass energy of √ s = 13 TeV and √ s = 14 TeV.All samples are generated using the NNPDF3.0NLO [84] set for parton distribution functions (PDFs).Parton level events are then interfaced with the PYTHIA (v8.2.05) [85] package to include parton fragmentation and hadronization processes, while DELPHES (v3.4.1) [86] is used to simulate detector effects, using the CMS detector geometric configurations and parameters, for performance of particle reconstruction and identification.At parton level, jets are required to have a minimum transverse momentum (p T ) of 20 GeV and pseudorapidity (η) |η| < 5.0.The cross sections in this paper are obtained with the aforementioned parton-level selections.The MLM algorithm [87] is used for jet matching and jet merging.The xqcut and qcut variables of the MLM algorithm, related with the minimal distance between partons and the energy spread of the clustered jets, are set to 30 and 45, respectively, as a result of an optimization process requiring the continuity of the differential jet rate as a function of jet multiplicity.The signal samples are generated considering the production of a Z and two associated top quarks (pp → Z t t), inclusive in α EWK and α QCD .For our benchmark signal scenario, we consider the simplified model in Ref. [88] where the Z masses and couplings to the SM particles are free parameters, and defined as variations of the SM Z boson couplings (i.e., variations of the so-called Sequential Standard Model, SeqSM).The Z coupling to the first and second generation SM quarks is defined as g Z q q = g q × g Zq q , where g Zq q is the SM Z boson coupling to first and second generation quarks and g q is a "modifier" for the coupling.Similarly, the Z coupling to the third generation SM quarks is defined as g Z ,b/t, b/ t × g Z,b/t, b/ t, where g Z ,b/t, b/ t is the modifier to the SeqSM coupling.We refer to this model as "simplified phenomenological model 1" (SPM1).In all cases considered, the modifiers for the Z couplings to t t and b b are equal to each other, and thus for simplicity we henceforth refer to those modifiers as g Z t t.Therefore, a scenario with g Z t t = 1 has similar Z couplings to top/bottom quarks as the SeqSM.Signal samples were created for m(Z ) ranging from 250 GeV to 2000 GeV.Table I lists the production cross sections for different Z masses, considering pp collisions at √ s = 13 TeV and 14 TeV, and for two representative g q coupling scenarios with g Z t t = 1.The g q = 0 case is a proxy for the tritogenophilic scenarios, where the couplings of the Z to light quarks are suppressed.The g q = 1 case allows for non-negligible couplings to light quarks, and thus other t tZ production processes can contribute, such as initial state radiation of a Z from a light quark.
In addition to our primary signal benchmark model described above, we also consider a tritogenophilic scenario where the Z is a color singlet vector particle whose effective couplings are not suppressed by factors of the electroweak mixing angles (as in the SeqSM) and whose relevant interactions to top/bottom quarks are given by the following renormalizable Lagrangian: is the Z coupling to top/bottom quarks, and tanθ = c R /c L is the tangent of the chirality angle.We consider the case where the Z couplings to top and bottom quarks are equal to each other, and thus for simplicity we henceforth refer to those couplings as c t .This type of simplified model, which we refer to as "simplified phenomenological model 2" (SPM2), has been studied in Refs.[74-76], and it has been shown that t tZ production is independent of θ.We have checked that this indeed the case.Thus, we only consider θ = π/2.Although the signal kinematic distributions for this particular model are similar to those of SPM1, the t tZ production cross sections for SPM2 are larger than those of SPM1, when c t = g Z t t, since the SPM2 Lagrangian does not contain suppression terms from the electroweak mixing angles.Our primary motivation in using SPM2 is to compare the projected discovery reach of the proposed analysis strategy in this paper, with other strategies, such as those in Ref. [75], which considers the Z → t t decay mode.
Several sources of background are considered for our studies, including production of top quark pairs (t t), Z/W bosons with associated jets (V+jets), QCD multijet, associated production of a Higgs (h) or a Z/γ * boson from t t fusion processes (denoted t th and t tX), and associated production of four t quarks (t tt t).Since our signal topology targets final states with four bottom quarks (Z → b b and t t → bWbW), the t t, V+jets, and QCD multijet backgrounds do not meaningfully contribute to our studies ( 1% of the total background).The t th, t tX, and t tt t processes are the dominant sources of background events.The t th and t tX processes become important backgrounds when h and Z/γ * decay to a pair of bottom quarks.Table II shows the production cross sections for the dominant backgrounds, at √ s = 13 TeV and 14 TeV.
The total event rates are determined using N = σ × 13 TeV 14 TeV Z mass (GeV) σg q =0(fb) σg q =1(fb) σg q =0(fb) σg q =1(fb) L × , where N represents the total yield of events, L the integrated luminosity considered (for this study, 150 fb −1 , 300 fb −1 , and 3000 fb −1 ), and represents any efficiencies which might reduce the total event yield (e.g., particle identification efficiencies).The L = 150 fb −1 scenario represents an estimate for the amount of data already collected by the ATLAS and CMS experiments, while the other luminosity scenarios are the expectations for the next decade of pp data taking at the LHC.All production cross sections are computed at tree-level.Since the k-factors associated with higher-order corrections to QCD production cross sections are typically greater than one, our estimates of the sensitivity are conservative.
Following Ref. [89], we consider three possible "working points" for the identification of the b-jet candidates in DELPHES: (i) the "Loose" working point of the DeepCSV algorithm, which gives a 85% b-tagging efficiency and 10% light quark mis-identification rate; (ii) the "Medium" working point of the DeepCSV algorithm, which gives a 70% b-tagging efficiency and 1% light quark mis-identification rate; and (iii) the "Tight" working point of the DeepCSV algorithm, which gives a 45% btagging efficiency and 0.1% light quark mis-identification rate.The choice of b-tagging working points is determined through an optimization process which maximizes discovery reach.The "Medium" working point was ultimately shown to provide the best sensitivity and therefore chosen for this study.For muons (electrons), the assumed identification efficiency is 95% (85%), with a 0.3% (0.6%) mis-identification rate [90][91][92].

III. DATA ANALYSIS USING THE GRADIENT BOOST ALGORITHM
The analysis of signal and background events is performed using a machine learning event classifier, namely a gradient boosted decision trees (BDTs) [93].Machine learning offers advantages over traditional event classification methods.In particular, machine learning models consider all kinematic variables in tandem, efficiently traversing the high-dimensional space of event kinematics, thereby enabling them to enact complicated selection criteria which incorporates that high-dimensional space in its entirety.
This method iteratively trains decision trees to learn the residuals between predictions and expected values yielded by the tree trained just before it, thereby greedily minimizing error at each iteration.BDTs have been employed to great effect previously in classification problems arising in collider physics (e.g., [94][95][96][97][98][99][100]).
Simulated signal and background events are initially filtered, before being passed to the BDT algorithm, requiring at least four well reconstructed and identified bjet candidates, at least two jets not tagged as b jets, and exactly one identified light lepton ( ), that could be either an electron (e) or a muon (µ).Selected jets must have p T > 30 GeV and |η(j)| < 5.0, while b-jet candidates with p T > 30 GeV and |η(b)| < 2.5 are chosen.The object must pass a p T > 25 GeV threshold and be within a |η( )| < 2.5.Overlapping objects in η − φ space are removed using a minimum ∆R among all particle candidates (p i ) above 0.3, where ∆R(p i , p j ) = (∆φ(p i , p j )) 2 + (∆η(p i , p j )) 2 .These filtering criteria will be henceforth referred to as pre-selections.Table III summarizes these pre-selections for the analysis.
Events passing this pre-selection are used as input for the BDT algorithm, which classifies them as signal or background, using a probability factor.We implement the BDT algorithm using the canonical scikit-learn [101] and xgboost [102] libraries.In particular, we employed the XGBClassifier class in the latter library with 250 iterations, a max depth of 7, a learning rate of 0.1, and default parameters otherwise, although we note that model performance was found to be largely independent of hyperparameters.As can be seen from Figures 2 and 3, for m(Z ) values beyond the electroweak scale, the relatively large leading and subleading b-jet p T is a key feature attributed to the heavy Z with respect to the mass of the bottom quarks, thus resulting in an average p T (b 1,2 ) of approximately m(Z )/2.This kinematic feature provides a nice handle to discriminate high m(Z ) signal events amongst the large SM backgrounds, which have lower average p T (b 1,2 ) constrained by the top quark and/or higgs masses.The ∆R separation between b 1 and b 2 is determined by the amount of momentum transfer to the resonant particles in each process (Z , h, or t), which in turn depends on the masses of those particles.Therefore, ) near m t ), as this is an important feature in our signal (Figure 1).A trained BDT can return the discriminating power of each of its inputs: we found that the plotted kinematic variables (i.e., p T (b 1 ), p T (b 2 ), ∆R(b 1 , b 2 ), and m(b 1 , b 2 )) were among the most productive variables from this standpoint, producing about 60-75% of signal significance (depending on Z mass), but the inclusion of all 47 variables does provide a non-trivial enhancement.
Figure 6 shows the distributions for the output of the BDT algorithm for a SPM1 signal benchmark point with m(Z ) = 350 GeV and {g q , g Z t t} = {0, 1}, and the dominant backgrounds.Figure 7 shows the BDT output for m(Z ) = 500 GeV and {g q , g Z t t} = {1, 1}.The distribu-Table III: Preliminary event selection criteria used to filter events that are passed to the gradient boosting algorithm.A ∆R(p i , p j ) > 0.3 requirement is applied to all the particle candidate pairs p i , p j .tions in Figures 6 and 7 are normalized to an area under the curve of unity.Table IV shows the expected event yields per bin, normalized to cross section times luminosity times pre-selection efficiency, for a particular choice of bin ranges of the BDT output.The bins are counted from 1 to 100, going from left to right, such that bin 1 is the leftmost bin near BDT output of 0, and bin 100 is the rightmost bin near a BDT output of 1.The backgrounds dominate over the SPM1 benchmark signal yields in a large part of the BDT output spectrum, especially near zero, where the background yields are about six orders of magnitude larger.The presence of signal will be observed as an enhancement in the yields near a BDT output of unity.

= 14 TeV s
Figure 7: Output of the gradient boosting algorithm for a Z signal with mass of 500 GeV and g q = 1 coupling, and for the most relevant backgrounds.The distributions are normalized to unity.
Table IV: Event yields for the main backgrounds and the signal point for m(Z ) = 1.0 TeV, for some of the bin entries for the output of the gradient boosting algorithm.The events correspond to 14 TeV, g q = 0, and 3000 f b −1 luminosity scenario.Using the BDT distributions normalized to cross section times pre-selection efficiency times luminosity, we calculate the expected experimental signal significance of the proposed search methodology, for different signal models, LHC operation conditions, and integrated luminosity scenarios.As noted earlier, we consider three values for the total integrated luminosity at the LHC: (i) 150 fb −1 , which is approximately the amount of pp data already collected by the ATLAS and CMS experiments; (ii) 300 fb −1 , expected in the next few years; and (iii) 3000 fb −1 , expected by the end of the High Luminosity LHC era.The significance is calculated using the expected binby-bin yields of the BDT output distribution in a profile likelihood fit, using the ROOTFit [103] package developed by CERN.Similar to Refs.[88,[104][105][106][107][108], the signal significance Z sig is determined using the probability of obtaining the same test statistic with the background-only hypothesis and the signal plus background hypothesis, defined as the local p-value.The value of Z sig corresponds to the point where the integral of a Gaussian distribution between Z sig and ∞ results in a value equal to the local p-value.
Systematic uncertainties are incorporated into the significance calculation as nuissance parameters, using a lognormal prior for normalization and a Gaussian prior for shape related uncertainties.The systematic uncertainties are based on both experimental and theoretical constraints.A 3% systematic uncertainty is used to account for experimental errors on the the estimation of the integrated luminosity collected by experiments.This is a reasonable and conservative choice based on Ref. [109].A systematic uncertainty is included due to the choice of PDF, with respect to the default set used to produce the simulated signal and background samples.The PDF uncertainties were calculated following the PDF4LHC prescription [103], and results in up to 5% systematic uncertainty, depending on the process.The effect of the chosen PDF set on the shape of the BDT output distribution is negligible.Other theoretical uncertainties were considered, such as the absence of higher-order contributions to the signal cross sections, which can alter the pre-selection efficiency and shapes of kinematic distributions which are fed into the BDT algorithm.This uncertainty is calculated by varying the renormalization and factorization scales by a factor of two with respect to the nominal value, and by considering the full change in the bin-by-bin yields of the BDT output distribution.They are found to be at most 3% in a given bin.For experimental uncertainties related to the reconstruction and identification of bottom quarks, Ref. [110] reports a systematic uncertainty of 1-5%, depending on p T and η of the b-jet candidate.However, we assume a conservative 5% uncertainty per b-jet candidate, independent of p T and η, which is correlated between signal and background processes with genuine bottom quarks, and correlated across BDT bins for each process.The electron and muon reconstruction, identification, and isolation requirements have an uncertainty of 2%, while a conservative 3% systematic uncertainty is set on the variation of the electron and muon energy/momentum scale and resolution [111,112].We assumed 2-5% jet energy scale uncertainties, depending on η and p T , resulting in shapebased uncertainties on the BDT output distribution that range from 1% to 4%, depending on the BDT bin.Finally, we consider a 10% systematic uncertainty associated with possible errors on the background predictions, which are uncorrelated between background processes.
Figure 8 shows the SPM1 signal significance as function of Z mass, for the {g q , g Z t t} = {0, 1} {g q , g Z t t} = {1, 1} coupling scenarios, assuming √ s = 13 TeV and 150 fb −1 .A signal significance of 1.69σ is our threshold to define expected exclusion at 95% confidence level, while 3σ (5σ) significance defines evidence (discovery) of new physics.For the {g q , g Z t t} = {1, 1} scenario, the analysis shows potential to exclude masses below 1.0 reconstructed mass, at √ s = 13 TeV and 150f b −1 luminosity, for the g q = 0, 1 and g Z t t = 1 benchmark coupling scenarios.The 1.69σ reference point for exclusion, and the 3σ and 5σ points for discovery sensitivity are shown as red-dashed lines.
TeV, and achieve greater than 3σ (5σ) signal sensitivity for Z masses below 800 (675) GeV.For the SPM1 scenario with {g q , g Z t t} = {0, 1}, the expected exclusion range is m(Z ) < 780 GeV, and the 3σ (5σ) reach is m(Z ) < 600 (500) GeV. Figure 9 shows the results for the same scenarios, but considering pp collisions at √ s = 14 TeV and integrated luminosities of 300 fb −1 and 3000 fb −1 .For the {g q , g Z t t} = {1, 1} scenario and assuming an integrated luminosity of 3000 fb −1 , the expected exclusion bound goes up to m(Z ) < 1.7 TeV, while the 3σ reach improves to m(Z ) < 1.45 TeV.We also estimate the expected signal significance for ) luminosity, for the g q = 0, 1 and g Z t t = 1 benchmark benchmark coupling scenarios.The 1.69σ reference point for exclusion, and the 3σ and 5σ points for discovery sensitivity are shown as red-dashed lines.different SPM1 coupling scenarios of the Z boson to t/b quarks.Figure 10 shows the signal significance for different g Z t t and m(Z ) scenarios, with suppressed couplings to first and second generation quarks (g q = 0), assuming √ s = 13 TeV and 150 fb −1 .Figure 11 shows the corresponding results for the same {g Z t t, m(Z )} combinations, but using g q = 1.The results for √ s = 14 TeV, assuming 300 fb −1 and 3000 fb −1 , are presented in Fig- ures 12-15 for different {g q , g Z t t, m(Z )} combinations.
Table V shows the SPM2 signal significance as function of m(Z ) and integrated luminosity, for the {c t , θ} = {1, π/2} scenario, assuming √ s = 14 TeV.The expected SPM2 exclusion range is m(Z ) < 1.5 TeV at L = 300 fb −1 , while the 5σ discovery reach is m(Z ) < 1.5 TeV for the 3000 fb −1 expected by the end of the high luminosity LHC era.

V. DISCUSSION
As the LHC continues to run with pp collisions at the highest energy, and with the slow increase in luminosity expected of the high-luminosity program of the accelerator, it is an important matter to ponder why certain searches for new physics have not provided strong evidence for discovery, and consider unexplored possibilities.In this work, we examine the phenomonology of a  vector like Z boson favoring higher-generation fermions (anogenophilic), in particular coupling to third generation fermions (tritogenophilic).This scenario is well motivated and arises in many theories that extend the SM [38][39][40][41][42][43][44][45].It also seems to appear as a possible, although not yet confirmed, pattern in precision measurements of the B-physics sector [46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63] and the measurement of the muon anomalous magnetic moment [64].to optimize the signal to background separation and maximize exclusion or discovery potential.Various coupling scenarios for the Z have been considered, including suppressed couplings to light flavour quarks (g q = 0), enhanced couplings to third generation fermions, and preferential couplings to top and bottom quarks (g Z t t).Under the SPM1 g q = 1 (g q = 0) scenario, at √ s = 13 TeV and integrated luminosity of 150 fb −1 , Z masses up to 1.0 TeV (780 GeV) can be excluded at 95% confidence level, while 5σ discovery potential exists for masses below 675 GeV (500 GeV).For the high luminosity era of the LHC with √ s = 14 TeV and integrated luminosity of 3000 fb −1 , Z masses up to 1.70 TeV (1.25 TeV) can be excluded for the SPM1 g q = 1 (g q = 0) scenario, while the 5σ discovery reach is m(Z ) < 1.25 TeV (900 GeV).For the SPM2 benchmark scenario with c t = 1 and θ = π/2, the discovery (exclusion) reach  As noted previously, the projected sensitivity using the SPM2 scenario serves as a good comparison with other search strategies.For example, the authors of Ref.
[75] examined the high luminosity LHC sensitivity to these anogenophilic scenarios using the pp → t tZ → t tt t final state with boosted top tagging algorithms, and reported a projected 2σ reach of approximately m(Z ) < 1.5 TeV for the same coupling scenario of c t = 1, assuming an integrated luminosity of 3000 fb −1 .That result is to be compared with the stronger projected significance of > 5.41σ for m(Z ) < 1.5 TeV in Table V, using the strategy presented in this paper.Additionally, Ref. [75] reports that a > 5σ discovery reach is attainable for m(Z ) = 1.5 TeV if c t > 1.65.For comparison, Table V already shows a significance of 5.41σ for m(Z ) = 1.5 TeV with a smaller coupling of c t = 1.
We also point out that these comparisons are conservative since the studies outlined in Ref.
[75] assume a 100% branching ratio of Z → t t, which would not be the case if Z couples to both top and bottom quarks.
The main result of this paper is that probing heavy neutral gauge bosons produced in association with spectator top quarks, and decaying to a pair of bottom quarks, can be a key search methodology.It represents the most important anogenophilic/tritogenophilic mode for discovery at m(Z ) < 2m t where the Z → t t decay is kinematically forbidden, and remains competitive with the Z → t t decay mode at TeV scale masses, benefiting from the possibility to reconstruct the Z mass from the two highest-p T b jets and resulting in events with reduced jet multiplicity.Furthermore, even if a Z boson is discovered in other search channels when m(Z ) is large, a t tZ → t tb b search remains a key part of the search program at the LHC in order to establish the couplings of the Z to all fermions.In particular, whereas a t tZ → t tt t search can measure the Z mass and coupling to top quarks, the proposed t tZ → t tb b search can additionally measure the Z coupling to bottom quarks.
The proposed data analysis represents a competitive alternative to complement searches already being conducted at the LHC.Those searches are based on the analysis of the mass distribution of two b-quark jets, in the resolved or boosted regime, using events whose triggers require high-p T jets [1,4,82], b-quark jets [81], or a photon [80].In the analysis strategy considered here instead, we can rely on the presence of an electron or muon lepton originating from the decay of a spectator top, which allows an unbiased selection of b-quark jets originating from the Z , or on the possibility to define a trigger using both a light lepton and jets, in order to select particles with lower energy.
Because of the above reasons, we deem that that the proposed analysis strategy should be considered in future Z searches at the LHC, by both the ATLAS and the CMS collaboration.

Figure 1 :
Figure 1: Representative Feynman diagram for the production of a Z boson through the fusion of a top quark pair, where the Z decays to a pair of bottom quarks and the two spectator top quarks decay semi-leptonically.

Figures 2 , 3 , 4 ,
and 5, show relevant kinematic distributions for two SPM1 signal points and dominant backgrounds, normalized to the area under the curve (unity).The distributions correspond to the b-jet candidate with the highest p T (b 1 ), the second b-jet candidate with the highest p T (b 2 ), the ∆R separation between the b 1 and b 2 candidates, and the reconstructed mass between the b 1 and b 2 , m(b 1 , b 2 ), respectively.These distributions are among the variables identified by the BDT algorithm with the highest signal to background discrimination power.
Figure 4 shows greater discrimination between background and signal processes as m(Z ) becomes larger.Finally, as noted previously, an advantage of the Z → b b final state in comparison to Z → t t is the experimental reconstruction of the Z mass, which is observed as a peak in the m(b 1 , b 2 ) signal distributions in Figure 5 near the true m(Z ) value.On the other hand, the background m(b 1 , b 2 ) distributions show a peak near m(h) = 125 GeV for the t th background, or a broad distribution for the other backgrounds, indicative of the combination of two b jets from different decay vertices.We note that the Z → b b decay width depends on g 2 Z t t × m 2 b m(Z ) 2 and is thus suppressed by the relatively small bottom quark mass with respect to the g Z t t and m(Z ) values considered in these studies.Therefore, the width of the m(b 1 , b 2 ) signal distributions is driven by the experimental resolution in the reconstruction of the b-jet momenta, as well as the probability that the two leading b jets are the correct pair from the Z decay.In addition to these aforementioned variables in Figures 2-5, a variety of other kinematic variables were included as inputs to the BDT algorithm.In particular, 47 such variables were used in total, and these included the momenta of b and light quark jets (not tagged as b jets); invariant masses of pairs of b jets and of the two leading light jets; angular differences between b jets, between light quark jets, and between the lepton and b jets; and transverse masses derived from the lepton-p miss T pair and lepton-p miss T -b triplets.The variables m(b i , b j ) for i, j = 1 provide some additional discrimination between signal and background when the leading b-jets are not a Z decay candidate.The transverse mass variables are designed to be sensitive to a leptonic decay of the W boson and t quark (i.e., m jj and m T ( , p miss T ) should be near m W , and m T ( , b, p miss T

Figure 2 :
Figure 2: Transverse momentum distributions for the b quark jet with the highest transverse momentum, for two signal points with masses of 350 GeV and 1000 GeV and dominant backgrounds.

Figure 3 :
Figure 3: Transverse momentum distributions for the b quark jet with the second highest transverse momentum, for two signal points with masses of 350 GeV and 1000 GeV and dominant backgrounds.

Figure 4 :
Figure 4: Distributions for the ∆R angular separation between the the highest (b 1 ) and second highest (b 2 ) transverse momentum b quark pair, for two signal points with masses of 350 GeV and 1000 GeV and dominant backgrounds.

Figure 5 :
Figure 5: Invariant mass distributions for the highest (b 1 ) and second highest (b 2 ) transverse momentum b quark pair, for two signal points with masses of 350GeV and 1000 GeV and dominant backgrounds.

Figure 6 :
Figure 6: Output of the gradient boosting algorithm for a Z signal with mass of 350 GeV and g q = 0 coupling, and the dominant backgrounds.The distributions are normalized to unity.

5 Figure 8 :
Figure 8: Expected signal significance as function ofreconstructed mass, at √ s = 13 TeV and 150f b −1 luminosity, for the g q = 0, 1 and g Z t t = 1 benchmark coupling scenarios.The 1.69σ reference point for exclusion, and the 3σ and 5σ points for discovery sensitivity are shown as red-dashed lines.

5 Figure 9 :
Figure 9: Expected signal significance as function of reconstructed mass, at √ s = 14 TeV and 300f b −1 (3000f b −1) luminosity, for the g q = 0, 1 and g Z t t = 1 benchmark benchmark coupling scenarios.The 1.69σ reference point for exclusion, and the 3σ and 5σ points for discovery sensitivity are shown as red-dashed lines.

Figure 10 :
Figure 10: Projected signal significance for the g q = 0 benchmark model for different g tt coupling scenarios and Z masses.The estimates are performed at √ s = 13 TeV and 150f b −1 .

Figure 11 :
Figure 11: Projected signal significance for the g q = 1 benchmark model for different g tt coupling scenarios and Z masses.The estimates are performed at √ s = 13 TeV and 150f b −1 .

Figure 12 :
Figure 12: Projected signal significance for the g q = 0 benchmark model for different g tt coupling scenarios and Z masses.The estimates are performed at √ s = 14 TeV and 300f b −1 .

Figure 13 :
Figure 13: Projected signal significance for the g q = 1 benchmark model for different g tt coupling scenarios and Z masses.The estimates are performed at √ s = 14 TeV and 300f b −1 .

Figure 14 :
Figure 14: Projected signal significance for the g q = 0 benchmark model for different g tt coupling scenarios and Z masses.The estimates are performed at √ s = 14 TeV and 3000f b −1 .

Figure 15 :
Figure 15: Projected signal significance for the g q = 1 benchmark model for different g tt coupling scenarios and Z masses.The estimates are performed at √ s = 14 TeV and 3000f b −1 .

Table I :
Signal cross sections, calculated with MadGraph, for different Z masses and couplings to first and second generation quarks.The values in this table are calculated with g Z t t = 1.

Table II :
Cross sections calculated with MadGraph for the dominant background processes.

Table V :
Projected signal significance for our second simplified model, considering the c t = 1 coupling scenario with varying Z masses.The calculations are performed at √ s = 14 TeV and assuming both 300 fb −1 and 3000 fb −1 .