Probing effective field theory operators in the associated production of top quarks with a Z boson in multilepton final states at $\sqrt{s} = $ 13 TeV

A search for new top quark interactions is performed within the framework of an effective field theory using the associated production of either one or two top quarks with a Z boson in multilepton final states. The data sample corresponds to an integrated luminosity of 138 fb$^{-1}$ of proton-proton collisions at $\sqrt{s} =$ 13 TeV collected by the CMS experiment at the LHC. Five dimension-six operators modifying the electroweak interactions of the top quark are considered. Novel machine-learning techniques are used to enhance the sensitivity to effects arising from these operators. Distributions used for the signal extraction are parameterized in terms of Wilson coefficients describing the interaction strengths of the operators. All five Wilson coefficients are simultaneously fit to data and 95% confidence level intervals are computed. All results are consistent with the SM expectations.


Introduction
The standard model (SM) of particle physics is supported by a vast number of measurements from experiments covering a broad energy range up to the TeV scale. However, the SM falls short of explaining several observed phenomena, such as neutrino oscillations [1] and the baryon asymmetry in our universe [2], and cannot accommodate the existence of dark matter and dark energy [3]. More generally, there are many indications that the SM corresponds to a low-energy approximation of a more fundamental theory beyond the standard model (BSM). Several BSM models explain these phenomena by postulating the existence of new particles or mechanisms. However, these hypothetical new particles may be too massive to be directly accessible at the CERN LHC.
Even in the absence of any direct observation of a new particle, new phenomena may manifest themselves indirectly through corrections at the quantum-loop level, thus leading to observable deviations in well-established processes. Such deviations can be interpreted in a coherent and model-independent manner using the approach of effective field theory (EFT) [4,5]. An EFT corresponds to an approximation at low energy of an underlying theory characterized by an energy scale Λ that is well above the typical energies accessible at colliders. New, effective interactions between the SM fields are introduced by extending the SM Lagrangian with higher-order operators. The interaction strength of an operator of dimension d is characterized by a dimensionless Wilson coefficient (WC) and is proportional to 1/Λ d−4 . This factor suppresses the contribution from higher-order operators, implying that effects of new interactions can be approximated with a finite set of WCs associated with lower-order operators. Since operators with d = 5 or 7 violate lepton and/or baryon number conservation [6], we restrict our analysis to dimension-six operators. The effective Lagrangian can then be written as where L SM is the SM Lagrangian, O i are dimension-six operators, and c i are the corresponding WCs that can be constrained from experimental data.
The large top quark mass m t = 172.44 ± 0.49 GeV [7] corresponds to a Yukawa coupling to the Higgs boson close to unity. This suggests that the top quark may play a special role within the SM, and that its precise characterization may shed light on the electroweak symmetry breaking mechanism [8-10]. The high integrated luminosity and center-of-mass energy at the LHC make it possible to study rare top quark processes, such as the associated production of top quarks with a Z boson, where top quarks are either produced in pairs (ttZ), or singly in the tZq and tWZ channels. Representative tree-level Feynman diagrams are shown for the three signal processes in Fig. 1. Whereas the ttZ and tZq processes have comparable inclusive cross sections of about 800 fb and were already observed by both the ATLAS and CMS Collaborations [11][12][13][14], the tWZ process has a much smaller cross section and has not been observed yet. These three processes are of major interest because they probe the coupling of the top quark to the Z boson at tree level. Numerous BSM extensions predict sizable modifications to this coupling [15][16][17], which is among the least constrained by the available data in the top quark sector.
The focus of the present analysis is the study of the electroweak interactions of the top quark, and in particular the ttZ interaction. Consequently, we only consider EFT operators involving third-generation quarks and gauge bosons, which interfere with the SM production of the ttZ or tZq signal. This means that these operators must affect either process-or both-at order 1/Λ 2 . We restrict our study to CP-conserving effects, and thus we ignore the imaginary components of complex WCs. Finally, the O tG operator related to the chromomagnetic dipole moment of the top quark is ignored, since it can be probed with much better sensitivity in tt events [18]. As a result, we focus on a subset of five operators, namely: O tZ , O tW , O 3 ϕQ , O − ϕQ , and O ϕt [19]. The O tZ and O tW operators induce electroweak dipole moments of the top quark, O 3 ϕQ is the left-handed SU(2) triplet current operator, and the O − ϕQ and O ϕt neutral-current operators modify the interactions of the Z boson with left-and right-handed top quarks, respectively. Comprehensive descriptions of their effects on top quark interactions are given in Refs. [20,21].
A key aspect of this study is the use of novel multivariate analysis (MVA) techniques based on machine learning to enhance the sensitivity to new phenomena arising from the EFT operators. Since new operators usually affect the distributions of multiple observables, MVA techniques that exploit correlations in high-dimensional data are well suited for EFT measurements. We train machine-learning algorithms for two purposes. Firstly, a multiclass classifier is trained to distinguish between different SM processes, and is used to define subregions enriched either in signal or background events. Secondly, binary classifiers are trained to separate events generated according to the SM from events generated with nonzero WC values for one or more EFT operators. They are used to construct powerful discriminating observables that are ultimately fit to data to compute two confidence intervals for each WC, one keeping the other WCs fixed to zero and the other treating all five WCs as free parameters. The core ideas for these binary classifiers appeared recently in the literature and since then have garnered increased attention, notably because they were shown to outperform traditional approaches based on single observables in several case studies at the generator level [22][23][24]. This motivates the application of this technique for the first time in an LHC analysis involving the interference between EFT operators and the SM amplitude.
The binary classifiers are trained with simulated samples whose event weights are parameterized as functions of the five WCs of interest. These samples are passed through a full detector simulation, and are used to search for new interactions without making any simplifying assumption regarding the parton shower and detector response. A previous CMS analysis in the top quark sector employed this approach to parameterize new interactions directly at the detector level in the context of an EFT [25]. It used data collected in 2017 and targeted mul-tilepton final states. That analysis set constraints on 16 WCs simultaneously by performing counting experiments in various event categories, which were defined based on the jet and lepton multiplicities, the presence of a Z boson candidate, and the sum of the lepton charges.
The data sample of proton-proton (pp) collisions at √ s = 13 TeV used in this paper was collected by the CMS experiment during Run 2 of the LHC (2016-2018) and corresponds to an integrated luminosity of 138 fb −1 , of which 36.3 [26], 41.5 [27], and 59.7 [28] fb −1 were recorded in 2016, 2017, and 2018, respectively. We target events in which the decays of the top quark and Z boson lead to final states with three or four leptons, either electrons or muons; this also includes a small contribution from leptonic tau lepton decays. Unless stated otherwise and throughout this paper, the term lepton refers exclusively to electrons and muons, generically denoted by . These multilepton final states offer high trigger efficiencies, and a more favorable signal-to-background ratio compared with hadronic channels.
The paper is organized as follows. We provide a brief overview of the CMS detector in Section 2. Section 3 describes the data and simulated event samples, as well as the parameterization of the weights for simulated events. Section 4 details the object and event reconstructions, while the event selection and categorization are presented in Section 5. The estimation of backgrounds is presented in Section 6. The MVA is described in Section 7, the systematic uncertainties affecting the measurements in Section 8, and the signal extraction procedure in Section 9. We discuss the results in Section 10, and conclude with a summary of the paper in Section 11. Tabulated results are provided in HEPData [29].

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
Events of interest are selected using a two-tiered trigger system. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [30]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [31]. A more detailed description of the CMS detector, together with a definition of the coordinate system and of the relevant kinematic variables, can be found in Ref. [32].

Data sample and Monte Carlo simulations
The data sample was recorded using a combination of single-, double-, and triple-lepton trigger algorithms, whose thresholds on the transverse momentum p T with respect to the beam axis vary between data-taking periods depending on the instantaneous luminosity. For instance, the minimum p T threshold of the single-electron (-muon) triggers ranges between 25-35 (22-27) GeV. For events selected in this analysis, the combined trigger efficiency is nearly 100% both in data and simulation. Event samples produced via Monte Carlo (MC) simulation are used to estimate the contributions of signal processes and most background processes, as well as to train machine-learning algorithms. The signal samples are generated at leading order (LO) in perturbative quantum chromodynamics (QCD) and incorporate EFT effects, whereas background samples do not include EFT effects and are generated at next-to-LO (NLO) whenever possible. Additional signal samples generated at NLO without EFT effects are used for validation purposes, and to train a classification algorithm aiming to separate different SM processes, as described in Section 7. The latter samples will be referred to explicitly as "SM signal samples". The mass of the top quark is set to m t = 172.5 GeV in simulation.
The SM signal samples for the tZq and ttZ processes, as well as the samples for several background processes (WZ, ttW, tttt, multiboson production, and Vγ, where V denotes either a W or Z boson), are generated at NLO using MADGRAPH5 aMC@NLO 2.4.2 [33]. The SM tWZ sample and other background samples are generated at LO using MADGRAPH5 aMC@NLO 2.4.2 (tHq, tHW, tt γ, ttVV, ttVH, ttHH) or MCFM 7.0.1 [34,35] (gg → ZZ), or at NLO using POWHEG 2.0 [36][37][38] (qq → ZZ [39], ttH [40]). The SM tZq sample is generated in the four-flavor scheme, in which only up, down, strange, and charm quarks are considered as sea quarks of the proton, whereas the SM tWZ sample is generated in the five-flavor scheme, in which bottom quarks are considered as sea quarks of the proton as well and may appear in the initial state of pp scattering processes [41].
The NNPDF3.1 [42] set of parton distribution functions (PDFs) is used to simulate signal samples for all three years, and background samples for 2017-2018, and the NNPDF3.0 [43] set of PDFs is used to generate background samples for 2016. The parton showering, hadronization, and underlying event are modeled using PYTHIA 8.2 [44] with the tune CP5 [45] for the 2017-2018 samples, as well as for the 2016 samples for the signal, ttW, ttH, tt γ, Zγ, and tttt processes, whereas the tunes CUETP8M1 or CUETP8M2T4 [46,47] are used to simulate other background samples for 2016. The matching of matrix elements (MEs) to the parton shower (PS) is done using the FxFx [48] merging scheme for NLO samples, and the MLM scheme [49] for LO samples.
The presence of simultaneous pp collisions in the same or nearby bunch crossings, referred to as pileup, is modeled by superimposing inelastic pp interactions simulated using PYTHIA 8.2 on all generated events. Generated events are passed through a detailed simulation of the CMS detector based on GEANT4 [50], and are reconstructed with the same version of the CMS event reconstruction software used for data.
Residual differences between data and simulation are corrected by modifying the weights of generated events, or by varying relevant simulated quantities. Such differences are observed in: the pileup distribution; the reconstruction and identification efficiencies for electrons and muons; the jet energy scale and resolution; the efficiency to identify jets originating from the hadronization of bottom quarks and the corresponding misidentification rates for light (u, d, c, s)-quark and gluon jets; and the resolution in missing transverse momentum.

Simulation of the signal samples
The signal samples including EFT effects are generated at LO using MADGRAPH5 aMC@NLO 2.6.5 and the NNPDF3.1 PDF set. The decays of top quarks and W bosons are simulated with the MADSPIN program [51].
We generate the signal events following a similar approach to that outlined in Ref. [19]. The EFT model used in the present analysis focuses on dimension-six operators that give rise to interactions involving at least one top quark. The degrees of freedom of this model are defined as linear combinations of Warsaw-basis operator coefficients [52], and the mapping between both bases is given in Table 1. Since this model only allows for tree-level generation, the ttZ sample is generated with an extra parton in the final state to improve its accuracy. The MLM merging scheme is used to match the MEs to the PS. However, the tZq sample does not include an extra final-state parton because this matching procedure cannot be performed correctly for single top quark processes in the t channel (due to the presence of a final-state light quark in the MEs), and neither does the tWZ sample to avoid overlap with ttZ production. We consider Feynman diagrams including at most one EFT vertex, in which the top quark is produced. Table 1: List of dimension-six EFT operators considered in this analysis and their corresponding WCs. The linear combinations of WCs to which they correspond in the Warsaw basis are indicated. The abbreviations s W and c W denote the sine and cosine of the weak mixing angle, respectively. The definitions of the relevant Warsaw-basis operators can be found in Ref. [19].
Operator WC Mapping to Warsaw-basis coefficients

Parameterization of the event weights
To interpret potential deviations in signal production in terms of new interactions, we parameterize the weights of the simulated signal events with the five WCs of interest. Within an EFT framework, a given ME may be decomposed into its SM and EFT components as where M SM is the SM ME, M i are the MEs associated with the new operators, and c i are WCs. A production cross section, either inclusive or differential, is proportional to the square of the total ME and can thus be written as a polynomial of second order in the WCs. The weight for a given event can then be parameterized with a five-dimensional (5D) quadratic function of the WCs as w c where the sums run over the five WCs, c is the set of WC values, and the s 0 , s 1 , s 2 , s 3 coefficients are associated with: the SM amplitude; the interference between SM and EFT amplitudes; EFT amplitudes; and the interference among EFT amplitudes, respectively. Although individual coefficients may be negative or null, the sum of all components always yields a physical distribution.
Following the procedure adopted in Ref. [25], this analysis makes extensive use of the possibility offered by the MADGRAPH5 aMC@NLO event generator to assign multiple weights to an event, which represent the infinitesimal contributions from this event to the total cross section at different points in the EFT phase space. Each simulated event is associated with weights corresponding to different WC values, which are used to build an overdetermined system of equations to solve for all the per-event coefficients-21 in total-of Eq. (3). By summing the quadratic functions of all simulated events for a given signal process, one can then evaluate its differential cross section as a function of any quantity, at any point in the 5D EFT phase space. The parameterized event yields of the signal samples are normalized such that they equal their SM NLO theoretical predictions [53,54] when all WCs are set to zero.

Event reconstruction
The particle-flow algorithm [55] combines information from all subdetectors to reconstruct individual particles in an event, and to identify them either as electrons, muons, photons, charged hadrons, or neutral hadrons.
The candidate vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects in this context are the jets, clustered using the jet-finding algorithm [56,57] with the tracks assigned to candidate vertices as inputs, and the negative vector p T sum of those jets. Reconstructed lepton candidates are required to have track parameters compatible with the primary vertex as origin.
Electron candidates are reconstructed within the range |η| < 2.5 by combining the energy measurement in the ECAL with the momentum measurement in the tracker [58]. They are required to satisfy p T > 7 GeV, and their identification relies upon an MVA algorithm trained with observables related to the ECAL and tracker measurements. Electrons originating from photon conversion are efficiently removed by requiring that candidate tracks have at most one missing hit in the innermost tracker layers.
Muon candidates are reconstructed within the range |η| < 2.4 as tracks in the tracker consistent with either a track or several hits in the muon system, and associated with calorimeter deposits compatible with the muon hypothesis [59]. They are required to satisfy p T > 5 GeV, as well as a set of quality criteria designed to reject hadrons punching through the calorimeters and muons produced by in-flight decays of kaons and pions.
Electron and muon candidates satisfying the aforementioned selection criteria are referred to as "loose leptons". Additional selection criteria are applied to select genuine "prompt" leptons produced in decays of W and Z bosons and leptonic τ decays, while rejecting "nonprompt" leptons (NPLs) mainly originating from b hadron decays, hadron misidentification, and the conversion of photons not produced in the hard scattering interaction. Background events containing at least one NPL, arising mostly from tt+jets and Z/γ production, will be referred to as "NPL background" throughout this paper. The rejection of NPLs is significantly improved by the use of MVA discriminants based on boosted decision trees [60]. They take as input several observables related to the lepton and to the jet activity in its vicinity. Electron and muon candidates satisfying a selection on the MVA discriminants are referred to as "tight leptons".
Jet candidates are reconstructed offline from energy deposits in the calorimeter towers, and clustered using the anti-k T algorithm [56,57] with a distance parameter of 0.4. They are required to satisfy p T > 25 GeV and |η| < 5, and must not overlap with any loose lepton within ∆R = √ (∆η) 2 + (∆φ) 2 < 0.4, where φ is the azimuthal angle. The contribution from pileup to jet momentum is mitigated by excluding charged hadrons associated with pileup vertices from the clustering. The energy of reconstructed jets is corrected for residual pileup effects, and calibrated as a function of jet p T and η [61,62]. We apply the more stringent requirement p T > 60 GeV to jets reconstructed within the range 2.7 < |η| < 3 to suppress calorimeter noise.
Jets passing these selection criteria are categorized into central and forward jets, the former satisfying the condition |η| < 2.4, and the latter 2.4 < |η| < 5. The presence of a high-p T forward jet in the event, referred to as a recoiling jet, is a characteristic feature of tZq production that is used in the MVA to isolate the contribution from this process. The phase space extension due to the inclusion of forward jets increases the acceptance of the tZq signal by about 25% in the trileptonic signal region defined in Section 5.
Jets arising from bottom quarks are identified (b tagged) with the DEEPJET deep neural network algorithm [63,64], within the ranges |η| < 2.4 in 2016 and |η| < 2.5 in 2017-2018 because of the Phase-1 upgrade of the CMS pixel detector [65]. Jets passing the medium working point of the algorithm are referred to as "b jets". For central jets with p T > 30 GeV, this corresponds to a selection efficiency of about 75% for jets arising from bottom quarks, and to a misidentification rate for light-quark and gluon jets of about 1% [66].
The missing transverse momentum vector p miss T is computed as the negative vector p T sum of all the particle-flow candidates in an event, and its magnitude is denoted by p miss T [67]. The p miss T is modified to account for corrections to the energy scale of the reconstructed jets in the event.
After the final-state particles have been identified and selected, we combine their information to reconstruct unstable particles that are expected in the topologies of the signal processes (see Fig. 1). These higher-level objects are used for event categorization and to construct powerful observables provided as input to the MVA algorithms presented in Section 7.
The Z boson candidates are reconstructed from pairs of opposite-sign same-flavor leptons having invariant masses within 15 GeV of the true mass m Z of the Z boson. While this analysis considers events with either three or four leptons, we do not reconstruct tetraleptonic events further since their kinematic information is not exploited in the signal extraction procedure described in Section 9. In trileptonic events with multiple Z boson candidates, only the candidate whose mass is closest to m Z is considered. The vector p miss T is associated with the undetected neutrino arising from the leptonic top quark decay. The longitudinal neutrino momentum is obtained by applying the W boson mass constraint and solving an analytical equation. The leptonically decaying top quark is then reconstructed using the four-momentum of the neutrino, the remaining selected lepton, and one b jet. In case two neutrino solutions or multiple b jets are reconstructed, the combination resulting in a top quark mass closest to m t is chosen. The W boson transverse mass m W T is reconstructed from the W → ν decay products and constitutes a powerful observable to discriminate background processes such as ZZ, Z/γ , and Zγ. To improve the selection of the tZq signal, we identify the remaining jet with the largest p T in the event as the recoiling jet, with a veto on b jets. In the rare case that an event contains only b jets, this veto is removed. The lepton asymmetry q |η( )| is defined as the product of the charge and absolute pseudorapidity of the lepton originating from the top quark decay, and provides additional discrimination power for the tZq signal.

Event selection and categorization
This analysis targets the associated production of top quarks with a Z boson, in events where the Z boson and at least one top quark decay leptonically. The corresponding experimental signatures are characterized by the presence of: three or four prompt leptons; high p miss T due to the neutrino(s) from W boson decay(s); at least one b jet from top quark decay; and possibly additional light-quark jets, either produced in the decays of top quarks or W bosons, or recoiling against the single top quark.
The signal region (SR) targeting trileptonic signal events drives the sensitivity of the analysis, and is denoted by SR-3 . An additional SR targeting tetraleptonic ttZ events is included, which is denoted by SR-ttZ-4 . Although this region contains a much smaller amount of data compared with the SR-3 , it is pure in ttZ events and thus provides additional sensitivity to operators impacting the cross section of this signal.
In the SR-3 (SR-ttZ-4 ) we require the presence of three (four) tight leptons with p T > 25, 15, 10 (and 10) GeV. To reject lepton pairs originating from low-mass hadron decays, which are not well modeled by the MC simulation, events containing a pair of loose leptons having an invariant mass below 12 GeV are discarded. Since this analysis focuses on the ttZ interaction, we require the presence of exactly one Z boson candidate in events entering the SRs. They are also required to contain at least two jets, and at least one b jet. Additionally, we require the sum of the lepton charges to be zero in events entering the SR-ttZ-4 . A multiclass classifier is then exploited to divide the SR-3 into process-specific subregions, as explained in Section 7. Figure 2 shows data-to-simulation comparisons in the SR-3 for several observables that are most relevant for this classification, before performing a fit to data as described in Section 9 (pre-fit). Unc. Data tZq tWZ  : Pre-fit data-to-simulation comparisons for several observables in the SR-3 . From left to right and upper to lower, the distributions correspond to: the relative azimuthal angle ∆φ between the two leptons from the Z boson decay; the maximum DEEPJET discriminant among all selected jets; the absolute pseudorapidity of the recoiling jet; the b jet multiplicity; the lepton asymmetry; and p miss T . The lower panels display the ratios of the observed event yields to their predicted values. The NPL background is modeled with the procedure based on control samples in data described in Section 6. The hatched band represents the total uncertainty in the prediction. Underflows and overflows are included in the first and last bins, respectively.
The main irreducible background contribution in the SR-3 arises from the production of two vector bosons, predominantly via the WZ → 3 process. The ZZ → 4 process also contributes, in cases where one lepton fails the lepton identification or is out of detector acceptance. These events contain on average fewer jets (b tagged or not) compared with signal processes, and are efficiently suppressed by the requirements on the jet and b jet multiplicities. Background processes with three vector bosons (WWW, WWZ, WZZ, and ZZZ) are considered as well and represent a minor contribution; together with the ZZ process, they are denoted by VV(V). Other processes with top quarks contributing as irreducible backgrounds are ttH, ttW, tHW, tHq, ttVV, ttVH, ttHH, and tttt; they are denoted by t(t)X. The contribution from processes including a photon produced in the hard scattering interaction is denoted by Xγ, and is dominated by Zγ and tt γ events yielding two prompt leptons plus a photon that undergoes an asymmetric conversion. The NPL background is modeled with the procedure based on control samples in data described in Section 6.
Two control regions (CRs) are enriched in the main background processes. They are included in the signal extraction procedure to constrain the uncertainties in the cross sections of these backgrounds from the data and are defined as follows: • A CR enriched in WZ events ("WZ CR") is defined similarly to the SR-3 , except that events containing one or more b jet(s) are rejected and no requirement on the minimal jet multiplicity is applied. The purity in WZ events is increased by requiring p miss T > 50 GeV, and the invariant mass of the trileptonic system m 3 to be larger than m Z + 15 GeV. • A CR enriched in ZZ events ("ZZ CR") is defined by requiring the presence of exactly four tight leptons that are compatible with two Z boson candidates. No requirements on the jet and b jet multiplicities are applied.
All regions are orthogonal by construction, and the leptons selected in the CRs must satisfy the same p T thresholds as used in the SRs. A summary of the main selection requirements applied in each region is provided in Table 2.

Background estimation
Irreducible backgrounds containing genuine prompt leptons are reliably estimated from MC simulations and are constrained in dedicated CRs. The Xγ background is strongly suppressed by the lepton identification criteria and its remaining contribution is estimated from MC simulations. However, the NPL background is known to be much more challenging to model. Thus, its contribution is estimated using the data-based misidentification probability method [60]. This method relies on the selection of samples of events satisfying the same selection criteria as defined in Section 5 for the different regions, except that the lepton identification requirements are relaxed. We refer to these samples as the application regions (ARs) of the misidentification probability method.
Estimates of the NPL background contribution to the different regions are obtained by applying suitably-chosen weights to the corresponding AR events. These weights quantify the probability for NPLs passing the relaxed identification requirements to be misidentified as tight leptons. They are measured in a data sample enriched in events composed uniquely of jets produced via the strong interaction, referred to as multijet events, in which an NPL originating from one of the jets is reconstructed. These data are collected with single muon (electron) triggers, some of which require the presence of an additional jet with p T > 40 (30) GeV. Events are selected if they contain exactly one lepton passing the relaxed identification criteria, plus at least one jet separated by ∆R > 0.7 from the lepton. The weights are measured separately for electrons and muons, and parameterized with the p T and η of the lepton candidate. The selected events are divided into "pass" and "fail" samples, depending on whether the lepton passes or fails the tight identification criteria, respectively.
The contribution from multijet events dominates the fail sample and is estimated directly from the data, after subtracting the small contamination from prompt lepton events (mostly from W+jets, Z/γ , diboson, and tt+jets production) based on MC simulations. It is then fit to the data in the pass sample, along with the contribution from prompt lepton events as estimated from the simulations, to determine the number of multijet events in the pass sample. The misidentification probability in a given category is computed as w = N pass /(N pass + N fail ), where N pass and N fail are the number of multijet events in the pass and fail samples, respectively. To prevent potential double counting in the NPL background estimation, all tight leptons selected in signal and control regions are matched to their generator-level equivalents in the simulation, and simulated events containing at least one lepton qualifying as nonprompt are discarded. Further details on the procedure for the estimation of the NPL background can be found in Ref. [60].

Multivariate analysis
This analysis makes extensive use of MVA techniques based on neural networks (NNs) to enhance the sensitivity to new phenomena arising from the EFT operators of interest. Firstly, a multiclass classifier is trained to separate the contributions from the main SM processes in the SR-3 . It is used to define three subregions enriched in the tZq, ttZ, and background processes. Secondly, binary classifiers are trained to separate events generated according to the SM from events generated with nonzero WC values. The responses of these NNs represent optimal observables that are used for the signal extraction in the SR-3 subregions.
All trainings are performed with MC simulations in the SR-3 , using the TENSORFLOW [68] package with the KERAS interface [69] and ADAM optimizer [70]. The training phase aims to reduce the cross-entropy loss function [71], and the NN weights are updated using batch gradient descent. Potential overtraining is mitigated using dropout [72], L2 regularization [73], and early stopping in case the minimized function has reached a stable minimum. The number of equidistant bins N bins and the range of each NN output distribution are adjusted such that the total expected event yield is above one in all bins. The bins are numbered from 1 to N bins .

Discrimination between SM processes
The SR-3 selection criteria are designed to retain a large proportion of signal events while rejecting most background events. However, basic selection criteria cannot isolate efficiently the rare signals from the overwhelming background processes that yield similar final states. Moreover, each signal process probes the WCs in unique ways, and the shapes of their kinematic distributions may be impacted differently in the presence of new interactions. To take advantage of this complementarity, it is thus desirable to separate the signal processes from one another.
To this end, we train a multiclass NN classifier, denoted hereafter by "NN-SM". It is tasked with isolating the tZq and ttZ signals from the major WZ, t(t)X, and VV(V) backgrounds. The tWZ signal is not targeted explicitly because it has a comparatively small event yield and is kinematically close to the ttZ process. This classifier is trained using the SM signal samples, which represent our best available SM predictions at NLO accuracy for these processes.
The NN-SM includes three hidden layers, each with 100 rectified linear units [74], and has three output nodes labeled "tZq", "ttZ", and "Others". The response of the NN-SM is used to divide the SR-3 into three subregions, each enriched in a particular class of events, which we label accordingly "SR-tZq", "SR-ttZ", and "SR-Others". The softmax activation function is used for all output nodes. Their activation values may be interpreted as the probability for a given event to be either tZq signal, ttZ signal, or background. Since the softmax operation normalizes the output vector to unit magnitude by construction, we assign a given event entering the SR-3 to either of the orthogonal subregions based on the output node that is most activated by this event. The set of 33 variables used as input to the NN-SM comprises 12 "high-level" variables that are designed to improve the separation between the different process classes. The threemomenta (p T , η, φ) of the three leading leptons and up to three leading jets in the events, as well as the b tagging discriminants for these jets, are also included and improve the classification performance. All the input variables are listed in Table 3. Their distributions and correlations with other variables are verified to be properly modeled by the simulations.
The pre-fit data-to-simulation comparisons for the distributions of the three output nodes of the NN-SM are shown in Fig. 3. For each distribution, only the events that have their maximum value in the corresponding output node are included.

Discrimination between the SM and EFT scenarios
In a second step, we leverage the EFT parameterization of the signal event weights to train NNs tasked with separating events generated either according to the SM, or according to EFT Table 3: Input variables to the NN-SM and to the eight NN-EFTs. A dash indicates that the variable is not used. The three-momentum of an object includes the p T , η, and φ components of its momentum. The symbol t denotes the lepton produced in the decay of the top quark; j denotes the recoiling jet; b denotes the b jet associated with the leptonic top quark decay; ( Z 1 , Z 2 ) denote the leptons produced in the Z boson decay; cos θ Z is the cosine of the angle between the direction of the Z boson in the detector reference frame, and the direction of the negatively-charged lepton from the Z boson decay in the rest frame of the Z boson. Other observables are defined in Section 4. scenarios in which at least one WC is nonzero. These classifiers are denoted by "NN-EFTs" hereafter. They are used to design observables with optimal sensitivity to new effects arising from the targeted operators. These techniques are based on ideas developed in the context of likelihood-free inference techniques, which are described extensively in Ref. [22].
Classification algorithms were already used in previous analyses adopting an EFT framework. For instance, Ref. [75] takes advantage of a multiclass classifier to separate SM events from events corresponding to the pure EFT contribution to the targeted signal process, as well as to distinguish among different classes of operators. However, the interference between EFT oper-ators and the SM amplitude-and among EFT amplitudes-were always either absent, or were voluntarily neglected in the design of these algorithms. This is because the shapes of the kinematic distributions due to pure EFT contributions are independent of the WCs, which makes it possible to train an algorithm on simulated samples whose kinematic properties are unambiguously defined. On the contrary, interference terms introduce a dependence of the shapes of the kinematic distributions on the WCs, which cannot be dealt with efficiently using the most commonly used, problem-specific algorithms. This is the first time that an LHC analysis uses machine-learning techniques that account for interference in the training phase to target EFT effects.
The NN-EFTs are binary NN classifiers trained to discriminate between the SM and EFT scenarios. We segment this challenging task by targeting individual WCs with separate trainings, and by targeting the tZq and ttZ signals separately, since both processes lead to significantly different event topologies. As anticipated, we find that for WC values close to the current exclusion limits, the O − ϕQ and O ϕt operators have negligible impacts on the shapes of the kinematic distributions of the tZq and ttZ processes [20]. We therefore do not train dedicated classifiers for these cases, but we rely on the sensitivity of this analysis to the signal cross sections to constrain these WCs. Additional classifiers, referred to explicitly as "NN-5D", are trained with events sampled over the phase space spanned by c tZ , c tW , and c 3 ϕQ , and will be used to constrain multiple WCs simultaneously. Consequently, eight NN-EFTs are trained and labeled according to the WC(s) and signal process that they target: NN-c tZ -tZq, NN-c tZ -ttZ, NN-c tW -tZq, NNc tW -ttZ, NN-c 3 ϕQ -tZq, NN-c 3 ϕQ -ttZ, NN-5D-tZq, and NN-5D-ttZ. The output distributions of the NNs targeting either the tZq or ttZ process are ultimately used to extract the signal in the SR-tZq or SR-ttZ subregions, respectively, as described in Section 9. Separate sets of statistically independent samples are used to perform the trainings and the signal extraction.
Each NN-EFT classifier is trained on a range of distinct EFT scenarios. The corresponding subsets of training events are sampled uniformly over the range [−5, 5] for c tZ and c tW , and [−10, 10] for c 3 ϕQ . These ranges for the training hypotheses are chosen to be broader than the existing direct constraints on the corresponding WCs. This avoids biasing the trainings on preexisting results and may help the classifiers learn the characteristic features associated with new interactions, since these features are more prominent at larger WC values. The exact definition for these ranges has only a minor impact on the classification performance. We rely upon the capability of these NNs to learn from training events corresponding to various hypotheses, and to interpolate between these events to construct an abstract model optimized to separate reconstructed events that are either SM-or EFT-like.
The NN-EFTs include three hidden layers, each with 50 rectified linear units. The sigmoid activation function is used for the single output node. The sets of input variables are defined independently for each classifier and are listed in Table 3. The choice of input variables is informed by generator-level studies of their expected sensitivities, as well as by their correlations, importance rankings, and modeling. Multiple tests were performed using different architectures, activation functions, optimizers, and learning and dropout rates, and the configuration performing best on a set of simulated events not used during the training phase was retained.

Systematic uncertainties
Many different sources of systematic uncertainty may alter the event yields, the shapes of the observables used in the final fits to data, or both. They are treated as correlated between the different regions and data-taking periods, unless stated otherwise. A summary of the sources of systematic uncertainty included in the measurements is provided in Table 4.

Experimental uncertainties
• Integrated luminosity The integrated luminosities of the 2016, 2017, and 2018 datataking periods are individually known with uncertainties in the 1.2-2.5% range [26][27][28]. The total Run 2 integrated luminosity has an uncertainty of 1.6%. • Trigger efficiency The combination of single-, double-, and triple-lepton trigger algorithms used in this analysis achieves a trigger efficiency of nearly 100% for events passing the SR or CR selection criteria, both in data and simulation, and therefore no correction is applied. However, a 2% uncertainty is assigned to the event yields of the MC samples independently for each year to account for the statistical uncertainty in the efficiency measurement in data. • Pileup Event weights are corrected to account for differences in the pileup distributions between data and MC samples, before any event selection is applied. To estimate the corresponding uncertainty, the pp inelastic cross section is varied by ±4.6% [76] and the variations are propagated to the event weights. • Lepton identification and isolation efficiencies Event weights are corrected to account for differences in the lepton identification and isolation efficiencies between data and simulation, as a function of the lepton p T and η. These corrections are varied independently for electrons and muons, within the uncertainties in the corresponding measurements. • b jet identification efficiency Event weights are corrected to make the simulated distribution of the DEEPJET discriminant match that measured in the data. These corrections are parameterized with the jet p T and η, and the corresponding uncertainties are measured in tt+jets and multijet events [63]. Several sources of uncertainty are considered, which are related to the contaminations and statistical fluctuations in the measurement regions of the corrections. Uncertainties that are statistical in nature are uncorrelated between the years. • Jet energy and missing transverse momentum The jet energy scale is measured with an uncertainty of a few percent, depending on the jet p T and η, in Z/γ → ee, Z/γ → µµ, γ+jets, dijet, and multijet events [62]. The resulting effect on signal and background distributions is evaluated by varying the energies of jets in simulated events within their uncertainties, recalculating all kinematic observables, and reapplying the event selection criteria. This effect is split between several subsources of uncertainty that are either correlated or uncorrelated between the years. The uncertainties in the jet energy resolution and in the vector p miss T are evaluated in a similar way, and have smaller effects than the uncertainty in the jet energy scale. These uncertainties are treated as uncorrelated between the years. • L1 ECAL trigger inefficiency A gradual shift in the timing of the inputs of the ECAL L1 trigger in the forward region (|η| > 2.4) constitutes an additional source of inefficiency in the 2016 and 2017 data-taking periods [30]. Dedicated event weights are applied to simulated events to account for this effect and are varied within their uncertainties.

Theoretical uncertainties
Several sources of theoretical uncertainty related to the modeling of the signal samples are included. Except for the uncertainties in the SM cross sections, these uncertainties impact both the event yields through changes in signal acceptance, and the shapes of the observables used for the signal extraction.
• PDF and QCD coupling The uncertainty in the PDF prediction is estimated by reweighting signal events according to 100 eigenvectors of the NNPDF3.1 PDF set. The total uncertainty is measured following the PDF4LHC recommendations [77]. The value of the QCD coupling α S is also varied independently within its uncertainty. • Renormalization and factorization scales The effect of missing higher-order corrections on the distributions of the discriminating observables is estimated by varying the renormalization and factorization scales (µ R and µ F ) up and down by a factor of two with respect to their nominal values, avoiding cases in which the two variations are done in opposite directions. • SM cross sections We assign uncertainties to the SM cross sections of the signal processes based on theoretical predictions computed at NLO accuracy. We sum in quadrature the uncertainties in the PDF, the renormalization scale, and the factorization scale. This results in a normalization-only uncertainty of +10.0 −11.6 % for the ttZ process [53], and of ±3.3% for the tZq process [54]. As there is yet no experimental evidence for the existence of the tWZ process, we assign a more conservative uncertainty of 20% to its cross section.
• Parton shower The uncertainty in the PS simulation is estimated by varying the renormalization scale up and down by a factor of two with respect to its nominal value, independently for initial-and final-state radiation (ISR and FSR). The effect of the ISR uncertainty is uncorrelated between the signal processes. • Additional radiation Comparisons between the SM tZq sample (generated at NLO accuracy) and the tZq sample used for the signal extraction (generated at LO without an extra final-state parton) show good agreement, except for a discrepancy in the jet multiplicity distribution that is not covered by other uncertainties. We include this effect by adding a systematic uncertainty based on these differences in bins of jet multiplicity, which only affects the shapes of observables for the tZq signal. ]. An additional extrapolation uncertainty of 6% is applied to WZ events that enter any of the signal regions, to account for differences in the multiplicity of heavy-flavor jets between the WZ CR and SRs, as estimated from simulations. • Misidentified-lepton probability estimation The misidentification probability measurements described in Section 6 are affected by several sources of uncertainty related to the limited amount of data in the measurement region, the subtraction of the prompt lepton contamination, and potential differences in kinematic properties of NPL candidates between the measurement and application regions. The weights obtained with this procedure are varied independently for electrons and muons within each uncertainty. Since these uncertainties were estimated in the context of Ref.

Background uncertainties
[60], we apply an additional 30% uncertainty to the event yield of the NPL background to account for differences in the definitions of the application regions of this analysis.

Signal extraction
The statistical analysis of the data relies upon the construction of a binned likelihood function L(data|θ, ν) that incorporates all the necessary information regarding the different model parameters. The WCs constitute the parameters of interest, denoted by θ, and all the systematic uncertainties defined in Section 8 are treated as nuisance parameters, denoted by ν. The parameterization of the signal event weights with the WCs makes it possible to vary the event yields in all bins, to describe any point in the parameter space spanned by the WCs. Statistical fluctuations due to the finite number of simulated events are incorporated into the likelihood function via the approach described in Refs. [84,85]. The compatibility of given values of the model parameters with the data is quantified by performing a maximum likelihood fit in which the negative log-likelihood (NLL) function −2 ln(L) is minimized and the nuisance parameters are profiled, simultaneously in the six regions (four SRs and two CRs) for the three years.
We perform a one-dimensional (1D) likelihood scan for each WC by repeating the maximum likelihood fit in steps of that WC, while the four other WCs are fixed to their SM values of zero. Any point in the scan is described by the difference −2∆ ln(L) between its NLL value and that of the global minimum. The boundaries of the 68 and 95% confidence level (CL) confidence intervals are defined as the intersections of the NLL function with values of 1 and 3.84, respectively. Five individual 1D likelihood scans are thus required to construct confidence intervals for all five WCs. In addition, we perform two-dimensional (2D) likelihood scans in which two WCs are scanned simultaneously, while the three remaining WCs are fixed to zero; the boundaries of the 68 and 95% CL confidence intervals are defined as the intersections of the NLL function with values of 2.30 and 5.99, respectively. The corresponding plots shown in Section 10 illustrate the correlations between pairs of WCs. Finally, we also perform a single 5D maximum likelihood fit in which all five WCs are treated as parameters of interest, and we compute the corresponding 95% CL confidence intervals.
Different observables are used to extract the signal in the SR-tZq and SR-ttZ, depending on the number of parameters of interest (one, two, or five) and on the WCs that are probed, while the observables used in all other regions are the same for all fits. In the SR-Others we use the m W T distribution since it provides good separation between the main background processes in this region. We perform simple counting experiments in the SR-ttZ-4 , WZ CR, and ZZ CR regions, as they are already very pure in their target processes and do not require any further discrimination. The distributions of the different NN classifiers are used to extract the signal in the SR-tZq and SR-ttZ subregions, which drive almost entirely the sensitivity of this analysis. For the 1D fits to the c tZ , c tW , and c 3 ϕQ coefficients, we use the distributions of the dedicated NN-EFT classifiers. For the 1D fits to the c − ϕQ and c ϕt coefficients, we use the distributions of the tZq and ttZ output nodes of the NN-SM classifier. The good discrimination power of the latter observables between the relevant SM processes improves the sensitivity to the signal event yields. The 2D and 5D fits are performed to the distributions of the NN-5D classifiers, which are trained to learn the features associated with several EFT operators simultaneously. The observables used in the different fits are listed for each region in Table 5. Table 5: Observables used in each region for the different fits. The NN-SM is trained to separate different SM processes, while the other NNs are trained to identify new effects arising from one or more EFT operators, as described in Section 7.

Results
The distributions that are common to all fits are shown in Fig. 4, after performing the fit (postfit) in which all five WCs are treated as free parameters. The post-fit distributions of the observables used in the SR-tZq and SR-ttZ for the 5D fit and for the 1D fits to the c tZ , c tW , and c 3 ϕQ coefficients, are shown in Figs. 5-6. The post-fit distributions of the response of the NN-SM in the SR-tZq and SR-ttZ, which are used for the 1D fits to the c − ϕQ and c ϕt coefficients, are not shown because they do not exhibit any shape discrimination for the considered EFT scenarios (as explained in Section 7.2), and because they are very similar to the corresponding pre-fit distributions shown in Fig. 3. We observe good agreement between data and simulation in all distributions, and illustrate the separation powers of the different NN-EFT classifiers for benchmark EFT scenarios. The expected and observed 95% CL confidence intervals obtained from the 1D and 5D fits are provided in Table 6. All intervals include the SM expectations of zero for the WC values. The optimal combination of the available kinematic information by the NN-EFT classifiers significantly reduces the widths of the confidence intervals, as estimated by repeating the fits when performing simple counting experiments in all regions. The widths of the expected 1D confidence intervals for c tW , c tZ , and c 3 ϕQ are reduced by about 55, 40, and 20%, respectively. Those for c − ϕQ and c ϕt change only marginally, as anticipated since these WCs mainly affect the signal cross sections. The widths of the expected 5D confidence intervals for c tW , c 3 ϕQ , c tZ , c − ϕQ , and c ϕt are reduced by about 70, 70, 60, 35, and 20%, respectively. The corresponding observed intervals exhibit similar or slightly larger improvements. This indicates that leveraging kinematic information becomes all the more important when aiming to constrain multiple WCs simultaneously, as it helps with disentangling effects from different EFT operators that have an interplay.
The impacts from different groups of sources of systematic uncertainty on each individual WC are listed in Table 7. We find that the uncertainties in the c tZ and c tW coefficients are primarily due to the limited number of data events in the most sensitive bins of the SR-3 distributions. Conversely, the measurements of the c 3 ϕQ , c − ϕQ , and c ϕt coefficients are limited by systematic uncertainties, with those in the signal and background cross sections being dominant. As anticipated, since we mostly rely on our sensitivity to the ttZ event yield to constrain the c − ϕQ and c ϕt coefficients, the corresponding confidence intervals are significantly impacted by the uncertainty assigned to the SM cross section of the ttZ process.
One-dimensional likelihood scans are shown for all WCs in Fig. 7. The 1D likelihood scan of c ϕt exhibits a double-minima structure; while the data favor the negative minimum, they are  compatible with the SM at 95% CL. It was verified with MC pseudo-experiments generated under the SM hypothesis that the negative minimum is favored about 20% of the time due to statistical fluctuations. Two-dimensional likelihood scans are shown in Fig. 8 for the pairs of WCs that are most correlated. Table 6: Expected and observed 95% CL confidence intervals for all WCs. The intervals in the first and second columns are obtained by scanning over a single WC, while fixing the other WCs to their SM values of zero. The intervals in the third and fourth columns are obtained by performing a 5D fit in which all five WCs are treated as free parameters. As explained in Section 9, the 1D intervals are obtained from separate fits to different observables in the SR-tZq and SR-ttZ, while the 5D intervals are obtained from a single fit.

Summary
A search for new top quark interactions has been performed within the framework of an effective field theory (EFT) using the associated production of either one or two top quarks with a Z boson in multilepton final states. The data sample corresponds to an integrated luminosity of 138 fb −1 of proton-proton collisions at √ s = 13 TeV collected by the CMS experiment. Five dimension-six operators modifying the electroweak interactions of the top quark were considered. The event yields and kinematic properties of the signal processes were parameterized with Wilson coefficients (WCs) describing the interaction strengths of these operators.
A multivariate analysis relying upon machine-learning techniques was designed to enhance the sensitivity to effects arising from the EFT operators. A multiclass neural network was trained to distinguish between standard model (SM) processes and was used to define three subregions enriched in tZq, ttZ, and background events. Additional neural networks were trained to separate events generated according to the SM from events generated with nonzero WC values, and were used to construct optimal observables. This is the first time that machinelearning techniques accounting for the interference between EFT operators and the SM amplitude have been used in an LHC analysis.
Results were extracted from a simultaneous fit to data in six event categories. Two confidence intervals were determined for each WC, one keeping the other WCs fixed to zero and the other treating all five WCs as free parameters. Two-dimensional contours were produced for pairs of WCs to illustrate their correlations. All results are consistent with the SM at 95% confidence level. and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies: [5] C. P. Burgess, "Introduction to effective field theory", Ann.    [11] CMS Collaboration, "Observation of single top quark production in association with a Z boson in proton-proton collisions at √ s = 13 TeV", Phys. Rev. Lett. 122 (2019) 132003, doi:10.1103/PhysRevLett.122.132003, arXiv:1812.05900.