1 Introduction

In proton–proton (pp) collisions at the CERN LHC, the pure electroweak (EW) production of a lepton–neutrino pair (\(\ell \nu \)) in association with two jets (\(\mathrm {jj}\)) includes production via vector boson fusion (VBF). This process has a distinctive signature of two jets with large energy and separation in pseudorapidity (\(\eta \)), produced in association with a lepton–neutrino pair. This EW process is referred to as EW \(\mathrm{W} \mathrm {jj}\), and the two jets produced through the fragmentation of the outgoing quarks are referred to as “tagging jets”.

Figure 1 shows representative Feynman diagrams for the EW \(\mathrm{W} \mathrm {jj}\) signal processes, namely VBF (Fig. 1, left), bremsstrahlung-like (Fig. 1, center), and multiperipheral (Fig. 1, right) production. Gauge cancellations lead to a large negative interference between the VBF diagram and the other two diagrams, with the larger interference coming from bremsstrahlung-like production. Interference with multiperipheral production is limited to cases where the lepton–neutrino pair mass is close to the \(\mathrm{W} \) boson mass.

Fig. 1
figure 1

Representative Feynman diagrams for lepton–neutrino production in association with two jets from purely electroweak amplitudes: vector boson fusion (left), bremsstrahlung-like (center), and multiperipheral (right) production

In addition to the purely EW signal diagrams described above, there are other, not purely EW processes, that lead to the same \(\ell \nu \mathrm {jj}\) final states and can interfere with the signal diagrams in Fig. 1. This interference effect between the signal production and the main Drell–Yan (DY) background processes (\(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\)) is small compared to the interference effects among the EW production amplitudes, but needs to be included when measuring the signal contribution. Figure 2 (left) shows one example of \(\mathrm{W} \) boson production in association with two jets that has the same initial and final states as those in Fig. 1. A process that does not interfere with the EW signal is shown in Fig. 2 (right).

Fig. 2
figure 2

Representative diagrams for \(\mathrm{W} \) boson production in association with two jets (\(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\)) that constitute the main background for the measurement

The study of EW \(\mathrm{W} \mathrm {jj}\) processes is part of a more general investigation of standard model (SM) VBF and scattering processes that includes the measurements of EW \(\mathrm{Zjj} \) processes, Higgs boson production [1,2,3], and searches for physics beyond the SM [4]. The properties of EW \(\mathrm{W} \mathrm {jj}\) events that are isolated from the backgrounds can be compared with SM predictions. Probing the additional hadronic activity in selected events can shed light on the modeling of the additional parton radiation [5, 6], which is important for signal selection and the vetoing of background events.

Higher-dimensional operators outside the SM can generate anomalous trilinear gauge couplings (ATGCs) [7, 8], so the measurement of the coupling strengths provides an indirect search for beyond-the-SM physics at mass scales not directly accessible at the LHC.

At the LHC, the EW \(\mathrm{W} \mathrm {jj}\) process was first measured by the CMS Collaboration using pp collisions at \(\sqrt{s}=8\,\text {Te}\text {V} \) [9] and then by the ATLAS Collaboration at both \(\sqrt{s}=8\,\text {Te}\text {V} \) and \(\sqrt{s}=7\,\text {Te}\text {V} \) [10]. The closely related EW \(\mathrm{Zjj} \) process was first measured during Run 1 by the CMS Collaboration using pp collisions at \(\sqrt{s}=7\,\text {Te}\text {V} \) [11], and then at \(\sqrt{s}=8\,\text {Te}\text {V} \) by both the CMS [12] and ATLAS [13] Collaborations. The EW \(\mathrm{Zjj} \) measurements using data samples of pp collisions at \(\sqrt{s}=13\,\text {Te}\text {V} \) have been performed by ATLAS [14] and by CMS [15]. Considering leptonic final states in the same kinematic region the EW \(\mathrm{W} \mathrm {jj}\) cross section is about a factor 10 larger than the EW \(\mathrm{Zjj} \) cross section. All results so far agree with the expectations of the SM within a precision of 10–20%.

This paper presents measurements of the EW \(\mathrm{W} \mathrm {jj}\) process with the CMS detector using pp collisions collected at \(\sqrt{s}=\)13\(\,\text {Te}\text {V}\) during 2016, corresponding to an integrated luminosity of 35.9\(\,\text {fb}^{-1}\). A multivariate analysis (BDT), based on the methods developed for the EW Zjj measurement [11, 12], is used to separate signal events from the large \(\mathrm{W} \)+jets background. The analysis of the 13\(\,\text {Te}\text {V}\) data offers the opportunity to measure the cross section at a higher energy than previously done and to reduce the uncertainties obtained with previous measurements, given both the larger integrated luminosity and the larger predicted total cross section.

This paper is organized as follows: Sect. 2 describes the experimental apparatus and Sect. 3 the event simulations. Event selection procedures are described in Sect. 4, together with the selection efficiencies and background estimations using control regions (CRs). Section 5 describes an estimation of the multijet background from quantum chromodynamics (QCD), based on CRs in data. Section 6 discusses a correction applied to the simulation as a function of the invariant mass \(m_\mathrm {jj}\). Section 7 presents distributions of the main discriminating variables in data. Section 8 details the strategy adopted to extract the signal from the data, and the corresponding systematic uncertainties are summarized in Sect. 9. The cross section and anomalous coupling results are presented in Sects. 10 and 11, respectively. Section 12 presents a study of the additional hadronic activity in an EW \(\mathrm{W} \mathrm {jj}\) enriched region. Finally, a brief summary of the results is given in Sect. 13.

2 The CMS detector and physics objects

The central feature of the CMS apparatus is a superconducting solenoid of 6\(\,\text {m}\) internal diameter, providing a magnetic field of 3.8\(\,\text {T}\). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the \(\eta \) coverage provided by the barrel and endcap detectors to \(|\eta |\) = 5.2. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.

The tracker measures charged particles within the range \(|\eta | < 2.5\). It consists of 1440 pixel and 15,148 strip detector modules. For nonisolated particles with transverse momenta \(1< p_{\mathrm {T}} < 10\,\text {Ge}\text {V} \) and \(|\eta | < 1.4\), the track resolutions are typically 1.5% in \(p_{\mathrm {T}}\) and 25–90 (45–150) \(\upmu \)m in the transverse (longitudinal) impact parameter [16].

The energy of electrons is measured after combining the information from the ECAL and the tracker, whereas their direction is measured by the tracker. The momentum resolution for electrons with \(p_{\mathrm {T}} \approx \) 45\(\,\text {Ge}\text {V}\) from \({\mathrm{Z}} \rightarrow \mathrm{e} \mathrm{e} \) decays ranges from 1.7 to 4.5%. It is generally better in the barrel region than in the endcaps, and also depends on the bremsstrahlung energy emitted by the electron as it traverses the material in front of the ECAL [17].

Muons are measured in the range \(|\eta | < 2.4\), with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. Matching muons to tracks measured in the silicon tracker results in a relative transverse momentum resolution for muons with \(20<p_{\mathrm {T}} < 100\,\text {Ge}\text {V} \) of 1.3–2.0% in the barrel and better than 6% in the endcaps. The \(p_{\mathrm {T}}\) resolution in the barrel is better than 10% for muons with \(p_{\mathrm {T}}\) up to 1\(\,\text {Te}\text {V}\) [18].

Events of interest are selected using a two-tiered trigger system [19]. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100\(\,\text {kHz}\) within a time interval of less than 4 \(\upmu \)s. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full-event reconstruction software optimized for fast processing, and reduces the event rate to around 1\(\,\text {kHz}\) before data storage.

A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [20].

3 Simulation of signal and background events

Signal events are simulated at leading order (LO) using the MadGraph 5_amc@nlo (v2.3.3) Monte Carlo (MC) generator [21], interfaced with pythia (v8.212) [22] for parton showering (PS) and hadronization. The NNPDF30 [23] parton distribution functions (PDFs) are used to generate the events. The underlying event is modeled using the CUETP8M1 tune [24]. The simulation does not include extra partons at matrix element (ME) level. The signal is defined in the kinematic region with parton transverse momentum \(p_\mathrm {T j} > 25\,\text {Ge}\text {V} \), and diparton invariant mass \(m_\mathrm {jj} >120\,\text {Ge}\text {V} \). The simulated cross section for the \(\ell \nu \)jj final state (with \(\ell \) = e, \(\mu \) or \(\tau \)), applying the above requirements, is \(\sigma _\mathrm {LO}(\mathrm {EW}~\ell \nu \mathrm {jj})= 6.81 ^{+0.03}_{-0.06} \,\text {(scale)}\pm 0.26\,\text {(PDFs)}\,\text {pb} \), where the first uncertainty is obtained by changing simultaneously the factorization (\(\mu _\mathrm {F}\)) and renormalization (\(\mu _{\mathrm {R}}\)) scales by factors of 2 and 1/2, and the second one reflects the uncertainties in the NNPDF30 PDFs. The LO signal cross section and relevant kinematic distributions estimated with MadGraph 5_amc@nlo are in agreement within 2–5% with the next-to-leading-order (NLO) predictions of the vbfnlo generator (v2.6.3) [25,26,27], which include QCD NLO corrections to the LO ME-level diagrams evaluated with MadGraph 5_amc@nlo. For additional comparisons, signal events produced with MadGraph 5_amc@nlo are also processed with the herwig++ (v2.7.1) [28] PS, using the EE5C [29] tune.

An additional signal sample that includes NLO QCD corrections but does not include the s-channel contributions to the final state has been generated with powheg (v2.0) [30,31,32], based on the vbfnlo ME calculations [33, 34]. In the powheg sample the \(m_\mathrm {jj} >120\,\text {Ge}\text {V} \) condition is applied on the two \(p_{\mathrm {T}}\)-leading parton-level jets, after clustering the ME final state partons with the \(k_{\mathrm {T}}\)-algorithm [35,36,37], with a distance parameter \(D=0.8\), as done in Ref. [33]. The powheg sample has also been processed alternatively with pythia and herwig++ parton showering (PS) and hadronization programs, as done for the MadGraph 5_amc@nlo samples. In the following, results obtained with the powheg signal samples are given as a cross check of the main results obtained with the MadGraph 5_amc@nlo signal samples.

Events coming from processes including ATGCs are generated with the same settings as the SM sample, but include additional information for reweighting in the three-dimensional effective field theory (EFT) parameter space, which is described in more detail in Sect. 11. The ‘EWdim6NLO’ model [8, 21] is used for the generation of anomalous couplings.

Background \(\mathrm{W} \) boson events are also simulated with MadGraph 5_amc@nlo using (1) an NLO ME calculation with up to three final-state partons generated from QCD interactions, and (2) an LO ME calculation with up to four partons from QCD interactions. The ME-PS matching is performed following the FxFx prescription [38] for the NLO case, and the MLM prescription [39, 40] for the LO case. The NLO background simulation is used to extract the final results, while the independent LO samples are used to perform the multivariate discriminant training. The inclusive \(\mathrm{W} \) boson production is normalized to \(\sigma _\text {th}(\mathrm{W})=61.5 \,\text {nb} \), as computed at next-to-next-to-leading order (NNLO) with fewz (v3.1) [41].

The evaluation of the interference between \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) processes relies on the predictions obtained with MadGraph 5_amc@nlo. A dedicated sample of events arising from the interference terms is generated directly by selecting the contributions of order \(\alpha _\mathrm {s}\alpha _\mathrm {EW}^3\), and passed through the full detector simulation to estimate the expected interference contribution.

Other backgrounds are expected from events with one electron or muon and missing transverse momentum together with jets in the final state. Events from top quark pair production are generated with powheg (v2.0) [30,31,32], and normalized to the inclusive cross section calculated at NNLO, including next-to-next-to-leading logarithmic corrections, of 832\(\,\text {pb}\) [42, 43]. Single top quark processes are modeled at NLO with powheg [30,31,32, 44] and normalized to cross sections of \(71.7\pm 2.0\,\text {pb} \), \(217\pm 3\,\text {pb} \), and \(10.32\pm 0.20\,\text {pb} \), respectively, for the tW (powheg v1) [45], t-, and s-channel production [42, 46]. The diboson (VV) production processes (\(\mathrm{W} \mathrm{W} \), \(\mathrm{W} \mathrm{Z} \), and \(\mathrm{Z} \mathrm{Z} \)) are generated with pythia and normalized to NNLO cross section computations obtained with mcfm (v8.0) [47].

The contribution from QCD multijet processes is derived via an extrapolation from a QCD data CR with the lepton relative isolation selection inverted. All background simulations make use of the pythia PS model with the CUETP8M1 tune.

A detector simulation based on Geant4 (v9.4p03) [48, 49] is applied to all the generated signal and background samples. The presence of multiple pp interactions is incorporated by simulating additional interactions (pileup), both in-time and out-of-time with respect to the hard interaction, with a multiplicity that matches the distribution observed in data. The average pileup is measured to be about 23 additional interactions per bunch crossing.

4 Reconstruction and selection of events

Events containing exactly one isolated, high-\(p_{\mathrm {T}}\) lepton and at least two high-\(p_{\mathrm {T}}\) jets are selected. Isolated single-lepton triggers are used to acquire the data, where the lepton is required to have \(p_{\mathrm {T}} >27\,\text {Ge}\text {V} \) for the electron trigger and \(p_{\mathrm {T}} >24\,\text {Ge}\text {V} \) for the muon trigger.

The offline analysis uses candidates reconstructed by the particle-flow (PF) algorithm [50]. In the PF event reconstruction, all stable particles in the event — i.e., electrons, muons, photons, charged and neutral hadrons — are reconstructed as PF candidates using information from all subdetectors to obtain an optimal determination of their direction, energy, and type. The PF candidates are used to reconstruct the jets and the missing transverse momentum.

The reconstructed primary vertex (PV) with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\) is the primary \(\mathrm{p} \mathrm{p} \) interaction vertex. The physics objects are the objects returned by a jet finding algorithm [51, 52] applied to all charged particle tracks associated with the vertex, along with the corresponding associated missing transverse momentum. Charged tracks identified as hadrons from pileup vertices are omitted in the subsequent PF event reconstruction [50].

Offline electrons are reconstructed from clusters of energy deposits in the ECAL that match tracks extrapolated from the silicon tracker [17]. Offline muons are reconstructed by fitting trajectories based on hits in the silicon tracker and in the muon system [53]. Reconstructed electron or muon candidates are required to have \(p_{\mathrm {T}} >20\,\text {Ge}\text {V} \). Electron candidates are required to be reconstructed within \(|\eta |\le 2.4\), excluding the barrel-to-endcap transitional region \(1.444< |\eta | < 1.566\) of the ECAL [20]. Muon candidates are required to be reconstructed in the fiducial region \(|\eta |\le 2.4\). The track associated with a lepton candidate is required to have both its transverse and longitudinal impact parameters compatible with the position of the PV of the event.

The leptons are required to be isolated; the isolation (I) variable is calculated from PF candidates and is corrected for pileup on an event-by-event basis [54]. The scalar \(p_{\mathrm {T}}\) sum of all PF candidates reconstructed in an isolation cone with radius \(\varDelta R=\sqrt{\smash [b]{(\varDelta \eta )^{2}+(\varDelta \phi )^{2}}}=0.4\) around the lepton’s momentum vector, excluding the lepton itself, is required to be less than 6% of the electron or muon \(p_{\mathrm {T}}\) value. For additional offline analysis, the isolated lepton is required to have \(p_{\mathrm {T}} >25\) \(\,\text {Ge}\text {V}\) for the muon channel and \(p_{\mathrm {T}} >30\) \(\,\text {Ge}\text {V}\) for the electron channel. Events with more than one lepton satisfying the above requirements are rejected. The lepton flavor samples are exclusive and precedence is given to the selection of muons.

The missing transverse momentum vector, \({\vec p}_{\mathrm {T}} ^{\text {miss}}\), is calculated offline as the negative of the vector sum of transverse momenta of all PF objects identified in the event [55], and the magnitude of this vector is denoted \(p_{\mathrm {T}} ^{\text {miss}}\). Events are required to have \(p_{\mathrm {T}} ^{\text {miss}}\) in excess of 20\(\,\text {Ge}\text {V}\) in the muon channel and 40\(\,\text {Ge}\text {V}\) in the electron channel. The tighter requirement for the electron channel reduces the corresponding higher background of QCD multijet events. The transverse mass (\({m}_{\mathrm {T}}\)) of the lepton and \({\vec p}_{\mathrm {T}} ^{\text {miss}}\) four-vector sum is then required to exceed 40\(\,\text {Ge}\text {V}\) in both channels.

Jets are reconstructed by clustering PF candidates with the anti-\(k_{\mathrm {T}}\) algorithm [51, 56] using a distance parameter of 0.4. The jet momentum is the vector sum of all particle momenta in the jet and is typically within 5–10% of the true momentum over the whole \(p_{\mathrm {T}}\) spectrum and detector acceptance.

An offset correction is applied to jet energies because of the contribution from pileup. Jet energy corrections are derived from simulation, and are confirmed with in situ measurements of the energy balance in dijet, multijet, photon+jet, and Z+jets events with leptonic Z boson decays [57]. Loose jet identification criteria are applied to reject misreconstructed jets resulting from detector noise [58]. Loose criteria are also applied to remove jets heavily contaminated with pileup energy (clustering of energy deposits not associated with a parton from the primary \(\mathrm{p} \mathrm{p} \) interaction) [58, 59]. The efficiency of the jet identification is greater than 99%, with a rejection of 90% of background pileup jets with \(p_{\mathrm {T}} \simeq 50\,\text {Ge}\text {V} \) and \(|\eta |\le 2.5\). For jets with \(|\eta | > 2.5\) and \(30< p_{\mathrm {T}} < 50\,\text {Ge}\text {V} \), the efficiency is approximately 90% and the pileup jet rejection is approximately 50%. The jet energy resolution (JER) is typically \({\approx }\)15% at 10\(\,\text {Ge}\text {V}\), 8% at 100\(\,\text {Ge}\text {V}\), and 4% at 1\(\,\text {Te}\text {V}\) for jets with \(|\eta |\le 1\) [57]. Jets reconstructed with \(p_{\mathrm {T}} \ge 15\) \(\,\text {Ge}\text {V}\) and \(|\eta |\le 4.7\) are used in the analysis.

The two highest \(p_{\mathrm {T}}\) jets are defined as the tagging jets, and are required to have \(p_{\mathrm {T}} >50\) \(\,\text {Ge}\text {V}\) and \(p_{\mathrm {T}} >30\) \(\,\text {Ge}\text {V}\) for the leading and subleading (in \(p_{\mathrm {T}}\)) jet, respectively. The invariant mass of the two tagging jets is required to satisfy \(m_\mathrm {jj}>200\,\text {Ge}\text {V} \).

The transverse momentum of the \(\mathrm{W} \) boson (\(\vec {p}_{\mathrm {T} \mathrm{W}}\)) is evaluated as the vector sum of the lepton \(p_{\mathrm {T}}\) and \({\vec p}_{\mathrm {T}} ^{\text {miss}}\). The event \(p_{\mathrm {T}} \) balance (\(R(p_{\mathrm {T}} \))) is then defined as

$$\begin{aligned} R(p_{\mathrm {T}})= \frac{| \vec {p}_{\mathrm {T} \mathrm {j}_1}+\vec {p}_{\mathrm {T} \mathrm {j}_2}+\vec {p}_{\mathrm {T} \mathrm{W}} |}{ |\vec {p}_{\mathrm {T} \mathrm {j}_1} | +|\vec {p}_{\mathrm {T} \mathrm {j}_2} | + |\vec {p}_{\mathrm {T} \mathrm{W}} | } \end{aligned}$$
(1)

where \(\vec {p}_{\mathrm {T} \mathrm {j}_1}\) and \(\vec {p}_{\mathrm {T} \mathrm {j}_2}\) are the transverse momenta of the two tagging jets.

Finally, events are required to have \(R(p_{\mathrm {T}}) < 0.2\). This has a negligible effect on the analysis sensitivity and allows the definition of a nonoverlapping control sample with \(R(p_{\mathrm {T}}) > 0.2\) that is used to derive a correction to the invariant mass based on a CR in data, as described in Sect. 6.

A multivariate analysis technique, described in Sect. 8, is used to provide an optimal separation of the \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) components of the inclusive \(\ell \nu \mathrm {jj}\) spectrum. The main discriminating variables are the dijet invariant mass \(m_\mathrm {jj}\) and pseudorapidity separation \(\varDelta \eta _\mathrm {jj}\).

Angular variables useful for signal discrimination include the \(y^*\) Zeppenfeld variable [6], defined as the difference between the rapidity of the \(\mathrm{W} \) boson \(y_{\mathrm{W}}\) and the average rapidity of the two tagging jets, i.e.,

$$\begin{aligned} y^*=y_{\mathrm{W}}-\frac{1}{2}(y_{\mathrm {j}_1}+y_{\mathrm {j}_2}), \end{aligned}$$
(2)

and the \(z^*\) Zeppenfeld variable [6] defined as

$$\begin{aligned} z^*=\frac{ y^* }{ \varDelta y_\mathrm {jj} }, \end{aligned}$$
(3)

where \(\varDelta y_\mathrm {jj}\) is the dijet rapidity separation.

Table 1 reports the expected and observed event yields after the initial selection and after imposing a minimum value for the final multivariate discriminant output applied to define the signal-enriched region used for the studies of additional hadronic activity described in Sect. 12.

Table 1 Event yields expected for background and signal processes using the initial selections and with a selection on the multivariate analysis output (BDT) that provides similar signal and background yields. The yields are compared to the data observed in the different channels and categories. The total uncertainties quoted for signal, \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) and diboson backgrounds, and processes with top quarks (\(\mathrm{t}{\bar{\mathrm{t}}}\) and single top quarks) include the systematic uncertainties

4.1 Discriminating quarks from gluons

Jets in signal events are expected to originate from quarks, whereas for background events it is more probable that jets are initiated by a gluon. A quark-gluon likelihood (QGL) discriminant [11] is evaluated for the two tagging jets with the intent of distinguishing the nature of each jet.

The QGL discriminant exploits differences in the showering and fragmentation of quarks and gluons, making use of the following internal jet composition observables: (1) the particle multiplicity of the jet, (2) the minor root-mean-square of distance between the jet constituents in the \(\eta \)\(\phi \) plane, and (3) the \(p_{\mathrm {T}}\) distribution function of the jet constituents, as defined in Ref. [60].

The variables are used as inputs to a likelihood discriminant on gluon and quark jets constructed from simulated dijet events. The performance of the QGL discriminant is evaluated and validated using independent, exclusive samples of \(\mathrm{Z} \)+jet and dijet data [60]. Corrections to the simulated QGL distributions and related systematic uncertainties are derived from a comparison of simulation and data distributions.

5 The QCD multijet background

The QCD multijet contribution is estimated by defining a multijet-enriched CR with inverted lepton isolation criteria for both the muon and electron channels. In the nominal selection both lepton types are required to pass the relative isolation requirement \(I<0.06\), whereas the multijet-enriched CRs are defined with the same event selection but with isolation requirements \(0.06<I<0.12\) and \(0.06<I<0.15\), for the muon and electron channel respectively. It is then assumed that the \(p_{\mathrm {T}} ^{\text {miss}}\) distribution of QCD events has the same shape in both the nominal and the multijet-enriched CR.

The various components, with floating \(\mathrm{W} \)+jets and QCD multijet background scale factors, are simultaneously fitted to the \(p_{\mathrm {T}} ^{\text {miss}}\) data distributions, independently in the muon and electron channels, and the expected QCD multijet yields in the nominal regions are derived.

The contribution of QCD multijet processes in any other observable (x) used in the analysis is then normalized to the yields obtained above from the fit to the \(p_{\mathrm {T}} ^{\text {miss}}\) distribution, and the shape for the distribution x is taken as the difference between data and all simulated background contributions in the x distribution in the multijet-enriched CR.

The estimation of the QCD multijet contribution based on a CR in data is validated by checking the modeling of other variables that discriminate QCD multijets from \(\mathrm{W} \)+jets such as the W transverse mass and the minimum difference in \(\phi \) between the missing transerse energy and the jets. Good agreement with the data is observed in all distributions. The stability of the \(\mathrm{W} \)+jets fitted normalization is checked by varying the selection requirements for the fitted region and repeating the QCD extraction fit. The observed variations in fitted normalization when varying the \(m_\mathrm {T}\)(W) and \(p_{\mathrm {T}} ^{\text {miss}}\) selection requirements with respect to the fit region definition are much smaller than systematic uncertainties.

Although \(\mathrm{b} \) tagging is not used in this analysis, a \(\mathrm{b} \)-tagging discriminant output [61] is used to check the fitted \(\mathrm{W} \)+jets background normalization as well as the \(\mathrm{t}{\bar{\mathrm{t}}} \) normalization from simulation, and they agree with data within the uncertainties. Finally, the selections on \(m_\mathrm {jj} \), \(p_{\mathrm {T}} ^{\text {miss}}\), and \(m_\mathrm {T}\)(W) are also loosened in order to verify that the \(\mathrm{W} \)+jets background scale factor is not biased by these requirements.

6 The \(m_\mathrm {jj} \) correction

A systematic overestimation of the simulation yields is caused by a partial mistiming of the signals in the forward region of the ECAL endcaps (\(2.5<\vert \eta \vert <3.0\)). This effect, which increases with increasing \(m_\mathrm {jj} \), is observed in both electron and muon channels. A correction for this effect is derived in the nonoverlapping signal-depleted CR obtained by requiring that the event transverse momentum balance \(R(p_{\mathrm {T}})\), defined in Sect. 4, exceeds 0.2.

A third-order polynomial correction is first applied to the \(\mathrm{W} \)+jets simulation separately in the muon and electron channels in order to match the \(R(p_{\mathrm {T}})\) distribution in data. The magnitude of the applied \(R(p_{\mathrm {T}})\) corrections is about 10%. The uncertainty in this correction due to the limited statistical precision of the simulation as well as data is propagated to the fitted \(\mathrm{W} \)+jets templates.

A correction to the \(m_\mathrm {jj} \) prediction from simulation is derived in the signal-depleted \(R(p_{\mathrm {T}})>0.2\) CR via a third-order polynomial fit to the ratio of data to the overall prediction from simulation for signal and background as a function of \(\ln (m_\mathrm {jj}/\,\text {Ge}\text {V})\). The electron and muon channels are combined when deriving the \(m_\mathrm {jj} \) correction. The uncertainty in the correction includes the data statistical component as well as the systematic uncertainty due to the limited statistical precision of the simulation.

Figure 3 shows the fitted correction including the uncertainty. This correction is applied to all simulated results, including the signal, and the corresponding uncertainty is propagated to the signal extraction fits.

Fig. 3
figure 3

Data divided by simulation as a function of \(\ln (m_\mathrm {jj}/\,\text {Ge}\text {V})\) in a signal-depleted control sample with \(R(p_{\mathrm {T}})>0.2\). This distribution is fit by a third-order polynomial (solid black line) in order to derive a correction on the simulation \(m_\mathrm {jj} \) prediction. The points are varied by the uncertainty, including the effect of the limited number of simulated events and refitted in order to derive the systematic variations on the correction (dashed lines) corresponding to a standard deviation (SD)

7 Distributions of discriminating variables

Figure 4 shows the \(p_{\mathrm {T}} ^{\text {miss}}\) and \({m}_{\mathrm {T}}\)(W) distributions after the event preselection. The dijet invariant mass and pseudorapidity difference (\(\varDelta \eta _\mathrm {jj} \)) after preselection are presented in Fig. 5, and Fig. 6 shows the \(y^\star \) and \(z^\star \) distributions after the event preselection. The distributions of the QGL likelihood output values in data and simulation for the two tagging jets are shown in Fig. 7. The prediction from simulated events and the data agree within total uncertainties for all discriminating variables.

Fig. 4
figure 4

Distribution of the missing transverse momentum (upper) and the lepton–\(p_{\mathrm {T}} ^{\text {miss}}\)system transverse mass (lower) after the event preselection for the selected leading lepton in the event, in the muon (left) and electron (right) channels. In all plots the last bin contains overflow events

Fig. 5
figure 5

Dijet invariant mass (upper) and pseudorapidity difference (lower) distributions after the event preselection, in the muon (left) and electron (right) channels. In all plots the last bin contains overflow events

Fig. 6
figure 6

Distributions of the “Zeppenfeld” variables \(y^\star \)(W) (upper) and \(z^\star \)(W) (lower) after event preselection in the muon (left) and electron (right) channels. In all plots the first and last bins contain overflow events

Fig. 7
figure 7

The QGL output for the leading (upper) and subleading (lower) quark jet candidates in the preselected muon (left) and electron (right) samples

8 Signal discriminants and extraction procedure

The \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) signal is characterized by a large pseudorapidity separation between the tagging jets, due to the small-angle scattering of the two initial partons. Because of both the topological configuration and the large energy of the outgoing partons, \(m_\mathrm {jj}\) is also expected to be large, and can be used to distinguish the \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) processes. The correlation between \(\varDelta \eta _\mathrm {jj}\) and \(m_\mathrm {jj}\) is expected to be different in signal and background events, therefore these characteristics are expected to yield a high separation power between \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) production. In addition, in signal events it is expected that the \(\mathrm{W} \) boson candidate is produced centrally in the rapidity region defined by the two tagging jets. As a consequence, signal events are expected to yield lower values of \(z^*\) compared to the DY background. Other variables that are used to enhance the signal-to-background separation are related to the kinematics of the event or to the properties of the jets that are expected to be initiated by quarks. The variables that are used in the multivariate analysis are: (1) \(m_\mathrm {jj}\), (2) \(\varDelta \eta _\mathrm {jj}\), (3) \(z^*\), and (4) the QGL values of the two tagging jets.

The output is built by training a boosted decision tree (BDT) discriminator with the tmva package [62] to achieve an optimal separation between the \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) processes. The simulated events that are used for the BDT training are not used for the signal extraction.

To improve the sensitivity for the extraction of the signal component, the transformation that originally projects the BDT output value in the [−1,\(+\)1] interval is changed to \(\mathrm {BDT'} = \tanh ^{-1}((\mathrm {BDT}+1)/2)\). This allows the purest signal region of the BDT output to be better sampled while keeping an equal-width binning of the BDT variable.

Figure 8 shows the distributions of the discriminants for the two leptonic channels. Good overall agreement between simulation and data is observed in all distributions, and the signal presence is visible at high BDT’ values.

Fig. 8
figure 8

Data and MC simulation BDT’ output distributions for the muon (upper) and electron (lower) channels, using the BDT output transformed with the \(\tanh ^{-1}\) function to enhance the purest signal region. The ratio panel shows the statistical uncertainty from the simulation as well as the independent systematic uncertainties front the leading sources

A binned maximum likelihood is built from the expected rates for each process, as a function of the value of the discriminant, which is fit to extract the strength modifiers for the \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) processes, \(\mu = \sigma ({\mathrm {EW}~\mathrm{W} \mathrm {jj}}) / \sigma _\mathrm {LO}({\mathrm {EW}~\ell \nu \mathrm {jj}})\) and \(\upsilon = \sigma ({\mathrm{W}})/\sigma _\text {NNLO}({\mathrm{W}})\). Nuisance parameters are added to modify the expected rates and shapes according to the estimate of the systematic uncertainties affecting the measurement.

The interference between the \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) processes is included in the fit procedure, and its strength scales as \(\sqrt{\mu \upsilon }\). The interference model is derived from the MadGraph 5_amc@nlo simulation described in Sect. 3.

The parameters of the model (\(\mu \) and \(\upsilon \)) are determined by maximizing the likelihood. The statistical methodology follows the one used in other analyses [63] using asymptotic formulas [64]. In this procedure the systematic uncertainties affecting the measurement of the signal and background strengths (\(\mu \) and \(\upsilon \)) are constrained with log-normal probability distributions.

9 Systematic uncertainties

The main systematic uncertainties affecting the measurement are classified into experimental and theoretical according to their sources. Some uncertainties affect only normalizations, whereas others affect both the normalization and shape of the BDT output distribution.

9.1 Experimental uncertainties

The following experimental uncertainties are considered.

Integrated luminosity.:

A 2.5% uncertainty is assigned to the value of the integrated luminosity [65].

Trigger and selection efficiencies.:

Uncertainties in the efficiency corrections based on control samples in data for the leptonic trigger and offline selections are included and amount to a total of 2–3% depending on the lepton \(p_{\mathrm {T}} \) and \(\eta \), for both the e and \(\mu \) channels. These uncertainties are estimated by comparing the lepton efficiencies expected in simulation and measured in data with a “tag-and-probe” method [66].

Jet energy scale and resolution.:

The uncertainty in the energy of the jets affects the event selection and the computation of the kinematic variables used to calculate the discriminants. Therefore, the uncertainty in the jet energy scale (JES) affects both the expected event yields and the final shapes. The effect of the JES uncertainty is studied by rescaling up and down the reconstructed jet energy by \(p_{\mathrm {T}}\)- and \(\eta \)-dependent scale factors [57]. An analogous approach is used for the JER.

QGL discriminator.:

The uncertainty in the performance of the QGL discriminator is measured using independent \(\mathrm{Z} \)+jet and dijet data, after comparing with the corresponding simulation predictions [60]. Shape variations corresponding to the full differences between the data and the simulation are used as estimates of the uncertainty.

Pileup.:

Pileup can affect the identification and isolation of the leptons or the corrected energy of the jets. When the jet clustering algorithm is run, pileup can distort the reconstructed dijet system because of the contamination of tracks and calorimetric deposits. This uncertainty is evaluated by generating alternative distributions of the number of pileup interactions, corresponding to a 4.6% uncertainty in the total inelastic pp cross section at \(\sqrt{s}=13\,\text {Te}\text {V} \) [67].

Limited number of simulated events.:

For each signal and background simulation, shape variations for the distributions are considered by shifting the content of each bin up or down by its statistical uncertainty [68]. This generates alternatives to the nominal shape that are used to describe the uncertainty from the limited number of simulated events.

\(m_\mathrm {jj} \) correction.:

As described in Sect. 6, the \(m_\mathrm {jj} \) prediction from simulation is corrected to match the distribution in data in a signal-depleted \(R(p_{\mathrm {T}})>0.2\) control region. The uncertainty in this correction is derived by varying the fitted points within the statistical uncertainty from data and simulation combined and refitting the correction.

QCD multijet background template.:

As described in Sect. 5, the QCD multijet prediction is extrapolated from the data in a nonoverlapping CR. The uncertainty in the QCD multijet background template shape is derived by taking the envelope of the shape obtained when varying the lepton isolation requirement used to define the multijet-enriched CR. A 50% uncertainty in the QCD multijet background normalization is also included.

9.2 Theoretical uncertainties

The following theoretical uncertainties are considered in the analysis.

PDF.:

The PDF uncertainties are evaluated by comparing the nominal distributions to those obtained when using the alternative PDFs of the NNPDF set, including \(\alpha _\mathrm {s}\) variations.

Factorization and renormalization scales.:

To account for theoretical uncertainties, signal and background shape variations are built by changing the values of \(\mu _\mathrm {F}\) and \(\mu _{\mathrm {R}}\) from their defaults by factors of 2 or 1/2 in the ME calculation, simultaneously for \(\mu _\mathrm {F}\) and \(\mu _{\mathrm {R}}\), but independently for each simulated sample.

Signal acceptance.:

A 5% uncertainty on the signal yield is assigned to account for differences between the prediction for the LO signal with respect to the NLO predictions of the vbfnlo generator (v2.6.3).

Normalization of top quark and diboson backgrounds.:

Diboson and top quark production processes are modeled with MC simulations. An uncertainty in the normalization of these backgrounds is assigned based on the PDF and \(\mu _\mathrm {F}\), \(\mu _{\mathrm {R}}\) uncertainties, following calculations in Refs. [42, 43, 47].

Interference between \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) and \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\).:

An overall normalization and a shape uncertainty are assigned to the interference term in the fit, based on an envelope of predictions with different \(\mu _\mathrm {F}\), \(\mu _{\mathrm {R}}\) scales.

Parton showering model.:

The uncertainty in the PS model and the event tune is assessed as the full difference of the acceptance and shape predictions using pythia and herwig++.

\(R(p_{\mathrm {T}})\) correction.:

As described in Sect. 6, the \(R(p_{\mathrm {T}})\) prediction from \(\mathrm{W} \)+jets simulation is corrected to match the distribution in data with all expected contributions other than \(\mathrm{W} \)+jets subtracted. The uncertainty in this correction is derived by varying the fitted points within the statistical uncertainty from data and simulation combined and refitting the correction.

10 Measurement of the \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) production cross section

The signal strength, defined with the \(\ell \nu \)jj final state in the kinematic region described in Sect. 3, is extracted from the fit to the BDT output distribution as discussed in Sect. 8. Figure 9 shows the BDT distribution in the muon and electron channels for data and simulation after the fit, where the grey uncertainty band includes all systematic uncertainties. Good agreement is observed between the data and simulation within the uncertainties.

Fig. 9
figure 9

Data compared with simulation for the BDT’ output distribution for the muon (upper) and electron (lower) channels, after the fit. The grey uncertainty band in the ratio panel includes all systematic uncertainties

In the muon channel, the signal strength is measured to be

$$\begin{aligned} \mu =0.91 \pm 0.02\,\text {(stat)} \pm 0.12\,\text {(syst)} =0.91\pm 0.12\,\text {(total)}, \end{aligned}$$

corresponding to a measured signal cross section

$$\begin{aligned} \begin{aligned} \sigma ({\mathrm {EW}~\ell \nu \mathrm {jj}})&= 6.22 \pm 0.12\,\text {(stat)} \pm 0.74\,\text {(syst)} \,\text {pb} \\&=6.22\pm 0.75 \,\text {(total)}\,\text {pb}. \end{aligned} \end{aligned}$$

In the electron channel, the signal strength is measured to be

$$\begin{aligned} \mu =0.92\pm 0.03\,\text {(stat)} \pm 0.13\,\text {(syst)} =0.92\pm 0.13 \,\text {(total)}, \end{aligned}$$

corresponding to a measured signal cross section

$$\begin{aligned} \begin{aligned} \sigma ({\mathrm {EW}~\ell \nu \mathrm {jj}})&= 6.27 \pm 0.19\,\text {(stat)} \pm 0.80\,\text {(syst)} \,\text {pb} \\&=6.27\pm 0.82\,\text {(total)}\,\text {pb}. \end{aligned} \end{aligned}$$

The results obtained for the different lepton channels are compatible with each other, and in agreement with the SM predictions.

From the combined fit of the two channels, the signal strength is measured to be

$$\begin{aligned} \mu =0.91\pm 0.02\,\text {(stat)} \pm 0.10\,\text {(syst)} =0.91\pm 0.10\,\text {(total)}, \end{aligned}$$

corresponding to a measured signal cross section

$$\begin{aligned} \begin{aligned} \sigma ({\mathrm {EW}~\ell \nu \mathrm {jj}})&= 6.23 \pm 0.12\,\text {(stat)} \pm 0.61\,\text {(syst)} \,\text {pb} \\ {}&=6.23\pm 0.62\,\text {(total)}\,\text {pb}, \end{aligned} \end{aligned}$$

in agreement with the MadGraph 5_amc@nlo LO prediction \(\sigma _\mathrm {LO}(\mathrm {EW}\,\ell \nu \mathrm {jj})=6.81^{+0.03}_{-0.06}\,\text {(scale)}\pm 0.26\,\text {(PDF)} \,\text {pb} \). In the combined fit, the DY strength is \(\nu =0.88\pm 0.07\). Using the statistical methodology described in Sect. 8, the background-only hypotheses in the electron, muon, and combined channels are all excluded with significance above five standard deviations. Table 2 lists the major sources of uncertainty and their impact on the measured precision of \(\mu \). The largest sources of experimental uncertainty are the \(m_\mathrm {jj} \) correction, the JES, and the limited number of simulated events, while the largest sources of theoretical uncertainty are the \(\mu _\mathrm {F}\), \(\mu _{\mathrm {R}}\) scale uncertainties and the uncertainty in the signal acceptance, derived by comparing the LO signal prediction with the prediction from the vbfnlo generator.

Table 2 Major sources of uncertainty in the measurement of the signal strength \(\mu \), and their impact. The total uncertainty is separated into four components: statistical, number of simulated events, experimental, and theory. The experimental and theory components are further decomposed into their primary individual uncertainty sources

The signal strength is also measured with respect to the NLO signal prediction, as described in Sect. 3. In the muon channel, the signal strength is measured to be

$$\begin{aligned} \mu _{\mathrm {NLO}}=0.91 \pm 0.02\,\text {(stat)} \pm 0.12\,\text {(syst)} =0.91\pm 0.12\,\text {(total)}. \end{aligned}$$

In the electron channel, the signal strength is measured to be

$$\begin{aligned} \mu _{\mathrm {NLO}}=0.89 \pm 0.03\,\text {(stat)} \pm 0.12\,\text {(syst)} =0.89\pm 0.12\,\text {(total)}. \end{aligned}$$

From the combined fit of the two channels, the signal strength is measured to be

$$\begin{aligned} \mu _{\mathrm {NLO}}=0.90 \pm 0.02\,\text {(stat)} \pm 0.10\,\text {(syst)} =0.90\pm 0.10\,\text {(total)}, \end{aligned}$$

corresponding to a measured signal cross section

$$\begin{aligned} \begin{aligned} \sigma ({\mathrm {EW}~\ell \nu \mathrm {jj}})&= 6.07 \pm 0.12\,\text {(stat)} \pm 0.57\,\text {(syst)} \,\text {pb} \\ {}&=6.07\pm 0.58\,\text {(total)}\,\text {pb}, \end{aligned} \end{aligned}$$

in agreement with the powheg NLO prediction \(\sigma _\mathrm {NLO}(\mathrm {EW}\,\ell \nu \mathrm {jj})=6.74^{+0.02}_{-0.04}\,\text {(scale)}\pm 0.26\,\text {(PDF)} \,\text {pb} \).

11 Limits on anomalous gauge couplings

It is useful to look for signs of new physics via a model-independent EFT framework. In the framework of EFT, new physics can be described as an infinite series of new interaction terms organized as an expansion in the mass dimension of the operators.

In the EW sector of the SM, the first higher-dimensional operators containing bosons are six-dimensional [8]:

$$\begin{aligned} \begin{aligned} {\mathcal {O}}_{WWW}&= \frac{c_{WWW}}{\varLambda ^2}W_{\mu \nu }W^{\nu \rho }W_{\rho }^{\mu },\\ {\mathcal {O}}_{W}&= \frac{c_{W}}{\varLambda ^2}(D^{\mu }\varPhi )^{\dagger }W_{\mu \nu }(D^{\nu }\varPhi ),\\ {\mathcal {O}}_{B}&= \frac{c_{B}}{\varLambda ^2}(D^{\mu }\varPhi )^{\dagger }B_{\mu \nu }(D^{\nu }\varPhi ),\\ \widetilde{{\mathcal {O}}}_{WWW}&= \frac{{\widetilde{c}}_{WWW}}{\varLambda ^2}{\widetilde{W}}_{\mu \nu }W^{\nu \rho }W_{\rho }^{\mu },\\ \widetilde{{\mathcal {O}}}_{W}&= \frac{{\widetilde{c}}_{W}}{\varLambda ^2}(D^{\mu }\varPhi )^{\dagger }{\widetilde{W}}_{\mu \nu }(D^{\nu }\varPhi ), \end{aligned} \end{aligned}$$
(4)

where, as is customary, group indices are suppressed and the mass scale \(\varLambda \) is factorized from the coupling constants c. In Eq. (4), \(W_{\mu \nu }\) is the SU(2) field strength, \(B_{\mu \nu }\) is the U(1) field strength, \(\varPhi \) is the Higgs doublet, and operators with a tilde are the magnetic duals of the field strengths. The first three operators are charge and parity conserving, whereas the last two are not. Models with operators that preserve charge conjugation and parity symmetries can be included in the calculation either individually or in pairs. With these assumptions, the values of coupling constants divided by the mass scale \(c/\varLambda ^2\) are measured.

These operators have a rich phenomenology since they contribute to many multiboson scattering processes at tree level. The operator \({\mathcal {O}}_{WWW}\) modifies vertices with three or six vector bosons, whereas the operators \({\mathcal {O}}_{W}\) and \({\mathcal {O}}_{B}\) modify both the HVV vertices and vertices with three or four vector bosons. A more detailed description of the phenomenology of these operators can be found in Ref. [69]. Modifications to the ZWW and \(\gamma \)WW vertices are investigated in this analysis, since these modify the \( \mathrm{p} \mathrm{p} \rightarrow \mathrm{W} \mathrm {jj} \) cross section.

Previously, modifications to these vertices have been studied using anomalous trilinear gauge couplings [70]. The relationship between the dimension-six operators in Eq. (4) and ATGCs can be found in Ref. [8]. Most stringent limits on ATGC parameters were previously set by LEP [71], CDF [72], D0 [73], ATLAS [74, 75], and CMS [76, 77].

11.1 Statistical analysis

The measurement of the coupling constants uses templates in the \(p_{\mathrm {T}}\) of the lepton from the \(W\rightarrow \ell \nu \) decay. Because this is well measured and longitudinally Lorentz invariant, this variable is robust against mismodeling and ideal for this purpose. An additional requirement of \(\mathrm {BDT} >0.5\) has been applied, which is optimized based on the expected sensitivity to the ATGC signal. The expected limits are subsequently improved by 20–25% with respect to the expected limits without a BDT selection. In each channel, four bins from \( 0< p_{\mathrm {T}} ^\ell < 1.2 \,\text {Te}\text {V} \) are used, where the last bin contains overflow and its lower bin edge boundary has been optimized separately for each channel.

For each signal MC event, 125 weights are assigned that correspond to a \(5{\times } 5{\times } 5\) grid in \((c_{WWW}/\varLambda ^2) \, (c_{W}/\varLambda ^2) \, (c_{B}/\varLambda ^2)\). Equal bins are used in the interval \([-15, 15]\,\text {Te}\text {V} ^{-2}\) for \(c_{WWW}/\varLambda ^2\), \([-40, 40]\,\text {Te}\text {V} ^{-2}\) for \(c_{W}/\varLambda ^2\), and equal bins in the interval \([-175, 175]\,\text {Te}\text {V} ^{-2}\) for \(c_{B}/\varLambda ^2\).

To construct the \(p_{\mathrm {T}} ^\ell \) templates, the associated weights calculated for each event are used to construct a parametrized model of the expected yield in each bin as a function of the values of the dimension-six operators’ coupling constants. For each bin, the ratios of the expected signal yield with dimension-six operators to the one without (leaving only the SM contribution) are fitted at each point of the grid to a quadratic polynomial. The highest \(p_{\mathrm {T}} ^\ell \) bin has the largest statistical power to detect the presence of higher-dimensional operators. Figure 10 shows examples of the final templates, with the expected signal overlaid on the background expectation, for three different hypotheses of dimension-six operators. The SM distribution is normalized to the expected cross section.

A simultaneous binned fit for the values of the ATGCs is performed in the two lepton channels. A profile likelihood method, the Wald Gaussian approximation, and Wilks’ theorem [78] are used to derive confidence intervals at 95% confidence level (CL). One-dimensional and two-dimensional limits are derived on each of the three ATGC parameters and each combination of two ATGC parameters while all other parameters are set to their SM values. Systematic and theoretical uncertainties are represented by the individual nuisance parameters with log-normal distributions and are profiled in the fit.

11.2 Results

No significant deviation from the SM expectation is observed. Limits on the EFT parameters are reported and also translated into the equivalent parameters defined in an effective Lagrangian (LEP parametrization) in Ref. [79], without form factors: \(\lambda ^{\gamma } = \lambda ^{\mathrm{Z}} = \lambda \), \(\varDelta {\kappa ^{\mathrm{Z}}} = \varDelta {g_1^{\mathrm{Z}}}-\varDelta {\kappa ^\gamma } \, \tan ^2\theta _{{\mathrm{W}}}\). The parameters \(\lambda \), \(\varDelta {\kappa ^{Z}}\), and \(\varDelta {g_1^{\mathrm{Z}}}\) are considered, where the \(\varDelta \) symbols represent deviations from their respective SM values.

Results for the one-dimensional limits are listed in Table 3 for \(c_{WWW}\), \(c_W\) and \(c_B\), and in Table 4 for \(\lambda \), \(\varDelta g_{1}^{\mathrm{Z}}\) and \(\varDelta \kappa _{1}^{\mathrm{Z}}\); two-dimensions limits are shown in Figs. 11 and  12. The results are dominated by the sensitivity in the muon channel due to the larger acceptance for muons. An ATGC signal is not included in the interference between EW and DY production. The effect on the limits is small (\({<}\)3%). The LHC semileptonic WZ analysis using 13\(\,\text {Te}\text {V}\) data currently sets the most stringent limits on \(c_{WWW}/\varLambda ^2\) and \(c_W/\varLambda ^2\), while the WW analysis using 8\(\,\text {Te}\text {V}\) data currently sets the tightest limits on \(c_B/\varLambda ^2\). This analysis is most sensitive to \(c_{WWW}/\varLambda ^2\), where the limit is slightly less restrictive but comparable.

Fig. 10
figure 10

Distributions of \(p_{\mathrm {T}} ^\ell \) in data and SM backgrounds, and various ATGC scenarios in the muon (left) and electron (right) channels, before the fit. For each ATGC scenario plotted a particular parameter is varied while the other ATGC parameters are fixed to zero. The lower panels show the ratio between data and prediction minus one with the statistical uncertainty from simulation (grey hatched band) as well as the leading systematic uncertainties in the shape of the \(p_{\mathrm {T}} ^\ell \) distribution

Table 3 One-dimensional limits on the ATGC EFT parameters at 95% CL
Table 4 One-dimensional limits on the ATGC effective Lagrangian (LEP parametrization) parameters at 95% CL
Fig. 11
figure 11

Expected and observed two-dimensional limits on the EFT parameters at 95% CL

Fig. 12
figure 12

Expected and observed two-dimensional limits on the ATGC effective Lagrangian (LEP parametrization) parameters at 95% CL

11.3 Combination with the VBF Z boson production analysis

As mentioned in Sect. 1, the closely-related EW \(\mathrm{Zjj} \) process has been measured by CMS at \(\sqrt{s}=13\,\text {Te}\text {V} \) [15]. This result included constraints on ATGC EFT parameters obtained via a fit to the \(p_{\mathrm {T}} \)(Z) distribution, an experimentally clean observable sensitive to deviations from zero in the ATGC parameters. Both the EW \(\mathrm{Zjj} \) and EW \(\mathrm{W} \mathrm {jj}\) analyses are sensitive to anomalous couplings related to the WWZ vertex. A simultaneous binned likelihood fit for the ATGC parameters is performed to the \(p_{\mathrm {T}} \)(Z) distribution in the EW Zjj production and and \(p_{\mathrm {T}} ^\ell \) in the EW \(\mathrm{W} \mathrm {jj}\) production. In the combined fit, the primary uncertainty sources are correlated including the JES and JER uncertainties. Results for the one-dimensional limits are listed in Table 5 for \(c_{WWW}\), \(c_W\) and \(c_B\), and in Table 6 for \(\lambda \), \(\varDelta g_{1}^{\mathrm{Z}}\), and \(\varDelta \kappa _{1}^{\mathrm{Z}}\); two-dimensions limits are shown in Figs. 13 and  14.

Table 5 One-dimensional limits on the ATGC EFT parameters at 95% CL from the combination of EW \(\mathrm{W} \mathrm {jj}\) and EW \(\mathrm{Zjj} \) analyses
Table 6 One-dimensional limits on the ATGC effective Lagrangian (LEP parametrization) parameters at 95% CL from the combination of EW \(\mathrm{W} \mathrm {jj}\) and EW \(\mathrm{Zjj} \) analyses
Fig. 13
figure 13

Expected and observed two-dimensional limits on the EFT parameters at 95% CL from the combination of EW \(\mathrm{W} \mathrm {jj}\) and EW \(\mathrm{Zjj} \) analyses

Fig. 14
figure 14

Expected and observed two-dimensional limits on the ATGC effective Lagrangian (LEP parametrization) parameters at 95% CL from the combination of EW \(\mathrm{W} \mathrm {jj}\) and EW \(\mathrm{Zjj} \) analyses

12 Study of the hadronic and jet activity in \(\mathrm{W} \)+jet events

Having established the presence of the SM signal, the properties of the hadronic activity in the selected events can be examined, in particular in the the region in rapidity between the two tagging jets, with low expected hadron activity (rapidity gap). The production of additional jets in the rapidity gap, in a region with a larger contribution of \(\mathrm {EW}\,\mathrm{W} \mathrm {jj}\) processes is explored in Sect. 12.1. Studies of the rapidity gap hadronic activity using track-only observables, are presented in Sect. 12.2. Finally, a study of hadronic activity vetoes, using both PF jets and track-only observables, is presented in Sect. 12.3. A significant suppression of the hadronic activity in signal events is expected because the final-state objects originate from EW interactions, in contrast with the radiative QCD production of jets in \(\mathrm {DY}\,\mathrm{W} \mathrm {jj}\) events.

In all these studies, event distributions are shown with a selection on the output value at BDT \(>0.95\), which allows a signal-enriched region to be selected with a similar fraction of signal and background events. None of the BDT input observables listed in Sect. 8 are related to additional hadronic activity observables, as a consequence there is no bias on the additional hadronic activity observables due to the BDT output cut. The reconstructed distributions are compared directly to the prediction obtained with a full simulation of the CMS detector. In the BDT \(>0.95\) region, the dominant uncertainty on the prediction from simulation is due to the limited number of generated events.

12.1 Jet activity studies in a high-purity region

For this study, aside from the two tagging jets used in the preselection, all PF jets with \(p_{\mathrm {T}} >15\,\text {Ge}\text {V} \) found within the pseudorapidity gap of the tagging jets, \(\eta ^\text {tag jet}_\text {min}< \eta < \eta ^\text {tag jet}_\text {max}\), are used. For the estimation of the background contributions, the normalizations obtained from the fit discussed in Sect. 10 are used.

The \(p_{\mathrm {T}}\) of the leading additional jet in \(\mathrm{W} \mathrm {jj}\) events, as well as the scalar \(p_{\mathrm {T}}\) sum (\(H_{\mathrm {T}} \)) of all additional jets, are shown in Figs. 15 and 16, comparing data and simulations including the signal prediction from MadGraph 5_amc@nlo interfaced with either pythia or herwig++ parton showering. The comparison reveals a deficit in the simulation predictions with pythia parton showering for the rate of events with lower additional jet activity, whereas the tail of higher additional activity is generally in better agreement.

A suppression of additional jets is observed in data compared with the background-only simulation shapes. In the simulation of the signal, the additional jets are produced by the PS (see Sect. 3), so studying these distributions provides insight on the PS model in the rapidity gap region.

Fig. 15
figure 15

Leading additional jet \(p_{\mathrm {T}}\) (\(p_{\mathrm {T}}\) (j3)) for \(\mathrm {BDT} > 0.95 \) in the muon (left) and electron (right) channels including the signal prediction from MadGraph 5_amc@nlo interfaced with pythia parton showering (upper) and herwig++ parton showering (lower). In all plots the last bin contains overflow events, and the first bin contains events where no additional jet with \(p_{\mathrm {T}}\) \(>15\) \(\,\text {Ge}\text {V}\) is present

Fig. 16
figure 16

Total \(H_{\mathrm {T}}\) of the additional jets for \(\mathrm {BDT} > 0.95 \) in the muon (left) and electron (right) channels including the signal prediction from MadGraph 5_amc@nlo interfaced with pythia parton showering (upper) and herwig++ parton showering (lower). In all plots the last bin contains overflow events, and the first bin contains events where no additional jet with \(p_{\mathrm {T}}\) \(>15\) \(\,\text {Ge}\text {V}\) is present

12.2 Study of charged hadron activity

For this study, a collection is formed of high-purity tracks [80] with \(p_{\mathrm {T}} > 0.3\,\text {Ge}\text {V} \), uniquely associated with the main PV in the event. Tracks associated with the lepton or with the tagging jets are excluded from the selection. The association between the selected tracks and the reconstructed PVs is carried out by minimizing the longitudinal impact parameter, which is defined as the z-distance between the PV and the point of closest approach of the track helix to the PV, labeled \(d_z^\mathrm {PV}\). The association is required to satisfy the conditions \(d_z^\mathrm {PV}<2\,\text {mm} \) and \(d_z^\mathrm {PV}<3\delta d_z^\mathrm {PV}\), where \(\delta d_z^\mathrm {PV}\) is the uncertainty in \(d_z^\mathrm {PV}\).

A collection of “soft-track” jets is defined by clustering the selected tracks using the anti-\(k_{\mathrm {T}}\) clustering algorithm [51] with a distance parameter of \(R=0.4\). The use of track jets represents a clean and well-understood method [81] to reconstruct jets with energy as low as a few \(\,\text {Ge}\text {V}\). These jets are not affected by pileup because of the association of the constituent tracks with the hard scattering vertex [82].

Track jets of low \(p_{\mathrm {T}}\) and within \(\eta ^\text {tag jet}_\text {min}< \eta < \eta ^\text {tag jet}_\text {max} \) are considered for the study of the hadronic activity between the tagging jets, and referred to as “soft activity” (SA). For each event, the scalar \(p_{\mathrm {T}}\) sum of the soft-track jets with \(p_{\mathrm {T}}\) \(>1\) \(\,\text {Ge}\text {V}\) is computed, and referred to as the “soft \(H_{\mathrm {T}} \)” variable. Figures 17 and 18 show the distribution of the leading soft-track jet \(p_{\mathrm {T}}\) and soft \(H_{\mathrm {T}}\) in the signal-enriched region (BDT \(>0.95\)), for the electron and muon channels, compared to predictions from pythia and herwig++ PS models. The plots show some disagreement between the data and the predictions, in particular in the regions of small additional activity, when compared with the pythia predictions.

Fig. 17
figure 17

Leading additional soft-activity (SA) jet \(p_{\mathrm {T}}\) for BDT \(> 0.95\) in the muon (left) and electron (right) channels including the signal prediction from MadGraph 5_amc@nlo interfaced with pythia parton showering (upper) and herwig++ parton showering (lower)

Fig. 18
figure 18

Total soft activity (SA) jet \(H_{\mathrm {T}}\) for BDT \(> 0.95\) in the muon (left) and electron (right) channels including the signal prediction from MadGraph 5_amc@nlo interfaced with pythia parton showering (upper) and herwig++ parton showering (lower). In all plots the last bin contains overflow events

12.3 Study of hadronic activity vetoes

The efficiency of a hadronic activity veto corresponds to the fraction of events with a measured gap activity below a given threshold. This efficiency is studied as a function of the applied threshold for various gap activity observables. The veto thresholds studied here start at 15\(\,\text {Ge}\text {V}\) for gap activities measured with standard PF jets, while they go down to 1\(\,\text {Ge}\text {V}\) for gap activities measured with soft-track jets.

Figure 19 shows the gap activity veto efficiency of combined muon and electron events in the signal-enriched region when placing an upper threshold on the \(p_{\mathrm {T}}\) of the additional third jet, on the \(H_{\mathrm {T}}\) of all additional jets, on the leading soft-activity jet \(p_{\mathrm {T}}\), or on the soft-activity \(H_{\mathrm {T}}\). The observed efficiency in data is compared to expected efficiencies for background-only events, and efficiencies for background plus signal events where the signal is modeled with pythia or herwig++. Data points clearly disfavor the background-only predictions and are in reasonable agreement with the presence of the signal with the herwig++ PS predictions for gap activities above 20\(\,\text {Ge}\text {V}\), while the signal with pythia PS seems to generally overestimate the gap activity. In the events with very low gap activity, in particular below 10\(\,\text {Ge}\text {V}\) as measured with the soft track jets, the data indicates gap activities also below the herwig++ PS predictions. In addition, the expected efficiencies are included for background plus signal events where the signal is modeled with powheg (Sect. 3) with herwig++ PS. The powheg plus herwig++ prediction is in good agreement with the LO plus herwig++ prediction.

Fig. 19
figure 19

Hadronic activity veto efficiencies in the signal-enriched \(\mathrm {BDT}>0.95\) region for the muon and electron channels combined, as a function of the leading additional jet \(p_{\mathrm {T}}\) (upper left), additional jet \(H_{\mathrm {T}}\) (upper right), leading soft-activity jet \(p_{\mathrm {T}}\) (lower left), and soft-activity jet \(H_{\mathrm {T}}\) (lower right). The data are compared with the background-only prediction as well as background+signal with pythia parton showering and background+signal with herwig++ parton showering. In addition, the background+signal prediction from powheg plus herwig++ parton showering is included. The uncertainty bands include only the statistical uncertainty in the prediction from simulation, and the data points include only the statistical uncertainty in data

13 Summary

The cross section of the electroweak production of a \(\mathrm{W} \) boson in association with two jets is measured in the kinematic region defined as invariant mass \(m_\mathrm {jj} >120\,\text {Ge}\text {V} \) and transverse momenta \(p_\mathrm {T j} > 25\,\text {Ge}\text {V} \). The data sample corresponds to an integrated luminosity of \(35.9~\mathrm {fb}^{-1}\) of proton–proton collisions at centre-of-mass energy \(\sqrt{s}=13\,\text {Te}\text {V} \) recorded by the CMS Collaboration at the LHC. The measured cross section \(\sigma _\mathrm {EW}(\mathrm{W} \mathrm {jj})= 6.23 \pm 0.12 \,\text {(stat)} \pm 0.61 \,\text {(syst)} \,\text {pb} \) agrees with the leading order standard model prediction. This is the first observation of this process at \(\sqrt{s}=13\,\text {Te}\text {V} \).

A search is performed for anomalous trilinear gauge couplings associated with dimension-six operators as given in the framework of an effective field theory. No evidence for ATGCs is found, and the corresponding 95% confidence level intervals on the dimension-six operators are \(-2.3< c_{{\mathrm{W} \mathrm{W} \mathrm{W}}}/\varLambda ^2 < 2.5\,\text {Te}\text {V} ^{-2}\), \(-8.8< c_{\mathrm{W}}/\varLambda ^2 < 16\,\text {Te}\text {V} ^{-2}\), and \(-45< c_{\mathrm {B}}/\varLambda ^2 < 46\,\text {Te}\text {V} ^{-2}\). These results are combined with previous results on the electroweak production of a Z boson in association with two jets, yielding the limit on the \(c_{{\mathrm{W} \mathrm{W} \mathrm{W}}}\) coupling \(-1.8< c_{{\mathrm{W} \mathrm{W} \mathrm{W}}}/\varLambda ^2 < 2.0\,\text {Te}\text {V} ^{-2}\).

The additional hadronic activity, as well as the efficiencies for gap activity vetos, are studied in a signal-enriched region. Generally reasonable agreement is found between the data and the quantum chromodynamics predictions with the herwig++ parton shower and hadronization model, while the pythia model predictions typically show greater activity in the rapidity gap between the two tagging jets.