Measurement of the hadronic activity in events with a Z and two jets and extraction of the cross section for the electroweak production of a Z with two jets in pp collisions at sqrt(s) = 7 TeV

The first measurement of the electroweak production cross section of a Z boson with two jets (Zjj) in pp collisions at sqrt(s) = 7 TeV is presented, based on a data sample recorded by the CMS experiment at the LHC with an integrated luminosity of 5 inverse femtobarns. The cross section is measured for the lljj (l = e, mu) final state in the kinematic region m[ll]>50 GeV, m[jj]>120 GeV, transverse momenta pt[j]>25 GeV and pseudorapidity abs(eta[j])<4.0. The measurement, combining the muon and electron channels, yields sigma = 154 +/- 24 (stat.) +/- 46 (exp. syst.) +/- 27 (th. syst.) +/- 3 (lum.) fb, in agreement with the theoretical cross section. The hadronic activity, in the rapidity interval between the jets, is also measured. These results establish an important foundation for the more general study of vector boson fusion processes, of relevance for Higgs boson searches and for measurements of electroweak gauge couplings and vector boson scattering.


Introduction
The cross section for the electroweak (EW) production of a central W or Z boson in association with two jets that are well separated in rapidity is quite sizable at the Large Hadron Collider (LHC) [1]. These electroweak processes have been studied in the context of rapidity intervals in hadron collisions [2,3], as a probe of anomalous triple-gauge-boson couplings [4], and as a background to Higgs boson searches in Vector Boson Fusion (VBF) processes [5][6][7][8]. There are three classes of diagrams to be considered in the EW production of W and Z bosons with two jets: VBF processes, bremsstrahlung, and multiperipheral processes. A full calculation reveals a large negative interference between the pure VBF process and the other two categories [1,3]. Figure 1 shows representative Feynman diagrams for these EW qq production processes. A representative Feynman diagram for Drell-Yan production in association with two jets is shown in figure 2. This process is the dominant background in the extraction of EW qq cross section. In what follows we designate as "tagging jets" the jets that originate from the fragmentation of the outgoing quarks in the EW processes shown in figure 1.  The study of these processes establishes an important foundation for the more general study of vector boson fusion processes, of relevance for Higgs boson searches and for measurements of electroweak gauge couplings and vector-boson scattering. The VBF Higgs boson production in proton-proton (pp) collisions at the LHC has been extensively investigated [9,10] as a way to discover the particle and measure its couplings [11][12][13]. Recent searches by the ATLAS and CMS collaborations for the standard model (SM) Higgs boson include analyses of the VBF final states [14,15].

JHEP10(2013)062
In particular, the study of the processes shown in figure 1 can improve our understanding of the selection of tagging jets as well as that of vetoing additional parton radiation between forward-backward jets in VBF searches [5][6][7][8]. The measurement of the electroweak production of the Zjj final state is also a precursor to the measurement of elastic vector boson pair scattering at high energy, an important physics goal for future analyses of LHC data.
In this work we measure the cross section for electroweak Z boson production in association with two jets in pp collisions at a center-of-mass energy of 7 TeV, where the Z boson decays into µ + µ − or e + e − , using a data sample collected in 2011 by the Compact Muon Solenoid (CMS) experiment with an integrated luminosity of 5.1 fb −1 for the µ + µ − mode and 5.0 fb −1 for the e + e − mode. We extract the cross section under the assumption that the theory describes correctly the shape of the kinematical distributions of the dominant background from Drell-Yan production in association with two jets

JHEP10(2013)062
The signal-to-background ratio for the cross section measurement is small. In order to confirm the presence of a signal, two methods of signal extraction are employed and two different jet algorithms are used. While providing a similar performance, these two types of jet algorithms use different methods to combine the information from the subdetectors, different energy corrections, and different methods to account for the energy from the additional minimum-bias events (pileup).
In a separate study, measurements of the hadronic activity in Drell-Yan events are presented. These include the level of hadronic activity in the rapidity interval between the two tagging jets and the properties of multi-jets in events with a Z boson.
The plan of the paper is as follows: in section 2 we describe the CMS detector, reconstruction, and event simulation; in section 3 we discuss the event selections; sections 4 and 5 are devoted to the study of the hadronic activity in Drell-Yan events; in section 6 we present the measurement of the cross section for the EW Zjj production; finally, in section 7 we summarize our main results.

CMS detector, reconstruction, and event simulation
A detailed description of the CMS detector can be found in ref. [16]. The CMS experiment uses a right-handed coordinate system, with the origin at the nominal interaction point, the x axis pointing to the center of the LHC ring, the y axis pointing up, and the z axis along the counterclockwise-beam direction as viewed from above. The polar angle θ is measured from the positive z axis and the azimuthal angle φ is measured in the x-y plane. The pseudorapidity η is defined as − ln[tan(θ/2)], which equals the rapidity y = ln[(E + p z )/(E − p z )] for massless particles.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter providing a field of 3.8 T. Within the field volume are a silicon pixel and strip tracker, a crystal electromagnetic calorimeter (ECAL), and a brass/scintillator hadron calorimeter (HCAL) providing coverage for pseudorapidities |η| < 3. The forward calorimeter modules extend the coverage of hadronic jets up to |η| < 5. Muons are measured in gas-ionization detectors embedded in the steel magnetic flux return yoke.
The first level (L1) of the CMS trigger system, composed of custom hardware processors, uses information from the calorimeters and the muon detectors to select the most interesting events. The high-level-trigger processor farm further decreases the event rate from ∼100 kHz of L1 accepts to a few hundred Hz, before data storage.
Muons are reconstructed [17] by fitting trajectories based on hits in the silicon tracker and the muon system. Electrons are reconstructed [18] from clusters of energy deposits in the ECAL matched to tracks in the silicon tracker.
Two different types of jets are used in the analysis: jet-plus-track (JPT) and particleflow (PF) jets [19]. The JPT jets are reconstructed calorimeter jets whose energy response and resolution are improved by incorporating tracking information according to the JPT algorithm [20]. Calorimeter jets are first reconstructed from energy deposits in the calorimeter towers clustered with the anti-k T jet algorithm [21,22] with a distance parameter of 0.5. Charged-particle tracks are associated with each jet, based on the spatial separation JHEP10(2013)062 in η-φ between the jet axis and the track momentum vector measured at the interaction vertex. The associated tracks are projected onto the surface of the calorimeter and classified as in-cone tracks if they point within the jet cone around the jet axis. The tracks bent out of the jet cone due to the magnetic field are classified as out-of-cone tracks. The momenta of the charged tracks are used to improve the measurement of the energy of the associated calorimeter jet. For in-cone tracks the expected average energy deposition in the calorimeters is subtracted and the energy of the tracks (assuming that they are charged pions) is added to the jet energy. For out-of-cone tracks the energy of the tracks is added directly to the jet energy. The direction of the jet is re-calculated with the tracks. As a result of the JPT algorithm, both the energy and the direction of the jet are improved.
The CMS particle flow algorithm [23,24] combines the information from all relevant CMS sub-detectors to identify and reconstruct particle candidates in the event: muons, electrons, photons, charged hadrons, and neutral hadrons. Charged hadrons are reconstructed from tracks in the tracker. Photons and neutral hadrons are reconstructed from energy clusters in the ECAL and HCAL, respectively, that are separate from the extrapolated position of tracks. A neutral particle overlapping with charged particles in the calorimeters is identified from a calorimeter energy excess with respect to the sum of the associated track momenta. Particle flow jets (PF jets) are reconstructed using the anti-k T jet algorithm with a distance parameter of 0.5, clustering particles identified by the particle flow algorithm.
The signal process for this analysis is the electroweak production of a dilepton pair in association with two jets (EW jj, = e, µ). It is simulated with MadGraph version 5 [25,26] interfaced with pythia 6.4.25 [27] for parton showering (PS) and hadronization. The CTEQ6L1 parton distribution functions [28] are used in the event generation by MadGraph. The electroweak pp → jj processes in MadGraph include WZ production where the W boson decays into two quarks and ZZ production where one of the Z bosons decays into two quarks. The requirement m jj > 120 GeV applied at the Mad-Graph generation level reduces the contribution from these processes to a negligible level in the defined signal phase space. For the leading order generators, j stands for partons. For next-to-leading order calculations, a jet algorithm is applied to the final state partons and j stands for the parton jets.
Background Z+jets (labeled DY jj) and ditop (tt) processes are generated with Mad-Graph via a matrix element (ME) calculation that includes up to four jets at parton level. The ME and parton shower (ME-PS) matching is performed following the ktMLM prescription [26]. The generation of the DY jj background does not include the electroweak production of the Z boson with two jets. The diboson production processes WW, WZ, and ZZ are generated with pythia. The mcfm program [29] is also used for the evaluation of the theoretical uncertainty of the DY jj background predictions. The dynamic scale µ 0 = n i=1 p i T with n final state particles (partons, not jets; n = 4, 5) is used with the QCD factorization and renormalization scales set equal, µ F = µ R = µ 0 . Generated events are processed through the full CMS detector simulation based on Geant4 [30,31], followed by a detailed trigger emulation, and the standard event recon-

JHEP10(2013)062
struction. Minimum-bias events are superimposed upon the hard interaction to simulate the effects of additional interactions per beam crossing (pileup). The multiplicity distribution of the pileup events in the simulation is matched with that observed in data. The pythia parameters for the underlying event were set according to the Z2 tune [32]. The signal cross section per lepton flavor, at next-to-leading order (NLO), is calculated to be σ NLO (EW jj) = 166 fb. The calculation is carried out with the vbfnlo program [33] with the factorization and renormalization scales set to µ R = µ F = 90 GeV and with CT10 parton distribution functions [34]. The calculation is performed in the following kinematical region: a dilepton invariant mass, m above 50 GeV, jet transverse momentum p j T > 25 GeV, jet pseudorapidity |η j | < 4, and dijet invariant mass m jj > 120 GeV. The kinematic distributions for the signal generated by vbfnlo at leading order agree with those produced by the MadGraph generator.
The interference effects between EW and DY jj production processes are evaluated with the MadGraph, sherpa [35], CompHEP [36], and vbfnlo programs by the authors of these programs and were found to be negligible.

Event selection
For the muon channel, the candidate events were selected by a trigger that required the presence of two muons. The requirement applied by the trigger on the muon transverse momenta changed with increasing instantaneous luminosity. As a consequence, the analyzed data sample is divided into three sets corresponding to the following different thresholds: (i) both muons have p µ T > 7 GeV, (ii) p µ 1 T > 13 GeV and p µ 2 T > 8 GeV, and (iii) p µ 1 T > 17 GeV and p µ 2 T > 8 GeV. Events in the electron channel were selected by a trigger that required the presence of two electrons with p e 1 T > 17 GeV and p e 2 T > 8 GeV. Offline, the muon candidates used in the analysis are identified by an algorithm [17], which starts from the tracks measured in the muon chambers, and then matches and combines them with the tracks reconstructed in the inner tracker. Muons from the in-flight decays of hadrons and punch-through particles are suppressed by applying a requirement on the goodness-of-fit over the number of degrees of freedom, χ 2 /dof < 10, of the global fit including the hits in the tracker and muon detectors.
In order to ensure a precise estimate of momentum and impact parameter, only tracks with more than 10 hits in the inner tracker and at least one hit in the pixel detector are used. We require hits in at least two muon detectors, to ensure a precise momentum estimate at the trigger level, and to suppress remaining background from misidentified muon candidates. Cosmic muons are rejected by requiring a transverse impact parameter distance to the beam spot position of less than 2 mm. These selection criteria provide an efficiency of 96% for prompt muons with p T > 20 GeV. The efficiency is defined as a ratio where the denominator is the number of generated muons with p T > 20 GeV within the geometrical acceptance and the numerator is the number of those muons that pass the selection criteria described above.
The electron candidates are required to pass a set of criteria which is 90% efficient for prompt electrons with p T > 20 GeV [18]. The electron identification variables used JHEP10(2013)062 in the selection are (i) the spatial distance between the track and the associated ECAL cluster, (ii) the size and the shape of the shower in ECAL, and (iii) the hadronic leakage. The track transverse impact parameter is used to discriminate electrons from conversions. Tracks from conversions have, on average, a greater distance to the beam axis. In order to reject electrons from conversions, candidates are allowed to have at most one missing hit among those expected in the innermost tracker layers.
Electrons and muons from heavy-flavor decays and contained in hadronic jets are suppressed by imposing a restriction on the presence of additional tracks around their momentum direction. The additional tracks are summed in a cone of radius ∆R = (∆η) 2 + (∆φ) 2 < 0.3 around the lepton candidate. Only tracks consistent with originating from the vertex corresponding to the hardest proton-proton scattering are used in the evaluation, so as to be insensitive to contributions from pileup interactions in the same bunch crossing. A relative isolation variable, I trk = p trk T /p T , is evaluated for each lepton. The dimuon channel selection "Z µµ " is defined by the following set of requirements: the two highest-p T muons must have p T > 20 GeV, |η| < 2.4, and must satisfy the muon quality criteria described above. The muons are required to have opposite charge and have a relative isolation of I trk < 0.1. The dimuon invariant mass is required to be within ±15 GeV of the Z boson mass m Z = 91.2 GeV.
The following set of requirements define the "Z ee " dielectron selection: the two higestp T electrons must have p T > 20 GeV, |η| < 2.4, and satisfy the electron quality criteria described previously. The electrons are required to have opposite charge and relative isolation criteria of I trk < 0.1.
The dielectron invariant mass is required to be within ±20 GeV of m Z , a larger mass range than that for m µµ since the dielectron Z-peak is wider because of electron bremsstrahlung effects in the tracker material.
The two highest-p T leading jets in the event with |η j | < 4.7 (labeled j 1 and j 2 ) are selected as the tagging jets. The selection criteria are optimized by maximizing the signal significance defined as N S / √ N B , where N S and N B are the number of signal and background events passing the selection criteria, expected from the Monte Carlo (MC) signal and DY samples, with an integrated luminosity of 5 fb −1 . The requirements on the momentum and pseudorapidity of the tagging jets (p j 1 T , p j 2 T , η j ), the dijet invariant mass (m j 1 j 2 ), and the Z boson rapidity in the rest frame of the tagging jets y * = y Z −0.5(y j 1 +y j 2 ) are varied in order to reach maximum signal significance. The optimized selection criteria shown in table 1, with the corresponding selection labels, result in an expected signal significance of about three for an integrated luminosity of 5 fb −1 , for each of the dilepton channels.
The signal efficiencies for the Z µµ selection without additional requirements, with the tagging jet requirement TJ1, with the TJ1 and the Z boson rapidity requirement YZ, and with the TJ1, YZ, and TJ2 requirements are 0.36, 0.23, 0.17, and 0.06, respectively. These efficiencies are valid both in the case of JPT and PF jet reconstruction. The signal efficiencies for the dielectron channel are respectively 0.33, 0.21, 0.16, and 0.06. The efficiency is defined as a ratio where the denominator is the number of signal events generated by MadGraph with m jj > 120 GeV and the numerator is the number of events that passed the selections described above.

JHEP10(2013)062
Tagging jet selections Table 1. The optimized selection criteria with the corresponding selection labels.  Figure 3. Distribution of the absolute difference in the pseudorapidity of the tagging jets, ∆η j1j2 = |η j1 − η j2 | (left) and the tagging jet p T for both jets, j 1 and j 2 (right) for the DY µµjj, EW µµjj, and VBF Higgs boson production processes.
The above event selection criteria are different from those suggested for Higgs boson searches in the VBF channel [5][6][7][8]. In particular, higher p T thresholds are used on the tagging jets, the rapidity separation between the tagging jets is not used, and a central jet veto is not applied. This is because the kinematics for EW Zjj production and VBF Higgs boson production are different. The former includes two additional contributions, bremsstrahlung and multiperipheral processes, as shown in figure 1. These additional processes and the interference between them lead to higher average jet transverse momenta in comparison to the VBF production process alone. This is due to the fact that the EW jj process involves transversely polarized W bosons, while the main contribution to VBF boson production involves longitudinally polarised W bosons. Figure 3 shows the simulated distributions of the absolute pseudorapidity difference of the two tagging jets, ∆η j 1 j 2 = |η j 1 − η j 2 | (left), and the tagging jets p T (right) for the DY µµjj, the EW µµjj, and the VBF Higgs boson production processes. The m j 1 j 2 distributions for EW µµjj production and VBF Higgs boson production processes are very similar and are not shown here.
The event selection is performed with JPT and PF jets in the dimuon channel and with PF jets in dielectron channel.  Table 3. Event yields in the e + e − channel after each selection step for the data, the signal Monte Carlo and the backgrounds. The expected contributions from the signal and background processes are evaluated from simulation, for 5 fb −1 of integrated luminosity.
µ + µ − channel. The observed and expected number of events from signal and background processes are shown for the different selection requirements. The two jet algorithms result in similar yields. Table 3 shows the event yields after each selection step with PF jets in the e + e − channel.
The uncertainty on the estimation of the dominant DY jj background from simulation is comparable with the expected number of signal events. The signal can therefore only be extracted by analyzing the distributions that are most sensitive to the difference between the signal and backgrounds.
Distributions for data and simulation after the Z µµ selection and jet tagging requirement TJ1 are shown in figures 4 and 5. In these and the following figures the histograms with the labels "DY" and "ttbar" show the contributions from the DY jj and tt processes. The labels "WZ", "ZZ", and "WW" apply to the diboson production processes WZ, ZZ, and WW. The label "EW" shows the contribution from the signal process, EW Zjj.  statistical uncertainties. The region between the two lines with the labels JES Up and JES Down shows the ±1σ uncertainty of the simulation prediction due to the jet energy scale (JES) uncertainty.
The ratio of the data to the expected contribution of the signal plus background is systematically below unity and outside the 1 σ JES uncertainty in some regions. However, it is consistent with unity within the systematic uncertainty in the MadGraph simulation of the dominant DY jj background. The systematic uncertainty due to the QCD scale is expected to be between the uncertainty given by the NLO and LO calculations, which are 8% and 25%, respectively, as calculated by the mcfm program. The choice of the QCD scale is discussed in section 6.2.
Figures 4 and 5 illustrate the overall level of agreement between data and simulation. It is evident from the figures that the signal fraction is small; this is why the extraction of the signal requires the special methods described in section 6.
In sections 4 and 5 we describe the measurements of the hadronic activity in the rapidity interval between the tagging jets and the measurements of the radiation patterns in multijet events in association with a Z boson. The selected data sample is dominated by DY jj events which are referred to as "DY Zjj events" in the following two sections.

Hadronic activity in the rapidity interval between tagging jets
A veto on the hadronic activity in the rapidity interval between the VBF tagging jets has been proposed [5][6][7][8] as a tool to suppress backgrounds in the searches for a Higgs boson produced in VBF. In the following, a study of the hadronic activity in this rapidity interval is presented. Although a veto is not used on the hadronic activity to select the EW jj process, the studies provided in this section and in section 5 can be considered as a test of the agreement between the data and the simulation for the dominant DY jj background. The data sample is selected with the Z µµ and Z ee requirements described in section 3. The requirements on the jets are described in this section and in section 5.

Central hadronic activity measurement using jets
The hadronic activity in the rapidity interval between the tagging jets is studied as a function of the pseudorapidity separation between the tagging jets, the p T threshold of the tagging jets, and the dijet invariant mass, m j 1 j 2 . The hadronic activity is measured through the efficiency of the central jet veto, defined as the fraction of selected events with no third jet (j 3 ) with p j 3 T > 20 GeV in the pseudorapidity interval between the tagging jets: where η tag jet min (η tag jet max ) is the minimal (maximal) pseudorapidity of the tagging jet. The central jets from pileup interactions are suppressed with the tracker information. Tables 4 and 5 show the efficiencies measured from the data and those obtained from the MadGraph DY jj simulation for the different requirements on the p T of the tagging jets, the pseudorapidity separation between them, and their invariant mass. The measured efficiency is shown with the statistical uncertainties. The contribution of the EW Zjj, tt, and diboson processes is not subtracted from the data measurements since it does not JHEP10(2013)062  Table 4. Efficiency of the central jet veto with p j3 T > 20 GeV for three different selections on the tagging jets for a pseudorapidity separation of ∆η j1j2 > 3.5 measured in data and predicted by the The veto efficiencies obtained from data and the MadGraph simulation are in good agreement.

Central hadronic activity measurement with track jets
As the hadronic activity in the rapidity interval between the tagging jets is expected to be small (soft) in the case of a purely electroweak Zjj production, the contribution from any additional pileup interaction in the event needs to be avoided or carefully subtracted. For this reason, an additional study of the interjet hadronic activity is performed using only charged tracks that clearly originate from the hard-scattering vertex in the event.
For this study a collection of tracks is built with reconstructed high-purity tracks [37] with p T > 300 MeV that are uniquely associated with the main primary vertex in the event. Tracks associated with the two leptons or with the tagging jets are not included. The association between the tracks and the reconstructed primary vertices is carried out by minimizing the longitudinal distance d z (PV) between the primary vertex (PV) and the point of closest approach of the track helix to that PV. The association is required to satisfy d z (PV) < 2 mm and d z (PV) < 3δd z (PV), where δd z (PV) is the uncertainty on d z (PV). The main primary vertex in the event is chosen to be that with the largest scalar sum of transverse momenta, for all tracks used to reconstruct it. A collection of "soft track jets" is built by clustering the tracks with the anti-k T clustering algorithm [22] with a distance parameter of 0.5. The use of track jets represents a clean and well understood method [38] to reconstruct jets with energy as low as a few GeV. Crucially, these jets are not affected by pileup because of the association of their tracks with the hard-scattering vertex [39].
For the purpose of studying the central hadronic activity between the tagging jets, only soft track jets with pseudorapidity η tag jet min + 0.5 < η < η tag jet max − 0.5 are considered. The scalar sum (H T ) of the transverse momenta of up to three soft track jets is used as a monitor of the hadronic activity in the rapidity interval between the two jets. The soft H T distribution is shown in figure 6 for DY Zjj events for p j 1 ,j 2 T > 65, 40 GeV. The expectations from the simulation for the hadronic activity between the tagging jets are in good agreement with the data.
The evolution of the average H T for DY Zjj jets events as a function of the dijet invariant mass m j 1 j 2 and the pseudorapidity difference ∆η j 1 j 2 between the tagging jets is shown in figure 7. For better visibility the symbols at each measured point are slightly displaced along the x axis. Good agreement is observed between the simulation and the data for the different mass and pseudorapidity intervals.

Measurements of the radiation patterns in multijet events in association with a Z boson
In hard multijet events in association with a Z boson, the observables referred to as "radiation patterns" are: • the number of jets N j ; • the total scalar sum (H T ) of jets with |η| • the difference in the pseudorapidity, ∆η j 1 j 2 , between the two most forward-backward jets (which are not necessarily the two highest-p T jets); • the cosine of the azimuthal angle difference, cos|φ j 1 − φ j 2 | = cos ∆φ j 1 j 2 , between the two most forward-backward jets.
These observables are investigated following the prescriptions and suggestions in ref. [40], where the model dependence is estimated by comparing the predictions from mcfm [29], pythia, alpgen [41]+pythia, and the hej [42] programs.
The observables N j , H T , ∆η j 1 j 2 , and cos ∆φ j 1 j 2 are measured for jets with p T > 40 GeV. The events are required to satisfy the Z µµ and Z ee selection criteria. Figures 8 and 9 show the average number of jets and the average cos ∆φ j 1 j 2 as a function of the total H T and ∆η j 1 j 2 . The MadGraph + pythia (ME-PS) predictions are in reasonable agreement with the data.
6 Signal cross section measurement

Signal extraction using the dijet mass fit
The signal cross section in the µ + µ − channel is extracted from a fit of the m j 1 j 2 data distribution obtained after the Z µµ selection and requirements TJ1 and YZ described in section 3. The distribution is fitted to the DY µµjj background and the EW µµjj signal processes with MC templates. Figure 10 shows the m j 1 j 2 distribution where the expected contributions from the dominant DY µµjj background and the EW µµjj signal are evaluated from the fit, while the contributions from the small tt and diboson backgrounds are estimated from simulation.

Systematic uncertainties
The sources and the absolute values of the systematic uncertainties on the estimated signal value of s are described below and summarized in table 6.
The following effects are taken into account in the extraction of the signal cross section from the fit of the m j 1 j 2 distribution: • The theoretical uncertainty on the m j 1 j 2 shape for the dominant DY µµjj background process. The m j 1 j 2 shape given by the NLO calculation of mcfm is used to correct the shape of MadGraph with jets built from partons and propagated to the reconstructed dijet mass with a procedure that matches the reconstructed and the parton jets. The fit is then repeated with the modified shape. The systematic uncertainty is taken as s NLO − s MadGraph , where s NLO and s MadGraph are the values of the parameter s extracted from the fit of the m j 1 j 2 distribution given by MadGraph with and without corrections to the NLO shape. The uncertainty of the m j 1 j 2 shape at NLO due to the uncertainties in the QCD factorization and renormalization scales, µ F and µ R , is much smaller than the difference between the shapes given by MadGraph and the NLO calculations. The QCD scale in the NLO calculations is varied from µ 0 /2 to 2µ 0 . The m j 1 j 2 shape uncertainty due to the PDFs is found to be negligible.

JHEP10(2013)062
Source of uncertainty Uncertainty  Table 6. Sources and absolute values of the systematic uncertainties on the estimated ratio s of measured over expected EW Zjj yields. The simulation of the signal includes m jj > 120 GeV.
• The theoretical uncertainty of the signal acceptance. The acceptance is obtained using the NLO calculation vbfnlo as well as using MadGraph. Since vbfnlo does not generate events that can be passed through the detector simulation, the following parton-level requirements, similar to those used in the analysis were applied: p T > 20 GeV, |η | < 2.4, p j T > 50 GeV, |η j | < 3.6. The acceptance is calculated as the ratio of the cross section with parton-level selection to the cross section with the selection in the MadGraph simulation of the signal (m jj > 120 GeV; see section 2). The 5% difference between the vbfnlo and MadGraph acceptances is taken as the systematic uncertainty. The m jj shapes given by the vbfnlo program and Mad-Graph simulation are found to be very similar, and therefore the shape difference is not included in the signal modeling uncertainty. The signal acceptance used in the analysis is evaluated, however, with MadGraph, applying the selections as described in section 2.
• The uncertainty on the jet energy scale (JES). The m j 1 j 2 fit is repeated with events simulated with the jet energy varied by the JES uncertainty [19]. The difference between the values of the parameter s extracted from the fit with simulated events with the adjusted jet energy is taken as the systematic uncertainty.
• The uncertainty on the jet energy resolution (JER). The m j 1 j 2 fit is repeated with events simulated with the correction factor varied by the JER uncertainty [19]. The difference between the values of the parameter s extracted from the fit using simulated events with the adjusted data-to-simulation correction factor is taken as the systematic uncertainty.

JHEP10(2013)062
• The uncertainty on the pileup modeling via re-weighting of the simulated events according to the distribution of the number of interactions per beam crossing. The distribution is re-evaluated with the total inelastic cross section varied by ±5% around the nominal value of 68 mb, based on a set of models consistent with the cross section measured by the CMS experiment [45].
• The uncertainty due to the limited number of events available in the simulated samples (MC statistics).
• The uncertainties on the expected yields of tt and diboson events corresponding to the theoretical cross section prediction uncertainties.
In addition, the following systematic uncertainties are included in the estimation of the cross section: • the estimated 2.2% uncertainty on the integrated luminosity [46], • the 1% uncertainty on the data-to-simulation correction factor for the efficiency of the lepton reconstruction, identification, isolation, and trigger, which is measured with Z → events.

Signal extraction using MVA analysis
The signal is extracted with multivariate analyses in the µ + µ − and e + e − channels. The events are required to pass the Z µµ or Z ee selection criteria and the tagging jet requirement TJ1. A boosted decision tree with decorrelation (BDTD option in the tmva package [47]) is trained to give a high output value for signal-like events based on the following observables • p j 1 T , p j 2 T , m j 1 j 2 , ∆η j 1 j 2 , and y * variables as defined in section 3; • p T : the p T of the dilepton system; • y : the rapidity of the dilepton system; • η j 1 + η j 2 : the sum of the pseudorapidities of the two tagging jets; • ∆φ j 1 j 2 : the azimuthal separation of the two tagging jets; • ∆φ( , j 1 ) and ∆φ( , j 2 ): the azimuthal separations between the dilepton system and the two tagging jets.
In the e + e − channel the gluon-quark likelihood values for the tagging jets are also used as inputs. In the DY jj background about 50% of the jets originate from gluons while in the EW jj signal process the tagging jets are only initiated by quarks. A likelihood discriminator separates the gluon-originated jets from the quark-originated jets. The discriminator makes use of five internal jet properties, built from the jet constituents. These are related to the two angular spreads (root mean square) of the constituents in the ηφ plane, the asymmetry (pull) of the constituents with respect to the center of the jet, the multiplicity of the constituents, and the maximum energy fraction carried by a single constituent. The validations of the five input variables and of the gluon-quark likelihood output have been carried out using the multijet, Z+jet, and photon+jet samples, for which the relative differences between data and simulation are within 10%. To assess the systematic uncertainty from the usage of this tool, the gluon likelihood output in the simulated samples has been modified in accord with the differences observed in the three samples. The use of the gluon-quark likelihood discriminator leads to a decrease of the statistical uncertainty of the measured signal in the e + e − channel by 5%.
The BDT is trained with EW jj simulated events for the signal model along with the DY jj and tt simulated events for the background model. The BDT output value is proportional to the probability that the event belongs to the signal: the higher the value, the higher the probability. The BDT output distributions for the two lepton modes from various production mechanisms are shown in figure 11 where the expected contributions from the signal and background processes are evaluated from simulation.
The signal cross section is extracted from the fit of the BDT output distributions for data with the method described in section 6.1, for the m j 1 j 2 distributions. For the µ + µ − channel, the best fits are s = 0.90 ± 0.19 (stat.), b = 0.905 ± 0.006 (stat.) with JPT jets and s = 0.85 ± 0.18 (stat.), b = 0.937 ± 0.007 (stat.) with PF jets. For the e + e − channel, with PF jets, the best fit is s = 1.17 ± 0.27 (stat.), b = 0.957 ± 0.010 (stat.). The value of the parameter b obtained from the fit is below unity by 5-10%. It is however consistent with unity within the JES uncertainty and the systematic uncertainty in the MadGraph simulation of the DY jj process as discussed in section 3. Figure 12 shows the BDT output distributions for the µ + µ − (left) and e + e − (right) channels, where the expected contributions from the dominant DY jj background and the EW jj signal processes are evaluated from the fit; the contributions from the small tt and diboson backgrounds are taken from the simulation estimates.

JHEP10(2013)062
The presence of the signal is clearly seen at high values of the BDT output (>0.25) for both dimuon and dielectron channels, and in the cases when the dominant DY jj background is evaluated from simulation (figure 11) or from the fit ( figure 12).
In figure 12 the bottom panels show the significance observed in data (histogram) and expected from simulation (solid purple line), while the dashed blue line shows the background modeling uncertainty. The observed signal significance in bin i of the BDT output distribution is calculated as where N EW Zjj i is the number of simulated signal EW Zjj events from the fit. The background modeling uncertainty is calculated as 4) where N mcfm i is the number of the simulated background events obtained from a new fit. The fit uses a modified BDT output distribution for the DY jj process. This distribution is evaluated using the m j 1 j 2 shape obtained from the NLO calculation of mcfm, as explained in section 6.2.
The sources of the systematic uncertainties on the estimated signal value of s are those discussed in section 6.2. The absolute values of the systematic uncertainties on the value of s for the BDT analysis are shown in table 7 for the µ + µ − and e + e − modes. The uncertainties are smaller than those from the m j 1 j 2 fit analysis since the BDT approach provides better separation between signal and background.
The BDT analysis in the µ + µ − channel is repeated for events passing the additional requirement of |y * | < 1.2, as used in the m j 1 j 2 analysis. In this case the best fit values

Results
The presence of the signal is confirmed in the dimuon and dielectron channels by using two alternative jet reconstruction algorithms and two methods of signal extraction.
The BDT analysis provides smaller uncertainties on the parameter s, and therefore the result is based on this analysis. The measured cross section is σ meas = s × σ MadGraph (EW jj), where σ MadGraph (EW jj) = 162 fb per lepton flavor is the cross section obtained from the MadGraph simulation using CTEQ6L1 [28].
The signal cross section given by MadGraph is obtained for event generation with the following selections at the parton level: m > 50 GeV, p j T > 25 GeV, |η j | < 4.0, m jj > 120 GeV. The parton-level requirements on the jet p T and η maximize the signal selection efficiency relative to the actual selection applied to the data, while keeping the fraction of the events which fail the parton-level requirements but pass the data selection criteria at a negligible level.

JHEP10(2013)062
Source of uncertainty Uncertainty The measured cross sections agree with the theoretical value of σ vbfnlo (EW jj) = 166 fb, calculated with next-to-leading order QCD corrections using the same parton level selections as those applied in the signal event generation by MadGraph. The cross sections obtained in the µ + µ − and e + e − analyses using PF jets is combined and the average cross section is: σ EW ( =e, µ) = 154 ± 24 (stat.) ± 46 (exp. syst.) ± 27 (th. syst.) ± 3 (lum.) fb. (6.8)

JHEP10(2013)062 7 Summary
A measurement of the electroweak production of a Z boson in association with two jets in pp collisions at √ s = 7 TeV has been carried out with the CMS detector using an integrated luminosity of 5 fb −1 . The cross section for the EW jj ( = e, µ) production process, with m > 50 GeV, p j T > 25 GeV, |η j | < 4.0, m jj > 120 GeV, is σ = 154 ± 24 (stat.) ± 46 (exp. syst.) ± 27 (th. syst.) ± 3 (lum.) fb. The measurement is in agreement with the theoretical cross section of 166 fb, obtained with calculations including nextto-leading order QCD corrections based on the CT10 [34] parton distribution functions. A significance of 2.6 standard deviations has been obtained for the observation of EW production of the Z boson with two tagging jets. The measured hadronic activity in events with Drell-Yan production in association with two jets is in good agreement with simulation. This is the first measurement of EW production of a Z boson with two jets at a hadron collider, and constitutes an important foundation for the more general study of vector boson fusion processes, of relevance for Higgs boson searches and for measurements of electroweak gauge couplings and vector-boson scattering.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWF and FWF (