Measurement of associated W + charm production in pp collisions at sqrt(s) = 7 TeV

Measurements are presented of the associated production of a W boson and a charm-quark jet (W + c) in pp collisions at a center-of-mass energy of 7 TeV. The analysis is conducted with a data sample corresponding to a total integrated luminosity of 5 inverse femtobarns, collected by the CMS detector at the LHC. W boson candidates are identified by their decay into a charged lepton (muon or electron) and a neutrino. The W + c measurements are performed for charm-quark jets in the kinematic region pt[jet]>25 GeV, abs(eta)<2.5, for two different thresholds for the transverse momentum of the lepton from the W-boson decay, and in the pseudorapidity range abs(eta[ell])<2.1. Hadronic and inclusive semileptonic decays of charm hadrons are used to measure the following total cross sections: sigma(pp to W + c + X) times B(W to ell nu) = 107.7 +/- 3.3 (stat.) +/- 6.9 (syst.) pb (pt[ell]>25 GeV) and sigma(pp to W + c + X) times B(W to ell nu) = 84.1 +/- 2.0 (stat.) +/- 4.9 (syst.) pb (pt[ell]>35 GeV), and the cross section ratios sigma(pp to W+ + c + X)/sigma(pp to W- + c + X) = 0.954 +/- 0.025 (stat.) +/- 0.004 (syst.) (pt[ell]>25 GeV) and sigma(pp to W+ + c bar + X)/sigma(pp to W- + c + X) = 0.938 +/- 0.019 (stat.) +/- 0.006 (syst.) (pt[ell]>35 GeV). Cross sections and cross section ratios are also measured differentially with respect to the absolute value of the pseudorapidity of the lepton from the W-boson decay. These are the first measurements from the LHC directly sensitive to the strange quark and antiquark content of the proton. Results are compared with theoretical predictions and are consistent with the predictions based on global fits of parton distribution functions.


Introduction
The study of associated production of a W boson and a charm (c) quark at hadron colliders (hereafter referred to as W + c production) provides direct access to the strange-quark content of the proton at an energy scale of the order of the W-boson mass (Q 2 ∼(100 GeV) 2 ) [1][2][3]. This sensitivity is due to the dominance of sg → W − + c and sg → W + + c contributions at the hardscattering level (Fig. 1). Recent work [4] indicates that precise measurements of this process at the Large Hadron Collider (LHC) may significantly reduce the uncertainties in the strange quark and antiquark parton distribution functions (PDFs) and help resolve existing ambiguities and limitations of low-energy neutrino deep-inelastic scattering (DIS) data [5]. More precise knowledge of the PDFs is essential for many present and future precision analyses, such as the measurement of the W-boson mass [6]. An asymmetry between the strange quark and antiquark PDFs has also been proposed as an explanation of the NuTeV anomaly [5], making it crucial to measure observables related to this asymmetry with high precision. W + c production receives contributions at a few percent level from the processes dg → W − + c and dg → W + + c, which are Cabibbo suppressed [7]. Overall, the W − + c yield is expected to be slightly larger than the W + + c yield at the LHC because of the participation of down valence quarks in the initial state. A key property of the qg → W + c reaction is the presence of a charm quark and a W boson with opposite-sign charges. The pp → W + c + X process is a sizable background for signals involving bottom or top quarks and missing transverse energy in the final state. Particularly relevant cases are topquark studies and third-generation squark searches. Measurements of the pp → W + c + X cross section and of the cross section ratio σ(pp → W + c-jet + X)/σ(pp → W + jets + X) have been performed with a relative precision of about 20-30% at the Tevatron [8][9][10] hadron collider using semileptonic charm hadron decays.
We present a detailed study of the pp → W + c + X process with the Compact Muon Solenoid (CMS) detector, using a data sample corresponding to a total integrated luminosity of 5 fb −1 collected in 2011 at a center-of-mass energy of 7 TeV. We measure the total cross section and the cross section ratio R ± c = σ(W + + c)/σ(W − + c) using the muon and electron decay channels of the W boson. Charm-quark jets are identified within the fiducial region of transverse momentum p jet T > 25 GeV and pseudorapidity |η jet | < 2.5 using exclusive hadronic, inclusive hadronic, and semileptonic decays of charm hadrons. Furthermore, the cross section and the R ± c ratio are measured as a function of the pseudorapidity of the lepton from the W decay, thus probing a wide range in the Bjorken x variable, which at leading order can be interpreted as the momentum fraction of the proton carried by the interacting parton. This paper is organized as follows: the CMS detector is briefly described in Section 2 and the 2 3 Analysis strategy general analysis strategy is outlined in Section 3. The samples used to carry out the measurement and the event selection criteria are presented in Sections 4 and 5. Section 6 details the measurement of the total cross section and Sections 7 and 8 are devoted to studies of the differential cross section and the charge ratio. Results and comparisons with theoretical predictions are discussed in Section 9. Finally, we summarize the results of this paper in Section 10.

CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the field volume are a silicon pixel and strip tracker, an electromagnetic calorimeter (ECAL), and a brass/scintillator hadron calorimeter (HCAL). Muons are detected in gas-ionization detectors embedded in the steel flux return yoke of the magnet.
The CMS experiment uses a right-handed coordinate system with the origin at the nominal interaction point, the x axis pointing to the center of the LHC ring, the y axis pointing up (perpendicular to the LHC plane), and the z axis along the anticlockwise-beam direction. The polar angle θ is measured from the positive z axis and the azimuthal angle φ is measured in the x-y plane. The pseudorapidity is given by η = − ln(tan(θ/2)).
The tracker measures charged-particle trajectories in the pseudorapidity range |η| ≤ 2.5. It consists of 1440 silicon pixel and 15 148 silicon strip detector modules. It provides an impact parameter resolution of 15 µm and a transverse momentum (p T ) resolution of about 1% for charged particles with p T around 40 GeV. The ECAL consists of nearly 76 000 lead tungstate crystals, which provide coverage in pseudorapidity |η| ≤ 1.479 in a cylindrical barrel region and 1.479 ≤ |η| ≤ 3.0 in two endcap regions (EE). A preshower detector, consisting of two planes of silicon sensors interleaved with a total of three radiation lengths of lead, is located in front of the EE. The ECAL has an ultimate energy resolution of better than 0.5% for unconverted photons with transverse energies (E T ) above 100 GeV. The energy resolution is 3% or better for the range of electron energies relevant for this analysis. The HCAL is a sampling device with brass as passive material and scintillator as active material. The combined calorimeter cells are grouped in projective towers of granularity ∆η × ∆φ = 0.087 × 0.087 at central rapidities and 0.175 × 0.175 at forward rapidities. Muons are detected in the pseudorapidity range |η| ≤ 2.4, with detection planes based on three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. A high-p T muon originating from the interaction point produces track segments in typically three or four muon stations. Matching these segments to tracks measured in the inner tracker results in a p T resolution between 1% and 2% for p T values up to 100 GeV. The first level of the CMS trigger system, composed of custom hardware processors, is designed to select the most interesting events in less than 1 µs using information from the calorimeters and muon detectors. The high-level trigger processor farm further decreases the event rate to a few hundred hertz before data storage. A more detailed description of CMS can be found elsewhere [11].

Analysis strategy
We study W + c associated production in final states containing a W → ν decay (where = µ or e) and a leading jet with charm-quark content. Jets originating from a c (c) parton are identified using one of the three following signatures: a displaced secondary vertex with three tracks and an invariant mass consistent with a D + → K − π + π + (D − → K + π − π − ) decay; a displaced secondary vertex with two tracks consistent with a D 0 → K − π + (D 0 → K + π − ) decay and associated with a previous D * + (2010) → D 0 π + (D * − (2010) →D 0 π − ) decay at the primary vertex; or a semileptonic decay leading to a well-identified muon. In total, since both electron and muon channels are considered in the W-boson decay, six different final states are explored.
The D ± , D * ± (2010), and c → ν + X decays provide a direct measurement of the charm-quark jet charge, which is a powerful tool to disentangle the W + c signal component from most of the background processes. We define two types of distributions: opposite-sign distributions, denoted by OS, are built on samples containing a W boson and a charm-quark jet with an opposite-charge sign; same-sign distributions, denoted by SS, are built from samples where the W boson and the charm-quark jet have the same charge sign. The final distributions used in the analysis are obtained by subtracting the SS distribution from the OS distribution (referred to as OS − SS) for any given variable. This subtraction has no effect on the signal at leading order. In contrast, W + cc and W + bb events provide the same OS and SS contributions and are suppressed in OS − SS distributions. Moreover, any OS − SS asymmetry present in tt, single-top-quark, or W + light-quark jet backgrounds is found to be negligible according to simulations. As a consequence, OS − SS distributions are largely dominated by the W + c component, allowing for many detailed studies of the pp → W + c + X process.
Using displaced secondary vertices is a simple way to suppress backgrounds, such as Drell-Yan events, W + light-quark jet, and multijet final states with no heavy-flavour content. It also reduces backgrounds containing b-hadron decays, which often lead to secondary vertices with a higher track multiplicity than a typical D-meson decay.
The sample containing semileptonic charm decays is complementary; it is a larger data sample but is more affected by backgrounds, in particular Drell-Yan events. Exclusive identification of D ± and D * ± (2010) final states allows for a precise accounting of systematic uncertainties in charm branching fractions and acceptances for cross section measurements. However, only charge identification is strictly required for studies that are independent of the overall W + c normalization, such as relative differential measurements or measurements of the σ(W + + c)/σ(W − + c) ratio.
In order to improve the statistical precision, we also employ inclusive selections of charm hadron decays, i.e. without requiring the identification of the full final state, thus allowing for decays with one or more neutral particles. Inclusive samples of events with three-track and two-track secondary vertices are selected by loosening the invariant mass constraints. Even with these relaxed criteria, simulations predict that the background contributions to the OS − SS subtracted distributions in these inclusive samples are small compared with the signal yield.

Data and Monte Carlo samples and signal definition
The analysis reported in this paper was performed with a data sample of proton-proton collisions at √ s = 7 TeV collected with the CMS detector in 2011. A detailed data certification process [12] guarantees that the data set available for analysis, corresponding to an integrated luminosity L = 5.0 ± 0.1 fb −1 , fulfills the quality requirements for all detectors used in this analysis. Candidate events for the muon decay channel of the W boson are selected online by a single-muon trigger that requires a reconstructed muon with p T > 24 GeV. Candidate events for the electron channel are selected by a variety of electron triggers. Trigger conditions were tightened throughout the 2011 data run to cope with the increasing instantaneous luminosity of the LHC collider. Most of the data used in this analysis are selected by requiring an electron Muon and electron candidates are reconstructed following standard CMS algorithms [13,14]. Jets, missing transverse energy, and related quantities are computed using particle-flow techniques [15] in which a full reconstruction of the event is developed from the individual particle signals in the different subdetectors. Jets are reconstructed from the particle-flow candidates using an anti-k T clustering algorithm [16] with a distance parameter of 0.5. Charged particles with tracks not originating at the primary vertex are not considered for the jet clustering, and the extra energy clustered in jets from the presence of additional pp interactions (pileup events) is subtracted from the jet energy [17,18]. Finally, energy corrections derived from data and simulated samples are applied to correct for η and p T dependent detector effects [19].
Large samples of events simulated with Monte Carlo (MC) techniques are used to evaluate signal and background efficiencies. The W-boson signal (W → µν and W → eν) as well as other electroweak processes (such as Z → µµ, Z → ee, W → τν, and Z → ττ production) are generated with the MADGRAPH [20] (v5.1.1) event generator, interfaced to the PYTHIA [21] (v6.4.24) program for parton shower simulation. The MADGRAPH generator produces partonlevel events with a vector boson and up to four partons in the final state on the basis of matrix element calculations. It has been shown to reproduce successfully the observed jet multiplicity and kinematic properties of W + jets final states at the LHC energy regime [22]. The matching matrix element/parton shower scale m 2 is equal to (10 GeV) 2 and the factorization and renormalization scales are set to Q 2 = M 2 W/Z + p 2 T,W/Z . Constraints on the phase space at the generator level are not imposed, except for the condition M > 10 GeV in the case of Z(γ * ) production.
Potential backgrounds in this analysis come from tt and single-top-quark production. A sample of tt events is generated with the MADGRAPH generator interfaced to PYTHIA. Single-topquark events are generated in the t-channel, s-channel, and tW associated modes with the nextto-leading-order (NLO) generator POWHEG [23] (v1.0), interfaced with PYTHIA. The PDF set used in these POWHEG productions is CT10 [24]. We also consider the small contributions from diboson (WW, WZ, ZZ) events and quantum chromodynamics (QCD) multijet events using PYTHIA. All leading-order (LO) generations use the CTEQ6L1 PDF set [25] with parameters set for the underlying event according to the Z2 tune [26].
Cross sections for single W and Z production processes are normalized to the predictions from FEWZ [27] evaluated at next-to-next-to-leading order (NNLO) using the MSTW08NNLO [28] PDF set. The tt cross section is taken at NNLO from Ref. [29]. For the rest of the processes, cross sections are normalized to the NLO cross section predictions from MCFM [30] using the MSTW08NLO PDF set. The QCD multijet cross section is evaluated at LO.
Several minimum-bias interactions, as expected from the projected running conditions of the accelerator, are superimposed on the hard scattering to simulate the real experimental conditions of multiple pp collisions occurring simultaneously. To reach an optimal agreement with the experimental data, the simulated distributions are reweighted according to the actual number of interactions (an average of nine) occurring given the instantaneous luminosity for each bunch crossing. Generated events are processed through the full GEANT4 [31] detector simulation, trigger emulation, and event reconstruction chain of the CMS experiment. Predictions derived from the MC-simulated samples are normalized to the integrated luminosity of the data sample. At the hard-scattering level we identify W + c signal events as those containing an odd number of charm partons in the final state. This choice provides a simple operational definition of the process and ensures that pure QCD splittings of the g → cc type are associated with the background. Events containing b quarks in the final state are always classified as W + b + X in order to correctly identify b → c decays. The W + c signal reference is defined at the hard-scattering level of MADGRAPH, which provides an implicit parton-jet matching for a jet separation parameter of R = (∆η) 2 + (∆φ) 2 = 1 that is suitable for comparisons with the NLO theoretical predictions of MCFM at the 1% level. The phase space definition at the generator level is chosen in order to approximately match the experimental selections used in the analysis. For charm partons we require p c T > 25 GeV, |η c | < 2.5. Differential measurements are performed as a function of the absolute value of the lepton pseudorapidity |η |, whereas total cross sections and average ratios require |η | < 2.1. Potential dependencies on the center-of-mass energy of the hard scattering process are explored by considering two different transverse momentum thresholds for the charged leptons from the W-boson decay: p T > 25 GeV and p T > 35 GeV. The p T > 25 GeV case is analyzed in the W → µν channel only.

Event selection
The selection of W-boson candidates closely follows the criteria used in the analysis of inclusive W → µν and W → eν production [32]. The leptonic decay of a W boson into a muon or an electron, and a neutrino is characterized by the presence of a high-transverse momentum, isolated lepton. The neutrino escapes detection causing an apparent imbalance in the transverse energy of the event. Experimentally, the magnitude of the vector momentum imbalance in the plane perpendicular to the beam direction defines the missing transverse energy of an event, E miss T . In W-boson events, this variable is an estimator of the transverse energy of the undetected neutrino.
Muon tracks are required to have a transverse momentum p µ T > 25 GeV and to be measured in the pseudorapidity range |η µ | < 2.1. A muon isolation variable, I µ rel , is defined as the sum of the transverse energies of neutral particles and momenta of charged particles (except for the muon itself) in a ∆R = (∆η) 2 + (∆φ) 2 = 0.4 cone around the direction of the muon, and normalized to the muon transverse momentum. The muon is required to be isolated from any other detector activity according to the criterion I µ rel < 0.12. Electron candidates with p e T > 35 GeV are accepted in the pseudorapidity range |η e | < 2.1 with the exception of the region 1.44 < |η e | < 1.57 where service infrastructure for the detector is located, thus degrading the performance. The electron isolation variable, I e rel , is defined as the sum of the transverse components of ECAL and HCAL energy deposits (excluding the footprint of the electron candidate) and transverse momenta of tracks reconstructed in the inner tracker in a ∆R = 0.3 cone around the electron direction, and normalized to the electron p T . An isolated electron must satisfy I e rel < 0.05. The background arising from Drell-Yan processes is reduced by removing events containing additional muons (electrons) with p T > 25 (20) GeV in the pseudorapidity region |η µ | < 2.4 (|η e | < 2.5). Finally, the reconstructed transverse mass, M T , which is built from the transverse momentum of the isolated lepton, p T , and the missing transverse energy in the event, , where φ and φ E miss T are the azimuthal angles of the lepton and the E miss T vector, must be large. In the muon channel, M T must be greater than 40 GeV. A higher threshold is set in the electron channel, M T > 55 GeV, since a condition on this variable (M T > 50 GeV) is already included in the online trigger selection. This requirement reduces the QCD multijet background to a negligible level in the muon channel. Residual QCD background 6 5 Event selection in the electron channel is estimated from the experimental E miss T distribution. It is found to be negligible after subtraction of the SS component.
A W + jets sample is selected by demanding the presence of at least one jet with p jet T > 25 GeV in the pseudorapidity range |η jet | < 2.5, thus ensuring that the jet passes through the tracker volume, and hence achieving the best possible jet p T resolution. A W + c candidate sample is further selected by searching for a distinct signature of a charmed particle decay among the constituents of the leading jet associated with the W boson, as introduced in Section 3. For that purpose, events with a secondary vertex consistent with the decay of a relatively longlived quark are kept. Secondary vertices are reconstructed using an adaptive vertex finder [33] algorithm with well understood performance [34]. This algorithm is stable with respect to alignment uncertainties and is an essential component of the vertex-based b-tagging algorithms used in the CMS experiment. In its default implementation, used in this analysis, tracks within a ∆R = 0.3 cone around the jet axis, that have a transverse momentum larger than 1 GeV and a probability of originating from the primary vertex below 50% are considered to come from a secondary vertex. Finally, only secondary vertices with a transverse decay length significance with respect to the primary vertex position larger than 3 are kept.
A search for D ± and D 0 charm meson decays is carried out in those events having reconstructed secondary vertices with three or two tracks, respectively. In addition, a W + c candidate sample with the charm quark decaying semileptonically is selected from the events with an identified muon among the particles constituting the jet. These samples are described in more detail in the following subsections.

Selection of exclusive D ± decays
We identify D ± → K ∓ π ± π ± decays in the selected W + jets sample using secondary vertices with three tracks and a reconstructed invariant mass within 50 MeV of the D ± mass, 1869.5 ± 0.4 MeV [35]. The kaon mass is assigned to the track that has opposite sign to the total charge of the three-prong vertex and the remaining tracks are assumed to have the mass of a charged pion. This assignment is correct in more than 99% of the cases, since the fraction of double Cabibbo-suppressed decays is very small: B(D + → K + π + π − )/B(D + → K − π + π + ) = 0.00577 ± 0.00022 [35]. Figure 2 shows the OS − SS distributions of the reconstructed invariant mass for D ± candidates associated with W → µν and W → eν decays. It is compared with the predictions obtained from the simulated MC samples. We distinguish two different contributions in the W + c prediction. A resonant W + c component is composed of those events with a D ± meson decaying into the K ∓ π ± π ± final state at generator level; it is visible as a clear peak around the D ± mass in Fig. 2. A nonresonant component arises from W + c events where the charm meson decays to any final state other than K ∓ π ± π ± . The reconstructed invariant mass distribution in this case extends as a continuum over the whole spectrum. The distribution presented in Fig. 2 is almost exclusively populated by W + c events. The contribution from the non-(W + c) processes introduced in Section 4 is shown as part of the background.
The MC prediction for the D ± signal is scaled by the ratio of the branching fractions B(c → D ± → K ∓ π ± π ± ) used in the simulation and measured experimentally. The branching fraction used in the PYTHIA simulation, (1.528 ± 0.008)%, is about 25% smaller than the experimental measurement, (2.08 ± 0.10)%. This value is the combination of three measurements performed at LEP [36][37][38] of this branching fraction times the relative partial decay width of the Z boson into charm-quark pairs, R c = Γ(Z → cc)/Γ(Z → hadrons). The original LEP measurements are divided by the latest experimental value from the PDG [35] of R c = 0.1721 ± 0.0030. In the   combination of these three experiments, we have assumed that experimental systematic uncertainties are uncorrelated among the measurements, given the substantially different sources of uncertainty considered by each experiment, whereas the experimental uncertainty in R c is propagated in a correlated way. Agreement between data and predictions is reasonable, although a small signal excess over the predictions (of about 10%) is visible in Fig. 2.
For illustration purposes, the sum of a Gaussian function to describe the signal plus a seconddegree polynomial for the nonresonant background is fitted to the data distribution. The PDG value of the D ± mass is reproduced precisely in all cases.

Selection of exclusive D * ± (2010) decays
The first step in the identification of D * + (2010) → D 0 π + (D * − (2010) →D 0 π − ) decays is the selection of a secondary vertex with two tracks of opposite charge, as expected from a D 0 → K − π + (D 0 → K + π − ) decay. This two-track system is combined with a primary track having p T > 0.3 GeV found in a cone of ∆R = 0.1 around the direction of the D 0 candidate momentum. The secondary track with charge opposite to the charge of the primary track is assumed to be the kaon in the D 0 decay. Only combinations with a reconstructed mass differing from the D 0 mass (1864.86 ± 0.13 MeV [35]) by less than 70 MeV are kept. The D * ± (2010) signal is identified as a peak in the distribution of the difference between the reconstructed D * ± (2010) and D 0 masses near the expected value, m rec (D * ± (2010)) − m rec (D 0 ) = 145.421 ± 0.010 MeV [35].
The OS − SS distribution of the reconstructed mass difference m rec (D * ± (2010)) − m rec (D 0 ) is shown in Fig. 3. Both W → µν and W → eν decays are considered, with transverse momentum requirements of p composed here of those events with a D * ± (2010) meson decaying into the D 0 π + ; D 0 → K − π + (D 0 π − ;D 0 → K + π − ) final state at generator level; it is visible as a clear peak around the nominal mass difference m rec (D * ± (2010)) − m rec (D 0 ) in Fig. 3. The nonresonant component comes from W + c events where the charm meson decays to any final state other than D 0 π + ; D 0 → K − π + (D 0 π − ;D 0 → K + π − ). Note that the amount of background predicted by the simulation, and also observed in data, is extremely small.

Mass difference [GeV]
0.14 0.15 0. The MC prediction for the full D * ± (2010) decay chain is scaled by the ratio between the product of the branching fraction for the decay chain B(c → D * + (2010)) × B(D * + (2010) → D 0 π + ) × B(D 0 → K − π + ) used in the simulation and the experimental measurement. The product of the branching fractions used in the PYTHIA simulation is (0.743 ± 0.005)%, which is about 20% larger than our estimation of the experimental value, (0.622 ± 0.020)%. The latter number is a weighted average that uses as inputs the dedicated measurements of this product times R c by ALEPH [37] and OPAL [39], as well as the measurement of B(c → D * + (2010)) × B(D * + (2010) → D 0 π + ) by DELPHI [40]. To obtain the charm fractions needed for the W + c cross section normalization, the ALEPH [37] and OPAL [39] measurements are divided by the world-average R c experimental value and the DELPHI [40] measurement is multiplied by the world-average B(D 0 → K − π + ) = 0.0388 ± 0.0005, both taken from the PDG [35]. Also in this case, experimental systematic uncertainties are assumed to be uncorrelated among the three LEP measurements and the experimental uncertainty in R c is propagated in a correlated way.
A small excess of data over the theoretical predictions is also observed in this channel.

Selection of semileptonic charm decays
In addition to the previous exclusive channels, we consider the identification of charm-quark jets via semileptonic decays of the c quark. Only jets containing semileptonic decays into 5.3 Selection of semileptonic charm decays 9 muons are considered. Muons in jets are identified with the same criteria used for muon identification in W-boson decays, with the exception that the isolation requirements are not applied.
Since the OS − SS strategy effectively suppresses all backgrounds except Drell-Yan processes, additional requirements are applied in order to reduce the Drell-Yan contamination to manageable levels without affecting the signal in an appreciable way. We require p µ T < 25 GeV, p µ T /p jet T < 0.6, and p rel T < 2.5 GeV, where p µ T denotes here the transverse momentum of the muon identified inside the jet and p rel T is its transverse momentum with respect to the jet direction. We also require the invariant mass of the dilepton system to be above 12 GeV, in order to avoid the region of low-mass resonances. Finally, dimuon events with an invariant mass above 85 GeV are rejected. The latter requirement is not applied to the sample with W-boson decays into electrons, which is minimally affected by high-mass dilepton contamination.
For the input semileptonic branching fraction of charm-quark jets, we employ the value B(c → ) = 0.091 ± 0.005, which is the average of the inclusive value, 0.096 ± 0.004 [35], and of the exclusive sum of the individual contributions from all weakly decaying charm hadrons, 0.086 ± 0.004 [35,41]. The uncertainty is increased in order to cover both central values within one standard deviation. This value is consistent with the PYTHIA value present in our simulations (9.3%).

Selection of inclusive D ± and D * ± (2010) decays
Enlarged samples of W + c candidates are selected from the events with secondary vertices with three or two tracks, in order to increase the size of the samples available for the differential measurements. We refer to them as inclusive three-prong and two-prong samples, respectively.
Candidates for charm meson decays in the D ± → K ∓ π ± π ± decay mode are selected among the events with a secondary vertex with three tracks and with a vertex charge equal to ±1, which is computed as the sum of the charges associated with the tracks constituting the vertex. The mass assignment for the secondary tracks follows the procedure described in Section 5.1. However, the constraint that the invariant mass of the secondary vertex be compatible with the D ± nominal mass within 50 MeV is not required in this case. The OS − SS distribution of the reconstructed invariant mass in events with three prongs is presented in Fig. 5. In addition to the resonant peak at the D ± mass, there is a nonresonant spectrum with lower values of the invariant mass corresponding mainly to D ± decays with one or more unaccounted neutral particles in the final state. For the differential cross section measurement, we consider the region of the invariant mass spectrum m(K ∓ π ± π ± ) < 2.5 GeV. This results in a sample five times larger than the D ± exclusive sample. Similarly, candidates for D 0 charm meson decays are reconstructed in the W + jets events with a displaced secondary vertex built from two tracks of opposite curvature. The two tracks are assumed to correspond to the decay products of a D 0 . The decay chain D * ± (2010) → D 0 π ± , D 0 → K ∓ π ± is identified according to the procedure described in Section 5.2, but dropping the D 0 mass constraint |m(K ∓ π ± ) − 1864.86 MeV| < 70 MeV. Figure 6 shows the OS − SS distributions of the mass difference m(K ∓ π ± π ± ) − m(K ∓ π ± ), where one of the pions is the closest track from the primary pp interaction vertex. The peak at m(K ∓ π ± π ± ) − m(K ∓ π ± )∼145 MeV corresponds to the nominal D * ± (2010) − D 0 mass difference [35]. W + c events are still the dominant contribution at larger values of the mass difference. The remaining background is small and it is mainly due to residual W + light-quark jets, W + cc, and tt production. We select the events with an invariant mass difference m(K ∓ π ± π ± ) − m(K ∓ π ± ) < 0.7 GeV. The size of the sample is increased by a factor of ∼25 with respect to the exclusive D * ± (2010) sample.  Figure 6: Inclusive two-prong samples: Distribution of the difference between the invariant mass of the two-track system and the closest track from the primary pp interaction vertex and the invariant mass of the two secondary vertex tracks (m(K ∓ π ± π ± ) − m(K ∓ π ± )), assuming the decay chain D * ± (2010) → D 0 π ± → K ∓ π ± π ± . The sharp peak at 145 MeV reflects the nominal mass difference between the invariant mass of the D 0 and the primary-pion system and the D 0 mass for the decay D * ± (2010) → D 0 π ± . The left plot is for W → µν events, with p µ T > 25 GeV. The right plot is for W → eν events, with p e T > 35 GeV. The distributions are presented after subtraction of the SS component.

Measurement of the W + c cross section
The measurement of the W + c cross section is performed with several different final states containing a well-identified W → ν decay plus a leading jet with charm content. We use the exclusive D ± and D * ± (2010) samples and the semileptonic sample, described in Section 5. Two sets of measurements are provided: one with p T > 25 GeV using only W → µν decays; and a second one, using both W → µν and W → eν decays with p T > 35 GeV.

6 Measurement of the W + c cross section
The factor C accounts for limited acceptances and efficiencies. In W + c events, less than 20% of the events have a well-identified secondary vertex, while less than 50% of the muons from semileptonic charm decays have sufficiently high energy to be reconstructed and identified in the muon spectrometer. The simulated W + jets sample generated by MADGRAPH + PYTHIA is used to calculate the fraction of events within the fiducial region that fulfil the criteria for the several charm-quark jet categories. These simulated samples are corrected for any differences between data and MC description in lepton trigger, identification and reconstruction efficiencies. Scaling factors, defined as the ratio efficiency data /efficiency MC as a function of the lepton pseudorapidity, are determined with samples of Z → + − events. An invariant mass (m + − ) constraint and tight quality requirements assigned to one of the leptons ("tag") allow the other lepton to be used as a probe to test the different steps in lepton identification ("tag-and-probe" method) [32]. The precision in the factor C is limited by the size of the MC sample employed; its statistical uncertainty is propagated as a systematic uncertainty to the W + c cross section.
The signal region for the D ± channel is defined by the constraint ∆m(D ± ) ≡ |m rec (D ± ) − 1.87 GeV| < 0.05 GeV, where m rec (D ± ) is the reconstructed mass of the D ± candidate (Fig. 2). The same requirement is applied to the MC simulations in order to determine the correction factor C. We estimate values of C = 0.1114 ± 0.0033 (p µ T > 25 GeV) and C = 0.0834 ± 0.0032 (p e T > 35 GeV), where the quoted uncertainties are statistical only. The background is fully dominated by the nonresonant W + c component. It is subtracted from the selected number of events in the data window by using the number of events selected in a control region away from the resonance, extending up to a difference of 200 MeV with respect to the nominal D ± mass, 20 GeV] observed in the simulation in order to obtain the number of background events expected in the reference window. This procedure is largely independent of uncertainties in the charm fractions present in PYTHIA. Systematic biases due to the assumed nonresonant background subtraction are expected to be negligible compared to the statistical uncertainty, given the approximate agreement between data and MC distributions.
The signal region for the D * ± (2010) channel is restricted to the interval ∆m(D * ± (2010)) ≡ |m rec (D * ± (2010)) − m rec (D 0 ) − 145 MeV| < 5 MeV, where m rec (D * ± (2010)) − m rec (D 0 ) is the reconstructed mass difference between the D mesons (Fig. 3). The same procedure is applied to the MC simulations in order to determine the correction factor C. We estimate values of C = 0.0849 ± 0.0040 (p µ T > 25 GeV) and C = 0.0559 ± 0.0036 (p e T > 35 GeV), where the quoted uncertainties are statistical only. As in the D ± case, the background is subtracted from the selected number of data events in a sideband sample, 5 MeV < ∆m(D * ± (2010)) < 20 MeV. This number is scaled by the ratio N[∆m(D * ± (2010)) For the semileptonic channel, N sel is given by the number of events with a W-boson candidate decaying into a high-p T muon or electron and an identified muon inside the jet passing the requirements described in Section 5.3. The correction factors C for the different lepton thresholds are estimated in the MC simulation as C = 0.2035 ± 0.0021 (p µ T > 25 GeV) and C = 0.1706 ± 0.0021 (p e T > 35 GeV), where the quoted uncertainties are statistical only. The number of background events remaining after selection is estimated from the simulated samples. In the sample with two opposite-sign muons, the residual Drell-Yan background corresponds to events with significant missing transverse energy and one low-p T muon inside a jet. Potential discrepancies between data and MC description in this particular phase space region are evaluated by analyzing the Drell-Yan-dominated control sample with dimuon invariant masses above 85 GeV. A correction factor of 1.2 ± 0.1 provides agreement between data and 13 MC simulation in this region and it is applied to estimate the background in the signal region. The uncertainty in this correction factor is propagated as a systematic uncertainty in the cross section measurement. This takes into account possible differences in the description of events below and around the Z-boson peak, where this factor is derived. Table 1 contains all the relevant inputs used in the measurements and the resulting cross sections in the different subchannels. The sources of systematic uncertainties affecting the measurement are discussed in Section 6.1. Table 1: Cross section results for three specific final states. Here N sel is the estimated number of selected events in the signal region (around the resonance in the case of D ± and D * ± (2010) final states). N sel − N bkg is the estimate for the signal events after background subtraction using the method described in the text, C is the acceptance and efficiency correction factor, and σ(W + c) is the measured W + c cross section after correction for the charm fractions as discussed in the text. Results obtained with the sample of W bosons decaying into a muon and a neutrino and for the two muon transverse momentum thresholds (p µ T > 25 GeV and p µ T > 35 GeV) are shown in the first two blocks of the table. Results obtained when the W boson decays into an electron and a neutrino (p e T > 35 GeV) are given in the lowest block of the table. All uncertainties quoted in the table are statistical, except for the measured cross sections, which include systematic uncertainties due to the sources discussed in Section 6.1.
14215 ± 196 9867 ± 237 20.4 ± 0.2 106.5 ± 2.6 (stat.) ± 9.6 (syst.) 1209 ± 55 981 ± 79 11.4 ± 0.4 82.9 ± 6.7 (stat.) ± 6.4 (syst.) For each W-boson decay channel and lepton p T threshold considered, the cross sections measured from the three charm meson decay samples are consistent and are combined. Measurements performed in the muon and electron channel with a lepton p T threshold of 35 GeV are also combined. The combination is a weighted average of the individual measurements taking into account their statistical and systematic uncertainties. Systematic uncertainties arising from a common source and affecting several measurements are considered to be fully correlated.

6 Measurement of the W + c cross section
For p T > 35 GeV we obtain The average cross sections are dominated by the measurements in the semileptonic channel (∼50%), followed by the D ± channel (∼30%) and the D * ± (2010) channel (∼20%). The weight of the W → µν channel in the cross section measurement with a lepton p T threshold of 35 GeV is ∼30% higher than the contribution from the W → eν channel.
These measurements are largely background-free. The overall relative uncertainty, 6-7%, is dominated by systematic uncertainties in the theoretical modeling of the signal and by experimental uncertainties in the efficiency of the selection criteria. A detailed comparison with theoretical predictions is provided in Section 9.

Systematic uncertainties in the W + c cross section measurement
The various sources of systematic uncertainties are presented in Table 2. The limited precision in the branching fractions of the charm decays is one of the dominant sources of uncertainties.
Tracking reconstruction inefficiencies are intrinsically small (< 1% [42]). Given the nature of the method used to build secondary vertices, tracks are assigned to either the primary or secondary vertex in a way that may be different in data and MC simulation. In order to estimate the size of a potential discrepancy, the set of secondary tracks is either increased by adding a nearby primary track or decreased by dropping one of the original secondary tracks. A systematic uncertainty of 3.3% in the measured cross sections is estimated from the observed differences at the resonant D 0 and D ± peaks between data and simulation. Its impact on the final cross sections is reduced after combination with the results from the semileptonic channel, which is free of this uncertainty.
Uncertainties due to the pileup modeling are calculated using a modified pileup profile obtained with a minimum bias cross section increased by its estimated uncertainty, ≈6%. Jet energy scale uncertainties are extracted from dedicated CMS studies [19], which also take into account possible variations in the jet flavour composition. Additional E miss T effects are estimated by smearing the M T distribution in simulation in order to match the M T shape observed in data. Their impact is ≈2% on the final measurement.
Lepton trigger and selection inefficiencies are included in the simulation by applying the corresponding data/MC scale factors determined in dedicated "tag-and-probe" studies as a function of the lepton pseudorapidity. For muons we estimate a 0.7% uncertainty according to CMS studies on dimuon events in the Z-boson mass peak. In the electron case we consider the difference between switching on and off the efficiency scale factors, because of the presence of missing transverse energy requirements at the trigger level that cannot be fully accounted by using "tag-and-probe" techniques. The effect of momentum and energy resolution corrections determined at the Z-boson mass peak is also propagated as an additional uncertainty. We combine the uncertainties due to lepton identification, isolation, and trigger efficiencies with the uncertainty in the lepton momentum and energy resolution in a single entry in Table 2.
The efficiency uncertainty for muons inside jets is taken to be 3.0% according to dedicated studies in multijet events. The systematic uncertainty arising from the Drell-Yan background 6.2 Characterization of W + c kinematics 15 subtraction in the semileptonic channel is determined as the change in the cross section when the correction factor to the MC simulation is varied within its uncertainties.
The propagation of the statistical uncertainty in the factor C to the cross section is not negligible due to the limited size of the MC samples used. The uncertainties related to initial-state radiation (ISR) are estimated by recalculating the factor C from samples generated with different renormalization and factorization scales (half and twice the default scale Q 2 used in the generation). The average value of the meson energy fraction in charm decays is varied by 4%, which is about twice the uncertainty in the D * ± (2010) fragmentation determined at LEP [37,39], in order to cover possible uncertainties in the assumed shape. Other theoretical uncertainties in C include PDF effects and potential biases due to the adoption of the MADGRAPH jet-parton matching scheme as the reference to be compared with the MCFM calculations (≈ 1%).
The integrated luminosity measurement has a 2.2% uncertainty [43]. Physics backgrounds, including the gluon-splitting W + cc component, have a negligible contribution to the systematics compared with the statistical uncertainties in the background subtraction. Table 2: Breakdown of the different contributions to the total systematic uncertainty (∆ syst ) in the combined σ(W + c) measurements in the fiducial region given by p jet T > 25 GeV, |η jet | < 2.5, |η | < 2.1 for two different thresholds of the transverse momentum of the lepton from the Wboson decay: p T > 25 GeV (muon channel only) and p T > 35 GeV (muon and electron channels combined).

Characterization of W + c kinematics
The high signal purity of the selected samples allows a deeper study of the properties of W + c events. Figure 7 shows the distributions of the jet pseudorapidity and the jet momentum fraction carried by the D ± candidates (top row of plots) and the D * ± (2010) candidates (middle row of plots), while the jet pseudorapidity and the jet momentum fraction carried by the muon is shown for the semileptonic candidates (bottom row of plots). The latter observable is directly related to the charm fragmentation function. The normalization of the W + c component in the simulation has been scaled by a factor of 1.1 in order to match approximately the experimental rate measured in data. Electron and muon channels are added in order to enhance the statistical power of the comparison. All distributions show reasonable agreement with the predictions of MADGRAPH + PYTHIA, although the experimental charm fragmentation spectra are slightly harder than the predicted ones.

Measurement of the differential cross section as a function of the lepton pseudorapidity
The W + c cross section is also measured differentially with respect to the absolute value of the pseudorapidity of the lepton from the W-boson decay. We first determine the normalized differential cross section, (1/σ(W + c)) dσ(W + c)/d|η|. The absolute differential cross section is derived from the normalized one just by scaling to the average cross section presented in the previous section.  Table 3. For C norm i only selection requirements related to the W-boson identification and jet selection are applied; these will be used to correct the observed events in the semileptonic sample. This procedure is done separately for events with a secondary vertex using the correction factors C norm SV , which are applied to the events in the inclusive three-and two-prong samples. Global factors correcting for effects independent of the pseudorapidity of the lepton from the W-boson decay affect equally all bins and cancel in the normalization. The statistical uncertainty in the C norm i factors is propagated as a systematic uncertainty to the normalized differential cross section. Table 3: Correction factors C norm used for the calculation of the differential measurements. Statistical uncertainties in C norm are typically 0.3% while in C norm SV they are roughly 1%. The number of events selected, N sel,i , in the inclusive three-prong sample is subject to the constraint that the invariant mass of the three tracks from the vertex, m(K ∓ π ± π ± ) is smaller than 2.5 GeV. The events included in the inclusive two-prong sample have a mass difference of less than 0.7 GeV between (1) the invariant mass of the two-track system plus the closest track from the primary pp interaction m(K ∓ π ± π ± ), and (2) the invariant mass of the two-track system m(K ∓ π ± ). For the semileptonic channel N sel,i is given by the number of events with a Wboson candidate decaying into a high-p T lepton and an identified muon inside the jet passing the requirements described in Section 5.3. The assignment to the corresponding ith bin in the differential distribution is determined by the absolute value of the pseudorapidity of the lepton from the W-boson decay.
The normalized differential cross sections are presented graphically in Fig. 8. The number of OS − SS events in each lepton pseudorapidity bin for the three charm meson decay samples are detailed in Tables 11, 12, and 13 of Appendix A, together with the expected residual background N bkg,i and the numerical values of the normalized cross sections. The estimation of this background contamination has large statistical uncertainties due to the limited size of the MC samples, mainly for the data with a displaced secondary vertex. This uncertainty is propagated to the differential cross sections as a systematic uncertainty in the measurement. Unlike the W → eν sample, there is a sizable background contribution in the W → µν sample arising from Drell-Yan events. The normalized differential cross sections measured with the different W + c subsamples and for the two W → ν decay channels are consistent. Therefore, the results obtained in the W → µν channel with p µ T > 25 GeV are averaged, as are the results for the W → µν and W → eν channels with p T > 35 GeV. These combinations are a weighted average of the individual measurements taking into account their statistical and systematic uncertainties. Systematic uncertainties arising from a common source and affecting several measurements are considered to be fully correlated among them. The existing statistical correlations among the normalized cross section in the five pseudorapidity bins are included in the combination. These averaged values are given in Table 4. The corresponding correlation matrices are presented in Table 5.

19
The normalized differential cross sections obtained for p µ T > 25 GeV and p T > 35 GeV are combined with the respective W + c cross sections presented in Section 6 to obtain the absolute differential cross sections, dσ(W + c)/d|η|. Results are shown in Table 6. Normalized differential cross section and total cross section measurements are essentially uncorrelated and the full covariance matrices for the absolute differential cross sections can be obtained by propagating the information contained in Tables 4 and 5 and the total uncertainty in the W + c cross sections. Table 4: The normalized differential cross section as a function of the absolute value of the lepton pseudorapidity. These results are the average of the three samples (inclusive threeprong, inclusive two-prong, and semileptonic). The left column shows the results obtained with the W → µν sample for muons with p T > 25 GeV, while the right column combines the results obtained with the W → µν and W → eν samples for leptons with p T > 35 GeV.

Systematic uncertainties in the normalized differential cross section measurement
The dominant source of systematic uncertainty in the normalized differential cross sections from the three samples is the limited size of the MC samples. It impacts the statistical accuracy in the estimation of the residual background after the SS subtraction, and to a lesser extent, in the determination of the correction factors C norm i . As summarized below, most of the other sources that have been discussed in Section 6 have a negligible impact in the differential distributions since their effects largely cancel out in the ratios.
Differential distributions are mostly independent of jet energy scale effects since they are measured as a function of the pseudorapidity of the lepton from the W-boson decay and the spanned jet kinematic region is similar in all cases, independently of the pseudorapidity of the lepton. Possible effects due to jet energy scale uncertainties are evaluated by changing the jet energy scale in the simulated W + c sample in accord with the results of dedicated studies by CMS [19]. The variations observed in the resulting differential distribution can be largely explained by statistical fluctuations in the MC sample.
The calibration factors for lepton momentum scale and resolution have been derived from detailed studies of the position and width of the Z-boson peak [44,45]. The systematic uncertainty in the normalized differential cross section is estimated in the W → eν channel by comparing the resulting distributions with and without calibration corrections. Variations are smaller than 1% in the barrel, and of the order of 1.5% in the endcap region. In the W → µν channel the measurement is repeated many times, varying the muon calibration factors within their uncertainties and comparing to the values obtained when applying the central value of the correcting factors. The width of the resulting distribution is taken as the systematic uncertainty arising from limited knowledge of the muon momentum scale and resolution. Uncertainties between 0.2% and 0.4% in the normalized differential distributions are obtained, depending on the particular muon pseudorapidity bin, the sample selection, and the p µ T threshold. We estimate a residual ∼0.35% systematic uncertainty in the muon efficiency scaling factors, which are treated as uncorrelated among the different pseudorapidity bins. For the W → eν channel, the effect of the efficiency corrections in the measured ratios (∼0.25%) is computed and taken as an estimation of the systematic uncertainty.
In the modeling of the background remaining after the SS subtraction, the only physical process with a visible contribution to the final sample is Drell-Yan production, which, when one of the two muons is inside a jet, mimics the semileptonic sample in the W → µν channel. The correction factor (1.2 ± 0.1) applied to the Drell-Yan prediction is varied by one sigma and the differential distribution is reevaluated. Variations smaller than 0.3% are observed and taken as the associated systematic uncertainty. Top-quark contributions have also been varied by 6% for tt production and by 15% for single-top-quark production. Variations in the differential distri-butions are smaller than 0.2%. A total systematic uncertainty of 0.3% is assumed to account for the background subtraction.
It is observed that the uncertainties related to the parton distribution function of the strange quark within the same PDF set are smaller than, or equal to, the differences between the central values obtained with MSTW08 [28], CT10 [24], and NNPDF23 [46]. However, no variation in the C correction factors computed with these sets of PDFs is observed and therefore no change is expected in the final result.
Systematic uncertainties arising from other sources, such as knowledge of the event pileup or the average energy fraction in charm fragmentation have been evaluated with the W + c MC sample and are found to be negligible.
The systematic uncertainties in the absolute differential cross sections given in Table 6 are dominated by the uncertainties in the total W + c cross section. The relative importance of the different sources essentially follows the breakdown of the contributions presented in Table 2. The effect of the limited MC statistics is increased because both measurements, total and normalized differential cross sections, are affected.

Measurement of the cross section ratio σ(W + + c)/σ(W − + c)
Cross section ratios σ(W + + c)/σ(W − + c) are also measured for the three specific final states discussed in the previous section. They are determined as the ratio of the OS − SS samples in which the lepton from the W-boson decay is positively or negatively charged: The total cross section ratio and the ratio as a function of the absolute value of the pseudorapidity of the lepton from the W-boson decay are determined.
The numbers for N + sel and N − sel are extracted from the same subsamples used for the differential cross section measurement presented in the previous section and by separating the events according to the sign of the lepton from the W-boson decay. The background contributions N + bkg and N − bkg to N + sel and N − sel have a small effect in the ratio and are neglected in the calculation. The largest effect is due to the Drell-Yan contamination in the W → µν channel and that is reduced by requiring that the transverse momentum of the muon inside the jet be less than 12 GeV. No efficiency corrections are applied since they affect the positively and negatively charged samples equally and cancel in the ratio. Figure 9 presents the cross section ratios R ± c (|η |) obtained from the three samples. The numerical values of the cross section ratio are detailed in Table 14 in Appendix A. The last row of each set of results in the table gives the cross section ratio for the full lepton absolute pseudorapidity range [0., 2.1].
The effect of neglecting the background is estimated to be of the order of 0.3% and 0.2% for the inclusive cross section ratio in the inclusive three-and two-prong samples, respectively. It is 1% (0.3%) in the semileptonic sample in the W → µν (W → eν) channel. In the ratios as a function of the absolute value of the pseudorapidity, the largest effect is for the highest |η| bin for all samples (∼1%) except for the semileptonic sample in the W → µν channel where it reaches ∼4%. Other sources of systematic uncertainties in the cross section ratios are those related to lepton reconstruction, identification, and, in particular, any lepton-charge-dependent effect that may affect the W + and W − candidate samples differently. The systematic uncertainty in the cross section ratio due to lepton momentum scale and resolution is estimated following the same technique used for the normalized differential cross section. The uncertainties in the W → eν channel are smaller than 1% in the barrel, and approximately 1.5% in the endcap region. They vary in the range 0.4-0.8% in the W → µν channel, depending again on the muon pseudorapidity bin, the sample, and the muon p T threshold. They reduce to ∼0.2-0.3% for the inclusive cross section ratios since the effect of muon momentum correction factors for the muon pseudorapidity bins cancels to a large extent, thus decreasing the final uncertainty. The correction factors to the lepton reconstruction efficiencies for positively and negatively charged leptons are the same within their statistical uncertainty and thus no additional systematic uncertainties are assigned to this source.
The lepton charge misassignment in CMS is smaller than 0.3% for electrons [47] and of the order of 10 −4 for muons [48]. The associated systematic uncertainty in the cross section ratio is proportional to the relative difference between W + + c and W − + c production. Since this is small because the measured cross section ratios are close to 1, the total effect is neglected.
The cross section ratios, both total and as a function of the lepton pseudorapidity, measured with the different W + c samples and for the two W → ν decay channels are consistent. The results obtained in the W → µν channel with p µ T > 25 GeV are averaged, as are the results for the W → µν and W → eν channels with p T > 35 GeV. Statistical and systematic uncertainties of the individual measurements are taken into account in the combination process. Systematic uncertainties arising from a common source and affecting several measurements are considered to be fully correlated.
The following averaged R ± c ratios in the full pseudorapidity interval are derived: (p e T > 35 GeV) = 0.927 ± 0.029 (stat.) ± 0.012 (syst.), and the corresponding averaged values as a function of the absolute value of the pseudorapidity are presented in Table 7.  A larger production yield of W − + c than of W + + c is expected because the former process involves a d quark whereas the latter involves a d (sea) antiquark. This prediction is confirmed since the measured cross section ratio σ(W + + c)/σ(W − + c) is smaller than 1.0. The difference in production between W + + c and W − + c is not constant over the full pseudorapidity range. Production cross sections are similar in the central region, R ± c ∼1, for absolute values of the pseudorapidity of the lepton smaller than 0.35. The ratio reduces to about 0.8 for the most forward lepton pseudorapidity. A decrease of the cross section ratio with the lepton pseudorapidity is expected, since in this case we are probing a region of Bjorken x where the difference between the d and d contributions is larger.

Results and comparisons with theoretical predictions
The measured total and differential cross sections and cross section ratios can be compared to analytical calculations from the MCFM program. The W + c process is available in MCFM up to 24 9 Results and comparisons with theoretical predictions O(α s 2 ) with a massive charm quark (m(c) = 1.5 GeV). The MCFM predictions for this process do not include contributions from gluon splitting into a cc pair, but only contributions where the strange (or the down) quark couples to the W boson. The implementation of W + c follows the calculation for the similar W+top-quark process [49].
The parameters of the calculation have been adjusted to match the experimental measurement: p jet T > 25 GeV and |η jet | < 2.5. Two sets of predictions are computed, utilizing the different lepton p T thresholds used in the analysis: p T > 25 GeV in the W → µν channel and p T > 35 GeV in the W → µν and in the W → eν channel.
We show predictions for three NNLO PDF sets: MSTW2008, CT10, and NNPDF2.3. These three PDF sets have in common the use of a global data set with a wide variety of observables to constrain PDFs, and, in particular, they include neutrino charm production data to provide information on the strange-quark content of the proton. In addition, we compare with predictions using the NNPDF2.3 coll NNLO set [50], which is based on high energy collider data only, and thus does not rely on the neutrino DIS charm information. In particular, it includes W and Z production data from ATLAS, CMS, and LHCb, and leads to a larger strangeness content of the proton than that of global PDF sets. These four sets span a wide range of values for the strange-quark PDF, and the strangeness content from other PDF analyses falls within this interval. NNPDF2.3 has the smallest strangeness, and NNPDF2.3 coll the largest one. We have also computed the theoretical predictions for the ABM11 [51], JR09 [52], and HERAPDF1.5 [53,54] PDF sets and we discuss these results below as well.
Both the factorization and the renormalization scales are set to the value of the W-boson mass. To estimate the uncertainty from missing higher perturbative orders, cross section predictions are computed by varying independently the factorization and renormalization scales to twice and half the nominal value (with the constraint that the ratio of scales is never larger than two). The envelope of the cross sections with these scale variation defines the theoretical scale uncertainty.
The value of α s (M Z ) in the calculation is set to the central value given by the respective PDF groups. Uncertainties in the predicted cross sections associated with α s (M Z ) are smaller than the uncertainties from the PDFs, and have been neglected in the following comparisons.

Total cross section
The measured total cross sections are consistent with theoretical expectations. However, there are significant variations depending on the PDF set used in the prediction. The detailed theoretical predictions are summarized in Table 8 where the central value of the prediction is given, together with the uncertainty due to the PDF variations within each set. The experimental results reported in this document are also included in the table. The size of the PDF uncertainties depends on the different methodology used by the various groups. In particular, they depend on the parametrization of the strange-quark PDF and on the definition of the one-standarddeviation uncertainty band. In the case of NNPDF2.3 coll , the larger uncertainties arise from the lack of direct constraints on strangeness in a collider-only fit.
These predictions are compared graphically to the experimental measurement in Fig. 10. Only PDF uncertainties are shown. Scale uncertainties in the total cross section are of the order of ±5%. From Fig. 10 we see that measured W + c cross sections agree with the theoretical predictions using the PDF sets introduced above within theoretical and experimental uncertainties. The total cross sections for ABM11, JR09, and HERAPDF1.5 are respectively 98.9 pb (78.0 pb), 80.0 pb (63.4 pb) and 96.9 pb (76.7 pb) for a lepton p T threshold of 25 (35) GeV. As discussed 9.2 Differential cross section 25 Table 8: Predictions for σ(W + c) from MCFM at NLO. Kinematic selection follows the experimental requirements: p jet T > 25 GeV, |η jet | < 2.5, and |η | < 2.1. Partons are joined using an anti-k T algorithm with a distance parameter of 1. Theoretical predictions are computed with MCFM for two different thresholds in the lepton p T : p T > 25 (35) GeV in the first (second) column of predictions. For every PDF set, the central value of the prediction is given, together with the relative uncertainty as prescribed from the PDF set. The uncertainty associated with scale variations is ±5%. The last row in the table gives the experimental results presented in this document. in [4], the strangeness in ABM11 and HERAPDF1.5 is close to that of MSTW and NNPDF, hence the similarities in the predictions.  Table 9 presents the predictions for (1/σ(W + c)) dσ(W + c)/d|η|. The differences among the central value of the predictions obtained with the various PDF sets are of the same order as the associated uncertainties (at 68% confidence level, CL). As in the case of the inclusive cross section, the different size of the associated uncertainties arises from the different assumptions of PDF groups about the strange quark and antiquark content of the proton and from the different experimental inputs included [3]. As expected, PDF uncertainties increase at forward pseudorapidities, where the range of Bjorken x is outside that covered by available data sensitive to strangeness. Systematic uncertainties due to the scale variations are smaller than 1% for all muon pseudorapidity bins.

Differential cross section
The theoretical predictions are compared with the average of the experimental measurements presented in Section 7. Figure 11 (Fig. 12) compares the measurements and predictions for the normalized cross sections (absolute cross sections). There is agreement between the measured distributions and the theoretical predictions. We note that a comparison among the several predictions in Figs. 11 and 12 may lead to different conclusions. For instance, NNPDF2.3 coll gives the smallest prediction in the first rapidity bin in Fig. 11, whereas it gives the highest value in Fig. 12. The normalized differential cross sections probe the shape of the strangequark PDF whereas the behaviour of the absolute differential cross sections is also driven by the overall magnitude of the strange-quark PDF.

Charged cross section ratio
Theoretical predictions for σ(W + + c) and σ(W − + c) production are computed independently under the same conditions explained before and for the same lepton pseudorapidity intervals used in the analysis. Expectations for the cross section ratio σ(W + + c)/σ(W − + c) are derived  : The (1/σ(W + c)) dσ(W + c)/d|η| theoretical predictions calculated with MCFM at NLO. Kinematic selection follows the experimental requirements: p jet T > 25 GeV, |η jet | < 2.5, and |η | < 2.1. Partons are joined using an anti-k T algorithm with a distance parameter of 1. Predictions for W → ν when the transverse momentum of the lepton from the W boson is larger than 25 GeV are given in the first block of the table. The second block of predictions are for W → ν production with p T > 35 GeV. For every PDF set, the central value of the prediction is given, together with the relative uncertainty as prescribed from the PDF set. The uncertainty associated with scale variations is smaller than 1%.     Table 10. The last row in each block of predictions gives the prediction of the charged cross section ratio for the full lepton pseudorapidity interval, |η | < 2.1. We note that this ratio is sensitive to the strangeness asymmetry in the proton, but also to the down quark and antiquark asymmetry from the Cabibbo-suppressed process gd → W − c (gd → W + c). The d-d asymmetry is larger in absolute value than the difference between strange quarks and antiquarks. Table 10: Theoretical predictions for R ± c (η ) ≡ σ(W + + c)(|η |)/σ(W − + c)(|η |) calculated with MCFM at NLO. Kinematic selection follows the experimental requirements: p jet T > 25 GeV, |η jet | < 2.5, and |η | < 2.1. Partons are joined using an anti-k T algorithm with a distance parameter of 1. Predictions for W → ν when the transverse momentum of the lepton from the W boson is larger than 25 GeV are given in the first block of the table. The second block of predictions are for W → ν production with p T > 35 GeV. For each PDF set, the central value of the prediction is given, together with the relative uncertainty as prescribed from the PDF set. The uncertainty associated with scale variations are of the order of 1-2%. Both the central values and the associated PDF uncertainties are quite different for the various sets of predictions. These differences arise from the assumptions underlying each global fit. For instance, the CT10 set assumes equal content of strange quark and antiquark in the proton, leading to a charged cross section ratio almost exclusively driven by the d-d asymmetry and with a very small PDF uncertainty in the prediction. On the other hand, both MSTW08 and NNPDF2.3 provide independent parametrizations of the strangeness asymmetry, thus resulting in larger PDF uncertainties. The MSTW08 and NNPDF2.3 predicted values for the σ(W + + c)/σ(W − + c) ratio in the full pseudorapidity region are smaller than in the CT10 case. As before, PDF uncertainties increase for large values of the lepton pseudorapidity. Systematic uncertainties in the cross section ratio due to the scale variations are smaller than 1% for the full lepton absolute pseudorapidity range [0., 2.1] and of the order of 1-2% for the smaller pseudorapidity bins of the differential measurement.
Differences among the predictions are relatively large for some of the lepton pseudorapidity bins, ∼4-5%, although this difference is covered by one standard deviation of the PDF uncertainties. All PDF sets predict the decrease of the charged ratio with the absolute value of the lepton pseudorapidity as a consequence of the higher d-d asymmetry at large values of Bjorken x. The decrease with |η | is more pronounced in the case of NNPDF2.3.
Averaged cross section ratios obtained in Section 8 are compared with theoretical predictions. Figure 13 shows the measurements and the predictions for the total cross section ratios and Fig. 14 shows the cross section ratios as a function of the absolute value of the lepton pseudorapidity.
The theoretical predictions based on the CT10 PDF set agree with the measured cross section ratios. Predictions from NNPDF23 and NNPDF23 coll are well within the uncertainty of the measurements, whereas expectations using MSTW08 lie about 1.5 sigma below the measurements.
For the cross section ratio as a function of the absolute value of the lepton pseudorapidity, there is agreement between the measurements and the theoretical predictions, especially when the transverse momentum of the lepton from the W-boson decay is larger than 35 GeV.

Summary and conclusions
The associated production of a W boson with a charm-quark jet in pp collisions at √ s = 7 TeV is experimentally established for the first time, using a data sample collected by the CMS experiment during the 2011 LHC run with an integrated luminosity of 5 fb −1 . The signature of W-boson production together with a charm-quark jet is observed by identifying the leptonic decay of the W boson into a muon or an electron and a neutrino and the reconstruction of exclusive and inclusive final states from the decay of charm hadrons. In total, distinct W + c signals are observed independently in six different final states.
The high performance of the CMS tracking detector and the algorithms devised for secondaryvertex reconstruction allow the efficient selection of candidate samples with a displaced secondary vertex having three or two tracks corresponding to the decay products of charm mesons. Clear signals of D ± mesons are observed through the reconstruction of the decay mode D ± → K ∓ π ± π ± in events with three-track secondary vertices and from D 0 production in the decay chain D * ± (2010) → D 0 π ± with the subsequent decay D 0 → K ∓ π ± in events with two-track secondary vertices. In addition, efficient muon identification among the particles constituting the jet leads to an independent W + c sample with an identified muon from the semileptonic decay of the charm quark.
The analysis exploits the intrinsic charge correlation in W + c production between the charge of the W boson and the charge of the c quark, which are always of opposite sign. The W-boson decay into a well-identified charged lepton and the final-state mesons allow us to determine unequivocally the signs of both the W boson and the charm-quark jet candidates. Independent opposite-sign and same-sign samples of events are hence defined. The background contributions from processes that are charge symmetric are subtracted in an essentially modelindependent way through a same-sign sample subtraction from the opposite-sign sample in the relevant variables used in the analysis.
The high purity of the resulting samples allows us to perform various measurements in an almost background-free environment. The sample of candidate events from the semileptonic decay of charm mesons is affected by a larger background, mainly in the W → µν channel, but  it provides a larger statistical power so that the final precision attained in the measurements in the three charm meson final states is similar. Furthermore, the large number of events in the inclusive three-and two-prong samples and in the semileptonic sample permit us to perform differential measurements.
A detailed analysis of W + c production at √ s = 7 TeV is presented. The study is done for the kinematic region p jet T > 25 GeV, |η jet | < 2.5, in the lepton pseudorapidity range |η | < 2.1, and for two different thresholds for the transverse momentum of the lepton from the W-boson decay: p T > 25 GeV in the W-boson muon decay channel only, and p T > 35 GeV in both the muon and the electron W-boson decay channels. Results obtained in the three charm decay samples and in the two W-boson decay modes are fully consistent and are thus combined to increase the final precision of the measurements.
The measured cross section ratios are the first evidence for an asymmetry in the W + + c and W − + c production. Total cross sections and cross section ratios are also measured as a function of the absolute value of the pseudorapidity of the lepton from the W-boson decay, thus probing a wide range of Bjorken x of the parton distribution of the proton. These measurements provide the first direct constraint from LHC data on the strange quark and antiquark content of the proton and constitute a valuable input for future global PDF analyses.
These measurements are compared with theoretical predictions calculated with MCFM at nextto-leading order in perturbative QCD using various sets of parton distribution functions. The PDF groups make different assumptions in their global fits about the total strange-quark content of the proton and of the s-s asymmetry. An overall agreement between the experimental results and the theoretical predictions is observed, which validates the fitted strange quark and antiquark parton distribution functions at an energy significantly higher than those of previous experiments. In particular, the predicted total cross sections based on those PDF sets that include low-energy DIS data in their fits agree with the measurements. Theoretical calculations also predict differential cross section shapes in agreement with the measured ones. The observed W − + c yield is slightly larger than the W + + c yield, as expected from the dominance of the d quark over the d antiquark in the proton. A Normalized differential cross section and cross section ratios as a function of the lepton pseudorapidity A Normalized differential cross section and cross section ratios as a function of the lepton pseudorapidity Table 11: Estimated number of OS − SS events in the inclusive three-prong sample (defined in Section 5.4). The estimated numbers of remaining background events after SS subtraction is given in the third column. The normalized differential cross section as a function of the absolute value of the lepton pseudorapidity is shown in the last column. The first two blocks of the table present the results from the W → µν sample, with p µ T > 25 GeV and p µ T > 35 GeV. The results from the W → eν sample, with p e T > 35 GeV are given in the lowest block of the table. The first error in the normalized differential cross section is due to the statistical size of the data sample and the second one is the systematic uncertainty from to the sources discussed in Section 7.  Table 12: Estimated number of OS − SS events in the inclusive two-prong sample (defined in Section 5.4). The estimated numbers of remaining background events after SS subtraction is given in the third column. The normalized differential cross section as a function of the absolute value of the lepton pseudorapidity is shown in the last column. The first two blocks of the table present the results from the W → µν sample, with p µ T > 25 GeV and p µ T > 35 GeV. The results from the W → eν sample, with p e T > 35 GeV are given in the lowest block of the table. The first error in the normalized differential cross section is due to the statistical size of the data sample and the second one is the systematic uncertainty from to the sources discussed in Section 7.

42
A Normalized differential cross section and cross section ratios as a function of the lepton pseudorapidity Table 13: Estimated number of OS − SS events in the semileptonic sample (defined in Section 5.3). The estimated numbers of remaining background events after SS subtraction is given in the third column. The normalized differential cross section as a function of the absolute value of the lepton pseudorapidity is shown in the last column. The first two blocks of the table present the results from the W → µν sample, with p µ T > 25 GeV and p µ T > 35 GeV. The results from the W → eν sample, with p e T > 35 GeV are given in the lowest block of the table. The first error in the normalized differential cross section is due to the statistical size of the data sample and the second one is the systematic uncertainty from to the sources discussed in Section 7.