1 Introduction

The associated production of a W boson and a single charm (c) quark (\(\textrm{W}+\textrm{c}\)) in proton–proton (pp) collisions at the CERN LHC is directly sensitive to the strange quark (\(\textrm{s}\)) content of the colliding protons at an energy scale of the order of the W  boson mass [1]. This sensitivity comes from the dominance of the \({\textrm{sg}} \rightarrow \textrm{W}+\textrm{c}\) contribution over the Cabibbo-suppressed process \({\textrm{dg}} \rightarrow \textrm{W}+\textrm{c}\) at tree level (see Fig. 1). Therefore, this process provides valuable information on the strange quark parton distribution function (PDF), which is one of the least constrained PDFs of the proton. Accurate measurements of the \(\textrm{W}+\textrm{c}\) production cross section and of the \(R_\textrm{c}^{\pm }= \sigma ({\hbox {W}}^{+}+{\bar{\text {c}}})/\sigma ({\hbox {W}}^{-}+{\textrm{c}})\) cross section ratio can be used to further constrain the strange quark PDF, and to probe the level of asymmetry between the \(\textrm{s}\) and \(\bar{\textrm{s}}\) PDFs [2,3,4].

Furthermore, the production of \(\textrm{W}+\textrm{c}\) events provides a useful calibration sample for the measurements and searches at the LHC involving electroweak bosons and c  quarks in the final state [5, 6]. Precise measurements of \(\textrm{W}+\textrm{c}\) production can be used to check the theoretical calculations of this process and its modeling in the currently available Monte Carlo (MC) event generators.

The \(\textrm{W}+\textrm{c}\) production in \({\textrm{pp}}\) collisions at the LHC has been reported by the CMS [7,8,9], ATLAS [10, 11], and LHCb [12] Collaborations at center-of-mass energies \(\sqrt{s}= 7\), 8, and 13\(\,\text {Te}\hspace{-.08em}\text {V}\). Measurements of \(\textrm{W}+\textrm{c}\) fiducial cross sections and the \(R_\textrm{c}^{\pm }\) cross section ratio were performed in those analyses by identifying charm events through the reconstruction of exclusive decays of charm hadrons, or finding secondary vertices or muons inside a jet.

Fig. 1
figure 1

Leading order Feynman diagrams for the associated production of a W boson and a charm quark. The electric charges of the W boson and c  quark have opposite signs

In this paper, we present a measurement of the \(\textrm{W}+\textrm{c}\) production cross section and cross section ratio \(R_\textrm{c}^{\pm }\) at \(\sqrt{s}=13\,\text {Te}\hspace{-.08em}\text {V} \) using the data collected in 2016–2018. The precision is improved compared with previous CMS measurements. In particular, the uncertainty in the \(R_\textrm{c}^{\pm }\) measurement is halved, reaching a precision of 1%. Measurements are performed in four independent channels, depending on the method used for identifying the c  quarks and the W boson decay mode (electron or muon). Jets are tagged as originating from the hadronization of c  quarks (c jet) by the presence of either muons or secondary vertices inside the jets. The combination of the measurements in the four channels, the use of the large data set collected at \(\sqrt{s}=13\,\text {Te}\hspace{-.08em}\text {V} \), and the reduction of systematic uncertainties, lead to more precise measurements.

A key property of \(\textrm{W}+\textrm{c}\) production is the opposite sign of the electric charges of the W boson and c  quark. This feature allows the suppression of most of the background events, which exhibit bottom or charm quarks and antiquarks with equal probability and identical kinematics, such as top quark-antiquark or \(\textrm{W}+\hbox {c}\bar{\text {c}}\) production. The statistical subtraction of the distributions of physical observables for events where the reconstructed charges of the W boson and the c  quark have opposite sign (OS) and same sign (SS) leads to the effective removal of these backgrounds [7, 8]. This technique, referred to as \(\text {OS-SS}\) subtraction, enhances the sensitivity to the \({\textrm{sg}}\rightarrow \textrm{W}+\textrm{c}\) process, and therefore to the strange quark PDF.

The \(\text {OS-SS}\) cross sections \(\sigma ({\hbox {W}}^{+}+\bar{\text {c}})\equiv \sigma ({\textrm{pp}} \rightarrow {\textrm{W}}^{+}+\bar{\textrm{c}}){\mathcal {B}}({\hbox {W}}^{+}\rightarrow \ell ^+ \upnu )\), \(\sigma ({\hbox {W}}^{-}+{\textrm{c}})\equiv \sigma ({\textrm{pp}} \rightarrow {\textrm{W}}^{-}+\textrm{c}){\mathcal {B}}({\textrm{W}}^{-}\rightarrow \ell ^-\bar{{\upnu }})\) (where \({\mathcal {B}}\) denotes the branching fraction), their sum \(\sigma (\textrm{W}+\textrm{c})\equiv \sigma ({\hbox {W}}^{+}+\bar{\text {c}})+\sigma ({\hbox {W}}^{-}+{\textrm{c}})\), and the cross section ratio \(R_\textrm{c}^{\pm }\equiv \sigma ({\hbox {W}}^{+}+\bar{\text {c}})/\sigma ({\hbox {W}}^{-}+{\textrm{c}})\) are measured. Inclusive and differential cross sections are measured as functions of the transverse momentum (\(p_{\textrm{T}} ^\ell \)) and pseudorapidity (\(\eta ^\ell \)) of the lepton from the W boson decay. Measurements are unfolded to the particle and parton levels both in a fiducial region of phase space defined in terms of the kinematics of the lepton from the W boson (\(p_{\textrm{T}} ^{\ell } > 35\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^\ell | < 2.4\)) and of the c jet (\(p_{\textrm{T}} ^{\textrm{c}\text { jet}}>30\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^{\textrm{c}\text { jet}} |<2.4\)).

The theoretical cross section for \(\textrm{W}+\textrm{c}\) production at the LHC [13] is well known at the next-to-leading order (NLO) accuracy in perturbative quantum chromodynamics (QCD). Recently, the first computation of next-to-NLO (NNLO) QCD corrections was published [14, 15]. The measurements presented here are compared with the predictions of these NNLO QCD calculations, which include NLO electroweak (EW) corrections. The measurements are also compared with the predictions of the parton-level MC program \(\textsc {mcfm} \) [16], which implements calculations at NLO in QCD using several proton PDF sets.

The paper is structured as follows: the CMS detector is briefly described in Sect. 2, and the data and simulated samples used are presented in Sect. 3. Sections 4 and 5 describe the physics object reconstruction and the selection of the \(\textrm{W}+\textrm{c}\) signal sample. Section 6 reviews the most important sources of systematic uncertainties and their impact on the measurements. Cross section and cross section ratio measurements, compared with the NLO QCD theoretical predictions using different PDF sets, are detailed in Sect. 7. The comparisons of the measurements with the NNLO QCD calculations are presented in Sect. 8. The main results of the paper are summarized in Sect. 9.

Tabulated results are provided in HEPData [17].

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the magnetic volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Additional forward calorimetry complements the coverage provided by the barrel and endcap detectors. The silicon tracker measures charged particles within the pseudorapidity range \(|\eta |< 2.5\). For nonisolated particles of \(1< p_{\textrm{T}} < 10\,\text {Ge}\hspace{-.08em}\text {V} \) and \(|\eta | < 1.4\), the track resolutions are typically 1.5% in \(p_{\textrm{T}} \) and 20–75 \(\upmu \hbox {m}\) in the transverse impact parameter [18]. The upgrade of the pixel tracking detector [19] in early 2017, which includes additional layers and places the innermost layer closer to the interaction point, significantly improves the performance of heavy-flavor jet identification [20]. Muons are measured in the pseudorapidity range \(|\eta | < 2.4\), with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive plate chambers. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, is reported in Ref. [21].

Events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 \(\upmu \hbox {s}\) [22]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [23].

3 Data and simulated samples

This analysis is performed using a data sample of \(\hbox {pp}\) collisions at \(\sqrt{s}=13\,\text {Te}\hspace{-.08em}\text {V} \) collected by the CMS experiment during the 2016 (\(36.3{\,\text {fb}^{-1}} \)), 2017 (\(41.5{\,\text {fb}^{-1}} \)), and 2018 (\(59.8{\,\text {fb}^{-1}} \)) data-taking periods with a total integrated luminosity of \(138{\,\text {fb}^{-1}} \).

The experimental signature of the signal events, an isolated high-\(p_{\textrm{T}} \) lepton together with a c jet, is also present in other background processes. Sources of background include top quark production (\(\hbox {t}\bar{{\hbox {t}}}\) and single top quark), diboson (\({\textrm{WW}}\), \({\textrm{WZ}}\), and \({\textrm{ZZ}}\)) processes (collectively denoted as \({\textrm{VV}}\)), the production of a Zboson (or a virtual photon) in association with jets (\(\textrm{Z}+\text {jets}\)), and \(\textrm{W}+ \textrm{c}{\bar{\textrm{c}}}\) or \(\textrm{W}+ \textrm{b}\bar{\textrm{b}}\) events.

Samples of signal and background events are simulated using MC event generators based on fixed-order perturbative QCD calculations, supplemented with parton showering and multiparton interactions. Simulated samples of \(\textrm{W}+\text {jets}\) and \(\textrm{Z}+\text {jets}\) events are produced at NLO accuracy with the MadGraph5_amc@nlo [24] (version 2.6.3) matrix element generator with up to two partons in the final state. The decay of the W and Zbosons to tau leptons is included in the \(\textrm{W}+\text {jets}\) and \(\textrm{Z}+\text {jets}\) simulations. Samples of \(\hbox {t}\bar{{\hbox {t}}}\) and single top (s-, t-, and \({\textrm{tW}}\) channels) events are generated at NLO accuracy with powheg v2.0 [25]. The cross sections for \(\textrm{W}+\text {jets}\), \(\textrm{Z}+\text {jets}\), \(\hbox {t}\bar{{\hbox {t}}}\), and single top production are obtained at NNLO in QCD [26, 27]. The diboson production is modeled with samples of events generated with pythia8 [28] (version 8.219).

The simulated \(\textrm{W}+\text {jets}\) sample is composed of W bosons accompanied by jets originating from quarks of all flavors and gluons. Simulated \(\textrm{W}+\text {jets}\) events are classified according to the flavor of the outgoing generated partons as: (i) \(\textrm{W}+ \textrm{b}\) if at least one bottom quark was generated in the hard process; (ii) \(\textrm{W}+\textrm{c}\) if a single charm quark was created in the hard process; (iii) \(\textrm{W}+ \textrm{c}{\bar{\textrm{c}}}\) if a \(\hbox {c}\bar{\text {c}}\) pair was present in the event; (iv) \(\textrm{W}+ {\textrm{udsg}} \) if no c  or b  quarks were produced.

Data collected in different running periods are modeled with specific simulation configurations. For simulations corresponding to 2016 detector conditions, the NLO NNPDF3.0 [29] PDF set is used, whereas the MC samples for 2017–2018 make use of the NNLO NNPDF3.1 [30] PDF set. The parton showering, hadronization, and the underlying events are modeled by pythiav8.212 (v8.230) using the CUETP8M1 [31, 32] (CP5 [33]) tune for the 2016 (2017–2018) samples. The jet matching and merging scheme for the MadGraph5_amc@nlo samples is FxFx [34].

In the pythia8 simulations, the charm fragmentation fractions, defined as the probabilities for c  quarks to hadronize as particular charm hadrons, corresponding to \({\hbox {D}}^{\pm }\), \(\hbox {D}^{0}/\bar{\text {D}}^{0}\), \({\hbox {D}}_{\textrm{s}}^{\pm }\) and \(\Lambda _{\textrm{c}}^{\pm }\) hadrons, are corrected to match those in Ref. [35]. In addition, the leptonic and hadronic decay branching fractions of those hadrons are corrected to agree with more recent measurements [36].

Generated events are processed through a full Geant4-based [37] CMS detector simulation and trigger emulation. Simulated events are reconstructed with the same algorithms used to reconstruct collision data.

The simulated samples incorporate additional \(\hbox {pp}\) interactions in the same or nearby bunch crossings (pileup) to reproduce the experimental conditions. Simulated events are weighted so the pileup distribution matches the experimental data.

4 Object reconstruction

The global event reconstruction (also called particle-flow event reconstruction [38]) reconstructs and identifies each individual particle in an event, with an optimized combination of all subdetector information. In this process, the identification of the particle type (photon, electron, muon, charged or neutral hadron) plays an important role in the determination of the particle direction and energy. Photons are identified as ECAL energy clusters not linked to the extrapolation of any charged-particle trajectory. Electrons are identified as a primary charged-particle track and with many ECAL energy clusters corresponding to this track extrapolation to the ECAL and to possible bremsstrahlung photons emitted along the path through the tracker material. Muons are identified as tracks in the central tracker consistent with either a track or several hits in the muon system, and associated with calorimeter deposits compatible with the muon hypothesis. Charged hadrons are identified as charged particle tracks neither identified as electrons nor as muons. Finally, neutral hadrons are identified as HCAL energy clusters not linked to any charged-hadron trajectory, or as a combined ECAL and HCAL energy excess with respect to the expected charged-hadron energy deposit.

The primary vertex (PV) is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [39].

The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the track momentum at the PV, the corresponding ECAL cluster energy, and the energy sum of all bremsstrahlung photons associated with the track. The energy of muons is obtained from the corresponding track momentum. The energy of charged hadrons is determined from a combination of the track momentum and the corresponding ECAL and HCAL energies, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.

The electron momentum is estimated by combining the energy measurement in the ECAL with the momentum measurement in the tracker. The momentum resolution for electrons with \(p_{\textrm{T}} \approx 45\,\text {Ge}\hspace{-.08em}\text {V} \) from \(\textrm{Z}\rightarrow {\textrm{ee}}\) decays ranges from 1.6 to 5.0%. It is generally better in the barrel region than in the endcaps, and also depends on the bremsstrahlung energy emitted by the electron as it traverses the material in front of the ECAL [40, 41]. Matching muons to tracks measured in the silicon tracker results in a \(p_{\textrm{T}} \) resolution of 1% in the barrel and 3% in the endcaps for muons with \(p_{\textrm{T}}\) up to 100\(\,\text {Ge}\hspace{-.08em}\text {V}\) [42].

For each event, hadronic jets are clustered from these reconstructed particles using the infrared- and collinear-safe anti-\(k_{\textrm{T}}\) algorithm [43, 44] with a distance parameter of 0.4. The jet momentum is determined as the vector sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5–10% of the true momentum over the entire \(p_{\textrm{T}}\) spectrum and detector acceptance. Pileup interactions can contribute additional tracks and calorimetric energy depositions to the jet momentum. To mitigate this effect, charged particles identified as originating from pileup vertices are discarded, and an offset correction is applied to correct for remaining contributions from neutral particles. Jet energy corrections are derived from simulation to bring the measured response of jets to that of particle level jets on average. In situ measurements of the momentum balance in dijet, \(\text {photon} + \text {jet}\), \(\textrm{Z}+ \text {jet}\), and multijet events are used to account for any residual differences in the jet energy scale (JES) between data and simulation [45]. The jet energy resolution (JER) amounts typically to 15–20% at 30\(\,\text {Ge}\hspace{-.08em}\text {V}\), 10% at 100\(\,\text {Ge}\hspace{-.08em}\text {V}\), and 5% at 1\(\,\text {Te}\hspace{-.08em}\text {V}\). Additional selection criteria are applied to each jet to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures.

The missing transverse momentum vector \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \) is the projection on the plane perpendicular to the beams of the negative vector momenta sum of all particles that are reconstructed with the particle-flow algorithm. The \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \) is modified to account for corrections to the energy scale of the reconstructed jets in the event. The missing transverse momentum, \(p_{\textrm{T}} ^\text {miss} \), is defined as the magnitude of the \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \) vector, and it is a measure of the transverse momentum of particles leaving the detector undetected [46].

The trigger, reconstruction, and selection efficiencies are corrected in simulations to match those observed in the data. Lepton efficiencies (\(\epsilon _{\ell }\)) are evaluated with data samples of dilepton events in the \(\textrm{Z}\) boson mass peak with the tag-and-probe method [47], and correction factors \(\epsilon _{\ell }^\text {data}/\epsilon _{\ell }^{\textrm{MC}}\), binned in \(p_{\textrm{T}} ^\ell \) and \(\eta ^\ell \) of the leptons, are implemented.

5 Event selection

Events with a high-\(p_{\textrm{T}} \) lepton from the decay of a W boson are selected online by a trigger algorithm that requires the presence of an electron (muon) candidate with minimum \(p_{\textrm{T}} \) of 27, 32, and 32 \(\text {Ge}\hspace{-.08em}\text {V} \) (24, 27, and 24 \(\text {Ge}\hspace{-.08em}\text {V} \)) during the 2016, 2017, and 2018 data-taking periods, respectively. Electrons and muons are selected using tight identification criteria following the reconstruction algorithms discussed in Refs. [40, 42]. The analysis follows the selection strategy used in Ref. [8] and requires the presence of a high-\(p_{\textrm{T}} \) isolated lepton in the region \(|\eta ^{\ell } | < 2.4\) and \(p_{\textrm{T}} ^{\ell } > 35\,\text {Ge}\hspace{-.08em}\text {V} \).

The combined isolation variable, \(I_{\text {comb}}\), quantifies additional hadronic activity around the selected leptons. It is defined as the sum of the transverse momenta of neutral hadrons, photons, and charged hadrons in a cone with \(\varDelta R = \sqrt{\smash [b]{(\varDelta \eta )^2 +(\varDelta \phi )^2}}<0.3\) (0.4) around the electron (muon) candidate, excluding the contribution from the lepton itself, where \(\phi \) is the azimuthal angle in radians. Only charged particles originating from the PV are included in the sum to minimize the contribution from pileup interactions. The contribution of neutral particles from pileup vertices is estimated and subtracted from \(I_{\text {comb}}\). For electrons, this contribution is evaluated with the jet area method described in Ref. [48]; for muons, it is assumed to be half the \(p_{\textrm{T}} \) sum of all charged particles in the cone originating from pileup vertices. The factor one-half accounts for the expected ratio of neutral to charged particle production in hadronic interactions. The lepton candidate is considered to be isolated if \(I_{\text {comb}}/p_{\textrm{T}} ^{\ell } < 0.15\). Events with an additional isolated lepton with \(p_{\textrm{T}} ^\ell >20\,\text {Ge}\hspace{-.08em}\text {V} \) are rejected to suppress the contribution from \(\textrm{Z}+\text {jets}\) and \(\hbox {t}\bar{{\hbox {t}}}\) events.

The transverse mass (\(m_\text {T} \)) of the lepton and \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \) is defined as,

$$\begin{aligned} m_\text {T} \equiv \sqrt{2~p_{\textrm{T}} ^\ell ~p_{\textrm{T}} ^\text {miss} ~[1-\cos (\phi _\ell -\phi _{p_{\textrm{T}} ^\text {miss}})]}, \end{aligned}$$

where \(\phi _\ell \) and \(\phi _{p_{\textrm{T}} ^\text {miss}}\) are the azimuthal angles of the lepton and the \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \) vector. Events with \(m_\text {T} < 55\,\text {Ge}\hspace{-.08em}\text {V} \) are discarded from the analysis to suppress the contamination from events composed uniquely of jets produced through the strong interaction, referred to as QCD multijet events. The contribution of this background was evaluated with two methods: (i) using a QCD multijet simulation; and (ii) by means of data control regions, inverting the selection requirements in transverse mass and lepton isolation to infer the contribution in the signal region. The contamination after \(\text {OS-SS}\) subtraction is negligible.

In addition to the requirements that select events with a W boson, we require the presence of at least one jet with \(p_{\textrm{T}} ^{\text {jet}}>30\,\text {Ge}\hspace{-.08em}\text {V} \) and \(|\eta ^{\text {jet}} |<2.4\). Jets with an angular separation between the jet axis and the selected isolated lepton \(\varDelta R ({\text {jet}},\ell )<0.4\) are not considered.

5.1 Identification of charm jets 

Hadrons with b  and c  quark content decay through the weak interaction with lifetimes of the order of \(10^{-12}\,\hbox {s}\) and mean decay lengths larger than 100 \(\upmu \hbox {m}\) at the energies relevant for this analysis. Secondary vertices well separated from the PV can be identified and reconstructed from the charged particle tracks. In a sizeable fraction of the heavy-flavor hadron decays (\({\approx }\)10–15% [36]) there is a muon in the final state. We make use of these properties to define two independent data samples enriched with jets originating from a c  quark: (i) the semileptonic (SL) channel, where a muon coming from the semileptonic decay of a c  hadron is identified inside a jet; and (ii) the secondary vertex (SV) channel, where a displaced SV is reconstructed inside a jet. The charge of the c  quark is determined from the charge of the muon in the SL channel, and the charges of the SV tracks in the SV case, as described in more detail below.

If an event fulfills both the SL and SV selection requirements (about 6% of the selected events), it is assigned to the SL channel. Thus, the SL and SV channels are mutually exclusive, i.e., the samples selected in each channel are statistically independent.

These two signatures also feature weakly decaying b hadrons. Events from processes involving the associated production of W bosons and b quarks are abundantly selected in the two categories. The dominant background contribution stems from \(\hbox {t}\bar{{\hbox {t}}}\) production, where a pair of W bosons and two b jets are produced in the decays of the top quark-antiquark pair. This final state mimics the analysis topology when at least one of the W bosons decays leptonically and one of the b jets contains an identified muon or a reconstructed SV. However, this background is effectively suppressed by the \(\text {OS-SS}\) subtraction. A \(\hbox {t}\bar{{\hbox {t}}}\) event will be categorized as OS (SS) when the lepton from the W decay and the muon or SV from the b  quark are coming from the same (different) top quark. The probability of identifying a muon or an SV inside the b (or \(\bar{\text {b}}\)) jet with opposite or same charge as the charge of the W candidate is expected to be the same, thus producing an equal amount of OS and SS events.

Top quark-antiquark events where one of the W bosons decays hadronically into a \(\hbox {c}\bar{\text {s}}\) (or \(\bar{\text {c}}\hbox {s}\)) pair may result in additional event candidates if the SL or SV signature originates from the c  jet. This topology produces genuine OS events, which contribute to the remaining background contamination after \(\text {OS-SS}\) subtraction. Similarly, single top quark production also produces OS events, but at a lower level because of the smaller production cross section. These remaining background contributions after \(\text {OS-SS}\) subtraction are estimated with simulations and are subtracted in the cross section measurements.

The production of a W boson and a single bottom quark through the process \({\textrm{qg}} \rightarrow \textrm{W}+ \textrm{b}\), similar to the one sketched in Fig. 1, produces OS events, but it is heavily Cabibbo-suppressed and its contribution is negligible. The other source of a W boson and a b  quark is \(\textrm{W}+ \textrm{b}\bar{\textrm{b}}\) events where the \(\hbox {b}\bar{\text {b}}\) pair originates from a gluon splitting mechanism. These events are also charge-symmetric, since it is equally likely to identify the b  jet with the same or opposite charge than that of the W boson. This contribution also cancels out after the \(\text {OS-SS}\) subtraction. The same argument applies to \(\textrm{W}+ \textrm{c}{\bar{\textrm{c}}}\) events.

5.1.1 Event selection in the SL channel

The \(\textrm{W}+\textrm{c}\) events with a semileptonic c  quark decay are selected by requiring a reconstructed muon among the constituents of any of the selected jets. Semileptonic c  quark decays into electrons are not considered because of a high background in identifying electrons inside jets. The muon candidate must satisfy the same reconstruction and identification quality criteria as those imposed on the muons from the W boson decay, except for isolation, and must be reconstructed in the region \(|\eta | < 2.4\), with \(p_{\textrm{T}} ^{\upmu }<25\,\text {Ge}\hspace{-.08em}\text {V} \) and \(p_{\textrm{T}} ^{\upmu }/p_{\textrm{T}} ^{\text {jet}}<0.6\). The \(p_{\textrm{T}} \) requirements reduce the contamination from prompt muons overlapping with or misreconstructed as jets. No minimum \(p_{\textrm{T}} \) threshold is explicitly required, but the muon reconstruction algorithm sets a natural threshold of around 3 (2)\(\,\text {Ge}\hspace{-.08em}\text {V}\) in the barrel (endcap) region since the muon must traverse the material in front of the muon detector and penetrate deep enough into the muon system to be reconstructed and satisfy the identification criteria. If more than one such muon is identified, the one with the highest \(p_{\textrm{T}} \) is selected.

Additional requirements are applied for the event selection in the \(\hbox {W}\rightarrow \upmu \upnu \) channel, since the selected sample is affected by a sizeable contamination from dimuon \(\textrm{Z}+\text {jets}\) events, where one of the muons from the \(\textrm{Z}\) decay is reconstructed inside a jet. The track of the muon coming from a semileptonic decay of a charm hadron tends to have a considerable transverse impact parameter with respect to the PV. We require the transverse impact parameter significance (IPS) of the muon in the jet, defined as the muon transverse impact parameter divided by its uncertainty, to be larger than 2. In addition, events with a dimuon invariant mass close to the \(\textrm{Z}\) boson mass peak (\(70<m_{{\upmu \upmu }}<110\,\text {Ge}\hspace{-.08em}\text {V} \)) are discarded. Furthermore, \(m_{{\upmu \upmu }}\) must be larger than \(12\,\text {Ge}\hspace{-.08em}\text {V} \) to suppress the background from low-mass resonances.

The normalizations of the \(\hbox {t}\bar{{\hbox {t}}}\) and \(\textrm{Z}+\text {jets}\) backgrounds are derived from data control samples. A \(\textrm{Z}+\text {jets}\) data control sample is defined using the same selection criteria as the analysis but inverting the \(m_{{\upmu \upmu }}\) requirement to select events close to the \(\textrm{Z}\) boson mass peak (\(70<m_{{\upmu \upmu }}<110\,\text {Ge}\hspace{-.08em}\text {V} \)). A normalization factor of \(1.08\pm 0.01\) is required to match the \(\textrm{Z}+\text {jets}\) simulation with data. The \(\hbox {t}\bar{{\hbox {t}}}\) data control sample is established by selecting events with the same requirements as the analysis and additionally demanding at least three high-\(p_{\textrm{T}} \) jets, two of which are tagged as \({\textrm{b}}\) jets (using the loose working point of the DeepCSV b-tagging algorithm [49]), and the remaining jet contains a muon. A normalization factor of \(0.92\pm 0.02\) is required to bring into agreement data and \(\hbox {t}\bar{{\hbox {t}}}\) simulation. The uncertainty in the background normalization factors reflects the statistical uncertainty of the data and the simulations in the control samples. Once the absolute normalization of the \(\textrm{Z}+\text {jets}\) and \(\hbox {t}\bar{{\hbox {t}}}\) background

Table 1 Data and background event yields (with statistical uncertainties) after selection and \(\text {OS-SS}\) subtraction for the SL channels (electron and muon W decay modes)
Table 2 Simulated signal and background composition (in percentage) of the SL sample after selection and \(\text {OS-SS}\) subtraction. The \(\hbox {W} + \hbox {Q}\bar{\hbox {Q}} \) stands for the sum of the contributions of \(\textrm{W}+ \textrm{c}{\bar{\textrm{c}}}\) and \(\textrm{W}+ \textrm{b}\bar{\textrm{b}}\)

contributions are determined, the \(\textrm{W}+\text {jets}\) simulation is scaled so that the sum of the events from all predicted contributions be equal to the number of events in the selected data sample. The normalization factor of the \(\textrm{W}+\text {jets}\) simulation (0.95) has only a minor effect in the contribution of the (small) predicted \(\textrm{W}+ {\textrm{udsg}} \) background. The overall normalization of the \(\textrm{W}+\textrm{c}\) signal simulation is irrelevant for the analysis, since it is only used for acceptance and efficiency calculations.

Events are classified as OS or SS depending on the electric charges of the lepton from the W boson decay and the muon inside the jet.

Table 1 shows the event yields in the \(\hbox {W}\rightarrow \hbox {e}\upnu \) and \(\hbox {W}\rightarrow \upmu \upnu \) channels after the selection requirements described above and after \(\text {OS-SS}\) subtraction. The \(\hbox {W}\rightarrow \upmu \upnu \) channel has a significantly lower yield due to the additional requirements to reduce the sizeable Z+jets background. The background yields, as estimated with the simulations, are also included in the table. The signal and background composition of the selected sample according to simulation is shown in Table 2. The fraction of signal \(\textrm{W}+\textrm{c}\) events in the \(\hbox {W}\rightarrow \hbox {e}\upnu \) channel is above 80%, whereas in the \(\hbox {W}\rightarrow \upmu \upnu \) channel it drops to 74% because of the additional \(\textrm{Z}+\text {jets}\) background (around 6%). The dominant background, \(\hbox {t}\bar{{\hbox {t}}}\) production, where one of the W bosons decays leptonically and the other hadronically with a charm quark in the final state, amounts to approximately 10%.

Figure 2 shows the \(p_{\textrm{T}} \) distributions of the muon inside the jet (upper) and of the lepton from the W decay (lower), for events in the selected SL sample, after the background normalization corrections described above. The simulations agree with the data within uncertainties.

5.1.2 Event selection in the SV channel

An independent \(\textrm{W}+\textrm{c}\) sample is selected by looking for secondary decay vertices of charmed hadrons within the reconstructed jets. Displaced SVs are reconstructed with either the simple secondary vertex (SSV) [50] or the inclusive vertex finder (IVF) [51, 52] algorithms. Both algorithms follow the adaptive vertex fitter technique [53] to construct an SV, but differ in the track selection used. The SSV algorithm takes as input the tracks constituting the jet, whereas the IVF algorithm starts from a displaced track with respect to the PV (seed track) and tries to build a vertex from nearby tracks in terms of their separation distance in three dimensions and their angular separation around the seed track. The IVF vertices are then associated to the closest jet in a cone of \(\varDelta R=0.3\). Both SSV and IVF vertices always start with input tracks with a minimum \(p_{\textrm{T}} \) of \(1\,\text {Ge}\hspace{-.08em}\text {V} \) to minimize the effects from poorly reconstructed tracks. Vertices reconstructed with the IVF algorithm are considered first. If no IVF vertex is found, SSV vertices are searched for, thus providing additional event candidates (about 3%). If more than one SV is reconstructed within a jet, the one with the highest \(p_{\textrm{T}} \), computed from its associated tracks, is selected. If there are several jets with an SV, only the SV associated to the jet of highest \(p_{\textrm{T}} \) is selected.

At least three tracks must be associated with an SV for it to be considered. This requirement largely reduces the contamination of jets coming from the hadronization of light-flavor quarks (\(\textrm{u}\), \(\textrm{d}\), and \(\textrm{s}\)) or gluons. It also reduces the systematic uncertainty associated with the SV reconstruction efficiency. To ensure that the SV is well separated from the PV, we require the displacement significance, defined as the three dimensional distance between the PV and SV, divided by its uncertainty, to be larger than 8. This requirement suppresses the \(\textrm{W}+ {\textrm{udsg}} \) background contribution below 1%.

Fig. 2
figure 2

Distributions after OS-SS subtraction of the \(p_{\textrm{T}} \) of the muon inside the \(\textrm{c}\text { jet} \) (upper) and the \(p_{\textrm{T}} \) of the lepton from the W decay (lower) for events in the SL sample, summing up the contributions of the W boson decay channels to electrons and muons. The contributions of the various processes are estimated with the simulated samples. The statistical uncertainty in the data is smaller than the size of the data dots. The hatched areas represent the sum in quadrature of statistical and systematic uncertainties in the MC simulation. The ratio of data to simulation is shown in the lower panels. The uncertainty band in the ratio includes the statistical uncertainty in the data, and the statistical and systematic uncertainties in the MC simulation

To classify the event as OS or SS, we measure the sign of the charge of the charm quark produced in the hard interaction. For charged charm hadrons, the sum of the charges of the decay products reflects the charge of the c  quark. For neutral charm hadrons, the charge of the closest hadron produced in the fragmentation process can indicate the charge of the c  quark [54, 55]. Hence, we assign a charge equal to the sum of the charges of the particle tracks associated with the SV. If the SV charge is zero, we assume the charge of the track that is closest in angular separation to the SV. We only consider PV tracks with \(p_{\textrm{T}} >0.3\,\text {Ge}\hspace{-.08em}\text {V} \) and within an angular separation from the SV direction of 0.1 in the \((\eta , \phi )\) space. If nonzero charge cannot be assigned, the event is rejected. According to the simulation, the charge assignment procedure provides a nonzero charge for 99% of the selected SVs, and the sign of the charge is correctly assigned in 83% of the cases.

The modeling of the SV charge assignment in the simulation has been validated with data. Events passing both the SL and SV selection criteria are used to compare the charges of the muon inside the jet and the SV. In 95% of these events the charges agree. The difference in the charge assignment efficiency between data and simulation, around 1%, is taken as a systematic uncertainty in the cross section measurements, as detailed in Sect. 6.

The SV reconstruction efficiency in the simulation is calibrated using data. The events of the SL sample are used to compute data-to-simulation scale factors for the efficiency of charm identification through the reconstruction of an SV [5, 49]. The fraction of events in the SL sample with an SV is computed for data and simulation, and the ratio of data to simulation is applied as a scale factor to the simulated \(\textrm{W}+\textrm{c}\) signal events in the SV sample. The calculated scale factor is \(0.93 \pm 0.03\), where the uncertainty accounts for statistical and systematic effects. The systematic uncertainty includes contributions from uncertainties in the pileup description, JES and JER, lepton efficiencies, background subtraction, and modeling of charm production and decay fractions in the simulation.

Table 3 shows the event yields in the \(\hbox {W}\rightarrow \hbox {e}\upnu \) and \(\hbox {W}\rightarrow \upmu \upnu \) channels after the selection requirements and \(\text {OS-SS} \) subtraction. The background yields, as estimated with the simulations, are also included. The contributions of the backgrounds were rescaled using the normalization factors described in Sect. 5.1.1. The signal and background composition of the selected sample, as predicted by the simulation, are shown in Table 4. The purity of signal \(\textrm{W}+\textrm{c}\) events is above 80%. The dominant backgrounds come from \(\hbox {t}\bar{{\hbox {t}}}\) (8%) and single top (9%) production.

Table 3 Data and background event yields (with statistical uncertainties) after selection and \(\text {OS-SS}\) subtraction for the SV channels (electron and muon W decay modes)
Table 4 Simulated signal and background composition (in percentage) of the SV sample after selection and \(\text {OS-SS}\) subtraction. The \(\hbox {W} + \hbox {Q}\bar{\hbox {Q}} \) stands for the sum of the contributions of \(\textrm{W}+ \textrm{c}{\bar{\textrm{c}}}\) and \(\textrm{W}+ \textrm{b}\bar{\textrm{b}}\)

The event selection requirements are summarized in Table 5 for the four selection channels of the analysis, the W boson decay channels to both electrons or muons, and the SL and SV charm identification channels.

Table 5 Summary of the selection requirements for the four selection channels of the analysis

Figure 3 shows the distributions, after \(\text {OS-SS}\) subtraction, of the corrected SV mass and the SV transverse momentum divided by the jet transverse momentum, \(p_{\textrm{T}} ^{\text {SV}}\)/\(p_{\textrm{T}} ^{\text {jet}}\). The latter is a representative observable of the energy fraction of the charm quark carried by the charm hadron in the fragmentation process. We define the corrected SV mass, \(m_\text {SV}^\text {corr}\), as the invariant mass of all charged particles associated with the SV, assumed to be pions, \(m_\text {SV}\), corrected for additional particles, either charged or neutral, that may have been produced but were not reconstructed [56]:

$$\begin{aligned} m_\text {SV}^\text {corr} = \sqrt{m^2_\text {SV} + p^2_\text {SV} \sin ^2 \theta } + p_\text {SV} \sin \theta , \end{aligned}$$

where \(p_\text {SV}\) is the modulus of the vectorial sum of the momenta of all charged particles associated with the SV, and \(\theta \) is the angle between the momentum vector sum and the vector from the PV to the SV. The corrected SV mass is thus the minimum mass the long-lived hadron can have that is consistent with the direction of its momentum.

The normalization of the single top quark background is fixed with data. Single top quark events populate the tail of the \( m_\text {SV}^\text {corr}\) distribution. A normalization factor of \(1.5\pm 0.2\) for the single top quark contribution was required to match data and simulation predictions. The same rescaling is applied to the SL and SV samples.

Fig. 3
figure 3

Distributions after \(\text {OS-SS}\) subtraction of the corrected SV mass (upper) and SV transverse momentum divided by the jet transverse momentum (lower) for events in the SV sample, summing up the contributions of the W boson decay channels to electrons and muons. The contributions from all processes are estimated with the simulated samples. The statistical uncertainty in the data is smaller than the size of the data dots for most of the data points. The hatched areas represent the sum in quadrature of statistical and systematic uncertainties in the MC simulation. The ratio of data to simulation is shown in the lower panels. The uncertainty band in the ratio includes the statistical uncertainty in the data, and the statistical and systematic uncertainties in the MC simulation

6 Systematic uncertainties

The impact of various sources of uncertainty in the measurements presented in Sect. 7 is estimated by recalculating the cross sections with the relevant parameters varied up and down by one standard deviation of their uncertainties.

The combined uncertainty in the trigger, reconstruction, and identification efficiencies for isolated leptons results in an uncertainty in the cross section measurements of about 2% (1%) for the \(\hbox {W}\rightarrow \hbox {e}\upnu \) (\(\hbox {W}\rightarrow \upmu \upnu \)) channel. The uncertainty in the identification efficiency of nonisolated muons inside jets is approximately 3%, according to dedicated studies with \(\textrm{Z}+\text {jets}\) events. This uncertainty affects only the SL channel.

The effects of the uncertainty in the JES and JER are assessed by varying up and down the \(p_{\textrm{T}} \) values of jets with the corresponding uncertainty factors. The JES and JER uncertainties are also propagated to \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \). The resulting uncertainty in the cross section is about 2% (1%) for the SL (SV) channel. The uncertainty from a \({\vec p}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}} \) mismeasurement in the event is estimated by varying within its uncertainty the contribution of the energy unassociated with reconstructed particle-flow objects. The effect in the cross section measurement is \({<}0.5\%\). Uncertainties in the pileup modeling are calculated using a modified pileup profile obtained by changing the mean number of interactions by \({\approx }5\%\). This variation covers the uncertainty in the \(\hbox {pp}\) inelastic cross section [57] and in the modeling of the pileup simulation. It results in less than 0.5% uncertainty in the cross section measurements.

The integrated luminosities of the 2016, 2017, and 2018 data-taking periods are individually known with uncertainties in the 1.2–2.5% range [58,59,60], whereas the total 2016–2018 integrated luminosity has an uncertainty of 1.6%. The improvement in precision arises from the (uncorrelated) time evolution of some systematic effects.

The uncertainty in the scale factor correcting the SV reconstruction efficiency in the simulation propagates into a systematic uncertainty of 3% in the cross section. The uncertainty in the SV charge determination is estimated as the difference (1%) in the rate obtained in data and simulation of correct SV charge assignment in the validation test described in Sect. 5.1.2.

Because of the dependence of the SV reconstruction efficiency on the SV displacement, we have evaluated the effect produced by an imperfect modeling of this observable by reweighting the SV displacement significance distribution of the simulation to match that of the data. The resulting uncertainty in the cross section measurement is 1–2%. In addition, the stability of the results with the minimum SV displacement significance requirement was checked by changing the threshold from 8 to 7. The effect in the results is also at the 1% level.

The background contributions are evaluated with the simulations validated in data control samples, as discussed in Sect. 5.1.1. The uncertainty in the predicted background levels has an effect of 1% in the cross section measurements.

The signal samples used for the acceptance and efficiency calculations were generated with MadGraph+ pythia8 using the NNPDF3.0 and NNPDF3.1 PDF sets. The envelope of the systematic variations (replicas) of the nominal PDF is assumed to be the systematic uncertainty due to an imperfect knowledge of the PDFs, as recommended in Ref. [61]. The effect is approximately \(1\%\). The statistical uncertainty in the determination of the selection efficiency using the simulated samples is 1%, and is propagated as an additional systematic uncertainty.

In the signal and background modeling, no uncertainty is included in the simulation of higher-order terms in perturbative QCD (parton shower). The \(\text {OS-SS}\) subtraction technique removes the contribution to \(\textrm{W}+\textrm{c}\) production coming from charm quark-antiquark pair production, rendering the measurement insensitive to this effect. The uncertainty in the modeling of the hard process in the signal simulation is assessed by independently changing the QCD factorization and renormalization scales by factors of 0.5 and 2 relative to the nominal value. The resulting uncertainty in the cross section measurement is negligible.

To estimate the effect produced by the uncertainties in the corrected values used in the simulation for the charm fragmentation and decay branching fractions [35, 36], we have varied those values within their uncertainties. The impact in the cross section measurements is 1–2%, both for the fragmentation and decay branching fractions.

The main systematic uncertainties are summarized in Table 6 for the four selection channels of the analysis. Overall, the total systematic uncertainty in the \(\textrm{W}+\textrm{c}\) fiducial cross section is approximately \(5\%\) in all channels.

Table 6 Summary of the main systematic uncertainties, in percentage of the measured fiducial cross section, for the four selection channels of the analysis

7 Cross section measurements

The \(\textrm{W}+\textrm{c}\) production cross section measurements are restricted to a phase space region that is close to the experimental fiducial volume with optimized sensitivity for the signal process. Cross sections are measured inclusively within the fiducial phase space region and differentially as a function of \(p_{\textrm{T}} ^{\ell }\) and \(|\eta ^{\ell } |\). Cross section measurements are performed independently in four different channels, the two charm identification SL and SV channels, and the two W boson decay channels. The four measurements are combined to improve the precision.

Measurements are unfolded to the particle and parton levels. At both levels, the fiducial region is defined by a lepton at the generator level coming from the decay of a W boson with \(p_{\textrm{T}} ^{\ell }>35\,\text {Ge}\hspace{-.08em}\text {V} \) and \(|\eta ^{\ell } | < 2.4\), together with a generator-level c  jet with \(p_{\textrm{T}} ^{\textrm{c}\text { jet}} > 30\,\text {Ge}\hspace{-.08em}\text {V} \) and \(|\eta ^{\textrm{c}\text { jet}} | < 2.4\). The \(\text {OS-SS}\) subtraction is also applied at generator level. This removes the charge-symmetric contributions that mostly originate from gluon splitting into a charm quark-antiquark pair. The c  jet must be well separated from the lepton by an angular distance \(\varDelta R (\textrm{c}\text { jet},\ell )>0.4\). Jets at the generator level are clustered using the anti-\(k_{\textrm{T}}\) jet algorithm with a distance parameter \(R = 0.4\). At the particle level, jets are formed using generator particles produced after the hadronization process. At the parton level, jets are constructed from the hard interaction partons.

For all channels under study, the \(\textrm{W}+\textrm{c}\) cross section is determined using the following expression:

$$\begin{aligned} \sigma (\textrm{W}+\textrm{c})= \frac{Y_{\text {sel}}-Y_{\text {bkg}}}{\mathcal {C} \, \mathcal {L}}, \end{aligned}$$
(1)

where \(Y_{\text {sel}}\) is the selected \(\text {OS-SS}\) event yield, and \(Y_{\text {bkg}}\) the background yield in data after \(\text {OS-SS}\) subtraction, estimated from simulation and normalized using the data control samples described in Sect. 5.1. \(\mathcal {L}\) is the integrated luminosity of the data sample.

The factor \(\mathcal {C}\) corrects for acceptance and efficiency losses in the selection process of \(\textrm{W}+\textrm{c}\) events produced in the fiducial region at the generator level. It also subtracts the contributions from \(\textrm{W}+\textrm{c}\) events outside the kinematic region of the measurements and from \(\textrm{W}+\textrm{c}\) events with \(\hbox {W}\rightarrow \uptau \upnu \), \(\uptau \rightarrow \textrm{e}+ X\) or \(\uptau \rightarrow \upmu + X\). It is calculated, using the sample of simulated signal events, as the ratio between the event yield of the selected \(\textrm{W}+\textrm{c}\) sample (according to the procedure described in Sects. 5.1.1 and 5.1.2 and after \(\text {OS-SS}\) subtraction) and the number of \(\text {OS-SS}\) \(\textrm{W}+\textrm{c}\) events satisfying the phase space definition at the generator level. Independent correction factors \(\mathcal {C}\) are computed at the particle and parton levels, and for the four selection channels.

7.1 Measurements at the particle level

Cross section measurements, unfolded to the particle level, are presented in this section. The fiducial \(\textrm{W}+\textrm{c}\) production cross section measurements computed with Eq. (1) for the four channels separately are shown in Table 7, together with the event yields and the \(\mathcal {C}\) correction factors. The different \(\mathcal {C}\) values reflect the different reconstruction and selection efficiencies in the four channels. In the SL channel, less than 5% of the signal charm hadrons generated in the fiducial region of the analysis produce a muon in their decay with enough momentum to reach the muon detector and get reconstructed. Similarly, in the SV channel, less than 5% of the events with a charm hadron decay remain after SV reconstruction, SV charge assignment, and \(\text {OS-SS}\) subtraction. The remaining inefficiency, accounted for in the \(\mathcal {C}\) correction factor, is due to the selection requirements of the samples.

Table 7 Measured production cross sections \(\sigma (\textrm{W}+\textrm{c})\) unfolded to the particle level in the four selection channels together with statistical (first) and systematic (second) uncertainties. The acceptance times efficiency values (\(\mathcal {C}\)) are also given

Results obtained for the \(\textrm{W}+\textrm{c}\) cross sections in the four different channels are consistent within the uncertainties, and are combined using the best linear unbiased estimator method [62] that takes into account individual uncertainties and their correlations. Systematic uncertainties arising from a common source and affecting several measurements are considered as fully correlated. In particular, all systematic uncertainties are assumed fully correlated between the electron and muon channels, except those related to the lepton reconstruction. The \(\chi ^2\) of the combination is 4.8 (three degrees of freedom), corresponding to a p-value of 0.19. The combined measured cross section unfolded to the particle level is:

$$\begin{aligned} \begin{aligned} \sigma (\hbox {W}+\hbox {c})&= 148.7 \pm 0.4 \,\text {(stat)} \pm 5.6 \,\text {(syst)} \,\hbox {pb}. \end{aligned} \end{aligned}$$

Measurements are compared with the predictions of the MadGraph5_amc@nlo MC generator, as shown in Fig. 4. In the predictions, two different NNPDF PDF sets (versions 3.0 and 3.1) are used. The two predictions differ as well in the tune used in pythia8 for the parton showering, hadronization, and underlying event modeling (CUETP8M1 and CP5). The predicted cross sections are about 10% (using NLO NNPDF3.0) and 20% (NNLO NNPDF3.1) higher than the measured value, with relative uncertainties close to 10%. The uncertainty associated with the MC predictions includes the uncertainties associated with the renormalization and factorization scales, as well as the uncertainty related to the PDFs used in the simulation. The scale uncertainties are estimated using a set of weights provided by the generator that corresponds to independent variations of the scales by factor of 0.5, 1, and 2. The prediction is obtained for all combinations (excluding the cases where one scale is reduced and the other is increased at the same time) and their envelope is quoted as the uncertainty. The uncertainty in the PDFs is estimated using different Hessian eigenvectors of each PDF set.

Fig. 4
figure 4

Comparison of the measured fiducial \(\sigma (\textrm{W}+\textrm{c})\) cross section unfolded to the particle level with the predictions from the MadGraph5_amc@nlo simulation using two different PDF sets (NLO NNPDF3.0 and NNLO NNPDF3.1). Two different tunes (CUETP8M1 and CP5) for the parton showering, hadronization and underlying event modeling in pythia8 are also used. Horizontal error bars indicate the total uncertainty in the predictions

The \(\sigma (\textrm{W}+\textrm{c})\) production cross section is also measured differentially as a function of \(|\eta ^\ell |\) and \(p_{\textrm{T}} ^\ell \). The total sample is divided into subsamples according to the value of \(|\eta ^\ell |\) or \(p_{\textrm{T}} ^\ell \), and the cross section is computed using Eq. (1). The binning of the differential distributions is chosen such that each bin is sufficiently populated to perform the measurement. Event migration between neighbouring bins caused by detector resolution effects is evaluated with the simulated signal sample and is negligible. Measurements in the four channels are combined assuming that systematic uncertainties are fully correlated among bins of the differential distributions.

Fig. 5
figure 5

Measured differential cross sections \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}|\eta ^\ell |}\) (upper) and \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}{p_{\textrm{T}} ^\ell }}\) (lower) unfolded to the particle level, compared with the predictions of the MadGraph5_amc@nlo simulation. Two different PDF sets (NLO NNPDF3.0 and NNLO NNPDF3.1) are used. Error bars on data points include statistical and systematic uncertainties. Symbols showing the theoretical expectations are slightly displaced in the horizontal axis for better visibility. The ratios of data to predictions are shown in the lower panels. The uncertainty in the ratio includes the uncertainties in both data and prediction

Table 8 Measured production cross sections \(\sigma (\textrm{W}+\textrm{c})\) unfolded to the parton level in the four selection channels together with statistical (first) and systematic (second) uncertainties. The acceptance times efficiency values (\(\mathcal {C}\)) are also given

Systematic uncertainties in the differential \(\sigma (\textrm{W}+\textrm{c})\) cross section measurements are in the range of 4–6%. The main sources of systematic uncertainty, as discussed in Sect. 6, are related to the charm hadron fragmentation and decay fractions in the simulation (2%), and the efficiency of identifying an SV or a muon inside a jet (3%).

The \(\sigma (\textrm{W}+\textrm{c})\) differential cross section as a function of \(|\eta ^{\ell } |\) and \(p_{\textrm{T}} ^\ell \), obtained after the combination of the measurements in the SL, SV, electron, and muon channels, is shown in Fig. 5, compared with the predictions from the MadGraph5_amc@nlo simulation. Observed shape differences are within 10%.

7.2 Measurements at the parton level

The measurements are also unfolded to the parton level, including an additional correction to account for the c  quark fragmentation and hadronization processes. Results of the fiducial cross sections in the four selection channels are presented in Table 8. The combination of the measurements is:

$$\begin{aligned} \sigma (\hbox {W}+\hbox {c}) = 163.4 \pm 0.5\,\text {(stat)} \pm 6.2\,\text {(syst)} \,\hbox {pb}. \end{aligned}$$

The fiducial cross section measured at the parton level is expected to be slightly larger than that at the particle level. During the hadronization and jet clustering processes, the momentum of the c  quark gets smeared and biased towards slightly smaller values. A fraction of charm quarks near the \(p_{\textrm{T}} ^{\textrm{c}}>30\,\text {Ge}\hspace{-.08em}\text {V} \) threshold of the fiducial region of the measurement do not result in c  jets with \(p_{\textrm{T}} ^{\textrm{c}\text { jet}} > 30\,\text {Ge}\hspace{-.08em}\text {V} \). On the other hand, a number of \(\textrm{W}+\textrm{c}\) events with a c  quark with \(p_{\textrm{T}} ^{\textrm{c}} < 30\,\text {Ge}\hspace{-.08em}\text {V} \) get reconstructed with a generator level jet with \(p_{\textrm{T}} ^{\textrm{c}\text { jet}} > 30\,\text {Ge}\hspace{-.08em}\text {V} \). The net effect is the the observed reduction of about 10% of the cross section at the particle level.

The measurements unfolded to the parton level are compared with analytical calculations of \(\textrm{W}+\textrm{c}\) production. We have used the \(\textsc {mcfm} \) 9.1 program [16] to evaluate the cross section predictions in the phase space of the analysis: \(p_{\textrm{T}} ^{\ell }>35\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^{\ell } |<2.4\), \(p_{\textrm{T}} ^{\textrm{c}\text { jet}}>30\,\text {Ge}\hspace{-.08em}\text {V} \), and \(|\eta ^{\textrm{c}\text { jet}} |<2.4\). Jets are clustered in \(\textsc {mcfm} \) using the anti-\(k_{\textrm{T}}\) jet algorithm with a distance parameter \(R = 0.4\). The \(\textrm{W}+\textrm{c}\) process description is available in \(\textsc {mcfm} \) up to \(\mathcal {O}({\alpha _\textrm{S}}^2)\) with a massive charm quark (\(m_{\textrm{c}} =1.5\,\text {Ge}\hspace{-.08em}\text {V} \)). The contributions from gluon splitting into \(\hbox {c}\bar{\text {c}}\) are not included. We have computed predictions for the following NLO PDF sets: MSHT20 [63], CT18 [64], CT18Z [64], ABMP16 [65], NNPDF3.0 [29], and NNPDF3.1 [30]. The LHAPDF6 library [66] was used to access the PDF sets. All of the PDF sets were derived using strangeness-sensitive experimental data, including LHC W/Z  and jet production cross section measurements. The NNPDF and MSHT20 sets additionally incorporate the CMS \(\textrm{W}+\textrm{c}\) production at \(\sqrt{s}=~7\,\text {Te}\hspace{-.08em}\text {V} \) data. CTA18Z differs from CTA18 because the former includes the ATLAS W/Z  \(7\,\text {Te}\hspace{-.08em}\text {V} \) precision measurements [67] leading to an enhancement of the strange PDF. The PDF parameterizations of the MSHT20 and NNPDF groups allow for strangeness asymmetry.

The factorization and the renormalization scales are set to the value of the W boson mass [36]. The uncertainty from missing higher perturbative orders is estimated by computing cross section predictions varying independently the factorization and renormalization scales to twice and half their nominal values, with the constraint that the ratio of scales is never larger than 2. The envelope of the resulting cross sections with these scale variations defines the theoretical scale uncertainty. The value in the calculation of the strong coupling constant at the energy scale of the mass of the \(\textrm{Z}\) boson, \(\alpha _\textrm{S} (m_{\textrm{Z}})\), is set to the recommended values by each of the PDF groups. Uncertainties in the predicted cross sections associated with \(\alpha _\textrm{S} (m_{\textrm{Z}})\) are evaluated as half the difference in the predicted cross sections evaluated with a variation of \(\varDelta (\alpha _\textrm{S})=\pm 0.002\).

Table 9 Predictions for \(\sigma (\textrm{W}+\textrm{c})\) production from \(\textsc {mcfm} \) at NLO in QCD for the phase space of the analysis. For every PDF set, the central value of the prediction is given, together with the uncertainty as prescribed from the PDF set, and the uncertainties associated with the scale variations and with the value of \(\alpha _\textrm{S} \). The total uncertainty is given in the last column. The last row in the table gives the experimental result presented in this paper

The theoretical predictions for the fiducial \(\textrm{W}+\textrm{c}\) cross section in the phase space of the measurements are summarized in Table 9. The central value of the prediction is provided together with the relative uncertainties arising from the PDF variations within each set, the choice of scales, and \(\alpha _\textrm{S} \). The size of the PDF uncertainties depends on the different input data and methodology used by the various groups. In particular, they depend on the parameterization of the strange quark PDF and on the definition of the one standard deviation uncertainty band. The maximum difference between the central values of the various PDF predictions is \(\sim \)10%. This difference is comparable to the total uncertainty in each of the individual predictions. Theoretical predictions are slightly larger than the measured cross section but are in agreement within the uncertainties, as depicted in Fig. 6.

Fig. 6
figure 6

Comparison of the experimental measurement of \(\sigma (\textrm{W}+\textrm{c})\), unfolded to the parton level, with the predictions from the NLO QCD \(\textsc {mcfm} \) calculations using different NLO PDF sets. Horizontal error bars indicate the total uncertainty in the predictions

Fig. 7
figure 7

Measured differential cross sections \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}|\eta ^\ell |}\) (upper) and \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}{p_{\textrm{T}} ^\ell }}\) (lower) unfolded to the parton level, compared with the predictions from the \(\textsc {mcfm} \) NLO calculations using different NLO PDF sets. Error bars on data points include statistical and systematic uncertainties. Symbols showing the theoretical expectations are slightly displaced in the horizontal axis for better visibility. The ratios of data to predictions are shown in the lower panels. The uncertainty in the ratio includes the uncertainties in both data and prediction

The predictions for the \(\sigma (\textrm{W}+\textrm{c})\) production cross section, computed in intervals of \(|\eta ^\ell |\) and \(p_{\textrm{T}} ^\ell \), are compared with the measured values in Fig. 7. The predictions are generally consistent with the measurements within uncertainties, except for the highest \(p_{\textrm{T}} ^\ell \) bin.

7.3 Measurements of the cross section ratio \(\sigma ({\hbox {W}}^{+}+\bar{\text {c}})/\sigma ({\hbox {W}}^{-}+{\textrm{c}})\)

The cross section ratio \(\sigma ({\hbox {W}}^{+}+\bar{\text {c}})/\sigma ({\hbox {W}}^{-}+{\textrm{c}})\) is measured in the four channels as the ratio of the \(\text {OS-SS}\) event yields in which the lepton from the W boson decay is positively or negatively charged:

$$\begin{aligned} R_\textrm{c}^{\pm }= \frac{\sigma ({\hbox {W}}^{+}+\bar{\text {c}})}{\sigma ({\hbox {W}}^{-}+{\textrm{c}})} =\frac{Y_{\text {sel}}^{+}-Y_{\text {bkg}}^{+}}{Y_{\text {sel}}^{-}-Y_{\text {bkg}}^{-}}. \end{aligned}$$
(2)

The \(\text {OS-SS}\) background contributions, \(Y_{\text {bkg}}^{+}\) and \(Y_{\text {bkg}}^{-}\), estimated with the simulations, are subtracted from the selected event yields \(Y_{\text {sel}}^{+}\) and \(Y_{\text {sel}}^{-}\). The statistical uncertainty in the background contributions in the four analysis channels is treated as a source of systematic uncertainty (0.5\(-\)0.8%) in the cross section ratio.

Most of the reconstruction and selection efficiencies cancel out in the measurement of the cross section ratio \(R_\textrm{c}^{\pm }\). Possible efficiency differences between positive and negative leptons and SVs are included as systematic uncertainties. We evaluate effects stemming from charge confusion and charge-dependent reconstruction efficiencies.

The probability of assigning the incorrect charge to a lepton is studied with data using \(\textrm{Z}\rightarrow \ell \ell \) events reconstructed with SS or OS leptons. For the muons, the charge misidentification probability is negligible (\(<10^{-3}\)). For the electrons, the effect is around 1% but propagates into a negligible uncertainty in the cross section ratio. The charge confusion rate for the SVs is significantly larger, 17%, as described in Sect. 5.1.2. However, assuming that the charge confusion probability is the same for positive and negative SVs, the effect in the cross section ratio cancels out.

Potential differences in the reconstruction efficiencies of positive and negative leptons or SVs are studied with the \(\textrm{W}+\textrm{c}\) MC simulation. Efficiency ratios are calculated independently for the four channels of the analysis and are consistent with unity within the statistical uncertainty (1.2–1.4%). No corrections are made in the \(R_\textrm{c}^{\pm }\) measurements but the statistical uncertainties in the efficiency ratios are treated as systematic uncertainties.

Table 10 Measured production cross section ratio \(R_\textrm{c}^{\pm }\) in the four selection channels. Statistical (first) and systematic (second) uncertainties are also given
Fig. 8
figure 8

Comparison of the experimental measurement of \(R_\textrm{c}^{\pm }\) with the NLO QCD \(\textsc {mcfm} \) calculations using different NLO PDF sets. Horizontal error bars indicate the total uncertainty in the predictions

The \(R_\textrm{c}^{\pm }\) measurements in the four channels are presented in Table 10. The four measurements are combined considering as fully correlated the systematic uncertainties of electron, muon and SV reconstruction efficiencies affecting several channels. The \(\chi ^2\) of the combination is 3.3 (three degrees of freedom), corresponding to a p value of 0.35. The combined cross section ratio measurement is:

$$\begin{aligned} R_\textrm{c}^{\pm }= 0.950 \pm 0.005\,\text {(stat)} \pm 0.010\,\text {(syst)}. \end{aligned}$$

The precision in the \(R_\textrm{c}^{\pm }\) measurement has been improved by a factor of two with respect to previous CMS measurements [7,8,9], leading to the most precise measurement of \(R_\textrm{c}^{\pm }\) to date.

In Fig. 8 the \(R_\textrm{c}^{\pm }\) measurement is compared with the \(\textsc {mcfm} \) calculations using various PDF sets. Theoretical predictions for \(\sigma ({\hbox {W}}^{+}+\bar{\text {c}})\) and \(\sigma ({\hbox {W}}^{-}+{\textrm{c}})\) are computed independently under the same conditions explained in Sect. 7.2 and for the same \(|\eta ^{\ell } |\) and \(p_{\textrm{T}} ^{\ell }\) ranges used in the analysis. Expectations for \(R_\textrm{c}^{\pm }\) are derived from them and presented in Table 11. All theoretical uncertainties are significantly reduced in the cross section ratio prediction.

Table 11 Theoretical predictions for \(R_\textrm{c}^{\pm }\) calculated with \(\textsc {mcfm} \) at NLO. The kinematic selection follows the experimental requirements. For every PDF set, the central value of the prediction is given, together with the uncertainty as prescribed from the PDF set, and the uncertainties associated with the scale variations and with the value of \(\alpha _\textrm{S} \). The total uncertainty is given in the last column. The last row in the table gives the experimental result presented in this paper

The \(R_\textrm{c}^{\pm }\) observable is sensitive to a potential strangeness asymmetry in the proton but also to the down quark and antiquark asymmetry through the Cabibbo-suppressed down quark contribution to the \(\textrm{W}+\textrm{c}\) production. In the absence of strangeness asymmetry, as in the PDF sets CT18 and ABMP16, the predicted \(R_\textrm{c}^{\pm }\) value in the kinematical region of the analysis ranges from 0.955 to 0.964 with a small uncertainty of about 0.2%. The predictions calculated using PDF sets that include strangeness asymmetry in the proton (MSHT20 and NNPDF) are about 2% lower, ranging from 0.935 to 0.948 with a 2% uncertainty as a result of the larger uncertainty associated with the difference between the strange quark and antiquark PDFs. Within experimental and theoretical uncertainties, the measured \(R_\textrm{c}^{\pm }\) value is consistent with both sets of predictions.

Fig. 9
figure 9

Measured cross section ratio \(R_\textrm{c}^{\pm }\) as a function of the absolute value of \(\eta ^\ell \) (upper) and \(p_{\textrm{T}} ^\ell \) (lower), compared with the NLO QCD \(\textsc {mcfm} \) calculations using different NLO PDF sets. Error bars on data points include statistical and systematic uncertainties. Symbols showing the theoretical expectations are slightly displaced in the horizontal axis for better visibility. The ratios of data to predictions are shown in the lower panels. The uncertainty in the ratio includes the uncertainties in both data and prediction

Table 12 Predictions for \(\sigma (\textrm{W}+\textrm{c})\) in the phase space of the analysis. For each QCD and EW order, the central values of the OS, SS and \(\text {OS-SS}\) predictions are given, together with the statistical, scales, PDF, and total uncertainties of the \(\text {OS-SS}\) prediction. All values are given in pb. The last row in the table gives the experimental result presented in this paper

The cross section ratio \(R_\textrm{c}^{\pm }\) is also measured differentially as a function of \(|\eta ^\ell |\) and \(p_{\textrm{T}} ^\ell \). The measurements are compared with the \(\textsc {mcfm} \) predictions in Fig. 9. The predictions are generally consistent with the measurements, with some small deviations in shape within 5%. The cross section ratio decreases with \(|\eta ^\ell |\) from \(R_\textrm{c}^{\pm }\sim 1\) in the central region to about 0.87 for the most forward lepton pseudorapidity values. This behaviour is expected since different Bjorken x regions are being probed. At larger x values, corresponding to higher values of \(|\eta ^\ell |\), \({\textrm{W}}^{-}+\textrm{c}\) production increases relative to \({\hbox {W}}^{+}+\bar{\text {c}}\) because of the growing contribution initiated by the valence down quark. The differences between the predictions made using PDF sets with and without strange quark asymmetry grow with increasing \(|\eta ^\ell |\) and \(p_{\textrm{T}} ^\ell \). However, with the current uncertainties, the data cannot distinguish between both sets of predictions.

8 Comparison with predictions using NNLO QCD and NLO EW calculations

The first computation of NNLO QCD corrections for \(\textrm{W}+\textrm{c}\) production has recently been presented [14, 15]. The latest calculations include full off-diagonal CKM dependence up to NNLO QCD accuracy, and the dominant NLO EW corrections. In addition, a modified anti-\(k_{\textrm{T}}\) jet algorithm (flavored anti-\(k_{\textrm{T}}\) [68]) is used to guarantee that the computations are infrared safe. This is important for a fair comparison between theory predictions and experimental measurements, since experimental results are derived using the anti-\(k_{\textrm{T}}\) jet algorithm.

Predictions corresponding to the phase space of the CMS measurements presented in this paper, \(p_{\textrm{T}} ^{\ell }>35\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^{\ell } |<2.4\), \(p_{\textrm{T}} ^{\textrm{c}\text { jet}}>30\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^{\textrm{c}\text { jet}} |<2.4\), \(\varDelta R ({\text {jet}},\ell )>0.4\), have been specifically computed for the purpose of this comparison, using the charge-dependent flavored anti-\(k_{\textrm{T}}\) jet algorithm with parameter \(a=0.1\), and the same input parameters as in Ref. [15]. The theoretical cross sections are provided at LO, NLO, and NNLO QCD accuracies. At LO, the \(\textrm{W}+\textrm{c}\) process is defined at order \(\mathcal {O}(\alpha _\textrm{S} \alpha ^2)\) in the strong and EW couplings. At NLO, the QCD corrections include all virtual and real contributions of order \(\mathcal {O}(\alpha _\textrm{S} ^2\alpha ^2)\). In the same way, at NNLO accuracy all double-virtual, double-real, and real-virtual contributions of order \(\mathcal {O}(\alpha _\textrm{S} ^3\alpha ^2)\) are included. The calculation is carried out in the 5-flavor scheme with massless bottom and charm quarks. NLO EW corrections of order \(\mathcal {O}(\alpha _\textrm{S} \alpha ^3)\) are calculated including all virtual corrections and the real corrections involving single real photon emission to cancel the corresponding IR divergences appearing in the EW one-loop amplitude.

Fig. 10
figure 10

Comparison of the experimental measurement of \(\sigma (\textrm{W}+\textrm{c})\) with the \(\text {OS-SS}\) LO, NLO, and NNLO QCD predictions, and NLO EW corrections. The NNLO QCD NNPDF3.1 PDF set is used for computing all the predictions. CMPP stands for the authors of the calculations [15]. Horizontal error bars indicate the total uncertainty in the predictions

Fig. 11
figure 11

Comparison of the measured differential cross sections \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}|\eta ^\ell |}\) (upper) and \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}{p_{\textrm{T}} ^\ell }}\) (lower) with the \(\text {OS-SS}\) LO, NLO, and NNLO QCD predictions, and NLO EW corrections. The NNLO QCD NNPDF3.1 PDF set is used for computing all the predictions. CMPP stands for the authors of the calculations [15]. Error bars on data points include statistical and systematic uncertainties. Symbols showing the theoretical expectations are slightly displaced in the horizontal axis for better visibility. The ratios of data to predictions are shown in the lower panels. The uncertainty in the ratio includes the uncertainties in both data and prediction

The nominal renormalization and factorization scales are set both to \(\frac{1}{2}(E_{\text {T,W}}+p_{\textrm{T}} ^{\textrm{c}\text { jet}})\), where \(E_{\text {T,W}}{=}\sqrt{\smash [b]{M_{\textrm{W}}^2{+}(\vec {p}_{\text {T}}^{\ell }{+}\vec {p}_{\text {T}}^{ \upnu })^2}}\). To estimate missing higher-order QCD corrections, the scale uncertainty is obtained by independently varying the two scales by factors of 0.5, 1, 2, and taking the envelope of the predictions obtained with all variations excluding the cases where one scale is reduced and the other is increased at the same time.

The calculation was performed for the most representative PDF set, which allows for strange asymmetry, NNPDF3.1. The NNLO QCD PDF set was used for computing the predictions for all orders, following the PDF4LHC recommendation [61]. To evaluate the PDF uncertainty of the NNPDF3.1 sets, specialized minimal PDF sets [69], which contain only 8 replicas, were used. The PDF uncertainty is calculated as the square root of the quadratic sum of the differences between the cross section obtained with the nominal PDF and that obtained with each replica.

In Table 12, the theoretical predictions for the OS, SS, and \(\text {OS-SS}\) inclusive fiducial cross section are given at LO, NLO, and NNLO QCD accuracies. The QCD corrections show good perturbative convergence, since the NNLO QCD corrections are significantly smaller than the NLO ones. The NNLO correction for the \(\text {OS-SS}\) cross section is negative, about \(-\) 2%. This occurs becausse the NNLO QCD corrections to SS are larger than those for OS; at LO there is no SS contribution to the \(\textrm{W}+\textrm{c}\) process and the first SS contribution enters at NLO. The cross section calculated at NNLO QCD including NLO EW corrections is also shown in Table 12. The EW corrections amount to -2%. They were included as a multiplicative factor with negligible statistical uncertainty.

At LO and NLO the total uncertainty in the predictions is dominated by the scale uncertainty (around 5% at NLO). At NNLO the scale uncertainty is reduced to 1%, and the PDF uncertainty (4%) dominates. The inclusion of NNLO QCD corrections provides a more precise determination of the strange quark content of the proton from the cross section observable.

The \(\text {OS-SS}\) predictions are compared with the fiducial cross section measurement in Fig. 10. The \(\text {OS-SS}\) subtraction reduces the NNLO corrections, but does not remove them completely. The inclusion of the NNLO corrections decreases the uncertainty in the prediction and also brings it closer to the experimental measurement. The EW NLO corrections further improves the agreement between the theoretical prediction and experimental data. The theoretical prediction and the experimental measurement agree within uncertainties.

No efficiency correction has been applied to account for the different flavor assignments in the jet algorithms of the predictions (flavored anti-\(k_{\textrm{T}}\)) and the experimental measurements (anti-\(k_{\textrm{T}}\)). In Ref. [15] the difference in the predictions from the standard anti-\(k_{\textrm{T}}\) and the flavored anti-\(k_{\textrm{T}}\) algorithms is studied. Due to the lack of flavored infrared safety for the standard anti-\(k_{\textrm{T}}\) algorithm, such a comparison can be done only at NLO with the help of a parton shower. The difference in the fiducial cross section predictions is below 1%. Similarly, the effect in the NNLO theoretical \(\textrm{W}+\textrm{c}\) cross section prediction using variations of the flavored anti-\(k_{\textrm{T}}\) algorithm, and the flavored \(k_{\textrm{T}}\) algorithm is studied. Differences are also below 1%.

Table 13 Theoretical predictions for \(R_\textrm{c}^{\pm }\). For each QCD order, the central values are given, together with the MC statistical, scales, PDF, and total uncertainties. The last row in the table gives the experimental result presented in this paper
Fig. 12
figure 12

Comparison of the experimental measurement of \(R_\textrm{c}^{\pm }\) with the \(\text {OS-SS}\) LO, NLO and NNLO QCD predictions. The NNLO QCD NNPDF3.1 PDF set is used for computing all the predictions. CMPP stands for the authors of the calculations [15]. Horizontal error bars indicate the total uncertainty in the predictions

The predictions are also compared with the differential cross section measurements \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}|\eta ^\ell |}\) and \({\hbox {d}\sigma (\hbox {W}+\hbox {c})/\hbox {d}{p_{\textrm{T}} ^\ell }}\) in Fig. 11. The NLO correction is approximately flat in \(|\eta ^\ell |\) while it is larger at low and high values of \(p_{\textrm{T}} ^\ell \). The NLO predictions are very similar to those shown in Fig. 7 calculated with \(\textsc {mcfm} \) at NLO using the same PDF set (NNPDF3.1). The NNLO correction is small and does not change the shape of the NLO predictions. The EW NLO correction is flat in \(|\eta ^\ell |\) and gets larger with \(p_{\textrm{T}} ^\ell \), from 0.99 in the first bin to 0.90 in the highest \(p_{\textrm{T}} ^\ell \) bin.

Predictions for the \(\text {OS-SS}\) cross section ratio \(R_\textrm{c}^{\pm }\) have also been computed and are collected in Table 13. In computing the scale variation of \(R_\textrm{c}^{\pm }\), the scale uncertainty for the positive and negative signatures is taken as correlated. The \(R_\textrm{c}^{\pm }\) observable is rather stable under perturbative QCD corrections, varying by less than 1% from LO to NNLO accuracy. The NLO EW correction does not affect \(R_\textrm{c}^{\pm }\), the change being smaller than 0.1%.

The comparison of the predictions with the fiducial inclusive and differential measurements are presented in Figs. 12 and 13. The inclusion of the NNLO QCD correction does not change the good agreement already observed with the predictions at NLO.

Fig. 13
figure 13

Comparison of the measured differential cross section ratio \(R_\textrm{c}^{\pm }\) as a function of the absolute value of \(\eta ^\ell \) (upper) and \(p_{\textrm{T}} ^\ell \) (lower) with the \(\text {OS-SS}\) LO, NLO, and NNLO QCD predictions. The NNLO QCD NNPDF3.1 PDF set is used for computing all the predictions. CMPP stands for the authors of the calculations [15]. Error bars on data points include statistical and systematic uncertainties. Symbols showing the theoretical expectations are slightly displaced in the horizontal axis for better visibility. The ratios of data to predictions are shown in the lower panels. The uncertainty in the ratio includes the uncertainties in the data and prediction

9 Summary

The associated production of a W boson with a charm quark (\(\textrm{W}+\textrm{c}\)) in proton–proton (\({\textrm{pp}}\)) collisions at a center-of-mass energy of 13\(\,\text {Te}\hspace{-.08em}\text {V}\) was studied with a data sample collected by the CMS experiment corresponding to an integrated luminosity of \(138{\,\text {fb}^{-1}} \). The \(\textrm{W}+\textrm{c}\) process is selected based on the presence of a high transverse momentum lepton (electron or muon), coming from a W boson decay, and a jet with the signature of a charm hadron decay. Charm hadron decays are identified either by the presence of a muon inside a jet or by reconstructing a secondary decay vertex within the jet. Measurements are combined from the four different channels: electron and muon W boson decay channels, muon and secondary vertex charm identification channels.

Cross section measurements, within a fiducial region defined by the kinematics of the lepton from the W boson decay and the jet originated by the charm quark (\(p_{\textrm{T}} ^{\ell }>35\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^{\ell } |<2.4\), \(p_{\textrm{T}} ^{\textrm{c}\text { jet}}>30\,\text {Ge}\hspace{-.08em}\text {V} \), \(|\eta ^{\textrm{c}\text { jet}} |<2.4\)), are unfolded to the particle and parton levels. Cross sections are also measured differentially, as functions of \(|\eta ^\ell |\) and \(p_{\textrm{T}} ^\ell \). The cross section ratio for the processes \({\hbox {W}}^{+}+\bar{\text {c}}\) and \({\hbox {W}}^{-}+\hbox {c}\) is measured as well, achieving the highest precision in this measurement to date.

The measured fiducial \(\sigma (\textrm{W}+\textrm{c})\) production cross section unfolded to the particle level is:

$$\begin{aligned}{} & {} \sigma ({\textrm{pp}} \rightarrow \textrm{W}+\textrm{c}){\mathcal {B}}(\textrm{W}\,\rightarrow \ell \upnu ) \\{} & {} \quad = 148.7 \pm 0.4\,\text {(stat)} \pm 5.6\,\text {(syst)} \,\hbox {pb}. \end{aligned}$$

The cross section measurement unfolded to the parton level yields:

$$\begin{aligned}{} & {} \sigma ({\textrm{pp}} \rightarrow \textrm{W}+\textrm{c}){\mathcal {B}}(\textrm{W}\,\rightarrow \ell \upnu )\\{} & {} \quad = 163.4 \pm 0.5\,\text {(stat)} \pm 6.2\,\text {(syst)} \,\hbox {pb}. \end{aligned}$$

The measured \(\sigma ({\hbox {W}}^{+}+\bar{\text {c}})/\sigma ({\hbox {W}}^{-}+{\textrm{c}})\) cross section ratio is:

$$\begin{aligned} \frac{\sigma ({\textrm{pp}} \rightarrow {\textrm{W}}^{+}+\bar{\textrm{c}})}{\sigma ({\textrm{pp}} \rightarrow {\textrm{W}}^{-}+\textrm{c})} = 0.950 \pm 0.005\,\text {(stat)} \pm 0.010 \,\text {(syst)}. \end{aligned}$$

The measurements are compared with theoretical predictions. The particle level measurements are compared with the predictions of the MadGraph5_amc@nlo MC generator. The parton level cross section measurements are compared with NLO QCD calculations from the \(\textsc {mcfm} \) program using different PDF sets and with recently available NNLO QCD calculations including NLO EW corrections. The predicted fiducial cross section and cross section ratio are consistent with the measurements within uncertainties. The NNLO QCD and NLO EW corrections improve the agreement between the predicted and measured cross sections. Despite the improvement in precision of the cross section ratio measurement compared with previous studies, discrimination between predictions using symmetric or asymmetric strange quark and antiquark PDFs would require a further reduction of experimental and theoretical uncertainties. The theoretical uncertainty is dominated by the PDF uncertainties. The inclusion of the cross section measurements in future PDF fits should improve the modeling of the strange parton distribution function of the proton.