1 Introduction

Precise knowledge of the structure of the proton, expressed in terms of parton distribution functions (PDFs), is important for interpreting results obtained in proton–proton (\(\mathrm {p}\) \(\mathrm {p}\)) collisions at the CERN LHC. The PDFs are determined by comparing theoretical predictions obtained at a particular order in perturbative quantum chromodynamics (pQCD) to experimental measurements. The precision of the PDFs, which affects the accuracy of the theoretical predictions for cross sections at the LHC, is determined by the uncertainties of the experimental measurements used, and by the limitations of the available theoretical calculations. The flavor composition of the light quark sea in the proton and, in particular, the understanding of the strange quark distribution is important for the measurement of the \(\mathrm {W}\) boson mass at the LHC [1]. Therefore, it is of great interest to determine the strange quark distribution with improved precision.

Before the start of LHC data taking, information on the strange quark content of the nucleon was obtained primarily from charm production in (anti)neutrino-iron deep inelastic scattering (DIS) by the NuTeV [2], CCFR [3], and NOMAD [4] experiments. In addition, a direct measurement of inclusive charm production in nuclear emulsions was performed by the CHORUS experiment [5]. At the LHC, the production of \(\mathrm {W}\) or \(\mathrm {Z} \) bosons, inclusive or associated with charm quarks, provides an important input for tests of the earlier determinations of the strange quark distribution. The measurements of inclusive \(\mathrm {W}\) or \(\mathrm {Z} \) boson production at the LHC, which are indirectly sensitive to the strange quark distribution, were used in a QCD analysis by the ATLAS experiment, and an enhancement of the strange quark distribution with respect to other measurements was observed [6].

Fig. 1
figure 1

Dominant contributions to \(\mathrm {W}{+}\mathrm {c}\) production at the LHC at leading order in pQCD

The associated production of \(\mathrm {W}\) bosons and charm quarks in pp collisions at the LHC probes the strange quark content of the proton directly through the leading order (LO) processes \(\mathrm {g}+ \overline{\mathrm {s}}\rightarrow \mathrm {W}^+{+}\overline{\mathrm {c}} \) and \(\mathrm {g}+ \mathrm {s}\rightarrow \mathrm {W}^-{+}\mathrm {c} \), as shown in Fig. 1. The contribution of the Cabibbo-suppressed processes \(\mathrm {g}+ \overline{\mathrm {d}}\rightarrow \mathrm {W}^+{+}\overline{\mathrm {c}} \) and \(\mathrm {g}+\mathrm {d}\rightarrow \mathrm {W}^-{+}\mathrm {c} \) amounts to only a few percent of the total cross section. Therefore, measurements of associated \(\mathrm {W}{+}\mathrm {c}\) production in pp collisions provide valuable insights into the strange quark distribution of the proton. Furthermore, these measurements allow important cross-checks of the results obtained in the global PDF fits using the DIS data and measurements of inclusive \(\mathrm {W}\) and \(\mathrm {Z} \) boson production at the LHC.

Production of \(\mathrm {W}{+}\mathrm {c}\) in hadron collisions was first investigated at the Tevatron [7,8,9]. The first measurement of the cross section of \(\mathrm {W}{+}\mathrm {c}\) production in \(\mathrm {p}\mathrm {p}\) collisions at the LHC was performed by the CMS Collaboration at a center-of-mass energy of \(\sqrt{s}= 7\,\text {Te}\text {V} \) with an integrated luminosity of 5\(\,\text {fb}^{-1}\)  [10]. This measurement was used for the first direct determination of the strange quark distribution in the proton at a hadron collider [11]. The extracted strangeness suppression with respect to \(\overline{\mathrm {u}}\) and \(\overline{\mathrm {d}}\) quark densities was found to be in agreement with measurements in neutrino scattering experiments. The cross section for \(\mathrm {W}{+}\mathrm {c}\) production was also measured by the ATLAS experiment at \(\sqrt{s}= 7\,\text {Te}\text {V} \) [12] and used in a QCD analysis, which supported the enhanced strange quark content in the proton suggested by the earlier ATLAS analysis [6]. A subsequent joint QCD analysis [13] of all available data that were sensitive to the strange quark distribution demonstrated consistency between the \(\mathrm {W}{+}\mathrm {c}\) measurements by the ATLAS and CMS Collaborations. In Ref. [13], possible reasons for the observed strangeness enhancement were discussed. Recent results of an ATLAS QCD analysis [14], including measurements of inclusive \(\mathrm {W}\) and \(\mathrm {Z} \) boson production at \(\sqrt{s}= 7\,\text {Te}\text {V} \), indicated an even stronger strangeness enhancement in disagreement with all global PDFs. In Ref. [15], possible reasons for this observation were attributed to the limitations of the parameterization used in this ATLAS analysis [14]. The associated production of a \(\mathrm {W}\) boson with a jet originating from a charm quark is also studied in the forward region by the LHCb experiment [16].

In this paper, the cross section for \(\mathrm {W}{+}\mathrm {c}\) production is measured in pp collisions at the LHC at \(\sqrt{s}= 13\,\text {Te}\text {V} \) using data collected by the CMS experiment in 2016 corresponding to an integrated luminosity of 35.7\(\,\text {fb}^{-1}\). The \(\mathrm {W}\) bosons are selected via their decay into a muon and a neutrino. The charm quarks are tagged by a full reconstruction of the charmed hadrons in the process \(\mathrm {c}\rightarrow {\mathrm {D}^{*}(2010)^{\pm }}\rightarrow \mathrm {D}^0 + {\pi }_{\text {slow}}^{\pm } \rightarrow \mathrm {K}^{\mp } + {\pi ^{\pm }}+ {\pi }_{\text {slow}}^{\pm } \), which has a clear experimental signature. The pion originating from the \({\mathrm {D}^{*}(2010)^{\pm }}\) decay receives very little energy because of the small mass difference between \({\mathrm {D}^{*}(2010)^{\pm }}\) and \(\mathrm {D}^0 \)(1865) and is therefore denoted a “slow” pion \({\pi }_{\text {slow}}^{\pm } \). Cross sections for \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) production are measured within a selected fiducial phase space. The \(\mathrm {W}{+}\mathrm {c}\) cross sections are compared with theoretical predictions at next-to-leading order (NLO) QCD, which are obtained with mcfm 6.8 [17,18,19], and are used to extract the strange quark content of the proton.

This paper is organized as follows. The CMS detector is briefly described in Sect. 2. The data and the simulated samples are described in Sect. 3. The event selection is presented in Sect. 4. The measurement of the cross sections and the evaluation of systematic uncertainties are discussed in Sect. 5. The details of the QCD analysis are described in Sect. 6. Section 7 summarizes the results.

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6\(\text { m}\) internal diameter, providing a magnetic field of 3.8\(\text { T}\). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.

The silicon tracker measures charged particles within the pseudorapidity range \(|\eta | < 2.5\). It consists of 1440 silicon pixel and 15 148 silicon strip detector modules. For nonisolated particles of \(1< p_{\mathrm {T}} < 10\,\text {Ge}\text {V} \) and \(|\eta | < 1.4\), the track resolutions are typically 1.5% in \(p_{\mathrm {T}}\) and 25–90 (45–150) \(\,\upmu \text {m}\) in the transverse (longitudinal) impact parameter [20]. The reconstructed vertex with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\) is taken to be the primary \(\mathrm {p}\mathrm {p}\) interaction vertex. The physics objects are the jets, clustered using the jet finding algorithm [21, 22] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the \(p_{\mathrm {T}}\) of those jets. Muons are measured in the pseudorapidity range \(|\eta | < 2.4\), with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. The single muon trigger efficiency exceeds 90% over the full \(\eta \) range, and the efficiency to reconstruct and identify muons is greater than 96%. Matching muons to tracks measured in the silicon tracker results in a relative transverse momentum resolution, for muons with \(p_{\mathrm {T}}\) up to 100\(\,\text {Ge}\text {V}\), of 1% in the barrel and 3% in the endcaps. The \(p_{\mathrm {T}}\) resolution in the barrel is better than 7% for muons with \(p_{\mathrm {T}}\) up to 1\(\,\text {Te}\text {V}\)  [23]. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [24].

3 Data and Monte Carlo samples and signal definition

Candidate events for the muon decay channel of the \(\mathrm {W}\) boson are selected by a muon trigger [25] that requires a reconstructed muon with \(p_{\mathrm {T}} ^{\mu } > 24\,\text {Ge}\text {V} \). The presence of a high-\(p_{\mathrm {T}}\) neutrino is implied by the missing transverse momentum, \({\vec p}_{\mathrm {T}}^{\text {miss}} \), which is defined as the negative vector sum of the transverse momenta of the reconstructed particles.

Muon candidates and \({\vec p}_{\mathrm {T}}^{\text {miss}} \) are reconstructed using the particle-flow (PF) algorithm [26], which reconstructs and identifies each individual particle with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained directly from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex determined by the tracking detector, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The muon momentum is obtained from the track curvature in both the tracker and the muon system, and identified by hits in multiple stations of the flux-return yoke. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for both zero-suppression effects and the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.

The \({\mathrm {D}^{*}(2010)^{\pm }}\) meson candidates are reconstructed from tracks formed by combining the measurements in the silicon pixel and strip detectors through the CMS combinatorial track finder [20].

The signal and background processes are simulated using Monte Carlo (MC) generators to estimate the acceptance and efficiency of the CMS detector. The corresponding MC events are passed through a detailed Geant4  [27] simulation of the CMS detector and reconstructed using the same software as the real data. The presence of multiple pp interactions in the same or adjacent bunch crossing (pileup) is incorporated by simulating additional interactions (both in-time and out-of-time with respect to the hard interaction) with a vertex multiplicity that matches the distribution observed in data. The simulated samples are normalized to the integrated luminosity of the data using the generated cross sections. To simulate the signal, inclusive \(\mathrm {W}\)+jets events are generated with MadGraph 5_amc@nlo  (v2.2.2) [28] using the NLO matrix elements, interfaced with pythia8 (8.2.12) [29] for parton showering and hadronization. A matching scale of 10\(\,\text {Ge}\text {V}\) is chosen, and the FxFx technique [30] is applied for matching and merging. The factorization and renormalization scales, \(\mu _{\mathrm {r}} ^2\) and \(\mu _{\mathrm {f}} ^2\), are set to \(\mu _{\mathrm {r}} ^2=\mu _{\mathrm {f}} ^2 = m^2_{\mathrm {W}} + p^2_{\mathrm {T},\mathrm {W}}\). The proton structure is described by the NNPDF3.0nlo [31] PDF set. To enrich the sample with simulated \(\mathrm {W}{+}\mathrm {c}\) events, an event filter that requires at least one muon with \(p_{\mathrm {T}} ^{\mu } > 20\,\text {Ge}\text {V} \) and \(|\eta ^{\mu } | < 2.4\), as well as at least one \({\mathrm {D}^{*}(2010)^{\pm }}\) meson, is applied at the generator level.

Several background contributions are considered, which are described in the following. An inclusive \(\mathrm {W}\)+jets event sample is generated using the same settings as the signal events, but without the event filter, to simulate background contributions from \(\mathrm {W}\) events that do not contain \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons. Events originating from Drell–Yan (DY) with associated jets are simulated with MadGraph 5_amc@nlo  (v2.2.2) with \(\mu _{\mathrm {r}} ^2\) and \(\mu _{\mathrm {f}} ^2\) set to \(m^2_{\mathrm {Z}} + p^2_{\mathrm {T},\mathrm {Z}}\). Events originating from top quark–antiquark pair (\({\mathrm {t}\overline{\mathrm {t}}} \)) production are simulated using powheg (v2.0) [32], whereas single top quark events are simulated using powheg (v2.0) [33, 34] or powheg (v1.0) [35], depending on the production channel. Inclusive production of \(\mathrm {W}\mathrm {W}\), \(\mathrm {W}\mathrm {Z} \), and \(\mathrm {Z} \mathrm {Z} \) bosons and contributions from the inclusive QCD events are generated using pythia8. The CUETP8M1 [36] underlying event tune is used in pythia8 for all, except for the \({\mathrm {t}\overline{\mathrm {t}}} \) sample, where the CUETP8M2T4 [37] tune is applied.

The dominant background originates from processes like \(\mathrm {u}+ \overline{\mathrm {d}}\rightarrow \mathrm {W}^++ \mathrm {g}^* \rightarrow \mathrm {W}^++ \mathrm {c} \overline{\mathrm {c}} \) or \(\mathrm {d}+ \overline{\mathrm {u}}\rightarrow \mathrm {W}^-+ \mathrm {g}^* \rightarrow \mathrm {W}^-+ \mathrm {c} \overline{\mathrm {c}} \), with \(\mathrm {c}\) quarks produced in gluon splitting. In the \(\mathrm {W}{+}\mathrm {c}\) signal events the charges of the \(\mathrm {W}\) boson and the charm quark have opposite signs. In gluon splitting, an additional \(\mathrm {c}\) quark is produced with the same charge as the \(\mathrm {W}\) boson. At the generator level, an event is considered as a \(\mathrm {W}{+}\mathrm {c}\) event if it contains at least one charm quark in the final state. In the case of an odd number of \(\mathrm {c}\) quarks, the \(\mathrm {c}\) quark with the highest \(p_{\mathrm {T}} \) and a charge opposite to that of the \(\mathrm {W}\) boson is considered as originating from a \(\mathrm {W}{+}\mathrm {c}\) process, whereas the other \(\mathrm {c}\) quarks in the event are labeled as originating from gluon splitting. In the case of an even number of \(\mathrm {c}\) quarks, all are considered to come from gluon splitting. Events containing both \(\mathrm {c}\) and \(\mathrm {b}\) quarks are considered to be \(\mathrm {W}{+}\mathrm {c}\)  events, since \(\mathrm {c}\) quarks are of higher priority in this analysis, regardless of their momentum or production mechanism. Events containing no \(\mathrm {c}\) quark and at least one \(\mathrm {b}\) quark are classified as \(\mathrm {W}+ \mathrm {b}\). Otherwise, an event is assigned to the \(\mathrm {W}+\mathrm {u}\mathrm {d}\mathrm {s}\mathrm {g} \) category.

The contribution from gluon splitting can be significantly reduced using data. Events with the same charge sign for both the \(\mathrm {W}\) boson and charm quark, which correlates to the charge sign of the \({\mathrm {D}^{*}(2010)^{\pm }}\) meson, are background, which is due to gluon splitting. Since the gluon splitting background for opposite charge pairs is identical, it can be removed by subtracting the same-sign distribution from the signal. The measurement is performed in the central kinematic range and is not sensitive to the contributions of processes \(\mathrm {c}+ \mathrm {g}\rightarrow \mathrm {W}+ \mathrm {s}\) with a spectator charm quark.

For validation and tuning of MC event generators using a Rivet plugin [38], the \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }} \) measurement is performed. This requires a particle-level definition without constraints on the origin of \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons. Therefore, any contributions from \({\mathrm {B}}\) meson decays and other hadrons, though only a few pb, are included as signal for this part of the measurement.

4 Event selection

The associated production of \(\mathrm {W}\) bosons and charm quarks is investigated using events, where \(\mathrm {W}\rightarrow \mathrm {\mu }+ {\overline{\nu }_{\mu }}\) and the \(\mathrm {c}\) quarks hadronize into a \({\mathrm {D}^{*}(2010)^{\pm }}\) meson. The reconstruction of the muons from the \(\mathrm {W}\) boson decays and of the \({\mathrm {D}^{*}(2010)^{\pm }}\) candidates is described in detail in the following.

4.1 Selection of \(\mathrm {W}\) boson candidates

Events containing a \(\mathrm {W}\) boson decay are identified by the presence of a high-\(p_{\mathrm {T}}\) isolated muon and \({\vec p}_{\mathrm {T}}^{\text {miss}}\). The muon candidates are reconstructed by combining the tracking information from the muon system and from the inner tracking system [23], using the CMS particle-flow algorithm. Muon candidates are required to have \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), \(|\eta ^{\mu } | < 2.4\), and must fulfill the CMS “tight identification” criteria [23]. To suppress contamination from muons contained in jets, an isolation requirement is imposed:

$$\begin{aligned} \frac{1}{p_{\mathrm {T}} ^{\mu }} \left[ \sum ^{\mathrm {CH}} p_{\mathrm {T}} + \max \left( 0., \sum ^{\mathrm {NH}} p_{\mathrm {T}} + \sum ^{\mathrm {EM}} p_{\mathrm {T}}- 0.5 \sum ^{\mathrm {PU}} p_{\mathrm {T}} \right) \right] \le 0.15, \end{aligned}$$

where the \(p_{\mathrm {T}} \) sum of PF candidates for charged hadrons (CH), neutral hadrons (NH), photons (EM) and charged particles from pileup (PU) inside a cone of radius \(\varDelta R \le 0.4\) is used, and the factor 0.5 corresponds to the typical ratio of neutral to charged particles, as measured in jet production [26].

Events in which more than one muon candidate fulfills all the above criteria are rejected in order to suppress background from DY processes. Corrections are applied to the simulated samples to adjust the trigger, isolation, identification, and tracking efficiencies to the observed data. These correction factors are determined through dedicated tag-and-probe studies.

The presence of a neutrino in an event is assured by imposing a requirement on the transverse mass, which is defined as the combination of \(p_{\mathrm {T}} ^{\mu } \) and \({\vec p}_{\mathrm {T}}^{\text {miss}} \):

$$\begin{aligned} m_{\mathrm {T}} \equiv \sqrt{{2 \, p_{\mathrm {T}} ^{\mu } \, {\vec p}_{\mathrm {T}}^{\text {miss}} \, (1 - \cos (\phi _{\mathrm {\mu }} - \phi _{{\vec p}_{\mathrm {T}}^{\text {miss}}}))}}. \end{aligned}$$
(1)

In this analysis, \(m_{\mathrm {T}} > 50\,\text {Ge}\text {V} \) is required, which results in a significant reduction of background.

4.2 Selection of \({\mathrm {D}^{*}(2010)^{\pm }}\) candidates

The \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons are identified by their decays \({\mathrm {D}^{*}(2010)^{\pm }}\rightarrow \mathrm {D}^0 + {\pi }_{\text {slow}}^{\pm } \rightarrow \mathrm {K}^{\mp } + {\pi ^{\pm }}+ {\pi }_{\text {slow}}^{\pm } \) using the reconstructed tracks of the decay products. The branching fraction for this channel is \(2.66 \pm 0.03\%\) [39].

The \(\mathrm {D}^0 \) candidates are constructed by combining two oppositely charged tracks with transverse momenta \(p_{\mathrm {T}} ^{\text {track}} > 1\,\text {Ge}\text {V} \), assuming the \(\mathrm {K}^{\mp } \) and \({\pi ^{\pm }}\) masses. The \(\mathrm {D}^0 \) candidates are further combined with a track of opposite charge to the kaon candidate, assuming the \({\pi }^{\pm }\) mass, following the well-established procedure of Refs. [40, 41]. The invariant mass of the \(\mathrm {K}^{\mp } {\pi ^{\pm }}\) combination is required to be \(|m(\mathrm {K}^{\mp } {\pi ^{\pm }}) - m(\mathrm {D}^0) | < 35\,\text {Me}\text {V} \), where \(m(\mathrm {D}^0) = 1864.8 \pm 0.1\,\text {Me}\text {V} \) [39]. The candidate \(\mathrm {K}^{\mp } \) and \({\pi ^{\pm }}\) tracks must originate at a fitted secondary vertex [42] that is displaced by not more than 0.1\(\text { cm}\) in both the xy-plane and z-coordinate from the third track, which is presumed to be the \({\pi }_{\text {slow}}^{\pm } \) candidate. The latter is required to have \(p_{\mathrm {T}} ^{\text {track}} > 0.35\,\text {Ge}\text {V} \) and to be in a cone of \(\varDelta R \le 0.15\) around the direction of the \(\mathrm {D}^0 \) candidate momentum. The combinatorial background is reduced by requiring the \({\mathrm {D}^{*}(2010)^{\pm }}\) transverse momentum \(p_{\mathrm {T}} ^{\mathrm {D}^*} > 5\,\text {Ge}\text {V} \) and by applying an isolation criterion \(p_{\mathrm {T}} ^{\mathrm {D}^*}/ \sum p_{\mathrm {T}} > 0.2\). Here \(\sum p_{\mathrm {T}} \) is the sum of transverse momenta of tracks in a cone of \(\varDelta R \le \) 0.4 around the direction of the \({\mathrm {D}^{*}(2010)^{\pm }}\) momentum. The contribution of \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons produced in pileup events is suppressed by rejecting candidates with a \(z\text {-distance} > 0.2\text { cm} \) between the muon and the \({\pi }_{\text {slow}}^{\pm } \). After applying all selection criteria, the contribution of \(\mathrm {D}^0 \) decays other than \(\mathrm {K}^{\mp } {\pi ^{\pm }}\) is negligible compared to the uncertainties.

The \({\mathrm {D}^{*}(2010)^{\pm }}\) meson candidates are identified using the mass difference method [41] via a peak in the \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) distribution. Wrong-charge combinations with \(\mathrm {K}^{\pm }{\pi ^{\pm }}\) pairs in the accepted \(\mathrm {D}^0 \) mass range mimic the background originating from light-flavor hadrons. By subtracting the wrong-charge combinations, the combinatorial background in the \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) distribution is mostly removed. The presence of nonresonant charm production in the right-charge \(\mathrm {K}^{\mp } {\pi ^{\pm }}{\pi ^{\pm }}\) combinations introduces a small normalization difference of \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0) \) distributions for right- and wrong-charge combinations, which is corrected utilizing fits to the ratio of both distributions.

4.3 Selection of \(\mathrm {W}{+}\mathrm {c}\) candidates

An event is selected as a \(\mathrm {W}{+}\mathrm {c}\) signal if it contains a \(\mathrm {W}\) boson and a \({\mathrm {D}^{*}(2010)^{\pm }} \) candidate fulfilling all selection criteria. The candidate events are split into two categories: with \(\mathrm {W}^\pm +{\mathrm {D}^{*}(2010)^{\pm }} \) combinations falling into the same sign (SS) category, and \(\mathrm {W}^\mp + {\mathrm {D}^{*}(2010)^{\pm }} \) combinations falling into the opposite sign (OS) category. The signal events consist of only OS combinations, whereas the \(\mathrm {W}+ \mathrm {c} \overline{\mathrm {c}} \) and \(\mathrm {W}+ \mathrm {b} \overline{\mathrm {b}} \) background processes produce the same number of OS and SS candidates. Therefore, subtracting the SS events from the OS events removes the background contributions from gluon splitting. The contributions from other background sources, such as \( \mathrm {t}\overline{\mathrm {t}}\) and single top quark production, are negligible.

The number of \(\mathrm {W}{+}\mathrm {c}\) events corresponds to the number of \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons after the subtraction of light-flavor and gluon splitting backgrounds. The invariant mass of \(\mathrm {K}^{\mp } {\pi ^{\pm }}\) candidates, which are selected in a \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) window of ±1\(\,\text {Me}\text {V}\), is shown in Fig. 2, along with the observed reconstructed mass difference \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\). A clear \(\mathrm {D}^0 \) peak at the expected mass and a clear \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) peak around the expected value of \(145.4257 \pm 0.0017\,\text {Me}\text {V} \) [39] are observed. The remaining background is negligible, and the number of \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons is determined by counting the number of candidates in a window of \(144< \varDelta m(\mathrm {D}^*, \mathrm {D}^0) < 147\,\text {Me}\text {V} \). Alternately, two different functions are fit to the distributions, and their integral over the same mass window is used to estimate the systematic uncertainties associated with the method chosen.

Fig. 2
figure 2

Distributions of the reconstructed invariant mass of \(\mathrm {K}^{\mp } {\pi ^{\pm }}\) candidates (upper) in the range \(|\varDelta m(\mathrm {D}^*, \mathrm {D}^0)-0.1454 |<0.001\,\text {Ge}\text {V} \), and the reconstructed mass difference \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) (lower). The SS combinations are subtracted. The data (filled circles) are compared to MC simulation with contributions from different processes shown as histograms of different shades

5 Measurement of the fiducial \(\mathrm {W}{+}\mathrm {c}\) cross section

The fiducial cross section is measured in a kinematic region defined by requirements on the transverse momentum and the pseudorapidity of the muon and the transverse momentum of the charm quark. The simulated signal is used to extrapolate from the fiducial region of the \({\mathrm {D}^{*}(2010)^{\pm }}\) meson to the fiducial region of the charm quark. Since the \({\mathrm {D}^{*}(2010)^{\pm }}\) kinematics is integrated over at the generator level, the only kinematic constraint on the corresponding charm quark arises from the requirement on the transverse momentum of \({\mathrm {D}^{*}(2010)^{\pm }}\) meson. The correlation of the kinematics of charm quarks and \({\mathrm {D}^{*}(2010)^{\pm }}\) mesons is investigated using simulation, and the requirement of \(p_{\mathrm {T}} ^{\mathrm {D}^*} > 5\,\text {Ge}\text {V} \) translates into \(p_{\mathrm {T}} ^\mathrm {c} > 5\,\text {Ge}\text {V} \). The distributions of \(|\eta ^{\mu } | \) and \(p_{\mathrm {T}} ^\mathrm {c} \) in the simulation are shown to reproduce very well the fixed order prediction at NLO obtained, using mcfm 6.8 [17,18,19] calculation. The kinematic range of the measured fiducial cross section corresponds to \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), \(|\eta ^{\mu } | < 2.4\), and \(p_{\mathrm {T}} ^\mathrm {c} > 5\,\text {Ge}\text {V} \).

The fiducial \(\mathrm {W}{+}\mathrm {c}\) cross section is determined as:

$$\begin{aligned} \sigma (\mathrm {W}{+}\mathrm {c}) = \frac{N_{\text {sel}} \, \mathcal {S}}{\large \mathcal {L} _{\text {int}} \, \mathcal {B} \, \mathcal {C} }, \end{aligned}$$
(2)

where \(N_{\text {sel}} \) is the number of selected \(\text{ OS } - \text{ SS }\) events in the \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) distribution and \(\mathcal {S}\) is the signal fraction. The latter is defined as the ratio of the number of reconstructed \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) candidates originating from \(\mathrm {W}{+}\mathrm {c}\) to the number of all reconstructed \({\mathrm {D}^{*}(2010)^{\pm }}\). It is determined from the MC simulation, includes the background contributions, and varies between 0.95 and 0.99. The integrated luminosity is denoted by \(\mathcal {L} _{\text {int}} \). The combined branching fraction \(\mathcal {B}\) for the channels under study is a product of \(\mathcal {B}(\mathrm {c}\rightarrow {\mathrm {D}^{*}(2010)^{\pm }}) = 0.2429\) ± 0.0049 [43] and \(\mathcal {B}({\mathrm {D}^{*}(2010)^{\pm }}\rightarrow \mathrm {K}^{\mp } + {\pi ^{\pm }}+ {\pi }_{\text {slow}}^{\pm })\) = 0.0266 ± 0.0003 [39]. The correction factor \(\mathcal {C}\) accounts for the acceptance and efficiency of the detector. The latter is determined using the MC simulation and is defined as the ratio of the number of reconstructed \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) candidates to the number of generated \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) originating from \(\mathrm {W}{+}\mathrm {c}\) events that fulfill the fiducial requirements. In the measurement of the \(\mathrm {W}^+{+}\overline{\mathrm {c}}\) (\(\mathrm {W}^+{+}\mathrm {D}^*(2010)^-\)) and \(\mathrm {W}^-{+}\mathrm {c}\) (\(\mathrm {W}^-{+}\mathrm {D}^*(2010)^+\)) cross sections, the factor \(\mathcal {C}\) is determined separately for different charge combinations.

The measurement of the \(\mathrm {W}{+}\mathrm {c}\) cross section relies to a large extent on the MC simulation and requires extrapolation to unmeasured phase space. To reduce the extrapolation and the corresponding uncertainty, the cross section for \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) production is also determined in the fiducial phase space of the detector-level measurement, \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), \(|\eta ^{\mu } | < 2.4\), \(|\eta ^{\mathrm {D}^*} | < 2.4\) and \(p_{\mathrm {T}} ^{\mathrm {D}^*} > 5\,\text {Ge}\text {V} \), in a similar way by modifying Eq. (2) as follows: only the branching fraction \(\mathcal {B} = \mathcal {B}({\mathrm {D}^{*}(2010)^{\pm }}\rightarrow \mathrm {K}^{\mp } +{\pi ^{\pm }}+{\pi }_{\text {slow}}^{\pm })\) is considered and the factor \(\mathcal {C}\) is defined as the ratio between the numbers of reconstructed and of generated \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) candidates in the fiducial phase space after \(\text{ OS } - \text{ SS }\) subtraction.

The cross sections are determined inclusively and also in five bins of the absolute pseudorapidity \(|\eta ^{\mu } | \) of the muon originating from the \(\mathrm {W}\) boson decay. The number of signal (\(\text{ OS } - \text{ SS }\)) events in each range of \(|\eta ^{\mu } | \) is shown in Fig. 3. Good agreement between the data and MC simulation within the statistical uncertainties is observed.

Fig. 3
figure 3

Number of events after \(\text{ OS } - \text{ SS }\) subtraction for data (filled circles) and MC simulation (filled histograms) as a function of \(|\eta ^{\mu } | \)

5.1 Systematic uncertainties

The efficiencies and the assumptions relevant for the measurement are varied within their uncertainties to estimate the systematic uncertainty in the cross section measurement. The resulting shift of the cross section with respect to the central result is taken as the corresponding uncertainty contribution. The various sources of the systematic uncertainties in the \(\mathrm {W}{+}\mathrm {c}\) production cross section are listed in Table 1 for both the inclusive and the differential measurements.

  • Uncertainties associated with the integrated luminosity measurement are estimated as 2.5% [44].

  • The uncertainty in the tracking efficiency is 2.3% for the 2016 data. It is determined using the same method described in Ref. [45], which exploits the ratio of branching fractions between the four-body and two-body decays of the neutral charm meson to all-charged final states.

  • The uncertainty in the branching fraction of the \(\mathrm {c}\rightarrow {\mathrm {D}^{*}(2010)^{\pm }} \) is 2.4% [43].

  • The muon systematic uncertainties are 1% each for the muon identification and isolation, and 0.5% for the trigger and tracking corrections. These are added in quadrature and the resulting uncertainty for muons is 1.2%, which is referred to as the ‘muon uncertainty’.

  • The uncertainty in the determination of \(N_{\text {sel}} \) is estimated from the difference in using a Gaussian or Crystal Ball fit [46]. The largest value of this uncertainty determined differentially, 1.5%, is considered for all.

  • Uncertainties in the modeling of kinematic observables of the generated \({\mathrm {D}^{*}(2010)^{\pm }}\) meson are estimated by reweighting the simulated \(p_{\mathrm {T}} ^{\mathrm {D}^*} \) and \(|\eta ^{\mathrm {D}^*} | \) distributions to the shape observed in data. The respective uncertainty in the inclusive cross section measurement is 0.5%. Due to statistical limitations, this uncertainty is determined inclusively in \(|\eta ^{\mu } |\).

  • The uncertainty in the difference of the normalization of the \(\varDelta m(\mathrm {D}^*, \mathrm {D}^0)\) distributions for \(\mathrm {K}^{\mp } {\pi ^{\pm }}{\pi ^{\pm }}\) and \(\mathrm {K}^{\pm }{\pi ^{\pm }}{\pi }^\mp \) combinations (’background normalization’) is 0.5%.

  • Uncertainties in the measured \({\vec p}_{\mathrm {T}}^{\text {miss}} \) are estimated in Ref. [47] and result in an overall uncertainty of 0.9% for this analysis.

  • Uncertainties due to the modeling of pileup are estimated by varying the total inelastic cross section used in the simulation of pileup events by 5%. The corresponding uncertainty in the \(\mathrm {W}{+}\mathrm {c}\) cross section is 2%.

  • The uncertainty related to the requirement of a valid secondary vertex, fitted from the tracks associated with a \(\mathrm {D}^0 \) candidate, is determined by calculating the \({\mathrm {D}^{*}(2010)^{\pm }}\) reconstruction efficiency in data and MC simulation for events with and without applying this selection criterion. The number of reconstructed \({\mathrm {D}^{*}(2010)^{\pm }}\) candidates after the SS event subtraction is compared for events with or without a valid secondary vertex along with the proximity requirement (\(\varDelta _{xy} < 0.1\text { cm} \), \(\varDelta _z < 0.1\text { cm} \)). The difference in efficiency between data and MC simulation is calculated and an uncertainty in the inclusive cross section of \(-1.1\)% is obtained. Since this variation is not symmetric, the uncertainty is one-sided.

  • The PDF uncertainties are determined according to the prescription of the PDF group [31]. These are added in quadrature to the uncertainty related to the variation of \(\alpha _S (m_\mathrm {Z})\) in the PDF, resulting in an uncertainty of 1.2% in the inclusive cross section.

  • The uncertainty associated with the fragmentation of the \(\mathrm {c}\) quark into a \({\mathrm {D}^{*}(2010)^{\pm }}\) meson is determined through variations of the function describing the fragmentation parameter \(z = p_{\mathrm {T}} ^{\mathrm {D}^*}/p_{\mathrm {T}} ^\mathrm {c} \). The investigation of this uncertainty is inspired by a dedicated measurement of the \(\mathrm {c}\rightarrow {\mathrm {D}^{*}(2010)^{\pm }} \) fragmentation function in electron-proton collisions [48], in which the fragmentation parameters in various phenomenological models were determined with an uncertainty of 10%. In the pythia MC event generator, the fragmentation is described by the phenomenological Bowler–Lund function [49, 50], in the form

    $$\begin{aligned} f(z) = \frac{1}{z^{r_c \, b \, m_q^2}} (1-z)^a \exp (-b \, m_\perp ^2 / z) \, c, \end{aligned}$$

    with \(m_\perp = \sqrt{\smash [b]{m_{\mathrm {D}^*}^2 + p_{\mathrm {T}\mathrm {D}^*}}^2},\) controlled by the two parameters a and b. In the case of charm quarks, \(r_c\) = 1 and \(m_q = 1.5\,\text {Ge}\text {V} \) are the pythia standard settings in the CUETP8M1 tune, whereas the value of \(m_\perp \) is related to the average transverse momentum of generated \({\mathrm {D}^{*}(2010)^{\pm }}\) in the MC sample. The parameters a, b and c are determined in a fit to the simulated distribution of f(z), where c is needed for the normalization. Since the presence of a jet is not required in the analysis, the charm quark transverse momentum is approximated by summing up the transverse momenta of tracks in a cone of \(\varDelta R \le \) 0.4 around the axis of the \({\mathrm {D}^{*}(2010)^{\pm }}\) candidate. The free parameters are determined as \(a = 1.827 \pm 0.016\) and \(b = 0.00837 \pm 0.00005\,\text {Ge}\text {V} ^{-2}\). To estimate the uncertainty, the parameters a and b are varied within ±10% around their central values, following the precision achieved for the fragmentation parameters in [48]. An additional constraint on the upper boundary on the a parameter in pythia is consistent with this 10% variation. The resulting uncertainty in the cross section is 3.9%.

Table 1 Systematic uncertainties [%] in the inclusive and differential \(\mathrm {W}{+}\mathrm {c}\) cross section measurement in the fiducial region of the analysis. The total uncertainty corresponds to the sum of the individual contributions in quadrature. The contributions listed in the top part of the table cancel in the ratio \(\sigma (\mathrm {W}^+{+}\overline{\mathrm {c}})/ \sigma (\mathrm {W}^-{+}\mathrm {c}) \)

5.2 Cross section results

The numbers of signal events and the inclusive fiducial cross sections with their uncertainties are listed in Table 2 together with the ratio of \(\sigma (\mathrm {W}^+{+}\overline{\mathrm {c}})/ \sigma (\mathrm {W}^-{+}\mathrm {c}) \). For the differential measurement of the \(\mathrm {W}{+}\mathrm {c}\) cross section, the numbers of signal events are summarized in Table 3 together with the corrections \(\mathcal {C}\) derived using MC simulations in each \(|\eta ^{\mu } | \) bin. The results are presented for \(\mathrm {d}\sigma (\mathrm {W}{+}\mathrm {c}) /\mathrm {d}|\eta ^{\mu } | \), as well as for \(\mathrm {d}\sigma (\mathrm {W}^+{+}\overline{\mathrm {c}}) /\mathrm {d}|\eta ^{\mu } | \) and for \(\mathrm {d}\sigma (\mathrm {W}^-{+}\mathrm {c}) /\mathrm {d}|\eta ^{\mu } | \).

Table 2 Inclusive cross sections of \(\mathrm {W}{+}\mathrm {c}\) and \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) production in the fiducial range of the analysis. The correction factor \(\mathcal {C}\) accounts for the acceptance and efficiency of the detector
Table 3 Number of signal events, correction factors \(\mathcal {C}\), accounting for the acceptance and efficiency of the detector and the differential cross sections in each \(|\eta ^{\mu } | \) range for \(\mathrm {W}{+}\mathrm {c}\) (upper), \(\mathrm {W}^+{+}\overline{\mathrm {c}}\) (middle) and \(\mathrm {W}^-{+}\mathrm {c}\) (lower)

The measured inclusive and differential fiducial cross sections of \(\mathrm {W}{+}\mathrm {c}\) are compared to predictions at NLO (\(\mathcal {O}(\alpha _s^2)\)) that are obtained using mcfm 6.8. Similarly to the earlier analysis [11], the mass of the charm quark is chosen to be \(m_{{c}} = 1.5\,\text {Ge}\text {V} \), and the factorization and the renormalization scales are set to the value of the \(\mathrm {W}\) boson mass. The calculation is performed for \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), \(|\eta ^{\mu } | < 2.4\), and \(p_{\mathrm {T}} ^\mathrm {c} > 5\,\text {Ge}\text {V} \). In Fig. 4, the measurements of the inclusive \(\mathrm {W}{+}\mathrm {c}\) cross section and the charge ratio are compared to the NLO predictions calculated using the ABMP16nlo [51], ATLASepWZ16nnlo [14], CT14nlo [52], MMHT14nlo [53], NNPDF3.0nlo [31], and NNPDF3.1nlo [54] PDF sets. The values of the strong coupling constant \(\alpha _S (m_\mathrm {Z})\) are set to those used in the evaluation of a particular PDF. The details of the experimental data, used for constraining the strange quark content of the proton in the global PDFs, are given in Refs. [14, 31, 52, 53, 55]. In these references, the treatment of the sea quark distributions in different PDF sets is discussed, and the comparison of the PDFs is presented. The ABMP16nlo PDF includes the most recent data on charm quark production in charged-current neutrino-nucleon DIS collected by the NOMAD and CHORUS experiments in order to improve the constraints on the strange quark distribution and to perform a detailed study of the isospin asymmetry of the light quarks in the proton sea [56]. Despite differences in the data used in the individual global PDF fits, the strangeness suppression distributions in ABMP16nlo, NNPDF3.1nlo, CT14nlo and MMHT14nlo are in a good agreement among each other and disagree with the ATLASepWZ16nnlo result [14].

The predicted inclusive cross sections are summarized in Table 4. The PDF uncertainties are calculated using prescriptions from each PDF group. For the ATLASepWZ16nnlo PDFs no respective NLO set is available and only Hessian uncertainties are considered in this paper. For other PDFs, the variation of \(\alpha _s(m_Z)\) is taken into account as well. The uncertainties due to missing higher-order corrections are estimated by varying \(\mu _{\mathrm {r}} \) and \(\mu _{\mathrm {f}} \) simultaneously by a factor of 2 up and down, and the resulting variation of the cross section is referred to as the scale uncertainty, \(\varDelta \mu \). Good agreement between NLO predictions and the measurements is observed, except for the prediction using ATLASepWZ16nnlo. For the cross section ratio \(\sigma (\mathrm {W}^+{+}\overline{\mathrm {c}})\)/\(\sigma (\mathrm {W}^-{+}\mathrm {c})\), all theoretical predictions are in good agreement with the measured value. In Table 5, the theoretical predictions for \(\mathrm {d}\sigma (\mathrm {W}{+}\mathrm {c}) /\mathrm {d}|\eta ^{\mu } | \) using different PDF sets are summarized. In Fig. 5, the measurements of differential \(\mathrm {W}{+}\mathrm {c}\) and \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) cross sections are compared with the mcfm NLO calculations and with the signal MC prediction, respectively. Good agreement between the measured \(\mathrm {W}{+}\mathrm {c}\) cross section and NLO calculations is observed except for the prediction using the ATLASepWZ16nnlo PDF set. The signal MC prediction using NNPDF3.0nlo is presented with the PDF and \(\alpha _s\) uncertainties and accounts for simultaneous variations of \(\mu _{\mathrm {r}} \) and \(\mu _{\mathrm {f}} \) in the matrix element by a factor of 2. The \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) cross section is described well by the simulation.

Table 4 The NLO predictions for \(\sigma (\mathrm {W}{+}\mathrm {c})\), obtained with mcfm [17,18,19]. The uncertainties account for PDF and scale variations
Fig. 4
figure 4

Inclusive fiducial cross section \(\sigma (\mathrm {W}{+}\mathrm {c}) \) and the cross section ratio \(\sigma (\mathrm {W}^+{+}\overline{\mathrm {c}})/\sigma (\mathrm {W}^-{+}\mathrm {c}) \) at 13\(\,\text {Te}\text {V}\). The data are represented by a line with the statistical (total) uncertainty shown by a light (dark) shaded band. The measurements are compared to the NLO QCD prediction using several PDF sets, represented by symbols of different types. All used PDF sets are evaluated at NLO, except for ATLASepWZ16, which is obtained at NNLO. The error bars depict the total theoretical uncertainty, including the PDF and the scale variation uncertainty

Table 5 Theoretical predictions for \(\mathrm {d}\sigma (\mathrm {W}{+}\mathrm {c}) /\mathrm {d}|\eta ^{\mu } | \) calculated with mcfm at NLO for different PDF sets. The relative uncertainties due to PDF and scale variations are shown
Fig. 5
figure 5

Left: Differential cross sections of \(\sigma (\mathrm {W}{+}\mathrm {c}) \) production at 13\(\,\text {Te}\text {V}\) measured as a function \(|\eta ^{\mu } | \). The data are presented by filled circles with the statistical (total) uncertainties shown by vertical error bars (light shaded bands). The measurements are compared to the QCD predictions calculated with mcfm at NLO using different PDF sets, presented by symbols of different style. All used PDF sets are evaluated at NLO, except for ATLASepWZ16, which is obtained at NNLO. The error bars represent theoretical uncertainties, which include PDF and scale variation uncertainty. Right: \(\sigma (\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}) \) production differential cross sections presented as a function of \(|\eta ^{\mu } | \). The data (filled circles) are shown with their total (outer error bars) and statistical (inner error bars) uncertainties and are compared to the predictions of the signal MC generated with MadGraph 5_amc@nlo and using NNPDF3.0nlo to describe the proton structure. PDF uncertainties and scale variations are accounted for and added in quadrature (shaded band)

6 Impact on the strange quark distribution in the proton

The associated \(\mathrm {W}{+}\mathrm {c}\) production at 13\(\,\text {Te}\text {V}\) probes the strange quark distribution directly in the kinematic range of \(\langle x \rangle \approx 0.007\) at the scale of \(m^2_{\mathrm {W}} \). The first measurement of a fiducial \(\mathrm {W}{+}\mathrm {c}\) cross section in \(\mathrm {p}\mathrm {p}\) collisions was performed by the CMS experiment at a center-of-mass energy \(\sqrt{s}= 7\,\text {Te}\text {V} \) with a total integrated luminosity of 5\(\,\text {fb}^{-1}\)  [11]. The results were used in a QCD analysis [10] together with measurements of neutral- and charged-current cross sections of DIS at HERA [57] and of the lepton charge asymmetry in \(\mathrm {W}\) production at \(\sqrt{s}= 7\,\text {Te}\text {V} \) at the LHC [11].

The present measurement of the \(\mathrm {W}{+}\mathrm {c}\) production cross section at 13\(\,\text {Te}\text {V}\), determined as a function of the absolute pseudorapidity \(|\eta ^{\mu } | \) of the muon from the \(\mathrm {W}\) boson decay and \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), is used in an NLO QCD analysis. This analysis also includes an updated combination of the inclusive DIS cross sections [58] and the available CMS measurements of the lepton charge asymmetry in \(\mathrm {W}\) boson production at \(\sqrt{s}= 7\,\text {Te}\text {V} \) [11] and at \(\sqrt{s}= 8\,\text {Te}\text {V} \) [59]. These latter measurements probe the valence quark distributions in the kinematic range \(10^{-3} \le x \le 10^{-1}\) and have indirect sensitivity to the strange quark distribution. The earlier CMS measurement [10] of \(\mathrm {W}{+}\mathrm {c}\) production at \(\sqrt{s}= 7\,\text {Te}\text {V} \) is also used to exploit the strange quark sensitive measurements at CMS in a joint QCD analysis. The correlations of the experimental uncertainties within each individual data set are taken into account, whereas the CMS data sets are treated as uncorrelated to each other. In particular, the measurements of \(\mathrm {W}{+}\mathrm {c}\) production at 7 and 13\(\,\text {Te}\text {V}\) are treated as uncorrelated because of the different methods of charm tagging and the differences in reconstruction and event selection in the two data sets.

The theoretical predictions for the muon charge asymmetry and for \(\mathrm {W}{+}\mathrm {c}\) production are calculated at NLO using the mcfm program [17, 18], which is interfaced to applgrid 1.4.56 [60].

Version 2.0.0 of the open-source QCD fit framework for PDF determination xFitter [61, 62] is used with the parton distributions evolved using the Dokshitzer–Gribov–Lipatov–Altarelli–Parisi equations [63,64,65,66,67,68] at NLO, as implemented in the qcdnum 17-00/06 program [69].

The Thorne–Roberts [70, 71] general mass variable flavor number scheme at NLO is used for the treatment of heavy quark contributions with heavy quark masses \(m_{\mathrm {b}} = 4.5\,\text {Ge}\text {V} \) and \(m_{\mathrm {c}} = 1.5\,\text {Ge}\text {V} \), which correspond to the values used in the signal MC simulation in the cross section measurements. The renormalization and factorization scales are set to Q, which denotes the four-momentum transfer for the case of the DIS data and the mass of the \(\mathrm {W}\) boson for the case of the muon charge asymmetry and the \(\mathrm {W}{+}\mathrm {c}\) measurement. The strong coupling constant is set to \(\alpha _s (m_{\mathrm {Z}}) = 0.118\). The \(Q^2\) range of HERA data is restricted to \(Q^2 \ge Q^2_{\min } = 3.5\,\text {Ge}\text {V} ^2\) to ensure the applicability of pQCD over the kinematic range of the fit. The procedure for the determination of the PDFs follows the approach used in the earlier CMS analyses [11, 59]. In the following, a similar PDF parameterization is used as in the most recent CMS QCD analysis [59] of inclusive \(\mathrm {W}\) boson production.

The parameterized PDFs are the gluon distribution, \(x\mathrm {g} \), the valence quark distributions, \(x\mathrm {u}_v\), \(x\mathrm {d}_v\), the \(\mathrm {u}\)-type, \(x\overline{\mathrm {u}}\), and \(x\overline{\mathrm {d}}\)-type anti-quark distributions, with \(x\mathrm {s}\) (\(x\overline{\mathrm {s}}\)) denoting the strange (anti-)quark distribution. The initial scale of the QCD evolution is chosen as \(Q_0^2 = 1.9\,\text {Ge}\text {V} ^2\). At this scale, the parton distributions are parameterized as:

$$\begin{aligned} x \mathrm {u}_\mathrm {v}(x)&= A_{\mathrm {u}_{\mathrm {v}}} ~ x^{B_{\mathrm {u}_{\mathrm {v}}}} ~ (1-x)^{C_{\mathrm {u}_{\mathrm {v}}}} ~(1+E_{\mathrm {u}_{\mathrm {v}}} x^2), \end{aligned}$$
(3)
$$\begin{aligned} x \mathrm {d}_{\mathrm {v}}(x)&= A_{\mathrm {d}_{\mathrm {v}}} ~ x^{B_{\mathrm {d}_{\mathrm {v}}}} ~ (1-x)^{C_{\mathrm {d}_{\mathrm {v}}}}, \end{aligned}$$
(4)
$$\begin{aligned} x \overline{\mathrm {u}}(x)&= A_{\overline{\mathrm {u}}} ~ x^{B_{\overline{\mathrm {u}}}} ~ (1-x)^{C_{\overline{\mathrm {u}}}} ~(1+E_{\overline{\mathrm {u}}} x^2), \end{aligned}$$
(5)
$$\begin{aligned} x \overline{\mathrm {d}}(x)&= A_{\overline{\mathrm {d}}} ~ x^{B_{\overline{\mathrm {d}}}} ~ (1-x)^{C_{\overline{\mathrm {d}}}}, \end{aligned}$$
(6)
$$\begin{aligned} x \overline{\mathrm {s}}(x)&= A_{\overline{\mathrm {s}}} ~ x^{B_{\overline{\mathrm {s}}}} ~ (1-x)^{C_{\overline{\mathrm {s}}}}, \end{aligned}$$
(7)
$$\begin{aligned} x \mathrm {g}(x)&= A_{\mathrm {g}} ~ x^{B_{\mathrm {g}}} ~ (1-x)^{C_{\mathrm {g}}}. \end{aligned}$$
(8)

The normalization parameters \(A_{\mathrm {u}_{\mathrm {v}}}\), \(A_{\mathrm {d}_{\mathrm {v}}}\), \(A_\mathrm {g}\) are determined by the QCD sum rules, the B parameter is responsible for small-x behavior of the PDFs, and the parameter C describes the shape of the distribution as \(x \rightarrow 1\). The strangeness fraction \(f_\mathrm {s}= \overline{\mathrm {s}}/( \overline{\mathrm {d}}+ \overline{\mathrm {s}})\) is a free parameter in the fit.

The strange quark distribution is determined by fitting the free parameters in Eqs. (3)–(8). The constraint \(A_{\overline{\mathrm {u}}} = A_{\overline{\mathrm {d}}}\) ensures the same normalization for \(\overline{\mathrm {u}}\) and \(\overline{\mathrm {d}}\) densities at \(x \rightarrow 0\). It is assumed that \(x\mathrm {s}= x\overline{\mathrm {s}}\).

In the earlier CMS analysis [11], the assumption \(B_{\overline{\mathrm {u}}} = B_{\overline{\mathrm {d}}}\) was applied. An alternative assumption \(B_{\overline{\mathrm {u}}} \ne B_{\overline{\mathrm {d}}}\) led to a significant change in the result, which was included in the parameterization uncertainty. In the present analysis, the B parameters of the light sea quarks are independent from each other, \(B_{\overline{\mathrm {u}}} \ne B_{\overline{\mathrm {d}}} \ne B_{\overline{\mathrm {s}}}\), following the suggestion of Ref. [15].

For all measured data, the predicted and measured cross sections together with their corresponding uncertainties are used to build a global \(\chi ^2 \), minimized to determine the initial PDF parameters [61, 62]. The quality of the overall fit can be judged based on the global \(\chi ^2 \) divided by the number of degrees of freedom, \(n_{\mathrm {dof}}\). For each data set included in the fit, a partial \(\chi ^2 \) divided by the number of measurements (data points), \(n_{\mathrm {dp}}\), is provided. The correlated part of \(\chi ^2 \) quantifies the influence of the correlated systematic uncertainties in the fit. The global and partial \(\chi ^2 \) values for each data set are listed in Table 6, illustrating a general agreement among all the data sets.

Table 6 The partial \(\chi ^2 \) per number of data points, \(n_{\mathrm {dp}}\), and the global \(\chi ^2 \) per number of degree of freedom, \(n_{\mathrm {dof}}\), resulting from the PDF fit

The PDF uncertainties are investigated according to the general approach of HERAPDF 1.0 [57]. The experimental PDF uncertainties arising from the uncertainties in the measurements are estimated by using the Hessian method [72], adopting the tolerance criterion of \(\varDelta \chi ^2 = 1\). The experimental uncertainties correspond to 68% confidence level. Alternatively, the experimental uncertainties in the measurements are propagated to the extracted QCD fit parameters using the MC method [73, 74]. In this method, 426 replicas of pseudodata are generated, with measured values for the cross sections allowed to vary within the statistical and systematic uncertainties. For each of them, the PDF fit is performed and the uncertainty is estimated as the root-mean-square around the central value. Because of possible nonGaussian tails in the PDF uncertainties, the MC method is usually considered to be more robust and to give more realistic uncertainties, in particular for PDFs not strongly constrained by the measurements, e.g., in the case of too little or not very precise data. In Fig. 6, the distributions of the strange quark content \(\mathrm {s}(x,Q^2)\), and of the strangeness suppression factor \(r_{\mathrm {s}}(x,\mu _f^2)=(\mathrm {s}+\overline{\mathrm {s}})/(\overline{\mathrm {u}}+\overline{\mathrm {d}})\) are presented.

Fig. 6
figure 6

The \(\mathrm {s}\) quark distribution (upper) and the strangeness suppression factor (lower) as functions of x at the factorization scale of 1.9\(\,\text {Ge}\text {V}\) \(^2\) (left) and \(m^2_{\mathrm {W}} \) (right). The results of the current analysis are presented with the fit uncertainties estimated by the Hessian method (hatched band) and using MC replicas (shaded band)

Fig. 7
figure 7

The strangeness suppression factor as a function of x at the factorization scale of 1.9\(\,\text {Ge}\text {V}\) \(^2\) (left) and \(m^2_{\mathrm {W}} \) (right). The results of the current analysis (hatched band) are compared to ABMP16nlo (dark shaded band) and ATLASepWZ16nnlo (light shaded band) PDFs

Fig. 8
figure 8

The distributions of \(\mathrm {s}\) quarks (upper panel) in the proton and their relative uncertainty (lower panel) as a functions of x at the factorization scale of 1.9\(\,\text {Ge}\text {V}\) \(^2\) (left) and \(m^2_{\mathrm {W}} \) (right). The result of the current analysis (filled band) is compared to the result of Ref. [11] (dashed band). The PDF uncertainties resulting from the fit are shown

In Fig. 7 the strangeness suppression factor is shown in comparison with the ATLASepWZ16nnlo and the ABMP16nlo, similar to Fig. 1 in Ref. [15]. Whereas the CMS result for \(\mathrm {r}_{\mathrm {s}}(x)\) is close to the ABMP16nlo PDF, it shows a significant difference with regard to the ATLASepWZ16nnlo PDF for \(x > 10^{-3}\). The significant excess of the strange quark content in the proton reported by ATLAS [14] is not observed in the present analysis.

To investigate the impact of model assumptions on the resulting PDFs, alternative fits are performed, in which the heavy quark masses are varied as \(4.3\le m_\mathrm {b}\le 5.0\,\text {Ge}\text {V} \), \(1.37\le m_\mathrm {c}\le 1.55\,\text {Ge}\text {V} \), and the value of \(Q^2_{\min }\) imposed on the HERA data is varied in the interval \(2.5 \le Q^2_{\min }\le 5.0\,\text {Ge}\text {V} ^2\). Also, the variations in PDF parameterization, following Ref. [59] are investigated. These variations do not alter the results for the strange quark distribution or the suppression factor significantly, compared to the PDF fit uncertainty. Since each global PDF group is using their own assumptions for the values of heavy quark masses and cutoffs on the DIS data, these model variations are not quantified further.

To compare the results of the present PDF fit with the earlier determination of the strange quark content in the proton at CMS [11], the “free-s” parameterization of Ref. [11] is used. There, a flexible form [70, 71] for the gluon distribution was adopted, allowing the gluon to be negative. Furthermore, the condition \(B_{\overline{\mathrm {u}}} = B_{\overline{\mathrm {d}}} = B_{\overline{\mathrm {s}}}\) was applied in the central parameterization, while \(B_{\overline{\mathrm {d}}} \ne B_{\overline{\mathrm {s}}}\) was used to estimate the parameterization uncertainty. A complete release of the condition \(B_{\overline{\mathrm {u}}} = B_{\overline{\mathrm {d}}} = B_{\overline{\mathrm {s}}}\) was not possible because of limited data input, in contrast to the current analysis. The same PDF parameterization was used in the ATLASepWZ16nnlo analysis [14]. The results are presented in Fig. 8. The central value obtained of the \(\mathrm {s}\) quark distribution is well within the experimental uncertainty of the results at \(\sqrt{s}= 7\,\text {Te}\text {V} \), while the PDF uncertainty is reduced.

7 Summary

Associated production of \(\mathrm {W}\) bosons with charm quarks in proton–proton collisions at \(\sqrt{s}= 13\,\text {Te}\text {V} \) is measured using the data collected by the CMS experiment in 2016 and corresponding to an integrated luminosity of 35.7\(\,\text {fb}^{-1}\). The \(\mathrm {W}\) boson is detected via the presence of a high-\(p_{\mathrm {T}}\) muon and missing transverse momentum, suggesting the presence of a neutrino. The charm quark is identified via the full reconstruction of the \({\mathrm {D}^{*}(2010)^{\pm }}\) meson decaying to \(\mathrm {D}^0 + {\pi }_{\text {slow}}^{\pm } \rightarrow \mathrm {K}^{\mp } + {\pi ^{\pm }}+ {\pi }_{\text {slow}}^{\pm } \). Since in \(\mathrm {W}{+}\mathrm {c}\) production the \(\mathrm {W}\) boson and the \(\mathrm {c}\) quark have opposite charge, contributions from background processes, mainly \(\mathrm {c}\) quark production from gluon splitting, are largely removed by subtracting the events in which the charges of the \(\mathrm {W}\) boson and of the \({\mathrm {D}^{*}(2010)^{\pm }}\) meson have the same sign. The fiducial cross sections are measured in the kinematic range of the muon transverse momentum \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), pseudorapidity \(|\eta ^{\mu } | < 2.4\), and transverse momentum of the charm quark \(p_{\mathrm {T}} ^\mathrm {c} > 5\,\text {Ge}\text {V} \). The fiducial cross section of \(\mathrm {W}{+}{\mathrm {D}^{*}(2010)^{\pm }}\) production is measured in the kinematic range \(p_{\mathrm {T}} ^{\mu } > 26\,\text {Ge}\text {V} \), \(|\eta ^{\mu } | < 2.4\), transverse momentum of the \({\mathrm {D}^{*}(2010)^{\pm }}\) meson \(p_{\mathrm {T}} ^{\mathrm {D}^*} > 5\,\text {Ge}\text {V} \) and \(|\eta ^{\mathrm {D}^*} | < 2.4\), and compared to the Monte Carlo prediction. The measurements are performed inclusively and in five bins of \(|\eta ^{\mu } | \).

The obtained values for the inclusive fiducial \(\mathrm {W}{+}\mathrm {c}\) cross section and for the cross section ratio are:

$$\begin{aligned} \sigma (\mathrm {W}{+}\mathrm {c})= & {} 1026 \pm 31\,\text {(stat)} \begin{array}{c} +76\\ -72 \end{array}\,\text {(syst)} \text { pb}, \end{aligned}$$
(9)
$$\begin{aligned} \frac{\sigma (\mathrm {W}^+{+}\overline{\mathrm {c}})}{\sigma (\mathrm {W}^-{+}\mathrm {c})}= & {} 0.968 \pm 0.055\,\text {(stat)} \begin{array}{c} +0.015\\ -0.028 \end{array}\,\text {(syst)}. \end{aligned}$$
(10)

The measurements are in good agreement with the theoretical predictions at next-to-leading order (NLO) for different sets of parton distribution functions (PDF), except for the one using the ATLASepWZ16nnlo PDF. To illustrate the impact of these measurements in the determination of the strange quark distribution in the proton, the data is used in a QCD analysis at NLO together with inclusive DIS measurements and earlier results from CMS on \(\mathrm {W}{+}\mathrm {c}\) production and the lepton charge asymmetry in \(\mathrm {W}\) boson production. The strange quark distribution and the strangeness suppression factor \(r_{\mathrm {s}}(x,\mu _f^2)=(\mathrm {s}+\overline{\mathrm {s}})/(\overline{\mathrm {u}}+\overline{\mathrm {d}})\) are determined and agree with earlier results obtained in neutrino scattering experiments. The results do not support the hypothesis of an enhanced strange quark contribution in the proton quark sea reported by ATLAS [14].