Measurement of inclusive charged-particle b-jet production in pp and p-Pb collisions at $\sqrt{s_{\rm NN}} = 5.02$ TeV

A measurement of the inclusive b-jet production cross section is presented in pp and p-Pb collisions at $\sqrt{s_{\rm NN}} = 5.02$ TeV, using data collected with the ALICE detector at the LHC. The jets were reconstructed in the central rapidity region $|\eta|<0.5$ from charged particles using the anti-$k_{\rm T}$ algorithm with resolution parameter $R=0.4$. Identification of b jets exploits the long lifetime of b hadrons, using the properties of secondary vertices and impact parameter distributions. The $p_{\rm T}$-differential inclusive production cross section of b jets, as well as the corresponding inclusive b-jet fraction, are reported for pp and p-Pb collisions in the jet transverse momentum range $10 \le p_{\text{T, ch jet}} \le 100$ GeV/$c$, together with the nuclear modification factor, $R_{\rm pPb}^{\text{b-jet}}$. The analysis thus extends the lower $p_{\rm T}$ limit of b-jet measurements at the LHC. The nuclear modification factor is found to be consistent with unity, indicating that the production of b jets in p-Pb at $\sqrt{s_{\rm NN}} = 5.02$ TeV is not affected by cold nuclear matter effects within the current precision. The measurements are well reproduced by POWHEG NLO pQCD calculations with PYTHIA fragmentation.


Introduction
Charm and beauty quarks arise from hard scattering processes with large four-momentum transfer (Q 2 ). In the subsequent hadronization process they lose their initial virtuality and produce short-lived heavyflavor hadrons, which can be reconstructed either through their weak hadronic decays or indirectly via their semi-leptonic decay channels. In case of proton-proton collisions, the inclusive production cross section of heavy-flavor hadrons can be calculated with perturbative quantum chromodynamics (QCD) using the factorization approach, which assumes that the collision process can be described by a convolution of parton distribution functions (PDFs), a short-distance parton-level cross section, and a fragmentation function. This factorization was proven to be valid at the leading power of Q [1], as well as the leading power corrections O(1/Q 2 ) [2,3]. The concept of the QCD factorization is often extrapolated to proton-nucleus collisions by replacing the usual PDFs with nuclear PDFs (nPDFs), while keeping the short-distance parton-level cross section and the fragmentation function the same [4][5][6][7][8][9]. However, there are also additional phenomena which may or may not be incorporated into the nPDFs, for instance soft gluon interactions between the incoming and/or outgoing hadrons causing k T -broadening and energy loss of partons in the cold nuclear matter [10][11][12][13]. These effects may break the QCD factorization in nuclear collisions, thus making the nPDFs process dependent. They are often accounted for as extra modification factors or convolutions with extra functions in various models [14][15][16][17]. The differences between the factorization of proton-proton and proton-nucleus collisions are in general referred to as cold-nuclearmatter (CNM) effects. The overall impact of the CNM effects on the resulting p T -differential inclusive production cross section spectrum can be quantified by means of the nuclear modification factor, defined as the ratio of the particle yield measured in proton-nucleus collisions and the expected yield that would be obtained from a superposition of independent pp collisions. The sensitivity of heavy-flavor probes to CNM effects can be expected to differ from that of light-flavor probes due to the mass-dependent jet fragmentation [18][19][20].
Small collision systems such as pp or p-A provide a natural reference for the more complex nucleusnucleus collisions. Nuclear matter in these ultra-relativistic heavy-ion collisions can reach extremely high energy densities and temperatures, and transform into its hot and dense deconfined phase, the quark-gluon plasma (QGP) [21][22][23]. Initial parton showers interact with the medium via collisional and radiative processes that cause dissipation and redistribution of energy inside the parton shower. This results in the suppression of high-p T hadrons and jets [24][25][26][27][28] in nucleus-nucleus collisions and the modification of the jet substructure [29][30][31][32], the so-called jet quenching. Since heavy-flavor quarks are mainly produced in initial hard processes and since their numbers remain largely unchanged in the later stages of the reaction [33,34], they provide a unique opportunity to study the space-time evolution of the QGP. In this context, small collision systems represent an important test for theoretical models that account for the system-size-dependent evolution of the QGP signatures as well as CNM effects. Understanding CNM effects is therefore essential for the accurate quantification of the effects of a hot and dense medium in heavy-ion measurements.
The reconstruction of jets containing heavy-flavor hadrons provides more direct access to the primary heavy-flavor parton kinematics than an inclusive measurement of heavy-flavor hadrons. By measuring heavy-flavor jets, production and fragmentation effects can be studied separately. The ALICE Collaboration reported production of charm-tagged jets in pp collisions at √ s = 7 TeV [35]. Measurements of beauty-tagged jets (b jets) in pp and p-Pb collisions were performed by the CMS experiment [36]. They reported the nuclear modification factor for b jets with transverse momentum larger than 50 GeV/c. The ALICE detector has excellent tracking capabilities for low-p T charged particles, which makes it possible to measure b jets at low transverse momenta. This provides a unique opportunity at the LHC to study nuclear modification of b jets down to the region where the energy scale of the jets is of similar magnitude compared to the b-quark mass, which increases sensitivity to mass dependent effects. In this paper, we present the first measurement of inclusive charged-particle b-jet p T -differential cross section and the b-jet fraction, down to jet-transverse momentum p T,ch jet = 10 GeV/c in p-Pb and pp collisions at √ s NN = 5.02 TeV. The measured p T distributions were used to obtain the nuclear modification factor of b jets, R b-jet pPb , in the transverse momentum range 10 ≤ p T,ch jet ≤ 100 GeV/c. The paper is organized as follows: the next section introduces the experimental setup and data sets used for these measurements. Jet reconstruction and the b-jet tagging procedures are described in Sec. 3. Section 4 deals with the correction steps that were applied in the analysis. These include corrections for b-jet tagging efficiency, b-jet tagging purity, and unfolding of the jet momentum smearing due to underlying event fluctuations and instrumental effects. Systematic uncertainties are discussed in Sec. 5. Section 6 is devoted to the discussion of the final results. The paper is summarized in Sec. 7.

Experimental setup and data sets
The ALICE detector [37,38] consists of a central barrel, a forward muon arm, and a set of forward detectors that are used for triggering and event characterization. The central barrel hosts detection systems that provide tracking and particle identification. The most important ones for this analysis are the Inner Tracking System (ITS) and the Time Projection Chamber (TPC). The ITS is a 6-layer silicon tracker, which allows for precise reconstruction of primary interaction and secondary decay vertices. The two innermost layers of the ITS are formed by the Silicon Pixel Detector (SPD). All detectors of the central barrel are placed in a solenoidal magnet that provides a field of 0.5 T along the beam direction.
The present analysis is based on the p-Pb and pp collisions at √ s NN = 5.02 TeV taken by ALICE in 2016 and 2017, respectively. For p-Pb collisions, the beam energies of colliding protons and Pb nuclei were asymmetric: the protons had 4 TeV, while Pb nuclei had an energy of 1.59 TeV per nucleon. This resulted in the laboratory frame in a rapidity (y) shift of the nucleon-nucleon center-of-mass system by ∆y = 0.465 in the direction of the proton beam.
The main triggering device for the data sets used here is the V0 detector [39], consisting of two scintillator arrays V0A and V0C. They cover the full azimuth angle in the forward and backward pseudorapidity ranges 2.8 < η < 5.1 and −3.7 < η < −1.7, respectively. The minimum bias trigger (MB) is defined by a coincidence of V0A and V0C signals. Timing of the V0A and V0C signals is also used to reject background from beam-gas interactions.
Pile-up events constitute less than 1% (0.5%) of triggered events in pp (p-Pb) collisions. They were identified and rejected using an algorithm that utilizes track segments, formed by hits in the SPD, to recognize events with multiple primary vertices. The remaining undetected pile-up events constitute a negligible fraction of the analysed sample.
The p-Pb data set corresponds to an integrated luminosity of L pPb = (298 ± 11) µb −1 (624 × 10 6 MB events) [40], and the pp data set to L pp = (18.9 ± 0.4) nb −1 (968 × 10 6 MB events) [41]. Only events with the location of the reconstructed primary vertex along the beam axis within |z vtx | < 10 cm were retained to assure a uniform detector coverage at midrapidity.

Jet reconstruction and b-jet identification
The analysis uses high-quality tracks [35] reconstructed in the pseudorapidity range |η track | < 0.9 that have at least one hit in either of the two SPD layers. In the regions where the SPD was inefficient, highquality tracks were supplemented with complementary tracks that do not have a hit in the SPD, to achieve azimuthal uniformity in the tracking acceptance. The momentum resolution of complementary tracks is improved by constraining the origin of the track to the primary vertex. Complementary tracks constitute about 3.5% of all primary tracks. The tracking efficiency for primary tracks with p T > 1 GeV/c varies with p T between 70 and 85%. Primary-track momentum resolution is about 0.7% at p T = 1 GeV/c, 1.6% at p T = 10 GeV/c, and 4% at p T = 50 GeV/c. The spatial resolution of the track impact parameter with respect to the primary vertex is better than 75 µm for charged-particle tracks with transverse momentum p T > 1 GeV/c and better than 20 µm for tracks with p T > 20 GeV/c [35,38]. More information about the track selection can be found in Ref. [35].
Jets were reconstructed using the infrared and collinear safe anti-k T algorithm [42] from the FastJet package [43]. The resolution parameter was set to R = 0.4, which ensures that most of the momentum of the initial parton (approximately 70% to 90% in the range of the current measurement) falls within the jet cone [44]. The jets were constructed from charged particles having p T,track > 0.15 GeV/c and pseudorapidity |η track | < 0.9. Their four-momenta were combined using the p T recombination scheme, which considers all particles to be massless [43]. The pseudorapidity coverage of the reconstructed jets was constrained to |η jet | < 0.9 − R = 0.5 to select only jets that are fully contained within the TPC acceptance.
The reconstructed transverse momentum for jets p reco T,ch jet is obtained using the measured transverse momentum of charged-particle jets p raw T,ch jet , corrected for the mean contribution of the underlying event using the formula p reco T,ch jet = p raw T,ch jet − ρ × A jet [45]. Here A jet denotes the area of the jet and ρ is the mean underlying event p T density. The mean underlying event p T density was calculated on an event-by-event basis using the estimator introduced by CMS [46].
Identification of b jets is based on kinematic variables related to the lifetime of b-hadrons (cτ ≈ 500 µm), and the large impact parameter of beauty-hadron-decay daughters. Several discriminator variables were defined and applied in two distinct b-jet tagging methods that are presented in this paper, the impact parameter (IP) method (based on the distance of closest approach, DCA, of the individual jet tracks to the primary vertex), and the displaced secondary vertex (SV) method (based on the topology of a reconstructed secondary vertex using a subset of the jet tracks). The tagging of the b-jet candidates utilizes global tracks only and exploits the high spatial resolution of the SPD. The b-jet p T -differential spectra were separately obtained with the two tagging algorithms and eventually combined to improve the accuracy of the measurement. While the IP method generally provides better b-jet tagging efficiency, the SV method has been proven to be more stable at low p T . Both methods are discussed in detail below. For more information on b-jet tagging algorithms, the reader may refer to Refs. [47][48][49].

b-jet tagging based on impact parameter
The impact parameter of a track can be measured either in three dimensions or in the projection on the plane perpendicular to the beam axis. This analysis used the latter definition (denoted d xy ) to exploit the better resolution of the ITS in this plane.
The sign of the impact parameter is determined as the sign of the scalar product of the jet axis and the impact parameter vector pointing from the primary vertex to the point of closest approach. Tracks originating from a secondary vertex tend to have positive impact parameter values because of the mother particle decay length. On the other hand, the tracks originating from the primary vertex can have both positive and negative impact parameter values due to finite resolution which smears the impact parameter of primary tracks symmetrically around the primary vertex. Discrimination among different jet flavors was based on the impact parameter significance (Sd xy ), defined as the ratio of the impact parameter to its estimated resolution. The impact parameter resolution largely depends on the η and p T of the tracks. Figure 1 (top left) shows the probability distribution of the impact parameter significance for tracks belonging to different jet flavors, as determined from a detector-level simulation, where PYTHIA 8 Monash 2013 [50] events were processed with an ALICE GEANT 3-based particle transport model [51].
On average, tracks associated to b jets have larger Sd xy values when compared to c jets and light-flavor jets. This means that the impact parameter has a strong discriminating power in distinguishing between the different jet flavors.             This analysis uses the track counting algorithm [47], which arranges the Sd xy values of tracks in a jet in descending order. A jet was tagged as a b jet if the second largest impact parameter significance value (see Fig. 1 top right) was greater than a certain threshold parameter Sd min xy . The default threshold parameter that was chosen in this analysis is Sd min xy = 2.5, which gives an average tagging efficiency of 55% with average purity of 42% for b jets with 20 < p reco T,ch jet < 40 GeV/c. This choice provided an optimum balance between good efficiency and good background rejection. Discrimination based on the tracks with the first largest as well as the third largest impact parameter significance value (see Fig. 1 bottom left) were used for consistency checks.
The purity and b-jet tagging efficiency of the selected b-jet sample presented in Sec. 4.1.1 were determined using the jet probability algorithm [47,49,53]. This algorithm evaluates a combined impact parameter significance of tracks inside the jet and estimates a likelihood that all tracks associated with the jet originated from the primary vertex.
Reconstructed tracks were classified based on different geometric and tracking features. The algorithm defines a resolution function R IP for each category, by fitting the negative side of the signed Sd xy distribu-    tion (bottom right panel of Fig. 1). This fit is carried out on the negative part of the distribution because in this range it is predominantly populated by primary tracks originating from the primary vertex.
The resolution functions corresponding to the different track categories were used to calculate the track probability P tr . This P tr corresponds to the probability that a high-quality jet constituent track with an impact parameter significance Sd xy is coming from the primary vertex: where the integration is done over the negative side of the impact parameter significance distribution. A large impact parameter value results in a small P tr .
The jet probability (JP) is then calculated by combining the P tr values of tracks within a given jet according to the equation [47,49,53]: Only tracks with positive Sd xy are selected to calculate the jet probability. The JP discriminates over different jet flavors only in a very narrow interval (0 < JP < 0.2). This distribution is therefore not convenient for discrimination. For this reason, the − ln(JP) quantity was used as a discriminator in our analysis to determine the b-jet tagging efficiency using a data-driven method. As shown in Fig. 2, the − ln(JP) decreases much faster for light-flavor jets and c jets when compared to b jets, allowing for an effective statistical discrimination of b jets.

b-jet tagging based on secondary vertex reconstruction
Secondary vertices (SV), where the weak decay of the beauty hadrons take place, are in most cases well displaced from the primary vertex of the collision due to the lifetime of beauty hadrons. Beauty hadrons primarily decay to non-prompt charm particles which typically have similarly long lifetime. The SV algorithm reconstructs the secondary vertices inside the jets from triplets of jet-constituent tracks. This choice was motivated by the typical decay patterns of beauty hadrons. From all of these reconstructed secondary vertices, this algorithm selects for the b-jet tagging the vertex that is most displaced. The vertex reconstruction quality is described by the dispersion of the reconstructed secondary vertex, σ SV = where d 1,2,3 are the distances of closest approach of the three tracks to the secondary vertex.
This algorithm uses the decay length L xy as a discriminator. The decay length is the distance between the primary vertex and the secondary vertex measured in the plane transverse to the beam axis. The significance is then defined by dividing L xy by its uncertainty, SL xy = L xy /σ L xy . The b-tagging is then performed by considering both the SV dispersion σ SV and the decay length significance SL xy .
The default operating point of the tagging in the analysis is SL xy > 7 and σ SV < 0.03 cm. These selection values were determined by optimizing for high b-tagging efficiency and low c-quark and light-flavor mistagging rates based on simulations. Figure 3 shows examples of the SL xy and σ SV distributions for jets having different flavors as obtained from PYTHIA 8 simulations using the Monash tune [52] followed by an ALICE detector level MC simulation and reconstruction. Figure 4 shows examples of the SL xy and σ SV probability distributions in pp and p-Pb collision data.

Corrections to the b-tagged jet spectrum
The raw p T spectrum of b-jet candidates (dN tagged /dp reco T,ch jet ) that was obtained after applying the tagging algorithms was corrected for the b-jet tagging efficiency, ε b , and the purity of the selected b-jet sample, The resulting spectrum is then corrected for jet reconstruction efficiency and momentum smearing due to detector effects and background fluctuations by means of unfolding. All corrections are discussed below in detail.

Tagging efficiency
The b-jet tagging algorithms discussed in Section 3 do not identify all produced b jets. The probability that a given tagging algorithm correctly identifies a jet originating from a b quark as a b jet is called the tagging efficiency. Similarly, one can also define the mistagging efficiency as the probability that a jet originating from a charm quark or a light-flavor parton is falsely tagged as a b jet. The efficiency of a given algorithm for tagging or mistagging is defined as where i is the jet flavor (b, c or light-flavor), N tagged i is the number of tagged i jets, and N total i is the total number of i jets.

Tagging efficiency of the IP algorithm
The tagging efficiency of the IP algorithm was estimated based on the semi-data-driven method outlined in Refs. [36,47], where the − ln(JP) distributions are fitted with a set of detector-level MC templates, which describe the shape of the jet probability distributions corresponding to b jets, c jets, and lightflavor jets. The templates for p-Pb were obtained from a MC simulation based on the EPOS event generator [54] with embedded PYTHIA 6 events, where particles are propagated through a model of the ALICE detector using GEANT 3 [51]. The simulated events were then reconstructed as events in data. The templates for pp collisions were obtained similarly, using the PYTHIA 8 MC event generator.
Two jet samples were created: a sample that contains the jets satisfying the tagging requirement (tagged sample), and another sample that contains the inclusive jets before applying the tagging algorithm (untagged sample). The associated − ln(JP) distributions from data were fitted with the corresponding b, c, and light-flavor jet templates using a binned maximum likelihood fit. The fitting procedure was done separately for the tagged jet (with Sd min xy = 2.5) and the inclusive (untagged) jet samples, see Fig. 5. The b-jet tagging efficiency is then obtained as the ratio of the number of identified b jets to the number of b jets before identification: and f tag b denote the b-jet fractions before and after tagging, respectively, which are extracted from the fits; N untag data and N tag data give the numbers of jets before and after tagging, which were extracted from data; finally, C b is the fraction of b jets for which the jet probability can be defined, i.e. b jets having at least two constituent tracks with positive Sd xy . This factor was estimated from MC. The C b is ≈80% at 10 GeV/c and increases to 98% at 40 GeV/c and remains at that value for p T > 40 GeV/c. Pb − p Figure 6: The b-jet tagging efficiency extracted from the data-driven method using the IP algorithm in pp and p-Pb collisions. Figure 6 shows the b-jet tagging efficiency of the IP method in pp and p-Pb collisions. As an alternative for − ln(JP) in the template fitting, other discriminators were also used to check consistency and estimate systematic uncertainties. The alternative discriminators were the jet mass distribution [55] and the distribution of energy fraction f E carried by the secondary vertex in the jet. Both of them provide results that are consistent with the standard analysis within one standard deviation. The systematic uncertainty on the tagging efficiency is estimated by fitting the f E distribution instead of − ln(JP). While JP may be correlated with the IP, there is no such correlation in the case of f E . The good match between efficiencies and purities obtained with the different methods excludes the possibility that any such correlation affects the results. Finally, it is worth noting that the template fit procedure yields results with large systematic uncertainties for p T,ch jet < 20 GeV/c, so the interval between 10 < p T,ch jet < 20 GeV/c was omitted in the IP analysis. The reason for these uncertainties is that the individual templates have rather similar shapes, causing instabilities in the fitting algorithm and thus reducing the discrimination power of the fit.

Tagging efficiency of the SV algorithm
For the SV method, tagging and mistagging efficiencies of beauty, charm, and light-flavor jets were estimated based on the same detector-level MC simulation data sets that were used in the IP method. While the IP algorithm used the MC simulation to get templates and assesses the reconstruction efficiency with a data-driven method, the SV algorithm obtained the efficiency directly from the MC simulation via Eq. (4). In particle-level simulations, a jet was counted as a b jet if there was a beauty hadron present with a three-momentum vector contained within the jet cone. An analogous definition was also used for c jets and the remaining jets were considered to be light-flavor jets. Figure 7 presents the efficiencies as a function of jet momentum in pp and p-Pb collisions. The figure shows that tagging with the default selection criteria yields similar performance in both systems, ensuring suppression of light-flavor jets by two orders of magnitude. Comparing the tagging efficiencies of the IP and SV methods, it can be seen that the efficiency of the IP method tagging is about a factor two higher because of the less stringent selections that are applied.

Purity of the b-jet sample
The b-jet tagging algorithms introduced in Sec. 3 select not only b jets but also a certain fraction of charm and light-flavor jets, cf. Sec. 4.1. Given the higher production cross section of light-flavor and charmed jets, this leads to a significant sample contamination that needs to be corrected for. The purity of the tagged sample of b-jet candidates, P b , is defined as the fraction of true b jets over the total number of tagged jets, Here N tagged b jet is the number of tagged true b jets and N tagged is the number of all tagged jets. One of the biggest challenges in the b-jet analysis is to obtain an accurate purity estimate.

b-jet purity from the IP tagging
In the IP method analysis, b-jet purity is estimated using a data-driven method based on the jet probability discriminator. A linear combination of detector-level MC templates corresponding to pure beauty, charm, and light-flavor jets were fitted to the − ln(JP) distribution measured in data in a similar way as discussed in Sec 4.1.1. Figure 8 shows the resulting b-jet purity for the IP method with Sd min xy = 2.5 in pp and p-Pb collisions. The template fitting procedure was repeated with other discriminators to assess the corresponding systematic uncertainty, as detailed in Sec 4.1.1.  On figure 8, one can see that the purity in p-Pb collisions is slightly higher than that in pp collisions. This effect arises from small differences between the two systems. Let us note that this difference is much smaller than the systematic uncertainties corresponding to the purity calculation.

b-jet purity from the SV tagging
The purity of the b-jet candidate sample tagged with the SV method was estimated based on a hybrid method that utilizes both data-driven template fitting and simulations. In p-Pb data, the purities were primarily determined by fitting the invariant mass distribution of the most displaced secondary vertex with beauty, charm, and light-flavor templates. The invariant mass was calculated from the three prongs that were used to reconstruct the secondary vertex, assuming that all tracks have the mass of a charged pion. These templates were obtained from the detector-level EPOS simulation with embedded PYTHIA 6 events. Analogous fits were done also for the pp data using detector-level PYTHIA templates. The fits were done in several p T intervals. Figure 9 shows a typical example of template fit in pp and p-Pb collisions. The small statistical samples, however, prevented the use of the template fitting method for jets with momenta larger than 30-40 GeV/c. Therefore, the purity was also estimated based on POWHEG HVQ simulations [56] with the CTEQ6M parton distribution function (PDF) set [57]. In the case of the p-Pb system, the EPS09 nPDF set was applied in addition [58], and the rapidity shift was taken into account. Simulated particle-level charm and beauty jet p T spectra were subjected to instrumental (efficiency and detector effects) and background fluctuation effects to estimate the c-and b-jet contributions in the inclusive raw jet spectrum before tagging. The purity was then estimated in each p reco T,ch jet bin as where ε b , ε c , and ε lf are tagging and mistagging efficiencies for beauty, charm, and light-flavor jets, respectively; and N b (N c ) is the estimated contribution of beauty (charm) jets in the raw inclusive untagged jets N incl . Nevertheless, this purity estimate relies on model parameters that cannot be directly validated, i.e, quark masses as well as renormalization and factorization scales used in the computation of the beauty and the charm production cross section. Hence, a statistical analysis was carried out comparing simulated purities with purities obtained by the data-driven invariant mass template fit method simultaneously in a broad range of tagging selection criteria. This was done in order to determine the simulation configurations that are consistent with the results of the data-driven method. Consistency was defined with a χ 2 /NDF < 10 goodness-of-description test taking into account the total number-of-degrees-offreedom (NDF) in the simultaneous comparison. The configuration space covered variations of the QCD renormalization and factorization scales by factors 0.5-2 with respect to the default values, and variations of the quark masses in the range 4.5-5 GeV/c 2 for b-quarks and 1.3-1.7 GeV/c 2 for c-quarks. The variation of the heavy quark masses only has a small effect on the observed b-jet sample purity, below 2% for the b-quark and negligible for the c-quark. Changing the factorization (renormalization) scales in the simulation of the b-quark spectrum by a factor of 2 affects the purity in the same (opposite) direction by 4 to 8%, while a factor of 2 change in the renormalization or factorization scales in the simulation of the c-quark spectrum causes a 2 to 6% effect on the resulting purity in either direction.
Simulations with accepted configurations were then used to determine the purities in the p-Pb as well as the pp data. Figure 10 shows a comparison of the b-jet sample purity obtained for the default tagging with the template fit method and the POWHEG-simulation-based approach. All accepted configurations were used to assess the systematic uncertainty related to the purity of the tagged b-jet candidate sample. Raw data / Fit 0 1 2 Figure 9: Invariant mass distribution of the combination of three prongs, forming the most displaced secondary vertex in jets with 20 < p reco T,ch jet < 30 GeV/c, tagged with the default selection SL xy > 7 and σ SV < 0.03 cm for pp (left) and p-Pb (right) collisions. The data (black points) are fitted with detectorlevel MC templates corresponding to beauty, charm, and light-flavor jets to assess the purity of the b-jet candidate sample. See text for further information on MC.

Detector effects and unfolding
The measured jet spectra were affected by distortions stemming from two main sources: instrumental effects and local background fluctuations with respect to the mean underlying event density. These two effects smeared the true jet spectrum and can be corrected for via an unfolding procedure. The corrections were assumed to factorize; thus, were handled with a product of two matrices that were determined separately [59]. The instrumental effects were accounted for by constructing a response matrix that is based on a b-jet sample generated with PYTHIA 8 [50], and subsequently processed with an ALICE GEANT 3-based particle transport model [51]. The detector-level jets were matched to the particlelevel jets based on geometry. This was done by minimizing their angular distance ∆R = ∆ϕ 2 + ∆η 2 , where ∆ϕ and ∆η are, respectively, the differences in azimuthal angle and pseudorapidity between given particle-level and detector-level jets. One-to-one correspondence between particle-level and detectorlevel jets was required, and ∆R was constrained to be less than 0.25 [60]. The instrumental effects cause a similar shift in the jet energy scale of reconstructed charged-particle b jets and inclusive untagged jets; untagged jets being shifted by about ≈ 1% more. The shift is p T -dependent and for b jets with 10 GeV/c reaches about 2% and increases to about 18% for 100 GeV/c b jets. The jet energy scale resolution of b jets and inclusive untagged jets is likewise similar; b jets having by about 1% smaller resolution than the untagged jets. The resolution for 10 GeV/c b jets is about 17% and increases to approximately 22% for 100 GeV/c b jets.
The matrix that describes momentum smearing due to background fluctuations was obtained with two methods based on track embedding and the random cone technique (RC) [61]. In the track-embedding approach, a track was embedded perpendicular in azimuth to the axis of the tagged b-jet candidate. This region is expected to be dominated by the underlying event. The resulting momentum smearing is where p raw, emb T,ch jet is the reconstructed momentum of the jet with the embedded track, A jet is its area, ρ is the estimated underlying event p T density, and p emb T is the transverse momentum of the embedded track.
In the RC approach, momentum smearing was calculated using a cone with radius R cone = 0.4 placed in a random position in the η − ϕ plane in an event. This cone must not overlap with the leading and the sub-leading jets in the event and must be fully inside the acceptance of the central barrel.
The momentum smearing is calculated from tracks which are inside the cone as: where p RC T denotes the sum of the p T of the tracks inside the cone. Only events which contained a tagged b-jet candidate were selected for the calculation of δ p T .
The δ p T matrices obtained with the track embedding and RC techniques provided consistent unfolded b-jet spectra and the difference is accounted for as a systematic uncertainty. In this analysis, the track embedding technique was used in the standard analysis, and the RC method as a systematic variation.
By default, unfolding of the raw b-jet spectrum defined in Eq. (3) was performed using the singular value decomposition (SVD) method [62] implemented in the RooUnfold package [63]. The optimal regularization parameter value was found to be four for the SV analysis and eight for the IP analysis. Stability of the unfolded solutions was tested also with the Bayesian unfolding [64] and the χ 2 unfolding. These algorithms provided consistent results with the SVD and the differences were taken into account in the systematic uncertainties.

b-jet cross section and nuclear modification factor
The p T -differential b-jet production cross section was calculated as where d 2 N b jet unfolded /dp T,ch jet dη jet is the unfolded p T differential yield of b jets and L is the integrated luminosity corresponding to minimum bias events, which was quoted for the pp and p-Pb data samples in Sec. 2.
Modification of the b-jet spectrum in p-Pb collisions due to nuclear matter effects was then quantified with the nuclear modification factor [65], which compares the p T -dependent production rates in p-Pb to the rates expected from the independent superposition of pp collisions.
where A = 208 is the number of nucleons in the Pb nucleus.

Combining the results of the IP and SV methods
The p T -differential b-jet production cross sections obtained from the IP and SV methods were combined using the Best Linear Unbiased Estimator (BLUE) method [66,67]. The BLUE method is used to combine different measurements of the same physical quantity, where the uncertainties of the individual measurements are correlated between the measurements to some extent. Besides the b-jet cross section in pp and p-Pb collisions, the BLUE method was also used to obtain the combined nuclear modification factor R b-jet pPb given that correlated systematic uncertainties cancel to a different degree in the individual ratios for the IP and SV analyses.
The combined results were obtained under the following considerations. The systematic uncertainties from tagging, and purity extraction were assumed to be uncorrelated between the two methods. The contributions to the systematic uncertainty from the tracking efficiency and p T resolution, as well as from the contamination by secondary tracks, were treated as fully correlated. Since the same data set was used in the two methods, the statistical uncertainty is partially correlated. The correlation coefficient ρ stat was estimated as where σ IP (σ SV ) is the statistical uncertainty corresponding to the jet sample from the IP (SV) method, and σ IP∩SV is the statistical uncertainty corresponding to the sample selected by both the IP and the SV methods. The correlation coefficients for statistical uncertainty are ρ stat = 0.35 for pp collisions and ρ stat = 0.27 for p-Pb collisions. For the background fluctuations and unfolding uncertainties, which were partially correlated between both methods, an arbitrarily chosen correlation coefficient value of 0.5 was used, with values 0 and 1 used as consistency checks. Correlation coefficients between other parameters were varied similarly and the resulting systematic uncertainty from these choices was found to be negligible.

Sources of systematic uncertainties
Systematic uncertainties of the p T -differential b-jet cross section and R b-jet pPb were assessed by varying the selection and correction procedures. Table 1 lists the possible sources of systematic uncertainties, and the adopted variations, with respect to the standard selection procedures and methods used to obtain the central values of the results. These variations are discussed in more detail below. Table 2 provides a summary of all uncertainties, reported separately for the IP and SV analyses, as well as for the combined results obtained with the BLUE method. The two analyses were developed largely independently from each other. In the IP analysis, all uncertainties were considered as symmetrical, while in the SV analysis, most of the uncertainties were considered as asymmetrical. Systematic uncertainties due to the tracking efficiency and p T resolution, tagging, contamination by secondary tracks, and background fluctuations were treated as correlated between the pp and p-Pb systems. Hence, these were partially propagated into R b-jet pPb , taking the correlation into account. All the other uncertainties were considered uncorrelated and were fully propagated. The different types of correlated systematic uncertainties on the R b-jet pPb were determined by simultaneously varying the pp and the p-Pb results to make sure that the correlations cancel out. Since the combination with the BLUE method requires symmetric uncertainties, two SV spectra were made, one with the lower and one with the upper uncertainties. These spectra were combined with the IP spectrum separately, and a conservative choice was made by taking the maximum of the lower and upper boundaries point-by-point in the combined result. The individual uncertainty sources are discussed in detail in the following paragraphs.

Tracking efficiency
The systematic uncertainty on tracking efficiency is about 4% [68]. This uncertainty translates into an uncertainty on the energy scale of reconstructed jets. The resulting effect on the b-jet spectra was estimated by constructing an instrumental response matrix from which 4% of tracks were randomly removed. This matrix represents the downward uncertainty on the reconstruction efficiency. It is assumed that a 4% variation towards higher tracking efficiency would affect the results symmetrically. The tracking efficiency uncertainty is one of the major sources of systematic uncertainties on the b-jet cross section. It tends to increase with increasing b-jet p T .

p T resolution of tracks
The p T resolution of tracks was discussed briefly in Sec. 3 and more details can be found in Ref. [38]. The systematic uncertainty on track transverse momentum resolution was estimated from the azimuthal variation of the p T spectrum of positively and negatively charged particles following the procedure described in Ref. [69]. The resulting effect of these variations on the b-jet cross section spectra was investigated by unfolding the b-jet spectrum with an instrumental response matrix that reflected the observed local variations in track p T smearing.

Contamination from secondary tracks
Contamination of jets from secondary tracks, due to weak decays, was corrected for using the instrumental matrix. This correction is MC based and relies on the secondary track fractions from the simulations. As a systematic variation, these fractions were taken from a data-driven approach where DCA distributions of tracks to the primary vertex were fitted with templates corresponding to primary tracks and secondary tracks. This resulted in a systematic shift in jet energy scale. In the SV analysis this uncertainty was treated as one-sided since the true fraction of secondary tracks is expected to fall between the two calculations.
The production of long-lived strange particles is known to be poorly described by the PYTHIA MC event generator [70][71][72][73]. Decays of K 0 S and strange baryons were however found to contribute by less than 1% to the constructed light-flavor SV invariant mass templates that are used for the data driven purity estimate. Possible variations of the strangeness in simulations would; therefore, have negligible impact on the shape of this template and should have negligible impact on the extracted purity. A similar situation holds also for the IP templates, where decays of long-lived strange particles contribute on the percent level only. Omission of the long-lived strange particles from construction of the templates led to negligible changes of the fit results.

Underlying event fluctuations
This uncertainty was estimated by comparing the spectra unfolded using δ p T matrices constructed with the track embedding and the random cone methods. This resulted in a one-sided uncertainty on the SV Table 2: Statistical and systematic uncertainties, in percent, corresponding to three representative p T,ch jet ranges for the pp and p-Pb cross sections, as well as for the R b-jet pPb . Uncertainties of the IP and SV methods are quoted separately. Wherever applicable, the table also reports the resulting combined uncertainties. Both the upper and lower values are listed for the asymmetric SV systematic uncertainties. An additional uncertainty from the normalization by the integrated luminosity [40,41] is quoted in the last row.

b-jet tagging efficiency and purity in the IP method
The uncertainty was estimated by varying the default impact parameter significance and template fit discriminator. The working point of the tagging selection criterion, set by default as Sd min xy = 2.5, was varied in the range from 1 to 4. This resulted in variations in the data driven b-jet tagging efficiency and purity that were propagated to the b-jet cross sections.
Similarly, The energy fraction carried by charged tracks associated to the secondary jet vertex, f E was used as template fitting discriminator. Differences between these methods were added up in quadrature with the uncertainties from the fitting method to establish the overall uncertainty on the template fitting.
The purity of the selected b-jet candidates can be in principle affected also by the admixture of the long-lived strange V 0 particles (K 0 s and Λ/Λ), which result in decay-daughter tracks with large impact parameters. The possible effect of these daughter tracks on the purity and efficiency of the IP tagging was tested by ignoring those tracks that, when combined with other tracks of the same event, yield an invariant mass compatible with the K 0 s or Λ hypothesis. The corresponding systematic effect on the resulting b-jet spectrum was found to be negligible.

b-jet tagging efficiency and purity in the SV method
The default tagging selection, SL xy > 7 and σ SV < 0.03 cm, was chosen to fall into a region where the simulation adequately describes the data. The variations were performed such that one parameter was kept at its default value while the other parameter was altered. In this study, SL xy was varied from 6 to 8, and σ SV was varied from 0.02 to 0.05 cm. Since these two parameters are correlated, the envelope of the systematic variations was considered, constructed using the point-by-point maximal upper and lower variations.
In the SV method, the major source of systematic uncertainty on the b-jet cross sections stems from the purity assessment of the tagged b-jet candidate sample. The uncertainty was evaluated by repeating the analysis with each of the accepted POWHEG purity curves shown in Fig. 10. The uncertainty is defined by the envelope of the resulting spectrum variations.
The POWHEG configurations that provide a statistically acceptable description of the purity are determined based on template fits in the p-Pb system. Since the same configurations are used in the pp system, the assumptions on the CNM effects in POWHEG p-Pb simulations will, counter-intuitively, affect the purity estimation in the pp system. This effect was estimated based on the comparison of the POWHEG simulations to the existing heavy-flavor R pPb measurements [74,75]. This additional, independent uncertainty on the SV-method purity in the pp system was found to be a few percents at low p T and is vanishing towards higher p T .
In the SV method, since a three-prong secondary vertex is required and the purity is determined based on template fitting of the invariant mass distribution, a possible incorrect modelling of V 0 particles poses negligible impact on the purity.

Unfolding
Both the IP and SV methods use SVD unfolding in the standard analysis. To establish the uncertainty stemming from the choice of the unfolding method, the spectra were also unfolded with the Bayesian method, and in the IP analysis, with the χ 2 method in addition. The sensitivity to the choice of regularization parameter was investigated by changing its value within ±1. The unfolding was also repeated with a modified lower p T limit of the input spectrum from p T = 5 GeV/c to p T = 1 GeV/c. The SV analysis also considered a different input p T spectrum binning. Both methods used the b-jet POWHEG spectrum as the default prior function in the respective standard analyses. In the IP analysis, the unfolding was repeated using the measured, as well as the χ 2 -unfolded spectra as priors. In the SV analysis, the unfolding was repeated by taking as priors the POWHEG b-jet spectra resulting from different scale and mass variations. The root mean square (RMS) of the differences between these variations and the standard analysis spectra was taken as uncertainty in the IP analysis. In the SV analysis, the statistical and systematic parts were separated using pseudo-experiments with randomized input spectra. The pseudo-experiments were carried out for the standard analysis configuration as well as for each systematic variation. The maximum deviations at each p T,ch jet value were taken as asymmetric uncertainties.
An additional systematic uncertainty stems from the limited knowledge of very low momentum jet production, determined from PYTHIA simulations when constructing the response matrix. This was estimated by using a matrix that was truncated below p T,ch jet = 5 GeV/c, and the resulting deviation with respect to the standard analysis spectrum was added up in quadrature to the total uncertainty.

Normalization
There are also uncertainties on the normalization of the differential cross section which will be propagated to the nuclear modification factor. The normalization uncertainties are discussed in detail in Ref. [40] for pp collisions and in Ref. [41] for p-Pb collisions.
6 Results and discussion  Figure 11: Comparison of the p T differential production cross section of charged-particle anti-k T R = 0.4 b jets measured in pp and p-Pb collisions at √ s NN = 5.02 TeV using the IP and SV methods. Systematic and statistical uncertainties are shown as boxes and error bars respectively. The additional common normalization uncertainty due to luminosity is denoted σ Sys L and it is quoted separately. Figure 11 presents the p T -differential production cross section of b jets obtained from the IP and SV analyses in pp and p-Pb collisions at √ s NN = 5.02 TeV. For easier comparison across the two systems, the p-Pb cross section is normalized by the number of Pb nucleons A = 208. The results obtained with the two methods are consistent within uncertainties.
The combined b-jet cross sections are compared with NLO pQCD calculations by the POWHEG dijet tune with PYTHIA 8 fragmentation [76,77], see Fig. 12. The measured b-jet cross section is described by the calculations within the experimental and theoretical uncertainties. The quoted theoretical uncertainties on the POWHEG data contain uncertainties obtained by changing the renormalization and factorization scales by a factor 0.5-2, variation of α s , and variation of the PDFs of the CT14NLO parton distribution function [78] and the EPPS16 nPDF [79] in the POWHEG calculations. The uncertainties from CT14NLO and EPPS16 were propagated according to the Hessian prescription of the authors of Ratio to Data Figure 12: Top panels: The combined differential production cross section of charged-particle anti-k T R = 0.4 b jets measured in pp (left) and p-Pb (right) collisions at √ s NN = 5.02 TeV. The data are compared with a NLO pQCD prediction by the POWHEG dijet tune with PYTHIA 8 fragmentation [76,77]. Systematic and statistical uncertainties are shown as boxes and error bars, respectively. The additional common normalization uncertainty due to luminosity, σ Sys L , is quoted separately. Bottom panels: Ratio of the theory calculations to the data. these parameterizations (Eq. 53 of Ref. [79]). The uncertainty on α s was estimated by varying the strong coupling from 0.111 to 0.123. Figure 13 shows the fraction of charged-particle b-jets among inclusive charged-particle jets in pp and p-Pb collisions. The reference p T -differential cross sections of inclusive charged-particle jet production in pp and p-Pb were taken from Refs.

b-jet fraction
[80] and [81], respectively. The inclusive-jet and the b-jet measurements were obtained from different data samples, collected in different periods. Although the uncertainties corresponding to track reconstruction may be partly correlated, as a conservative approach, both the statistical and systematic uncertainties of the inclusive and b-jet cross sections were considered as uncorrelated. The measured b-jet fractions are compared with calculations of the POWHEG dijet tune with PYTHIA 8 fragmentation [76,77]. In the p-Pb case, the EPPS16 nuclear PDF set was also applied. The measured b-jet fraction is described by these calculations within uncertainties.  Figure 14 (left) shows the nuclear modification factor of charged-particle b jets obtained from the IP and SV methods. The R b-jet pPb of the two methods are consistent within uncertainties. Figure 14 (right) displays the combined b-jet nuclear modification factor R b-jet pPb as a function of p T,ch jet , compared to the NLO pQCD, POWHEG dijet tune with PYTHIA 8 fragmentation calculation [76,77]. The NLO R b-jet pPb was estimated from the ratio of the b-jet spectra obtained with EPPS16 and CT14NLO parton distribution functions. The R b-jet pPb is consistent with unity within uncertainties, as well as with a mild modification of R b-jet pPb ≈ 1.1 ± 0.1 predicted by antishadowing in the EPPS16 nuclear PDFs, in the full  pPb of the inclusive charged-particle anti-k T R = 0.4 b jets as a function of p T from the IP and SV method. Right: The nuclear modification factor R b-jet pPb obtained from combining the IP and SV method results as a function of p T,ch jet compared with the calculation by the POWHEG dijet tune with the PYTHIA 8 fragmentation [76,77]. Systematic and statistical uncertainties are shown as boxes and error bars, respectively. There is an additional normalization uncertainty of 4.37% due to luminosity, which is quoted separately.
10 < p T,ch jet < 100 GeV/c range of the measurement. The pQCD calculations describe the data within uncertainties. These results indicate that there are no strong nuclear matter effects present in b-jet production at midrapidity in p-Pb collisions at √ s NN = 5.02 TeV. Figure 15 shows the R b-jet pPb for charged-particle b jets measured by ALICE as a function of jet p T , compared with the measurement of the CMS collaboration for full-jet b jets [36]. Since the jets from CMS also include the neutral particles, the p T scales do not compare directly. Note that there is an additional ≈ 22% scaling uncertainty on the CMS data from the pp reference that was computed using PYTHIA simulations. Despite the different jet definitions and rapidity ranges used in the two measurements, the ALICE and CMS data are fully compatible in the overlap region. A substantial nuclear modification of b-jet production by cold nuclear matter can be excluded in the whole range from p T,ch jet > 10 GeV/c (approximately corresponding to p T,fulljet 15 GeV/c) up to p T,fulljet < 400 GeV/c.  Figure 15: The nuclear modification factor R b-jet pPb for charged-particle b jets measured by the ALICE experiment, compared with the b-jet measurement from the CMS experiment [36]. The CMS measurement represents R = 0.3 fully reconstructed b jets within −2.5 < η jet < 1.5. There is an additional 22% scaling uncertainty from the PYTHIA pp reference on the CMS data that is not shown in the figure. The ALICE R b-jet pPb data have an additional normalization uncertainty of 4.37%.  There is a global normalization uncertainty of 4.37% on the R b-jet pPb data from luminosity calculation, and 11.6% on the inclusive-jet R pPb data due to luminosity calculation and the scaling of the pp reference.
is consistent with unity as well as with R b-jet pPb . This suggests that jets in the given p T,ch jet may only be subject to mild cold nuclear matter effects, regardless of the jet-initiating parton.

Summary
A measurement of the p T -differential b-jet production cross sections in pp and p-Pb collisions at √ s NN = 5.02 TeV is presented in this paper, in the transverse momentum range 10 ≤ p T,ch jet ≤ 100 GeV/c and the central rapidity region. The lower p T reach of the current measurements is unprecedented at the LHC. The fraction of b jets compared to inclusive jets in pp collisions are around 0.02 in the lowest 10 ≤ p T,ch jet < 20 GeV/c interval, saturating at about 0.03 from p T,ch jet ≥ 30 GeV/c. There is no significant difference between the b-jet fractions measured in pp and p-Pb collisions. The nuclear modification factor R b-jet pPb is found to be consistent with unity within the current precision, implying no strong cold nuclear effects on the b-jet production in p-Pb collisions at √ s NN = 5.02 TeV. The b-jet measurements are described by NLO pQCD POWHEG calculations with PYTHIA 8 fragmentation within uncertainties.
In the low jet transverse momentum range, jet energy loss by radiative and collisional mechanisms in a hot and dense medium is expected to be strongly mass dependent. The current results, which exploit the excellent tracking capabilities of ALICE and reach down to p T,ch jet = 10 GeV/c, provide a baseline for future measurements of nuclear modification in Pb-Pb collisions.