Machine learning-based jet and event classification at the Electron-Ion Collider with applications to hadron structure and spin physics

We explore machine learning-based jet and event identification at the future Electron-Ion Collider (EIC). We study the effectiveness of machine learning-based classifiers at relatively low EIC energies, focusing on (i) identifying the flavor of the jet and (ii) identifying the underlying hard process of the event. We propose applications of our machine learning-based jet identification in the key research areas at the future EIC and current Relativistic Heavy Ion Collider program, including enhancing constraints on (transverse momentum dependent) parton distribution functions, improving experimental access to transverse spin asymmetries, studying photon structure, and quantifying the modification of hadrons and jets in the cold nuclear matter environment in electron-nucleus collisions. We establish first benchmarks and contrast the estimated performance of flavor tagging at the EIC with that at the Large Hadron Collider. We perform studies relevant to aspects of detector design including particle identification, charge information, and minimum transverse momentum capabilities. Additionally, we study the impact of using full event information instead of using only information associated with the identified jet. These methods can be deployed either on suitably accurate Monte Carlo event generators, or, for several applications, directly on experimental data. We provide an outlook for ultimately connecting these machine learning-based methods with first principles calculations in quantum chromodynamics.

The future Electron-Ion Collider (EIC) will map out the structure of nucleons and nuclei with unprecedented precision and allow for novel studies of quantum chromodynamics (QCD) including explorations of the mechanism of hadronization and the properties of cold nuclear matter [1].The EIC center-of-mass energy of up to √ s = 141 GeV will allow for jets to play a major role in the EIC science program.Jets are collimated sprays of particles observed in detectors that arise from the multiple soft and collinear emissions of highly energetic scattered quarks and gluons.The measured energy and the direction of a jet represent good proxies of the corresponding quantities at the level of quarks and gluons that initiate the observed jet.This close correspondence between partons and jets is expected to hold at the EIC [2,3].In this work, we explore the use of machine learning to classify different flavors of jets as well as the underlying hard processes in events using machine learning.Because of the close connection between jets and partons, accurate classification of jets as well as the underlying hard processes has important implications for the major scientific goals of the EIC.We will outline specific applications of these methods that we envision at the future EIC.Moreover, we expect that the tools and applications discussed in this work are also relevant to the ongoing experimental program at the Relativistic Heavy Ion Collider (RHIC).
The rapid progress in artificial intelligence and machine learning over the last decade has led to various applications in nuclear and high energy physics, including in the areas of classification, generative modeling, regression, and inference [4][5][6][7][8].In the context of the EIC, machine learning techniques have been proposed to determine kinematic variables in Deep Inelastic Scattering (DIS) [9,10] and to extract quantum correlation functions such as parton distribution functions (PDFs), transverse momentum dependent distribution functions (TMDs), generalized parton distributions (GPDs), and fragmentation functions [11][12][13][14][15].
In high-energy collider physics, the classification of jets (quark vs. gluon, QCD vs. W/Z, proton-proton vs. heavy-ion) has been studied with increasing sophistication over the past years [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32].An important goal of these studies is to increase the sensitivity to potential signals of physics beyond the Standard Model.At the Large Hadron Collider (LHC), machine learning based jet taggers have succeeded in significantly outperforming traditional jet taggers [33], including architectures based on Convolutional Neural Networks [19,34], deep sets [35], and transformer models [36].Compared to tagging techniques based on traditional observables, algorithms based on supervised machine learning algorithms have both benefits and drawbacks.The principal benefit is the ability of the machine learning algorithm to take advantage of the full information at hand and thereby significantly outperform algorithms based on simple closed-form observables.The drawbacks in doing so are that the results can be difficult to interpret and connect to first principles since it is unclear what the machine is learning and that the (simulated or experimental) data used in the training process may contain biases.Numerous efforts have been taken to address the limitations of machine learning based algorithms, both to improve the interpretability of machine learned results and bring under control the biases of the training data set [33].For example, in Refs.[37][38][39][40] complete bases of jet substructure observables were introduced which span the phase space of emissions inside jets.This large set of perturbatively calculable observables can rival the performance of successful machine learning algorithms, which has allowed for insights into how machines learn and an improved understanding of their good performance.In this way, machine learning based algorithms have driven progress both in the performance of the classification tasks but also in pushing analytical approaches forward.A key question at the EIC is understanding how large an improvement in performance machine learning based tagging algorithms can provide, and to identify applications of machine learning based algorithms that can drive forward the physics goals of the EIC, as it has at the LHC.
In this work, we explore the application of machine learning-based classifiers at the comparatively low EIC energies.Typically, jet classification studies are carried out at LHC energies where the jet transverse momentum p jet T is O(100 − 1000 GeV).Instead, at the EIC, jets will be produced predominantly with p jet T of 10-30 GeV.Moreover, an important aspect of machine learning for jets is the sparsity of the data compared to typical tasks in computer vision.Due to the relatively low number of hadrons that make up EIC jets, we expect an increased level of sparsity compared to LHC jets.We will explore if the reduced energy and increased sparsity of jets at the EIC affect the performance of machine learning based classifiers.Additionally, at the low EIC energies Monte Carlo event generators must rely on a larger amount of non-perturbative modeling, which can cause biases in the simulated data used to train machine learning classifiers.For the study of spin physics and cold nuclear matter effects, we will propose an approach that can mitigate this issue by training directly on experimental data.
We will explore both quark flavor jet tagging and quark vs. gluon jet tagging, as well as tagging the underlying hard process in the event.For jet flavor tagging, as examples we consider the binary classification tasks: u vs. d quark [41], ud vs. s and uds vs. c jet tagging and we compare to the jet charge [42,43] as a reference.We expect that the flavor tagging of jets will be an important component to constrain collinear and transverse momentum dependent PDFs, which we discuss more below.For quark vs. gluon tagging and tagging the underlying hard process in events, we consider the classification of qq, q q vs. gg di-jet topologies as well as the classification of direct vs. resolved photon contributions to the photoproduction di-jet cross section.The latter provides a new opportunity to enhance constraints on the parton-in-photon PDF.
In addition to using the in-jet information to train the classifiers, we will investigate how the performance can be improved by not only making use of the particles inside the jet but also out-of-jet particles in the event.By making use of this additional information, we extend the jet classification task to the classification of the underlying hard process in the event.For typical applications that we foresee at the future EIC, the classification of the underlying hard process can improve measurements that are not necessarily limited to in-jet dynamics.Event-wide classification algorithms can help to improve the measurement of spin asymmetries and studies of cold nuclear matter.The machine-learned event-wide information can also be mapped to traditional observables like N -jettiness ratios [44] or the jet pull [45,46].
In addition to baseline studies of the jet flavor tagging performance, we will gauge the importance of several aspects of detector design within the future detector setup at the EIC [1].We will consider the role of particle identification (PID) information, charge information, and minimum transverse momentum capabilities on the jet flavor tagging performance.In these studies, which serve as a first step, we retain the ideal efficiency and PID capabilities for the detected particle species, leaving the implementation of a simulated detector response and the impact on the jet energy scale and resolution for future work.Nonetheless, by varying the minimum particle p T of the particles input to the classifier as well as comparing the performance when PID or charge information are included, we will elucidate baseline considerations on the importance of reconstructing low p T particles and reconstructing PID vs. charge information.For strange quark initiated jets, we additionally study the impact of identifying the weakly decaying strange hadrons from their decay products by comparing the jet flavor tagging performance when the classifier is supplied with the undecayed strange hadrons vs. their decay products.
We foresee several specific applications of machine learning based jet and event classification to the major physics goals of the EIC, several aspects of which we will discuss in further detail in Section II: (i) Strengthening constraints on (transverse momentum dependent) PDFs.The flavor tagging of jets will be an important component to constrain collinear and transverse momentum dependent PDFs.For example, charm-tagged jets can increase the sensitivity to the (collinear) strange quark PDF in charged current events [47].In addition, di-jet events with charm and anti-charm tagging can also help to constrain the gluon TMD, including the gluon Sivers function at the EIC [48].Jet substructure observables have also been proposed to constrain the gluon PDF at the LHC [49].New opportunities relevant to RHIC and the EIC include the gluon helicity distribution, the parton-in-photon PDF, spin-dependent TMD PDFs and fragmentation functions some of which we explore quantitatively in this work.In our studies, we find that machine learning-based classifiers outperform traditional observables like the jet charge and therefore we expect that machine learning can significantly enhance the constraints on PDFs.Machine learned classifiers provide an upper bound on the information content contained in the jet or event [38] and can be used to design closed-form observables using symbolic regression techniques that are calculable in perturbative QCD [36,50,51].Additionally, machine-learned event-by-event classifiers may eventually be directly included in global analyses of quantum correlation functions like PDFs.
(ii) Enhancing the sensitivity to transverse single spin asymmetries.Transverse Single Spin Asymmetries (TSSAs) constitute some of the hallmark measurements at RHIC and the future EIC and they provide constraints on the spin structure of the proton.TSSAs are defined as the difference of cross sections where the incoming protons have different transverse spin (↑↓) orientations However, it has generally been challenging to measure non-zero TSSAs, in particular, for those associated with jets [52].Recently, the STAR Collaboration used the jet charge as an additional measurement to increase the size of the asymmetry [53], which was also proposed in Ref. [54].In this paper, we propose that an enhancement of the TSSA signal can be achieved by including an additional machinelearned measurement, which is here given by the parameters θ.This can be achieved by formulating the regression task in Eq. ( 2) as a classification problem of jets (or events) that are obtained in scattering processes with differently polarized protons (↑↓) in the initial state.By applying a classifier trained to distinguish jets in events with different initial spin orientations as an additional measurement, similar to the jet charge, larger spin asymmetries may be obtained, which can provide better constraints on the corresponding quantum correlation functions in global analyses.
(iii) Elucidating cold nuclear matter effects.One of the goals of the EIC is to achieve an understanding of the transport properties of nuclear matter such as the jet transport coefficient q, which denotes the mean square of the momentum transfer between a propagating hard jet and the nuclear medium [55][56][57][58][59].This can be achieved by comparing jet observables in eA collisions to those in ep collisions, similar to the jet quenching program comparing AA and pp collisions at RHIC and the LHC.The entire basis for extracting such properties of nuclear matter is the difference between eA and ep observables.By training machine learning methods to distinguish these two classes of events, one can use interpretable machine learning methods to gain insight into the type of information responsible for these differences, and thereby make connections to calculable observables in perturbative QCD [50].Additionally, by tagging quark and gluon jets separately, one can achieve a more detailed understanding of the jet quenching interaction.This has remained challenging in the AA jet quenching program [30,60], and the eA program will offer a cleaner environment where such techniques may be more likely to succeed.
The remainder of this work is organized as follows.In Section II, we propose several applications of machine learning based jet identification we carry out in this paper to the scientific program of the EIC.In Section III, we discuss the event generation setup of our studies and present the different machine learning algorithms used in this work.In Section IV we present results for jet flavor classification at the EIC, and in Section V we extend the classification to identify underlying hard processes in full events.In Section VI, we draw conclusions and provide an outlook.

II. MACHINE LEARNING APPLICATIONS TO HADRON STRUCTURE AND SPIN PHYSICS
In this Section, we propose several applications of machine learning based jet classification algorithms to the scientific program of the EIC and the ongoing RHIC program.In Section II A we provide a high-level description of applications of jet tagging that can gain stronger constraints with machine learning methods.In Section II B, we outline a proposal to enhance spin asymmetries with machine learning that is largely independent of model biases.

A. Maximizing jet flavor tagging performance
The measurement of longitudinal and transverse spin asymmetries provides constraints on the spin decomposition of the proton.Using QCD factorization, initial and final state spin effects can be disentangled within global analyses of the available data.As a representative example, we consider TSSAs where one of the incoming protons is transversely polarized, see Eq. ( 1).These asymmetries are generally small due to the cancellation of different parton contributions with opposite sign.In particular, measurements of Sivers [61] and Collins [62] asymmetries are often close to zero due to large cancellations between different PDFs and fragmentation functions.
As discussed also in Section I, since these asymmetries are small, experimental measurements at RHIC and the future EIC are challenging.Due to the relatively large experimental uncertainties, the measured asymmetries are often small or consistent with zero.Measurements have been performed by STAR using di-jet correlations p ↑ + p → dijets + X [52,53] as well as with singleinclusive measurements by PHENIX and STAR using pions p ↑ + p → π + X, jets, open heavy-flavor mesons and photons [63][64][65][66][67]. Theory calculations corresponding to the di-jet measurements have also recently been performed [68][69][70].
The reason for the small size of the asymmetries is due to approximate cancellations which can be understood from momentum sum rules.In the following, we will consider the Schäfer-Teryaev sum rule [71,72] and the Burkardt [73][74][75] sum rule that are satisfied by the Collins and Sivers functions, respectively.Both of these sum rules state that average transverse momentum should sum to zero when summed over either the outgoing hadron flavors (Collins) or incoming quark flavors (Sivers).We note that the derivation of these sum rules involves bare quark and gluon operators and it is therefore unknown how much the sum rules are violated due to renormalization.Nevertheless, they provide an intuitive understanding of the large cancellations between different quark flavors to first order.
First, we consider the Sivers function f ⊥a 1T (x, k 2 T ), which describes the longitudinal x and transverse momentum k T anisotropy of partons inside a transversely polarized proton.Here the superscript a = q, q, g denotes the parton inside the proton.Including appropriate prefactors and formally integrating over the transverse momentum dependence, we find where M is the proton mass.The Burkardt sum rule for the Sivers function states that the following integral vanishes [74] a=q,q,g Under the assumption that the valence quark distributions dominate, the Burkardt sum rule leads to u and d-quark Sivers functions that have opposite signs and similar magnitudes.This expected behavior of the u and d quark Sivers functions has been confirmed by recent global analyses [76][77][78].At the EIC, the Sivers function can be measured for example in lepton-jet correlations [79,80].
In order to obtain larger spin asymmetries, and hence a greater sensitivity to the underlying Sivers function, we propose that machine learning-based classifiers can be included to isolate the contribution of different quark flavors.
Second, we consider the Collins fragmentation function H ⊥ 1,h/q (z, P 2 ⊥ ) as an example of spin-dependent dynamics in the final state where large flavor cancellations are expected.It describes the longitudinal z and transverse momentum P ⊥ distribution of a final-state hadron that originates from a transversely polarized parton.After integrating out the transverse momentum dependence, we find where M h is the mass of the observed hadron.The Schäfer-Teryaev sum rule for the Collins function states that the integral over the longitudinal momentum fraction vanishes after we sum over all hadron species [71,72] h The cancellation that results here from summing over all hadrons is typically avoided by measuring identified hadrons in the final states e.g. by measuring π + and π − cross sections separately [66].Nevertheless, there can be further cancellations, which can be seen as follows.For simplicity, we now assume that isospin symmetry holds and we assume that the light parton-to-pion fragmentation process dominates.In this case, only two fragmentation channels remain.The favored fragmentation functions are pion fragmentation functions for a valence parton u or d, i.e.
, respectively, and the unfavored fragmentation functions are pion fragmentation functions of u or d that are not a valence parton, i.e.
Assuming that the fragmentation of light partons to pions dominates, this then implies that the favored and unfavored contributions cancel according to the Schäfer-Teryaev sum rule in Eq. ( 6) for a given parton.If we now choose to measure an identified hadron, photoproduction, where we include both the direct and the resolved contribution.At leading order, the final state consists of the scattered electron in the forward direction close to the beam axis and a di-jet pair, which can be initiated by both quarks and gluons.In both cases, the transverse momentum of the jets is measured relative to the beam axis in the laboratory frame.
say π + , we expect As the flavor of the jet corresponds to the flavor of the fragmenting parton, up to higher order corrections in QCD, this cancellation is relevant when we consider for example the distribution of identified hadrons inside jets [81][82][83][84][85].
In particular, one can study azimuthal asymmetries that involve the correlation of the transversity PDF and the Collins fragmentation function of the hadron inside the jet [86,87].Therefore, in order to avoid the cancellation in Eq. ( 7), we propose that a machine learned classifier can be used to tag the flavor of the observed jet.Moreover, recently Refs.[88,89] proposed to measure spin asymmetries similar to the Collins asymmetry directly using jets instead of identified hadrons.Analogous sum rules as in Eq. ( 6) apply that are expected to lead to small spin asymmetries.In order to address this problem, Refs.[54,88,89] proposed the use of the jet charge as an additional measurement, which avoids large cancellations between different quark flavors.Here we also propose that the use of machine learning-based classification of the jet flavor can enhance the size of the asymmetry compared to more traditional observables.We expect that various other measurements and science goals of the EIC and RHIC will greatly benefit from machine learned classifiers that can identify the jet flavor or the hard-scattering event.While some of them will be discussed in this work, we leave more detailed quantitative studies of the following topics for future work: • Quark flavor and quark vs. gluon jet identification can help to improve the sensitivity to the longitudinally polarized gluon distribution ∆g.In particular, it may be possible to distinguish the positive and negative solutions for ∆g that were found in recent global analyses [90].See also Refs.[91][92][93][94][95] for recent discussions and experimental results.
• Quark vs. gluon jet classification may help to improve measurements of the gluon Sivers function at RHIC and the future EIC [79,96].
• The techniques discussed here may also improve searches of physics beyond the Standard Model at the EIC [97][98][99].For example, in Ref. [100] jet charge-weighted TSSAs were proposed in this context.
• Exclusive / diffractive processes involving jets can provide constraints on GPDs and Wigner functions [101][102][103].We expect that machine learning based classifiers may help to better pin down these higher-dimensional parton distribution functions along with knowledge about the exact kinematics of the di-jet events [104,105].

B. Maximizing the size of spin asymmetries
In the previous Section, we discussed several areas where machine learning-based jet and event flavor tagging can play an important role to support the EIC and RHIC science programs.We implicitly adopted a "UV definition" of the jet flavor.In this case, the flavor of a jet is defined as the hard parton that initiates the jet and it can directly be used in Monte Carlo event generators.The machine learning algorithm is then trained to recover the assigned flavor label from the IR physics, i.e. the hadrons that make up the jet [106].There are theoretical ambiguities associated with this approach [107] and since the UV label of the jet flavor is not accessible experimentally, machine learning algorithms have to be trained on simulated data.This definition has been widely used for machine learning studies of jet classification at the LHC and various approaches have been developed to minimize the biases of this approach.For example, data-driven methods [108] and weakly supervised learning [109] have been introduced, which are tailored toward the physics goals at the LHC.
In the spin physics context, we propose an alternative approach to directly train the machine learning algorithms at the observable level.We expect that several of the science goals discussed in the previous Section can be achieved without explicitly relying on simulated data.Spin asymmetries such as TSSAs in Eq. ( 1) are measured as the difference between cross sections with different (longitudinal or transverse) spin orientations of the particles in the initial state.Therefore, we can train a machine learning algorithm that directly maximizes the size (positive or negative sign) of the spin asymmetry as repeated in Eq. ( 2): The machine learning algorithm is given here in terms of the set of parameters θ.The optimization of a machine learning algorithm using Eq. ( 8) only requires experimentally accessible / hadron-level information.The training does not explicitly require a UV definition of the jet flavor and it does not rely on simulated data.As discussed above, spin asymmetries are often small due to cancellations between different flavor combinations.By including a machine learning algorithm that is using the size of the spin asymmetry as an optimization metric or loss function, we can effectively achieve a flavor separation.Within QCD factorization, the sum rules in Eqs. ( 4) and ( 6) provide the direct connection of the machine learning algorithm that is optimized using the objective function in Eq. ( 8) to parton level quantities.Instead of solving the regression problem in Eq. ( 8) directly, we can formulate the task as a classification problem where the machine learning techniques discussed in the following Sections can be applied.By training a classifier that distinguishes jets produced in events where the incoming proton has opposite transverse or longitudinal spin orientation, we can find a classifier that maximizes the corresponding spin asymmetry.We note that this approach is similar to other discrimination tasks where the training labels are known such as between jets in proton-proton and heavy-ion collisions where the trained classifier can be used to maximize the deviation of the nuclear modification factor from unity: See Ref. [50] for more details.The identification of a machine learned-classifier can be performed directly on data before unfolding [50] or using corrected full events [110].Subsequently, an observable can be identified that is calculable in perturbative QCD and that approximates the performance of the machine learned-classifier.For example, this can be achieved using complete sets of observables such as the N -subjettiness basis or Energy Flow Polynomials (EFPs) that will be discussed below.Note that in the case where the classifier is trained on uncorrected detector-level inputs, one must carefully account for systematic differences in detector conditions between the two data samples.In this case, the designed observables will not be optimal but rather approximate maximally discriminating observables -where the deviation of the approximate observable from the optimal observable is driven by the extent to which detector conditions are understood.These identified observables can then be measured with traditional techniques that correct for detector and background effects.In this way, one can verify -without any reliance on ML -whether the designed observables maximize spin asymmetries.Ultimately, the observables can then be included in a global analyses of quantum correlation functions.

III. SIMULATION AND TRAINING SETUP
To perform the studies in the remainder of this article, we generate simulated events using the Monte Carlo event generator PYTHIA6 [111], which serves as the training data for the (supervised) machine learning based classification algorithms.In the following, we describe the simulated event sample and the machine learning architecture.

A. Event generation
We generate two data sets for the following studies, both using PYTHIA6 [111] with the eRHIC tune [112].We use CTEQ6.1 [113] and SAS 1D-LO [114] proton and photon PDFs, respectively.See Figure 1 for an illustration of the two processes in the respective data sets.
First, we generate jet samples using leading-order (LO) DIS as the hard-scattering process for the jet flavor tagging studies discussed in Section IV.At LO, the final state consists of the scattered electron and a single jet originating from different quark flavors.The LO DIS process is given the process number 99 according to PYTHIA6.We then identify the jet flavor with the flavor of the underlying quark in the LO DIS process (γ * q → q).We require the photon virtuality and inelasticity to be in the range 25 < Q 2 < 1000 GeV 2 and 0.1 < y < 0.85, respectively.
Since gluons do not contribute at LO in DIS, we generate a second data set for our studies in Section V of quark vs. gluon jet tagging using di-jet events in low-Q 2 photoproduction events, including both the direct and resolved photon contributions.At LO, the final state consists of the scattered electron in the forward direction close to the beam axis and a di-jet pair, which can be initiated by both quarks and gluons.We require low 10 −5 < Q 2 < 1 GeV 2 , while maintaining the same cut on the inelasticity 0.1 < y < 0.85.We identify quark and gluon jets in the photoproduction events using the PYTHIA6 resolved processes 11 (qq → qq), 12 (q q → q q), 53 (gg → q q) and 13 (q q → gg), 68 (gg → gg) and the direct photon-gluon fusion processes 135 (γ * T g → q q), 136 (γ * L g → q q), where the subscripts T and L denote the transverse and longitudinal polarization contributions, respectively.For the quark vs. gluon jet studies in Section V A, we neglect the resolved process 28 (qg → qg) and the direct QCD Compton processes 131 (γ * T q → qg) and 132 (γ * L q → qg) in order to avoid ambiguity in labeling processes with qg final states, whereas in Section V B we include them.We note that jet cross sections and jet substructure observables in EIC photoproduction events were considered in Ref. [115][116][117].In addition, a comparison of PYTHIA6 results to jet substructure data from HERA was performed in Ref. [117].
In accordance with experimental particle detection capabilities, we include all particles in the event and in the jet reconstruction with a lifetime of cτ > 1 cm.This includes and the corresponding anti-particles.Particles with cτ < 1 cm, such as neutral pions π 0 , are decayed until daughters with cτ > 1 cm are produced.The scattered electron is identified as the leading electron in the event and removed before we run the jet clustering algorithm.
From the scattered electron we determine the virtuality Q 2 of the exchanged photon.We leave the implementation of a simulated detector response for future work, and as a first step we will instead examine the impact of PID, charge information, and minimum transverse momentum requirements on the performance of these taggers in Section IV.Jets are reconstructed in the laboratory frame with the anti-k T algorithm [118].We choose a jet radius parameter of R = 1.0 and we consider the rapidity range |η lab | < 4. For the single jets identified in the DIS events, we require the transverse momentum of the identified jets to be p jet T > 10 GeV.We consider a sample of 14M events satisfying these criteria.For the di-jets identified in the low-Q 2 photoproduction events, we require the transverse momentum of the leading jet to be p jet T 1 > 8 GeV and the subleading jet to be p jet T 2 > 5 GeV, as well as the third leading jet to have p jet T 3 < 4 GeV.We consider a sample of 1.4M events satisfying these criteria, corresponding to an integrated luminosity of approximately 1 fb −1 [117], corresponding to approximately one month of EIC runtime (estimated at approximately 10 fb −1 per year [1]).The size of the sample is determined by studying when the classification performance approximately saturates as the statistics are increased.
We use supervised training throughout this work.In order to deploy supervised models on real data, one can either perform training on simulations or in some cases on experimental data itself.Training models on Monte Carlo simulations may be viable at the EIC if event simulations based on parton showers are sufficiently reliable in the sense that the machine learning algorithms can be trained on a suite of simulations and deployed on data.In addition, some of the studies proposed here can be performed on experimental data before unfolding of detector effects.
For example, classifying different spin orientations or classifying eA vs. ep observables are cases where the training labels are known experimentally, unlike the case in quark flavor or quark vs. gluon classification.This will allow for the identification of suitable observable for which unfolding can be performed afterwards [50].Furthermore, recent advances in unfolding methods may allow one to unfold entire events [110].We leave a more detailed exploration of these aspects for future work.

B. Machine learning algorithms
In the next Sections, we will study the binary classification of jets with different quark flavor and quark vs. gluon jets using machine learning algorithms.For this task we choose deep sets [119][120][121] as our default classifier, which were introduced as Particle Flow Networks (PFNs) [35] for data obtained in high-energy collisions in particle and nuclear physics.The information about the particles in a jet or collider event can be considered as a set of four vectors with variable length event-by-event.A PFN is a neural network, which is invariant with respect to permutations of the input variables and it can naturally handle input with different length.This choice of machine learning architecture appears to be natural for data in collider physics since the number of particles varies eventby-event and there is no inherent ordering of the particles inside a jet or of the particles in the entire event.We note that other machine learning based classifiers were found to perform similarly or worse for analogous tasks at LHC energies [35,36].The PFNs take as input the information of all the particles inside a reconstructed jet.We represent the per-particle input variables as where z i = p T i /p jet T is the normalized transverse momentum of particle i with respect to the beam axis, and η i , φ i are its rapidity and azimuthal angle.Following Ref. [35], we take η i , φ i relative to the (E-scheme [122]) jet axis.Lastly, PID i in Eq. ( 11) denotes the particle identification number.See Eq. ( 10) for the different particles in our data set.We map the PIDs to numerical values in an interval centered around zero.The numerical values are separated by 0.1 and particles (anti particles) are assigned positive (negative) values.
A PFN denoted f (p 1 , . . ., p N ) takes as input the kinematics of the N particles in the event.It is constructed such that it satisfies f (p π(1) , . . ., p π(N ) ), where π denotes the permutation operator.The required permutation invariance can be achieved by expressing f as [119] Here Φ, F denote fully connected feed-forward neural networks with a certain number of hidden layers.The FIG. 2. The jet charge distribution for EIC jets with p jet T > 10 GeV produced in high-Q 2 events as shown on the left side of Figure 1.The three panels show the results for different flavor discrimination: u vs. d (left), ud vs. s (middle), and uds vs. c (right) for a jet charge parameter of κ = 0.5, see Eq. (15).The jet charge is able to distinguish u from d, s reasonably well, whereas it is a relatively poor discriminator for u vs. c or q vs. g.Note that a peak at Qκ = 0 arises from jets that contain only neutral particles, which happens more frequently compared to its counterpart at the LHC due to lower particle multiplicity at the EIC.
connections between hidden layers are parametrized in terms of weights and each node has a bias term.The perparticle neural network Φ : R 4 → R d maps the input to a d-dimensional latent space.The summation operation in latent space leads to the permutation invariance of f .The second neural network is a map between the latent space and the final output of the binary classifier F : R d → R.
For quark vs. gluon jet classification, we additionally consider Energy Flow Networks (EFNs), which are Infrared-Collinear (IRC) safe architectures closely related to PFNs [35].IRC safety is built into the permutation invariant neural network in Eq. ( 12) by constructing an EFN denoted where every particle inside a jet is written in terms of its transverse momentum momentum fractions z i and a 2-component vector which contains the angular variables pi = (y i , φ i ).Due to the weighting of Φ with the momentum fraction z i , the resulting expression is IRC safe [35].
We parametrize the functions Φ and F in Eqs. ( 12) and ( 13) in terms of DNNs, using the EnergyFlow package [35] with Keras [123]/TensorFlow [124].For Φ we use two hidden layers with 100 nodes each and a latent space dimension of d = 256.For F we include three layers with 100 nodes each.For each dense layer we use the ReLU activation function [125] and we use the softmax activation function for the final output layer of the classifier.We train the neural networks using the Adam optimizer [126] and the binary cross entropy loss function [127], and train for 10 epochs with a batch size of 500.We do not perform additional hyperparameter optimization for the PFNs or EFNs.We reserve 20% of the training sample as a validation set, and an additional 20% as a test set on which all metrics are reported.We train the models using an NVIDIA A100 GPU on the Perlmutter supercomputer.
For quark vs. gluon tagging, we will also consider dense neural networks (DNNs) that take as input a list of observables that are IRC safe and generally calculable within perturbative QCD.The resulting classifier is generally Sudakov safe [38].The observables that are taken as input to the DNN form a complete basis of observables.As an example we consider Energy Flow Polynomials (EFPs) [37].Alternately, one could consider the N -subjettiness basis [38][39][40].The EFPs constitute a linear basis of jet substructure observables and they are defined as where we sum over all particles inside the jet and z i , θ ij denote the longitudinal momentum fraction of particle i and the relative angle between particles i and j, respectively.The subscript G = (V, E) indicates that EFPs are defined in terms of a graph that specifies which terms are included on the right hand side of Eq. ( 14).See Ref. [37] for more details.We note that this basis is insensitive to quark flavor differences but provides a powerful discriminant for quark vs. gluon jet tagging.In addition, they provide an increased degree of interpretability compared to PFNs.For the EFP DNNs, we use 3 hidden layers containing between 32-512 nodes, each with a ReLU activation function [125], followed by a sigmoid activation for the final output layer.We train the neural network with the Adam optimizer [126] and a learning rate ranging from 0.01 to 0.001 and batch size 1000, with the binary cross entropy loss function of Ref. [127].We use Keras [123]/TensorFlow [124] for the implementation, and determine the number of nodes in each hidden layer and the learning rate using a hyperparameter optimization with the Hyperband algorithm [128] implemented in Keras Tuner [129].As in the case of the PFNs, we reserve 20% of the training sample as a validation set (on which the hyperparameter tuning is performed), and an additional 20% as a test set on which all metrics are reported.
The performance of a classifier can be assessed by vari-FIG.3. Illustration of particles inside the jet (black) and out-of-jet radiation (red), which we also take into account to classify both the flavor of the jet as discussed in Section IV B and the underlying hard process in the event as discussed in Section V.As an example, we show jet production in a high-Q 2 DIS scattering process.
ous metrics quantifying the rates of correct and incorrect identification of the two classes.There are four possible outcomes of a classifier's prediction, "True/False Positive/Negative", where "True/False" denotes whether the classifier prediction was correct, and "Positive/Negative" refers to the predicted class label.In this work, we will use the following conventions for the positive class: • u vs. d classification (Section IV A): d • ud vs. s classification (Section IV C): s • uds vs. c classification (Section IV C): c • qq/q q vs. gg classification (Section V A): qq/q q • direct vs. resolved classification (Section V B): direct We will consider two metrics in this work, the Receiver Operating Characteristic (ROC) curve and the Precision-Recall (PR) curve.These metrics are defined in terms of the following quantities each of which are cumulative distribution functions of corresponding probability distributions: a diagonal line with an area under the curve (AUC) of 0.5 and the better a classifier is, the closer the curve is to the upper left edge of the plot, with a perfect classifier having AUC = 1.The ROC curve does not depend on the relative proportions of the two classes, and we will use it for classification tasks where there is not a large imbalance in the proportions of the two classes, such as u vs. d and q vs. g classification.
The PR curve shows the precision vs. recall as the decision threshold is varied.The PR curve explicitly depends on the relative proportions of the two classes, since the precision is a measure of the purity of the predicted positive class.A random classifier based solely on the relative proportions of the two classes follows a line of constant precision.The larger the precision and recall, the better the classifier is.A classifier with high precision but low recall returns only a small fraction of positive cases (low efficiency) but most of them being identified correctly (high purity), whereas a classifier with low precision but high recall returns a large fraction of positive cases (high efficiency) but with many of them being identified incorrectly (low purity).We will use the PR for classification tasks where there is a large imbalance in the proportions of the two classes, such as strange and charm jet classification.

IV. JET FLAVOR TAGGING
Using the LO DIS events described in Section III A, we now study various binary classifications of quark-jet flavors.We consider several different classification groupings: u vs. d, ud vs. s, and uds vs. c quark jets.We will study the role of PID information, charge information, and minimum particle transverse momentum thresholds on the performance of the classifiers, as well as the role of both in-jet and out-of-jet particles.
We will benchmark our machine learning-based algorithms against the energy-weighted jet charge [130] where z i = p T i /p jet T denotes the longitudinal momentum fraction of the hadrons i inside the jet and Q i is their electric charge.The weighting factor z κ i reduces the sensitivity to experimental uncertainties and κ is a free parameter that we will vary in our numerical studies below.The jet charge is soft safe but collinear unsafe, which means that theoretical calculations require a nonperturbative input that needs to be determined from experiment.Theoretical calculations of the jet charge were performed in Ref. [42,43].Extensions of the jet charge definition in Eq. ( 15) were proposed in Refs.[41,131].Theoretical work on defining the flavor of jets can be found in Refs.[132][133][134][135][136]. Experimental measurements at the LHC can be found in Refs.[137][138][139].In Figure 2, we show the jet charge distributions for the LO DIS jets considered in this Section.The jet charge is able to distinguish jets initiated by quarks of different electric charge reasonably well, such as u from d, s, whereas it is a relatively poor discriminator for u vs. c since they have the same electric charge, and similarly for q vs. g (not shown here).The jet charge thereby serves as a reference to which the performance of our machine learning-based algorithms can be compared.
In order to study the role of PID information and charge information, we consider three variations of the information input to the PFN training: • "PFN w/ PID": • "PFN w/ charge": u vs. d jets in-jet (pT,particle > 0.1 GeV) in-jet + out-of-jet (pT,particle > 0.1 GeV) in-jet (pT,particle > 0.4 GeV) in-jet + out-of-jet (pT,particle > 0.4 GeV) Jet charge, = 0.3 FIG. 6. ROC curves for u vs. d jet flavor tagging using PFNs with PID information for jets with p jet T > 10 GeV, using either in-jet information as input or using both in-jet and out-of-jet information as input.We consider two different cuts on the minimum p T,particle required of both the in-jet and out-of-jet particles, which illustrate that soft out-of-jet particles play a significant role in boosting the classification performance.
• "PFN w/o PID, charge": We note that the "PFN w/ charge" classifier uses the same experimental information as the jet charge, whereas the "PFN w/ PID" uses full PID information, which is not used by the jet charge.Similarly, we consider varying the minimum transverse momentum of jet constituents input to the PFN training, varying between p T,particle = 0.1−0.4GeV.While we do not consider the exact PID capabilities or single-particle efficiencies of the proposed EIC detectors, these variations provide a first-order estimate of the importance of PID and minimum particle transverse momentum detection capabilities and serve as an initial quantification of the value that may be gained in jet tagging performance by investing in improved PID or minimum particle transverse momentum capabilities.

A. u vs. d quark jets
To begin, we consider the classification of u vs. d initiated jets.Our results are shown in Figure 4. We find that while the jet charge is a fairly good discriminator of u vs. d jets, the PFN (which uses the full four-vector information of the final-state particles) improves the performance when either charge information is included or even more so when PID information is included.When neither PID nor charge information is included, the classifier cannot significantly distinguish u jets from d jets in PYTHIA6.The increase in performance when adding PID information rather than charge information is fairly small, especially noting that experimental PID capabilities are not perfectly efficient as assumed in our studies.We will see in Section IV C, however, that for strange and ROC curve (left) and PR curve (right) for ud vs s jet flavor tagging using the jet charge and PFNs for jets with p jet T > 10 GeV and p T,particle > 0.1 GeV.We consider several variations of the input to the PFN, providing either PID information for all particles, charge information for all particles, or neither.All curves are constructed from particles with a decay length cτ > 1 cm (in which the weakly-decaying strange hadrons K 0 S , Λ 0 , Ξ 0 , Ξ − , Σ ± , Ω − and their associated antiparticles are undecayed), except the curve labeled cτ > 10 cm, which is constructed from particles with a decay length cτ > 10 cm (in which the above weakly-decaying strange hadrons are decayed).The dashed black lines correspond to a random classifier.charm quark jet identification, PID information provides a substantial improvement in performance.
Next, we consider the role of the minimum transverse momentum of jet constituents input to the PFN training.Figure 5 shows the results when varying the minimum threshold between p T,particle > 0.1 − 0.4 GeV.We find only a minor difference in the classifier performance when varying the minimum p T,particle between 0.1 GeV and 0.4 GeV, suggesting that the minimum p T,particle detector requirements are not essential for classifying jet flavor using the in-jet information.We will see, however, in the next Section that this has a stronger impact when considering the out-of-jet particles.

B. Out-of-jet information
The motivation of machine learned-jet classification at the EIC and RHIC is quite different compared to the LHC.For example, at the LHC di-jet reference processes can be used as calibration and the resulting classifier can be applied to identify jets in multi-jet events to search for physics beyond the Standard Model.Instead, at RHIC and the EIC the focus will be on improving for example measurements of spin asymmetries as discussed above or to improve constraints on cold nuclear matter effects.Therefore, at RHIC and the EIC, the classifier does not need to be limited to the particles inside the identified jet.We note that event-wide information was also used in classification studies at the LHC, see for example Refs.[140,141].In this Section, we investigate how the performance can be improved by not only making use of the particles inside the jet but also out-of-jet particles to classify the jet flavor, as shown in Figure 3.While we have used a relatively large jet radius R = 1.0 in these studies, this choice is somewhat arbitrary and neglects the role of large-angle radiation and correlations across the entire event.We therefore compare the performance of a PFN supplied with only in-jet particles to that of a PFN supplied with both in-jet and out-of-jet particles.
Figure 6 shows the results of this comparison.We show the comparison for two different minimum p T,particle thresholds, 0.1 GeV and 0.4 GeV.We find that the difference between the in-jet classifier and the in-jet + out-ofjet classifier is significant for the case p T,particle > 0.1 GeV, whereas the difference is almost negligible for p T,particle > 0.4 GeV.This suggests that the soft outof-jet particles play a significant role in boosting the classification performance -despite that the soft in-jet particles had little impact (see Figure 5).This motivates further study of the origin and role of out-of-jet radiation, since our results suggest it can provide a significant boost in jet (or event) flavor tagging performance.In Section V we will revisit the role of out-of-jet particles in order to classify the underlying hard process of the event.

C. Strange and charm
We now turn to the identification of strange-and charmquark initiated jets.Since strange-and charm-initiated jets are considerably more rare than up-or down-initiated jets (for our kinematics, the relative u : d : s : c ratios are approximately 33 : 5 : 1 : 2), we quantify the classification performance using both the ROC curve and the precisionrecall curve.In fact, strange jets are even more rare than charm jets, since despite that the proton PDF contains a larger quantity of strange than charm, the overall cross section for charm is larger due to its larger electric charge.
Strange and charm jets also differ from up and down jets in that strange and charm hadrons have limited decay lifetimes.In the case of strange quarks, there are a variety of weakly decaying strange hadrons with lifetimes 1 cm < cτ < 10 cm (namely K 0 S , Λ 0 , Ξ 0 , Ξ − , Σ ± , Ω − and their associated antiparticles) which therefore decay on a length scale comparable to the size of the innermost tracking layers of collider experiments [142].We therefore will contrast the classification performance depending on whether the PFN is provided the undecayed strange hadrons or only the decay products of these hadrons.In the case of charm quarks, on the other hand, all charm hadrons decay with lifetimes much shorter than cτ = 1 cm, and cannot be directly detected by experiments but rather must be reconstructed using the invariant mass of decay products of exclusive charm hadrons or by tagging displaced vertices.A large literature exists on charm-jet tagging algorithms, but we will not pursue performance comparisons here [143][144][145][146].
Figure 7 shows the results for ud vs. s jet classification with final-state particle decay lifetimes of cτ > 1 cm and cτ > 10 cm, respectively.We find several notable differences compared to the u vs. d classification.First, the PFN with PID dramatically outperforms the jet charge.We also provide as a reference a simple "Leading strange tagger" which classifies the jet flavor purely based on whether the highest p T particle in the jet is a strange hadron.The PFN also dramatically outperforms this.This provides a clear illustration of the value of machine learning-based jet flavor identification.Second, the overall performance of ud vs. s tagging is significantly improved when PID information is provided relative to charge information, especially when the weakly decaying strange hadrons with cτ > 1 cm are included as input to the PFN.If only charge information is supplied, the performance decreases substantially.This provides a clear illustration that PID information is highly valuable to obtaining the best possible strange-jet tagging performance.We leave further study, such as whether providing PID information of the leading particle rather than all particles, which could substantially lessen the experimental efforts, to future work.Third, if neither PID nor charge information is provided, the performance is yet again substantially worse -however it is still notably better than in the u vs. d case.This illustrates the relative importance of particle identification vs. fragmentation in determining the jet flavor -since when neither PID nor charge information is provided the machine learning algorithm can only learn from the differences in fragmentation between ud and s jets.
Figure 8 shows the results for uds vs. c jet classification.In this case, the jet charge is not expected to be a good discriminator, since u (which dominates the uds sample) and c jets have the same electric charge.We find similarly strong performance of the PFN classifier when PID information is included, with an even larger benefit of providing PID information relative to charge information.Additionally, we note that the PFN that is supplied with neither PID nor charge information performs better than the previous cases, illustrating that the amount of information in the fragmentation pattern unrelated to particle PID or charge plays an increasing role for heavier quarks, as expected [147,148].

V. HARD PROCESS TAGGING
The classification of the underlying hard process is often of primary interest instead of the classification of a single jet.To do this, we propose to not only utilize the particles inside the reconstructed jet but to also take as input particles outside the jet, similar to the studies done in Section IV B and shown in Figure 3.Note that we still require a jet with a given transverse momentum to identify the entire event to ensure the presence of a hard- True Positive Rate = True q Total q q vs. g jet

Leading jet
Particle Flow Network (w/ PID) Energy Flow Network Energy Flow Polynomials (DNN), d = 7 Jet mass FIG. 9. ROC curves for quark vs. gluon jet tagging at the EIC using the leading jet information from quark and gluon di-jets in low-Q 2 protoproduction events containing qq, q q, gg topologies.The leading jet is required to be p jet T 1 > 8 GeV, the subleading jet to be p jet T 2 > 5 GeV, and the third leading jet to be p jet T 3 < 4 GeV, see Figure 1.We consider several models: (i) PFN including PID information, (ii) EFN, (iii) DNN with EFPs for two different dimensions d, and (iv) jet mass.
scale, which allows for the interpretation or applicability of perturbative techniques in QCD.The additional information contained in the dynamics of particles outside the reconstructed jet can generally increase the performance of the machine learning algorithm.We note that event type classification using triggers and machine learning was discussed in Ref. [4] and references therein.Different than Ref. [4], we aim here at identifying the underlying hard process in the event at parton level.As discussed in Section III B above, the in-jet information that is used to train machine learned classifiers can be captured by complete sets of observables like N -subjettiness and EFPs.Similar observable bases can be constructed for out-ofjet information and correlations between jets (such as in photoproduction events, see Figure 1) can be captured by observables like the jet pull [45,46].
We consider two examples of event classification in this Section.In both cases, we use low-Q 2 photoproduction events that contain a di-jet signal with the transverse momentum of the leading jet required to be p jet T 1 > 8 GeV and the subleading jet to be p jet T 2 > 5 GeV, as described in Section III A. First, we consider the classification quark vs. gluon jet topologies by discriminating qq or q q dijet topologies from gg topologies.Second, we consider the classification of direct vs. resolved photoproduction processes.Similar to the in-jet particles, we normalize the transverse momentum of out-of-jet particles relative to the leading jet transverse momentum z i = p T i /p jet T 1 .Since we divide by the transverse momentum of the leading jet in the event, we have z i < 1 for both in-jet and out-ofjet particles.Moreover, we count the values (η i , φ i ) of out-of-jet particles relative to the leading jet axis.FIG. 10.ROC curves for quark vs. gluon event tagging at the EIC with PFNs including PID information and considering as input to the PFN either the leading jet particles, the leading and subleading jet particles, or all particles in the event with p T,particle > 0.1 GeV.Here we consider quark and gluon dijets in low-Q 2 protoproduction events containing qq, q q, gg topologies.The leading jet is required to be p jet T 1 > 8 GeV, the subleading jet to be p jet T 2 > 5 GeV, and the third leading jet to be p jet T 3 < 4 GeV, see Figure 1.

A. Quark vs. gluon jet tagging
We consider events with quark and gluon di-jet topologies by considering both direct and resolved processes that result in qq-, q q-, or gg-initiated di-jets, as described in Section III A. We then train PFNs using either (i) the particles in the leading jet, (ii) the particles in both the leading and subleading jet, or (iii) all particles in the event with p T,particle > 0.1 GeV.
Figure 9 shows the classification performance of quark vs. gluon jet event topologies when trained with only the leading jet particles.Models trained with the leading jet particles correspond most closely to previous studies of the classification of quark vs. gluon single jets [27,34].We consider a PFN trained with PID information, as well as an IRC-safe EFN, which performs slightly worse than the PFN, which is typical in quark vs. gluon jet classification at the LHC.While the performance at the low EIC jet energies considered here is lower than quark vs. gluon classification with high-p jet T jets at the LHC, the PFN and EFN still are able to achieve substantial classification performance and large improvements compared to single observables such as the jet mass.Additionally, we compare the a DNN that uses IRC-safe EFPs with dimension d = 7 as input, which gives a performance that approaches that of the deep set models.
Figure 10 shows the classification performance of quark vs. gluon jet event topologies as the subleading jet particles and out-of-di-jet particles are added to the PFN training input.As the subleading jet particles and out-ofdi-jet particles are added to the PFN training input, the performance significantly increases.

B. Direct vs. resolved processes and improved constraints on photon structure
Next, we consider discriminating events that arise from direct vs. resolved photoproduction processes.The direct processes correspond to those where the low-Q 2 quasi-real photon directly participates in the hard-scattering process.Instead, the resolved process corresponds to the case where the parton content of the photon is resolved and only quarks and gluons are involved in the hard-scattering process that produces the di-jet pair in the final state.The resolved process provides access to the nonperturbative parton-in-photon PDF, which is of particular interest at the future EIC.While the unpolarized di-jet photoproduction cross section has been measured at HERA by the H1 Collaboration [149], the polarized cross section will be measured for the first time at the EIC and the helicity parton-in-photon PDFs are currently unconstrained from experimental data.See Refs.[116,117,[150][151][152] for theoretical work on the partonic structure of photons in the context of the future EIC.
An approximate separation of the direct and resolved contribution can significantly enhance the sensitivity to the nonperturbative parton-in-photon PDF.Traditionally, this separation has been achieved by measuring a multidifferential cross section including the kinematic variable x γ , which is defined in terms of the electron energy and the di-jet transverse momenta and rapidities [149,150].At LO in QCD, x γ corresponds to the momentum fraction of the parton inside the photon such that for x γ → 1(0), the direct (resolved) process dominates.Instead, here we propose that the performance can be augmented using a machine learning-based binary classifier that can make use of the full event information.
In order to explore this approach, we train a PFN to distinguish the direct vs. resolved photoproduction processes using the leading and subleading jet information.We consider events with the same quark and gluon dijet topologies described in the previous Section except additionally including qg-initiated di-jets in addition to qq-, q q-, and gg-initiated di-jets.Figure 11 shows the classification performance of direct vs. resolved photoproduction processes.We find that the performance is worse than the quark vs. gluon di-jet topology classification shown in the previous Section, which is unsurprising given that the direct and resolved contributions contain both quark and gluon jets.We furthermore find that the impact of supplying PID information to the PFN is almost negligible in this case.It would be instructive to combine the information of the "QCD-inspired" variable x γ with the machine-learned classifier described here.Note that in Fig. 11 we have not included information from the electron.In addition, it would be interesting to combine the tagging of the direct vs. resolved process with FIG.11.ROC curves for direct vs. resolved process tagging at the EIC.Here we consider quark and gluon jets produced in low-Q 2 photoproduction events, see Figure 1.
jet flavor identification as discussed in previous Sections.This would allow for a flavor separation of the partonin-photon PDFs.We note that a flavor separation based on identified hadrons inside the di-jets was employed in Ref. [150].We leave the exploration of these topics as well as quantitative impact studies at the EIC for future work.

VI. CONCLUSIONS AND OUTLOOK
We have presented first studies of machine learning based jet and event classification using simulated events at the EIC.While the performance of jet flavor classification is more challenging than with high transverse momentum jets at the LHC, machine learning-based classification algorithms offer important advantages in performance and many prospects for interpretability.We found that machine learning algorithms outperform traditional observables used to identify jet flavor, such as the jet charge.In order to provide input to the detector specifications at the EIC, we investigated the impact of PID information, charge information, and minimum particle transverse momentum requirements.We found that providing charge information is sufficient for u vs. d jet classification, but that PID information gives a large improvement to strange and charm jet tagging capabilities.We found that soft particles with 0.1 < p T < 0.4 GeV have only minor impact when jet flavor classification is performed using in-jet particles, but that out-of-jet soft particles give substantial improvement to the classification performance.The studies performed here can be extended to a full detector simulation and additional kinematics.Future detailed studies on the impact of PID for strange and charm quark-initiated jets may be useful, such as investigating whether it is important for the all particles to be identified or whether a small number of leading particles is responsible for most of the flavor tagging performance.Another future direction is the exploration of different machine learning architectures.In this work, we limited ourselves to algorithms that have been known to perform well in the LHC environment.The jet quark flavor classification could also be extended from binary classification to multi-label classification for all quark flavors simultaneously.
We have proposed several applications in which these methods will play an important role in the EIC science program.Jet flavor tagging can lead to improved constraints of collinear and transverse momentum dependent parton distribution functions, such as the strange quark PDF and the gluon TMD.Eventually, machine-learned event-byevent classifiers may be directly included in global analyses of quantum correlation functions like PDFs.Flavor tagging also provides opportunities to increase experimental access to transverse single spin asymmetries, since one can use the magnitude of the asymmetry as an objective function for the machine algorithm to maximize.Tagging the underlying hard process in events can enhance constraints on the parton-in-photon PDF when combined with previous work [150].Additionally, the methods outlined here can be applied to classify ep vs. eA collisions and provide new insight to cold nuclear matter effects, and can be extended to produce increasingly interpretable results [50].The supervised machine learning methods presented here to classify jet flavor require suitably accurate Monte Carlo event generators to produce training data.This will require repeating these studies with additional and increasingly sophisticated Monte Carlo event generators.On the other hand, the applications we proposed for (i) maximizing spin asymmetries and (ii) classifying ep vs. eA can be performed directly on experimental data since the training labels are known.
While the latter can currently only be explored using simulations, the former can be deployed immediately at RHIC.

FIG. 1 .
FIG.1.Illustration of the jet production processes considered in this work.Left: High-Q 2 electron-proton scattering.At leading order, the final state consists of the scattered electron and a single jet originating from different quark flavors.Right: Low-Q 2 photoproduction, where we include both the direct and the resolved contribution.At leading order, the final state consists of the scattered electron in the forward direction close to the beam axis and a di-jet pair, which can be initiated by both quarks and gluons.In both cases, the transverse momentum of the jets is measured relative to the beam axis in the laboratory frame.

• 7 FIG. 4 .
FIG.4.ROC curve for u vs. d jet flavor tagging using the jet charge and PFNs for jets with p jet T > 10 GeV and p T,particle > 0.1 GeV.We consider three variations of the input to the PFN, providing either PID information for all particles, charge information for all particles, or neither.

FIG. 5 .
FIG.5.ROC curves for u vs. d jet flavor tagging using using PFNs with PID information for jets with p jet T > 10 GeV and different cuts on the minimum p T,particle required of jet constituents.
FIG.7.ROC curve (left) and PR curve (right) for ud vs s jet flavor tagging using the jet charge and PFNs for jets with p jet T > 10 GeV and p T,particle > 0.1 GeV.We consider several variations of the input to the PFN, providing either PID information for all particles, charge information for all particles, or neither.All curves are constructed from particles with a decay length cτ > 1 cm (in which the weakly-decaying strange hadrons K 0 S , Λ 0 , Ξ 0 , Ξ − , Σ ± , Ω − and their associated antiparticles are undecayed), except the curve labeled cτ > 10 cm, which is constructed from particles with a decay length cτ > 10 cm (in which the above weakly-decaying strange hadrons are decayed).The dashed black lines correspond to a random classifier.
FIG.8.ROC curve (left) and PR curve (right) for uds vs c jet flavor tagging using the jet charge and PFNs for jets with p jet T > 10 GeV and p T,particle > 0.1 GeV.We consider several variations of the input to the PFN, providing either PID information for all particles, charge information for all particles, or neither.The dashed black lines correspond to a random classifier.
= True direct Total direct direct vs. resolved process Particle Flow Network (w/ PID) Particle Flow Network (w/o PID,charge)