Jet charge identification in the e + e − → Z → q ¯ q process at Z pole

: Accurate jet charge identification is essential for precise electroweak and flavor measurements at the high-energy frontier. We propose a novel method called the Leading Particle Jet Charge method (LPJC) to determine the jet charge based on information of the leading charged particle. Tested on Z → b ¯ b and Z → c ¯ c samples at center-of-mass energy of 91 . 2 GeV, the LPJC achieves an effective tagging power ϵ eff of 20%/9% for c/b jet, respectively. In combination with the Weighted Jet Charge method (WJC), we develop a Heavy Flavor Jet Charge method (HFJC), which achieves an effective tagging power ϵ eff of 39%/20% for c/b jet, respectively. This paper also discusses the dependencies between jet charge identification performance and the fragmentation process of heavy flavor jets, as well as critical detector performances.

The modified content is marked in red.

Introduction
In the Standard Model (SM) of particle physics, quarks, and gluons carry color charge and are unable to propagate freely in spacetime due to the phenomenon of color confinement [1,2].As a result, when these colored SM particles are produced in high-energy colliders, they fragment into sets of collimated final state particles, primarily hadrons [3], creating streams of particles known as jets.It is of great interest to distinguish the original species of colored SM particles within a jet, for instance, in measuring the properties of the Higgs at the high-energy frontier [4][5][6].
Quarks or gluons can be identified from the final state particles corresponding to the initial colored particle.Technically, within high-energy frontier experiments, quark jets can be differentiated from gluons at Large Hadron Collider (LHC) experiments [7][8][9][10][11][12].Moreover, at both the LHC and future electron-positron (e + e − ) Higgs factories, c/b jet and light jets can be distinguished from each other using flavor tagging algorithms [13][14][15].Finally, quarks and anti-quarks can be identified using jet charge identification algorithms at both LEP and LHC experiments.
Accurately measuring the jet charge is crucial for determining the asymmetry between the production of particles and their anti-particles in EW processes, as it directly affects the production rates of charged particles and their subsequent decay processes.The forward-backward asymmetry (A F B ) is a fundamental observable that provides insights into the underlying electroweak interactions [16][17][18][19][20][21][22][23][24][25].Similarly, the electroweak mixing angle (sin 2 θ W ) is a fundamental parameter that characterizes the strength of the electroweak force [26,27].Accurate measurement of jet charge is essential in determining A F B and sin 2 θ W Furthermore, in flavor physics, the time-dependent CP measurement plays a crucial role in studying the violation of CP symmetry in particle decays [28][29][30][31].
The performance of heavy flavor jet charge identification is measured by efficiency ϵ tag , misjudgment rate ω, and effective tagging power ϵ eff [32,33], defined as 2) where N all denotes the total number of input particles, N selected represents the number of particles that have been tagged, N wrong represents the number of particles that have been misidentified, and ϵ tag represents the selection efficiency, which indicates the level of final state particle utilization.The ω denotes the misjudgment rate of all tagged samples.This misjudgment rate ω is usually converted into the dilution factor r = 1 − 2ω, which reflects the correct judgment rate.The dilution factor r = 0 indicates a fully diluted flavor (no possible distinction between b/c jet and b/c jet), whereas a dilution factor r = 1 indicates a perfectly tagged flavor, The effective tagging power ϵ eff indicates the total performance of jet charge tagging.Physics measurements reliant on jet charge identification, such as sin 2 θ W , typically demonstrate accuracy proportional to 1/ √ ϵ eff N [17], where N indicates the number of events observed.
Multiple jet charge identifications have been investigated from several collider experiments.The LEP experiments employ the c/b jet charge for the determination of asymmetry A F B and sin 2 θ W in the production of b/c quark pairs [18][19][20][21][22][23][24][25][26][34][35][36][37] and for neutral B meson oscillation studies [34,36].Five methods have been used in LEP experiments for the jet charge identification, relying on leptons, c hadrons, momentum-weighted jet charge, vertex charge, and kaons.DELPHI achieves a b purity of 90% in double tagged events [19].The ATLAS and CMS Collaborations at the LHC use dijet events to differentiate between quark jets and gluon jets to test different aspects of the strong interaction [15,[38][39][40][41][42], and use muon and momentum-weighted jet charge to measure B 0 s decay parameters, achieving an ϵ eff of 1.49% [43].LHCb uses the b-tagging algorithm for precision measurement of CP violation via neutral B meson decay channels, using opposite side e, µ, K, c hadron, vertex charge, same side π, p, K [32,44], and combined with Quantum ML, reaching an effective tagging power ϵ eff of about 8% (transverse momentum p T >60 GeV) [45].The concept of jet charge identification can be extended to heavy flavor factories, such as Belle II and BABAR.These factories distinguish between b quark and b quark for time-dependent CP analysis using hadronic B decays with flavor-specific final states at the Υ(4S) resonance.With a combination of different tagging signatures and neural networks, BABAR achieves the an effective tagging power ϵ eff of 31.2% [46,47], while Belle II obtains the effective tagging power ϵ eff of 30.0%[33,48].
The e + e − Higgs factory has been identified as the foremost priority for future high energy particle collider experiments [49].Several facilities have been proposed, including CEPC [50,51], FCC-ee [52][53][54], ILC [55], CLIC [56], C 3 [57], and Helen [58].These multipurpose factories not only allow for precision Higgs property measurements, but also offer considerable potential for accurate measurements of electroweak (EW), quantum chromodynamics (QCD), flavor physics, and searches for new physics.Especially, the circular e + e − Higgs factories, such as the CEPC and the FCC-ee, could deliver Teras of Z bosons and provide excellent opportunities for those explorations.Accurate jet charge identification has a significant impact on the scientific reach of these e + e − Higgs factories.
In this study, we investigate the jet charge identification performance at the future e + e − Higgs factory.The influence of light flavors has been neglected because only 0.051/0.005light flavors could be misidentified as c/b jet, respectively [59].We generate approximately 10 million Z → b b and Z → cc events using three distinct generators for comparison: Whizard 1.95 [60][61][62], Herwig [63], and Sherpa [64].In Whizard 1.95, the built-in Pythia 6 [65] is used for jet fragmentation and hadronization.Uncertainty of effective tagging power ϵ eff raised from finite statistics of Monte Carlo (MC) samples, being 10 −4 order, is negligible and thus omitted in this study.We have developed an algorithm, LPJC, to identify the charge of a single jet utilizing information from the final state leading charged particle within it.Combing with the conventional method that uses the sum of jet particles' charge weighted by the energy proposed by [66], we developed a Heavy Flavor Jet Charge method (HFJC), achieving an effective tagging power ϵ eff of 39%/20% for c/b jet, respectively.The effective tagging power ϵ eff can be significantly improved to 45%/37% for c/b jet, respectively, by additionally identifying the origin of the final state leading charged particles.
This paper is organized as follows.Section 2 describes LPJC and its performance, and a brief analysis for the performance dependency on jet hadronization.Section 3 presents the corresponding information of WJC method.We develop a Heavy Flavor Jet Charge method (HFJC) which combines WJC and LPJC methods.Section 4 introduces the combination method and HFJC performance, which conclude the characteristic jet charge performance at the Z pole operation.In Section 1-4, our assumptions regarding the detector are ideal, while in Section 5, we discuss the impact of actual detector performance and quantifies the variation of effective tagging power ϵ eff at different conditions.A short summary and a discussion on perspectives are given in Section 6.

Leading Particle Jet Charge method (LPJC)
Generated in high energy colliders, b/c quarks typically fragment into b/c hadrons (hadrons that contain b/c quark), comprising both excited and ground states.This study designates the most energetic ground-state heavy hadron within a jet as the leading heavy hadron.Specifically, in c jets, these include D + , D 0 , D + s , Λ + c , while in b jets, these comprise B0 , B − , B0 s , Λ 0 b , which collectively account for 99.7% and 99.8% of all heavy hadrons, respectively.The mass, lifetime τ , decay length cτ and proportion of these hadrons are summarized in Table 1.According to Whizard 1.95, approximately 67% and 83% of the final state leading charged particles in c/b jet, respectively, are from leading heavy hadron decays, which have a sizeable distance from the interaction point (IP).Apart from these, final state charged particles can also originate in proximity of IP, primarily through QCD fragmentation.For the sake of simplicity, these origins in proximity of IP are collectively referred to as QCD fragmentation. Figure 1 shows an display of a Z → b b event where the trajectories of charged particles are depicted by curves.All particles within the event can be clustered into two back-toback jets based on the plane perpendicular to the thrust axis [68].The enlarged view of the event display shows the IP and the secondary vertices (SV), which can be used to distinguish between particles originating from the leading heavy hadron decays and those produced via QCD fragmentation.A heavy flavor jet can be regarded as a composition of one leading heavy hadron along with a set of light hadrons, predominantly pions.
The jet charge identification algorithm takes the information of final state particles as input and determines the charge of the initial heavy quark.The observable properties of the jet, such as momentum flow, multiplicity, and charge, reflect the characteristics of the heavy quarks.Figure 2 illustrates the multiplicity of final state charged and neutral particles in c/b jets, which peaks at a value of 10 and exhibits an extended tail towards the larger values.This section introduces the LPJC method and evaluates its performance in various decay processes.

Methodology of LPJC
In this section, we introduce the LPJC algorithm and its performance quantified by the effective tagging power ϵ eff .The LPJC method is composed of three key steps: First, all final state particles in each event are clustered into two back-to-back jets.Secondly,  select the charged particle with highest energy in each jet, identified as the final state leading charged particle.These are then classified into sub-groups based on their types: e, µ, K, π, p.This covers almost all final state charged particles in c/b jets.Finally, the jet charge is determined using the charge and particle identification (PID) information of the final state leading charged particle in each sub-group, and the asymmetry between the two charged particles in one sub-group is defined as the corresponding misjudgment rate ω.
The energy spectrum of final state charged particles is shown in figure 3. The x-axis represents the energy of final state leading charged particles, ranging from 0 to 35 GeV, with a peak of 5 to 10 GeV.The y-axis represents the energy of all final state charged particles.The entries along the diagonal represent the final state leading charged particles.The LPJC criteria for determining whether it is c jet and c jet (similarly for b jet and b jet) is the characteristics, including type and charge, of the final state leading charged particle.The specific correlation between the initial quark charge and the characteristics of the final state leading charged particle is outlined as follows: The electric charge of each final state leading charged particle exhibits an obvious asymmetry that reflects the jet charge misjudgment rate ω, which can be interpreted in eq.(1.2). Figure 4 and 5 show this asymmetry, with the red lines dividing the pies into two parts that indicate correct judgment (N R ) and misjudgment (N W ) rates.The charge sign for each species of final state leading charged particle exhibits an obvious asymmetry between c jet and c jet, which can be interpreted as that of the quark.Therefore, the misjudgment rate ω can be calculated accordingly.The effective tagging power ϵ eff , calculated according to eq. (1.3), accounts for the effective reduction of events due to flavor dilution and represents the overall performance of the jet charge.
Table 2 provides the effective tagging power ϵ eff values for each species of final state leading charged particle.The ϵ tag,max represents the upper limit on the tagging efficiency ϵ tag .The efficiency values in Table 2 are used for the subsequent calculations.The different species of final state leading charged particles form an un-overlapping, almost full coverage of all events, enabling the calculation of total effective tagging power ϵ eff as a straightforward sum of each category's effective tagging power ϵ eff [32,33].The misjudgment rate ω for the sum is then computed from the total effective tagging power ϵ eff using eq.( 1.3).
The LPJC method effectively utilizes final state leading charged particles to determine the jet charge, establishing benchmark values for the effective tagging power ϵ eff .For b jet, these values are 8.9% by Whizard 1.95, 8.4% by Herwig, and 7.8% by Sherpa.Meanwhile, for c jet, these values are 20.2% by Whizard 1.95, 22.5% by Herwig, and 21.5% by Sherpa.

Jet charge performance dependencies
The dependence of the correlation between final state particles and jet charge is not solely affected by the species and charge of final state leading charged particles but also by their Table 2.The LPJC effective tagging power ϵ eff of each type of final state leading charged particles and the total effective tagging power ϵ eff at the c/b jet, which is the sum of each due to the independence.The misjudgment rate ω for the sum are computed from the total effective tagging power ϵ eff using eq.(1.3).

Whizard 1.95
Herwig Sherpa origin: the final state leading charged particles can be generated from leading heavy hadron decay or QCD fragmentation.While for the former, the correlation will also be sensitive to the species of leading heavy hadron.This session analyzes the dependence of misjudgment rate ω(the charge asymmetry of final state leading charged particles) in different categories of one jet, including: 1.The type of the final state leading charged particle; 2. The type of the leading heavy hadrons; 3. The sources of the final state leading charged particles, whether the final state leading charged particles are from the leading heavy hadron decay or from QCD fragmentation.
The jet charge performances for each decay source can be observed in the percentage distribution pie plot, see figure 6 and 7, for the c/b jet, respectively, using the same methodology as explained in section 2.1.The three rows in the figures categorize decay sources into inclusive, leading heavy hadron, or QCD fragmentation.The percentage distribution of final state leading charged particles from inclusive source is the sum of that from heavy hadron decay and that from QCD fragmentation.The four columns distinguish each type of the leading heavy hadrons within the c/b jet into sub-groups: namely D + , D 0 , D + s , Λ + c for  3.The "proportion" represents the fragmentation of each heavy hadrons.The total effective tagging power ϵ eff of 22.1%/11.0%for c/b jet is the sum of effective tagging power ϵ eff of each sub-group weighted by the fragmentation fractions, as each leading heavy hadron type is independent of the others.Note that this differs from the value of 20%/9% for the inclusive source in Table 2.
The percentage distribution pie plots of final state leading charged particles vary with the species of leading heavy hadron.In comparing the performance of jet charge identification for each leading heavy hadron type, we observe: • Lepton has a probability of 5.4% and 15.2% to be the leading jet particle for c/b jet, respectively.Once generated in c jet, the charges of these leading leptons faithfully represent the jet charge since they are almost completely generated from leptonic/semileptonic decay of leading heavy hadrons, leading to an effective tagging power of 5.2%, contributed evenly from muons/anti-muons and electrons/positrons.Generated in b jet, because the leading leptons can be generated not only from the decay of b quark, but also the decay from c quark generated in b quark cascading decay, the leading lepton has a typical misjudgment rate ω of 25.5%; therefore the leading leptons contribute to an effective tagging power ϵ eff of 3.6%.
• The leading kaons, with a probability of 28.4% and 21.8% to be the leading jet particle for c/b jet, achieve an effective tagging power ϵ eff of 10.4%/4.4% for c/b jet.The leading kaons decayed from D + and D 0 deliver low misjudgment rate ω of 20.4%/4.7%.However, the kaons from D s can be decayed either from c quark or s quark and have opposite charge performances, resulting in a high misjudgment rate ω of 35.2%.The misjudgment rate ω of leading kaons from inclusive source is 19.7%/27.5% for c/b jet.
• Pions have the largest probability of 37.3% and 56.2% to be the leading jet particle for c/b jet, however, its misjudgment rate ω is the highest, equals to 38.8%/46.3%for c/b jet due to their complete sources, resulting in an effective tagging power ϵ eff of 2.9%/0.3%for c/b jet.
• With a probability of 8.9%/6.7%,leading protons achieve an effective tagging power ϵ eff of 1.7%/0.5% for c/b jet.Using protons from Λ c and Λ b decay yields a very small misjudgment rate ω of 0.1%/2.8%,respectively, while the protons from QCD have opposite charge information.
• Furthermore, for b jet, the protons have opposite charge performances because the baryon number is conserved in baryons decay.• If the final state leading charged particle is from B 0 s , the effective tagging power ϵ eff vanishes due to its fast oscillation [69][70][71][72][73][74].However, the lepton and charged kaon with maximum momentum in the opposite direction of the B 0 s , as well as the charged kaon flying in a similar direction as the B 0 s , can achieve a good misjudgment rate ω of 22.5% and effective tagging power ϵ eff of 20.2% [75].
To summarize, the correlation between the final state leading charged particle and jet charge strongly depends on their origin, and the misjudgment rate ω takes value from 0 (excellent identification) to 0.5 (no identification power).For final state leading charged particle of leptons inside c jet, the misjudgment rate ω could be reduced almost to zero, while the final state leading charged particle decay from B 0 s is not sensitive to the jet charge at all due to the fast oscillation of B 0 s .The misjudgment rate ω depends strongly on whether they are generated from leading heavy hadron decay or QCD fragmentation.The information regarding the source can be extracted from a well-defined vertex (VTX) due to the long decay length of leading heavy hadron, see Table 3.If the VTX can effectively differentiate between protons originating from heavy hadron decay and those from QCD fragmentation, LPJC can achieve an upper limits of effective tagging power ϵ eff of 35.7%/11.2%for c/b jet.

Weighted Jet Charge method (WJC)
The Weighted Jet Charge method (WJC) is based on the calculation of jet charge as described in ref. [66].In this research, the jet charge Q κ jet is calculated by the energyweighted sum of the electric charges of the particles within a jet, which is defined as where the sum runs over all charged particles in a hemisphere to a particular jet.The parameter κ is responsible for managing the relative contribution of both low and highenergy particles to the jet charge.Using the conventional WJC, the variation of effective tagging power ϵ eff with different κ values is illustrated in figure 8.The three convex lines by three generators exhibit a consistent trend.The differences in generators may be due to the different nonperturbative hadronization models used to explain the transition from partonic states to the final hadronic states: phenomenological "string models" (Pythia) or "cluster models" (Herwigand Sherpa) are used for calculations, and the different secondary decay models.This might lead to different proportion of final state charged particles, thus affecting the effective tagging power ϵ eff .The real fragmentation will be determined with real data in future.The optimal κ values with Whizard 1.95 are found to be 0/0.2 for c/b jet, yielding effective tagging power ϵ eff values of 25.8%/15.9%,respectively, as detailed in Table 4.The jet charge distributions, using b jet and b jet as examples, are presented in figure 9.The optimal κ value differs for each type of final state leading charged particles.The difference in κ values for b/c jets is due to the different masses and decay properties.Specifically, the parameter κ determines the relative contribution of low-and high-energy particles to the jet charge.In other words, it quantifies how the jet charge is influenced by the energy of the individual particles within the jet.The b quark has a longer lifetime compared to c quark, leading to a different energy response compared to c jets.The same is true for different heavy hadrons in Table 5. Different decay channels of b/c jets may have distinct energy distributions, necessitating different κ values to accurately capture the charge distribution within the jet.Our analysis has accounted for these differences in κ values by calibrating the jet energy response separately for each case.This ensures that the measured jet properties are effectively attributed to the corresponding quark flavor.Therefore, distinguishing different types of final state leading charged particles can improve the jet charge performance.In addition, when κ = 0, compared to other κ values, the misjudgment rate ω might be smaller as a cost of efficiency.In this case, the samples could be further categorized into two groups: whether Q κ=0 jet = 0 or not.By optimizing κ for each type of final state leading charged particles and re-optimize κ for those Q κ=0 jet = 0, we developed an Improved Weighted Jet Charge method (IWJC).
Jets can be categorized into several classes based on the species and origin of their leading charged particles.The optimal κ for each category can differ significantly from that of others.Therefore, calculating the jet charge using κ values determined from the species and origin of these leading particles can enhance identification performance.Additionally, in cases where κ is zero, the jet charge may coincidentally become zero.In such instances, a re-optimized κ is utilized.To incorporate these improvements, we have developed an Improved Weighted Jet Charge method (IWJC).The optimal parameter κ varies with the species and origin, summarized in Table 5.
Table 5.The optimal parameter κ for each type of final state particles from different sources.We consider the κ=100 as +∞, which corresponds to LPJC method.
Inclusive Heavy hadron QCD For e, µ, K, π from b hadron decay, as well as π from c hadron decay, the optimal κ approaches 0, corresponding to the sum of jet particle charge.On the other hand, for protons from b hadron decay and e, µ, K, p from c hadron decay, the optimal κ approaches +∞, which means the reconstructed jet charge will be determined by the jet particle with highest energy -corresponding to LPJC method.Using IWJC, we determined a benchmark value of the effective tagging power ϵ eff in Table 6.Although the inclusive performance of LPJC is inferior to that of IWJC, LPJC does have advantages in specific categories of final state leading charged particle species.Therefore, an appropriate combination of LPJC and IWJC could lead to improved jet charge identification performance compared to either of them individually.To this end, we have developed a combined jet charge algorithm, Heavy Flavor Jet Charge method (HFJC), which can be divided into six steps: 1. Categorize final state leading charged particles into five types: e, µ, K, π, p.
2. Perform LPJC and IWJC identification in each sub-category.
3. Categorize each type of final state leading charged particles into three groups: two decisions agree, two decisions disagree, and only one method has a decision.
4. For each group, calculate the corresponding misjudgment rate ω.
5. Calculate the effective tagging power ϵ eff of each group using eq.( 4.1).
6. Add the effective tagging power ϵ eff of each type of final state particles, which form an almost full coverage of all the events, to obtain the total effective tagging power ϵ eff .
According to refs.[32,33], the combined effective tagging power ϵ eff can be calculated by where s i is the decision weight of i-th method.ω i is the misjudgment rate ω of i-th method.ξ i is the tagging decision of i-th method.The variable ξ i represents the decision of a particular judgment, which can take values of +1 when the input is tagged as a positive jet, -1 when the input is tagged as a negative jet, 0 when the input is untagged.The HFJC effective tagging power ϵ eff values of each type of final state leading charged particles are shown in Table 7.Compared to LPJC and WJC, the leading leptons for c jet still deliver a misjudgment rate ω of close to zero.The effective tagging power ϵ eff of leading pions, kaons, and protons improved tens or hundreds of times.The total effective tagging power ϵ eff improved about 90%/130% for c/b jet.
The effective tagging power ϵ eff of each method is summarized in Table 8.Each column compares each method: Leading Particle Jet Charge (LPJC), conventional Weighted Jet Charge (WJC), Improved Weighted Jet Charge method (IWJC), Heavy Flavor Jet Charge (HFJC), and ideal cases that the origin of final state leading charged particle can be distinguished correctly, and that b/c hadron type can be reconstructed correctly.Each row compares the three generators: Whizard 1.95, Herwig, and Sherpa.The tendencies between three generators of the improvement of jet charge methods are almost consistent, and the differences between the three generators of the same method are below 11%/12% for c/b jet relative to Whizard 1.95.
At b jet, using the energy-weighted sum of jet charge, WJC exhibits better jet charge performance than LPJC.While for c jet, for final state e, µ, K, LPJC exhibits better jet charge performance.IWJC categorizes final state particles to improve WJC, reaching an improvement of 15%/20%.By combing the methods above, HFJC achieves an effective tagging power ϵ eff of 39.0%/20.4% for c/b jet.If the origin of the final state leading charged particle can be distinguished between leading heavy hadron decay and QCD fragmentation, the effective tagging power ϵ eff could improve to 45%/37% for c/b jet, reaching an improvement of 15%/80%.If each type of leading heavy hadrons can be distinguished, the effective tagging power ϵ eff reaches 56.1%/47.8%for c/b jet.

Impact of realistic detector effects
The jet charge performances described in the previous sessions are obtained using truth level information, which, to some extent, corresponds to an ideal detector that has 100% acceptance, 100% reconstruction efficiency for all final state charged particles, ideal flavor tagging, PID identification and momentum measurements.This section explores the disparity between the actual effective tagging power ϵ eff and the maximum attainable limit.The jet charge performance will necessarily depend on the specific detector configuration and performance in actual experiments at the high energy frontier.It is necessary to understand and quantify the influence of the detector parameters.We analyze the influence on high energy frontier and Higgs factory detectors in references [49,76,77], and typically on CEPC baseline detector [51].The relevant detector effects are mainly as follows.
• Jet Flavor Tagging (FT) The jet charge performances described above are achieved at pure samples; In realistic HEP experiments, it is usually difficult to distinguish jets initiated by different species of quarks or gluon with high efficiency/purity.
The performance of FT can be characterized by a parameter matrix, which converts the truth level (T) to reconstructed level (re) for heavy flavor jet, defined as where Ω is the misjudgment rate of flavor tagging.For simplicity, we consider the FT matrix as a single-parameter matrix, The efficiency of b/c jet can be expressed as 1-Ω.
Combing with quark charge information, the performance of the flavor-charge tagging (FC) can be characterized by a parameter matrix, defined as The elements in FC matrix can be calculated from flavor tagging misjudgment rate Ω and jet charge misjudgment rate ω by where, the ω ′ b(c) represents the jet charge misjudgment rate for b(c)jet that we reconstructed as c(b)jet.Through the charge correlation between final state particles and b/c quark presented in section 2, for e, µ, π, ω . The misjudgment rate of jet charge with flavor mixing, ω FC , can be obtained from the FC matrix 3) The corresponding effective tagging power ϵ FC eff can be expressed as (5.4) The correlation between ϵ FC eff and FT efficiency is illustrated in figure 10, which shows a clear correlation between the flavor tagging performance and jet charge performance.At the CEPC, simulation studies show that at inclusive hadronic events at Z pole operation (Z → q q), the c/b jet can be typically identified with an efficiency of 78.4%/91.1% [59], reaching an effective tagging power ϵ eff equals to 33.8%/18.7%for c/b jet, respectively.
• Vertex resolution The vertex information plays a crucial role in distinguishing the origin of the final state leading charged particles, namely whether they arise from heavy hadron decay or QCD fragmentation.The HFJC* presented in Table 8 is based on ideal vertex resolution.However, the interdependence between vertex resolution and jet charge performance is of significant importance.The reconstructed impact point (IP) can be used to measure vertex resolution, which represents the resolution of the decay sources of final state charged particles.We investigate the influence of IP resolution on the HFJC* performance.For simplicity, the distinction between heavy hadron decay and QCD fragmentation using IP resolution can also be characterized by a single-parameter matrix given by r T r vtx mis r vtx mis r T , where r T = 1 − r vtx mis .The correlation between ϵ eff and r vtx mis is shown in figure 11.
The effective tagging power ϵ eff of c/b jet decreases rapidly due to the fact that the heavier hadron mass and larger decay length lead to a significant impact of IP resolution on the jet charge measurement.The IP distributions of final state leading charged particles that arise from heavy hadron decay and QCD fragmentation for c/b jet are displayed in figure 12, which is scaled to 1 to correspond to the matrix element.
The peaks of higher values on these two graphs are mainly from K s /Λ decay.
The reconstructed IP distinctions between final state charged particles that arise from heavy hadron decay and QCD fragmentation are relatively better for b jet than for c jet, and the IP resolution has a bigger impact on b jet than on c jet charge identification performance.The benchmark IP cuts of 0.02/0.04mmfor c/b jet are marked with green dashed lines, which correspond to the triangle markers in figure 11, leading to an effective tagging power ϵ eff of 39.0%/26.8%for c/b jet.Therefore, improving IP resolution is crucial for enhancing the performance of jet charge measurements for b jet, and relatively less crucial for c jet.Further research in developing more advanced IP reconstruction techniques is worth pursuing.
• PID Jet charge performance is sensitive to PID performance, which encompasses lepton identification and hadron identification.

Lepton identification
The lepton identification reaches an efficiency and misidentification rate of 98% and 1% for energy higher than 2 GeV [78].The impact of lepton identification on effective tagging power ϵ eff can be ignored.

Hadron identification
The hadron identification (K, π, p) is comparatively inferior to lepton identification; therefore, its influence on jet charge effective tagging power ϵ eff merits investigation.For simplicity, the hadron identification performance can also be characterized by a single-parameter matrix, where r T = 1-2r pid mis .The correlation between ϵ eff and r pid mis is illustrated in figure 13, The WJC remains unaffected by PID performance as the jet charge calculation involves all final state charged particles, irrespective of their type.In contrast, the effective tagging power ϵ eff of LPJC noticeably diminishes as r pid mis increases, ultimately reaching an ϵ eff of 5.5% that primarily stems from lepton information.The effective tagging power ϵ eff of c jet decreases faster than that of b jet because leading kaons have different origins and exhibit opposite charge correlation for c jet.The PID identification performance at future circular colliders is analyzed in [79,80], in which dE/dx resolution is pursued to be better than 3%, making a 4.6%/3.9%degradation of effective tagging power ϵ eff for c/b jet, respectively.
• Energy threshold The impact of energy threshold on jet charge effective tagging power ϵ eff is illustrated in figure 14.The WJC demonstrates greater sensitivity, as the selection efficiency significantly influences the total charge of a jet.In contrast, the LPJC method exhibits relative insensitivity.The degradation of effective tagging power ϵ eff can be effectively ignored at an energy threshold of 1 GeV.

• Other detector parameters
The momentum resolution and polar angle acceptance can be quantified by the Particle Flow Algorithm (PFA) [81] oriented baseline at Higgs factories, which gives δp/p around 0.1% or better and |cos(θ)| around 0.99.The track reconstruction can refer to [82], finding charged particles with transverse momentum p T as low as 0.1 GeV, or produced as far as 60cm from the beam line.Under these conditions, the degradation of effective tagging power ϵ eff is negligible.
However, there are remaining dilution effects after all effects that we considered or techniques to mitigate dilution.To correct for these effects, we use the residual dilution, which quantifies the extent to which the measured jet charge deviates from the true jet charge due to these unwanted contributions.We account for the residual dilution by applying a correction factor to the measured values.

Summary
The jet charge plays a crucial role in electroweak and flavor physics measurements at collider experiments.In this study, we evaluate the performance of heavy flavor jet charge identifications at future Z factories using truth-level generators.The jet charge performance is quantified by the effective tagging power ϵ eff .We develop a Leading Particle Jet Charge method (LPJC) and combine it with Weighted Jet Charge (WJC) to create a Heavy Flavor Jet Charge method (HFJC), achieving an effective tagging power ϵ eff of 39%/20% for c/b jet, respectively.
The effective tagging power is highly dependent on the species of the final state leading charged particle, the species of the leading heavy hadron that b/c quark hadronizes into, and the decay source of the final state particles.Using leading heavy decay products, particularly leptons from charged heavy mesons and protons from baryons(Λ c /Λ b ), LPJC achieves a very low misjudgment rate ω.In contrast, the final state leading charged particle from B 0 s contributes little to the jet charge identification due to the fast oscillation of B 0 s .However, measuring the flavor of the other b hadron produced in the event is promising to achieve an effective tagging power ϵ eff of 20.2% [75].The high dependency on leading hadron type indicates a potential for specific channel measurements on jet charge.If the origin of the final state leading charged particle can be distinguished between leading heavy hadron decay and QCD fragmentation, the effective tagging power ϵ eff could be improved to 45%/37% for c/b jet.Furthermore, if each type of leading heavy hadrons can be distinguished, the effective tagging power ϵ eff reaches 56.1%/47.8%for c/b jet.The former could, in principle, be approached by an excellent vertex system, while the latter is much more challenging; however, carefully analyzing the information of all the final state particles in Z → q q event can also provide certain effective tagging power ϵ eff for different leading heavy hadron species, especially in specific decay modes.The values of effective tagging power ϵ eff are summarized in Table 8.
The impact of critical detector performance on jet charge is also quantified.On the referenced detector [49,51,76,77], most of the effects have a relatively small impact on jet charge performance, while the PID performance has a more obvious effect on effective tagging power ϵ eff , quantified in figure 13.Although our analysis is at the truth level, the reconstructed results are expected to be similar.
A comparison of jet charge performance between colliders under different conditions is displayed in figure 15, where the color maps the corresponding effective tagging power ϵ eff , the white lines indicate contours of constant effective tagging power ϵ eff , the x-axis is the misjudgment rate ω, and the y-axis refers to the selection efficiency ϵ tag from all .The comparison of jet charge performance between different colliders under different conditions.the color maps the corresponding effective tagging power ϵ eff , the x-axis is the misjudgment rate ω, and the y-axis refers to the selection efficiency ϵ tag .The solid triangles emphasize the effective tagging power ϵ eff for CEPC employing the HFJC method with an ideal detector, corresponding to the ϵ tag,max in Table 2.The hollow triangles labeled "HFJC* Baseline VTX" represent the effective tagging power ϵ eff obtained by incorporating the baseline vertex information on top of the solid triangles.The hollow triangles labeled "HFJC Baseline FT" represent the effective tagging power ϵ eff achieved using the HFJC method but with baseline flavor tagging efficiency instead of ideal flavor tagging.And the dots represent the results of related jet charge experiments of LHCb, ATLAS, BaBar, and Belle for each specific decay mode in [43,44] and their corresponding references.
final state charged particles.The solid triangles emphasize the effective tagging power ϵ eff for CEPC employing HFJC method with an ideal detector, corresponding to the ϵ tag,max in Table 2.The hollow triangles labeled "HFJC* Baseline VTX" represent the effective tagging power ϵ eff obtained by incorporating the baseline vertex information on top of the HFJC method represented by solid triangles.The hollow triangles labeled "HFJC Baseline FT" represent the effective tagging power ϵ eff achieved using the HFJC method but with baseline flavor tagging efficiency instead of ideal flavor tagging.And the dots represent the results of recent jet charge analyses conducted by LHCb, ATLAS, BaBar, and Belle for specific decay modes, leading to a lower efficiency, as reported in the article by [43,44] and its corresponding references.The results of Z factories are relatively close to the flavor factory case.
This study provides insights into the factors that affect the correlation between jet charge and fragmentation methods of different generators.This result has substantial im-plications for precise electroweak and flavor physics measurements, such as the forwardbackward asymmetry A F B and the electroweak mixing angle sin 2 θ W [16][17][18], the timedependent CP measurement [28][29][30][31], and the Higgs properties measurement at high-energy frontier [5,6].In the future, jet charge identification is expected to be extended to multi-jet events, other center-of-mass energy frontiers, and other collision environments such as pp colliders.Besides, investigating detector dependency on the full simulation level is the next logical step.The jet charge performance versus jet clustering algorithm is also of great interest.Moreover, the algorithm can be combined with other information, for example, secondary vertex charge, kaon charge for B 0 s , neutral kaon identification, jet shape, thrust, and multiplicity information [83].Further advancements using machine learning techniques are also worth exploring.

Figure 1 .
Figure 1.The jet event display of e + e − → Z → b b event at Z pole generated by Whizard 1.95.

Figure 2 .
Figure 2. The multiplicity distributions of final state charged particles vs. neutral particles of c jet (left), and b jet (right) on the Z pole.

Figure 3 .
Figure 3.The energy spectrum of final state charged particles for c jet (left) and b jet (right) The entries along the diagonal represent the final state leading charged particles.

Figure 4 .
Figure 4.The percentages of species of final state leading charged particles within the c jet (left) and c jet (right) by Whizard 1.95.

Figure 5 .
Figure 5.The percentages of species of final state leading charged particles within the b jet (left) and the b jet (right) by Whizard 1.95.

Figure 6 .
Figure 6.The percentage distribution pie plots of final state leading charged particles for c jet that from inclusive source (top), from heavy hadron decay (middle), and from QCD fragmentation (bottom) by Whizard 1.95.The four columns distinguish different types of leading c hadron.About 67% of final state leading charged particles are from c hadron decay: D + , D 0 , D + s , Λ + c , with proportion of 22%/63%/8%/7%, respectively.

Figure 7 .
Figure 7.The percentage distribution pie plots of final state leading charged particles for b jet that from inclusive source(top), from heavy hadron decay(middle), and from QCD fragmentation(bottom) by Whizard 1.95.The four columns distinguish different types of leading b hadron.About 83% of final state leading charged particles are from b hadron decay: B0 , B − , B0s , Λ 0 b , with proportion of 42%/42%/8%/7%, respectively.

Figure 8 .
Figure 8.The WJC effective tagging power ϵ eff with different κ (from 0 to 1.0) for c jet (left) and b jet (right).The optimal κ is 0 for c jet and 0.2 for b jet.

Figure 9 .
Figure 9.The charge distributions for b jet with κ = 0.2.

Figure 10 .
Figure 10.The effective tagging power ϵ eff as a function of the b/c quark flavor tagging efficiency, using HFJC method.

Figure 11 .
Figure 11.The HFJC* effective tagging power ϵ eff varies with the mis-distinction rate between heavy hadron decay and QCD fragmentation r vtx mis for b jet and c jet.

Figure 12 .
Figure 12.The reconstructed log IP (the logarithm of impact parameter) distributions of final state leading charged particles that from heavy hadron decay and QCD fragmentation for c jet (left) and b jet (right).

Figure 13 .
Figure 13.The effective tagging power ϵ eff varies with the PID mis-identification rate r pid mis for b jet and c jet.

Figure 14 .
Figure 14.The effective tagging power ϵ eff varies with the energy threshold for b jet and c jet.

Figure 15
Figure 15.The comparison of jet charge performance between different colliders under different conditions.the color maps the corresponding effective tagging power ϵ eff , the x-axis is the misjudgment rate ω, and the y-axis refers to the selection efficiency ϵ tag .The solid triangles emphasize the effective tagging power ϵ eff for CEPC employing the HFJC method with an ideal detector, corresponding to the ϵ tag,max in Table2.The hollow triangles labeled "HFJC* Baseline VTX" represent the effective tagging power ϵ eff obtained by incorporating the baseline vertex information on top of the solid triangles.The hollow triangles labeled "HFJC Baseline FT" represent the effective tagging power ϵ eff achieved using the HFJC method but with baseline flavor tagging efficiency instead of ideal flavor tagging.And the dots represent the results of related jet charge experiments of LHCb, ATLAS, BaBar, and Belle for each specific decay mode in[43,44] and their corresponding references.

Table 3 .
The effective tagging power ϵ eff for each type of the leading heavy hadrons and the total effective tagging power ϵ eff by Whizard 1.95.

Table 4 .
The WJC effective tagging power ϵ eff for the c/b jet is determined using three generators, where κ is respectively set to the optimal value.

Table 6 .
The IWJC effective tagging power ϵ eff at the c/b jet using three generators.

Table 7 .
The HFJC effective tagging power ϵ eff of each type of final state leading charged particles and the total effective tagging power ϵ eff at the c/b jet.

Table 8 .
The effective tagging power ϵ eff at the c/b jet of each method: Leading Particle Jet Charge (LPJC), conventional Weighted Jet Charge (WJC), Improved Weighted Jet Charge method (IWJC), Heavy Flavor Jet Charge (HFJC), and ideal cases that the origin of final state leading charged particle can be distinguished correctly, and that b/c hadron type can be reconstructed correctly.Results are presented using three generators for comparison: Whizard 1.95, Herwig, and Sherpa.