A Hadronization Model for Few-GeV Neutrino Interactions

We present a detailed description of a new hadronic multiparticle production model for use in neutrino interaction simulations. Its validity spans a wide invariant mass range starting from pion production threshold. This model focuses on the low invariant mass region which is probed in few-GeV neutrino interactions and is of particular importance to neutrino oscillation experiments using accelerator and atmospheric fluxes. It exhibits reasonable agreement with a wide variety of experimental data. We also describe measurements that can be made in upcoming experiments that can improve modeling in areas where uncertainties are currently large.


Introduction
In neutrino interaction simulations the hadronization model (or fragmentation model) determines the final state particles and 4-momenta given the nature of a neutrinonucleon interaction (CC/NC, ν/ν, target neutron/proton) and the event kinematics (W 2 , Q 2 , x, y). The modeling of neutrino-produced hadronic showers is important for a number of analyses in the current and coming generation of neutrino oscillation experiments: Calorimetry: Neutrino oscillation experiments like MINOS which use calorimetry to reconstruct the shower energy, and hence the neutrino energy, are sensitive to the modelling of hadronic showers. These detectors are typically calibrated using single particle test beams, which introduces a model dependence in determining the conversion between detector activity and the energy of neutrino-produced hadronic systems [1].
NC/CC Identification: Analyses which classify events as charged current (CC) or neutral current (NC) based on topological features such as track length in the few-GeV region rely on accurate simulation of hadronic particle distributions to determine NC contamination in CC samples.
Topological Classification: Analyses which rely on topological classifications, for instance selecting quasi-elastic-like events based on track or ring counting depend on the simulation of hadronic systems to determine feeddown of multi-particle states into  [7]. selected samples. Because of the wide-band nature of most current neutrino beams, this feeddown is non-neglible even for experiments operating in beams with mean energy as low as 1 GeV [2,3]. ν e Appearance Backgrounds: A new generation of ν µ → ν e appearance experiments are being developed around the world, which hope to measure θ 13 , resolve the neutrino mass hierarchy, and find evidence of CP violation in the lepton sector [4,5]. In these experiments background is dominated by neutral pions generated in NC interaction. The evaluation of NC backgrounds in these analysis can be quite sensitive to the details of the NC shower simulation and specifically the π 0 shower content and transverse momentum distributions of hadrons [6].
In order to improve Monte Carlo simulations for the MINOS experiment, a new hadronization model, referred to here as the 'AGKY model', was developed. We use the PYTHIA/JETSET [8] model to simulate the hadronic showers at high hadronic invariant masses. We also developed a phenomenological description of the low invariant mass hadronization since the applicability of the PYTHIA/JETSET model, for neutrino-induced showers, is known to deteriorate as one approaches the pion production threshold. We present here a description of the AGKY hadronization model and the tuning and validation of this model using bubble chamber experimental data.

Overview
The AGKY model, which is now the default hadronization model in the neutrino Monte Carlo generators NEUGEN [9] and GENIE-2.0.0 [10], includes a phenomenological description of the low invariant mass region based on Koba-Nielsen-Olesen (KNO) scaling [11], while at higher masses it gradually switches over to the PYTHIA/JETSET model. The transition from the KNO-based model to the PYTHIA/JETSET model takes place gradually, at an intermediate invariant mass region, ensuring the continuity of all simulated observables as a function of the invariant mass. This is accomplished by using a transition window [W tr min , W tr max ] over which we linearly increase the fraction of neutrino events for which the hadronization is performed by the PYTHIA/JETSET model from 0% at W tr min to 100% at W tr max . The default values used in the AGKY model are: The kinematic region probed by any particular experiment depends on the neutrino flux, and for the 1-10 GeV range of importance to oscillation experiments, the KNO-based phenomenological description plays a particularly crucial role. The higher invariant mass region where PYTHIA/JETSET is used is not accessed until a neutrino energy of approximately 3 GeV is reached, at which point 44.6% of charged current interactions are non-resonant inelastic and are hadronized using the KNO-based part of the model. For 1 GeV neutrinos this component is 8.3%, indicating that this model plays a significant role even at relatively low neutrino energies. At 9 GeV, the contributions from the KNO-based and PYTHIA/JETSET components of the model are approximately equal, with each handling around 40% of generated CC interactions. The main thrust of this work was to improve the modeling of hadronic showers in this low invariant mass / energy regime which is of importance to oscillation experiments.
The description of AGKY's KNO model, used at low invariant masses, can be split into two independent parts: • Generation of the hadron shower particle content • Generation of hadron 4-momenta These two will be described in detail in the following sections.
The neutrino interactions are often described by the following kinematic variables: where Q 2 is the invariant 4-momentum transfer squared, ν is the neutrino energy transfer, W is the effective mass of all secondary hadrons (invariant hadronic mass), x is the Bjorken scaling variable, y is the relative energy transfer, E ν is the incident neutrino energy, E µ and p L µ are the energy and longitudinal momentum of the muon, M is the nucleon mass and m is the muon mass.
For each hadron in the hadronic system, we define the variables z = E h /ν, x F = 2p * L /W and p T where E h is the energy in the laboratory frame, p * L is the longitudinal momentum in the hadronic c.m.s., and p T is the transverse momentum.

Low-W model: Particle content
At low invariant masses the AGKY model generates hadronic systems that typically consist of exactly one baryon (p or n) and any number of π and K mesons that are kinematically possible and consistent with charge conservation.
For a fixed hadronic invariant mass and initial state (neutrino and struck nucleon), the method for generating the hadron shower particles generally proceeds in four steps: Determine n ch : Compute the average charged hadron multiplicity using the empirical expression: The coefficients a ch , b ch , which depend on the initial state, have been determined by bubble chamber experiments. Determine n : Compute the average hadron multiplicity as n tot = 1.5 n ch [12]. Deterimine n: Generate the actual hadron multiplicity taking into account that the multiplicity dispersion is described by the KNO scaling law [11]: where P (n) is the probability of generating n hadrons and f is the universal scaling function which can be parametrized by the Levy function 1 (z = n/ n ) with an input parameter c that depends on the initial state. Fig.1 shows the KNO scaling distributions for νp (left) and νn (right) CC interactions. We fit the data points to the Levy function and the best fit parameters are c ch = 7.93 ± 0.34 for the νp interactions and c ch = 5.22 ± 0.15 for the νn interactions. Select particle types: Select hadrons up to the generated hadron multiplicity taking into account charge conservation and kinematic constraints. The hadronic system contains any number of mesons and exactly one baryon which is generated based on simple quark model arguments. Protons and neutrons are produced in the ratio 2:1 for νp interactions, 1:1 for νn andνp, and 1:2 forνn interactions. Charged mesons are then created in order to balance charge, and the remaining mesons are generated in neutral pairs. The probablilities for each are 31.33% (π 0 , π 0 ), 62.66% (π + , π − ), 1.5% (K 0 , K − ), 1.5% (K + , K − ), 1.5% (K 0 , K + ) and 1.5% (K 0 ,K 0 ). The probability of producing a strange baryon via associated production is determined from a fit to Λ production data:

Low-W model: Hadron system decay
Once an acceptable particle content has been generated, the available invariant mass needs to be partitioned amongst the generated hadrons. The most pronounced kinematic features in the low-W region result from the fact that the produced baryon is much heavier than the mesons and exhibits a strong directional anticorrelation with the current direction. Our strategy is to first attempt to reproduce the experimentally measured final state nucleon momentum distributions. We then perform a phase space decay on the remnant system employing, in addition, a p T -based rejection scheme designed to reproduce the expected meson transverse momentum distribution. The hadronization model performs its calculation in the hadronic c.m.s., where the z-axis is in the direction of the momentum transfer. Once the hadronization is completed, the hadronic system will be boosted and rotated to the LAB frame. The boost and rotation maintains the p T generated in the hadronic c.m.s.
In more detail, the algorithm for decaying a system of N hadrons is the following: Generate baryon: Generate the baryon 4-momentum P * N = (E * N , p * N ) using the nucleon p 2 T and x F PDFs which are parametrized based on experimental data [14,15]. The x F distribution used is shown in Fig.2. We do not take into account the correlation between p T and x F in our selection.
Remnant System: Once an accepted P * N has been generated, calculate the 4momentum of the remaining N-1 hadrons, (the "remnant" hadronic system) as P * R = P * X − P * N where P * X = (W, 0) is the initial hadron shower 4-momentum in the hadronic c.m.s. Decay Remnant System: Generate an unweighted phase space decay of the remnant hadronic system [16]. The decay takes place at the remnant system c.m.s. after which the particles are boosted back to the hadronic c.m.s. The phase space decay employs a rejection method suggested in [17], with a rejection factor e −A * p T for each meson. This causes the transverse momentum distribution of the generated mesons to fall exponentially with increasing p 2 T . Here p T is the momentum component perpendicular to the current direction.
Two-body hadronic systems are treated as a special case. Their decay is performed isotropically in the hadronic c.m.s. and no p T -based suppression factor is applied.

High-W model: PYTHIA/JETSET
The high invariant mass hadronization is performed by the PYTHIA/JETSET model [8]. The PYTHIA program is a standard tool for the generation of high-energy collisions, comprising a coherent set of physics models for the evolution from a few-body hard process to a complex multihadronic final state. It contains a library of hard processes and models for initial-and final-state parton showers, multiple parton-parton interactions, beam remnants, string fragmentation and particle decays. The hadronization model in PYTHIA is based on the Lund string fragmentation framework [18]. In the AGKY model, all but four of the PYTHIA configuration parameters are set to be the default values. Those four parameters take the non-default values tuned by NUX [19], a high energy neutrino MC generator used by the NOMAD experiment: • P ss controlling the ss production suppression: (PARJ(2))=0.21.

Data/MC Comparisons
The characteristics of neutrino-produced hadronic systems have been extensively studied by several bubble chamber experiments. The bubble chamber technique is well suited for studying details of charged hadron production in neutrino interactions since the detector can provide precise information for each track. However, the bubble chamber has disadvantages for measurements of hadronic system characteristics as well. The detection of neutral particles, in particular of photons from π 0 decay, was difficult for the low density hydrogen and deuterium experiments. Experiments that measured neutral pions typically used heavily liquids such as neon-hydrogen mixtures and Freon. While these exposures had the advantage of higher statistics and improved neutral particle identification, they had the disadvantage of introducing intranuclear rescattering which complicates the extraction of information related to the hadronization process itself. We tried to distill the vast literature and focus on the following aspects of ν/ν measurements made in three bubble chambers -the Big European Bubble Chamber (BEBC) at CERN, the 15-foot bubble chamber at Fermilab, and the SKAT bubble chamber in Russia. Measurements from the experiments of particular interest for tuning purposes can be broadly categorized as multiplicity measurements and hadronic system measurements. Multiplicity measurements include averaged charged and neutral particle (π 0 ) multiplicities, forward and backward hemisphere average multiplicities and correlations, topological cross sections of charged particles, and neutral -charged pion multiplicity correlations. Hadronic system measurements include fragmentation functions (z distributions), x F distributions, p 2 T (transverse momentum squared) distributions, and x F − p 2 T correlations ("seagull" plots). The systematic errors in many of these measurements are substantial and various corrections had to be made to correct for muon selection efficiency, neutrino energy smearing, etc. The direction of the incident ν/ν is well known from the geometry  [7,20].
of the beam and the position of the interaction point. Its energy is unknown and is usually estimated using a method based on transverse momentum imbalance. The muon is usually identified through the kinematic information or by using an external muon identifier (EMI). The resolution in neutrino energy is typically 10% in the bubble chamber experiments and the invariant hadronic mass W is less well determined.
The differential cross section for semi-inclusive pion production in neutrino interactions ν + N → µ − + π + X may in general be written as: where D π (x, Q 2 , z) is the pion fragmentation function. Experimentally D π is determined as: In the framework of the Quark Parton Model (QPM) the dominant mechanism for reactions (6) is the interaction of the exchanged W boson with a d-quark to give a u-quark which fragments into hadrons in neutrino interactions, leaving a di-quark spectator system which produces target fragments. In this picture the fragmentation function is independent of x and the scaling hypothesis excludes a Q 2 dependence; therefore the fragmentation function should depend only on z. There is no reliable way to separate the current fragmentation region from the target fragmentation region if the effective mass of the hadronic system (W ) is not sufficiently high. Most experiments required W > W 0 where W 0 is between 3 GeV/c 2 and 4 GeV/c 2 when studying the fragmentation characteristics. The caused difficulties in the tuning of our model because we are mostly interested in the interactions at low hadronic invariant masses.
We determined the parameters in our model by fitting experimental data with simulated CC neutrino free nucleon interactions uniformly distributed in the energy range from 1 to 61 GeV. The events were analyzed to determine the hadronic system characteristics and compared with published experimental data from the BEBC, Fermilab 15-foot, and SKAT bubble chamber experiments. We reweight our MC to the energy spectrum measured by the experiment if that information is available. This step is not strictly necessary for the following two reasons: many observables (mean multiplicity, dispersion, etc.) are measured as a function of the hadronic invariant mass W , in which case the energy dependency is removed; secondly the scaling variables (x F , z, etc.) are rather independent of energy according to the scaling hypothesis.
Some experiments required Q 2 > 1GeV 2 to reduce the quasi-elastic contribution, y < 0.9 to reduce the neutral currents, and x > 0.1 to reduce the sea-quark contribution. They often applied a cut on the muon momentum to select clean CC events. We apply the same kinematic cuts as explicitly stated in the papers to our simulated events. Fig.3 shows the average charged hadron multiplicity n ch (the number of charged hadrons in the final state, i.e. excluding the muon) as a function of W 2 . n ch rises linearly with ln(W 2 ) for W > 2GeV/c 2 . At the lowest W values the dominant interaction channels are single pion production from baryon resonances: Therefore n ch becomes 2(1) for νp(νn) interactions as W approaches the pion production threshold. For νp interactions there is a disagreement between the two measurements especially at high invariant masses, which is probably due to differences in scattering from hydrogen and deuterium targets. Our parameterization of low-W model was based on the Fermilab 15-foot chamber data. Historically the PYTHIA/JETSET program was tuned on the BEBC data. The AGKY model uses the KNO-based empirical model at low invariant masses and it uses the PYTHIA/JETSET program to simulation high invariance mass interactions. Therefore the MC prediction agrees better with the Fermilab data at low invariant masses and it agrees better with the BEBC data at high invariant masses. The production of strange particles via associated production is shown in Figures  8 and 9. The production of kaons and lambdas for the KNO-based model are in reasonable agreement with the data, while the rate of strange meson production from  JETSET is clearly low. We have investigated adjusting JETSET parameters to produce better agreement with data. While it is possible to improve the agreement with strange particle production data, doing so yields reduced agreement with other important distributions, such as the normalized charged particle distributions. Fig.4(a) shows the dispersion D − = ( n 2 − − n − 2 ) 1/2 of the negative hadron multiplicity as a function of n − . Fig.4(b) shows the ratio D/ n ch as a function of W 2 . The dispersion is solely determined by the KNO scaling distributions shown in Fig.1. The agreement between data and MC predictions is satisfactory. Fig.5(a) shows the average π 0 multiplicity n π 0 as a function of W 2 . Fig.5(b) shows the dispersion of the distributions in multiplicity as a function of the average multiplicity of π 0 mesons. As we mentioned it is difficult to detect π 0 's inside a hydrogen bubble chamber. Also shown in the plot are some measurements using heavy liquids such as neon and Freon. In principle, rescattering of the primary hadrons can occur in the nucleus. Some studies of inclusive negative hadron production in the hydrogen-neon mixture and comparison with data obtained by using hydrogen targets indicate that these effects are negligible [27]. The model is in good agreement with the data. n π 0 is 0(1/2) for νp(νn) interactions when the hadronic invariant mass approaches the pion production threshold, which is consistent with the expectation from the reactions (9)(10)(11). The model predicts the same average π 0 multiplicity for νp and νn interactions for W > 2GeV/c 2 . Fig.6 shows the average π 0 multiplicities n π 0 as a function of the number of negative hadrons n − for various W ranges. At lower W , n π 0 tends to decrease with n − , probably because of limited phase space, while at higher W n π 0 is rather independent of n − where there is enough phase space. Our model reproduces the     [30].
correlation at lower W suggested by the data. However, another experiment measured the same correlation using neon-hydrogen mixture and their results indicate that n π 0 is rather independent of n − for both W > 4GeV/c 2 and W < 4GeV/c 2 [28]. Since events with π 0 but with 0 or very few charged pions are dominant background events in the ν e appearance analysis, it is very important to understand the correlation between the neutral pions and charged pions; this should be a goal of future experiments [29]. Fig.7 shows the average charged-hadron multiplicity in the forward and backward hemispheres as functions of W 2 . The forward hemisphere is defined by the direction of the current in the total hadronic c.m.s. There is a bump in the MC prediction in the forward hemisphere for νp interactions at W ∼ 2GeV/c 2 and there is a slight dip in the backward hemisphere in the same region. This indicates that the MC may overestimate the hadrons going forward in the hadronic c.m.s. at W ∼ 2GeV/c 2 and underestimate the hadrons going backward. One consequence could be that the MC overestimates the energetic hadrons since the hadrons in the forward hemisphere of hadronic c.m.s. get more Lorentz boost than those in the backward hemisphere when boosted to the LAB frame. This may be caused by the way we determine the baryon 4-momentum and preferably select events with low p T in the phase space decay. These  Nev · dN dz , where N ev is the total number of interactions (events) and z = E/ν is the fraction of the total energy transfer carried by each final hadron in the laboratory frame. The AGKY predictions are in excellent agreement with the data. Fig.11 shows the mean value of the transverse momentum with respect to the current direction of charged hadrons as a function of W . The MC predictions match the data reasonably well. In the naive QPM, the quarks have no transverse momentum within the struck nucleon, and the fragments acquire a P f rag T with respect to the struck quark from the hadronization process. The average transverse momentum P 2 T of the hadrons will then be independent of variables such as x BJ , y, Q 2 , W , etc., apart from trivial kinematic constraints and any instrumental effects. Both MC and data reflect this feature. However, in a perturbative QCD picture, the quark acquires an additional transverse component, P 2 T QCD , as a result of gluon radiation. The quark itself may also have a primordial P 2 T prim inside the nucleon. These QCD effects can introduce dependencies of P 2 T on the variables x BJ , y, Q 2 , W , z, etc. Fig.12 shows the mean value of the transverse momentum of charged hadrons as a function of is the Feynman-x. As is well known, p T increases with increasing |x F | with a shape called the seagull effect. This effect is reasonably well modeled by the AGKY model.

Conclusions
In this paper we have described a new hadronic mutiparticle production model for use in neutrino simulations. This model will be useful for experiments in the few-GeV energy regime and exhibits satisfactory agreement with wide variety of data for charged, neutral pions as well as strange particles. Several upcoming expriments will have high-statistics data sets in detectors with excellent energy resolution, neutral particle containment, and particle identification. These experiments are in some cases considering possible running with cryogenic hydrogen and deuterium targets. These experiments will be operating in this few-GeV regime and have the potential to fill in several gaps in our understanding that will help improve hadronic shower modeling for oscillation experiments.
The upcoming generation of experiments have all the necessary prerequisites to significantly address the existing experimental uncertainties in hadronization at low invariant mass. These result from the fact that these detectors have good containment for both charged and neutral particles, high event rates, good tracking resolution, excellent particle identification and energy resolution, and the possibility of collecting data on free nucleons with cryogenic targets. The latter offers the possibility of addressing the challenge of disentangling hadronization modeling from intranuclear rescattering effects. Charged current measurements of particular interest will include clarifying the experimental discrepancy at low invariant mass between the existing published results as shown in Fig.7, the origin of which probably relates to particle misidentification corrections [22]. This discrepancy has a large effect on forward/backward measurements, and a succesful resolution of this question will reduce systematic differences between datasets in a large class of existing measurements. In addition, measurements of transverse momentum at low invariant masses will be helpful in model tuning. Measurements of neutral particles, in particular multiplicity and particle dispersion from free targets at low invariant mass, will be tremendously helpful. The correlation between neutral and charged particle multiplicities at low invariant mass is particularly important for oscillation simulations, as it determines the likelihood that a low invariant mass shower will be dominated by neutral pions.