GEANT4 simulation of hadronic interactions at 8–10 GeV/c: response to the HARP-CDP group

The results of the HARP-CDP group on the comparison of Geant4 Monte Carlo predictions versus experimental data are discussed. It is shown that the problems observed by the group are caused by an incorrect implementation of old features at the programming level, and by a lack of the nucleon Fermi motion in the simulation of quasi-elastic scattering. These drawbacks are not due to the physical models used. They do not manifest themselves in the most important applications of the Geant4 toolkit.


Introduction
Recently the HARP-CDP group published simulation results on hadron interactions with beryllium nuclei at energies below 10 GeV [1]. These results were obtained with the help of the GEANT4 Monte Carlo event generators [2,3]. Problems were emphasized in two-dimensional plots of P L vs. P T of the produced low energy secondary particles. Significant irregularities, such as the peaks and valleys shown in Fig. 1a, were not seen in the experimental data. 1 Because the GEANT3 and GEANT4 toolkits are widely used by experimental collaborations for design and analysis of various detectors, these results require explanation, and programming errors, where found, in the GEANT4 code must be located and fixed. In this report, we show that these irregularities are connected with incomplete or un-optimized solutions of well-known problems. 1 The dense regions in Fig. 1a correspond to the peaks, and the empty regions correspond to the valleys. a e-mail: Vladimir.Uzhinskiy@cern.ch Thanks to the HARP-CDP group, problems in hadron interactions with light nuclear targets have now been clearly demonstrated. However, some natural questions are not considered in their paper: does the observed structure also appear in hadron-hadron interactions and in the interactions of hadrons with heavy nuclei; is there any dependence of the structure on the interaction energy or on the multiplicity of produced particles, and so on?
In Sect. 2 we describe the main features of GEANT4 simulation package for a reader not familiar with GEANT4. There we pay attention to the coupling of the GEANT4 hadronic models in the transition from intermediate to high energy. A consideration of questions dealing with the pion spectra is presented in Sect. 3. We show that the observed structure is caused by an incorrect programming solution dating back to the origin of the hadronic code. We turn to the proton spectra in Sect. 4, and show that a strange peak observed by the HARP-CDP group is connected with the absence of Fermi-motion of the nuclear nucleons during the simulation of quasi-elastic scattering. A summary of our results is presented in a short conclusion.

GEANT4 simulation package
GEANT4 is a software toolkit for the simulation of the passage of particles through matter. It is used by a large number of experiments and projects in a variety of application domains, including high energy physics, astrophysics and space science, medical physics and radiation protection. GEANT4 physics processes cover diverse interactions over an extended energy range, from optical photons and thermal neutrons to the high energy reactions at the Large Hadron Collider (LHC) and in cosmic ray experiments. Particles tracked include leptons, photons, hadrons and ions. Various implementations of physics processes are offered, providing complementary or alternative modeling approaches. All aspects of the simulation process are included: the geometry of the system, the materials involved, the fundamental particles of interest, the generation of primary particles of events, the tracking of particles through materials and external electromagnetic fields, the physics processes governing particle interactions, the response of sensitive detector components, the generation of event data, the storage of events and tracks, the visualization of the detector and particle trajectories, and the capture for subsequent analysis of simulation data at different levels of detail and refinement.
At the heart of this software system is an abundant set of physics models to handle the interactions of particles with matter across a very wide energy range. Data and expertise have been drawn from many sources around the world and in this respect GEANT4 acts as a repository that incorporates a large part of all that is known about particle interactions; moreover it continues to be refined, expanded and developed. Strictly speaking, it is an organized collection of program codes for solving of the aforementioned problems.
GEANT4 electromagnetic physics manages the electromagnetic interactions of leptons, photons, hadrons and ions. The package includes the processes of ionization, bremsstrahlung, multiple scattering, Compton and Rayleigh scattering, photo-electric effect, pair conversion, annihilation, synchrotron and transition radiation, scintillation, refraction, reflection, absorption and Cherenkov effect.
The basic requirements on the physics modeling of hadronic interactions in a simulation toolkit span more than 15 orders of magnitude in energy. The energy ranges from thermal for neutron cross-sections and interactions, through 7 TeV (in the laboratory) for LHC experiments, to even higher for cosmic ray physics. In addition, depending on the setup being simulated, the full range or only a small part might be needed in a single application. The complex nature of hadronic showers and the particular needs of the experiment require the user to be able easily to vary the models for particular particles and materials depending on the situation.
For calorimeter simulation at colliders, for example, pion-nucleus interactions are fundamental, and leading particle effects, transverse momentum distributions, inclusive cross-sections, and the prediction of nuclear excitation energies largely define the quantities of interest for measurement and detector design. When simulating backgrounds in the muon systems of the large LHC experiments, critical items are the production of muons in hadronic showers, as well as the simulation of punch-through and low energy neutron interactions.
Three classes of models are distinguished for modeling final states. There are models that are largely based on evaluated or measured data, models that are predominantly based on parameterizations and extrapolation of experimental data under some theoretical assumptions, and models that are predominantly based on theory.
Data-driven models When experimental or evaluated data are available with sufficient coverage, the data-driven approach is considered to be the optimal way of modeling. Data-driven modeling is used in the context of neutron transport, photon evaporation, absorption at rest, calculation of inclusive cross sections, and isotope production. We also use data-driven modeling in the calculation of the inclusive scattering cross sections for hadron-nucleus scattering. Limitations exist at high projectile energies, for particles with short life-times, and for strange baryons, as well as the K 0 system. Theory-based approaches are employed to extract missing cross sections from the measured ones, or, at high energies, to predict these cross sections. The main data-driven models in GEANT4 deal with neutron-and proton-induced isotope production, and with the detailed transport of neutrons at low energies.
Parameterized models Parameterizations and extrapolations of cross-sections and interactions are widely used in the full range of hadronic shower energies, and for all kinds of reactions. In GEANT4, models based on this paradigm are available for low and high particle energies, and for stopping particles. They are exclusively the result of re-writes of models available from GEANT3, predominantly GHEISHA [4]. They include induced fission, capture, and elastic scattering, as well as inelastic final state production. The LEP and HEP models in the GEANT4 package are also of this type.
Theory-based models Theory-based modeling is the basic approach in many models that are provided by GEANT4 or are under development. It includes a set of different theoretical approaches to describing hadronic interactions, depending on the addressed energy range and computing performance needs.
Parton string models for the simulation of high energy final states (E CMS > O (5 GeV)) are provided and in continuous development. Now diffractive string excitation [5,6] (FTF) and dual parton model [7,8] or quark gluon string [9] (QGS) model are used. String decay is generally modeled using well established fragmentation functions [10].
For intermediate energies, two intra-nuclear transport models are provided. The Bertini-style cascade [11][12][13][14] (BERT) is a combination of theoretically motivated concepts and parameterizations, which is commonly applied up to energies of 10 GeV. Particles in the cascade are propagated along straight-line paths and interact with nucleons in the nucleus according to the free-space cross sections. Details of the nuclear density and Pauli blocking are taken into account. The binary cascade [15] (BIC) is a time-like model which numerically solves the equations of motion of particles in the field of the nucleus. A detailed description of the nuclear density is used and scattering is done using smeared resonance cross sections, taking the local phase space and the Pauli principle into account. These features provide more detail, such as final-state correlations, at the expense of greater CPU time. This model is typically valid at energies below 5 GeV.
The last phase of a nuclear interaction is nuclear evaporation. In order to model the behavior of excited, thermalised nuclei, variants of the classical Weisskopf-Ewing model [16] are used. Fission [17], and photon evaporation (see Physics Reference Manual at the GEANT4 Web page [21] under Documentation) can be treated as competitive channels in the evaporation model. Internal conversion is used as a competitive channel in photon evaporation.
As an alternative for all nuclear fragmentation models, including evaporation models, the chiral invariant phase space (CHIPS) model is available. It is a quark-level 3dimensional, SU(3) ⊗ SU(3) symmetric event generator for fragmentation of excited hadronic [18] and nuclear [19] systems into hadrons. It is applied to a wide range of hadronand lepton-nuclear [20] interactions. It is released in the toolkit as a final state generator for the reactions of pion capture at rest, anti-proton capture at rest, a fragmentation model for photo-and electro-nuclear reactions, and as a nuclear fragmentation model for residual nuclei absorbing the soft particles from the Quark-Gluon String or diffractive string.
In the context of the large HEP experiments, it has become evident that all modeling techniques-data-driven, parameterized, and theory-driven are necessary to satisfy the needs for hadronic simulation in an optimal manner. Datadriven modeling is known to provide the best, if not only, approach to low energy neutron transport for radiation studies in large detectors. Parameterization driven modeling has proven to allow for tuning of the hadronic shower for particle energies accessible to test-beam studies, and is widely used for calorimeter simulation. Theory-driven modeling is the approach that promises safe extrapolation of results toward energies beyond the test-beam region, and allows for maximal extendibility and customisability of the underlying physics.
The software is freely available at source-code level over the Web [21]. The GEANT4 toolkit is nowadays used by the worldwide scientific community in diverse experimental domains. GEANT4 is used in production by large scale high energy physics experiments as well as in smaller scale detector development projects. It is also employed for accurate simulation in mission critical applications in space science and astrophysics, and is the basis for accurate calculations in medical physics, nuclear medicine and radiation protection.
The GEANT4 toolkit offers a variety of options for physics processes and models over a wide range of energies for electromagnetic and strong interactions. For the same combination of projectile and target at a given energy, there can be several models or processes applicable with different accuracy, strengths and computational cost. It is possible to create numerous configurations of models in order to address the needs of a particular use case.
Making an optimal selection of a set of models among those available can present a daunting learning curve, especially for hadronic interactions. The absence of a unique effective theory of hadronic interactions, which is calculable at relatively low energies, is a significant obstacle. In addition the variety of approaches even at high energies, where scaling formula and applicability of perturbative Quantum Chromodynamics (QCD) enable simulation, provide several competing approaches (chains, diffraction, hadronic clusters, quark-gluon strings, etc.). The low energy hadronic models, whose approximate validity is often restricted to particular incident particles, target material types, and interaction energies require special efforts from users. By using a consistent, tailored set of models it is possible to address the requirements of a particular use case. Choosing among the GEANT4 hadronic models is made easier by a number of Physics Lists which are included in the GEANT4 toolkit release. Each Physics List is a complete and consistent collection of models chosen to be appropriate for a given use case. Hadronic use cases relevant to high energy physics applications include calorimeters, trackers and a typical general-purpose detector. At low energy the use cases of neutron dosimetry applications and nucleon penetration shielding are covered. Results already obtained for several use cases and Physics List are available, many obtained by users, and provide invaluable reference points and benchmarks.
A very important matter has to do with sampling in the case where two models partially overlap one another in a given energy range, e.g. in the region 12-25 GeV/c between LEP and QGS, and in the region 9.5-9.9 GeV/c between BERT and LEP. Because the model predictions are different as a rule, we apply the following method as a compromise solution for a smooth transition: the models are sampled randomly assuming that a weight of low energy model is decreased linearly from 100% to 0% with laboratory momentum growth in the overlapping region, so the model valid on the lower side is always chosen at the lower bound of the overlap (and correspondingly the model valid on the upper side is always chosen at the top of the overlap range).
The other important question is the coupling of high energy models with intra-nuclear cascade models. High energy models (QGS and FTF) produce high energy part of secondary particles in hadron-nucleus interactions, and for a reproduction of the low energy part it is necessary to consider the formation time of the particles and their cascading into a nucleus. The last is performed at the coupling of the high energy models with the low energy model (BIC). The corresponding combinations are QGSB and FTFB. The first combination became available starting from GEANT4 version 9.1. The second combination is accessible at GEANT4 version 9.2.
There are also the combinations of the models without the secondary particle cascading. An excitation energy of a nuclear residual is calculated in them, and the evaporation or fragmentation model of nuclear relaxation is applied. The corresponding combinations are QGSP, FTFP, QGSC and FTFC.

Analysis of pion spectra
The HARP-CDP experimental group studied particle production in the interactions of π + -and π − -mesons at a laboratory momentum of 8.0 GeV/c, and protons at a laboratory momentum of 8.9 GeV/c, with beryllium nuclei. They compared their experimental data to the results of simulations using GEANT4 version 9.1. The Physics Lists they studied are presented in Tables 1 and 2 where we have marked by stars the models or combination of models which are actually invoked by the Physics Lists within the given energy ranges. As seen from Table 1, the low energy models of GEANT4 were applied in most of the Physics Lists for projectile protons of 8.9 GeV/c. The high energy FTF model was used in FTFC, FTFP and FTFP_BERT Physics Lists only.
The simulation results for the secondary mesons [1] using the LHEP, QGSC and QGSP Physics Lists were described there as unacceptable (see Table 2 of [1]). All of these Physics Lists use the low energy parameterized model (LEP). Thus, the source of the problem is in the model.
Considering the model code as a "black box", we have to produce and check various hypotheses on the code operation and the model implementation varying the conditions of the simulations. Here we have an advantage over a real experiment: we can change the beam energy and the target material quite easily. However, changing the model assumptions is more complicated.
Hypothesis 1: It is natural to assume that the observed peak-and-valley effect is caused by the nuclear environment of the multi-particle production process. In this case, it must be absent in hadron-nucleon interactions, and become more important as the target mass increases. Even though the beryllium nucleus is light, the hypothesis must be checked.
The calculations we present in Fig. 1a show that the structure is present in hadron-hadron interactions at projectile energies below 10 GeV when we choose the GEANT4 low energy parameterized model (LEP). It disappears at higher energies when the QGSP_BERT physics List is chosen, which is the preferred list of the major LHC experiments. Figure 1a shows the two-dimensional distribution of π + mesons within four-particle states resulting from ppinteractions at 8 GeV/c. Some structure clearly appears here, as it does in the π 0 distribution. The corresponding distributions for protons and neutrons have no such structure. The other low energy models, the Bertini model and the binary cascade model, do not predict any structure.

Hypothesis 2:
The location of the ridges in the Fig. 1a suggests that they are caused by the two-body kinematical decay [22] of resonances with no smearing of the longitudinal and transverse momenta. However, this must be rejected because the old Gheisha model, from which the GEANT4 model is translated, does not consider the production of mesonic and baryonic resonances. Only the production of π mesons and nucleons is taken into account.
So, neither hypothesis works, and we must go deeper into the Gheisha model. Now it can be easily done because "the effect" is presented in hadron-hadron interactions.
At a given interaction energy, the Gheisha model first determines the multiplicity of secondary particles. After that, each particle is assigned a transverse momentum, P T . It is assumed that the distribution of the momenta has the form dW ∼ P T exp(−BP 2 T ) dP T . In the following, the longitudinal momenta are sampled in the center-of-mass system of the colliding particles. It is clear from Fig. 1a, that the particles' P T -distribution has no special features, and obeys the exponential law. The longitudinal momentum (P L ) distribution and the corresponding Feynman x (x F ) distribution have irregularities at a fixed P T . In the model (see subroutine genxpt.f in the Gheisha code) the values of x F are sampled discretely rather than continuously. There are 20 such discrete values, leading to the irregular structure in the twodimensional distributions.
This sampling method is used in the toolkit for simulations of the hadron-nucleus interactions at low energies. Thus the structure exists for the collisions of hadrons with heavy nuclei as well as light nuclei. The average multiplicities of the produced hadrons are reproduced quite well, though the P L -P T spectra have the layer structure which is strongly distorted for interactions within a substance. We believe that this drawback is not reflected in most practical applications where thick targets are used. The effect is also hidden in one-dimensional distributions for thin targets (see the publications of the HARP collaboration [23,24]).
As shown in Table 2, the low energy model, LEP, was applied in most of the Physics Lists for projectile mesons with momentum 8.0 GeV/c. The high energy FTF model was used in FTFC, FTFP and FTFP_BERT Physics Lists only. The Bertini cascade model was used in the QBBC and QGSP_ BERT lists. The problems were observed (see Table 1 A list of the physics models utilised in different GEANT4 Physics Lists for proton interactions. For each Physics List, the combination of models and the full extent of their coverage are listed in the 2nd and third column respectively. The fourth and fifth column make these combinations explicit, listing where each applies. An explanation of the stars see in the text Proton beam  Models  Energy range   LHEP  HE_GHEISHA  25 GeV-100 TeV  HEP  55 GeV-100 TeV   LE_GHEISHA  0 GeV-55 GeV  HEP⊕LEP  25 GeV-    The parameterized model was recently improved by sampling x F from a smooth distribution instead of a discrete one. It gives an acceptable result as shown in Fig. 1b. The other low energy models of GEANT4, the binary cascade model [15] (BIC) and the Bertini model [11][12][13][14] (BERT), do not  predict any structure in the P L -P T spectra for the secondary mesons.

Physics List
We have checked that the high energy quark-gluon string (QGS) and Fritiof (FTF) models by themselves do not predict structure in the P L -P T distributions of mesons. Such structure could only appear by considering the low energy secondaries of the initial interaction which cascade through the nucleus.
The HARP-CDP group results are summarized in Table 2 of their paper [1]. There the authors rate the various comparisons of simulation with data, using the terms "unacceptable (diffr. patt.)" , "acceptable", and "poor (shape)". The definition "good" was given only to the QBBC Physics List proposed by the group in an application to the reaction π + A → π + X. The "poor (shape)" was ascribed to the results of the QGSP_BERT, QGSP_BIC and FTFP Physics Lists. An example of the "poor shape" for QGSP_BIC is shown in Fig. 8 of their paper. We note that such terms are qualitative and subjective. We also note that the comparisons were made between GEANT4 calculations and the raw experimental data of the HARP-CDP group. They state that "for the comparison of the shapes of inclusive particle spectra between data and Monte Carlo simulation-which is the purpose of this paper-it is, therefore, sufficient to compare data with Monte Carlo generated distributions, without correction of losses from acceptance cuts and of migration stemming from finite resolution". The published experimental data [23,24] relevant to the considered energy range are preented in Fig. 2 together with our calculations. 2 Certainly, the calculations are not in excellent agreement with the experimental data, but they do reproduce the main features. The agreement depends on the coupling of the high energy model with the cascade model which must handle the propagation of lower energy particles within the nucleus. This is currently an open question and work in this area is in progress. The data are very useful for the tuning of the model.

Analysis of proton spectra
The low energy models of GEANT4 (LEP, BERT, BIC) do not predict any structure for the proton spectra in the hadronnucleus interactions. Unacceptable results, narrow peaks for secondary low energy protons near θ ∼ 70°, were reported by the HARP-CDP group for the QBBC, FTFC, FTFP, and FTFP_BERT Physics Lists. The authors supposed that the results are connected with "the kinematics of elastic scattering of the incoming particle with a proton at rest". We checked this hypothesis by simulating p-Be interactions without the consideration of inelastic scatterings of the projectile in the nucleus, and without Fermi motion of the target nucleons. The corresponding distribution is presented in Fig. 3a. As seen, the figure shows a well-defined curve corresponding to the kinematics of two-body elastic scattering [22]. The group of points near the origin is due to evaporated nucleons.
In the second stage of our investigation, the Fermi motion was added (see Fig. 3b). It is natural that the curve seen in Fig. 3a should now be diffused. Usually the nuclear nucleons are assigned a Fermi momentum in a Monte Carlo calculation. Because the nucleons must be bound in the nucleus, they are not on the mass-shell. This means that the nucleons are given a mass m * = m N − P 2 F /2m N − 2 , where m N is the nucleon mass in the free state, and is the nuclear binding energy per nucleon. The nucleons return to the mass-shell during the subsequent interactions of the projectile. The projectile momentum is thus decreased by the quantity P L m N − m * , and the target nucleon receives the corresponding fraction of the longitudinal momentum. This shift along the P L -axis is small and not easily seen in Fig. 3b.
Inelastic interactions are added in the third stage (see Fig. 3c). The smoothness of the plot depends, obviously, on the assumed elastic and inelastic cross sections, and also on the magnitude of the Fermi momentum. Once all of these effects are taken into account, there is practically no structure in the P L -P T spectra.
The elastic scatterings of a projectile on the nuclear nucleons (quasi-elastic scatterings) were introduced into the GEANT4 high energy hadronic models in order to improve the agreement of simulated shower shapes with those measured in LHC test-beam experiments. This resulted in a significant improvement, although some questions remain. The first question has to do with the influence of the elastic scatterings on a projectile. This was considered many years ago by R.J. Glauber and G. Matthiae [25], and beautiful results were obtained for p-A interactions at 19.2 and 24 GeV/c [26,27]. The spectra of nucleons ejected from the nucleus were not considered at that time. Thus it was a second question for us-how to simulate the spectra of low energy ejected nucleons?
In the beginning we neglected Fermi motion and this was reflected in the proton spectra observed by the HARP-CDP group. This omission was corrected in the last version of the GEANT4 (release 9.2).
Accounting for the Fermi motion of the nucleon ejected in quasi-elastic scattering is not a simple one. According to the Glauber theory, one has to write exact wave functions of all the possible states of a nucleus. This can only be done for hadron-deuteron scattering because the deuteron has no bound excited states. For heavy nuclei, one has to use a phenomenological approach like that presented above.
In addition, we note that there is no unique method of accounting for the Fermi motion in inelastic hadron-nucleus interactions which is completely correct from a theoretical point of view. Such a method would be equivalent to a solution of the many-body problem of nuclear reactions. Thus all practical methods can be criticized.
The program implementation now assigns the Fermi momentum to the involved target nucleons, and allow them to take part in the following intra-nuclear cascading. The analogous method is applied in many other high energy Monte Carlo generators. We have also introduced Fermi motion in other Monte Carlo codes for the 9.2 release of GEANT4.
Looking carefully at Fig. 3c, one can see a depletion region restricted by the condition 100 ≤ P 2 T + P 2 L ≤ 400 (MeV/c). The upper bound (400 MeV/c) corresponds to protons of 80 MeV kinetic energy. It is assumed in the GEANT4 pre-compound model (PRECO) of the nuclear residual excitation energy calculation that nucleons with energies below 80 MeV are captured by the residual nucleus, thus removing nucleons from the plot. The lower bound of the region depends on the assumed excitation energy of the nuclear residual. Thus, the coupling of high-and low-energy particle cascades in nuclei can bring, in principle, some irregularities in the P L -P T spectra.
Another source of irregularities in the P L -P T spectra is the unsolved question of how to treat low-mass strings. It is usually assumed at high energies that quark-gluon string creation takes place in the hadron-hadron interactions. If all strings have large masses, they can fragment in the usual manner. If some of the strings have low masses, which happens very often in hadron-nucleus interactions, they must be considered separately.
If a string mass is near the mass of a corresponding hadron with the same quark content, it is natural to consider the string as the hadron, and ascribe to it the corresponding mass. In this case it is obvious that energy-momentum conservation cannot be satisfied.
If the mass of the string is such that the production of light mesons is possible, an isotropic decay of the string into light particles is simulated as usual. A sharp boundary between the two cases leads to irregularities in the P L -P T spectra. This question is currently unsolved and we will not present the corresponding calculations. We believe that it can arise only in special cases.

Conclusion
The HARP-CDP group has performed important and useful tests of several GEANT4 models and identified obvious problems in secondary particle generation. We have shown that these problems are not due to faults in the physical models used in GEANT4, but are in fact caused by incorrect programming solutions to problems inherited from very early versions of the code. The defects have been repaired in recent versions of the GEANT4 toolkit. We have checked that the identified defects have no effect on the simulation of calorimeter detectors, and we believe that most other practical applications will also not be affected. We point out some remaining theoretical problems, but none of these pertain to the present tests.