Comparison of Geant4 hadron generation with data from the interactions with beryllium nuclei of +8.9 GeV/c protons and pions, and of -8 GeV/c pions

Hadron generation in the Geant4 simulation tool kit is compared with inclusive spectra of secondary protons and pions from the interactions with beryllium nuclei of +8.9 GeV/c protons and pions, and of -8.0 GeV/c pions. The data were taken in 2002 at the CERN Proton Synchrotron with the HARP spectrometer. We report on significant disagreements between data and simulated data especially in the polar-angle distributions of secondary protons and pions.


INTRODUCTION AND MOTIVATION
The HARP experiment arose from the realization that the differential cross-sections of hadron production in the collisions of few GeV/c protons with nuclei were known only within a factor of two to three. Consequently, the HARP spectrometer was designed to carry out a programme of systematic and precise measurements of hadron production by protons and pions with momenta from 3 to 15 GeV/c. The experiment was in operation at the CERN Proton Synchrotron in 2001 and 2002, with a set of stationary targets ranging from hydrogen to lead, including beryllium.
The data from the HARP spectrometer can be used, amongst other purposes, for the physics validation of hadron generators that are used in simulation tool kits such as Geant4 [1]. This is of interest for the correct interpretation of data that will be forthcoming, e.g., from experiments at the LHC [2].
In this paper, data are used from the HARP large-angle spectrometer that comprised a cylindrical Time Projection Chamber (TPC) and an array of Resistive Plate Chambers (RPCs) around the TPC. The purpose of the TPC was the measurement of the transverse momentum p T and of the polar angle θ of tracks, and particle identification by dE/dx. The purpose of the RPCs was a complementary particle identification by time of flight.
The data analysis that underlies the spectra shown in this paper rests on the calibrations of the TPC and the RPCs that our group published in Refs. [3] and [4]. For a more detailed account of our calibration work we refer to our collection of memos and analysis notes [5]. We recall that we disagree with the calibrations and physics results reported by the 'HARP Collaboration', as discussed in Refs. [6] and [7].
With a view to correcting for losses of secondary particles from acceptance cuts, and for migration due to finite detector resolution, the measurement of tracks in the detector must be simulated with a Monte Carlo program. We use the Geant4 tool kit for this purpose. It was at this point that we noticed peculiar structures in the polar-angle spectra of secondary particles generated by Geant4's LHEP 'physics list' that prevented the weighting of generated tracks by smooth functions. Further investigations showed that this is a rather common phenomenon across Geant4's hadronic physics lists. Figure 1 shows a typical example of an unphysical structure in generated longitudinal momentum p L versus transverse momentum p T of secondaries. Since the structure is genuinely connected with the polar angle θ, it tends to be washed out when integrating over either p L or p T 1) . That may explain why-as nearly as we can tell-these structures were not noticed before.

HADRON GENERATORS IN THE GEANT4 SIMULATION TOOL KIT
The Geant4 simulation tool kit provides several physics models of hadronic interactions of hadrons with nuclei, and collections of such models, termed physics lists. The latter are tailored with a view to optimizing performance for specific applications. Table 1 lists and characterizes a representative selection of physics lists of hadronic interactions in Geant4 2) , together with the used physics models and the energy ranges where the latter are considered to be reliable [8].
In the so-called 'low-energy' domain (defined as kinetic energy E of the incoming hadron below some 25 GeV), a modified version of the GHEISHA package of Geant3 is used in many physics lists: the Parametrized Low-Energy Model ('LE GHEISHA'). Optionally, for E below 1) All physical quantities in this paper refer to the laboratory system.  versus transverse momentum p T , as generated by Geant4's LHEP physics list for secondary π + from the interactions of +8.9 GeV/c beam π + with beryllium nuclei at rest. a few GeV, the Bertini Cascade [9] ('BERT') or the Binary Cascade [10] ('BIC') models can be enabled, with a view to simulating the cascading of final-state hadrons when they move through nuclear matter. As an alternative to LE GHEISHA, a modified version of the FRITIOF string fragmentation model [11] ('FTF') is available.
In the so-called 'high-energy' domain, mostly the Quark-Gluon String Model ('QGSM') is used, with FTF and the Parametrized High-Energy Model ('HE GHEISHA') as alternatives. Further terms that appear in Table 1 and are explained in Ref. [8], are 'PRECO' for the Precompound model, 'QEL' for the Quasi-elastic scattering model, and 'CHIPS' for the Chiral Invariant Phase Space model.
The energy ranges of models tend to overlap. In the overlap region, the model is chosen randomly but the choice is biased by the difference between the kinetic energy of the beam particle and the kinetic energy limits of the models.
Below, we compare the predictions of Geant4 hadronic physics lists with our data: the inclusive proton, π + and π − spectra that are generated by the interactions with beryllium nuclei of +8.9 GeV/c protons and π + , and of −8.0 GeV/c π − .

THE HARP LARGE-ANGLE SPECTROMETER 3.1 Physics performance
For the purpose of this paper, the essential physics performance parameters are the resolution and the scale of the transverse momentum p T of final-state particles, the resolution and the scale of the polar angle θ, and the separation of pions from protons. We briefly give evidence of the salient features, and refer the reader to our respective technical publications [3,4] for details.
The resolution of the inverse transverse momentum measured by the TPC depends slightly on the relative velocity β and on θ of the particles. It is in the range 0.20 < σ(1/p T ) < 0.25 (GeV/c) −1 . Figure 2 shows the difference of the inverse transverse momentum of positive   From the requirement that π + and π − with the same RPC time of flight have the same momentum, the momentum scale is determined to be correct to better than 2%, for both positively and negatively charged particles.
The polar angle θ is measured in the TPC with a resolution of ∼9 mrad, for a representative angle of θ = 60 • . To this a multiple scattering error has to be added which is ∼7 mrad for a proton with p T = 500 MeV/c and θ = 60 • , and ∼4 mrad for a pion with the same characteristics. The polar-angle scale is correct to better than 2 mrad.
As for the separation of pions from protons: the resolution of dE/dx in the TPC is 16% for a track length of 300 mm, and the system time-of-flight resolution is 175 ps. Figure 3 (a) shows the specific ionization dE/dx, measured by the TPC, and Fig. 3

Acceptance and migration
We discuss differences of inclusive spectra between data and simulated data in terms of the distribution in the polar angle θ, for different ranges of p T . The primary reason for this choice is that disagreements show up most clearly in θ. At the same time, θ is a well-measured experimental quantity. We also consider that the θ distribution provides the clue to the origin of the disagreements.
The physics performance parameters in the transverse momentum p T and polar angle θ are so good that finite resolution, or a small dependence of acceptance cuts on p T or θ, does not appreciably affect the comparison of data with simulated data (the chosen ranges of p T exceed by a factor of two or more the p T resolution, and the chosen bin size of 2 • (35 mrad) of θ exceeds by a factor of two the θ resolution). That is substantiated in Fig. 4 which shows examples of the dominant disagreements between data and simulated data: patterns reminiscent of diffractive scattering [ Fig. 4 (a)] and of elastic scattering [ Fig. 4 (b)]. The full lines show the Monte Carlo generated polar-angle distributions, while the crosses show the same after acceptance cuts, particle identification cuts, and with resolution effects included. Obviously, the structures seen in the simulated data are not appreciably altered by experimental acceptance and resolution.
We conclude that for the comparison of the shapes of inclusive particle spectra between data and simulated data-which is the purpose of this paper-it is sufficient to compare data with Monte Carlo generated distributions, without correction of losses from acceptance cuts and of migration stemming from finite resolution. Also, the comparison is intentionally restricted to kinematical regions where there is ample and unambiguous separation of pions from protons 3) . To identify a secondary particle as a pion or as a proton it is required that the measured dE/dx and time of flight are both consistent with the given particle hypothesis. In addition, the time of flight has to be inconsistent with the opposite hypothesis. For particles without either dE/dx or time-of-flight measurement, the cut on the available variable is tightened. Within the ac- cepted phase space, the particle identification efficiency is between 70% and 90%, while the contribution from wrong particle identification is below 5%.

DATA VERSUS SIMULATION FROM SELECTED GEANTPHYSICS LISTS
The combination of the choice of hadron generators with the choice of incoming beam particles and the choice of secondary hadrons, leads to a large a number of possible plots. With a view to simplifying matters, we select for several physics lists two plots each that are representative for the agreement and disagreement, respectively, between data and simulation. In all plots, we compare the Monte Carlo-generated θ distribution with the θ distribution of data. Positive beam particles have +8.9 GeV/c momentum, and negative beam particles have −8.0 GeV/c momentum. The target is a 5% λ abs thick stationary beryllium target. In Figs. 5 to 10, data are shown as crosses while simulated data are shown as full lines. As justified in Section 3.2, the data are not corrected for losses from acceptance cuts and migration stemming from finite resolution. With a view to emphasizing shape differences, the data are normalized to the simulated data in the angular range 20 • < θ < 125 • . Figures 5 to 10 show the comparisons for several physics lists from Table 1: LHEP, QGSC, QGSP BERT, QGSP BIC, QBBC, and FTFP.
There are three distinct problems visible in the comparison of θ distributions of data and simulated data: -an unphysical peak for secondary protons near θ = 70 • ; -an unphysical diffraction-like pattern for secondary pions; -a poor agreement in the shape. The different physics lists behave differently with respect to the type of disagreement, yet the unphysical peak for secondary protons near θ = 70 • , and the unphysical diffraction-like pattern for secondary pions appear as dominant problems.      With a view to elucidating the physics origin of these dominating problems, we examine more closely for the LHEP physics list the disagreements between data and simulated data for different combinations of incoming and secondary particle. Figure 11 shows for a specific range of p T all combinations of incoming and secondary protons, π + and π − . Fig. 11: LHEP physics list; polar-angle distributions of protons (left panels), π + (middle panels), and π − (right panels), for incoming protons (top row), incoming π + (middle row), and incoming π − (bottom row). Figure 12 presents for the LHEP physics list in four ranges of p T the spectra of secondary pions from incoming protons, and Fig. 13 the same for secondary protons from incoming pions. The conclusions from the comparison of data with simulated data are summarized in Table 2. We qualify the various physics lists in the order 'good', 'acceptable', 'poor', 'unacceptable'. Note that our qualification refers to the restricted polar-angle range 20 • < θ < 125 • .
None of the standard physics lists of Geant4 is qualified 'good'. The unphysical peak in the distribution of secondary protons around 70 • is consistent with the kinematics of elastic scattering of the incoming particle with a proton at rest. The diffractionlike pattern in the distribution of seconday pions is consistent with diffractive scattering of the incoming particle on a stationary disc with the diameter of a nucleon. We conjecture that the differences between data and simulated Geant4 data arise dominantly from an inadequate description of the elastic scattering and diffraction scattering of incoming beam particles on nucleons embedded in a nucleus.
For the analysis of our data, we have used for incoming beam protons the QGSP BIC physics list and see no strong reason to reconsider this choice. For incoming beam pions, none of the standard physics lists for hadronic interactions was acceptable, so we had to build our private HARP CDP physics list. This physics list starts from the QBBC physics list (see Table 1). Yet the Quark-Gluon String Model is replaced by the FRITIOF string fragmentation model for kinetic energy E > 6 GeV; for E < 6 GeV, the Bertini Cascade is used for pions, and the Binary Cascade for protons; elastic and quasi-elastic scattering is disabled 4) . We qualify our HARP CDP physics list as 'acceptable' in all four categories listed in Table 2.

SUMMARY
We have presented significant disagreements in the laboratory polar-angle distributions between data from the HARP large-angle spectrometer, and data simulated by various physics lists in the Geant4 simulation tool kit. An unphysical peak for secondary protons near θ = 70 • , and an unphysical diffraction-like pattern for secondary pions appear as dominant problems. 4) Our Monte Carlo simulation will be discussed in the necessary detail in a forthcoming paper.