Reconstruction of 400 GeV/c proton interactions with the SHiP-charm project

The SHiP-charm project was proposed to measure the associated charm production induced by 400 GeV/c protons in a thick target, including the contribution from cascade production. An optimisation run was performed in July 2018 at CERN SPS using a hybrid setup. The high resolution of nuclear emulsions acting as vertex detector was complemented by electronic detectors for kinematic measurements and muon identification. Here we present first results on the analysis of nuclear emulsions exposed in the 2018 run, which prove the capability of reconstructing proton interaction vertices in a harsh environment, where the signal is largely dominated by secondary particles produced in hadronic and electromagnetic showers within the lead target.


Introduction
The SHiP-charm project [1] aims at measuring the differential charm production cross section in a thick target, including the enhancement due to cascade production, which has never been measured so far.Elastic scattering followed by a deep inelastic interaction is the main source of this enhancement [2].The accurate prediction of charm hadroproduction rates is an essential ingredient to establish the sensitivity of a high-intensity proton beam dump experiment like SHiP (Search for Hidden Particles) [3] to new particles produced in charm decays and to make a precise estimation of the tauneutrino flux at SHiP.
An optimization run was performed in July 2018 at the H4 beam line of CERN SPS/North Area, in the same location used for the muon flux measurement [4].A thick target made of lead interleaved with nuclear emulsions was exposed to a 400 GeV/c proton-beam.The detector is a hybrid The challenge of the SHiP-charm measurement is twofold: reconstruct tracks and interaction vertices in a highdensity environment and search for rare decays of charmed hadrons.The track reconstruction and matching between emulsion and silicon pixel detectors in this optimization run has been described elsewhere [5].Here we focus on the identification of interaction vertices, whose success is a prerequisite for subsequent phases of the analysis.
The work presented in this paper not only contributes to understanding and addressing challenges in track and vertex reconstruction in high-density environments but also establishes the groundwork for broader applications in similar experimental contexts.In particular, our approach is of significant interest to the SND@LHC experiment [6,7], currently collecting data at CERN, which aims to study high-energy neutrinos produced by accelerators for the first time.Furthermore, our work may be of interest to the SHiP@ECN3 [8] experiment proposal, which includes a neutrino detector for conducting high-statistics studies of neutrinos.Both experiments operate in high-density environments, up to 10 6 tracks/cm 2 , primarily due to background tracks, all nearly incident at the same angle.This aspect differentiates these experiments from SHiP-charm and introduces additional challenges related to alignment and tracking accuracy, challenges not present in SHiP-charm, where high density is associated with particles produced in secondary interactions and electromagnetic showers.These connections with current experiments and future proposals underscore the broad relevance and potential applications of the methodology developed in our work.

Detector layout
The detector layout of the SHiP-charm experiment was optimised in order to provide full topological and kinematic reconstruction of the event.A picture of the overall setup installed in the H4-PPE134 experimental area is shown in Figure 1.
The topological reconstruction of proton interactions and the identification of charmed hadron decay vertices is performed within the target, which exploits the submicrometer and milliradian resolution of nuclear emulsions.
The target is constructed according to the Emulsion Cloud Chamber (ECC) technique, alternating 1 mm-thick passive material plates with 330 µm emulsion films.The ECC was placed on a motorised mechanical stage in order to ensure a uniform distribution of the proton beam over the whole emulsion surface of 125×100 mm 2 .A schematic drawing and a picture of the target mover are shown in Figure 2.During each spill the target moves along the horizontal axis (x) at the uniform speed of 2.6 cm/s, thus covering the horizontal dimension of the ECC.Between two consecutive spills the target moves along the vertical axis (y) by 1 or 2 cm, depending on the expected track density in different target configurations.The total target surface is consequently covered in 5 or 10 spills, respectively.
A magnetic spectrometer is located downstream of the target.The magnetic field is provided by the GOLIATH magnet [9], located in PPE134 area.In order to cope with the high multiplicity of tracks produced in each proton interaction, the upstream station is required to be highly segmented and withstand a high occupancy.Insertable B-Layer (IBL [10]) hybrid silicon pixel detectors were used for this purpose.Pixels have a size of 250 × 50 µm 2 ; pixel modules consist each of a planar sensor and two custom developed large FE-I4 front-end chips [11] with a sophisticated readout architecture.Each sensor is made of 160 columns and 336 rows, resulting in 53760 pixels.The pixel tracking station is made of six planes equipped with IBL double-chips modules.Every second plane is rotated by 90 • in order to provide a 50µm position accuracy in both coordinates.The upstream station covers a transverse area of about 33.6 × 37.0 mm 2 , sufficient to contain the beam spot and proton interaction products passing through the lead-emulsion target.The matching between the emulsion target and the silicon pixel detector was successfully performed [5].
The downstream station is made by a combination of two different technologies: Scintillating fibers (SciFi) (T3s and T4s) in the central 40 × 40 cm 2 region, where the track density is higher, and drift tubes (T3 and T4) in the outer region.The T3 and T4 stations consist each of four detection planes to provide XU and YV coordinates, where U and V planes have a stereo angle of ∼2.5 • with respect to X an Y, respectively.Each detector plane is made by 3 × 12 cm-wide mats of scintillating fibers [12].A mat is a matrix structure consisting of six staggered fibre layers with a horizontal pitch of 270 µm and a total length of 40 cm.
While the SciFi stations were built for the purpose of this measurement, drift tube chambers were adapted from modules built for the OPERA experiment [13].T3 and T4 stations provide the x-coordinate information in the external region downstream of the GOLIATH magnet.Drift tube modules were installed on both sides and above the region covered by the SciFi stations.
The most downstream component of the experiment is the Muon Filter, which is designed to identify muons with high efficiency, separating them from charged hadrons.At the same time, it has to reconstruct the muon track slope to match the corresponding track reconstructed in the upstream Magnetic Spectrometer and assign the momentum to the muon track.The Muon Filter consists of five concrete slabs, two 80 cm-thick and three 40 cm-thick, acting as hadron absorbers, interleaved with five Resistive Plate Chambers (RPCs), acting as trackers.The transverse size of the RPC planes is 195 × 125 cm 2 .The muon identification is performed on the basis of the number of crossed layers in the detector.The RPCs were designed and constructed to operate in avalanche mode, with a time resolution of about 1 ns.Two orthogonal sets of strips, 1 cm-wide, are used for 2D measurements with a position resolution of about 3 mm in both directions.

Data taking and simulation
The SHiP-charm optimisation run was performed in July 2018.The target was assembled in six different configurations in order to study the production of charmed hadrons at different depths, up to a total thickness of 280 mm, corresponding to about 1.6 interaction lengths.The most downstream section of the target is instrumented with nuclear emulsions (the ECC) and moved by the motorised stage.
Upstream of the ECC, lead blocks with lengths from 28 to 244 mm are positioned to act as a pre-shower, according to the scheme shown in Figure 3. Hereafter the six target configurations will be referred to as CHARMn, with n ranging from 1 to 6.
The ECC target of CHARM1 and CHARM2 is made of a sequence of 29 emulsion films alternated with 28 passive layers, while for configurations from CHARM3 to CHARM6 it consists of 57 emulsion films and 56 passive layers.Multiple runs were performed for the different configurations (CHARMx-RUNm, with m ranging from 1 to 6) in order to accumulate enough statistics in each portion of the target.A total number of 15.6 × 10 5 p.o.t. were integrated during the whole exposure.All runs used lead as passive material, except for the sixth run of CHARM1, which used 1 mm-thick tungsten layers.The proton beam intensity was measured by a beam counter located upstream of the target region.The temporal structure of the beam was consistent during the whole exposure, with a spill duration of 4.8 s.Its intensity, however, showed fluctuations from 7.7 × 10 3 to 13.8 × 10 3 protons/spill.The profile of the beam during the spill was monitored by the pixel station.The beam profile recorded in one spill is shown in Figure 4.The beam spot integrated during the spill has a transverse size of about 6 × 15 mm 2 .The elliptical shape is due to a translation of the beam center-of-gravity within the spill.
The SHiP-charm experimental apparatus was reproduced within the FairShip software, the official SHiP simulation framework derived from FairRoot [14], as shown in Figure 5.The geometry and the position of different sub-detectors were set taking into account measurements performed in situ by the CERN survey team.The magnetic-field map measured by the CERN staff in 2017 [9] was imported in the simulation of the GOLIATH magnet.The simulation of 400 GeV/c proton interactions within the target and the propagation of particles in detector materials is performed with GEANT4 [15,16], that was validated to provide reliable estimates in the SHiP-charm energy regime [4,17].Different simulation campaigns were performed in order to reproduce the six target configurations.
4 Data analysis

Track reconstruction in nuclear emulsions
The track left by a charged particle in an emulsion layer is recorded by a series of sensitised AgBr crystals, growing up to 0.6 µm diameter during the development process.Optical microscopes analyse the whole thickness of the emulsion, acquiring tomographic images at equally spaced depths.The acquired images are digitized, then an image processor recognizes the grains as clusters, i.e. groups of pixels of given size and shape.Then, a track in the emulsion layer (usually referred to as micro-track) is obtained connecting clusters belonging to different levels, as shown in the left panel of Figure 6.Since an emulsion film is formed by two emulsion layers, the connection of the two micro-tracks through the plastic base provides a reconstruction of the particle's trajectory in the emulsion film, called base-track.The reconstruction of particle tracks in the full volume requires connecting base-tracks in consecutive films.In order to define a global reference system, a set of affine transformations has to be computed to account for the different reference frames used for data taken in different films.Once all emulsion films are aligned, volume-tracks (i.e., charged tracks which crossed several emulsion films) can be reconstructed.The track finding and fitting is based on the Kalman-Filter algorithm and takes into account possible inefficiencies in the base-track reconstruction [18].
The vertex identification is initiated by two-track vertices defined according to minimal distance criteria.Topological cuts are used in order to reduce the combinatorial background.The final selection on the track pairs is based on a vertex probability calculated with the full covariance matrix of the involved tracks.Starting from pairs, n-tracks vertices are constructed using the Kalman-Filter technique.The off-line reconstruction tool used in the analysis reported in this document is FEDRA (Frame-work for Emulsion Data Reconstruction and Analysis) [19], an object-oriented tool based on C++ language and developed in the ROOT [20] framework.
The analysis of emulsion films was performed in dedicated laboratories in Naples and Zurich equipped with a new generation of optical microscopes, one of which is shown in the right panel of Figure 6.A recently developed upgrade of the European Scanning System (ESS) [21][22][23] was used.The use of a faster camera with smaller sensor pixels and a higher number of pixels combined with a lower magnification objective lens, together with a new software LASSO [24,25] has allowed to increase the scanning speed to 180 cm 2 /h [26], more than a factor ten larger than the previous generation.

Proton-beam characterisation
The number of protons impinging on ECC target units vary from 10 2 /cm 2 to 10 3 /cm 2 according to the configuration of the exposure.The data analysis shows that the track density increases with the depth in the brick due to the proton interactions, hadronic reinteractions and electromagnetic showers.The density of segments reconstructed in a single emulsion film extends up to 4.5×10 4 /cm 2 .
Figure 7 shows the characterisation of the proton beam in one of the ECC targets both in terms of angle (left) and position (right).The pattern observed in the position distribution reproduces the movement of the target with respect to the proton beam.The base-track efficiency is shown in Figure 8 as a function of the film number in one of the most upstream configurations.The average base-track efficiency is higher than 90%.A slight decrease in the efficiency is observed in downstream configurations due to higher track density.

Interaction-vertex identification
Several thousands of proton interaction vertices are expected in a single target unit (∼10 3 cm 3 ).400 GeV/c proton interactions produce on average more than ten charged particles and as many photons, having energies up to ∼50 GeV.This results in a large number of secondary hadronic reinteractions and electromagnetic showers, that increases the number of reconstructed vertices by two order of magnitudes.To set the scale, the unitary cell of the OPERA experiment [27,28] contained only a single neutrino-interaction vertex in the same volume.
The analysis of the SHiP-charm emulsion data therefore required the development of dedicated software and analysis tools to extract the signal from an unprecedented background rate.The main background source consists of low energy particles produced in hadronic and electromagnetic showers originated in primary protons interaction and in subsequent re-interactions downstream of the primary vertex.The yield of background vertices in the reconstructed sample is more than one order larger than the signal and shows an increasing trend in downstream configurations.
A full Monte Carlo simulation was performed in order to have a training sample that accurately reproduced data.The tracking and vertexing algorithms described in Section 4 were applied both on simulated and real data.
A multivariate classification was performed using boosted decision trees from the TMVA toolkit [29] to distinguish a signal of true interaction vertices from the background.The background is mainly due to the random combination of low-momentum tracks and electromagnetic showers that crowd the ECC volume.Five discriminating variables were selected: vertex probability, as provided by the fit procedure angular distance between tracks associated to the vertex mean impact parameter of tracks at the vertex maximum impact parameter of tracks at the vertex fill factor of tracks at the vertex, defined as the ratio between the number of base-tracks building up the track and the number of emulsion films downstream of the vertex.
Other variables were tested, such as the average slope of tracks associated to the vertex and distribution of tracks in the transverse plane, and resulted not to have good performances in the signal-to-background discrimination.
The left panel of Figure 9 shows the above mentioned variables for the training sample.The output of the BDT (V bdt ) is shown in the right panel of Figure 9: a good separation between signal and background distributions is observed.The final selection of the signal component is performed on the variable R sel , defined as the ratio between (1-V bdt ) and the track multiplicity at the reconstructed vertex.The distribution of R sel variable is shown in Figure 10 for data and simulation.The signal component is confined in the region R sel < 0.1, where a fairly good agreement between data and simulation is observed.The excess in the data for higher R sel values is due to very low (n < 4) multiplicity vertices that are mainly made of random combination of instrumental background tracks.This background component, indeed, is not included in the current version of the simulation software.The cut on the R sel variable was optimised in order to maximise the background rejection while keeping an high signal selection efficiency.Vertices having R sel < 0.05 are classified as interaction-vertex candidates.The angular distribution of tracks associated to interactionvertex candidates is shown in Figure 11.A good agreement is observed, both in normalisation and shape, thus validating the Monte Carlo simulation and the signal selection procedure.Fig. 11: Angular distribution of tracks associated to interaction-vertex candidates.The inset shows the region with slopes smaller than 0.014 rad.

Interaction-vertex characterisation
The reconstructed position of interaction-vertex candidates along the beam direction for the most upstream and the most downstream configuration is shown in Figure 12.The most upstream configuration shows very good agreement between data and Monte Carlo, both in normalisation and shape.A discrepancy between data is observed in downstream configurations which is due to inefficiencies in track reconstruction that affect the overall number of selected vertices without introducing relevant biases in the variables that characterise interaction-vertex candidates.The origin of the discrepancies are mainly related to an higher track density in downstream configurations, mainly coming from the overlap of hadronic and electromagnetic showers started in upstream regions.The presence of a large number of low-energy tracks can spoil the alignment precision between consecutive films, thus causing a discrepancy with respect to simulations, where those effects are not taken into account.
The signal sample selected with the above mentioned procedure is made of two components: primary protons interaction vertices and hadron re-interaction vertices.A display of a Monte Carlo event containing both vertex categories is shown in Figure 13.
The interaction vertex multiplicity for the most upstream and the most downstream configuration is shown in     In order to merge data reconstructed in different configurations, inefficiencies were corrected by applying a normalisation factor, which also scaled all data to the same number of incoming protons on target.
By adding data reconstructed in different runs and combining the six configurations it is possible to retrieve the overall distribution of interaction-vertex candidates in a ∼365 mm long emulsion/lead target.The overall distribution is shown in Figure 15 for data and simulation.Error bars on data points are obtained propagating the covariance matrix of the original histogram with the efficiency correction factor.
The distribution shown in Figure 15 is made by the sum of two components: primary protons and hadron reinteractions.While the primary-proton component follows an exponential distribution, hadron reinteractions can be parametrised as a second-order polynomial.A Chi-square fit was therefore performed on data points with an exponential function and a 2 nd degree polynomial.The area under the two curves results to be 58% and 42%, respectively.
The exponent of the exponential function provides an estimation of the proton interaction length in the emulsion/lead target of Errors are purely statistical.This result is compatible with expectations from the full simulation, that predicts an interaction length of (175±5) mm.

Conclusions
The analysis of the SHiP-charm emulsion data required the development of dedicated software and analysis tools to extract the signal from an unprecedented background rate.A good agreement between data and Monte Carlo expectations is found for the number of charged tracks defining the interaction vertex and the position of the vertex along the beam axis.These results prove the capability to reconstruct interaction vertices in a harsh environment.
The development of a Monte Carlo simulation that accurately described reconstructed data and the application of multivariate analysis techniques allowed to extract the primary proton interaction component in a ∼365 mm long emulsion/lead target and to evaluate the effective interaction length.The results to be in good agreement with expectations.Fig. 15: Position distribution of interaction-vertex candidates along the beam direction for data and Monte Carlo, merging results from the different configurations.Primary-proton and hadron-reinteraction components are shown in red and blue, respectively.The dashed line represents the fit to data points.Empty bins refer to regions where data points are not available since they correspond to marginal regions of consecutive target configurations where the vertex reconstruction is not possible due to geometrical acceptance . 20College of Industrial Technology, Nihon University, Narashino, Japan 21 Toho University, Funabashi, Chiba, Japan 22 Physics Education Department & RINS, Gyeongsang National University, Jinju, Korea 23 Gwangju National University of Education e , Gwangju, Korea 24 Jeju National University e , Jeju, Korea 25 Korea University, Seoul, Korea 26 Sungkyunkwan University e , Suwon-si, Gyeong Gi-do, Korea

Fig. 1 :
Fig. 1: Lateral view of the experimental apparatus for the charm measurement.The red arrow represents the beam direction.

Fig. 2 :
Fig. 2: Left: technical drawing of the target mover.Right: picture of the mechanical stage during a test exposure of an ECC target.

Fig. 4 :
Fig. 4: Left: beam profile in the transverse plane, as registered by the pixel detector in the sixth spill of CHARM2-RUN1.Right: position of the beam center-of-gravity as function of time during the spill.

Fig. 6 :
Fig. 6: Left: schematic layout of a nuclear emulsion film.Right: one of the optical microscopes used for the analysis of nuclear emulsions exposed in the SHiP-charm project.

Fig. 7 :
Fig. 7: Left: angular dispersion of the proton beam as reconstructed in one of the exposed ECC target units.Right: position distribution of incoming protons on the emulsion surface.

Fig. 8 :
Fig. 8: Film-by-film base-track efficiency as a function of the film number for reconstructed protons in CHARM1-RUN2 configuration.The average efficiency, amounting to 92 ± 2 %, is shown as horizontal red line.

Fig. 9 :
Fig. 9: Left: distribution of input variables used in the multivariate analysis.Right: output value of the BDT for signal (blue) and background (red).

Fig. 10 :
Fig. 10: Distribution of the R sel variable for data and simulated signal and background vertices.
Figure 14.The contribution of the primary proton and hadronreinteraction components is shown separately.As one might expect, the hadron-reinteraction component increases as the configuration number increases, going from 11% in CHARM1 to 59% in CHARM6.

Fig. 12 :
Fig. 12: Vertex position along the beam direction for interaction-vertex candidates reconstructed in CHARM1-RUN2 (left) and CHARM6-RUN1 (right).Data and Monte Carlo distributions have been normalised to the number of p.o.t.integrated in the analised run.

Fig. 13 :
Fig. 13: Display of a reconstructed Monte Carlo event where both the primary-proton interaction vertex and an hadronreinteraction vertex are reconstructed.

Fig. 14 :
Fig. 14: Charged track multiplicity for interaction-vertex candidates reconstructed in CHARM1-RUN2 (left) and CHARM6-RUN1 (right).Data and Monte Carlo distributions have been normalised to the number of p.o.t.integrated in the analysed run.