The aim of the SHiP experiment [1] is to search for very weakly interacting particles beyond the Standard Model which are produced by the interaction of 400 GeV/c protons from the CERN SPS with a beam dump. The SPS will deliver \(4\times 10^{13}\) protons on target (POT) per spill, with the aim of accumulating \(2\times 10^{20}\) POT during five years of operation. The target is composed of a mixture of TZM (Titanium-Zirconium doped Molybdenum, \(3.6 \lambda \)Footnote 1), W (\(9.2 \lambda \)) and Ta (\(0.5 \lambda \)) to increase the charm cross-section relative to the total cross-section and to reduce the probability that long-lived hadrons decay.

An essential task for the experiment is to keep the Standard Model background level to less than 0.1 event after \(2~\times ~10^{20}\) POT. About \(10^{11}\) muons per spill will be produced in the dump, mainly from the decay of \(\pi , K, \rho , \omega \) and charmed mesons. These muons would give rise to a serious background for many hidden particle searches, and hence their flux has to be reduced as much as possible. To achieve this, SHiP will employ a novel magnetic shielding concept [2] that will suppress the background by five orders of magnitude. The design of this shield relies on the precise knowledge of the kinematics of the produced muons, in particular the muons with a large momentum (>100 GeV/c) and a large transverse momentum (>3 GeV/c) as they can escape the shield and end up in the detector acceptance.

To validate the muon spectrum as predicted by our simulation, and hence the design of the shield, the SHiP Collaboration measured the muon flux in the experiment in the 400 GeV/c proton beam at the H4 beam line of the SPS at CERN in July 2018 [3].

Experimental setup and data


The experimental setup, as implemented in FairShip (the SHiP software framework), is shown in Fig. 1. A cylindrical SHiP-likeFootnote 2 target (10 cm diameter and 154.3 cm length) was followed by a hadron absorber made of iron blocks (\(240 \times 240 \times 240~{\mathrm{cm}}^3\)) and surrounded by iron and concrete shielding blocks. The dimensions of the hadron absorber were optimised to stop pions and kaons while keeping a good \(p_T\) acceptance of traversing muons. The SPS beam counters

Fig. 1
figure 1

Layout of the experimental setup to measure the \(\mu \)-flux. The FairShip (the SHiP software framework) coordinate system is also shown

(XSCI.022.480/481, S0 in Fig. 1) and beam counter S1 were used to count the number of POT seen by the experiment.

A spectrometer was placed downstream of the hadron absorber. It consisted of four drift-tube stations (T1–T4, modified from the OPERA experiment [4]) with two stations upstream and two stations downstream of the Goliath magnet [5]. The drift-tubes were arranged in modules of 48 tubes, staggered in four layers of twelve tubes with a total width of approximately \(50~{\mathrm{cm}}\). The four modules of height \(110~{\mathrm{cm}}\) making up stations T1 and T2 were arranged in a stereo setup (\(x-u\) views for T1 and \(v-x\) views for T2), with a stereo angle of \(60^{\circ }\). T3 and T4 had only x views and were made of four modules of \(160~{\mathrm{cm}}\) height.

The drift-tube trigger (S2) consisted of two scintillator planes, placed before (S2a) and behind (S2b) the first two tracking stations.

A muon tagger was placed behind the two downstream drift-tube stations. It consisted of five planes of single-gap resistive plate chambers (RPCs), operated in avalanche mode, interleaved with \(1\times 80~{\mathrm{cm}}\) and \(3\times 40~{\mathrm{cm}}\) thick iron slabs. In addition to this, a \(80~{\mathrm{cm}}\) thick iron slab was positioned immediately upstream of the first chamber. The active area of the RPCs was \(190~{\mathrm{cm}} \times 120~{\mathrm{cm}}\) and each chamber was read out by two panels of x/y strips with a \(1~{\mathrm{cm}}\) pitch.

The two upstream tracking stations were centered on the beam line, whereas the two downstream stations and the RPCs were centered on the Goliath magnetFootnote 3 opening to maximize the acceptance.

The data acquisition was triggered by the coincidence of S1 and S2. For more details on the DAQ framework, see [6], and for a description of the trigger and the DAQ conditions during data taking, see [7].

The protons were delivered in \(4.8~{\mathrm{s}}\) duration spills (slow extraction). There were either one or two spills per SPS supercycle, with intensities \(\sim 3\times 10^6\) protons per second. The 1-sigma width of the beam spot was 2 mm. For physics analysis, 20128 useful spills were recorded with the full magnetic field of 1.5 T, with \(2.81\times 10^{11}\) raw S1 counts. After normalization (see Sect. 3.1) this corresponds to \((3.25\pm 0.07)\times 10^{11}\) POT. Additional data were taken with the magnetic field switched off for detector alignment and tracking efficiency measurement.

Data analysis


The calculation of the number of POT delivered to the experiment must take the different signal widths and dead times of the various scintillators into account. Moreover, some protons from the so-called halo, might fall outside the acceptance of S1 and will only be registered by S0.

In low-intensity runs these effects are small. We select some spills of these runs and split them into 50 slices of 0.1 s. We then determine the number of POT per slice and count the number of reconstructed muons in each slice, which should be independent of the intensity. By leaving the dead times as free parameters in a straight line fit, we find [8] that the number of POT required to have an event with at least one reconstructed muon is \(710\pm 15\). The systematic error of 15 POT accounts for the variation between the runs used for the normalization. The statistical error is negligible.

The efficiency of the trigger relies on the efficiency of detecting a muon signal in two scintillator planes S2a and S2b (see Fig. 1 and [8]). Each plane is equipped with two photo-multipliers (PMs), and the signal of each of the PMs is recorded for each event. The calculated trigger inefficiency is less than 1‰ and is hence neglected. Multiplying the number of reconstructed muons found in the 20128 spills by 710 we calculated that this data set corresponds to \((3.25\pm 0.07)~\times ~10^{11}\) POT.

Fig. 2
figure 2

A two-muon event (most events are single-muon events) in the event display. The blue crosses are hits in Drift-tube stations T1 and T2, the red crosses are hits in T3 and T4. The green and light blue are hits in the RPC stations. The orange (blue) dotted lines are drift tube (RPC) track segments in the y projection; the pink (red) curves are track segments in the x projection

Fig. 3
figure 3

Average of all drift-tube residuals. The fit is a double Gaussian and the resulting hit resolution (\(\sigma _\mathrm{{mean}}\)) is the average of the two sigma’s

Fig. 4
figure 4

Effect of additional Gaussian smearing on the momentum distribution in the simulation, left p, right \(p_{T}\). The distributions correspond to the simulation truth before reconstruction (navy blue), the nominal resolution \(\sigma _\mathrm{{hit}}=270~\upmu \mathrm {m}\) (green) and a degraded resolution \(\sigma _\mathrm{{hit}}=350~\upmu \mathrm {m}\) (pink)


For the drift-tubes, the relation between the measured drift-time and the distance of the track to the wire (the “\(r\text {--}t\)” relation) is obtained from the Time to Digital Converter (TDC) distribution by assuming a uniformly illuminated tube. When reconstructing the data, the \(r\text {--}t\) relations are established first by looking the TDC distributions of simple events (i.e. events with at least 2 and a maximum of 6 hits per tracking station). In the simulation, the true drift radius is smeared with the expected resolution. The pattern recognition subsequently selects hits and clusters to form track candidates and provides the starting values for the track fit. The RPC pattern recognition proceeds similarly. Drift-tube tracks are then extrapolated to RPC tracks and tagged as muons if they have hits in at least three RPC stations. Figure 2 shows a two-muon event in the event display.

Momentum resolution

The expected drift-tube hit resolution based on the OPERA results is 270 µm [4]. However, due to residual misalignment and imperfect \(r\text {--}t\) relations, the measured hit resolution was slightly worse, 373 µm, as shown in Fig. 3. To study the impact of degraded spatial drift-tube resolution the momentum distribution from the simulation was folded with additional smearing as shown in Fig. 4. The tails towards large momentum p and \(p_{T}\) are caused mainly by tracks fitted with wrong drift times due to background hits.

From Fig. 4 we conclude that the momentum resolution is not strongly affected by the degraded resolution of the drift-tubes that is observed. The effect of the degraded drift-tube resolution is therefore negligible for our studies of the momentum spectrum. To account for residual effects in the track reconstruction, the resolution in the simulation was set to 350 µm.

Table 1 Simulation samples made for SHiP background studies. \(\chi \) is the fraction of protons that produce heavy flavour
Fig. 5
figure 5

Measured muon momentum distributions from data and simulation, top full range in log scale, bottom detail of the low momentum range with a linear scale. The distributions are normalized to the number of POT. For simulated data, some individual sources are highlighted, muons from charm (green), from dimuon decays of low-mass resonances in Pythia8 (cyan), in Geant4 (turquoise), photon conversion (dark green) and positron annihilation (brown)

Fig. 6
figure 6

Transverse momentum distributions from data and simulation, top full range in log scale, bottom detail of lower transverse momentum with a linear scale. The distributions are normalized to the number of POT. For the simulation, some individual sources are highlighted, muons from charm (green), from dimuon decays of low-mass resonances in Pythia8 (cyan), in Geant4 (turquoise), photon conversion (dark green) and positron annihilation (brown)

Tracking efficiencies

The tracking efficiency in the simulation depends on the station occupancy, and in data and simulation the occupancies are different (apparently caused by different amounts of delta rays). By taking this into account, the efficiency in the simulation is reduced from 96.6 to 94.8%.

To determine the tracking efficiency in data, we use the RPCs to identify muon tracks in the data with the magnetic field turned off. We then take the difference between the tracking efficiency in the simulation with magnetic field off (96.9%) and the measured efficiency (93.6%) as the systematic error: 3.3%. For more details on the analysis and reconstruction, see [9].

Fig. 7
figure 7

\(p_T\) distributions in slices of p for data and simulation. The units on the vertical axes are the number of tracks per bin, with the simulation normalised to the data

Table 2 Number of reconstructed tracks in different momentum bins per \(10^9\) POT per GeV/c for data and simulation. The statistical errors for data are negligible. For data, the uncertainties are dominated by the uncertainty in the POT normalization, \(2.1\%\). For the simulation, the main uncertainty is due to a different reconstruction efficiency in the simulation compared to data, \(3.3\%\)

Comparison with the simulation

A large sample of muons was generated (with Pythia6, Pythia8 [10] and GEANT4 [11] in FairShip) for the background studies of SHiP, corresponding to the number of POT as shown in Table 1. The energy cuts (\(E_{\mathrm {min}}\)) of 1 GeV and 10 GeV were imposed to save computing time. The primary proton nucleon interactions are simulated by Pythia8 (using the default tune). The emerging particles are transported by GEANT4 through the target and hadron absorber producing a dataset of also referred to as “mbias” events. A special setting of GEANT4 was used to switch on muon interactions to produce rare dimuon decays of low-mass resonances. Since GEANT4 does not have production of heavy flavour in particle interactions, an extra procedure was devised to simulate heavy-flavour production not only in the primary pN collision but also in collisions of secondary particles with the target nucleons. For performance reasons, this was done with Pythia6. The mbias and charm/beauty datasets were combined by removing the heavy-flavour contribution from the mbias and inserting the cascade data with appropriate weights. The details of the full heavy-flavour production for both the primary and cascade interactions are described in [12].


The main objective of this study is to validate our simulations for the muon background estimation for the SHiP experiment. For this purpose, we compare the reconstructed momentum distributions (p and \(p_{T}\)) from data and simulation.

As discussed in the previous section (see also Fig. 4), the events outside the limits (\(p>350~\)GeV/c or \(p_{T}>5~\)GeV/c) are dominated by wrongly reconstructed trajectories due to background hits and the limited precision of the tracking detector. In SHiP, where the hadron absorber is 5 m long, only muons with momentum \(p>5~\)GeV/c have sufficient energy to traverse the entire absorber. We therefore restrict our comparison to 5 GeV/c \(<p<300~\)GeV/c and \(p_{T}<4~\)GeV/c. For momenta below 10 GeV/c, we only rely on the reconstruction with the tracking detector, since they do not reach the RPC stations. Above 10 GeV/c we require the matching between drift-tube and RPC tracks.

Figures 5 and 6 show the p and \(p_{T}\) distributions of muon tracks. The distributions are normalized to the number of POT for data (see Sect. 3.1) and simulation respectively. For the simulated sample, muons from some individual sources are also shown in addition to their sum.

In Fig. 7, we show the \(p_T\) distributions in slices of p. Table 2 shows a numerical comparison of the number of tracks in the different momentum bins.

Figure 8 shows the muon \(p-p_T\) distribution in data.

Fig. 8
figure 8

\(p_T\) vs p for data. The units on the vertical axis are the number of tracks per \(p,p_{T}\) bin in the entire data set

Fig. 9
figure 9

Ratio of data and MC tracks, \(R=\frac{N_{data}}{N_{MC}}\) in bins of p and \(p_T\)

Figure 9 gives a view of the differences between data and simulation in the \(p-p_T\) plane. Plotted is the difference between number of data and simulated tracks divided by the sum of the tracks in data and simulation in bins of p and \(p_T\).

For momenta above 150 GeV/c, the simulation underestimates tracks with larger \(p_T\), while the total number of tracks predicted are in agreement within \(20\%\). The difference between data and simulation is probably caused by a different amount of muons from pion and kaon decays. It was seen that by increasing the contribution of muons from pion and kaon decays in the simulation the difference between data and simulation was reduced.

The FLUKA [13, 14] generator is used to determine the radiation levels in the SHiP environment. To benchmark FLUKA with typical settings used for radiological estimates related to muons in the SHiP environment, the muon flux setup was implemented in FLUKA and the simulation with this setup was compared to that made with Pythia/GEANT4. The results of this comparison are given in Annex 1. This independent prediction provides additional support for the validity of the SHiP background simulation.


We have measured the muon flux from 400 GeV/c protons impinging on a heavy tungsten/molybdenum target. The physics processes underlying this are a combination of the production of muons through decays of non-interacting pions and kaons, the production and decays of charm particles and low-mass resonances, and the transportation of the muons through 2.4 m iron. Some 20–30% differences in the absolute rates are observed. The simulation underestimates contributions to larger transverse momentum for higher muon momenta. Given the complexity of the underlying processes, the agreement between the prediction by the simulation and the measured rate is remarkable.

Systematic errors for the track reconstruction (\(3\%\)) and POT normalization (\(15~\mathrm {POT})/\mu \text {-}\mathrm {event}\) have been studied and estimated.

A further understanding of the simulation and the data will be obtained with an analysis of di-muon events, the results of which will be the subject of a future publication.