In this chapter, we discuss the method of particle identification, which is also called object identification because what we reconstruct or identify is usually not a particle but an object such as a charged particle trajectory, a jet that is a cluster of many particles, missing transverse energy and so on. We go through common objects that are widely used in high energy physics.

6.1 Tracking and Vertexing

A tracking refers to the reconstruction of a trajectory of a charged particle, or a track. Once we find such a trajectory under a magnetic field, we can measure the momentum of this particle through the curvature of the trajectory. The energy of a particle can be estimated with an assumption of the particle type, or with measurements related to the particle identification. Thus, the tracking allows us to obtain a 4-momentum vector, which is ultimately needed in data analysis. For this reason, the tracking capability is one of the most important features equipped in most of the detectors for the high energy physics.

The tracking can be divided into three parts: the measurement of the space hit points of the particles in the detector, the pattern recognition to the hit points to make a candidate track (referred to as track finding) and the fitting for the candidate track to get a smooth track, which is our best guess for the true particle trajectory. We discuss these three steps in the following subsections.

The collection of tracks in an event further allows us to find or guess, for example, the particle-particle collision points of the event, or decay positions of short-lived particles, such as \(K_S\), \(\varLambda \), b-hadrons, \(\tau \), etc. In either case, more than one particle appears from a common location, which we call a vertex. A vertexing refers to the reconstruction of the vertex using the collection of the tracks.

6.1.1 Space Hit Point

The tracking starts with searching for space hit points of charged particles in the detector. Charged particle deposits its kinematic energy in material by ionisation when passing through the materials. This results in some electrical signal or scintillation light, which will be converted to electrical signal in the end, in the sensor material. The sensor for tracking is usually segmented, allowing to know the hit position of the charged particle.

If the tracking device is pixelated, and had 100% efficiency without any fake hits, i.e. virtually a perfect detector, and only one charged particle existed in an event, the space hit point could be uniquely determined without any ambiguity, and there is no further discussion to find the space hit point. However, the real world is not so kind to us. The particle could penetrate through more than single segments due to the incident angle, resulting in multi-hits in a sensor. Moreover, many particles are often generated by particle-particle collisions or a beam hitting a target, and these particles may be overlapped and create multi-hits in a single sensor. Or there could be false measurements due to the inefficiency or the noise of the detector. First, in order to determine or estimate the space hit points, therefore, we have to apply clustering techniques to a set of raw hits in the sensor. Once we have a cluster of hits, then, the next step is to estimate the hit position of a particle from the cluster. Below we discuss the clustering techniques first, and then how to determine the hit position from the cluster.

Many tracking devices used so far are not pixelated. Instead, in many cases, they provide one-dimensional information from the wires in a chamber or strip electrodes in a silicon sensor. Therefore, we need to convert and obtain the space hit points from such one-dimensional measurements, which we will discuss at the last part of the following subsections.

6.1.1.1 Hit Clustering

Below we consider a position sensitive device in one dimension such as a silicon strip sensor, a wire chamber or a fibre tracker. A basic approach to the clustering is to group the consecutive hits. This is rather straightforward if the detector is perfect. All you need is just to set a (very low) threshold for each channel to define a single hit. In the actual experiments, however, there could be some dead channels or noisy channels where some particular channels have always a hit regardless the existence of a particle. Special treatments are needed for these kinds of deficits.

For example, in order to avoid a fake track caused by noise hits, we mask noisy channels, i.e. ignore such channels when making a cluster. Or if two silicon strip sensors can be a pair because they locate so closely, requiring two hits on the pair can reduce the fake hits significantly. In contrast, a known dead channel is sometimes treated as if it would have a hit when the adjacent two channels have hits.

In addition to the deficit such as noisy or dead channels, another type of care is needed when a large size cluster is generated. A large cluster is possible when a particle hits a sensor obliquely. In this case, energy deposit and hence signal size in each channel is small. So the threshold to define a hit needs to be set low to reconstruct this type of tracks. But of course such lower thresholds can cause higher noise rates. Therefore, an optimisation of the thresholds is always a key to have higher efficiency with lower fake hits.

6.1.1.2 Hit Point Determination

The method to estimate the hit position depends on how the signal information in each channel is stored. In the so-called “binary readout”, where only the location of channels with the signal above a threshold is recorded, all we can do is just to take the average of the positions of hit channels. For example, if only one hit is found in a silicon strip sensor, the location of the strip with the hit is regarded as the incident position of a particle. If there are two hits, the particle incident position is regarded as the middle of hit strips, and so on. The position resolution of such binary readout scheme becomes \(\frac{d}{\sqrt{12}}\), where d is the size of the segment of each channel, as discussed in Sect. 4.2.5.

In case the size of the collected charge at each channel is recorded by some means, the centre of gravity in terms of the collected charge can be considered to be the particle hit position. Even more sophisticated approach, such as making use of a lookup table, is sometimes used. In either way, the position resolution can be improved compared to the binary readout scheme. Let’s assume that we now measure position along the x-axis with the silicon strip detector whose strip pitch is d, where we have two hits. Also, assume \(Q_L\) and \(Q_R\) be the collected charge at the two strips. With the centre of gravity method, x, which is the particle incident position defined as the distance from the right side strip, can be measured to be

$$ x = d \times \frac{Q_L}{Q_L+Q_R} \equiv d \times \frac{Q_L}{S} \; \; , $$

where S is the total charge accumulated by the two strips. Assuming the accumulated signal charge, S, is much larger than the noise, the uncertainty of measuring x can be written as

$$ \delta x = d \times \frac{\delta Q_L}{S} \sim d \times \frac{N}{S} \; \; , $$

where N is the noise. Since the binary readout gives us the position resolution of \(d / \sqrt{12}\), the device with the signal-to-noise ratio (S/N) greater than \(\sqrt{12} \sim 3.5\) has the advantage of using the analog information of the charge (Q). In many detectors including the silicon strip sensors, it is very common to have the S/N greater than 10 or 20. Hence, adding the analog information usually improves the position resolution.

6.1.1.3 From One-Dimensional Measurement to Space Point

In many cases, the tracking detector consists of devices that are capable of sensing only one-dimensional hit information, although it becomes popular to use the device with pixelated sensors recently. For example, suppose that there are parallel wires, which make a plane, inside the gas volume. By detecting a signal from the wire, one can identify the position of the particle hitting on the plane in the direction perpendicular to the wires, but not in the direction along the wires. Adding another layer in parallel to the original plane does not change the situation. Instead, placing another layer with some tilt angle with respect to the wires in the other layer allows us to get two-dimensional hit information. This tilt angle is referred to as stereo angle (Fig. 6.1: left).

Fig. 6.1
figure 1

Left: One-dimensional measurement by two layers with stereo angle gives two-dimensional measurement. Right: Two-dimensional measurement by pixelated device

The set of two layers with the stereo angle can provide two-dimensional measurements of a hit position on a certain plane. Since the three-dimensional space hit point is needed to reconstruct particle trajectory, one must define the plane of the two-dimensional measurements. In case the pixel type sensor is used, the plane is defined to be the sensor. In the right of Fig. 6.1, for example, x and y positions are determined by the location of pixel with hit, while the z is the plane of the sensor. Of course the sensor has finite thickness, and hence the “plane” must be pre-defined arbitrarily, such as the front surface or middle of the front and back side surface, etc. In case the configuration of two stereo sensor layers, such as two planes of wires (left of Fig. 6.1), the middle of two layers is often defined to be the space hit point plane.

The simple or natural idea to get two-dimensional hit information from two one-dimensional measurements is to have the stereo angle to be 90\(^\circ \). In many of the applications, however, it is difficult to have large stereo angle because of the geometrical constraints of the detector, especially in the collider detectors where the inactive materials such as cables must be minimised for \(4\pi \) acceptance coverage. (In contrast, the fixed target experiment may not face such space constraints and can easily achieve 90\(^\circ \) stereo angle.) To minimise the inactive materials, it is preferred to have the readout electronics including cables at only one end of the detector. Therefore, many tracking detectors that are actually used in the collider experiments have very small stereo angle so that the signals from two planes are read out through the same direction with a cost of losing the position resolution along the wire direction.

6.1.2 Track Finding

Once a set of space hit points is identified, the next step is to form the candidate tracks from the hit points. The idea itself here is simple, i.e. select the space hit points based on the track hypothesis where one needs a modelling of the charged particle trajectory. If only one particle travels inside the tracking volume without any noise hits, there would be no further discussions. All you need is to just connect the hit points. However, usually, there are many particles creating many hit points, resulting in very complicated hit pattern, plus noise hits. Given a particle trajectory model, the human eyes are very good in the pattern recognition, i.e. you can pick up a set of hit points to form a track. In the old days when using the emulsion or the bubble chambers, it was actually the eyes that worked as the track finder. But in most of the high energy physics experiments these days, one must use computers to find tracks through tons of the hit points, because of the very high rate of data acquisition. This means that algorithms for track finding have to be provided.

The first thing to consider is the modelling of charged particle trajectories. In the collider experiments, since the magnetic field exists in the tracking volume to measure momentum in most of the cases, the trajectory would be basically a helix. On the other hand, it is a straight line and possibly plus a curvature on the single plane, which is perpendicular to the magnetic field for measuring momentum in many of the fixed target experiments. There may be more models depending on each experiment. Anyhow, the point here is that the pattern recognition runs with a hypothesis on how a charged particle travels in the tracking volume.

The core of track finding is to select a set of hit points based on a given track modelling. There are mainly two groups of the track finding algorithms, in general, local and global methods, which we will describe below.

6.1.2.1 Local Method

The concept “local” means that the algorithm tries to find single track first, and then search for the next one once the first one is found. Until no possible candidates are found anymore, this procedure is repeated. In this way, multi-tracks are found sequentially.

The local method starts from finding or clustering an initial seed from the hit list. Suppose the tracking in the collider experiments such as the ATLAS experiment. The tracking algorithm picks up a hit on either the innermost or outermost layer. Here let’s suppose that the algorithm looks for a track from the inside to the outside, i.e. the hit on the innermost layer is randomly picked up first. Then the hit on the next outer layer is searched, based on the hypothesis that the track comes from the proton-proton interaction point with a helix trajectory, where the rough estimate of the interaction point needs to be provided. The segment formed by the hits on the innermost and the adjacent outer layer is examined if it matches with the track hypothesis with some criteria. The survived segment becomes the track seed. Once the seed is found, the track candidate is extrapolated to the next outer layer and examined again if it has a hit. This procedure will be repeated until the track candidate reaches the outermost layer of the tracking volume. The examination is often based on the \(\chi ^2\) testing or Kalman filter.Footnote 1

Whether starting from inner to outer or from outer to inner depends on the algorithm. Because the inner layer has denser hits than the outer, starting from the outer allows to avoid unnecessary trials and hence save the CPU time. It also can reconstruct a track from the secondary vertex whose position is in the tracking volume. This type of secondary vertex may appear from the long lived particles such as \(K_S\). On the other hand, the algorithm starting from the inside has the advantage in the efficiency by trying for any possibilities, even though it costs CPU time. In ATLAS, both algorithms are used in parallel. The resultant output, the track candidates from both the algorithms are fed into the next stage of the tracking, the track fitting.

After finding a track candidate, the space hits points that are used to reconstruct the track candidate are removed from the hit list for the next track candidates in many algorithms. Then the algorithm starts to search for the next track candidate using the updated hit list and continues the same procedure until no more candidates, which satisfy the selection criteria are found. In the dense experimental conditions such as the hadron colliders; however, it may be also a possible option to leave some hits on the list, even though they are already used, for the redundancy. The choice is up to you.

6.1.2.2 Global Method

The concept “global” means that the algorithm tries to find all tracks at a time from the list of the space hit points in contrast to the local method. There are other types of algorithms that are completely different from the one so far discussed. Actually, the variety of those algorithms is wide. So we just pick up and introduce two widely used algorithms.

Fig. 6.2
figure 2

An example of the track and space hit point distribution in the plane perpendicular to the beam axis in the collider experiment without the magnetic fields. The coordinate origin is the collision point

Histogramming method :

For the illustration purpose, suppose we think of finding tracks in the collider detector without any magnetic field. Suppose also the particle-particle collision point (collision point in short) is known. Figure 6.2 shows the example of the space hit points projected onto the plane perpendicular to the beam direction. If you plot the azimuthal angle, \(\phi \), of the hit points on Fig. 6.2, the \(\phi \) distribution will have the seven peaks, each corresponding to each track. By making such a histogram, one can identify the group of hit clusters in an event at a time. This is the basic idea of the histogramming method. In the actual application with the magnetic field, the coordinate transformation is carried out. Suppose x and y are the values in the original coordinate, then

$$ u = \frac{x}{x^2 + y^2} $$
$$ v = \frac{-y}{x^2 + y^2} $$

will produce straight lines in the (uv) plane, ending up with the simple histogramming method which handles the straight lines.

Hough transformation :

Another famous example of the global method is the Hough Transformation, which is widely used in the digital imaging process. Suppose we have a line on \(x-y\) plane. Given r is the distance between the line and the coordinate origin, and \(\theta \) the angle between the x-axis and the normal of the line in problem, the line satisfies the following equation:

$$ r = x \cos \theta + y \sin \theta $$

This means that any points \((x_0, y_0)\) on the line satisfy

$$ r(\theta ) = x_0 \cos \theta + y_0 \sin \theta \; \; . $$

Therefore, the lines passing the point of \((x_0, y_0)\) can be represented by a sine curve on the \((r, \theta )\) plane which we call a Hough space. Or the transformation of a set of lines on single point on the original \(x-y\) plane to the sine curve on the Hough space is called Hough transformation. Let’s now assume that we have five points on the straight line on the \(x-y\) plane. A set of the straight lines passing each point is transformed to the corresponding sine curves on the Hough space. Since there are five points now, there are also five sine curves after the Hough transformation. Because the five points on the \(x-y\) plane are on a common straight line, which we would like to find, there should be a crossing point of the five sine curves on the \((r, \theta )\) plane. The crossing point \((r_0, \theta _0)\) gives us the straight line on \(x-y\) plane, i.e. \(r_0 = x \cos \theta _0 + y \sin \theta _0\) is the expression of the line where the five points are on. This is how the Hough transformation allows to determine a line from a set of the space hit points.

The transformation shown above is for the straight line. It is possible to select a different transformation suitable for your application, for example, the curves such as the charged particle trajectories in magnetic field. In this case, assuming the trajectory to be a helix or a circle in the plane perpendicular to the magnetic field, the parameters representing the circle are the coordinate of the centre, x and y, and the radius, hence there are three parameters instead of two in the case of the straight line, r and \(\theta \). Then by performing the similar Hough transformation to the three-dimensional space, one can obtain a curved surface for each single point on \(x-y\) plane. By repeating the Hough transformation from all the measured points, we get a set of the curved surfaces. The crossing of the curved surfaces represents the circle, which is the one we want to determine on the \(x-y\) plane, or actually a track.

6.1.3 Track Fitting

The final step of the tracking is the fitting of the space hit points that are associated with each track candidate to the track modelling. The concept of the track fitting is rather simple, i.e. it is the least squares method to minimise the difference between the measurements and the track hypothesis. More specifically, \(\chi ^2\) is defined to be

$$ \chi ^2(\theta ) = \sum _i^N \frac{( y_i - f(x_i; \theta ) )^2}{\sigma _i^2} \; \; , $$

where \(y_i\) is the set of measurements at \(x_i\), \(f(x_i; \theta )\) the prediction of the track based on some trajectory modelling, \(\sigma \) the measurement error, and \(\theta \) the parameter which you want to obtain by minimising the \(\chi ^2\). In case of the collider experiments, a trajectory of charged particles is modelled by helix that consists of five parameters. Assuming each measurement is described by Gaussian p.d.f., which is the case in many measurements, the \(\chi ^2\) can be written as

$$ \chi ^2 (\theta ) = \sum _{i,j=1}^N ( ( y_i - f(x_i; \theta ) ) (V^{-1})_{ij} ( (y_j - f(x_j; \theta ) ) \; \; , $$

where V is the covariance matrix. This can be further written in general matrix notation as

$$ \chi ^2 = ( \textbf{y} - \textbf{f})^T V^{-1} (\textbf{y} - \textbf{f}) \; \; , $$

where \(\textbf{y}\) is the vector of the measurements, and \(\textbf{f}\) is the predicted value. The track fitting searches for the parameters \(\theta \) to minimise the \(\chi ^2\). There are varieties in the approaches for this minimisation. The details of the mathematical treatments can be found elsewhere (see Ref. [2] for example). Instead we discuss a few points that need to be considered particularly in high energy physics.

  • The Kalman filter is widely used recently because it can naturally handle some effects due to the interaction of a particle and the material in the tracking detector, such as Coulomb multiple scattering and/or energy loss in the tracking device.

  • The speed is crucial. This is especially true for the experiments with high rate and high multiplicity environment such as the hadron colliders. Not only the efficiency and/or resolution but also the speed needs to be optimised in the algorithm.

After the success of converging the track fitting, one can finally obtain the reconstructed tracks. As all the track parameters are determined at this point, one can deduce the momentum of the reconstructed tracks or the particles at arbitrary position. At the collider experiments, the momentum at the collision point is the interesting quantity we want to know in most cases. In addition, the complete track parameters allow to predict or extrapolate the particle trajectories outside the tracking volume, which is sometimes important in the particle identification such as the electron or muon identification, as shown in later in this chapter.

6.1.4 Vertex Finding

It is important to know the location of the collision point because it is where the particle reaction in interest happens, and hence the momentum vectors of the generated particles are defined. Therefore, the measurement of the collision point event-by-event is a crucial ingredient of the physics analysis. The analysis handling neutral particle is of particular importance because the momentum vector of the neutral particle cannot be defined without knowing the location of the collision point.

In the collider experiment, the collision point is often called as a primary vertex. There is also another type of vertex. For example, b-hadrons generated by the collisions can travel a few mm at LHC, and decay subsequently, creating a vertex at the decay point of the b-hadron because the decay products emerge and form a kink. This type of a vertex, caused by a decay of some particles, is called as a secondary vertex.

The concept of the vertex finding is simple, i.e. after the tracking one extrapolates more than one reconstructed tracks to the direction where the vertex is expected to exist. The intersection of the extrapolated tracks can be a vertex.

The actual vertex finding starts to pick up all reconstructed tracks with some selection criteria, which assures the quality of the tracks. The tracks are then examined which one should be associated with the same vertex candidate by the least squares fitting, or equivalently, Kalman filtering. The track that significantly worsen the \(\chi ^2\) value is removed from the association of the vertex or down-weighted in the fitting. The latter technique is called as the adaptive vertex fitting. Until stabilising the \(\chi ^2\) value, the testing of the association of the tracks continues. The fitting after this removal process gives us the best possible vertex estimate.

In order to improve the track reconstruction and the vertex finding, the track fitting is repeated after finding the vertex, and then vertex fitting again. This recursive process is of particular importance in the dense environment such as the hadron colliders where not only tons of tracks exist but also many interaction/collision points, hence many vertices, exist per bunch crossing.

Fig. 6.3
figure 3

Event display obtained from the real data in the ATLAS experiment. Reprinted under the Terms of Use from [3] ATLAS Experiment © 2022 CERN. All rights reserved. The reconstructed tracks are shown in the lines, and the vertex by the circles. There are 65 proton-proton interaction points on top of the one where Z boson is produced and decayed into dimuon. This dimuon is shown in yellow lines

At the hadron collider experiments, there are many interactions occurred per bunch crossing as just mentioned. At the LHC, for example, more than 20 interactions occur in average as shown in Fig. 6.3, where 65 vertices are found, while the bunch length is 7.5 cm. Out of these 65 interaction points, we have to find out the vertex where our interesting event is generated as the next step. In the example of Fig. 6.3, two muons appeared from one of the 65 vertices are identified to be consistent with Z decay. Since such interesting events often have high momentum transfer of the colliding partons, called hard scattering (see Sect. 2.5), the selection of the vertex is based on some measures that represent the momentum transfer of the collision. For this reason, the sum of \(p_{\textrm{T}}\) of the tracks, or the number of tracks associated to each vertex, is frequently used. Note that the hadron collider people sometimes call only this hard scattered interaction point as the primary vertex, although the primary means vertices generated by the collisions and the secondary by the particle decays in the original definition. They call the other vertices induced by the collisions as the primary vertex “candidates” or pile-up vertices. Readers should be aware of this difference.

Once we know the location of the primary vertex, or the hard scattered vertex, a momentum vector of the charged particle reconstructed by tracking is recalculated to obtain the one with respect to the primary vertex. This is one of the information ultimately necessary in physics analysis using charged particles.

A special care needs to be taken to find secondary vertices, as the location of the secondary vertex is not known a priori, in contrast to the primary vertex finding where the collision point is known to some degree. The basic procedure is the same as the primary vertex search. Only the explicit difference is that the tracks not associated with the primary vertex are the candidates to form the secondary vertex. The idea here is simple, but this selection is sensitive to the performance of the secondary vertex finding, and also on the primary vertex reconstruction. For example, if the selection criteria are tight, i.e. if you select only the tracks whose impact parameter is far enough away from the primary vertex, the reconstruction efficiency of the secondary vertex would be poor, while the reconstructed primary vertex position is free from the possible bias due to the tracks emerged from the secondary vertex. Therefore, the optimisation based on the physics requirements is of particular importance. In the end, the secondary vertex finding allows us to identify the location of decay position of \(K_S\), \(\varLambda \), b-hadrons, and so on, which is the important ingredient of physics analysis.

6.2 Electron and Photon

6.2.1 Interactions with Materials

Electrons and photons lose their energy by developing a characteristic shower in the electromagnetic (EM) calorimeter, which is called an EM shower due to the cascade process of the bremsstrahlung (\(e+\textrm{materials}\rightarrow e\gamma \)) and the \(e^+e^-\) pair production (\(\gamma +\textrm{materials}\rightarrow e^+e^-\)) as shown in Fig. 6.4. Such processes are based on the interaction of electrons and photons with materials of the absorbers, for example lead (Pb) in the ATLAS detector and crystal (PbWO\(_4\)) in the CMS detector.Footnote 2

Fig. 6.4
figure 4

Schematic views of a EM shower: a photon is injected into a calorimeter

A charged particle like an electron interacts with electrons in atoms (electrons of molecules) of detector materials through the EM interaction. A charged particle ionises or excites atoms. This is why a charged particle loses energy in materials. This is called ionisation loss. This energy loss can be described by the Bethe-Bloch formula. In addition, a charged particle radiates photons when it is decelerated in materials, which is called bremsstrahlung. The ionisation loss is dominant in a low energy region (low \(\beta \gamma \)) while the bremsstrahlung in a high energy region (high \(\beta \gamma \)). The energy at which the ionisation loss and bremsstrahlung energy loss is equal is called a critical energy. It is about 7.3 MeV for electrons in Pb.

There are three kinds of interactions for a photon with materials: the photoelectric effect, Compton scattering and \(e^+e^-\) pair production. In the energy of \(> 2m_\textrm{electron} = 1.022\) MeV, the \(e^+e^-\) pair production is dominant. Once an \(e^+e^-\) pair is created, the interaction of an electron with materials can be applicable. This is why the detector response by a photon is similar with that by an electron at the first order as mentioned in Sect. 5.4.2.

The development of the EM showers stops at the critical energy. One of the key parameters to describe calorimeter performance is the radiation length \(X_0\), which represents a length with which the energy of an electron becomes a factor of 1/e by passing materials. A unit of \(X_0\) is g/cm\(^2\) or cm: \(X_0\) for Pb is 0.56 cm, \(X_0\) for PbWO\(_4\) 0.89 cm. Typical EM calorimeters have at least 20\(X_0\) (ideally 25\(X_0\)) to stop EM showers. Materials with smaller \(X_0\) can stop EM showers with a small space.

The main materials for electrons and photons to lose their energy are those of calorimeter detectors. In most of actual collider experiments, however, there are materials in front of the calorimeter where the bremsstrahlung, \(e^+e^-\) pair production etc. are possible, for example, a beam pipe and inner tracking detectors. An electron can be scattered by the Coulomb force (called the multiple scattering) or radiate photons via the bremsstrahlung. A photon can be converted into an \(e^+e^-\) pair as shown in Fig. 6.5, which produces oppositely charged particles with a zero-opening angle and imbalance in momenta. In this case, a positron and an electron are detected in the calorimeter instead of a photon. Such a photon is called a converted photon. The position where such a conversion happens, which is called a conversion vertex, can be obtained from these two charged tracks reconstructed in the inner tracking detector. When photons themselves are detected in the calorimeter without the conversion, they are often called unconverted photons to distinguish from converted photons.

Fig. 6.5
figure 5

Reprinted under the Creative Commons Attribution 4.0 International License from [4] © 2011 CERN for the benefit of the ATLAS Collaboration

A photon interacts with the inner tracking detector and is converted into \(e^+e^-\) (blue and red curves) in the ATLAS detector. The conversion vertex is shown with a brown point.

6.2.2 Reconstruction

Electrons and photons are reconstructed by clustering cells of the EM calorimeter, where their energies are deposited in each cell. Typical algorithms are: the sliding window algorithm [5], the topological-clustering algorithm [6] etc. The reconstruction of electrons and photons is based on the sliding window algorithm with a sliding window seed size of \(3\times 5\). The topological clustering algorithm was used for the isolation energy calculation in the ATLAS experiment [7, 8]. Figure 6.6 shows a cluster (yellow) by the sliding window algorithm (\(5\times 7\)) and several clusters (red) by the topological clustering algorithm. To perform the topological clustering, cells are categorised into, for example, three different classes: 4\(\sigma \), 2\(\sigma \) and 0\(\sigma \) cells: the 4\(\sigma \) cells are those having energy of four or more times larger than their expected noises (\(\sigma \)), the 2 (0)\(\sigma \) cells are those having energy of \(2-4\) (\(<2\)) \(\times \) \(\sigma \). Then, in the step of the clustering, one of 4\(\sigma \) cells is selected as a seed cell and its neighbouring 4 or 2\(\sigma \) cells in the three spatial directions are connected until there are no neighbouring 4 or 2\(\sigma \) cells. Then, all the surrounding 0\(\sigma \) cells are finally connected to have clusters as electron and photon candidates.

For electrons, they can be also reconstructed through charged tracks using the inner tracking detector information (Sect. 6.1) since they are charged particles. Clusters matched to a charged track are classified as electrons and those not matched to any charged tracks are as unconverted photons. The momentum of the matched charged tracks is recalculated taking into account possible energy loss due to the bremsstrahlung in the detectors in front of the calorimeter. A cluster matched to a track pair from a reconstructed conversion vertex or a single track that has no hit in the innermost layer of the inner tracking detector is classified as a converted photon. The energy calibration of electrons and photons is explained in Sect. 5.4.2.

Fig. 6.6
figure 6

Reprinted under the Creative Commons Attribution 4.0 International License from [7] © CERN for the benefit of the ATLAS collaboration 2019. The \(5\times 7\) cells shown by yellow colour are those obtained using the sliding window algorithm. This is used to obtain electron energy. Other clusters (red colour) are obtained from the topological-clustering algorithm and are used to evaluate the isolation variable, which is calculated using clusters inside \(\varDelta R=0.4\) (a blue region)

Schematic view of clusters for an electron in the ATLAS experiment.

6.2.3 Identification

Reconstructed electrons and photons are candidates of true electrons and photons, respectively. Other particles, which are background for electrons and photons, can be also reconstructed with the same algorithm mentioned above. Such particles are dominated by jets, which are explained in detail later (Sect. 6.4). The origin of jets is a gluon and a quark, which produce a set of particles after hadronisation. A jet or a hadron can develop a hadronic shower in the calorimeter. The hadronic shower, which is shown in Fig. 6.7, has two components: hadronic and EM components. The hadronic component produces charged pions, charged kaons, protons, neutrons etc. through the hadronic interaction (strong interaction).Footnote 3 In addition, neutral pions are also produced but they are observed as photons since the lifetime of neutral pions (\(8.5\times 10^{-17}\) s) is very short and they immediately decay into two photons. This is the EM component of a hadronic shower.

Fig. 6.7
figure 7

Schematic views of a hadronic shower: a hadron (\(n, p, \pi ^\pm \), etc.) is injected into a calorimeter

Electrons and photons can be separated from jets and hadrons using the differences between an EM shower and a hadronic shower: lateral (\(=\)transverse) and longitudinal shower developments are different. For the lateral shower shape, an EM shower is relatively narrower than a hadronic shower since the constituents of a jet (\(\pi ^\pm , K^\pm , p, n, \gamma \) etc.) are spread. This is the case even for a single hadron, where the hadronic shower can become wider with producing neutrons etc. For the longitudinal shower shape, a hadronic shower is developed into the hadronic calorimeter, in other words, the shower cannot stop in the EM calorimeter. For example, when there are some longitudinal layers in the EM calorimeter like the ATLAS detector, the energy deposited in outer layers of the EM calorimeter is larger for jets and hadrons than for electrons and photons. Variables for the identification can be defined using cells of calorimeters. Such variables are called shower shapes variables, for example, shower widths, ratios of energy deposited in different layers of the calorimeter, etc. Figure 6.8 shows four variables for electrons in the ATLAS experiment: \(w_{\eta 2}\) and \(R_\eta \) represent a kind of narrowness in the lateral direction, and \(R_\textrm{had1}\) and \(f_3\) for the shower development in the longitudinal direction. Since more than 10 variables of shower shapes and tracks (if necessary) are used, so-called a multivariate analysis technique such as a combined likelihood, neural network, or boosted decision tree is adopted. Possible reasons of misidentification for electrons and photons using the shower and track variables are given below.

Fig. 6.8
figure 8

Reprinted under the Creative Commons Attribution 4.0 International License from [7] © CERN for the benefit of the ATLAS collaboration 2019. \(w_{\eta 2}\) and \(R_\eta \) are a shower width and a ratio of the energy in \(3\times 3\) cells over the energy in \(3\times 7\) cells in the second layer of the EM calorimeter, respectively. \(R_\textrm{had1}\) is a ratio of the transverse energy (\(E_\textrm{T}\)) in the first layer of the hadronic calorimeter to \(E_\textrm{T}\) of the EM cluster. \(f_3\) is a ratio of the energy in the third layer to the total energy in the EM calorimeter. Signals are electrons from Z and \(J/\psi \) decay and backgrounds are from electron candidates from multijet production, \(\gamma \)+jets etc.

Distributions of two shower shapes for the electron identification from the ATLAS MC simulation studies.

Fig. 6.9
figure 9

Schematic view of a fake electron from a jet: a charged pion overlaps with photons from a neutral pion decay inside a jet

Jets can be misidentified as electrons (called fake electrons), for example, because a charged pion overlaps with photon(s) from a neutral pion decay, a \(\eta \) decay and so on inside a jet. This is illustrated in Fig. 6.9. One of useful discriminating variables for this type of fake electrons is E/p as shown in Fig. 6.10, where E is energy measured in the calorimeter and p is momentum measured in the inner tracking detector. In case of true electrons, it should be close to 1 because E and p should originate from a same object but in case of jets (fake electrons) there is no clear correlation between E and p because different particles can contribute to E or p. This variable is included in the electron identification.

Fig. 6.10
figure 10

Reprinted under the Creative Commons Attribution 4.0 International License from [7] © CERN for the benefit of the ATLAS collaboration 2019. E is energy measured in the calorimeter and p is momentum measured in the inner tracking detector

E/p distributions for electrons (signal) and hadrons (background) from the ATLAS MC simulation studies.

Not only jets but also other objects such as \(\tau \)-jets and converted photons are misidentified as electrons. \(\tau \)-jets are misidentified when it decays hadronically to one charged particle, so-called one-prong (ex. \(\tau \rightarrow \pi ^\pm \pi ^0\nu \)) with the same reason as jets, that is, the overlap between \(\pi ^\pm \) and \(\gamma \) from \(\pi ^0\). A simple \(\tau \)-veto algorithm is applied: electron candidates that are highly identified as \(\tau \)-jets in a \(\tau \) identification are rejected. For the converted photons, a cluster has a possibility to have a matched charged track when one of the charged particles is not reconstructed. To reduce such misidentification (see Fig. 6.11), a track is required to associate to a primary vertex using impact parameters since tracks from a conversion vertex have large impact parameters. In addition, a hit in the innermost layer of the inner tracking detector is required for electron candidates.

Fig. 6.11
figure 11

Schematic views of conversions at the first layer (left) and at the beam pipe (right). When two tracks are reconstructed, both cases are categorised into conversions. On the other hand, in case one of the tracks is misreconstructed, only the left is taken as a conversion to reduce fake converted photons. No hit in the first layer of the inner tracking detector is required for conversions

Fig. 6.12
figure 12

Schematic view of a fake photon from a jet: most of jet’s energy is carried by a neutral pion

Jets are misidentified as photons (called fake photons), for example, when a neutral pion from a jet carries most of energy of the jet. This is illustrated in Fig. 6.12. In principle, two photons should be observed inside a jet because a neutral pion decays into two photons. To separate a single photon from a set of two photons, the finely segmented first layer is used in the ATLAS experiment as shown in Fig. 6.13. A single cluster is observed for a photon (left) but two clusters for a \(\pi ^0\).

Fig. 6.13
figure 13

Reprinted under the Terms of Use from [9] ATLAS Experiment © 2022 CERN. All rights reserved. Photons are injected from the bottom to the top. The energy deposits are shown by yellow. In the right figure, two groups of the energy deposit are observed in the first layer (fine segmentation) of the calorimeter

Energy deposits in the three layers in the ATLAS EM calorimeter for a photon (left) and two photons from a \(\pi ^0\) (right).

6.3 Muon

Muons can be produced via decays of Higgs bosons, W/Z bosons, quarks and new particles such as SUSY. Therefore, the reconstruction and identification of the muons with good quality in the wide range of momentum and solid angle are key to many of the most important physics in the energy frontier experiment.

The muon belongs to the second-generation lepton. The characteristics are similar to the electron except for the mass, with an electronic charge of \(-e\), a spin of 1/2, a mass (\(m_\mu \)) of \(105.6583715 \pm 0.0000035~\textrm{MeV}\). Muons hardly make either electromagnetic or hadronic shower in our energy regime, but decay into an electron, an electron anti-neutrino, and a muon neutrino via the weak interaction. Therefore, the mean lifetime of the muon (\(\tau _\mu \)), even flying in material, is very close to that in vacuum, \(2.1969811\pm 0.0000022~\upmu \textrm{s}\), which is relatively long. Muons with the momentum (\(p_\mu \)) of 1.0 GeV can pass through around 70 m at a period of the mean lifetime in the laboratory frame;

$$\begin{aligned} c \tau _\mu \beta \gamma = c \tau _\mu \frac{p_\mu }{m_\mu } \approx 3 \times 10^{8}~(\mathrm{m/s}) \times 2.2 \times 10^{-6}~(\textrm{s}) \times \frac{1.0~(\textrm{GeV})}{0.105~(\textrm{GeV})} \approx 70~(\textrm{m}) \end{aligned}$$
(6.1)

where the c is a velocity of light, \(\displaystyle \beta =\frac{v}{c}\) is a ratio of the velocity of the muon to the light velocity, and \(\displaystyle \gamma =\frac{1}{\sqrt{1-\beta ^2}}\) is a Lorentz boost factor.

Considering these unique characteristics of the muon, in the collider experiments, muons can be detected by charged particle detectors located at both inside and outside of calorimeters. In this section, the muon identification and reconstruction in the collider experiments are described using the ATLAS detector as an example.

6.3.1 Muon Momentum Measurement

Muon reconstruction and identification in ATLAS relies on inner tracking detector, described in Sect. 3.3.2, and muon spectrometers (MS). The track reconstruction is first independently performed in inner tracker and MS. The information from both of them is then combined to form the muon tracks that are used in the physics analysis.

6.3.1.1 Effect of Multiple Scattering

When a muon passes through the large volume of the materials in the detectors, the effect of the multiple scattering needs to be taken into account. The multiple scattering angle is regarded as the accumulation of the Rutherford scattering. The probability of single Rutherford scattering is in inverse proportion to \(\sin ^4{\left( \frac{\theta }{2} \right) }\), where the \(\theta \) is the scattering angle. The scattering angle has a sharp peak at \(\theta = 0\), meaning that \(\theta \) is typically very small. The mean of the multiple scattering angle is statistically regarded as the accumulation of the small angle Rutherford scattering, shown as \(\displaystyle \langle \theta ^2 \rangle =\sum _{i} \theta _i^2\). The multiple scattering angle approximately distributes in the Gauss distribution. The effect of the large angle scattering that also occurs for \(\sin ^4{\left( \frac{\theta }{2} \right) }\) distribution is shown up in the tail of the distribution. The mean of the multiple scattering angle (\(\theta _0 \equiv \sqrt{\langle \theta ^2 \rangle }\)) can be expressed as

$$\begin{aligned} \theta _0 = \frac{13.6~\textrm{MeV}}{\beta c p} \sqrt{\frac{x}{X_0}} \left[ 1 + 0.038 \ln {\left( \frac{x}{X_0} \right) } \right] \propto \frac{1}{p} \sqrt{\frac{x}{X_0}}, \end{aligned}$$
(6.2)

where p and \(\beta c\) are the momentum and velocity of a muon, respectively, and \(x/X_0\) is the thickness of the scattering medium in radiation length (\(X_0\)). If the uncertainty of the muon position measurements, \(\sigma _x\) in Eq. (5.2) or (5.3) dominated by the multiple scattering with the detector materials, the momentum resolution is independent of \(p_{\textrm{T}}\) as

$$\begin{aligned} \frac{\sigma _{p_{\textrm{T}}}}{p_{\textrm{T}}} \propto \sigma _x \cdot p_{\textrm{T}}\propto \theta _0 \cdot p_{\textrm{T}}\propto \frac{1}{p_{\textrm{T}}} \cdot p_{\textrm{T}}\approx \mathrm {const.} \end{aligned}$$
(6.3)

6.3.1.2 Contributions to Muon Momentum Resolution

The uncertainty of the position measurement \(\sigma _x\) usually comes from the accuracy of the hit position measurement limited by the detector characteristic, misalignment of detectors, multiple scattering in the detector, and fluctuations in the energy loss of the muons traversing through the material in front of the spectrometer. Figure 6.14 shows the contributions to the momentum resolution for the ATLAS MS as a function of transverse momentum [10]. The contribution of the multiple scattering is independent of the transverse momentum and dominated at moderate momentum (\(30< p_{\textrm{T}}< 300\) GeV), while the contributions of the hit position resolution (denoted as “Tube resolution and autocalibration” in Fig. 6.14) and the detector (chamber in this case) alignment are in inversely proportion to \(p_{\textrm{T}}\) and dominated at high momentum (\(p_{\textrm{T}}>300\) GeV). At low momentum (\(p_{\textrm{T}}<30\) GeV), energy loss fluctuations become dominant.

The ATLAS MS is designed to detect muons in the pseudorapidity region up to \(| \eta | = 2.7\) and to provide momentum measurements with a relative resolution better than 3% over a wide \(p_{\textrm{T}}\) range and up to 10% at \(p_{\textrm{T}}\approx 1~\textrm{TeV}\). In order to satisfy the requirements, the measurement precision in each hit by a muon track is required to be typically better than 100 \(\upmu \)m, which can be roughly estimated by Eq. (5.2). The uncertainty of the alignment in the chamber positions is required to be at the level of 30 \(\upmu \)m.

Fig. 6.14
figure 14

Reprinted under the Creative Commons Attribution 3.0 License from [10] © 1997-2022 CERN

Contributions to the muon momentum resolution for the ATLAS MS as a function of transverse momentum.

Figure 6.15 shows muon momentum resolution for MS alone and for the combined measurements by MS and inner tracker [10]. At low momentum (\(p_{\textrm{T}}<30\) GeV), the measurement by inner tracker is better due to better spatial resolution of silicon strip and pixel detectors. On the other hand, at high momentum (\(p_{\textrm{T}}>30\) GeV), the measurement by MS becomes better than inner tracker because the MS is stationed in a wider space, which means L is larger in Eq. (5.2).

Fig. 6.15
figure 15

Reprinted under the Creative Commons Attribution 3.0 License from [10] ATLAS Collaboration © 1997 CERN. The dashed curve is the resolution using only the inner tracker

The muon momentum resolution for the muon spectrometer alone and the combined measurements by the ATLAS MS and the ATLAS inner tracker as a function of the transverse momentum.

6.3.2 Examples of Muon Detectors

Because the muon spectrometers have to cover the wide surface area of the barrel and endcap of the cylindrical detector system, it is required to be robust, mechanically strong, and inexpensive as well as to provide the good momentum resolution and the high efficiency. Because muons give us clear signatures from physics of interests such as \(H \rightarrow ZZ^* \rightarrow 4\mu \), the muon spectrometers are used as the trigger devices which provide fast information on momenta, positions and multiplicity of muons traversing through the detector. This is called as the first level muon trigger, which makes a trigger decision within a few micro-seconds by a simple trigger logic on hardware. The gas detectors satisfy these requirements. For instance, the ATLAS MS, shown in Figs. 6.16 and 6.17, consists of the resistive plate chambers (RPC) and the thin gap chambers (TGC) to provide the fast muon trigger information and the monitored drift tube (MDT) chambers and the cathode strip chambers (CSC) to reconstruct muon trajectory precisely. The ATLAS MS divided into a barrel part (\(| \eta | < 1.05\)) and two endcaps (\(1.05< | \eta | < 2.7\)).

Fig. 6.16
figure 16

Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved

The muon spectrometer for the ATLAS experiment.

Fig. 6.17
figure 17

Reprinted under the Creative Commons Attribution 3.0 License from [10] ATLAS Collaboration © 1997 CERN

The cross-section of the ATLAS muon spectrometer: r-z view (Left) and r-\(\phi \) view (Right).

Three large superconducting air-core toroid magnets provide magnetic fields with a bending integral of about 2.5 T\(\cdot \)m in the barrel and up to 6 T\(\cdot \)m in the endcaps in order to measure the muon momentum independently to the inner tracking system with the solenoid magnet (Fig. 6.18). In the following sections, as an example of the muon chamber, RPC, TGC, MDT used in ATLAS muon spectrometers, are introduced [11].

Fig. 6.18
figure 18

Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved

The magnetic fields provided by ATLAS toroid magnet.

6.3.2.1 Resistive Plate Chamber

In the barrel region (\(|\eta | \le 1.05\)), trigger signals are provided by a system of resistive plate chambers (RPCs). The RPC is a gaseous parallel electrode-plate detector providing a typical space-time resolution of 1 cm \( \times \)\(\textrm{ns}\) with digital readout. The mechanical structure of an RPC is shown in Fig. 6.19. Two resistive plates, made of phenolic-melaminic plastic laminate, are kept parallel to each other at a distance of 2 mm by insulating spaces. The gas gaps are filled with the gas of a mixture of \(\textrm{C}_2\textrm{H}_2\textrm{F}_4/\textrm{Iso}\)-\(\textrm{C}_4\textrm{H}_{10}/\textrm{SF}_6\) (94.7/5/0.3). The electric field between the plates of about \(4.9~\mathrm{kV/mm}\) allows avalanches to form along the ionising tracks towards the anode. Since all primary electron clusters form avalanches simultaneously in the strong and uniform electric field, single signal is produced instantaneously after the passages of the particle. The intrinsic time jitter is less than 1.5 ns. The signal is read out via capacitive coupling to metallic strips, which are mounted on the outer faces of the resistive plates. The total jitter of RPC is less than 10 ns, which ensures to identify the proton bunch crossing of 25 ns and to produce fast trigger signals. The readout pitch of \(\eta \) and \(\phi \)-strips is 23–35 mm. The \(\eta \) and \(\phi \) strips provide the bending view of the trigger detector and the second-coordinate measurement, respectively. The second-coordinate measurement that cannot be done by MDT chambers (see Sect. 6.3.2.3) is also required for the offline pattern recognition.

Fig. 6.19
figure 19

Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved. The unit of the number in the figure is mm

Mechanical structure of an RPC chamber.

RPC is made up of three stations, each with two detector layers. Two stations installed at a distance of 50 cm from each other are located near the centre of the magnetic field region and provide the low-\(p_\textrm{T}\) trigger (\(p_\textrm{T} > 6\) GeV) while the third station, at the outer radius of the magnet, allows to detect the muon trajectory with larger curvature and to increase the \(p_\textrm{T}\) threshold to 20 GeV, thus providing the high-\(p_\textrm{T}\) trigger. The trigger logic requires three out of four layers in the middle stations for the low-\(p_\textrm{T}\) trigger and, in addition, one of the two outer layers for the high-\(p_\textrm{T}\) trigger (Fig. 6.20).

Fig. 6.20
figure 20

Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved. Two stations of the RPC are below and above middle station of MDT chamber. Outer station is above the MDT in the large and below the MDT in the small sectors. Dimensions are in mm

Cross-section of the upper part of the barrel muon spectrometer.

6.3.2.2 Thin Gap Chamber

In the endcap region (\(1.05 \le |\eta | \le 2.4\)), trigger signals are provided by a system of thin gap chambers (TGCs). TGC is multi-wire proportional chambers with the characteristic that the wire-to-cathode distance of 1.4 mm is smaller than the wire-to-wire distance of 1.8 mm, as shown in Fig. 6.21. The gas used is mixture of \(\textrm{CO}_2\) and n-\(\textrm{C}_5\textrm{H}_{12}\) (n-pentane) (55 : 45). TGC is operational in quasi-saturated mode with a gas gain of about \(3 \times 10^{5}\). The high electric field of the wires (around 2800 V) and small wire-to-wire distance allows us to measure the muon trajectory with a good time resolution and to identify the proton bunch crossing of 25 ns. The number of wires in a wire group varies from 6 to 31 as a function of \(\eta \), in order to match the granularity to the required momentum resolution. The wire groups measure the \(\eta \) direction of the muon trajectory. Two of copper layers in triplet and doublet modules, which is marked as “Cu stripes” in Fig. 6.21, are segmented into readout strips to read the azimuthal coordinate (\(\phi \)) of the muon trajectory.

Fig. 6.21
figure 21

Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved

TGC structure showing anode wire, graphite cathodes, G-10 layers and a pick-up strip, orthogonal to the wires (top) and cross-section of a TGC triplet and doublet module (bottom).

The inner wheel formed by doublet modules is placed before the endcap toroidal magnet, while the big wheel consists of the seven layers (triplet module plus two doublet modules) as shown in Fig. 6.22 and measures the muon trajectory in the bending direction by toroidal magnet.

Fig. 6.22
figure 22

Reprinted under the Terms of Use from [12] ATLAS Experiment © 2006 CERN. All rights reserved. The diameter of the Big wheel is about 25 m

Big wheel of TGC chamber.

6.3.2.3 Monitored Drift Tube

Over most of the \(\eta \) range, a precise measurement of the track coordinates in the principal bending direction of the toroidal magnetic field is provided by monitored drift tubes (MDT) chambers. The MDT system achieves a sagitta accuracy of 60 \(\upmu \)m, corresponding to the momentum resolution of about 10% at \(p_{\textrm{T}}=1~\textrm{TeV}\).

The basic element of the MDT is pressurised drift tube with a diameter of 29.970 mm, operating with Ar/\(\textrm{CO}_2\) gas (93% : 7%) at 3 bar. The electrons resulting from ionisation are collected at the central tungsten-rhenium wire with a diameter of 50 \(\upmu \)m at a potential of 3080 V as shown in left figure of Fig. 6.23. The average drift velocity of electrons is about 20.7 \(\upmu \)m/ns and the maximum drift time is about 700 ns. Making use of the radius-to-drift time relation (r-t relation), the distance of a muon track passing through the tube from an anode wire can be measured as a drift circle. The shape of the r-t relation, which depends on parameters such as temperature, pressure, magnetic field distortions caused by the positive ions after ionisation, must be known with high accuracy in order to achieve better spatial resolution.

Fig. 6.23
figure 23

Reproduced by permission of IOP Publishing from [11] © IOP Publishing Ltd and SISSA. All rights reserved

Left: the cross-section of the MDT drift tube. Right: the mechanical structure of a MDT chamber.

The mechanical structure of an MDT chamber is shown in right figure of Fig. 6.23. A chamber consists of two multi-layers of three or four drift tube layers. In order to monitor the internal geometry of the chamber, four optical alignment rays, two parallel and two diagonal, are equipped. That is why the drift tube detector in the ATLAS experiment is called “Monitored” Drift Tubes. The 1,150 MDT chambers are constructed from 354,000 tubes and cover an area of 5,500 \(\textrm{m}^2\). Each MDT chamber provides the information of the track segment. Muon tracks are reconstructed by track segments obtained from inner, middle and outer stations of MDT chambers.

6.3.3 Muon Reconstruction

The muon reconstruction can be performed independently in the inner tracker and MS. In the inner tracker, the muons are reconstructed such as any other charged particles described in Sect. 6.1. In this section, the description of the muon reconstruction in the MS and the combined muon reconstruction are focused on. More detail on the muon reconstruction at the ATLAS experiment is given in Ref. [13].

Using the drift circles in MDTs or clusters in TGCs and RPCs, the muon reconstruction is subdivided into the three stages: segment-finding, segment-combining and track-fitting.

Segment-finding starts with a search for hit patterns in a single station (i.e. inner, middle and outer stations of MDT, RPC and TGC chambers in case of the ATLAS MS) to form the track segments. The Hough transform is used to search for hits aligned on a trajectory in the detector. The track segments are reconstructed by a straight-line fit to the hits found in each layer.

Full-fledged track candidates are built from segments, typically starting from middle stations of detector where trigger hits from TGC or RPC are available, and extrapolating back through the magnetic field to the segments reconstructed in the inner stations. Whenever a match of the segment is found, the segment is added as the track candidate. The final track-fitting procedure takes into account all relevant effects: multiple scattering, non-uniformity of the magnetic field, inter-chamber misalignment etc.

The physics analyses make use of four muon types.

  • Combined muon: muon tracks reconstructed by the inner tracker and MS independently are combined with a global refit using the hits from the inner tracker and MS detectors. In order to improve the fit quality, MS hits may be added to or removed from the track. Most muon tracks are reconstructed by outside-in reconstruction, where the muons are first reconstructed in the MS and then extrapolated inward and match to a track reconstructed by the inner tracker. An inside-out reconstruction where the reconstruction procedure is opposite to the outside-in reconstruction is also used as a complementary approach.

  • Segment-tagged muons: a muon track in the inner tracker is classified as a muon if it is associated with at least one local segment in the MDT stations. In case of low \(p_{\textrm{T}}\) muon or in case muons pass through the \(\eta -\phi \) region, which is not covered by MS stations, segment tagged muons are used.

  • Calorimeter-tagged muons: a muon track in the inner tracker is identified as a muon if it is associated with an energy deposit in the calorimeter compatible with a minimum-ionising particle (MIP). Muons passing through the \(\eta -\phi \) region where MS is not fully covered are regarded as this type of muons.

  • Extrapolated muons: muon tracks reconstructed based only MS and a loose requirement on compatibility with originating from the interaction point. The tracking parameters of the muon are defined at the interaction point, taking into account the estimated energy loss of the muon in the calorimeters. Extrapolated muons are used to extend the acceptance for the muon reconstruction into the region where the inner tracker does not cover.

When the same track reconstructed by inner tracker is identified by two muon types, the priority is given to the combined muons, then to segment-tagged muons, and finally calorimeter tagged muons.

6.3.4 Muon Identification

Although muon candidates reconstructed by the muon spectrometers are mostly true muons, we want to identify the origin of muons. Muons from the decay of heavy particles such as W, Z, Higgs bosons, or new particles are interesting for us and need to be reconstructed as “isolated” muons efficiently and precisely. Since muons from semi-leptonic decays from b and c-hadrons and \(\tau \) are also important for the b-tagging and \(\tau \) ID, respectively, they need to be reconstructed as muons in the heavy flavour jets and \(\tau \)s. On the other hand, muons from the decays of pions and kaons are regarded as “fake” muons and eliminated from muon candidates.

Muon candidates originating from in-flight decays of charged hadrons mainly from pion and kaon decays in the inner tracker are reconstructed with a distinctive kink in the track. Therefore, it is expected that the track fit quality of the resulting combined track is poor and that the momentum measured by the MS and the inner tracker are not compatible. Muon identification is performed by applying quality requirements to suppress the background, to select prompt muons with high efficiency, and to guarantee a robust momentum measurement. Based on the number of hits in the inner tracker and MS, \(\chi ^2\) of the combined muon tracks, the difference between the transverse momentum measurements in the inner tracker and MS and their uncertainties are used to classify as “Loose”, “Medium”, “Tight” and “High \(p_{\textrm{T}}\)” (for high momentum muons above 100 GeV aimed at the muons from exotic particle such as \(Z'\) and \(W'\) bosons) categories. These categories are provided to address the specific needs of different physics analyses.

6.3.5 Muon Isolation

Muons originating from the decay of heavy particles such as W, Z or Higgs bosons are often produced isolated from the other particles, in contrast to the muons from semi-leptonic hadron decays such as \(b \rightarrow c\mu \nu \), which are embedded in jets. The measurement of the detector activity around a muon candidate, referred to as muon isolation, is a powerful tool for background rejection in many physics analyses. Both track-based and calorimeter-based isolation variables are often used.

The track-based isolation variable \(p_\textrm{T}^\textrm{varcone30}\) is defined as the scalar sum of the transverse momentum of the tracks \(p_\textrm{T} > 1\) GeV in a cone size \(\varDelta R =\) \(\min {(10~\textrm{GeV}/p_\textrm{T}^{\mu }}, 0.3)\) around the muon. The muon momentum \(p_\textrm{T}^{\mu }\) is excluded from \(p_\textrm{T}^\textrm{varcone30}\). In this case, the cone size is chosen either to be \(p_\textrm{T}\) dependent (\(\varDelta R = 10~\textrm{GeV}/p_\textrm{T}^\mu \)) or to be \(p_\textrm{T}\) independent (\(\varDelta R = 0.3\)). The \(p_\textrm{T}\) dependent cone size is used to improve the performance for the isolated muon with a high transverse momentum. The calorimeter-based isolation variables \(E_\textrm{T}^\textrm{topocone20}\) are defined as the sum of the transverse energy of topological cluster in a cone size \(\varDelta R = 0.2\) around the muon. The isolation selection criteria are determined using the relative isolation variables defined as \(p_\textrm{T}^\textrm{varcone30}/p_\textrm{T}^{\mu }\) and \(E_\textrm{T}^\textrm{topocone20}/p_\textrm{T}^{\mu }\). Several selection criteria are provided to address the specific needs of different physics analyses.

6.3.6 Momentum Scale and Resolution

Although the simulation contains the description of the detector, there is a limitation in describing the momentum scale and the momentum resolution. For this reason, corrections of simulated values are often applied. The momentum scale and resolution are parameterised by the following equation:

$$\begin{aligned} p_\textrm{T}^\textrm{Cor} = \frac{\displaystyle p_\textrm{T}^\textrm{MC} + \sum ^{1}_{n=0}s_n(\eta , \phi )\times \left( p_\textrm{T}^\textrm{MC}\right) ^n}{\displaystyle 1+ \sum ^{2}_{m=0}\varDelta r_m(\eta , \phi ) \times \left( p_\textrm{T}^\textrm{MC}\right) ^{m-1} g_m} \end{aligned}$$
(6.4)

where \(p_\textrm{T}^\textrm{MC}\) is the uncorrected transverse momentum in simulation, \(g_m\) is normally distributed random variables with zero mean and unit width, and the \(\varDelta r_m(\eta , \phi )\) and \(s_n(\eta , \phi )\) are the parameters representing the smearing of momentum resolution and the scale corrections applied in a specific \((\eta , \phi )\) detector region, respectively.

The corrections to the momentum resolution are described by the denominator of Eq. (6.4), assuming that the relative \(p_\textrm{T}\) resolution can be parameterised by

$$\begin{aligned} \frac{\sigma {(p_\textrm{T})}}{p_\textrm{T}} = \frac{r_0}{p_\textrm{T}} \oplus r_1 \oplus r_2 \cdot p_\textrm{T} \end{aligned}$$
(6.5)

with \(\oplus \) denoting a sum in quadrature. As shown in Sect. 6.3.1, the second and third terms of Eq. (6.5) account mainly for multiple scattering and the resolution effects caused by spatial resolution of the hit measurements and the misalignment of the muon spectrometer. The first term accounts for fluctuation of the energy loss in the detector material. The difference in the momentum resolution between data and simulation is parameterised by \(\varDelta r_m(\eta , \phi )\). The momentum in simulation is smeared with the \(\varDelta r_m(\eta , \phi )\), by dividing uncorrected muon momentum by the term of denominator in Eq. (6.4).

The numerator in Eq. (6.4) describes the momentum scales. The \(s_1(\eta , \phi )\) corrects for inaccuracy in the description of the magnetic field integral and the dimension of the detector in the direction perpendicular to the magnetic field. The \(s_0(\eta , \phi )\) corrects the energy loss in the detector material.

The momentum scale and resolution are usually studied using \(J/\psi \rightarrow \mu \mu \) and \(Z \rightarrow \mu \mu \) decays. Since the \(J/\psi \) and Z are narrow resonances and their masses are well known, the distributions of invariant mass reconstructed by two \(\mu \)’s from \(J/\psi \) and Z show clear peaks around 3 GeV and 91 GeV [14], respectively. Furthermore, the number of non-resonant background events from decays of light and heavy hadrons and from continuum Drell-Yan production is very small. The momentum scale and resolution are determined from data using a fit with templates derived from simulation, which compares the invariant mass distributions from \(J/\psi \rightarrow \mu \mu \) and \(Z \rightarrow \mu \mu \) candidates in data and simulation. The momentum in the range of 5 GeV\(<p_\textrm{T}<\)20 GeV and 20 GeV\(<p_\textrm{T}<\)300 GeV is corrected by \(J/\psi \rightarrow \mu \mu \) and \(Z \rightarrow \mu \mu \) candidates, respectively. Figure 6.24 shows the invariant mass distribution of \(J/\psi \rightarrow \mu \mu \) (left) and \(Z \rightarrow \mu \mu \) (right) candidate events reconstructed with combined muons [13]. The agreement between data and simulation becomes much better after the correction.

Fig. 6.24
figure 24

Reprinted under the Creative Commons Attribution 4.0 International License from [13] © CERN for the benefit of the ATLAS collaboration 2016. The upper panels show the invariant mass distribution for data and for the signal simulation, and for background estimate. The points show the data, the continuous line shows the simulation with the corrections of momentum scale and resolution, and the dashed lines show the simulation without the corrections

Dimuon invariant mass distribution of \(J/\psi \rightarrow \mu \mu \) (left) and \(Z \rightarrow \mu \mu \) (right) candidate events reconstructed with combined muons.

6.4 Jet Identification

6.4.1 Fragmentation: Partons to Particles

This section describes the identification and reconstruction of jets. A jet in high-energy physics is, naively speaking, a bunch of hadrons, which are emitted in nearby directions. This is an object consisting of the consequence of parton(s) fragmented into multi-hadron states. Here gives a short introduction of how we understand the “fragmentation” process, i.e. the underlying physics of the partons transformed into long-lived hadrons, followed by discussion on algorithms to identify and reconstruct jets.

The partons, i.e. quarks and gluons, obey the dynamics described by QCD with one and only parameter, the strong coupling constant \(\alpha _S\). The coupling constant becomes smaller with the energy of the interaction as a consequence of the renormalisation group equation as shown in Fig. 2.6. The energy scale, denoted as \(\mu \), is given as the centre-of-mass energy of the partons in concern. Since the energies involved in each parton reaction is not measurable, the choice of the energy scale for a process is, however, not uniquely given and we leave the discussion to elsewhere. Here, we merely point out that there are many choices: it could be centre-of-mass energy or transverse momentum of two partons when discussing on the parton-parton collisions, often quadratically summed with a heavy quark mass if a heavy quark is involved, or the mass of the particles (\(W, Z, \varUpsilon \) ...) when discussing the decay of particles.

Now let us take a simple example, a decay of the \(Z^0\) boson into a \(q\bar{q}\) pair for understanding how a parton fragment into a multi-hadron state. The energy scale would be given as \(\mu = m_{Z^0}\). In this case, the quarks cannot be a pair of top quarks due to energy conservation and the quarks run fast and may radiate additional gluons, since a quark feel the force from the other quarks due to the colour charge carried by each of quarks. The force is “strong”, as \(\alpha _S\) is about 0.1 at the mass scale of \(m_{Z^0}\). The radiation of the gluon is soft in most cases, i.e. typically collinear and/or with small momentum fraction with respect to the parent quark, like for the case of bremsstrahlung. But with a small probability, the gluon may have large angle from both of the two quarks and may have large momentum fraction, i.e. the radiated gluon may be hard.

The gluons and quarks still feel the colour force and may further radiate a pair of \(q\bar{q}\), \(g \rightarrow q\bar{q}\) or radiate further a gluon \(g \rightarrow gg\). The splitting of partons would be repeated: this process is often called “parton shower”. After some steps of radiation, the partons are branched into many parton states, most of which have another close-by parton and the invariant masses between these two partons are much smaller than the initial mass \(m_{Z^0}\). In such a situation, the coupling constant describing the interaction of the partons becomes much larger, say 0.3 rather than 0.1. This accelerates the process of the fragmentation and eventually all the partons would have their invariant mass with the nearest partons below 1 GeV. This is the energy scale of \(\varLambda _{QCD}\), below which perturbative QCD (pQCD) is no longer applicable; one cannot discuss the branching of partons by perturbation theory and need a help of the non-perturbative approach, e.g. the lattice QCD.

The lattice QCD calculation tells us that the potential energy of a \(q\bar{q}\) pair is linear to the distance r between the pair, \(U(r) \propto r\). This means that the line of the strong force is about constant density and concentrated in a tube-like area. The stored energy gets higher as the distance becomes larger. Once the stored energy exceeds beyond the mass of two quarks, the total energy should be lower if a \(q\bar{q}\) pair is produced and the force lines are cut. This process continues until the relative distances between all the colour-neutral \(q\bar{q}\) pair become shorter that there remains no more enough energy to produce additional \(q\bar{q}\) pair, giving the end of the showering process. All the quarks and gluons are in bound states, i.e. mesons or baryons, at this stage. This last part of the transition is called hadronisation.

Prior to the discussion on jet algorithm, we may like to see if the input to the algorithm, the four momentum of the final state objects, is well defined. The final state with hadrons can clearly be defined once we give a threshold on the lifetime of the final state particles. The boundary is often given at where the \(B-\)mesons and charm mesons decay but not charged pions. The intermediate states of partons before hadronisation, on the other hand, are less obvious in their definition and we need certain criteria, which we discuss in the following section. One should also note that the fragmentation is a process described by quantum field theory and it is not possible to assign a certain parton or hadron to their parents in principle—what we know through the theory is the probability to which parents the daughter particle is assigned, unless the lifetime of the parent particle is long enough that the quantum effect is negligible.

6.4.2 Defining Jets

While it is impossible to have unique one-to-many correspondence between a parton and hadrons, one may still imagine that a spray of particles, or a jet, would be originated from a quark or a gluon, if it looks like collimated and away from the other activities, and may like to relate the jet to the underlying parton. This relies on the fact that the parton emission is mostly collinear if the parents of the hadronised partons have sufficiently high energy and run fast. In practice, however, the jets are still “ambiguous”: if there are two close-by jets or a wide jet, it is often not straightforward to know whether to associate a parton or two or more partons to the jets. A hadron away from the sprays may also be ambiguous in such an assignment or left unassociated to any jets.

Things are even more complicated if we are to reconstruct jets using the detector information. Not only the momentum of the particles are smeared by the detector but also that a significant part of the particles may be escaped from detection. Also, a measurement by calorimetry cannot resolve two close-by hadrons since their energy clusters may be merged to a cluster if the distance between the two hadrons at the calorimeter is less than a certain value.

This also applies when we extend the concept of the jets to partons. Parton branching is also a quantum-mechanics process and the final state partons are not uniquely related to their parent partons, as described above.

These facts all call for some definition of jets, or a jet algorithm that defines the number of jets and their momenta. The algorithm has to be independent of the type of particles: a few “primary” partons, many partons after the parton shower, hadrons before or after the meson decays, or detector measurements. The algorithm should also identify, count and reconstruct the momentum of hard, i.e. high momentum partons while the soft emissions nearby hard partons should be absorbed to the hard partons, or discarded. In this sense, the algorithm should be insensitive to the soft emissions, or ‘infrared safe’. In fact, the procedure to absorb the soft particles to the stronger jets is somewhat analogical to the procedure of renormalisation in theoretical calculation.

6.4.3 Jet Algorithms

Historically, there have been two kinds of jet algorithms used in high-energy physics, one called the cone algorithm and the other the cluster algorithm. The cone algorithm moves around a window of jet area defined as a circle in \(\eta -\phi \) space, \(\varDelta r = \sqrt{\varDelta \eta ^2 + \varDelta \phi ^2} < R\), where the \(\varDelta \) variables \(\varDelta r\), \(\varDelta \eta \) and \(\varDelta \phi \) are the distance between the jet centre and the position of the particle. R is called cone radius, giving the angular boundary of jets. The algorithm iteratively finds such an energetic cluster where the sum of the transverse momenta \(p_\textrm{T}^\textrm{jet}\) for the particle inside the circle becomes maximum. The cluster with transverse momenta is said to be a jet if \(p_\textrm{T}^\textrm{jet} > p_\textrm{T}^\textrm{thr}\), the threshold value of the jets, which is the parameter to ensure that the jets are hard. After the algorithm is run, one may find many jets but also many particle clusters below the \(p_\textrm{T}^\textrm{thr}\), which are not qualified to be a jet and discarded. This feature is suitable for hadron-hadron collisions where hard jets are accompanied with many soft particles arising from soft emissions from beam remnants, particles from soft underlying events (rescattering of the outgoing proton remnants) and multi-parton interactions, those often called “underlying events”. In addition, particles from soft collisions pile up to the hard partons in case of high-luminosity collisions such as the main LHC runs (see Sect. 6.4.4).

There still remains the activity of particles not emerged from the hard partons within the cone. Although the amount of the underlying events and pile-up particles are certainly not constant, it is at least possible to statistically subtract such contributions since the size of the jet is the same for each jet unless they overlap and some part of the jet area is to be shared; even for such a case the net area of overlapped jets is still well defined. This is another virtue of the cone algorithm.

The historical cluster algorithm, on the other hand, assigns all the particles to one of the jets. This is suitable for \(e^+e^-\) collisions where there are neither beam remnants, multi-parton interactions nor pile-up but only the soft emissions from hard partons. The basic idea is that the soft particles should always be merged to other particles or nearest cluster until the energy of the cluster exceeds beyond the threshold (see Fig. 6.25). The “distance” between two particles, \(d_{ij}\) is defined in various ways, which gives the variation and choice to the algorithm. The distance can be an invariant mass squared between two particles (the original JADE algorithm) or relative transverse momentum squared of a softer particle with respect to the harder particle (often called \(k_\textrm{T}\) or \(k_\perp \)): \(d_{ij} = min(p_\textrm{T}^{i2}, p_\textrm{T}^{j2})\). The algorithm sequentially combines two particles of the nearest distance. In each step of the combination, the four momentum of the two merged particles is calculated to form a new particle. There are also many choices in how to combine the momenta of two particles (called “recombination scheme”)—either the merged particle is massless, massive, conserving energy or not etc. The definition of distance and recombination scheme should be chosen such that the jet observables in concern (momentum, energy, number, mass etc.) are well reproduced, and the choice may vary with energy and type of the interaction. The combination is stopped until the distance between two jets, defined as \(y = d_{ij} / M\) become above \(y_{cut}\), where M is normally chosen as the invariant mass of the first two outgoing partons from the \(e^+e^-\) collisions, which equals to the centre-of-mass energy of the \(e^+e^-\) collisions in most of the cases, except for the events with hard initial state radiation of photons.

Fig. 6.25
figure 25

A schematic drawing, showing how clusters with the nearest distance are merged each other to form a new cluster

The biggest advantage of the cluster algorithm against the cone algorithm is that there is no ambiguity in the algorithm originated from the iterative procedure in the cone algorithms. The cone algorithm needs seeds to start the iteration. It is well known that the number of jets and jet momenta is largely affected by the choice of the property of the seed (\(p_{\textrm{T}}\) threshold and the cone size to define a seed) when particles are densely populated. A jet could be split into two jets depending on the seed choice. This means that the result of the jet finding is affected by soft particles (which could be the seed). It is known that a naive cone algorithm is not infrared safe, i.e. the result of the algorithm may depend on a presence of a particle with infinitesimally small energy.

A new class of cluster algorithms for hadron-hadron collisions are then invented, by taking virtue of the cone algorithms, (a) the algorithm works on \(\eta -\phi -p_{\textrm{T}}\) space so that it is boost invariant and (b) particles below the threshold are discarded. A typical arrangement is to introduce two particles with infinite momentum on the beam axis. In the algorithm, the distance between the beam particles \(d_i\) is defined as \(d_i = p_\textrm{T}^{i2}\), where \(p_\textrm{T}^i\) is the transverse momentum of the ith particle, in addition to \(k_\textrm{T}\) as the distance parameter between two final state particles, \(d_{ij}\). If \(d_i\) of the particle, the distance to the beam axis, is smaller than any of \(d_{ij}\)’s, the distances to the other final state particles, the particle is merged to the beam particle.

Also, \(d_{ij}\) is adjusted to the hadron collider environment. The first version of the algorithm, the \(k_\textrm{T}\) algorithm, uses the distance parameter as

$$\begin{aligned} d_{ij} = \min (p_\textrm{T}^{i2}, p_\textrm{T}^{j2}) \varDelta r_{ij}^2 / R^2 \end{aligned}$$
(6.6)

where \(\varDelta r\) is that used in the cone algorithms, \(\varDelta r = \sqrt{\varDelta \eta ^2 + \varDelta \phi ^2}\), and R is the radius parameter. The particle with the smallest \(p_{\textrm{T}}\) will be merged to the beam if \(\varDelta r\) is more than R for any other particles, since then \(d_i\) would be smaller than any of \(d_{ij}\). It will be merged to the nearest particle if \(\varDelta r < R\). In this way, the parameter R plays the role of the cone radius in cone algorithms (Fig. 6.26).

Fig. 6.26
figure 26

A schematic drawing, showing how clusters with large momenta are classified: to be merged each other to form a new cluster, or to the beam axis to be considered as a jet

The value of R gives the angular size of the jets. This is a parameter to which extent one allows to include hard parton radiation around the primary parton, in addition to the soft emissions and/or collinear part of radiated partons. The size should not be too small to include the soft/collinear particles but should not be too large since the particles from beam-related activities (soft underlying events, multi-parton interactions and pile-up particles) may come more into the jet area. Typical values used for the QCD studies at the energy scale of weak interactions (\(p_\textrm{T}^\textrm{jet} \simeq m_Z / 2\)) is \(0.6-0.7\) to include soft emission originated from the parent partons, in order to reduce the theoretical uncertainty in pQCD description of the data. For higher energy interactions, \(0.4-0.5\) would be more preferred, in particular for physics beyond the SM (BSM) searches at TeV scale, to minimise the effect of soft particles to the momentum or mass reconstruction of the parent BSM particles.

The clustering procedure is finished when there is no possibility to merge the remaining particles each other except for the beam particle. The particles above a given \(p_{\textrm{T}}\) threshold is defined as jets and others are discarded, i.e. merged to the beam particle.

The original \(k_\textrm{T}\) algorithm, where \(d_{ij}\) is defined as Eq. (6.6), it is known that the jet area tends to be extended beyond the area given by the parameter R. This feature is undesirable for the mass reconstruction as discussed just above. Recently anti-\(k_\textrm{T}\) algorithm became more popular, where \(d_{ij}\) is defined as:

$$ d_{ij} = \min ((p_\textrm{T}^{i})^{-2}, (p_\textrm{T}^{j})^{-2}) \varDelta r_{ij}^2 / R^2 $$

With this distance parameter, the algorithm first merges the pairs within the maximum allowed distance, \(\varDelta r_{ij} \simeq R\), making a merged particle in between. The same thing happens in the next iteration: the furthest particle from the new merged particle would be absorbed. This would imply that the direction of the jet particle, or the jet axis, would be oscillated between the merged particles, but the axis will be stabilised in the later stage where only particles with small \(k_\textrm{T}\) with respect to the jet axis are left, which are eventually be merged to the jet. As a consequence, the jet area will have clear boundary of a circle with the radius R. The area of overlapping circles will be absorbed to more energetic jets. This would give the jets very similar to what is given by the cone algorithm, which has certain area size. One can statistically subtract the underlying events of the jets in such a case. The anti-\(k_\textrm{T}\) algorithm combines virtues of the cone and cluster algorithm successfully and is now the most popular jet algorithm at the LHC.

Naturally, the criteria to define the jets of the partons is based on continuous parameters, such as \(p_{\textrm{T}}\), R, which have no characteristic scale, apart from \(\varLambda _{QCD}\), being anyhow much below the typical jet momentum (\({>}O(10)\) GeV). There are some arbitrariness on the parameter values in such algorithms, and the parameters may have to be optimised for each application.

6.4.4 Calibrating Jet Measurements

As described in Sect. 6.4.1, the jets are expected to be collimated at high energies, primarily since radiation in the final state is less pronounced for high-\(p_{\textrm{T}}\) jets where the coupling constant \(\alpha _S\) is smaller. There the individual hadrons consisting of a jet cannot easily be resolved, and the calibration of the detector is performed at the level of jets instead of constituting hadrons. The result of jet calibration is often called jet energy scale (JES).

Since a jet is defined by means of an algorithm, the momenta of jets depend on the choice of the algorithm and jet finder parameters (e.g. R). The calibration on jets should be repeated for each choice of the algorithm and the set of parameters. There is another point to choose: if the energy is to be corrected to the “particle level”, i.e. jets using the momentum of particles, or to the parton level, where partons are the input for the jet algorithm. A general consensus is that the correction to the particle level is to be applied, i.e. the momentum of the particle-level jet gives the reference to a detector-level jet, which matches in \(\eta -\phi \) space. In this way, one can avoid the theoretical uncertainty on the correction factor from the particle to the parton level jets, which is expected to be improved as the theory is more advanced.

A simplest reconstruction of jets is to start from only calorimeter information, as described in Sect. 6.4.3. The energy calibration for calorimeter objects is done to the electromagnetic scale, ignoring that \(e/h \ne 1\) (see Sect. 5.4.3), or to the hadronic scale after applying e/h correction. In principle, a calorimeter cluster can be replaced with a matched track to improve the momentum resolution, if the spatial resolution of the calorimeter cluster is fine enough to resolve close-by particles.

There remains still difference between the particle-level and detector-level jets. In addition to the factors arising from calorimetry, such as longitudinal hadron shower leakage and intrinsic dependence of calorimeter response on type of particles (the e/h ratio, response for muons and neutrinos), the following effects specific to jets cause significant shift in measured momentum:

  • particles escaping outside the jet area

    If the jet area is wider than the radius used in the jet finder, the particles outside the jet is lost and the jet energy is underestimated. As explained above, the size of the jet is wider for low energy jets since the partons are radiated more often. This leak, however, should be a part of the jet definition for both parton and particle-level jets. Further leak occurs when the particles are bent by solenoidal magnetic field applied to the central tracker. This is to be corrected through the jet energy scale.

    A gluon radiates another partons more often than a quark because of the different colour factor (9/4 vs. 1). In general, a gluon jet is wider than a quark jet and particle spectrum is softer because of more radiation. A gluon jet contains more particles with smaller average energy than for a quark jet. This again leads to more leaks by the magnetic bending of particles.

    An additional correction depending on the jet properties, such as the transverse size of the jet or the number of tracks matched to the jet, would improve the energy resolution of jets.

  • response of heavy-quark jets

    Some shift may remain for c-quark and b-quark jets (c/b-jets). A b-jet may decay semi-leptonically, to a lepton (\(e, \mu , \tau \)), a neutrino and a lighter c- or u-quark jet. The c-quark jet may decay again semi-leptonically. As a consequence, c/b-jets may contain one or more electron or muon and one or more neutrinos. The momentum of neutrinos cannot be measured; moreover, the muon leaves only up to two GeV energies in calorimeter (MIP). Therefore, the energy responses for b- and c-quark jets are, in general, smaller than other kinds jets (uds or gluon jets—often called “light flavour jets”). The actual difference depends, then, on many factors, e.g. on how the muon momentum is taken into account, the e/h ratio, which may affect the jet energy containing an electron etc. Anyhow, it is a common practice to apply additional correction if a jet is identified as a heavy quark jet.

  • pile-up

    One can safely assume that the events that pile up on top of a collision of interest are all soft interactions (see Sect. 2.5). The soft interaction events are often called “minimum-bias” events since, the events are taken through triggers as little requirement as possible, e.g. small energy in very forward part of the calorimeter. Average \(p_{\textrm{T}}\) from such minimum-bias events at the LHC energy (\(\sqrt{s} \simeq 14\) TeV) is about 2.4 GeV per unit of \(\eta -\phi \) space. This means that a jet with radius \(R = 0.4\) with 20 additional minimum-bias events have average offset of \(p_{\textrm{T}}\) by about 24 GeV, hence gives a very large shift in energies for jets with \(p_\mathrm{T\textrm{jets}} < O(100)\) GeV. In order to reduce the influence from pile-up particles, the expected average \(p_{\textrm{T}}\) from pile-up is subtracted. The actual value to be subtracted depends on the number of pile-up events. The average number of pile-up can be estimated from the luminosity of the collisions. The Poisson fluctuation from the average can further be corrected by measuring the number of interactions per bunch crossing through, e.g. \(N_\textrm{PV}\), the number of primary vertex per crossing reconstructed from the central tracker.

The residual difference from imperfect simulation of the detector and remaining miscalibration of detectors is corrected by in-situ measurements of jet response. The most common way to determine the overall jet energy scale is to find and use some physics processes with a jet whose energy can be deduced through energy-momentum conservation. In the hadron collider experiments, for example, the production of \(\gamma +\)jet or \(Z +\)jet is widely used as the calibration source, where the photon or Z reconstructed from dilepton can be the reference to the jet energy because the fluctuation of energy deposited by the electromagnetic shower or measured charged track momentum is much smaller than that by the hadronic shower, leading to more precise energy measurement than that by the hadron calorimeter. However, since the momentum conservation is hold only in the plane perpendicular to beam axis in the hadron collider, what is conserved is \(p_{\textrm{T}}\), not p. More concretely, in \(\gamma +\)jet event, the jet energy scale is adjusted so that the \(p_{\textrm{T}}\) of the jet is equal to that of the photon. The result of the calibration is illustrated in Fig. 6.27. The clear peak can be seen in the \(\gamma +\)jet events with the peak close to unity as expected.

Fig. 6.27
figure 27

Reprinted under the Creative Commons Attribution 4.0 International License from [15] © CERN for the benefit of the ATLAS collaboration 2014. The calorimeter region is restricted to be \(|\eta |<1.2\)

The ratio of \(p_{\textrm{T}}\) of jet to \(\gamma \) for the \(p_{\textrm{T}}\) range between 160 and 210 GeV.

With the similar concept of calibrating the electromagnetic scale, \(Z \rightarrow q \bar{q}\) can be used in principle as the calibration source of the jets with the Z mass as the reference target. However, this method does not work in the hadron collider experiments in practice, because of the overwhelming dijet backgrounds generated by QCD process. In addition, the jet energy resolution is much worse than that of the electromagnetic energy measurement, resulting in the difficulty to see the resonant peak from \(Z\rightarrow q \bar{q}\).

6.5 Reconstructing Missing Momentum

The momentum of the neutral particles, such as neutrino and unknown neutral particles, are not detected by collider detectors. For hadron colliders, the longitudinal momentum of such particles cannot be known due to the lack of longitudinal momentum information of the collisions. The missing transverse momentum, denoted as either \(p_\mathrm{T\textrm{miss}}\) or \(\textbf{E}_\mathrm{T\textrm{miss}}\), can still be reconstructed by the negative of the transverse momentum vector of observed particles. A first approximation of the sum of the visible particle momentum could simply be obtained from the x and y component of the calorimeter cell energies, \(E_{i,\textrm{cell}}\sin \theta _i\cos \phi _i\) and \(E_{i,\mathrm cell}\sin \theta _i\sin \phi _i\). This would miss muon momenta and also detailed calibration depending on the final state objects are ignored. Instead, one may measure each category of final state objects separately, with proper calibration and possibly with a help of tracking and muon detectors, for example,

$$ \textbf{p}_\mathrm{T\textrm{miss}} = \sum \left[ \textbf{p}_\textrm{T}^e + \textbf{p}_\textrm{T}^{\mu } + \textbf{p}_\textrm{T}^\gamma + \textbf{p}_\textrm{T}^\tau + \textbf{p}_\textrm{T}^\textrm{jets} + \textbf{p}_\textrm{T}^\textrm{others} \right] , $$

as is done for the ATLAS experiment. Here \(p_\textrm{T}^\textrm{others}\) term is the momentum of the particles belonging to neither of objects identified as charged lepton, jet nor a photon. This term, often called as “soft term”, includes the rest of the particles accompanied with the hard interaction, such as particles from ISR, multi-parton events and underlying events, which should also be added to the \(p_\mathrm{T\textrm{miss}}\) calculation. The soft term, however, also includes particles from pile up. Since average transverse energy of a minimum-bias event is about 100 GeV, the total transverse energy of an event with \({>}20\) pile-ups would be about 2 TeV and increases proportionally as a function of the number of pile-up. The resolution of this pile-up component directly affects the missing \(p_{\textrm{T}}\) calculation. Therefore, the performance of the missing \(p_{\textrm{T}}\) reconstruction strongly depends on how to estimate the missing vector from the soft term, and to less extent through the jet term. In addition, any misreconstruction in the detector, such as noise in the calorimeter, affects to \(\textbf{p}_\mathrm{T\textrm{miss}}\) through the soft term.

Various algorithms are developed in order to mitigate the growth of the resolution with the number of pile-up events. A simple algorithm is to reconstruct the soft term by using only the calorimeters or the central tracker. The latter has a benefit that it can remove all the track momentum not originated from the vertex of the hard scattering in concern. It misses, however, the contribution from the neutral particles like \(\pi ^0\) and \(K^0_L\). A further refined algorithm could be to reweight the calorimeter soft term by the momentum fraction of the soft term from trackers from the primary vertex (PV): \(\varSigma _\textrm{tracks,PV} p_{\textrm{T}}/ \varSigma _\textrm{tracks}\). Each estimator would have different resolution and tail. Now suppose that we find long tail in for zero missing-\(p_{\textrm{T}}\) events on truth level. It is found there that tails of the soft term of tracking and calorimeter algorithms are not strongly correlated. For that reason, it is often useful to use more than one algorithm to reduce the tail induced by the \(p_\mathrm{T\textrm{miss}}\) reconstruction. This also indicates that the choice of the soft term reconstruction depends on the type of events in concern.

6.6 Identification of b-Jet and \(\tau \)-Jet

In many types of physics analyses in the collider experiments, the identification of b-quark jet (b-jet) or \(\tau \)-jet is of particular importance. For example, the Higgs boson has the large branching fractions of \(H\rightarrow b \bar{b}\) and \(H \rightarrow \tau ^+ \tau ^-\). The top quark decays into b and W with almost 100% of probability. This section describes the methods to identify b-jet and \(\tau \)-jet.

6.6.1 b-Jet

There are two approaches for the b-jet identification, or b-tagging. The first one exploits the fact that b-hadron generated at the collision point travels a few mm before the decay (\(c\tau \) of \(B^0\) is 455 \(\upmu \)m for example), leaving the secondary vertex or the collection of tracks that have large values of the impact parameter with respect to the primary vertex. We refer this type of b-tagging as the track-based tagging below. The second one exploits the fact that the b-hadron decay is associated with leptons with high probability because the branching fraction of the semi-leptonic decay of b-quark is approximately 11%. In addition, b-quark decays to c-quark plus something with the probability close to 100%, where the semi-leptonic branching fraction of c-quark is about 10%. Hence, the existence of a lepton nearby a jet can be a signature of b- or actually also c-quark jet. We refer this second type of b-tagging as the soft lepton tagging. In either method, all jets are the candidate of b-jets, i.e. all jets are examined if they are originated from b-quarks. In the actual application, two methods are often combined. Or more precisely, there are some branches in the track based tagging, and the discriminants from each tagging method are often unified with a multivariate analysis technique for better discrimination.

6.6.1.1 Track Based Tagging

The track-based b-tagging makes use of the difference in lifetime between b-hadrons and other more common particles generated by the collisions, such as pions or protons. As schematically shown in Fig. 6.28, b-hadrons fly typically a few mm from the collision point before the decays, hence producing particles that emerge from the space point away from the primary vertex. On the other hand, light quarks, such as u, d, or gluon, generate only light hadrons that appear from the beam-beam interaction point, causing charged particles associated with the primary vertex. Using this difference, the track-based b-tagging algorithms search for either tracks with their impact parameter significantly away from zero, or explicitly reconstruct secondary vertex formed by the decay products of b-hadrons. One thing the reader should keep in mind is the existence of particles generated at the primary vertex even in the b-jet, because of the bi-products of b-quark hadronisation, which is also shown in Fig. 6.28. These particles sometimes degrade the b-tagging capability because they mimic jets originated from light quarks. This becomes more striking if the momentum or energy of the original b-quark is higher. More energetic partons end up with more particles through hadronisation, while the number of decay products of b-hadron does not depend on momentum of parent b-hadron or b-quark, i.e. the fraction of particles from the primary vertex compared to the one from the secondary vertex increases as the b-quark gets harder.

Fig. 6.28
figure 28

Schematic drawing of b-jet. \(d_0\) is the impact parameter of the tracks with respect to the primary vertex. The signed impact parameter is defined to be the distance of the \(d_0\) projected to the jet axis with a sign, where the sign is defined to be positive if the track crosses the jet axis in the region towards jet direction in a view from the primary vertex, and negative if the crossing point is behind the primary vertex. The decay length, \(L_{xy}\), is defined to be b-hadron flight length or more specifically the distance between the primary and the secondary vertices in x-y plane

Fig. 6.29
figure 29

Reprinted under the Creative Commons Attribution 4.0 International License from [16] © 2015 CERN for the benefit of the ATLAS Collaboration

Impact parameter significance distributions of tracks inside b-, c-, or light jets. The distributions are obtained in the ATLAS group simulation.

Figure 6.29 shows the signed impact parameter significance of the tracks in the simulated \(t\bar{t}\) events. The definition of the signed impact parameter is explained in the figure caption of Fig. 6.28. As can be seen, b-jets have more tracks with the large value compared to the other jets originated from u-, d-, s-quark, or gluon, which are referred to as light jets. By the way, the perfect detector that has the infinite position resolution would not give us the negative value of the signed impact parameter because the vector drawn from the primary to the secondary vertex should be the same as the jet direction. Therefore, with such detector, if existed, the signed impact parameter distribution has monochromatic peak at zero plus the tail to only the positive side due to the contribution from b- or c-hadrons etc. In other words, the negative value is caused by the detector resolution. The width of the peak around zero, therefore, represents the detector resolution.

The most simple application in the track-based tagging is to just count the number of tracks with some selection criteria to pick up the tracks that do not come from the primary vertex. In slightly more complicated approaches, the likelihood of the track impact parameters is formed and used as the discriminant. One example is the jet probability algorithm, where the probability density function of the impact parameter for the tracks that come from the primary vertex is created a priori, and then the likelihood value of each charged particle to be consistent with the one from the primary vertex is calculated. In many cases, there are many tracks in a jet, and hence the likelihoods assigned to each track are combined to form a likelihood or discriminant for a jet in concern. The benefit of this method is that one examines if a track is compatible with the hypothesis that it comes from a collision point, and no priori knowledge is required for b-jets.

In further application, one can construct the b-jet likelihood based on the probability density function for the tracks produced by the b-hadron decays a priori, for example, by the simulation. Taking the likelihood ratio for the b-jet hypothesis and the light-jet hypothesis would give us the improved discrimination power over the jet probability where only the light-jet hypothesis is used in principle. In the actual application, special care needs to be taken to form the b-jet probability density function. We must use a correct probability density function and would like to confirm that a priori knowledge or simulated data reproduces real data, but it is not so easy to extract non-biased b-jet sample with high purity. This is in contrast to handling the jet probability tagging where the light-jet sample can be as easily accumulated with high purity. Hence, the jet probability method is more robust, although the discriminating power is less and suitable for the usage in the early stage of the experiment.

Another type of track-based b-tagging algorithm explicitly reconstructs the secondary vertex caused by the decay of b-hadron. The secondary vertex is reconstructed as already described in Sect. 6.1.4. After finding the secondary vertex, the b-tagging algorithm usually set a threshold on the significance of the decay distance defined in Fig. 6.28, which is the distance from the primary and secondary vertices, or the flight length of the b-hadron in the plane perpendicular to beam axis.

Once the secondary vertex is formed, some extra information, such as the invariant mass calculated from the tracks associated to the secondary vertex etc., can be extracted, resulting in high rejection power for the light jets than the impact parameter based b-tagging method. However, the efficiency is the key issue, because one needs at least two tracks in the secondary vertex finding, while even one track can give us the discriminating power at some degree in the impact parameter based tagging.

In either types of algorithms, the impact parameter based or the secondary vertex reconstruction, one also needs to remove the tracks which emerge from the secondary vertex although their origin is not b-hadron. Tracks generated by the decay of \(K_S\) and \(\varLambda \), and by photon conversion are the typical example. In many applications, the algorithm looks for two-track combination whose invariant mass is consistent with \(K_S\), \(\varLambda \) and photon and removes them in the track list to consider.

6.6.1.2 Soft Lepton Tagging

A b-jet contains charged lepton nearby with high probability, which comes either from direct semi-leptonic decay of a b-hadron or from the cascade decay through c-hadron. Another possible source of charged leptons is the leptonic decays of W or Z. But they don’t produce additional jets, i.e. such leptons are “isolated”. This difference, isolated vs. non-isolated, is very frequently and efficiently used to discriminate whether the charged lepton in question is originated from the b-jet or W/Z. The typical application to identify b-jet is therefore to require a jet to have a charged lepton nearby, for example, \(\varDelta R\) between the jet and charged lepton is required to be smaller than a threshold. This technique is called as soft lepton tagging.

In principle, we can use any charged leptons for the soft lepton tagging. However, only muons are used in practice because of the difficulty in identifying \(\tau \)’s and non-isolated electrons. The identification of \(\tau \)-jet is discussed in the next section. The non-isolated electron often shares its electromagnetic shower with the shower or energy deposit by the constitutes of the jet. This is in contrast to the muon case. It’s only muon that can penetrate the calorimeter and reach the muon detector even with jets. Hence, non-isolated muons can be still identified with high efficiency and low fake rate.

The possible background source of the soft lepton tagging with muon is either the punch through (see Sect. 3.3.2) or decay of hadrons. This is basically the background in the muon identification. Thus, the muon identification capability mostly determines the performance of the soft muon tagging.

To achieve higher b-jet selection efficiency or suppress the fake contribution from light jets, a kinematical requirement for the non-isolated muon is sometimes imposed. Suppose we know the direction of the jet axis. This axis is a good approximation of the initial b-quark momentum vector, or flight direction of the b-hadron, which is produced by the hadronisation of the initial b-quark. In this process, the non-isolated lepton momentum transverse to the b-hadron flight direction or approximately the jet axis can be as large as a half of the b-hadron mass. On the other hand, there is no mechanism for hadrons yielded from light quarks to get the momentum transverse to the jet axis rather than the tiny contribution in the hadronisation process. Thus, the transverse momentum relative to the jet axis can give us some discriminating power between b-jets and other types of jets.

6.6.2 \(\tau \)-Jet

The \(\tau \) identification is classified into few categories based on that it decays either leptonically or hadronically, and the number of final states particles. The branching fraction of leptonic decay is about 35%. In the remaining 65% cases, the \(\tau \) decays hadronically with one charged particles (one prong) with the fraction of about 50%, and with three charged particle (three prong) with the fraction of 15%.

In case of leptonic decays, the \(\tau \) identification is actually the identification of isolated electron or muon as the final state consisting of either electron or muon and neutrino which is not detected. In absence of another neutrino, the sum of momentum vector of the isolated electron or muon, and missing \(E_\textrm{T}\) can be treated as the momentum of \(\tau \). This is rather straightforward and cleaner method compared to the identification in hadronic decays.

The identification for hadronic decays is more complex, but important because of the larger branching fraction. In the hadronic decay, there are one or three charged particles often associated with extra neutral particles such as \(\pi ^0\). Since we are now dealing with rather high momentum \(\tau \)’s, the decay products are boosted and collimated, resulting in a \(\tau \)-jet. The particles inside the \(\tau \)-jet are basically decay products of the \(\tau \), hence the number of particles, which does not depend on \(\tau \) momentum, is typically smaller than the one in quark or gluon induced jet (hadronic jet) where the number of particles depends on the momentum. This leads to the fact that the particles or the energy carried by the particles inside the \(\tau \)-jet are more collimated than that in the hadronic jets, given the same jet energy. In addition, the hadronisation process could generate a particle whose momentum relative to the jet axis is greater than the half of \(\tau \)’s mass due to QCD radiation. On the other hand, the maximum in \(\tau \) decay is a half of \(\tau \) mass. This is another reason why the \(\tau \)-jet is more collimated. In terms of the width of shower shape, the other important point we have to care is electromagnetic shower, which mimics \(\tau \) shower. As we have seen in the previous sections and chapters, the size or width of electromagnetic shower is smaller than the one of hadronic shower. This means that electron could be the fake of \(\tau \), if you just select collimated jet. Therefore, we have to require the jet width to be narrower than hadronic jet and wider than electromagnetic shower at the same time.

Another feature of \(\tau \)-jet is that \(\tau \) has a finite lifetime, whose \(c\tau = 87\) \(\upmu \)m, causing a possible decay vertex in addition to the primary vertex created by collisions. This means that the track-based b-tagging can also give us the discrimination of \(\tau \)-jet from the hadronic jets in principle. However, this lifetime is much shorter than that of b-hadrons. Therefore, using a method similar to track-based b-tagging alone does not produce sufficient discriminating power. Still the track information helps to identify \(\tau \) in cooperation with the jet shower shape variables.

The actual \(\tau \)-jet identification starts from reconstructing a jet. Usually, no special jet clustering algorithm for \(\tau \)-jet is used. The similar one for the hadronic jet is used with the parameters possibly tuned for \(\tau \)-jet clustering. The jet considering here is a cluster based on energy deposit in the calorimeter. The next step is to select and associate the tracks to the \(\tau \) candidate jet. Some quality cuts and the requirement on \(p_{\textrm{T}}\) are the standard criteria. If necessary the \(\tau \)-jet candidate is categorised into one or three prong based on the number of associated tracks.

Fig. 6.30
figure 30

Reprinted under the Creative Commons Attribution 4.0 International License from [17] © 2011 CERN for the benefit of the ATLAS Collaboration. The red histogram shows the distribution for \(\tau \)’s in \(Z\rightarrow \tau \tau \) or \(W \rightarrow \tau \nu \) simulated events, while the black dots for inclusive jets in real data

One of the discriminating variables used in the \(\tau \) ID in the ATLAS group, which is the fraction of calorimeter energy in the region \(\varDelta R<0.1\) to the total energy in a jet.

Fig. 6.31
figure 31

Reprinted under the Creative Commons Attribution 4.0 International License from [17] © 2011 CERN for the benefit of the ATLAS Collaboration. The red histogram shows the distribution for \(\tau \)’s in \(Z\rightarrow \tau \tau \) or \(W \rightarrow \tau \nu \) simulated events, while the black dots for inclusive jets in real data

Discriminating variables used in the \(\tau \) ID in the ATLAS group, which is the maximum of \(\varDelta R\) between the tracks inside a jet and the jet axis.

Here, we show you some variables that are actually used in the ATLAS \(\tau \)-jet identification. Figure 6.30 shows the fraction of calorimeter energy in the region \(\varDelta R<0.1\) to the total energy in the jet for \(Z\rightarrow \tau \tau \) or \(W\rightarrow \tau \nu \) MC and for real data where most of the jets originate from light-quarks or gluons. Figure 6.31 shows the maximum of \(\varDelta R\) between the tracks inside a jet and \(\tau \)-jet axis. As can be seen from these two figures, the energy flow of the \(\tau \)-jet is concentrated on the centre of the jet.

The \(\tau \) identification algorithm nowadays exploits the multivariate analysis such as likelihood, neural network or boosted decision tree based on the variables discussed above or some other variations.