1 The Equations of Motion

Consider a charged particle with mass m and charge Q = q e, where e is the elementary charge and q is an integer, usually q = ±1. Its trajectory or position r(t) = (x(t), y(t), z(t))T in a magnetic field B(r), as a function of time, is determined by the equations of motion given by the Lorentz force F ∝ q v ×B, where v = dr∕dt is the velocity of the particle. In vacuum, Newton’s second law reads [1]:

$$\displaystyle \begin{gathered} \frac{\mathrm{d}{\boldsymbol{p}}}{\mathrm{d} t} = k\hspace{0.5pt} q\hspace{0.5pt} {\boldsymbol{v}} ( t ) \times {\boldsymbol{B}} ( {\boldsymbol{r}} ( t ) ),{} \end{gathered} $$
(4.1)

where p = γ m v is the momentum vector of the particle, γ = (1 −v 2c 2)−1∕2 is the Lorentz factor, and k is a unit-dependent proportionality factor. If the units are GeV∕c for p, meter for r, and Tesla for B, then k = 0.29979 GeV∕c T−1 m−1. The trajectory is uniquely defined by the initial conditions, i.e., the six degrees of freedom specified for instance by the initial position and the initial momentum. If these are tied to a reference surface, five degrees of freedom are necessary and sufficient. Geometrical quantities other than position and velocity can also be used to specify the initial conditions. The collection q = (q 1, …., q m) of these quantities is called the initial track parameter vector or the initial state vector.

Equation (4.1) can be written in terms of the path length s(t) along the trajectory instead of t, giving [1]:

$$\displaystyle \begin{gathered}{} \frac{\mathrm{d}^2 {\boldsymbol{r}}}{\mathrm{d} s^2} = k\hspace{0.5pt}\psi \cdot \frac{\mathrm{d}{\boldsymbol{r}}}{\mathrm{d} s} \times {\boldsymbol{B}} ( {\boldsymbol{r}} ( s ))={\boldsymbol{\varGamma}} \left( s,{\boldsymbol{r}}(s), {\boldsymbol{\dot{{\boldsymbol{r}}}}} (s) \right), \end{gathered} $$
(4.2)

where \({\boldsymbol {\dot {{\boldsymbol {r}}}}} (s) \equiv \mathrm {d}{\boldsymbol {r}} / \mathrm {d} s,\ \psi =q/p\) and p = | p |. In simple cases, this equation has closed-form solutions. In a homogeneous magnetic field , the solution is a helix; it reduces to a straight line in the limit of B ≡0. In the general case of an inhomogeneous magnetic field , one has to resort to numerical methods such as Runge–Kutta integration of the equations of motion (Sect. 4.3.2.1), parametrization by polynomials or splines [1], or other approximations [2]; see Sect. 4.3.2.2.

Equation (4.2) can be expressed in terms of other independent variables. For example, if the equations of motion are integrated in a cylindrical detector geometry, the radius R is a natural integration variable. In a planar detector geometry, the position coordinate z could be the variable of choice [2, 3].

2 Track Parametrization

Different detector geometries often lead to different choices of the track parameters. However, the parametrization of the trajectory should comply with some basic requirements: the parameters should be continuous with respect to small changes of the trajectory; the choice of track parameters should facilitate the local expansion of the track model into a linear function; and the stochastic uncertainties of the estimated parameters should follow a Gaussian distribution as closely as possible. In order to fulfill, for instance, the continuity requirement, curvature should be used rather than radius of curvature, and inverse (transverse) momentum rather than (transverse) momentum.

In a barrel-type detector system typical for the central part of collider experiments, a natural reference surface of the track parameters is a cylinder with radius R, centered around the global z-axis, which usually coincides with the beam line. The track parameters are, in this case, defined at the point of intersection P between the track and the reference cylinder. In such a system, one possible choice of track parametrization is the following:

$$\displaystyle \begin{aligned} q_1 = {q}/{{p_{\mathrm{T}}}} ,\ q_2 = \phi ,\ q_3 = \tan \lambda ,\ q_4 = R \varPhi ,\ q_5 = z,{} \end{aligned} $$
(4.3)

where \({p_{\mathrm {T}}} = p\hspace{0.5pt}\cos \lambda \) is the transverse momentum, ϕ is the azimuth angle of the tangent of the track at P, λ is the dip angle (complement of the polar angle) of the tangent at P, and and z are the cylindrical coordinates of P in the global coordinate system, see Fig. 4.1.

Fig. 4.1
figure 1

Track parametrization according to Eq. (4.3). The parameter \(\tan \lambda \) is the slope of the tangent at the reference point with respect to the (x, y)-plane

In a detector system based on planar detector elements, the natural reference surface is a plane. Such a surface is uniquely determined by a normal vector of the plane and the position of a reference point inside the plane. A local coordinate system is defined such that the u-axis is parallel to the normal vector and the v- and w-axes are inside the plane. A natural choice of track parameters is now

$$\displaystyle \begin{gathered} q_1 = \psi ,\ q_2 = \mathrm{d} v/\mathrm{d} u ,\ q_3 = \mathrm{d} w/\mathrm{d} u ,\ q_4 = v ,\ q_5 = w ,{} \end{gathered} $$
(4.4)

where ψ = qp, dv∕du is the tangent of the angle between the projection of the track tangent into the (u, v)-plane and the u-axis, dw∕du is the tangent of the angle between the projection of the track tangent into the (u, w)-plane and the u-axis, and v and w are the local coordinates of the intersection point of the track with the plane; see Fig. 4.2. The quantities dv∕du and dw∕du are also called direction tangents, as the tangent vector to the track is proportional to the vector (dv∕du, dw∕du, 1)T.

Fig. 4.2
figure 2

Track parametrization according to Eq. (4.4). The parameters dv∕du and dw∕du are the direction tangents of the track at the reference point with respect to the u-axis

Another planar reference system is the curvilinear frame, which is useful for transporting uncertainties of track parameters; see Sect. 4.4. It is a hybrid local/global reference frame. The curvilinear plane is always orthogonal to the direction of the track with the parametrization:

$$\displaystyle \begin{aligned} q_1 = \psi ,\ q_2 = \phi ,\ q_3 = \lambda ,\ q_4 = x_{\bot},\ q_5 = y_{\bot},{} \end{aligned} $$
(4.5)

where x and y are orthogonal position coordinates inside the plane, see Fig. 4.3. The x -axis is parallel to the global (x, y)-plane. The azimuth and dip angles ϕ and λ are defined at the point of intersection of the track with the curvilinear plane, but their values are measured in the global Cartesian coordinate system. The tangent vector is thus proportional to \((\cos \lambda \cos \phi ,\cos \lambda \sin \phi ,\sin \lambda ){{ }^{\mathsf {T}}}\).

Fig. 4.3
figure 3

Track parametrization according to Eq. (4.5). The parameters x and y are the distances from the reference point in the plane perpendicular to the track. The direction of the tangent is measured in global polar coordinates

In addition to the above-mentioned, surface-based frames, a global, Cartesian coordinate frame is also frequently used. In this frame, the track parameters are the position vector r and the momentum vector p at that point. Since these are not tied to a surface, six parameters are needed in order to uniquely specify the state of the track. The rank of their covariance matrix is at most five.

3 Track Propagation

The track model, given by the solution of the equations of motion, describes the functional dependence of the state vector q j at a surface j on the state vector q i at a different surface i:

$$\displaystyle \begin{gathered} {} {\boldsymbol{q}}_j = \boldsymbol{f}_{j{\hspace{0.5pt}|\hspace{0.5pt}} i}\left({\boldsymbol{q}}_i\right). \end{gathered} $$
(4.6)

The function f j | i is called the track propagator from surface i to surface j, see Fig. 4.4. When closed-form solutions of the equations of motion exist, e.g., in the two situations of vanishing magnetic field and homogeneous magnetic field, the track propagator can be written as an explicit function of the path length. In the straight-line solution of the equations of motion in a vanishing magnetic field, it is easy to also derive an analytical formula for the path length between two surfaces. For the helical solution in a homogeneous magnetic field, however, such an analytical formula exists only for propagation to cylinders with symmetry axis parallel to the field direction or to planes orthogonal to the field direction. Otherwise, a Newton iteration or a parabolic approximation has to be used to find the path length.

Fig. 4.4
figure 4

Track propagator from surface i to surface j

3.1 Homogeneous Magnetic Fields

The helical track propagator takes the solution to Eq. (4.2) as a starting point. The solution [4] can be written as:

$$\displaystyle \begin{gathered} {} {\boldsymbol{r}} (s) = {\boldsymbol{r}}_0 + \frac{\delta}{K} \left( \theta - \sin \theta \right) \cdot {\boldsymbol{h}} + \frac{\sin \theta}{K} \cdot {\boldsymbol{t}}_0 + \frac{\alpha}{K} \left( 1 - \cos \theta \right) \cdot {\boldsymbol{n}}_0, \end{gathered} $$
(4.7)

where r(s) is the position vector of the point on the helix at path length s from the reference point r 0 (at s = 0), h = B∕| B | is the normalized magnetic field vector, t = pp is the unit tangent vector to the track, \({\boldsymbol {n}} = \left ( {\boldsymbol {h}} \times {\boldsymbol {t}} \right )/ \alpha \) with α = | h ×t |, δ = h ⋅t, K = −k ψ | B |, and θ = K s. In the following, the subscript “0” indicates quantities defined at the initial point s = 0. Any point along the trajectory can be specified by a corresponding value of s. The equation of the unit tangent vector t is found by differentiating Eq. (4.7) with respect to s,

$$\displaystyle \begin{gathered} {} {\boldsymbol{t}}(s) = \frac{\mathrm{d} {\boldsymbol{r}} (s)}{\mathrm{d} s} = \delta \left( 1 - \cos \theta \right) \cdot {\boldsymbol{h}} + \cos \theta \cdot {\boldsymbol{t}}_0 + \alpha \sin \theta \cdot {\boldsymbol{n}}_0. \end{gathered} $$
(4.8)

For a given value of s, any desired set of track parameters can be calculated from Eq. (4.7) (positions) and Eq. (4.8) (directions). In the helical track model, the momentum p is constant and therefore has the same value for all s.

3.2 Inhomogeneous Magnetic Fields

In an inhomogeneous magnetic field, the equations of motion have no exact closed-form solutions, and one has to resort to numerical, approximate solutions.

3.2.1 Runge–Kutta Methods

Runge–Kutta methods are iterative algorithms for the approximate numerical solutions of ordinary differential equations, given initial values. Among Runge–Kutta methods, the Runge–Kutta–Nyström algorithm is specifically designed for second-order equations such as Eq. (4.2). In the fourth-order version a step of length h, starting at s = s n, is computed by [1]:

$$\displaystyle \begin{gathered} {} \begin{aligned} {\boldsymbol{r}}_{n+1}&={\boldsymbol{r}}_n+h\hspace{0.5pt}{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+h^2\hspace{0.5pt}({\boldsymbol{k}}_1+{\boldsymbol{k}}_2+{\boldsymbol{k}}_3)/6,\\ {\boldsymbol{\dot{{\boldsymbol{r}}}}}_{n+1}&={\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+h\hspace{0.5pt}({\boldsymbol{k}}_1+2{\boldsymbol{k}}_2+2{\boldsymbol{k}}_3+{\boldsymbol{k}}_4)/6, \end{aligned} \end{gathered} $$
(4.9)

with the intermediate stages k defined by

$$\displaystyle \begin{gathered} {} \begin{aligned} {\boldsymbol{k}}_1&={\boldsymbol{\varGamma}} (s_n,{\boldsymbol{r}}_n,{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n),\\ {\boldsymbol{k}}_2&={\boldsymbol{\varGamma}} (s_n+h/2,{\boldsymbol{r}}_n+h\hspace{0.5pt}{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n/2+h^2{\boldsymbol{k}}_1/8,{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+h{\boldsymbol{k}}_1/2),\\ {\boldsymbol{k}}_3&={\boldsymbol{\varGamma}} (s_n+h/2,{\boldsymbol{r}}_n+h\hspace{0.5pt}{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n/2+h^2{\boldsymbol{k}}_1/8,{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+h{\boldsymbol{k}}_2/2),\\ {\boldsymbol{k}}_4&={\boldsymbol{\varGamma}} (s_n+h,{\boldsymbol{r}}_n+h\hspace{0.5pt}{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+h^2{\boldsymbol{k}}_3/2,{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+h{\boldsymbol{k}}_3), \end{aligned} \end{gathered} $$
(4.10)

where r n is the position of the particle at \(s=s_n,\ {\boldsymbol {\dot {{\boldsymbol {r}}}}}_n\) is the unit tangent vector, and Γ is defined in Eq. (4.2). The magnetic field needs to be looked up three times per step, at the positions \({\boldsymbol {r}}_n,\ {\boldsymbol {r}}_n+h\hspace{0.5pt}{\boldsymbol {\dot {{\boldsymbol {r}}}}}_n/2+h^2{\boldsymbol {k}}_1/8\), and \({\boldsymbol {r}}_n+h\hspace{0.5pt}{\boldsymbol {\dot {{\boldsymbol {r}}}}}_n+h^2{\boldsymbol {k}}_3/2\). If the field at the final position r n+1, which is the starting position of the next step, is approximated by the field used for k 4, only two lookups are required per step.

If integration variables other than the path length s are used, for instance the radius R or the position coordinate z, the integration equations Eq. (4.9) are very similar. The difference is that the step length h must be expressed in terms of R or z, and that the function Γ has a different form when expressed in other variables [3].

If the field is (almost) homogeneous, as for example in a solenoid, the step size h can be chosen to be constant; otherwise a variable step size, taking into account the local inhomogeneity of the field, is more efficient. Determining the variable step size along the propagation is done by a step-size selection algorithm. The essence of such an algorithm is to assess the local error 𝜖 of the Runge–Kutta step and compare it to a user-defined error tolerance τ. The so-called embedded Runge–Kutta pairs, originally invented by Fehlberg [5], provide a measure of the local error in an elegant way by producing solutions of different orders during the same step with little extra computational cost. The higher-order solution is denoted r n+1, whereas the lower-order solution is denoted \({\hat {\boldsymbol {{\boldsymbol {r}}}}}_{n+1}\). The difference

$$\displaystyle \begin{gathered} \epsilon = |\hspace{0.5pt}{\boldsymbol{r}}_{n+1} - {\hat{\boldsymbol{{\boldsymbol{r}}}}}_{n+1}\hspace{0.5pt}| \end{gathered} $$
(4.11)

between these solutions constitutes a measure of the error of the step. A popular algorithm for the step size h n+1 of step n + 1 is [6]

$$\displaystyle \begin{gathered} {} h_{n+1} = h_n \left( \frac{\tau}{\epsilon} \right)^{{1}/(q+1)}, \end{gathered} $$
(4.12)

where h n is the size of step n and q is the order of the lower-order solution. This algorithm will effectively shorten the step size if the local error is larger than the tolerance and lengthen it if the local error is smaller than the tolerance, forcing the local error to oscillate around the tolerance.

The original version of the Runge–Kutta–Nyström algorithm is not of the embedded type and therefore contains no direct recipe for estimating the local error. It was realized by the authors of [7], however, that a fourth-order derivative of r can be formed by a combination of the various stages calculated along the step. This derivative is implicitly the difference between a fourth-order solution and a third-order solution and can therefore be used as a measure of the local error 𝜖:

$$\displaystyle \begin{gathered} \epsilon = h^2 \cdot |{\boldsymbol{k}}_1 - {\boldsymbol{k}}_2 -{\boldsymbol{k}}_3 + {\boldsymbol{k}}_4| . \end{gathered} $$
(4.13)

This error measure can be used for step-size selection according to Eq. (4.12) and constitutes an adaptive version of the Runge–Kutta–Nyström algorithm when used alongside the integration steps in Eq. (4.9).

3.2.2 Approximate Analytical Formula

An approximate analytical formula for track extrapolation in an inhomogeneous field is described in [2]. The magnetic field B(z) is assumed to depend only on the z-coordinate. The particle is assumed to move along the z-axis, and the track parameters are x, y, t x, t y, ψ, where t x, t y are the direction tangents. In this parametrization, the equations of motion read:

$$\displaystyle \begin{aligned} x'&=t_x, \\ y'&=t_y,\\ t_x^{\prime}&=h\cdot[t_x t_y B_1-(1+t_x^2)B_1+t_yB_3]={\boldsymbol{a}}(z)\cdot{\boldsymbol{B}}(z),\\ t_y^{\prime}&=h\cdot[(1+t_y^2)B_1-t_xt_yB_2-t_xB_3]={\boldsymbol{b}}(z)\cdot{\boldsymbol{B}}(z),\\ \psi'&=0, \end{aligned} $$
(4.14)

where \(h=k\hspace{0.5pt}\psi \left (1+t_x^2+t_y^2\right )^{1/2}\) and the prime denotes differentiation with respect to z. The aim is to find formulas for the extrapolation of (t x, t y) from z 0 to z e; the extrapolation of x and y can then be performed by integration of the direction tangents.

Let T(z) = T(t x(z), t y(z)) be a function of t x and t z. Then

$$\displaystyle \begin{gathered} T'=\displaystyle\frac{\partial T}{\partial t_x}\hspace{0.5pt} t^{\prime}_x+\displaystyle\frac{\partial T}{\partial t_y}\hspace{0.5pt} t^{\prime}_y=\left(\displaystyle\frac{\partial T}{\partial t_x}\hspace{0.5pt}{\boldsymbol{a}}+\displaystyle\frac{\partial T}{\partial t_y}\hspace{0.5pt}{\boldsymbol{b}}\right)\cdot{\boldsymbol{B}}=\sum_{i_1=1}^3 T_{i_1}\hspace{0.5pt} B_{i_1}. \end{gathered} $$
(4.15)

The derivatives of the functions \(T_{i_1}\) can be represented in a similar way:

$$\displaystyle \begin{gathered} T_{i_1}^{\prime}=\displaystyle\frac{\partial T_{i_1}^{\prime}}{\partial t_x}\hspace{0.5pt} t_x^{\prime}+\displaystyle\frac{\partial T_{i_1}^{\prime}}{\partial t_y}\hspace{0.5pt} t_y^{\prime}=\left(\displaystyle\frac{\partial T_{i_1}^{\prime}}{\partial t_x}\hspace{0.5pt}{\boldsymbol{a}}+\displaystyle\frac{\partial T_{i_1}^{\prime}}{\partial t_y}\hspace{0.5pt}{\boldsymbol{b}}\right)\cdot{\boldsymbol{B}}=\sum_{i_2=1}^3 T_{i_1 i_2}\hspace{0.5pt} B_{i_2}. \end{gathered} $$
(4.16)

When this process is continued, the following functions can be defined recursively:

$$\displaystyle \begin{aligned} T^{\prime}_{i_1\ldots i_{k-1}}&=\sum_{i_k=1}^3 T_{i_1\ldots i_k} B_{i_k}, \end{aligned} $$
(4.17)
$$\displaystyle \begin{aligned} T_{i_1\ldots i_k}&=\displaystyle\frac{\partial T_{i_1\ldots i_{k-1}}}{\partial t_x}\hspace{0.5pt} a_{i_k} + \displaystyle\frac{\partial T_{i_1\ldots i_{k-1}}}{\partial t_y}\hspace{0.5pt} b_{i_k}. \end{aligned} $$
(4.18)

The exact relation

$$\displaystyle \begin{gathered} T(z_{\mathrm{e}})=T(z_0)+\int_{z_0}^{z_e} T'(z)\mathrm{d}z \end{gathered} $$
(4.19)

can then be expanded into:

(4.20)

If the expansion is terminated after n steps, the compact form of Eq. (4.20) reads:

(4.21)

The integral over the field components is taken along an approximate trajectory. Setting T = t x and T = t y gives the desired extrapolation formulas. For a comparison with a Runge–Kutta solver, see [2].

4 Error Propagation

The task of transporting the covariance matrix of the track parameters is essential for any track reconstruction algorithm, either from one set of track parameters to another in the same reference surface, or during track propagation from one surface to another.

This so-called error propagation is straightforward when the transformed track parameter vector is a strictly linear function of the initial track parameter vector. The track propagator in Eq. (4.6), however, is in general a non-linear function, and exact error propagation is not feasible. The common solution is approximate linearized error propagation; see Sect. 3.2.3.2 and Fig. 4.5. The derivatives of the track propagator f j | i are collected in the Jacobian matrix F j | i:

$$\displaystyle \begin{gathered} {} \boldsymbol{F}_{j{\hspace{0.5pt}|\hspace{0.5pt}} i} = \displaystyle\frac{\partial {\boldsymbol{q}}_j}{\partial {\boldsymbol{q}}_i}. \end{gathered} $$
(4.22)

The covariance matrix C i of the track parameters at surface i is transported to surface j according to:

$$\displaystyle \begin{gathered} {\boldsymbol{C}}_j \approx \boldsymbol{F}_{j{\hspace{0.5pt}|\hspace{0.5pt}} i}\hspace{0.5pt} {\boldsymbol{C}}_i\hspace{0.5pt} \boldsymbol{F}_{j{\hspace{0.5pt}|\hspace{0.5pt}} i} {{}^{\mathsf{T}}} . \end{gathered} $$
(4.23)
Fig. 4.5
figure 5

Error propagation from surface i to surface j. The error bars around the track positions and the cones around the direction vectors symbolize the growing uncertainty of the track parameters

A general method for computing the Jacobians is numerical differentiation [8, Section 5.7], by propagating a reference track and five other, nearby tracks from surface i to surface j. Consider a reference track with parameter vector q i at surface i. A small variation of component l (l = 1, …, 5) in q i is introduced by adding to q i the vector Δ l = (δ 1l h l, …, δ 5l h l)T. The corresponding change in parameter k (k = 1, …, 5) at surface j is

$$\displaystyle \begin{gathered} \varDelta_{kl} = f_k ({\boldsymbol{q}}_i + {\boldsymbol{\varDelta}}_{l}) - f_k ({\boldsymbol{q}}_i), \end{gathered} $$
(4.24)

where f k is component k of the track propagator f j | i. The elements (F j | i)kl of the numerical Jacobian matrix F j | i are then obtained by evaluating

$$\displaystyle \begin{gathered} (\boldsymbol{F}_{j{\hspace{0.5pt}|\hspace{0.5pt}} i})_{kl} = \frac{\varDelta_{kl}}{h_l}. \end{gathered} $$
(4.25)

This procedure works for all track propagators, irrespective of whether or not they are in closed form.

4.1 Homogeneous Magnetic Fields

The exposition in this section closely follows the treatment in [4]. In a homogeneous magnetic field, it is possible to obtain analytical formulas for the Jacobians defined in Eq. (4.22). The problem of calculating transport Jacobians from one plane of arbitrary orientation to another is naturally decomposed into three separate parts because the error propagation from one spatial location to another is performed most easily in a coordinate frame which moves along the track, i.e., the curvilinear frame introduced in Sect. 4.2. Therefore, the natural decomposition is first a transformation from a local coordinate system to the curvilinear frame at the initial surface, then a transport within the curvilinear frame to the destination surface, and finally a transformation from the curvilinear frame to a local frame at the destination surface. The total Jacobian is the matrix product of the three intermediate Jacobians.

The starting point of calculating the transport Jacobians are differentials relating variations of position, direction, and momentum at the initial point (s = 0) to variations of the same quantities at any other point along the helix. These differentials are given by

$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{r}} & = \displaystyle\frac{\partial {\boldsymbol{r}}}{\partial {\boldsymbol{r}}_0} \cdot \mathrm{d}{\boldsymbol{r}}_0 + \displaystyle\frac{\partial {\boldsymbol{r}}}{\partial {\boldsymbol{t}}_0} \cdot \mathrm{d}{\boldsymbol{t}}_0 + \displaystyle\frac{\partial {\boldsymbol{r}}}{\partial \psi_0} \cdot \mathrm{d}\psi_0 + \displaystyle\frac{\partial {\boldsymbol{r}}}{\partial s} \cdot \mathrm{d}s, \end{aligned} $$
(4.26)
$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{t}} & = \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial {\boldsymbol{t}}_0} \cdot\mathrm{d}{\boldsymbol{t}}_0 + \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial \psi_0} \cdot \mathrm{d}\psi_0 + \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial s} \cdot \mathrm{d}s,\end{aligned} $$
(4.27)

where dψ 0 is the variation of the signed inverse momentum at the initial point and ds is the change in path length of the helix due to the variations at the initial point. An illustration of this effect is shown in Fig. 4.6.

Fig. 4.6
figure 6

A track and the displaced track due to a variation dr 0 are shown. In the error propagation, the change ds of the path length has to be taken into account. In this specific case, dr is understood to be perpendicular to the track (see also Eq. (4.41))

The partial derivatives are obtained by direct differentiation of Eqs. (4.7) and (4.8). The results are:

$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{r}}}{\partial {\boldsymbol{r}}_0} \cdot\mathrm{d}{\boldsymbol{r}}_0 = \mathrm{d}{\boldsymbol{r}}_0, \end{aligned} $$
(4.28)
$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{r}}}{\partial {\boldsymbol{t}}_0} \cdot\mathrm{d}{\boldsymbol{t}}_0 = \frac{\theta - \sin \theta}{K} \cdot \left( {\boldsymbol{h}} \cdot\mathrm{d}{\boldsymbol{t}}_0 \right) \cdot {\boldsymbol{h}} + \frac{\sin \theta}{K} \cdot\mathrm{d}{\boldsymbol{t}}_0 + \frac{1 - \cos \theta}{K} \cdot \left( {\boldsymbol{h}} \times\mathrm{d}{\boldsymbol{t}}_0 \right), \end{aligned} $$
(4.29)
$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{r}}}{\partial \psi_0} \cdot \mathrm{d}\psi_0 = \frac 1{\psi} \left[ s \cdot {\boldsymbol{t}} + {\boldsymbol{r}}_0 - {\boldsymbol{r}} \right] \cdot \mathrm{d}\psi_0, \end{aligned} $$
(4.30)
$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{r}}}{\partial s} \cdot \mathrm{d}s = {\boldsymbol{t}} \cdot \mathrm{d}s, \end{aligned} $$
(4.31)
$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{t}}}{\partial {\boldsymbol{t}}_0} \cdot\mathrm{d}{\boldsymbol{t}}_0 = \cos \theta \cdot\mathrm{d}{\boldsymbol{t}}_0 + \left( 1 - \cos \theta \right) \cdot \left( {\boldsymbol{h}} \cdot\mathrm{d}{\boldsymbol{t}}_0 \right) \cdot {\boldsymbol{h}} + \sin \theta \cdot \left( {\boldsymbol{h}} \times\mathrm{d}{\boldsymbol{t}}_0 \right), \end{aligned} $$
(4.32)
$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{t}}}{\partial \psi_0} \cdot \mathrm{d}\psi_0 = \frac{\alpha K s}{\psi} \cdot {\boldsymbol{n}} \cdot \mathrm{d}\psi_0, \end{aligned} $$
(4.33)
$$\displaystyle \begin{aligned} &\displaystyle\frac{\partial {\boldsymbol{t}}}{\partial s} \cdot \mathrm{d}s = \alpha K \cdot {\boldsymbol{n}} \cdot \mathrm{d}s. \end{aligned} $$
(4.34)

4.1.1 Transformation from One Curvilinear Frame to Another

The curvilinear frame is uniquely defined at each point along the track by three orthogonal unit vectors u, v and t, defining a coordinate system \(\left ( x_\bot , y_\bot , z_\bot \right )\). The vector t has been defined above as the unit vector parallel to the track, and pointing in the particle direction. The two vectors u and v are defined by

$$\displaystyle \begin{gathered} {\boldsymbol{u}} = \frac{{\boldsymbol{z}} \times {\boldsymbol{t}}}{|\hspace{0.5pt}{\boldsymbol{z}} \times {\boldsymbol{t}}\hspace{0.5pt}|},\ \; {\boldsymbol{v}} = {\boldsymbol{t}} \times {\boldsymbol{u}}, \end{gathered} $$
(4.35)

where z is the unit vector pointing in the direction of the global z-axis. This means that the z -axis is pointing along the particle direction, the x -axis is parallel to the global (x, y)-plane, while the y -axis is given by the requirement that the three axes should form a Cartesian, right-handed coordinate system. The relations between the momentum components \(\left ( p_x, p_y, p_z \right )\) in the global Cartesian frame and the angles are:

$$\displaystyle \begin{gathered} \begin{aligned} p_x & = p \cos \lambda \cos \phi, \\ p_y & = p \cos \lambda \sin \phi, \\ p_z & = p \sin \lambda. \end{aligned} \end{gathered} $$
(4.36)

The Jacobian of the transformation from a curvilinear frame \(\left ( \psi , \phi , \lambda , x_{\bot }, y_{\bot } \right )\) at s 0 = 0 to the same set of parameters at path length s is then derived by forming the differentials dr and dt, introducing the specific constraints given by the curvilinear frames,

$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{r}}_0 & = {\boldsymbol{u}}_0 \cdot \mathrm{d}{}x_{\bot 0} + {\boldsymbol{v}}_0 \cdot \mathrm{d}{}y_{\bot 0}, \end{aligned} $$
(4.37)
$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{t}}_0 & = \displaystyle\frac{\partial {\boldsymbol{t}}_0}{\partial \phi_0} \cdot \mathrm{d}\phi_0+\displaystyle\frac{\partial {\boldsymbol{t}}_0}{\partial \lambda_0} \cdot \mathrm{d}\lambda_0 = \cos \lambda_0 \cdot {\boldsymbol{u}}_0 \cdot \mathrm{d}\phi_0 + {\boldsymbol{v}}_0 \cdot \mathrm{d}\lambda_0, \end{aligned} $$
(4.38)
$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{r}} & = {\boldsymbol{u}} \cdot \mathrm{d}{}x_\bot + {\boldsymbol{v}} \cdot \mathrm{d}{}y_\bot, \end{aligned} $$
(4.39)
$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{t}} & = \cos \lambda \cdot {\boldsymbol{u}} \cdot \mathrm{d}\phi + {\boldsymbol{v}} \cdot \mathrm{d}\lambda. \end{aligned} $$
(4.40)

Moreover, since dr is now defined to be a variation in a plane perpendicular to the track, the functional dependence of ds on the variations of position, direction, and momentum at the initial point can be evaluated by multiplying Eq. (4.26) with t and using the constraint dr ⋅t = 0. One obtains

$$\displaystyle \begin{gathered} {} \mathrm{d}s = - {\boldsymbol{t}} \cdot\mathrm{d}{\boldsymbol{r}}_0 - {\boldsymbol{t}} \cdot \left( \displaystyle\frac{\partial {\boldsymbol{r}}}{\partial {\boldsymbol{t}}_0} \cdot\mathrm{d}{\boldsymbol{t}}_0 \right) - \left( {\boldsymbol{t}} \cdot \displaystyle\frac{\partial {\boldsymbol{r}}}{\partial \psi_0} \right) \cdot \mathrm{d}\psi_0. \end{gathered} $$
(4.41)

Inserting Eq. (4.41) and Eqs. (4.28) – (4.34) into Eqs. (4.26) and (4.27), making use of Eqs. (4.37) and (4.40), yields a set of equations relating variations of the parameters at the initial surface to the variations at the destination surface. These can then be manipulated in a straightforward manner to yield the differentials of the parameters at the destination surface. Since, for instance, the differential dx is defined as

$$\displaystyle \begin{gathered} \mathrm{d}x_{\bot} = \displaystyle\frac{\partial x_{\bot}}{\partial \psi_0} \cdot \mathrm{d}\psi_0 + \displaystyle\frac{\partial x_{\bot}}{\partial \phi_0} \cdot \mathrm{d}\phi_0 + \displaystyle\frac{\partial x_{\bot}}{\partial \lambda_0} \cdot \mathrm{d}\lambda_0 + \displaystyle\frac{\partial x_{\bot}}{\partial x_{\bot 0}} \cdot \mathrm{d}x_{\bot 0} + \displaystyle\frac{\partial x_{\bot}}{\partial y_{\bot 0}} \cdot \mathrm{d}y_{\bot 0}, \end{gathered} $$
(4.42)

the different terms in the desired Jacobian can be identified as the multiplying factors of the variations at the initial surface. A complete list of these can be found in Appendix A. Since a perfectly helical track model is assumed, the momentum at the destination surface is equal to the momentum at the initial surface.

4.1.2 Transformations Between Curvilinear and Local Frames at a Fixed Point on the Particle Trajectory

Consider the transformation from a local plane—defined by a unit vector i normal to the plane and two, orthogonal unit vectors j and k inside the plane—to the curvilinear frame. These three unit vectors define the coordinate system \(\left ( u, v, w \right )\) introduced in Sect. 4.2. The aim is to derive the Jacobian of the transformation between \(\left ( \psi , v', w', v, w \right )\), where v′ = dv∕du and w′ = dw∕du, and \(\left ( \psi , \phi , \lambda , x_{\bot }, y_{\bot } \right )\) at a given point s on the particle trajectory. The relevant differentials are now

$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{r}} & = {\boldsymbol{u}} \cdot \mathrm{d}{}x_\bot + {\boldsymbol{v}} \cdot \mathrm{d}{}y_\bot = {\boldsymbol{j}} \cdot \mathrm{d} v + {\boldsymbol{k}} \cdot \mathrm{d} w + {\boldsymbol{t}} \cdot \mathrm{d}{}s, \end{aligned} $$
(4.43)
$$\displaystyle \begin{aligned} \mathrm{d}{\boldsymbol{t}} & = \cos \lambda \cdot {\boldsymbol{u}} \cdot \mathrm{d}\phi + {\boldsymbol{v}} \cdot \mathrm{d}\lambda = \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial v'} \cdot \mathrm{d} v' + \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial w'} \cdot \mathrm{d} w' + \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial s} \cdot \mathrm{d}s. \end{aligned} $$
(4.44)

Again dr is orthogonal to t, therefore ds can be calculated from Eq. (4.43), in a similar manner as before. The result is

$$\displaystyle \begin{gathered} {} \mathrm{d}s = - \left( {\boldsymbol{t}} \cdot {\boldsymbol{j}} \right) \cdot \mathrm{d} v - \left( {\boldsymbol{t}} \cdot {\boldsymbol{k}} \right) \cdot \mathrm{d} w. \end{gathered} $$
(4.45)

By inserting Eq. (4.45) into Eqs. (4.43) and (4.44) and following the same procedure as in Sect. 4.4.1.1, explicit expressions of the differentials can again be constructed. For this also t and its derivatives, expressed in the local parameters, are needed. The formulas are:

$$\displaystyle \begin{aligned} {\boldsymbol{t}} & = \frac 1{\sqrt{1 + {v'}^2 + {w'}^2}} \left[ {\boldsymbol{i}} + v' {\boldsymbol{j}} + w' {\boldsymbol{k}}\right], \end{aligned} $$
(4.46)
$$\displaystyle \begin{aligned} \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial v'} & = \frac 1{\sqrt{1 + {v'}^2 + {w'}^2}} \left[ {\boldsymbol{j}} - \frac{v'}{\sqrt{1 + {v'}^2 + {w'}^2}} {\boldsymbol{t}} \right], \end{aligned} $$
(4.47)
$$\displaystyle \begin{aligned} \displaystyle\frac{\partial {\boldsymbol{t}}}{\partial w'} & = \frac 1{\sqrt{1 + {v'}^2 + {w'}^2}} \left[ {\boldsymbol{k}} - \frac{w'}{\sqrt{1 + {v'}^2 + {w'}^2}} {\boldsymbol{t}} \right]. \end{aligned} $$
(4.48)

The Jacobian of the transformation from the curvilinear to the local frame can be derived by inverting the Jacobian of the transformation from the local to the curvilinear frame. The expressions of both Jacobians can be found in Appendix A.

4.1.3 Transformations Between Global Cartesian and Local Frames

The global Cartesian frame (p x, p y, p z, x, y, z) is useful for vertex reconstruction purposes as well as for track reconstruction in zero and inhomogeneous magnetic fields. Zero magnetic field is typical in, for instance, test-beam applications.

First the transformations between a local frame and the global Cartesian frame is considered. The basic strategy is to go via an intermediate Cartesian frame aligned with the local frame under consideration. This Cartesian frame is related to the global Cartesian frame via a pure rotation. The intermediate Cartesian frame C′ will be denoted by primed quantities (\(p_x^{\prime },\ p_y^{\prime },\ p_z^{\prime },\ x',\ y',\ z'\)), where the x′- and y′-axes are parallel to the v- and w-axes of the local frame, respectively. The Jacobian R of the transformation from the global, Cartesian frame to the intermediate frame C′ consists of two similar, three-by-three blocks, each containing the relevant rotation matrix. The other entries of this matrix are zero. The Jacobian \({\boldsymbol {J}}_{C' \rightarrow L}\) of the transformation from the primed Cartesian frame C′ to the local frame L is constructed from the following formulas:

$$\displaystyle \begin{aligned} \frac{q}{p} & = \displaystyle\frac{q}{\sqrt{{p_x^{\prime}}^2 + {p_y^{\prime}}^2 + {p_z^{\prime}}^2}}, \end{aligned} $$
(4.49)
$$\displaystyle \begin{aligned} v' & = \displaystyle\frac{p_x^{\prime}}{p_z^{\prime}}, \end{aligned} $$
(4.50)
$$\displaystyle \begin{aligned} w' & = \displaystyle\frac{p_y^{\prime}}{p_z^{\prime}}. \end{aligned} $$
(4.51)

The total Jacobian is given by the product \({\boldsymbol {J}}_{C' \rightarrow L} \cdot {\boldsymbol {R}}\).

For the inverse transformation, the Jacobian \({\boldsymbol {J}}_{L \rightarrow C'}\) can be derived from the following relations:

$$\displaystyle \begin{aligned} p_x^{\prime} & = \displaystyle\frac{q}{\psi} \cdot \frac{s_z \cdot v'} {\sqrt{1 + v^{\prime2} + w^{\prime2}}}, \end{aligned} $$
(4.52)
$$\displaystyle \begin{aligned} p_y^{\prime} & = \displaystyle\frac{q}{\psi} \cdot \frac{s_z \cdot w'} {\sqrt{1 + v^{\prime2} + w^{\prime2}}}, \end{aligned} $$
(4.53)
$$\displaystyle \begin{aligned} p_z^{\prime} & = \displaystyle\frac{q}{\psi} \cdot \frac{s_z}{\sqrt{1 + v^{\prime2} + w^{\prime2}}}, \end{aligned} $$
(4.54)

where \(s_z= \operatorname {\mathrm {sign}}(p_z)\) is the sign of the z-component of the momentum vector in the local frame. It is needed in order to uniquely specify the state of the track in the local frame. The total Jacobian is in this case given by the product \({\boldsymbol {R}}^T \cdot {\boldsymbol {J}}_{L \rightarrow C'}\), using the same matrix R as above. The Jacobians \({\boldsymbol {J}}_{C' \rightarrow L}\) and \({\boldsymbol {J}}_{L \rightarrow C'}\) are shown in Appendix A.

4.2 Inhomogeneous Magnetic Fields

As mentioned above, the common way of obtaining the Jacobian matrices needed for the linearized error propagation is to expand the track propagator functions to first order in a Taylor series. As there are no analytical expressions for the track propagator in the case of inhomogeneous magnetic fields, this approach is unfortunately impossible. Another common technique is the error propagation by numerical derivatives described earlier. This method is slow but robust and accurate. A third way of obtaining the Jacobian matrices is to differentiate the recursion formulae of the numerical integration method directly. This is the essence of the so-called Bugge-Myrheim method [3].

In the parameter propagation, the propagated global track parameters:

$$\displaystyle \begin{gathered} {\boldsymbol{r}}= \begin{pmatrix} x \\ y \\ z \end{pmatrix}, \ \; {\boldsymbol{\dot{{\boldsymbol{r}}}}}= {\boldsymbol{t}} = \begin{pmatrix} t_x \\ t_y \\ t_z \end{pmatrix}, {} \end{gathered} $$
(4.55)

are obtained by integrating the equations of motion, using some recursion formulas. With the Runge–Kutta–Nyström method, one step (numbered by n) becomes:

$$\displaystyle \begin{aligned} {\boldsymbol{r}}_{n+1} &= {\boldsymbol{r}}_n+h\hspace{0.5pt}{\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+\frac{h^2}{6}({\boldsymbol{k}}_1+{\boldsymbol{k}}_2+{\boldsymbol{k}}_3) = {\boldsymbol{F}}_n({\boldsymbol{r}}_n, {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n), \end{aligned} $$
(4.56)
$$\displaystyle \begin{aligned} {\boldsymbol{\dot{{\boldsymbol{r}}}}}_{n+1} &= {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n+\frac{h}{6}({\boldsymbol{k}}_1+2{\boldsymbol{k}}_2+2{\boldsymbol{k}}_3+{\boldsymbol{k}}_4) = {\boldsymbol{G}}_n({\boldsymbol{r}}_n, {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n). {} \end{aligned} $$
(4.57)

To obtain the Jacobian matrix of the propagated global track parameters with respect to the initial track parameters q i, the recursion formulae (Eq. (4.57)) have to be differentiated with respect to q i, giving:

$$\displaystyle \begin{gathered} {{\boldsymbol{J}}}_{n+1} = \begin{pmatrix} \displaystyle\frac{\partial {\boldsymbol{r}}_{n+1}}{\partial {\boldsymbol{q}}_i}\\ {}\displaystyle\frac{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_{n+1}}{\partial {\boldsymbol{q}}_i} \end{pmatrix} = \begin{pmatrix} \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{q}}_i}\\ {}\displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{q}}_i} \end{pmatrix} = \begin{pmatrix} \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{r}}_n} & \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}\\ {}\displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{r}}_n} & \displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n} \end{pmatrix} \cdot \begin{pmatrix} \displaystyle\frac{\partial {\boldsymbol{r}}_n}{\partial {\boldsymbol{q}}_i}\\ {}\displaystyle\frac{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}{\partial {\boldsymbol{q}}_i} \end{pmatrix} = \boldsymbol{D}_n \cdot \boldsymbol{J}_n, {} \end{gathered} $$
(4.58)

where the derivatives r n q i and \(\partial {\boldsymbol {\dot {{\boldsymbol {r}}}}}_n / \partial {\boldsymbol {q}}_i\) of the 6 × j Jacobian J are given by the following 3 × j matrices:

$$\displaystyle \begin{gathered} \displaystyle\frac{\partial {\boldsymbol{r}}_n}{\partial {\boldsymbol{q}}_i} = \begin{pmatrix} \displaystyle\frac{\partial x_n}{\partial q_{i,1}} & \cdots & \displaystyle\frac{\partial x_n}{\partial q_{i,j}} \\ \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial z_n}{\partial q_{i,1}} & \cdots & \displaystyle\frac{\partial z_n}{\partial q_{i,j}} \end{pmatrix}, \ \; \displaystyle\frac{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}{\partial {\boldsymbol{q}}_i} = \begin{pmatrix} \displaystyle\frac{\partial t_{x,n}}{\partial q_{i,1}} & \cdots & \displaystyle\frac{\partial t_{x,n}}{\partial q_{i,j}} \\ \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial t_{z,n}}{\partial q_{i,1}} & \cdots & \displaystyle\frac{\partial t_{z,n}}{\partial q_{i,j}} \end{pmatrix}. {} \end{gathered} $$
(4.59)

D n is a 6 × 6 matrix containing the recursion formulae F n and G n differentiated with respect to the global track parameters:

$$\displaystyle \begin{gathered} \boldsymbol{D}_n = \displaystyle\frac{\partial ({\boldsymbol{F}}_n, {\boldsymbol{G}}_n)}{\partial ({\boldsymbol{r}}_n, {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n)} = \begin{pmatrix} \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{r}}_n} & \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}\\ {}\displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{r}}_n} & \displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n} \end{pmatrix}, {} \end{gathered} $$
(4.60)

giving the 3 × 3 matrices

$$\displaystyle \begin{gathered} \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{r}}_n} = \begin{pmatrix} \displaystyle\frac{\partial F_{n,1}}{\partial x_n} & \cdots & \displaystyle\frac{\partial F_{n,1}}{\partial z_n} \\ \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial F_{n,3}}{\partial x_n} & \cdots & \displaystyle\frac{\partial F_{n,3}}{\partial z_n} \end{pmatrix}, \ \; \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n} = \begin{pmatrix} \displaystyle\frac{\partial F_{n,1}}{\partial t_{x,n}} & \cdots & \displaystyle\frac{\partial F_{n,1}}{\partial t_{z,n}} \\ \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial F_{n,3}}{\partial t_{x,n}} & \cdots & \displaystyle\frac{\partial F_{n,3}}{\partial t_{z,n}} \end{pmatrix}, {} \end{gathered} $$
(4.61)

and

(4.62)

By writing the recursion formulae of the derivatives as a product of D n and J n (Eq. (4.58)), the recursion formulae F n and G n can be differentiated with respect to the global track parameters r n and \({\boldsymbol {\dot {{\boldsymbol {r}}}}}_n\) instead of the initial track parameters q i. This greatly simplifies the differentiation, giving:

$$\displaystyle \begin{gathered} \begin{aligned} \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{r}}_n} & = 1 + \frac{h^2}{6}\left(\displaystyle\frac{\partial {\boldsymbol{k}}_1}{\partial {\boldsymbol{r}}_n}+\displaystyle\frac{\partial {\boldsymbol{k}}_2}{\partial {\boldsymbol{r}}_n}+\displaystyle\frac{\partial {\boldsymbol{k}}_3}{\partial {\boldsymbol{r}}_n}\right),\\ \displaystyle\frac{\partial {\boldsymbol{F}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n} & = h + \frac{h^2}{6}\left(\displaystyle\frac{\partial {\boldsymbol{k}}_1}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}+\displaystyle\frac{\partial {\boldsymbol{k}}_2}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}+\displaystyle\frac{\partial {\boldsymbol{k}}_3}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}\right), \\ \displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{r}}_n} & = \frac{h}{6}\left(\displaystyle\frac{\partial {\boldsymbol{k}}_1}{\partial {\boldsymbol{r}}_n}+2\displaystyle\frac{\partial {\boldsymbol{k}}_2}{\partial {\boldsymbol{r}}_n}+2\displaystyle\frac{\partial {\boldsymbol{k}}_3}{\partial {\boldsymbol{r}}_n}+\displaystyle\frac{\partial {\boldsymbol{k}}_{4}}{\partial {\boldsymbol{r}}_n}\right), \\ \displaystyle\frac{\partial {\boldsymbol{G}}_n}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n} & = 1 + \frac{h}{6}\left(\displaystyle\frac{\partial {\boldsymbol{k}}_1}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}+2\displaystyle\frac{\partial {\boldsymbol{k}}_2}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}+2\displaystyle\frac{\partial {\boldsymbol{k}}_3}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}+\displaystyle\frac{\partial {\boldsymbol{k}}_{4}}{\partial {\boldsymbol{\dot{{\boldsymbol{r}}}}}_n}\right). {} \end{aligned} \end{gathered} $$
(4.63)

In order to calculate these derivatives explicitly, the individual stages of the Runge–Kutta–Nyström method have to be differentiated with respect to the global track parameters:

(4.64)

where l denotes the individual stages, and k l is given by the equations of motion of the global track parameters in Eq. (4.2):

$$\displaystyle \begin{gathered} \begin{aligned} \frac{\mathrm{d}^2 x}{\mathrm{d} s^2} &= x'' = \xi\hspace{0.5pt} (t_{y} B_{z} - t_{z} B_{y}), \\ \frac{\mathrm{d}^2 y}{\mathrm{d} s^2} &= y'' = \xi\hspace{0.5pt} (t_{z} B_{x} - t_{x} B_{z}), \\ \frac{\mathrm{d}^2 z}{\mathrm{d} s^2} &= z'' = \xi\hspace{0.5pt} (t_{x} B_{y} - t_{y} B_{x}), \end{aligned} {} \end{gathered} $$
(4.65)

where ξ ≡ k ψ. Writing the 3 × 3 matrices A l and C l in a general form yields:

$$\displaystyle \begin{gathered} \boldsymbol{A} = \begin{pmatrix} \displaystyle\frac{\partial x''}{\partial t_x} & \cdots & \displaystyle\frac{\partial x''}{\partial t_z} \\ \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial z''}{\partial t_x} & \cdots & \displaystyle\frac{\partial z''}{\partial t_z} \end{pmatrix} = \begin{pmatrix} 0 & \xi\hspace{0.5pt} B_{z} & -\xi\hspace{0.5pt} B_{y} \\ -\xi\hspace{0.5pt} B_{z} & 0 & \xi\hspace{0.5pt} B_{x} \\ \xi\hspace{0.5pt} B_{y} & -\xi\hspace{0.5pt} B_{x} & 0 \end{pmatrix} {} \end{gathered} $$
(4.66)

and

$$\displaystyle \begin{gathered} \begin{array}{l} {{\boldsymbol{C}}} {=}\! \begin{pmatrix} \displaystyle\frac{\partial x''}{\partial x} & \cdots & \displaystyle\frac{\partial x''}{\partial z} \\ \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial z''}{\partial x} & \cdots & \displaystyle\frac{\partial z''}{\partial z} \end{pmatrix}\! {=}\, \xi\! \begin{pmatrix} (t_{y}B_{z;x} - t_{z}B_{y;x}) & (t_{y}B_{z;y} - t_{z}B_{y;y}) & (t_{y}B_{z;z} - t_{z}B_{y;z}) \\ (t_{z}B_{x;x} - t_{x}B_{z;x}) & (t_{z}B_{x;y} - t_{x}B_{z;y}) & (t_{z}B_{x;z} - t_{x}B_{z;z}) \\ (t_{x}B_{y;x} - t_{y}B_{x;x}) & (t_{x}B_{y;y} - t_{y}B_{x;y}) & (t_{x}B_{y;z} - t_{y}B_{x;z}) \end{pmatrix}\!, \end{array} {} \end{gathered} $$
(4.67)

with

$$\displaystyle \begin{gathered} B_{u;v}=\displaystyle\frac{\partial B_{u}}{\partial v},\ \; u,v\in\{x,y,z\}. \end{gathered} $$
(4.68)

With the help of these matrices, the elements of the matrix D n in Eq. (4.60) are computed. D n is then multiplied by J n to produce the transported Jacobian J n+1; see Eq. (4.58). This procedure is repeated for every recursion step, transforming J along the way. If, for instance, a planar detector element is reached at the end of the propagation steps and a measurement is to be included in a track reconstruction algorithm, the Jacobian can be transported from the global parameters to a local set of parameters as described in Sect. 4.4.1.3.

When applied to real problems, the field gradients in C are usually quite costly to calculate. The calculations can be significantly sped up by setting elements of C with negligible influence on the Jacobians to zero. The sensitivity of the Jacobians to the field gradients can be checked by a simulation study.

5 Material Effects

5.1 Multiple Scattering

5.1.1 The Distribution of the Scattering Angle

Elastic Coulomb scattering of particles heavier than electrons, including the muon, is dominated by the atomic nucleus. The PDF of the single scattering angle θ has the following form [9]:

$$\displaystyle \begin{gathered}{} f(\theta)= \begin{cases}\displaystyle \frac{k\theta}{(\theta^{\,2}+a^2)^2} & \mathrm{if}\ \theta\leq b,\\ \qquad0 & \mathrm{otherwise,} \end{cases} \end{gathered} $$
(4.69)

with the normalization constant

$$\displaystyle \begin{gathered} k=2\hspace{0.5pt}{a}^2\left(1+{a}^2/{b}^2\right). \end{gathered} $$
(4.70)

The PDF in Eq. (4.69) is derived from the Rutherford scattering formula by setting \(\sin \theta \approx \theta \) and introducing a minimal and a maximal scattering angle, thereby avoiding the singularity at θ = 0. The minimal and maximal angles a, b depend on the nuclear charge Z and the atomic mass number A of the nucleus as well as on the momentum p of the scattered particle [9]:

$$\displaystyle \begin{gathered} a=\frac{2.66\cdot10^{-6}\cdot Z^{1/3}}{p},\ \; b=\frac{0.14}{A^{1/3}\cdot p}.{} \end{gathered} $$
(4.71)

Here and in the following, p is assumed to be given in units of GeV∕c. The ratio

$$\displaystyle \begin{gathered} \rho=b/a\approx{\left(\frac{204}{Z^{1/3}}\right)}^2{} \end{gathered} $$
(4.72)

depends only on the nuclear charge Z, with the exception of very heavy nuclei, where the approximation A ≈ 2 Z breaks down. The size of ρ is typically of the order 104. The normalization constant k can therefore be approximated by k ≈ 2 a 2. It follows that the expectations of θ and θ 2 are approximately given by:

$$\displaystyle \begin{gathered} {\mathsf{E}[\theta]}=\frac{4.18\cdot10^{-6}\cdot Z^{\hspace{0.5pt}1/3}}{p}, \ \; {\mathsf{E}[\theta^{\hspace{0.5pt}2}]}=\frac{2.84\cdot10^{-11}\cdot Z^{\hspace{0.5pt}2/3}\cdot\ln(159\,Z^{-1/3})}{p^2}. \end{gathered} $$
(4.73)

In track reconstruction, the scattering angle θ is of little use, as it is more convenient to work with two projected scattering angles instead, for instance, \(\theta _x=\theta \,\cos \phi \) and \(\theta _y=\theta \,\sin \phi \), if the particle runs parallel to the z-axis. Under the assumption that the azimuthal angle ϕ is independent of θ and uniform in the interval [0, 2π], their joint PDF is given by [10]:

$$\displaystyle \begin{gathered} g(\theta_x,\theta_y)= \begin{cases}\displaystyle \frac 1{\pi}\cdot\frac{a^2}{(\theta_x^{\,2}+\theta_y^{\,2}+a^2)^2},&\ \;\mathrm{if}\ \; 0\leq \theta_x^{\,2}+\theta_y^{\,2} \leq b^2,\\ \,0, &\ \;\mathrm{otherwise}. \end{cases} \end{gathered} $$
(4.74)

The support of the joint PDF is a circle around the origin with radius b. The projected angles θ x and θ y are uncorrelated but not independent. The marginal PDF of the projected angle in the interval [−b, b] has the following form:

(4.75)

with θ p either θ x or θ y. The marginal PDF has zero mean, a single mode at θ p = 0 with f(0) ≈ 1∕(2a), and the following variance:

$$\displaystyle \begin{gathered} {\mathsf{var}\,[\theta_{\hspace{0.5pt}\mathrm{p}}]}={\mathsf{E}[\theta^{\,2}]}/2=\frac{1.42\cdot10^{-11}\cdot Z^{2/3}\cdot\ln(159\,Z^{-1/3})}{p^2}. \end{gathered} $$
(4.76)

The range of the distribution is very large. For silicon (Z = 14), it is about ± 2500 standard deviations.

If the projected scattering angle is still small after N scattering processes, it is approximately equal to the sum of the individual projected scattering angles. Its distribution can be obtained either by the convolution of N single scattering distributions [10] or by the Molière theory [11,12,13]. Figure 4.7 shows the two PDFs for a muon with p = 1 GeV∕c and four silicon scatterers. Their thickness d is given in fractions of a radiation length [14]. The corresponding numbers of scattering processes in the four scatterers have been chosen as N = 210, 213, 216 and 219, respectively, spanning the range from about 0.2% to about 90% of a radiation length. There is excellent agreement for all scatterers but the thickest one; in this case, the tails are more persistent in the Molière PDF than in the PDF obtained by convolution. The discrepancy is at angles larger than the maximum scattering angle b, which is equal to about 46 mrad in silicon for p = 1 GeV∕c.

Fig. 4.7
figure 7

Probability density functions of the projected multiple scattering angle for silicon targets obtained by convolution (solid lines) and frequency distributions obtained by simulation from the Molière densities (dots). The vertical dashed lines in figure (d) show the upper limit b ≈ 46 mrad in silicon

For the purpose of track reconstruction, the “exact” distribution of the projected scattering angle has to be approximated by a single Gaussian or a mixture of Gaussians; see Sect. 6.2.3. The most commonly used approximation by a single Gaussian is the Highland formula, proposed in [15] and modified in [16]. The most up-to-date version can be found in [14], which gives the standard deviation of the projected scattering angle as :

(4.77)

where p, β and z are the momentum, velocity, and charge number of the incident particle, and dX 0 is the thickness of the scatterer in units of the radiation length.

If the scatterer consists of a composite material with k components, its radiation length X 0 is given by:

$$\displaystyle \begin{gathered} \frac 1{X_0}=\sum_{i=1}^k \frac{f_i}{X_i}, \end{gathered} $$
(4.78)

where X i is the radiation length of component i in g/cm2 and f i is the mass fraction of component i [17]. In the following, it is more convenient to express d and X 0 in centimeters:

$$\displaystyle \begin{gathered} \frac{X_0}{\mathrm{cm}}=\frac{\mathrm{g/cm}^3}{\varrho}\frac{X_0}{\mathrm{g/cm}^2},{} \end{gathered} $$
(4.79)

where ϱ is the density of the material in units of g∕cm3.

In a thin scatterer, a Gaussian distribution is but a poor approximation of the actual distribution of the multiple scattering angle, because of the latter’s large range. A better, though still far from perfect, approximation can be obtained using a normal mixture with two components, one modeling the “core”, the other modeling the “tails” [10, 18]. The standardized two-component mixture PDF of θ p with variance equal to 1 has the following form:

$$\displaystyle \begin{gathered} f(\theta_{\hspace{0.5pt}\mathrm{p}})=(1-\varepsilon)\cdot\varphi\hspace{0.5pt}(\theta_{\hspace{0.5pt}\mathrm{p}};0,\sigma_1^2)+\varepsilon\cdot\varphi\hspace{0.5pt}(\theta_{\hspace{0.5pt}\mathrm{p}};0,\sigma_2^2), {} \end{gathered} $$
(4.80)

where \(\sigma _1^2<\sigma _2^2,\ \varepsilon <1/2\) and \(\sigma _2^2=[1-(1-\varepsilon )\,\sigma _1^2]/\varepsilon \).

The core variance \(\sigma _1^2\) is parametrized in terms of the reduced thickness \(d^{\prime }_0=d/(\beta ^2 X_0)\), where X 0 is the radiation length of the scatterer:

$$\displaystyle \begin{gathered} \sigma_1^2=8.471\cdot 10^{-1}+3.347\cdot 10^{-2}\cdot\ln d^{\prime}_0-1.843\cdot 10^{-3}\cdot(\ln d^{\prime}_0)^2 \end{gathered} $$
(4.81)

The tail weight ε is parametrized in terms of a modified reduced thickness \(d^{\prime \prime }_0=Z^{2/3}d/(\beta ^2 X_0)\):

(4.82)

Finally, the standardized PDF in Eq. (4.80) is scaled with the total standard deviation of the projected scattering angle without the logarithmic correction; see Eq. (4.77). This ensures that the variance of the scattering angle is strictly additive if the scatterer is divided into thinner slices. For a comparison of the mixture model with simulations by GEANT4 [19], see [18].

5.1.2 Multiple Scattering in Track Propagation

If a charged particle is propagated through material, the covariance matrix of the track parameters is augmented by the additional uncertainty on direction and possibly position caused by multiple scattering. The algorithmic treatment is different for “thin” and “thick” scatterers. By definition, in a thin scatterer the offset—the change in position of the passing particle—is negligible in relation to the spatial resolution of the surrounding detectors, so that only the direction is affected. For instance, in a silicon sensor with a typical thickness of 0.3 mm the standard deviation of the offset is less than 0.2 µm for momenta above 1 GeV; see Eq. (4.107) below. As the spatial resolution is in the order of 10 µm, the offset can be safely neglected, and the sensor is considered as a thin scatterer.

If, on the other hand, the scatterer is a 5 cm thick iron absorber in a muon spectrometer , the standard deviation of the offset is about 0.66 mm for a muon with 1 GeV, and the offset can be as large as 2 mm. This is no longer negligible in a muon spectrometer equipped with drift chambers having a typical spatial resolution of 150 µm. The absorber should therefore be treated as a thick scatterer, at least for low-momentum muons.

Thin Scatterers

Consider a thin scatterer with nominal thickness d, radiation length X 0, and unit normal vector n at the point where the track crosses the scatterer. The track is specified by a vector qof track parameters and the associated covariance matrix C. In a thin scatterer, only the sub-matrix of Ccorresponding to the track direction is augmented. The details depend on the choice of the track parametrization; see Sect. 4.2.

  1. A.

    The track is parametrized as in Eq. (4.3):

    $$\displaystyle \begin{gathered} q_1 = \frac{q}{{p_{\mathrm{T}}}} ,\ q_2 = \phi ,\ q_3 = \tan \lambda ,\ q_4 = R \varPhi ,\ q_5 = z. \end{gathered} $$
    (4.83)
    1. 1.

      Compute the unit direction vector a and the momentum p of the track:

      $$\displaystyle \begin{gathered} {\boldsymbol{a}}=\left(\cos\phi\cos\lambda,\sin\phi\cos\lambda,\sin\lambda\right){{}^{\mathsf{T}}},\ \;p=1/(|\,q_1|\cos\lambda). \end{gathered} $$
      (4.84)
    2. 2.

      Compute the effective thickness t in units of X 0:

      $$\displaystyle \begin{gathered} t=\frac{d\hspace{0.5pt}|\hspace{0.5pt}{\boldsymbol{a}}\hspace{0.5pt}|}{|\hspace{0.5pt}{\boldsymbol{a}}\cdot{\boldsymbol{n}}\hspace{0.5pt}|\,X_0}.{} \end{gathered} $$
      (4.85)
    3. 3.

      Compute the variance of the projected multiple scattering angle, assuming β = 1 and q = 1 if unknown:

      $$\displaystyle \begin{gathered} {\mathsf{var}\,[\theta_{\hspace{0.5pt}\mathrm{p}}]}=\frac{1.85\cdot10^{-4}\cdot t}{p^2}\, \left(1+0.038\ln t\right)^2.{} \end{gathered} $$
      (4.86)
    4. 4a.

      If \(\sin \phi =0\) compute the covariance matrix D of (a 2, a 3)T:

      $$\displaystyle \begin{aligned} {\boldsymbol{D}}&={\boldsymbol{J}}\cdot{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})\cdot{\boldsymbol{J}}{{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ {\boldsymbol{J}}&= \begingroup \begin{pmatrix} {\cos\phi\cos\lambda} &{\ } {0}\\ 0 &{\ } \cos^3\!\lambda \end{pmatrix}. \endgroup \end{aligned} $$
      (4.87)

      Augment D by the contribution of multiple scattering [1]:

      $$\displaystyle \begin{gathered} {\boldsymbol{D}}'={\boldsymbol{D}}+{\mathsf{var}\,[\theta_{\hspace{0.5pt}\mathrm{p}}]}\cdot{} \begingroup \begin{pmatrix} 1-a_2{}^2 &{\ } -a_2a_3\\ -a_2a_3 &{\ } 1-a_3^2 \end{pmatrix}. \endgroup \end{gathered} $$
      (4.88)

      Modify C:

      $$\displaystyle \begin{aligned} &{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})={\boldsymbol{J}}{{}^{-1}}\cdot{\boldsymbol{D}}'\cdot({\boldsymbol{J}}{{}^{-1}}){{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ &{\boldsymbol{J}}{{}^{-1}}= \begingroup \begin{pmatrix} {1}/{\cos\phi\cos\lambda} &{\ } {0}\\ 0 &{\ } {1}/{\cos^3\!\lambda} \end{pmatrix}. \endgroup \end{aligned} $$
      (4.89)
    5. 4b.

      If \(\sin \phi \neq 0\) compute the covariance matrix D of (a 1, a 3)T:

      $$\displaystyle \begin{aligned} {\boldsymbol{D}}&={\boldsymbol{J}}\cdot{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})\cdot{\boldsymbol{J}}{{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ {\boldsymbol{J}}&= \begingroup \begin{pmatrix} {-\sin\phi\cos\lambda} &{\ } -{\cos\phi\sin\lambda\cos^2\!\lambda}\\ 0 &{\ } \cos^3\!\lambda \end{pmatrix}. \endgroup \end{aligned} $$
      (4.90)

      Augment D by the contribution of multiple scattering [1]:

      $$\displaystyle \begin{gathered} {\boldsymbol{D}}'={\boldsymbol{D}}+{\mathsf{var}\,[\theta_{\hspace{0.5pt}\mathrm{p}}]}\cdot{} \begingroup \begin{pmatrix} 1-a_1{}^2 & -a_1a_3\\ -a_1a_3 & 1-a_3^2 \end{pmatrix}. \endgroup \end{gathered} $$
      (4.91)

      Modify C:

      $$\displaystyle \begin{aligned} &{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})={\boldsymbol{J}}{{}^{-1}}\cdot{\boldsymbol{D}}'\cdot({\boldsymbol{J}}{{}^{-1}}){{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ &{\boldsymbol{J}}{{}^{-1}}= \begingroup \begin{pmatrix} -{1}/{\sin\phi\cos\lambda} &{\ } -{\cos\phi\sin\lambda}/{\sin\phi\cos^2\!\lambda}\\ 0 &{\ } {1}/{\cos^3\!\lambda} \end{pmatrix}. \endgroup \end{aligned} $$
      (4.92)
  2. B.

    The track is parametrized as in Eq. (4.4):

    $$\displaystyle \begin{gathered} q_1 = \psi ,\ q_2 = \mathrm{d} v/\mathrm{d} u ,\ q_3 = \mathrm{d} w/\mathrm{d} u ,\ q_4 = v ,\ q_5 = w. \end{gathered} $$
    (4.93)
    1. 1.

      Compute the direction vector a and the momentum p of the track:

      $$\displaystyle \begin{gathered} {\boldsymbol{a}}=\left(q_2,q_3,1\right){{}^{\mathsf{T}}},\ \;p=1/|\,q_1|. \end{gathered} $$
      (4.94)
    2. 2.

      Compute the effective thickness t and the variance of the projected multiple scattering angle as in Eqs. (4.85) and (4.86).

    3. 3.

      Extract the covariance matrix D of (q 2, q 3)T:

      $$\displaystyle \begin{gathered} {\boldsymbol{D}}={\boldsymbol{C}_{}}\,(\mbox{2:3,2:3}). \end{gathered} $$
      (4.95)
    4. 4.

      Augment D by the contribution of multiple scattering [1] and modify C:

      $$\displaystyle \begin{gathered} {\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})={\boldsymbol{D}}+{\mathsf{var}\,[\theta_{\hspace{0.5pt}\mathrm{p}}]}\cdot(1+q_2^2+q_3^2)\cdot \begingroup \begin{pmatrix} 1+q_2^2 &{\ } q_2 q_3 \\ q_2 q_3 &{\ } 1+q_3^2 \end{pmatrix}. \endgroup \end{gathered} $$
      (4.96)
  3. C.

    The track is parametrized as in Eq. (4.5):

    $$\displaystyle \begin{gathered} q_1 = \frac{q}{p} ,\ q_2 = \phi ,\ q_3 = \lambda ,\ q_4 = x_{\bot},\ q_5 = y_{\bot}.{} \end{gathered} $$
    (4.97)
    1. 1.

      Compute the unit direction vector a and the momentum p of the track:

      $$\displaystyle \begin{gathered} {\boldsymbol{a}}=\left(\cos\phi\cos\lambda,\sin\phi\cos\lambda,\sin\lambda\right){{}^{\mathsf{T}}},\ \;p=1/|\,q_1|. \end{gathered} $$
      (4.98)
    2. 2.

      Compute the effective thickness t and the variance of the projected multiple scattering angle as in Eqs. (4.85) and (4.86).

    3. 3a.

      If \(\sin \phi =0\), compute the covariance matrix D of (a 2, a 3)T:

      $$\displaystyle \begin{aligned} {\boldsymbol{D}}&={\boldsymbol{J}}\cdot{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})\cdot{\boldsymbol{J}}{{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ {\boldsymbol{J}}&= \begingroup \begin{pmatrix} {\cos\phi\cos\lambda} &{\ } {0}\\ 0 &{\ } \cos\lambda \end{pmatrix}. \endgroup \end{aligned} $$
      (4.99)

      Augment D by the contribution of multiple scattering as in Eq. (4.88) and modify C:

      $$\displaystyle \begin{aligned} &{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})={\boldsymbol{J}}{{}^{-1}}\cdot{\boldsymbol{D}}'\cdot({\boldsymbol{J}}{{}^{-1}}){{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ &{\boldsymbol{J}}{{}^{-1}}= \begingroup \begin{pmatrix} {1}/{\cos\phi\cos\lambda} &{\ } 0\\ 0 &{\ } {1}/{\cos\lambda} \end{pmatrix}. \endgroup \end{aligned} $$
      (4.100)
    4. 3b.

      If \(\sin \phi \neq 0\), compute the covariance matrix D of (a 1, a 3)T:

      $$\displaystyle \begin{aligned} {\boldsymbol{D}}&={\boldsymbol{J}}\cdot{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})\cdot{\boldsymbol{J}}{{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ {\boldsymbol{J}}&= \begingroup \begin{pmatrix} -{\sin\phi\cos\lambda} &{\ } -{\cos\phi\sin\lambda}\\ 0 &{\ } \cos\lambda \end{pmatrix}. \endgroup \end{aligned} $$
      (4.101)

      Augment D by the contribution of multiple scattering as in Eq. (4.91) and modify C:

      $$\displaystyle \begin{aligned} &{\boldsymbol{C}_{}}\,(\mbox{2:3,2:3})={\boldsymbol{J}}{{}^{-1}}\cdot{\boldsymbol{D}}'\cdot({\boldsymbol{J}}{{}^{-1}}){{}^{\mathsf{T}}},\ \;\mathrm{with}\ \;\\ &{\boldsymbol{J}}{{}^{-1}}= \begingroup \begin{pmatrix} -{1}/{\sin\phi\cos\lambda} &{\ } -{\cos\phi\sin\lambda}/{\sin\phi\cos^2\!\lambda}\\ 0 &{\ } {1}/{\cos\lambda} \end{pmatrix}. \endgroup \end{aligned} $$
      (4.102)

Thick Scatterers

A thick scatterer can be treated in two ways: sliced into a number of thin scatterers or considered a single scatterer. The first approach gives more precise results, in particular when the scatterer is magnetized and the incoming particle is expected to be strongly deflected and suffer significant energy loss. A typical example for this situation is a low-momentum muon crossing the instrumented iron return yoke of the CMS experiment [20].

In the second approach, the uncertainties of both the direction and the position at the exit point have to be increased. In addition, energy loss may have to be considered, and an estimate of the actual track length L in the scatterer must be available. If no estimate of L is returned by the track propagation algorithm, L can be approximated by the distance between the entry and the exit point times a safety factor that depends on the momentum and can be tuned by simulation. Alternatively, L can be determined from the most probable trajectory in the scatterer [21].

Assume that the track parameters and their covariance matrix at the exit of the scatterer are given by qand Cin the curvilinear parametrization; see Eq. (4.97).

  1. 1.

    Extract the entry and exit momenta p 1 and p 2. Under the assumption that the momentum p drops linearly between the entry and the exit, the average of 1∕p 2 is 1∕(p 1 p 2).

  2. 2.

    Compute the variance per unit length of the projected multiple scattering angle:

    $$\displaystyle \begin{gathered} t=\frac{1}{X_0},\ \;\sigma_0^2=\frac{1.85\cdot10^{-4}\cdot t}{p_1 p_2}\, \left(1+0.038\ln t\right)^2.{} \end{gathered} $$
    (4.103)
  3. 3.

    Extract the covariance matrix D of (q 2, q 3, q 4, q 5)T:

    $$\displaystyle \begin{gathered} {\boldsymbol{D}}={\boldsymbol{C}_{}}\,(\mbox{2:5,2:5}). \end{gathered} $$
    (4.104)
  4. 4.

    Compute the unit direction vector a in the global coordinate system:

    $$\displaystyle \begin{gathered} {\boldsymbol{a}}=\left(\cos\phi\cos\lambda,\sin\phi\cos\lambda,\sin\lambda\right). \end{gathered} $$
    (4.105)
  5. 5.

    Transform a into the curvilinear system by rotating a into a  = (0, 0, 1)T:

    (4.106)
  6. 6.

    Compute the joint covariance matrix E of \({\boldsymbol {q}_{}}'=(a_1^{\prime },a_2^{\prime },x_\bot ,y_\bot ){{ }^{\mathsf {T}}}\) [1]:

    $$\displaystyle \begin{gathered} {\boldsymbol{E}}=\sigma_0^2\cdot \begingroup \begin{pmatrix} L &{\ } 0 &{\ } L^2/2 &{\ } 0\\ 0 &{\ } L &{\ } 0 &{\ } L^2/2\\ L^2/2 &{\ } 0 &{\ } L^3/3 &{\ } 0\\ 0 &{\ } L^2/2 &{\ } 0 &{\ } L^3/3 \end{pmatrix}. \endgroup {} \end{gathered} $$
    (4.107)
  7. 7.

    Rotate the direction part of E back into the global system:

    $$\displaystyle \begin{gathered} {\boldsymbol{E}}'={\boldsymbol{T}}\cdot{\boldsymbol{E}}\cdot{\boldsymbol{T}}{{}^{\mathsf{T}}},\ \;\mathrm{with}\ \; {\boldsymbol{T}}= \begin{pmatrix} \displaystyle\frac{{a_1}^2a_3+{a_2}^2}{1-{a_3}^2} &{\ } -\displaystyle\frac{a_1\,a_2}{1+a_3} &{\ } 0 &{\ } 0 \\ {}-\displaystyle\frac{a_1\,a_2}{1+a_3} &{\ } \displaystyle\frac{{a_1}^2+{a_2}^2a_3}{1-{a_3}^2} &{\ } 0 &{\ } 0 \\ {}0&{\ } 0 &{\ } 1 &{\ } 0\\ 0&{\ } 0 &{\ } 0 &{\ } 1 \end{pmatrix}. \end{gathered} $$
    (4.108)
  8. 8.

    Transform the direction cosines a 1, a 2 back to ϕ, λ:

    (4.109)
    $$\displaystyle \begin{aligned} {\boldsymbol{J}}&= \begin{pmatrix} -\sin\phi &{\ } \cos\phi &{\ } 0 &{\ } 0\\ 0 &{\ } 0 &{\ } 0 &{\ } 0\\ 0&{\ } 0 &{\ } 1 &{\ } 0\\ 0&{\ } 0 &{\ } 0 &{\ } 1 \end{pmatrix},\ \;\mathrm{if}\ \; \lambda=0. \end{aligned} $$
    (4.110)
  9. 9.

    Augment D by the contribution of multiple scattering and modify C:

    $$\displaystyle \begin{gathered} {\boldsymbol{C}_{}}\,(\mbox{2:5,2:5})={\boldsymbol{D}}+{\boldsymbol{E}}''. \end{gathered} $$
    (4.111)

5.2 Energy Loss by Ionization

5.2.1 Mean Energy Loss

A heavy, charged particle passing through material suffers loss of energy due to ionization of the material. The mean energy loss of the particle is given by the Bethe–Bloch formula [14] :

$$\displaystyle \begin{gathered} \frac{\mathrm{d} E}{\mathrm{d} s} = - K z^2 \frac{Z \rho}{A\hspace{0.5pt} \beta^2} \left( \frac{1}{2} \ln \frac{2 m_e c^2 \beta^2 \gamma^2 W_{\mathrm{max}}}{I^2} - \beta^2 - \frac{\delta}{2} \right), \end{gathered} $$
(4.112)

where K is a constant, z is the charge number of the incoming particle, Z and A are the atomic number and atomic mass of the material, ρ is the density of the material, m e is the electron mass, W max is the maximum energy transfer in a single collision, I is the mean excitation energy, and δ is the density effect correction. For an incoming particle of mass M, W max is given by:

$$\displaystyle \begin{gathered} W_{\mathrm{max}} = \frac{2 m_e c^2 \beta^2 \gamma^2}{1 + 2 \gamma m_e/M + (m_e/M)^2}. \end{gathered} $$
(4.113)

For light particles such as electrons and positrons, the Bethe–Bloch formula needs some modifications. An alternative expression for light particles is [1]:

$$\displaystyle \begin{gathered} \frac{\mathrm{d} E}{\mathrm{d} s} = - K \hspace{0.5pt} \frac{Z\hspace{0.5pt} \rho}{A} \left( \ln \frac{2\hspace{0.5pt} m_e\hspace{0.5pt} c^2}{I} + 1.5 \ln \gamma - 0.975\right). \end{gathered} $$
(4.114)

5.2.2 Ionization Energy Loss in Track Propagation

In tracking detectors, material layers traversed during propagation are often considered as discrete. Knowing the thickness of the traversed layer, the Bethe-Bloch formula is used to modify the momentum part of the track parameter vector by the expected change before propagating to the next layer. Fluctuations in the ionization energy loss are often considered so small that they are neglected.

For detectors such as electromagnetic or hadronic calorimeters, however, ionization energy loss takes place continuously during propagation. In this case, it is possible to augment the vector of global track parameters in Eq. (4.55) with a parameter containing the track momentum [3, 22], e. g., ξ =  as defined in Sect. 4.4.2:

$$\displaystyle \begin{gathered} {\boldsymbol{u}}= \left(x,y,z,\varXi\right){{}^{\mathsf{T}}},\ \; {\boldsymbol{\dot{{\boldsymbol{u}}}}}= \left(t_x,t_y,t_z,\xi\right){{}^{\mathsf{T}}}, {} \end{gathered} $$
(4.115)

where Ξ is an auxiliary parameter corresponding to the integrated change in ξ. Track parameter and Jacobian matrix propagations can thereby be carried out in way very similar to the one described in Sect. 4.4.2, effectively including effects of energy loss continuously while traversing the detector.

5.3 Energy Loss by Bremsstrahlung

5.3.1 Mean and Distribution of the Energy Loss

High-energy electrons lose energy in a material mainly by bremsstrahlung [14]. The dependence on the material can be summarized in a characteristic length, called the radiation length X 0. It is defined as the average distance over which a high-energy electron loses 1 − 1∕e ≈ 63% of its energy. Although it is not strictly identical to the radiation length that is characteristic for multiple scattering (see Sect. 4.5.1), the same length is used in both contexts for the sake of convenience.

The rate of the mean energy loss by bremsstrahlung is nearly proportional to the energy:

$$\displaystyle \begin{gathered} {\mathsf{E}\left[-\frac{\mathrm{d}E}{\mathrm{d}s}\right]}\approx\frac{E}{X_0}, \end{gathered} $$
(4.116)

and on average, the energy decreases approximately exponentially as a function of the reduced path length t = sX 0:

$$\displaystyle \begin{gathered} {\mathsf{E}\left[E(t)\right]}=E_0\exp(-t), \end{gathered} $$
(4.117)

where E 0 is the initial energy. The energy loss is subject to large fluctuations, as a substantial part of the electron energy can be carried away by a single photon. A simplified PDF of E as a function of t has the following form, called the Bethe–Heitler model [23]:

$$\displaystyle \begin{gathered} h(E)=\frac{\left[\ln(E_0/E)\right]{}^{t/\ln2-1}}{E_0\,\varGamma\left(t/\ln2\right)}. \end{gathered} $$
(4.118)

The PDF can be rewritten in terms of the remaining energy fraction z = EE 0:

$$\displaystyle \begin{gathered} f(z)=\frac{\left(-\ln z\right){}^{t/\ln2-1}}{\varGamma\left(t/\ln2\right)}.{} \end{gathered} $$
(4.119)

It can be seen that \(-\ln z\) is Gamma-distributed. The expectation and the variance can be computed explicitly [24]:

$$\displaystyle \begin{gathered} {\mathsf{E}\left[z\right]}=\exp\left(-t\right),\ \; {\mathsf{var}\left[z\right]}=\exp\left(-t\ln3/\ln2\right)-\exp\left(-2\hspace{0.5pt} t\right). \end{gathered} $$
(4.120)

Figure 4.8 shows that the shape of the PDF is very far from being Gaussian; thus, the distribution cannot be adequately described by merely its mean and variance.

Fig. 4.8
figure 8

Probability density function of the Bethe–Heitler model of bremsstrahlung in Eq. (4.119), with mean μ and standard deviation σ, for t = 0.2, 0.1, 0.05, 0.02

5.3.2 Approximation by Gaussian Mixtures

For electron reconstruction with the Gaussian-sum filter (GSF ; see Sect. 6.2.3) the model PDF in Eq. (4.119) is approximated by a normal mixture PDF with n c components. The parameters of the mixture are determined by minimizing some measure of distance between the two distributions. In [25], two distances have been used: D KL, the Kullback–Leibler distance, and D CDF, the integral over the absolute difference of the respective cumulative distribution functions (CDFs):

$$\displaystyle \begin{aligned} {D_{\mathrm{KL}}}&=\int_{0}^{1} \ln[f(z)/g(z)]\,f(z)\,\mathrm{d} z,{} \end{aligned} $$
(4.121)
$$\displaystyle \begin{aligned} {D_{\mathrm{CDF}}}&=\int_{-\infty}^{\infty} |\hspace{0.5pt}F(z)-G(z)\hspace{0.5pt}|\,\mathrm{d} z,{} \end{aligned} $$
(4.122)

where f(z) and F(z) are the PDF and CDF of the model distribution, and g(z) and G(z) are the PDF and CDF of the normal mixture, respectively. Using D KL and n c = 1 the single Gaussian with the correct first two moments is recovered. In all other cases, the mixtures do not have the same moments as the model. The quality of the approximating mixtures has been investigated in detail in [25]. Software in the form of a Matlab ® function can be downloaded from the URL in [26].