1 Least-Squares Fitting

1.1 Straight Tracks

1.1.1 Exact Fit

Assume that there are n straight tracks that have to be fitted to a common vertex. Track i is given by a point r i, a unit direction vector a i, and the joint covariance matrix \({{{\boldsymbol {V}}}_{\hspace{-0.5pt} i}}\) of q i = (r i;a i), i = 1, …, n. The rank of \({{{\boldsymbol {V}}}_{\hspace{-0.5pt} i}}\) is usually equal to five. Here and in the entire chapter, it is assumed that there is no material between the vertex and the point or surface where the track parameters are defined.

The estimated common vertex is the point v that minimizes the sum of the weighted squared distances from the tracks. The squared distance D i of track i from the point v is given by:

$$\displaystyle \begin{gathered} D_i({{\boldsymbol{v}}})=\left[({{\boldsymbol{r}}}_i-{{\boldsymbol{v}}})\times{{\boldsymbol{a}}}_i\right]{}^2.{} \end{gathered} $$
(8.1)

For a given vertex v 0, the variance \(\sigma _i^2={\mathsf {var}\left [D_i\right ]}\) is computed by linearized error propagation. The Jacobian of D i with respect to q i is given by:

(8.2)

with the auxiliary variables

$$\displaystyle \begin{gathered} d_{i,k}=r_{i,k}-v_{0,k},\ \; \eta_{i,jk}=a_{i,j}\,d_{i,k}-a_{i,k}\,d_{i,j},\ \; j,k=1,2,3. \end{gathered} $$

It follows that

$$\displaystyle \begin{gathered} \sigma_i^2\approx{{\boldsymbol{J}}}_i{{}^{\mathsf{T}}} \cdot {{{\boldsymbol{V}}}_{\hspace{-0.5pt} i}} \cdot {{\boldsymbol{J}}}_i.{} \end{gathered} $$
(8.3)

Minimizing the sum of the squared distances gives the fitted vertex \({{\hat {\boldsymbol {{{\boldsymbol {v}}}}}}}\):

$$\displaystyle \begin{gathered} {{\hat{\boldsymbol{{{\boldsymbol{v}}}}}}}=\arg_{{{\boldsymbol{v}}}} \min {\mathcal{S}}({{\boldsymbol{v}}}),\ \;\mathrm{with}\ \; {\mathcal{S}}({{\boldsymbol{v}}})=\sum_{i=1}^n \frac{D_i({{\boldsymbol{v}}})}{\sigma_i^2}.{} \end{gathered} $$
(8.4)

The minimization with the Newton–Raphson method proceeds iteratively:

  1. 1.

    Let v 0 be an approximate initial vertex position. Compute D i(v 0), J i and \(\sigma ^2_i\) for i = 1, …, n, according to Eqs. (8.1)–(8.3).

  2. 2.

    Compute the gradient of \({\mathcal {S}}\) with respect to v at v 0:

    (8.5)
  3. 3.

    Compute the Hessian matrix of \({\mathcal {S}}\) with respect to v at v 0:

    $$\displaystyle \begin{aligned} &{\nabla^2} {\mathcal{S}}=\sum_{i=1}^n \frac{1}{\sigma_i^2}\hspace{0.5pt}{{\boldsymbol{H}}}_i,\ \;\mathrm{with}\ \; {{\boldsymbol{H}}}_i= 2\cdot\begin{pmatrix} a_{i,2}^2+a_{i,3}^2 &{\ } -{a_{i,1}\,a_{i,2}} &{\ } -{a_{i,1}\,a_{i,3}}\\ -{a_{i,1}\,a_{i,2}} &{\ } {a_{i,1}^2+a_{i,3}^2} &{\ } -{a_{i,2}\,a_{i,3}}\\ -{a_{i,1}\,a_{i,3}} &{\ } -{a_{i,2}\,a_{i,3}} &{\ } {a_{i,1}^2+a_{i,2}^2} \end{pmatrix}.{} \end{aligned} $$
    (8.6)
  4. 4.

    Compute the solution v 1 of \(\nabla {\mathcal {S}}=\mathbf {0}\):

    $$\displaystyle \begin{gathered} {{\boldsymbol{v}}}_1={{\boldsymbol{v}}}_0-({\nabla^2} {\mathcal{S}}){{}^{-1}}\cdot\nabla {\mathcal{S}}. \end{gathered} $$
    (8.7)
  5. 5.

    Set v 0 equal to v 1 and repeat from step 2 until convergence.

The covariance matrix of the final estimate \({{\hat {\boldsymbol {{{\boldsymbol {v}}}}}}}\) is given by \(({\nabla ^2} {\mathcal {S}}){{ }^{-1}}\), and the χ 2-statistic of the fit is equal to \({\mathcal {S}}({{\hat {\boldsymbol {{{\boldsymbol {v}}}}}}})\). Its number of degrees of freedom is the sum of the ranks of all \({{{\boldsymbol {V}}}_{\hspace{-0.5pt} i}}\) minus three. Prior information on the vertex position that is independent of the track information can be included by an additional term in Eq. (8.4) or after the fit by a weighted mean.

1.1.2 Simplified Fit

If the uncertainty of the direction vectors a i is neglected, the vertex fit can be further simplified [1]. Assume that track i is specified by a reference point r i = (x i, y i, z i)T in the vicinity of the vertex and a unit direction vector a i in spherical coordinates:

$$\displaystyle \begin{gathered} {{\boldsymbol{a}}}_i=\left( \cos\varphi_i\cos\lambda_i, \sin\varphi_i\cos\lambda_i, \sin\lambda_i \right){{}^{\mathsf{T}}}, \end{gathered} $$
(8.8)

where φ i is the azimuth and λ i = π∕2 − θ i the dip angle, i.e., the complement of the polar angle. For the purpose of the vertex fit, a convenient choice of the coordinate system for position is a system where the x -axis is parallel to the track, the y -axis is perpendicular to x and z, and the z -axis forms a right-handed orthonormal system with x and y . The coordinate transformation of the reference point r i to this track-based system is given by the following rotation:

$$\displaystyle \begin{gathered} {{\boldsymbol{r}}}_i^{\prime}={{\boldsymbol{R}}}_i\hspace{0.5pt}{{\boldsymbol{r}}}_i=\begin{pmatrix} \cos\varphi_i\cos\lambda_i & \sin\varphi_i\cos\lambda_i & \sin\lambda_i\\ -\sin\varphi_i & \cos\varphi_i & 0\\ -\cos\varphi_i\sin\lambda_i & -\sin\varphi_i\sin\lambda_i & \cos\lambda_i \end{pmatrix}\hspace{0.5pt}{{\boldsymbol{r}}}_i. \end{gathered} $$
(8.9)

The coordinates y and z are called the transverse and the longitudinal impact parameter, respectively. The fit described in [1] assumes that q i = (y , z )T has been estimated by the track fit, with the associated weight matrix G i, and that the direction errors are negligible. The transformation from r i to q i is given by the 2 × 3 matrix T i consisting of the second and third line of R i.

The vertex v is estimated by minimizing the sum of the weighted distances between the reference points and the vertex, transformed to the corresponding track-based systems:

$$\displaystyle \begin{gathered} {\mathcal{S}}({{\boldsymbol{v}}})=\sum_{i=1}^n ({{\boldsymbol{r}}}_i-{{\boldsymbol{v}}}){{}^{\mathsf{T}}}{{\boldsymbol{T}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{T}}}_i\hspace{0.5pt}({{\boldsymbol{r}}}_i-{{\boldsymbol{v}}}). \end{gathered} $$
(8.10)

The estimated vertex and its covariance matrix C are therefore given by:

$$\displaystyle \begin{gathered} {{\hat{\boldsymbol{{{\boldsymbol{v}}}}}}}={{\boldsymbol{C}}}\sum_{i=1}^n{{{\boldsymbol{W}}}_{i}}\hspace{0.5pt}{{\boldsymbol{r}}}_i,\ \;\mathrm{with}\ \; {{\boldsymbol{C}}}=\left(\sum_{i=1}^n{{{\boldsymbol{W}}}_{i}}\right)^{-1}\quad \mathrm{and}\quad{{{\boldsymbol{W}}}_{i}}={{\boldsymbol{T}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{T}}}_i,\ i=1,\ldots,n. \end{gathered} $$
(8.11)

1.2 Curved Tracks

The fits described in the preceding subsection can also be used with locally straight tracks for which the change of curvature and direction in the vicinity of the vertex is negligible. If this is not the case, nonlinearities in the track model have to be taken into account.

1.2.1 Nonlinear Regression

The general vertex fit can be formulated as a nonlinear regression model [2]. Assume that there are n tracks to be fitted to a common vertex. The tracks are specified by the estimated track parameters q i and the associated covariance matrices V i, i = 1, …, n. The parameters to be estimated are the vertex position v and the momentum vectors p i of all tracks at the vertex, see Fig. 8.1.

Fig. 8.1
figure 1

A vertex fit with four tracks. The parameters of the fit are the vertex v and the momentum vectors p i; the observations are the estimated track parameters q i

The track parameters q i are nonlinear functions of the parameters:

$$\displaystyle \begin{gathered}{} {{\boldsymbol{q}}}_i={{{\boldsymbol{h}}}_{i}}({{\boldsymbol{v}}},{{\boldsymbol{p}}}_i),\ i=1\ldots,n. \end{gathered} $$
(8.12)

The first-order Taylor expansion of h i at a suitable expansion point e 0 = (v 0, p i,0) gives the following approximate linear model:

$$\displaystyle \begin{gathered}{} {{\boldsymbol{q}}}_i\approx{{{\boldsymbol{A}}}}_i\hspace{0.5pt}{{\boldsymbol{v}}}+{{\boldsymbol{B}}}_i\hspace{0.5pt}{{\boldsymbol{p}}}_i+{{\boldsymbol{c}}}_i,\ i=1\ldots,n, \end{gathered} $$
(8.13)

with

(8.14)

This can be written as:

(8.15)

The LS estimates \({{\hat {\boldsymbol {{{\boldsymbol {v}}}}}}}\) and \({{\hat {\boldsymbol {p}}}}_i\) are obtained by:

$$\displaystyle \begin{gathered} \begin{pmatrix} {{\hat{\boldsymbol{{{\boldsymbol{v}}}}}}} \\ {{\hat{\boldsymbol{p}}}}_1 \\ \vdots\\ {{\hat{\boldsymbol{p}}}}_n \end{pmatrix}={{\boldsymbol{M}}}{{}^{-1}}\hspace{0.5pt}{{\boldsymbol{N}}}\hspace{0.5pt} \begin{pmatrix} {{\boldsymbol{q}}}_1-{{\boldsymbol{c}}}_1\\ \vdots\\ {{\boldsymbol{q}}}_n-{{\boldsymbol{c}}}_n \end{pmatrix},{} \end{gathered} $$
(8.16)

with

(8.17)
(8.18)
$$\displaystyle \begin{aligned} {{\boldsymbol{D}}}_i&={{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{B}}}_i,\ \; {{\boldsymbol{E}}}_i={{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{B}}}_i={{\boldsymbol{W}}}_i{{}^{-1}},\ \;{{{\boldsymbol{G}}}_{i}}={{{\boldsymbol{V}}}_{\hspace{-0.5pt} i}}{{}^{-1}},\ \; i=1,\ldots,n, \end{aligned} $$
(8.19)
$$\displaystyle \begin{aligned} {{\boldsymbol{D}}}_0&=\sum_{i=1}^n {{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{{\boldsymbol{A}}}}_i. \end{aligned} $$
(8.20)

C = M −1 can be written as a block matrix with the blocks C ij, i, j = 0, …, n:

$$\displaystyle \begin{aligned} {{\boldsymbol{C}}}_{00}&=\left({{\boldsymbol{D}}}_0-\sum_{i=1}^n {{\boldsymbol{D}}}_i\hspace{0.5pt}{{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{D}}}_i{{}^{\mathsf{T}}}\right)^{-1},{} \end{aligned} $$
(8.21)
$$\displaystyle \begin{aligned} {{\boldsymbol{C}}}_{0j}&=-{{\boldsymbol{C}}}_{00}\hspace{0.5pt}{{\boldsymbol{D}}}_j\hspace{0.5pt}{{\boldsymbol{W}}}_j, \ \; {{\boldsymbol{C}}}_{j0}={{\boldsymbol{C}}}_{0j}{{}^{\mathsf{T}}},\ \; j>0{} \end{aligned} $$
(8.22)
$$\displaystyle \begin{aligned} {{\boldsymbol{C}}}_{ij}&=\delta_{ij}{{\boldsymbol{W}}}_i+{{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{D}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{\boldsymbol{C}}}_{00}\hspace{0.5pt}{{\boldsymbol{D}}}_j\hspace{0.5pt}{{\boldsymbol{W}}}_j= \delta_{ij}{{\boldsymbol{W}}}_i-{{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{D}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{\boldsymbol{C}}}_{0j},\ \; i,j>0.{} \end{aligned} $$
(8.23)

Substitution of Eqs. (8.21)–(8.23) into Eq. (8.16) gives the following expressions for the estimated parameters:

$$\displaystyle \begin{aligned} {{\hat{\boldsymbol{{{\boldsymbol{v}}}}}}}&={{\boldsymbol{C}}}_{00}\hspace{0.5pt}\sum_{j=1}^n {{{\boldsymbol{A}}}}_j{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{j}}\hspace{0.5pt} ({{\boldsymbol{I}}}-{{\boldsymbol{B}}}_j\hspace{0.5pt}{{\boldsymbol{W}}}_j\hspace{0.5pt}{{\boldsymbol{B}}}_j{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{j}})\hspace{0.5pt}({{\boldsymbol{q}}}_j-{{\boldsymbol{c}}}_j),{} \end{aligned} $$
(8.24)
$$\displaystyle \begin{aligned} {{\hat{\boldsymbol{p}}}}_i&={{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}({{\boldsymbol{q}}}_i-{{\boldsymbol{c}}}_i-{{{\boldsymbol{A}}}}_i\hspace{0.5pt}{{\hat{\boldsymbol{{{\boldsymbol{v}}}}}}}),\ \; i=1,\ldots,n. \end{aligned} $$
(8.25)

The functions h i are re-expanded at the new expansion point \({{\boldsymbol {e}}}_1=({{\hat {\boldsymbol {{{\boldsymbol {v}}}}}}},{{\hat {\boldsymbol {p}}}}_i),\ \; i=1,\ldots ,n\), and the fit is iterated until convergence. After convergence, the track parameters q i can be updated:

$$\displaystyle \begin{gathered} {{\hat{\boldsymbol{q}}}}_i={{{\boldsymbol{h}}}_{i}}({{\hat{\boldsymbol{{{\boldsymbol{v}}}}}}},{{\hat{\boldsymbol{p}}}}_i),\ i=1\ldots,n,{} \end{gathered} $$
(8.26)

In the linear approximation, the joint covariance matrix of \({{\hat {\boldsymbol {{{\boldsymbol {v}}}}}}}\) and all \({{\hat {\boldsymbol {p}}}}_i\) is equal to C = M −1, from which the joint covariance matrix of all \({{\hat {\boldsymbol {q}}}}_i\) can be computed by linearized error propagation. The χ 2-statistic of the fit can be computed as follows:

$$\displaystyle \begin{gathered} \chi^2=\sum_{i=1}^n ({{\boldsymbol{q}}}_i-{{\hat{\boldsymbol{q}}}}_i){{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}({{\boldsymbol{q}}}_i-{{\hat{\boldsymbol{q}}}}_i). \end{gathered} $$
(8.27)

If the errors of the estimated track parameters can be assumed to be approximately Gaussian, the chi-square statistic is approximately χ 2-distributed with

$$\displaystyle \begin{gathered} {\mathrm{ndf}}=\sum_{i=1}^n {{\text{rank}({{{\boldsymbol{V}}}_{\hspace{-0.5pt} i}})}}-3\,(n+1) \end{gathered} $$
(8.28)

degrees of freedom.

1.2.2 Extended Kalman Filter

The nonlinear regression can be reformulated as an extended Kalman filter (see Sect. 6.1.2). Initially, the state vector consists only of the prior information about the vertex position v 0, and its covariance matrix C 0. In many instances, the prior information is given by the position and the size of the beam spot or the target. If no prior information is available, v 0 is a rough guess, and C 0 is set to a large diagonal matrix.

For each track i, i = 1, …, n, the state vector is augmented by the three-momentum vector at the vertex p i. The system equation is the identity:

$$\displaystyle \begin{gathered} {{\boldsymbol{v}}}_i={{\boldsymbol{v}}}_{i-1},\quad{{\boldsymbol{C}}}_i={{\boldsymbol{C}}}_{i-1}. \end{gathered} $$
(8.29)

The measurement equation and its linearized form are the same as in Eqs. (8.12)–(8.14). The update of the vertex position and the estimation of p i can now be written as:

$$\displaystyle \begin{aligned} {{\boldsymbol{v}}}_i&={{\boldsymbol{C}}}_i\left[{{\boldsymbol{C}}}_{i-1}{{}^{-1}}\hspace{0.5pt}{{\boldsymbol{v}}}_{i-1}+{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}\,^{\mathrm{B}}_{i}}({{\boldsymbol{q}}}_i-{{\boldsymbol{c}}}_i)\right],{} \end{aligned} $$
(8.30)
$$\displaystyle \begin{aligned} {{\boldsymbol{p}}}_i&={{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}({{\boldsymbol{q}}}_i-{{\boldsymbol{c}}}_i-{{{\boldsymbol{A}}}}_i{{\boldsymbol{v}}}_i),{} \end{aligned} $$
(8.31)

with W i = (B i T G i B i)−1 and \({{{\boldsymbol {G}}}^{\,\mathrm {B}}_{i}}={{{\boldsymbol {G}}}_{i}}-{{{\boldsymbol {G}}}_{i}}\hspace{0.5pt}{{\boldsymbol {B}}}_i\hspace{0.5pt}{{\boldsymbol {W}}}_i\hspace{0.5pt}{{\boldsymbol {B}}}_i{{ }^{\mathsf {T}}}{{{\boldsymbol {G}}}_{i}}\). The updated covariance and cross-covariance matrices are:

$$\displaystyle \begin{aligned} {\mathsf{Var}\left[{{\boldsymbol{v}}}_i\right]}&={{\boldsymbol{C}}}_i=\left({{\boldsymbol{C}}}_{i-1}{{}^{-1}}+{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}^{\,\mathrm{B}}_{i}}{{{\boldsymbol{A}}}}_i\right){{}^{-1}},{} \end{aligned} $$
(8.32)
$$\displaystyle \begin{aligned} {\mathsf{Var}\left[{{\boldsymbol{p}}}_i\right]}&={{\boldsymbol{W}}}_i+{{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{{\boldsymbol{A}}}}_i\hspace{0.5pt}{{\boldsymbol{C}}}_i\hspace{0.5pt}{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{B}}}_i\hspace{0.5pt}{{\boldsymbol{W}}}_i,{} \end{aligned} $$
(8.33)
$$\displaystyle \begin{aligned} {\mathsf{Cov}\left[{{\boldsymbol{v}}}_i,{{\boldsymbol{p}}}_i\right]}&=-{{\boldsymbol{C}}}_i\hspace{0.5pt}{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{B}}}_i\hspace{0.5pt}{{\boldsymbol{W}}}_i.{} \end{aligned} $$
(8.34)

Each update step gives rise to residuals r i and a chi-square statistic χi2:

$$\displaystyle \begin{aligned} {{\boldsymbol{r}}}_i&={{\boldsymbol{q}}}_i-{{{\boldsymbol{h}}}_{i}}({{\boldsymbol{v}}}_i,{{\boldsymbol{p}}}_i), \end{aligned} $$
(8.35)
$$\displaystyle \begin{aligned} {\chi^2}_i&={{\boldsymbol{r}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{r}}}_i+ ({{\boldsymbol{v}}}_i-{{\boldsymbol{v}}}_{i-1}){{}^{\mathsf{T}}}\hspace{0.5pt}{{\boldsymbol{C}}}_{i-1}{{}^{-1}}\hspace{0.5pt}({{\boldsymbol{v}}}_i-{{\boldsymbol{v}}}_{i-1}). \end{aligned} $$
(8.36)

The chi-square statistic has two degrees of freedom and can be used to test the compatibility of track i with the current fitted vertex. If no intermediate results are needed, the computation of the momentum vectors p i can be deferred to the smoother, and the final vertex v n and its covariance matrix C n can be computed directly, cf. Eqs. (8.21) and (8.24):

$$\displaystyle \begin{aligned} {{\boldsymbol{v}}}_n&={{\boldsymbol{C}}}_n\left[{{\boldsymbol{C}}}_0{{}^{-1}}{{\boldsymbol{v}}}_0+\sum_{i=1}^n {{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}^{\,\mathrm{B}}_{i}}\hspace{0.5pt}({{\boldsymbol{q}}}_i-{{\boldsymbol{c}}}_i)\right], \end{aligned} $$
(8.37)
$$\displaystyle \begin{aligned} {{\boldsymbol{C}}}_0&=\left({{\boldsymbol{C}}}_0{{}^{-1}}+\sum_{i=1}^n{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}^{\,\mathrm{B}}_{i}}\hspace{0.5pt}{{{\boldsymbol{A}}}}_i\right)^{-1}. \end{aligned} $$
(8.38)

As there is no process noise in the system equation, the smoother is tantamount to recomputing the momentum vectors and the covariance matrices with the final vertex v n and its covariance matrix C n, see also Eqs. (8.31)–(8.34):

$$\displaystyle \begin{aligned} {{\boldsymbol{p}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}&={{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}({{\boldsymbol{q}}}_i-{{\boldsymbol{c}}}_i-{{{\boldsymbol{A}}}}_i{{\boldsymbol{v}}}_n),{} \end{aligned} $$
(8.39)
$$\displaystyle \begin{aligned} {\mathsf{Var}\left[{{\boldsymbol{p}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}\right]}&={{\boldsymbol{W}}}_i+{{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{{\boldsymbol{A}}}}_i\hspace{0.5pt}{{\boldsymbol{C}}}_n\hspace{0.5pt}{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{B}}}_i\hspace{0.5pt}{{\boldsymbol{W}}}_i,{} \end{aligned} $$
(8.40)
$$\displaystyle \begin{aligned} {\mathsf{Cov}\left[{{\boldsymbol{v}}}_n,{{\boldsymbol{p}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}\right]}&=-{{\boldsymbol{C}}}_n\hspace{0.5pt}{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{B}}}_i\hspace{0.5pt}{{\boldsymbol{W}}}_i,{} \end{aligned} $$
(8.41)
$$\displaystyle \begin{aligned} {\mathsf{Cov}\left[{{\boldsymbol{p}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n},{{\boldsymbol{p}}}_{j{\hspace{0.5pt}|\hspace{0.5pt}}{}n}\right]}&={{\boldsymbol{W}}}_i\hspace{0.5pt}{{\boldsymbol{B}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{{\boldsymbol{A}}}}_i\hspace{0.5pt}{{\boldsymbol{C}}}_n\hspace{0.5pt}{{{\boldsymbol{A}}}}_j{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{j}}\hspace{0.5pt}{{\boldsymbol{B}}}_j\hspace{0.5pt}{{\boldsymbol{W}}}_j.{} \end{aligned} $$
(8.42)

The update of the track parameters reads:

$$\displaystyle \begin{gathered} {{\hat{\boldsymbol{q}}}}_i={{{\boldsymbol{h}}}_{i}}({{\boldsymbol{v}}}_n,{{\boldsymbol{p}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}),\ i=1\ldots,n. \end{gathered} $$
(8.43)

Their joint covariance matrix can be computed by linearized error propagation. Each track can be tested against the final vertex by computing the smoothed residuals and the corresponding chi-square statistic:

$$\displaystyle \begin{aligned} {{\boldsymbol{r}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}&={{\boldsymbol{q}}}_i-{{{\boldsymbol{h}}}_{i}}({{\boldsymbol{v}}}_n,{{\boldsymbol{p}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}), \end{aligned} $$
(8.44)
$$\displaystyle \begin{aligned} {\chi^2}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}&={{\boldsymbol{r}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{r}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}+ ({{\boldsymbol{v}}}_n-{{\boldsymbol{v}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}){{}^{\mathsf{T}}}\hspace{0.5pt}{{\boldsymbol{C}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}{{}^{-1}}\hspace{0.5pt}({{\boldsymbol{v}}}_n-{{\boldsymbol{v}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}), \end{aligned} $$
(8.45)

where v n | −i is the final vertex with track i removed, and C n | −i is its covariance matrix:

$$\displaystyle \begin{aligned} {{\boldsymbol{v}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}&={{\boldsymbol{C}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}\left[{{\boldsymbol{C}}}_{n}{{}^{-1}}{{\boldsymbol{v}}}_{n}-{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}^{\,\mathrm{B}}_{i}}({{\boldsymbol{q}}}_i-{{\boldsymbol{c}}}_i)\right],{} \end{aligned} $$
(8.46)
$$\displaystyle \begin{aligned} {\mathsf{Var}\left[{{\boldsymbol{v}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}\right]}&={{\boldsymbol{C}}}_{n{\hspace{0.5pt}|\hspace{0.5pt}}{}-i}=\left({{\boldsymbol{C}}}_{n}{{}^{-1}}-{{{\boldsymbol{A}}}}_i{{}^{\mathsf{T}}}{{{\boldsymbol{G}}}^{\,\mathrm{B}}_{i}}{{{\boldsymbol{A}}}}_i\right){{}^{-1}}.{} \end{aligned} $$
(8.47)

Searching for outliers in this way is, however, tedious and time consuming. The adaptive vertex fit described in Sect. 8.2 is better suited to and more powerful for this task, especially if there are several outliers.

1.2.3 Fit with Perigee Parameters

In many collider experiments, past and present, the magnetic field in the vicinity of the collision region is almost perfectly homogeneous, giving a helical track model. The “perigee” parametrization for such helical tracks was introduced in [3], with a correction in [4]. The track is parametrized around the point of closest approach, the perigee point v P, of the helix to the z-axis, which is also the direction of the beams and the magnetic field. The perigee point is expected to be close to the vertex. The five track parameters of the perigee parametrization are (see Fig. 8.2):

  1. 1.

    The impact parameter 𝜖. By convention, the sign of 𝜖 is positive if the origin O is at the left side of the trajectory.

    Fig. 8.2
    figure 2

    A helical track in the projection to the (x, y)-plane. O: origin; C:circle center; P: perigee point; V: vertex; ρ: circle radius; t P: tangent at P; φ P: azimuth of track direction at P; t V: tangent at V; φ V: azimuth of track direction at V

  2. 2.

    The azimuth φ P of the tangent to the track at P.

  3. 3.

    The z-coordinate z P of the perigee point P.

  4. 4.

    The polar angle 𝜗 of the helix with respect to the z-axis.

  5. 5.

    The signed curvature κ. By convention, the sign of κ is positive if the trajectory is anti-clockwise.

With these definitions, the trajectory can be approximately parametrized in terms of a running parameter s, which is the distance from P along the projected helix:

$$\displaystyle \begin{aligned} x&\approx\epsilon\sin\varphi_{\mathrm{P}}+s\cos\varphi_{\mathrm{P}}-\frac{s^2\kappa}{2}\sin\varphi_{\mathrm{P}}, \end{aligned} $$
(8.48)
$$\displaystyle \begin{aligned} y&\approx-\epsilon\cos\varphi_{\mathrm{P}}+s\sin\varphi_{\mathrm{P}}+\frac{s^2\kappa}{2}\cos\varphi_{\mathrm{P}},\\ z&\approx z_{\mathrm{P}}+s\cot\vartheta. \end{aligned} $$
(8.49)

The track parameters q = (𝜖, φ P, 𝜗, z P, κ)T have to be expressed as a function of the coordinates v = (x V, y V, z V)T of the vertex V, and the track parameters p = (𝜗, φ V, κ)T at V. Note that 𝜗 and κ are invariant along the helix. As higher orders of κ can usually be neglected for tracks in collider experiments, the following functional dependence is obtained:

$$\displaystyle \begin{aligned} \epsilon&\approx-R-Q^2\kappa/2, \\ z_{\mathrm{P}}&\approx z_{\mathrm{V}}-Q(1-R\kappa)\cot\vartheta,\\ \varphi_{\mathrm{P}}&\approx\varphi_{\mathrm{V}}-Q\kappa, \end{aligned} $$
(8.50)

with

$$\displaystyle \begin{gathered} Q=x_{\mathrm{V}}\cos\varphi_{\mathrm{V}}+y_{\mathrm{V}}\sin\varphi_{\mathrm{V}},\ \; R=y_{\mathrm{V}}\cos\varphi_{\mathrm{V}}-x_{\mathrm{V}}\sin\varphi_{\mathrm{V}}. \end{gathered} $$
(8.51)

The Jacobian matrix at the lowest order is given by:

$$\displaystyle \begin{gathered} {\frac{\partial{{\boldsymbol{q}}}}{\partial({{\boldsymbol{v}}},{{\boldsymbol{p}}})} } = \begin{pmatrix} s & -tc & -\kappa c\\ -c & -ts & -\kappa s\\ 0 & 1 & 0 \\ 0 & Q(1+t^2) & 0\\ Q & -Rt & 1\\ -Q^2/2 & QRt & -Q \end{pmatrix}, \end{gathered} $$
(8.52)

with

$$\displaystyle \begin{gathered} c=\cos\varphi_{\mathrm{V}},\ \; s=\sin\varphi_{\mathrm{V}},\ \; t=\cot\vartheta. \end{gathered} $$
(8.53)

With these ingredients, the nonlinear regression (see Sect. 8.1.2.1) can be computed.

2 Robust and Adaptive Vertex Fitting

2.1 Vertex Fit with M-Estimator

Estimators are called robust if they are insensitive to outlying observations [5,6,7] . The M-estimator, see Sect. 6.2.1, is a well-known robust estimator that can be implemented as an iterated reweighted LS estimator that assigns smaller weights (larger uncertainties) to observations suspected to be outliers. One of the first attempts, if not the first, to make the vertex fit robust is the extension of the Kalman filter to an M-estimator [8] of the Huber type [5]. Before the fit, the parameters of each track are decorrelated by finding the orthogonal matrix U i that transforms the covariance matrix \({{{\boldsymbol {V}}}_{\hspace{-0.5pt} i}}\) to a diagonal matrix D i:

$$\displaystyle \begin{gathered} {{\boldsymbol{D}}}_i={{\boldsymbol{U}}}_i\hspace{0.5pt}{{{\boldsymbol{V}}}_{\hspace{-0.5pt} i}}\hspace{0.5pt}{{\boldsymbol{U}}}_i{{}^{\mathsf{T}}}. \end{gathered} $$
(8.54)

The measurement equation Eq. (8.13) is transformed by setting

$$\displaystyle \begin{gathered} {{{\boldsymbol{V}}}_{\hspace{-0.5pt} i}}\leftarrow{{\boldsymbol{D}}}_i,\ \; {{\boldsymbol{q}}}_i\leftarrow{{\boldsymbol{U}}}_i\hspace{0.5pt}{{\boldsymbol{q}}}_i,\ \; {{{\boldsymbol{A}}}}_i\leftarrow{{\boldsymbol{U}}}_i{{{\boldsymbol{A}}}}_i, \ \;{{\boldsymbol{B}}}_i\leftarrow{{\boldsymbol{U}}}_i{{\boldsymbol{B}}}_i,\ \; {{\boldsymbol{c}}}_i\leftarrow{{\boldsymbol{U}}}_i{{\boldsymbol{c}}}_i. \end{gathered} $$
(8.55)

The M-estimator is then computed by an iterated LS estimator, see Table 8.1. Vertex fits using M-estimators with other weight functions are described in [9, 10].

Table 8.1 Algorithm: Vertex fit with M-estimator

2.2 Adaptive Vertex Fit with Annealing

The adaptive vertex fit (AVF) was introduced in [11] and further investigated in [12,13,14,15]. It can be interpreted either as an EM algorithm or as an M-estimator with a specific weight function [16, 17]; see also Table 6.1. In the AVF, tracks as a whole are down-weighted, as the weight w i is a function of the distance of track i from the current vertex, measured by the chi-square χi2:

$$\displaystyle \begin{gathered}{} {\chi^2}_i={{\boldsymbol{r}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n}{{}^{\mathsf{T}}}\hspace{0.5pt}{{{\boldsymbol{G}}}_{i}}\hspace{0.5pt}{{\boldsymbol{r}}}_{i{\hspace{0.5pt}|\hspace{0.5pt}}{}n},\ \; w_i\left({\chi^2}_i\right)=\frac{\exp\left(-{\chi^2}_i/2T\right)}{\exp\left(-{\chi^2}_i/2T\right)+\exp\left(-{\chi^2_{\mathrm{c}}}/2T\right)}, \end{gathered} $$
(8.57)

where T is a temperature parameter. The weight w i can be interpreted as the probability that track i belongs to the vertex. The chi-square cut \({\chi ^2_{\mathrm {c}}}\) sets the threshold where the weight is equal to 0.5. Beyond this cut, a track is considered to be an outlier rather than an inlier .

The temperature T modifies the shape of the function in Eq. (8.57). At high temperature, the weight function varies very slowly; at low temperature, the weight is close to 1 for \({\chi ^2}_i\leq {\chi ^2_{\mathrm {c}}}\) and close to 0 for \({\chi ^2}_i>{\chi ^2_{\mathrm {c}}}\) (see Fig. 6.1c), with a sharp drop at \({\chi ^2}_i={\chi ^2_{\mathrm {c}}}\). The weights at low temperature can be used to identify secondary tracks in a fit of the primary vertex; see Sect. 7.3.3.

Similar to the M-estimator in Sect. 8.2.1, the adaptive vertex fit is implemented as an iterated LS estimator, which can be one of the methods described in Sect. 8.1. The temperature T can be used to employ an annealing procedure that helps to reach the globally optimal solution. The adaptive vertex fit is summarized in Table 8.2.

Table 8.2 Algorithm: Adaptive vertex fit with annealing

In order to get reasonable initial weights, the initial vertex v (0) has to be chosen carefully, preferably with a robust finder [18]. The annealing schedule and the constant \({\chi ^2_{\mathrm {c}}}\) have to be tuned on simulated data. A detailed study and useful hints can be found in [15].

2.3 Vertex Quality

The assessment of the quality of a fitted vertex is similar to the assessment of track quality; see Sect. 6.4. The primary criterion is the chi-square statistic of the vertex fit, or its p-value. If the p-value is unreasonably small, a search for outlying tracks can be started. There are several sources of outliers in the vertex fit:

  • Fake tracks and tracks that include unrecognized extraneous measurements or noise hits.

  • Tracks with an unrecognized interaction (kink) in the material.

  • Tracks with a covariance matrix that does not properly reflect the statistical uncertainty of the track parameters, for instance, because of an incorrect evaluation of material effects.

  • Tracks that belong to a different vertex, in particular to a nearby secondary vertex.

Just as in the case of the track fit, outliers can be detected by a test based on the residuals of the track with respect to the estimated vertex position. Outlying tracks that pass the track quality check are often candidates for inclusion in a secondary vertex; see Sects. 7.3.3 and 9.2.

3 Kinematic Fit

The vertex fit as described so far imposes purely geometrical constraints on the participating tracks, namely that they originate at the same point in space. Especially in the case of a secondary vertex, the vertex fit can be extended by imposing other geometrical or kinematical constraints. In the vertex fit of a photon conversion (see Sect. 9.4), the constraint can be imposed that the momentum vectors of the outgoing tracks are parallel. In the fit of a decay vertex (see Sects. 9.2 and 9.3), the laws of momentum and/or energy conservation can be imposed as constraints. In addition, assumptions about the mass of some or all of the participating particles can be included. The width of a resonance can be included by considering its mass as a virtual measurement with a standard error reflecting the width. Such fits are called kinematic fits.

In a kinematic fit, a track is most conveniently represented by a collection u of parameters that are physically meaningful and allow a simple formulation of the constraints [19]. If only kinematic constraints are present, u contains the energy and the Cartesian momentum components in the global coordinate system:

$$\displaystyle \begin{gathered} {{\boldsymbol{u}}}=(E,p_x,p_y,p_z){{}^{\mathsf{T}}}. \end{gathered} $$
(8.59)

For charged tracks, u is computed from the usual representation by a five-vector q, and the covariance matrix \({{{\boldsymbol {V}}}}={\mathsf {Var}\left [{{\boldsymbol {u}}}\right ]}\) is obtained by linearized error propagation. If the mass of the particle is assumed to be known and fixed, the rank of V is three, as the energy is a deterministic function of the momentum p and the mass m. If there is an independent measurement of the energy, for instance by a cluster in the calorimeter, the rank is four. For neutral tracks, u is computed from the calorimeter information. As there is no independent momentum measurement, the rank of V is three.

If in addition a vertex constraint is to be enforced, the location x, y, z in space, which can vary along the track, is appended to u, and the rank of V increases by two [19].

If n tracks with parameters u k and covariance matrices V k, k = 1, …, n, participate in the kinematic fit, their parameters are combined in a single column vector y:

$$\displaystyle \begin{gathered} {\boldsymbol{y}}=\begin{pmatrix}{{\boldsymbol{u}}}_1\\ \vdots\\ {{\boldsymbol{u}}}_n\end{pmatrix}. \end{gathered} $$
(8.60)

If the estimated u k are uncorrelated, their joint covariance matrix V is block-diagonal:

$$\displaystyle \begin{gathered} {\mathsf{Var}\left[{\boldsymbol{y}}\right]}={{{\boldsymbol{V}}}}={{\text{blkdiag}\hspace{0.5pt}({{{\boldsymbol{V}}}}_1,\ldots,{{{\boldsymbol{V}}}}_n)}}.{} \end{gathered} $$
(8.61)

If the momentum parts of the u k are the results of a preceding vertex fit, they are correlated, and their joint covariance matrix is given in Eqs. (8.40) and (8.42).

The vector y is considered to be an unbiased observation of the vector α that satisfies r ≤rank(V ) kinematical or geometrical constraints:

$$\displaystyle \begin{gathered}{} {\boldsymbol{y}}={{\boldsymbol{\alpha}}}+{\boldsymbol{\varepsilon}},\ \; {\mathsf{E}\left[{\boldsymbol{\varepsilon}}\right]}=\mathbf{0},\ \;{\mathsf{Var}\left[{\boldsymbol{\varepsilon}}\right]}={{{\boldsymbol{V}}}}. \end{gathered} $$
(8.62)

The constraints are expressed by r equations h i(α) = 0, i = 1, …, r, which can be written compactly in the following vector form:

$$\displaystyle \begin{gathered} {{\boldsymbol{h}}}({{\boldsymbol{\alpha}}})=\mathbf{0},\ \;\mathrm{with}\ \;{{\boldsymbol{h}}}=\left(h_1({{\boldsymbol{\alpha}}}),\ldots,h_r({{\boldsymbol{\alpha}}})\right){{}^{\mathsf{T}}}. \end{gathered} $$
(8.63)

The function h is approximated by its first-order Taylor expansion at the expansion point α 0 = y, resulting in a set of linear constraints:

$$\displaystyle \begin{gathered} {{\boldsymbol{h}}}({{\boldsymbol{\alpha}}})\approx {{\boldsymbol{h}}}({{\boldsymbol{\alpha}}}_0)+{{\boldsymbol{H}}}({{\boldsymbol{\alpha}}}-{{\boldsymbol{\alpha}}}_0)={{\boldsymbol{h}}}({{\boldsymbol{\alpha}}}_0)+{{\boldsymbol{H}}}{{\boldsymbol{\alpha}}}-{{\boldsymbol{H}}}{{\boldsymbol{\alpha}}}_0 = {{\boldsymbol{H}}}{{\boldsymbol{\alpha}}}+{\boldsymbol{d}}=\mathbf{0}, \end{gathered} $$
(8.64)

with the Jacobian matrix H =  h α evaluated at α = α 0 and d = h(α 0) − 0. It is assumed that H has rank r in a sufficiently large neighbourhood of y.

If V has full rank, the constrained LS estimate of α is obtained by minimizing the following objective function:

$$\displaystyle \begin{gathered} {\mathcal{S}}({{\boldsymbol{\alpha}}},{\boldsymbol{\lambda}})=\left({{\boldsymbol{\alpha}}}-{\boldsymbol{y}}\right){{}^{\mathsf{T}}}{{{\boldsymbol{G}}}}\left({{\boldsymbol{\alpha}}}-{\boldsymbol{y}}\right)+2\,{\boldsymbol{\lambda}}{{}^{\mathsf{T}}}\left({{\boldsymbol{H}}}{{\boldsymbol{\alpha}}}+{\boldsymbol{d}}\right), \end{gathered} $$
(8.65)

where G = V −1 and λ is the vector of Lagrange multipliers. Setting the gradient of \({\mathcal {S}}({{\boldsymbol {\alpha }}},{\boldsymbol {\lambda }})\) to zero results in the following system of linear equations:

$$\displaystyle \begin{aligned} {{{\boldsymbol{G}}}}\left({{\boldsymbol{\alpha}}}-{\boldsymbol{y}}\right)+{{\boldsymbol{H}}}{{}^{\mathsf{T}}}{\boldsymbol{\lambda}}&=\mathbf{0}, \end{aligned} $$
(8.66)
$$\displaystyle \begin{aligned} {{\boldsymbol{H}}}{{\boldsymbol{\alpha}}}+{\boldsymbol{d}}&=\mathbf{0}. \end{aligned} $$
(8.67)

The system can be explicitly solved for α, the solution being the estimate \({{\hat {\boldsymbol {\alpha }}}}\):

$$\displaystyle \begin{gathered} {{\hat{\boldsymbol{\alpha}}}}={\boldsymbol{y}}-{{{\boldsymbol{V}}}}\hspace{0.5pt}{{\boldsymbol{H}}}{{}^{\mathsf{T}}}\hspace{0.5pt}{{\boldsymbol{G}}}_{H}\left({{\boldsymbol{H}}}{\boldsymbol{y}}+{\boldsymbol{d}}\right),\ \;\mathrm{with}\ \;{{\boldsymbol{G}}}_{H}=\left({{\boldsymbol{H}}}\hspace{0.5pt}{{{\boldsymbol{V}}}}\hspace{0.5pt}{{\boldsymbol{H}}}{{}^{\mathsf{T}}}\right){{}^{-1}}.{} \end{gathered} $$
(8.68)

The solution is valid also for singular V as long as r ≤rank(V ), so that H V H T has full rank r and can be inverted. This can be proved by regularization of V ; see Appendix B.

The constraints are re-expanded at the new expansion point \({{\boldsymbol {\alpha }}}_0={{\hat {\boldsymbol {\alpha }}}}\), and the fit is iterated until convergence. The covariance matrix \({{{\boldsymbol {V}}}_{{{\hat {\boldsymbol {\alpha }}}}}}\) of the final estimate \({{\hat {\boldsymbol {\alpha }}}}\) and the chi-square statistic of the fit are given by:

$$\displaystyle \begin{aligned} {{{\boldsymbol{V}}}_{{{\hat{\boldsymbol{\alpha}}}}}}&={{{\boldsymbol{V}}}}-{{{\boldsymbol{V}}}}\hspace{0.5pt}{{\boldsymbol{H}}}{{}^{\mathsf{T}}}\hspace{0.5pt}{{\boldsymbol{G}}}_{H}\hspace{0.5pt}{{\boldsymbol{H}}}\hspace{0.5pt}{{{\boldsymbol{V}}}}, \end{aligned} $$
(8.69)
$$\displaystyle \begin{aligned} {\chi^2}&=\left({{\boldsymbol{H}}}{\boldsymbol{y}}+{\boldsymbol{d}}\right){{}^{\mathsf{T}}}{{\boldsymbol{G}}}_{H}\left({{\boldsymbol{H}}}{\boldsymbol{y}}+{\boldsymbol{d}}\right). \end{aligned} $$
(8.70)

The χ 2 has r degrees of freedom and can be used to assess the goodness of the fit, provided that the linear approximation of the constraints is satisfactory and the distribution of the error term ε in Eq. (8.62) is close to normal.

A comprehensive software package for kinematic fitting, called KWFIT, can be found in [20]. In addition to stand-alone kinematic fits as described above, it allows fitting entire decay chains with multiple vertices. For a discussion of its features and of kinematic fitting in general, see [21]. More recent decay chain fitters are described in [22] and [23].