1 Introduction

Error propagation as well as the averaging of results of individual measurements—at least in the context of strictly Gaussian errors, possibly with statistical or systematic correlations—are straightforward, they are covered in many textbooks,Footnote 1 and there seems to be no open issue, because all that is required is multi-variate analysis applied to normal distributions. It is the more surprising, that—to the best of the authors knowledge—no explicit analytical expression is available that serves to compute for a given set of measurements of some quantity with individual, generally correlated, errors of statistical and systematic nature, the statistical and systematic components of the uncertainty of the average.

That is to say, while in the Gaussian context it is clear how to obtain the average including its uncertainty, and that the total error ought to be the quadratic sum of the statistical and systematic (and perhaps other such as theoretical) error components, formulae of these individual components and their derivation have not received much attention. But the proper disclosure of the statistical (random) error component compared to the systematic uncertainty can be of importance. For example, in the context of the design of future experimental facilities it is crucial to know how much precision (say, over the world average) can be gained by simply generating larger data samples, in contrast to possible technological or scientific breakthroughs. There are also more involved cases [3] of error propagation where the knowledge of individual error components of intermediate averages are helpful, if not crucial. The systematic error is, of course, more troublesome as it cannot be reducedFootnote 2 as straightforwardly by increasing the sample size N.

In the next Section, we review the standard procedure to average a number of experimental determinations of some observable quantity. We also mention an approximate method to obtain the statistical and systematic error of an average of similar experiments where the ratio of the systematic to statistical components are comparable, or where the statistical error is dominant.

Then we turn to the main point, the exact determination of the individual error components of an average. We show that in the absence of correlations the square of the statistical error or any other type of uncertainty is weighted by the fourth power of the total errors.

Then we turn to correlations, starting with the simplest case of two measurements for which we introduce the concepts of disparity and misalignment angles. Finally, we present exact relations for the case of more than two measurements, and address some problems that arise when new measurements are added to an existing average iteratively.

2 Simplified procedures

Suppose one is given a set of measurements of some quantity v, with central values \(v_i\), statistical (random) errors \(r_i\) and total systematic errors \(s_i\). For simplicity, we are going to assume that the \(r_i\) and \(s_i\) are Gaussian distributed (the generalization to other error distributions is straightforward) in which case the total errors of the individual measurements are given by

$$\begin{aligned} t_i = \sqrt{r_i^2 + s_i^2}\ . \end{aligned}$$
(1)

If we furthermore temporarily assume that the measurements are uncorrelated, then the central value \(\bar{v}\) of their combination is given by the precision weighted average,

$$\begin{aligned} \bar{v} = {\sum _i v_i t_i^{-2} \over \sum _i t_i^{-2}} \ , \end{aligned}$$
(2)

with total error

$$\begin{aligned} \bar{t} = {1 \over \sqrt{\sum _i t_i^{-2}}} \ . \end{aligned}$$
(3)

Similarly, the statistical component \(\bar{r}\) of \(\bar{t}\) can often be approximated by

$$\begin{aligned} \bar{r} = {1 \over \sqrt{\sum _i r_i^{-2}}} \ . \end{aligned}$$
(4)

The systematic component \(\bar{s}\) of \(\bar{t}\) is then obtained from

$$\begin{aligned} \bar{s}= \sqrt{\bar{t}^2 - \bar{r}^2} \ . \end{aligned}$$
(5)

For example, two measurements with

$$\begin{aligned} r_1 = s_2 = 30, \qquad r_2 = s_1 = 40, \qquad t_1 = t_2 = 50, \end{aligned}$$
(6)

would result in

$$\begin{aligned} \bar{r} = 24, \qquad \bar{s} = \sqrt{674} \approx 26 \approx \bar{r}. \end{aligned}$$
(7)

Notice, that while the individual errors in Eq. (6) are symmetric under the simultaneous exchange of the statistical and systematic errors (we recall that all \(r_i\) and \(s_i\) are assumed Gaussian) and the labels of the two measurements, the result (7) does not exhibit the corresponding symmetry which would imply \(\bar{r} = \bar{s}\) exactly. The exact procedure introduced in the next section would indeed yield \(\bar{r} = \bar{s}\) in this example.

Now consider the case where one of the systematic errors, say \(s_1\), is larger and eventually \(s_1 \rightarrow \infty \). Then the weight of the first measurement approaches zero, and \(\bar{t} \rightarrow t_2\), as expected. However, one would also expect that \(\bar{r} \rightarrow r_2\) and \(\bar{s} \rightarrow s_2\), while instead \(\bar{r} < r_2\) remains constant and \(\bar{s} \rightarrow \sqrt{1924} \approx 44 > s_2\). Thus, one would face the unreasonable result that averaging some measurement with an irrelevant constraint (with infinite uncertainty) will decrease (increase) the statistical (systematic) error component, leaving only the total error invariant. In other words, if in a set of measurements there is one with negligible statistical error, then the average would also have vanishing statistical error, regardless of how unimportant that one measurement is compared to the others. Clearly, Eq. (4) is then unsuitable even as an approximation.

Table 1 Central value \(v_i\), random error \(r_i\), systematic uncertainty \(s_i\), total error \(t_i\), and uncorrelated error component \(u_i\), for measurements of the quantity \({\mathcal A}_\tau \) by each of the four
Table 2 Same as Table 1, but for the weak mixing angle determinations by ATLAS

One can easily extend these consideration to the case where the individual measurements have a common contribution c entering the systematic error. The precision weighted average (2) and total error (3) are then to be replaced by

$$\begin{aligned} \bar{v} = {\sum _i v_i u_i^{-2} \over \sum _i u_i^{-2}}\ , \qquad \bar{t} = \sqrt{{1 \over \sum _i u_i^{-2}} + c^2}, \end{aligned}$$
(8)

where the uncorrelated error components are given by

$$\begin{aligned} u_i = \sqrt{t_i^2 - c^2}, \qquad \bar{u} = \sqrt{\bar{t}^2 - c^2} = {1\over \sqrt{\sum _i u_i^{-2}}}\ , \end{aligned}$$
(9)

and where \(t_i^2 \ge 0\) requires \(c^2 \ge - u_i^2\) for all i.

The general case of correlated errors will be dealt with later, but we note that the case of two measurements with Pearson’s correlation coefficient \(\rho \) can always be brought to this form with \(c^2\) given by

$$\begin{aligned} c^2 = t_i^2 - u_i^2 = \rho \, t_1 t_2 = \rho \, \sqrt{u_1^2 + c^2} \sqrt{u_2^2 + c^2}. \end{aligned}$$
(10)

A proper (normalizable) probability distribution requires \(|\rho | \le 1\), so that from Eq. (10),

$$\begin{aligned} c^2 \ge - {u_1^2 u_2^2 \over {u_1^2} + {u_2^2}}\ , \end{aligned}$$
(11)

guaranteeing that \(\bar{t}\) is real. On the other hand, \(u_1\) or \(u_2\), as well as \(\bar{u}\), in Eq. (9) may become imaginary provided that

$$\begin{aligned} \rho > {t_1\over t_2}\qquad \ \mathrm{or} \qquad \rho > {t_2\over t_1} \end{aligned}$$
(12)

in which case the first or second measurement, respectively, contributes with negative weight, and \(\bar{v}\) lays no longer between \(v_1\) and \(v_2\). In this situation, one rather (but equivalently) regards the measurement with a negative weight as a measurement of some nuisance parameter related to c. Replacing the inequalities (12) by equalities, gives rise to an infinite weight (one of the \(u_i = 0\)) as well as \(\bar{u} = 0\) and \(\bar{t} = c\).

As a concrete example, each of the four experimental collaborations at the Z boson factory LEP 1 [47] has measured some quantity \({\mathcal A}_\tau \) (related to the polarization of final-state \(\tau \) leptons produced in Z decays) with the results shown in Table 1. A number of uncertainties affected the four measurements in a similar way, leading to a relatively weak correlation matrix [8] which, while not quite corresponding to the form (8), (9), can be well approximated by it when using the average of the square root of the off-diagonal entries of the covariance matrix \(c \approx 0.0016\).

The values in the last line are \(\bar{v}\), \(\bar{r}\), \(\bar{s}\), \(\bar{t}\) and \(\bar{u}\) as calculated from Eqs. (4), (5), (8) and (9). \(\bar{v}\), \(\bar{r}\) and \(\bar{s}\) agree with Table 4.3 and \(\bar{t}\) agrees with Eq. (4.9) of the LEP combination in Ref. [8].

Table 2 shows the more recent example of the determination of the weak mixing angle [9] which is based on purely central (CC) electron events, events with a forward electron (CF), as well as muon pairs. Here the average of the off-diagonal entries of the covariance matrix amounts to \(c \approx 0.0010\). This is an example where the dominant uncertainty is from common systematics, namely from the imperfectly known parton distribution functions affecting the three channels in very similar ways.

We will return to these examples after deriving exact alternatives to formula (4).

3 Derivatively weighted errors

Our starting point is the basic property of a statistical error to scale as \(N^{-1/2}\) with the sample size. To implement this, we rewrite Eq. (1) as

$$\begin{aligned} t_i = \sqrt{\epsilon ^2 r_i^2 + s_i^2} |_{\epsilon =1}. \end{aligned}$$
(13)

Thus, the statistical error satisfies the relation,

$$\begin{aligned} r_i = \sqrt{t_i {d t_i\over d\epsilon } |_{\epsilon =1}}. \end{aligned}$$
(14)

In the absence of correlations we can use Eq. (3), and demand that analogously,

$$\begin{aligned} \bar{r}^2 = \bar{t}\, {d \bar{t} \over d\epsilon } |_{\epsilon =1} = \bar{t}^{\, 4} \sum _i t_i^{-3} {d t_i\over d\epsilon } |_{\epsilon =1} = \sum _i r_i^2 \left( {\bar{t} \over t_i} \right) ^4. \end{aligned}$$
(15)

Notice, that Eq. (4) can be recovered from Eq. (15) upon substituting \(t_i \rightarrow r_i\) and \(\bar{t} \rightarrow \bar{r}\). Eq. (15) means that the relative statistical error of the combination, \(\bar{x}\), is given by the precision weighted average

$$\begin{aligned} \bar{x}^2 = {\sum _i x_i^2 t_i^{-2} \over \sum _i t_i^{-2}}, \end{aligned}$$
(16)

where

$$\begin{aligned} x_i \equiv {r_i \over t_i}\ , \qquad \bar{x} \equiv {\bar{r} \over \bar{t}}. \end{aligned}$$
(17)

Furthermore, giving the systematic components a similar treatment, we find

$$\begin{aligned} \bar{s}^2 = \sum _i s_i^2 \left( {\bar{t} \over t_i} \right) ^4, \end{aligned}$$
(18)

so that the expected symmetry between the two types of uncertainty becomes manifest, and moreover, Eq. (5) now follows directly from Eqs. (15) and (18), rather than being enforced. The central result is that for uncorrelated errors, the squares of the statistical and systematic components (or those of any other type) of an average are the corresponding individual squares weighted by the inverse of the fourth power of the individual total errors, or equivalently, weighted by the square of the individual precisions \(t_i^{-2}\).

Returning to the case where the only source of correlation is a common contribution \(c \ne 0\) equally affecting all measurements, we find from Eq. (8),

$$\begin{aligned} \bar{r}^2 = \sum _i r_i^2 \left( {\bar{u} \over u_i} \right) ^4, \qquad \bar{y}^2 = {\sum _i y_i^2 u_i^{-2} \over \sum _i u_i^{-2}}, \end{aligned}$$
(19)

where

$$\begin{aligned} y_i \equiv {r_i \over u_i}\ , \qquad \bar{y} \equiv {\bar{r} \over \bar{u}}. \end{aligned}$$
(20)

Applied to the case of \({\mathcal A}_\tau \) measurements we now find

$$\begin{aligned} \bar{r} = 0.0035, \qquad \bar{s} = \sqrt{\bar{t}^2 - \bar{r}^2} = 0.0026, \end{aligned}$$
(21)

which agree not exactly, but within round-off precision with the approximate numbers in Table 1.

4 Bivariate error distributions

As a preparation for the most general case of N measurements with arbitrary correlation coefficients, we first discuss in some detail the case \(N = 2\). Recall that the covariance matrix in this case reads

$$\begin{aligned} T \equiv \begin{pmatrix} t_1^2 &{}\quad \rho t_1 t_2 \\ \rho t_1 t_2 &{}\quad t_2^2 \end{pmatrix} \equiv \begin{pmatrix} t_1^2 &{}\quad c^2 \\ c^2 &{}\quad t_2^2 \end{pmatrix}. \end{aligned}$$
(22)

The precision weighted average is given by the expression,

$$\begin{aligned} \bar{v} = \frac{v_1 t_2^2 + v_2 t_1^2 - (v_1 + v_2) c^2}{t_1^2 + t_2^2 - 2 c^2} = \frac{v_1 + \omega \, v_2}{1 + \omega }, \end{aligned}$$
(23)

where

$$\begin{aligned} \omega \equiv \frac{t_1^2 - c^2}{t_2^2 - c^2}\ , \qquad c^2 = \frac{t_1^2 - \omega t_2^2}{1 - \omega }, \end{aligned}$$
(24)

obtained by minimizing the likelihood following a bivariate Gaussian distribution,

$$\begin{aligned} {\mathcal L} \propto e^{- \chi ^2/2}, \end{aligned}$$
(25)

where

$$\begin{aligned} \chi ^2 = \vec v^{\, T} T^{-1} \vec v\ , \qquad \vec v = \begin{pmatrix} v_1 - \bar{v} \\ v_2 - \bar{v} \end{pmatrix} . \end{aligned}$$
(26)

The one standard deviation total error \(\bar{t}\) is defined by

$$\begin{aligned} \Delta \chi ^2 \equiv \chi ^2(\bar{v} + \bar{t}) - \chi ^2(\bar{v}) \overset{!}{=} 1, \end{aligned}$$
(27)

which results in

$$\begin{aligned} \bar{t}= & {} \sqrt{\frac{1 - \rho ^2}{t_1^{-2} + t_2^{-2} - 2 \rho \, t_1^{-1} t_2^{-1}}}\nonumber \\= & {} \sqrt{\frac{t_1^2 t_2^2 - c^4}{t_1^2 + t_2^2 - 2 c^2}} = \sqrt{\frac{t_1^2 - \omega ^2 t_2^2}{1 - \omega ^2}}, \end{aligned}$$
(28)

or conversely,

$$\begin{aligned} \omega = \sqrt{\frac{t_1^2 - \bar{t}^2}{t_2^2 - \bar{t}^2}}, \qquad c^2 = \bar{t}^2 - \sqrt{(t_1^2 - \bar{t}^2)(t_2^2 - \bar{t}^2)}\ . \end{aligned}$$
(29)

Equation (29) is useful in practice if one needs to recover the correlation between a pair of measurement uncertainties and their combination error.

We now turn to the generalization of Eq. (15) in the presence of a systematic correlation. When applying our method of derivatively weighted errors to Eq. (28) it is important to keep \(c^2 = \rho t_1 t_2\) fixed (this would be different in the presence of a statistical correlation). Doing this, we obtain

$$\begin{aligned} \bar{r} = \frac{\sqrt{r_1^2 (t_2^2 - c^2)^2 + r_2^2 (t_1^2 - c^2)^2}}{t_1^2 + t_2^2 - 2 c^2} = \frac{\sqrt{r_1^2 + \omega ^2 r_2^2}}{1 + \omega }. \end{aligned}$$
(30)

For the systematic component we find

$$\begin{aligned} \bar{s} = \frac{\sqrt{s_1^2 + 2 \omega c^2 + \omega ^2 s_2^2}}{1 + \omega }, \end{aligned}$$
(31)

and we also note that

$$\begin{aligned} \bar{u} ^2 = \frac{\omega }{1 - \omega ^2} (t_2^2 - t_1^2). \end{aligned}$$
(32)

More generally, one can compute the error contribution \(\bar{q}\) of any individual source of uncertainty q to the total error as

$$\begin{aligned} \bar{q} = \frac{\sqrt{q_1^2 + 2 \omega c_q^2 + \omega ^2 q_2^2}}{1 + \omega }, \end{aligned}$$
(33)

where \(c_q^2\) is the contribution of q to \(c^2\) with the constraint

$$\begin{aligned} \sum _q c_q^2 = c^2. \end{aligned}$$
(34)

If the two uncertainties \(q_i\) are fully correlated or anti-correlated between the two measurements, then

$$\begin{aligned} c_q^2 = \pm q_1 q_2\ , \qquad \bar{q} = \frac{q_1 \pm \omega q_2}{1 + \omega }, \end{aligned}$$
(35)

where the minus sign corresponds to anti-correlation.

The formalism is now general enough to allow statistical correlations, as well. As we will illustrate later, knowing all the \(\bar{q}\) is particularly useful if one wishes to successively include additional measurements to a combination—one by one—rather than having to deal with a multi-dimensional covariance matrix. This situation frequently arises in historical contexts when new measurements add information to a set of older ones, rather than superseding them. But there is a problematic issue with this, which apparently is not widely appreciated.

5 Disparity and misalignment angles

Continuing with the case of two measurements, we can relate \(\rho \) to the rotation angle necessary to diagonalize the matrix T. If we define an angle \(\beta \) quantifying the disparity of the total errors of two measurements through

$$\begin{aligned} \tan \frac{\beta }{2} \equiv \frac{t_1 - t_2}{t_1 + t_2}, \end{aligned}$$
(36)

then the diagonal from of T is \(R T R^T\) with

$$\begin{aligned} R \equiv \begin{pmatrix} \cos \frac{\alpha }{2} &{}\quad \sin \frac{\alpha }{2} \\ - \sin \frac{\alpha }{2} &{}\quad \cos \frac{\alpha }{2} \end{pmatrix} \end{aligned}$$
(37)

and

$$\begin{aligned} \tan \alpha = \rho \cot \beta , \end{aligned}$$
(38)

where

$$\begin{aligned} - \frac{\pi }{2} \le \alpha \le \frac{\pi }{2}, \qquad - \frac{\pi }{2} \le \beta \le \frac{\pi }{2}. \end{aligned}$$
(39)

The angle \(\alpha \) may be interpreted as a measure of the misalignment of the two measurements with respect to the primary observable of interest v. Uncorrelated measurements of v are aligned (\(\rho = \alpha = 0\)), while the case \(|\rho | \gg |\tan \beta |\) is reflective of a high degree of misalignment. Indeed, in the extreme case where \(\beta = 0\) (\(|\alpha | = 90^\circ \)) two correlated measurements (\(\rho \ne 0\)) of the same quantity v are equivalent to two uncorrelated measurements, only one of which having any sensitivity to v at all. To reach the decorrelated configuration involves subtle cancellations between correlations and anti-correlations of the statistical and systematic error components of the original measurements.

We can now express the weight factor \(\omega \) in terms of the disparity and misalignment angles \(\beta \) and \(\alpha \),

$$\begin{aligned} \omega = \frac{1 + \sin \beta (1 - \tan \alpha )}{1 - \sin \beta (1 + \tan \alpha )} = \frac{1 + \sin \beta - \rho \cos \beta }{1 - \sin \beta - \rho \cos \beta }. \end{aligned}$$
(40)

In the case \(\rho = \alpha = 0\) this reduces to

$$\begin{aligned} \omega = \tan ^2 \left( \frac{\beta }{2} + \frac{\pi }{4} \right) , \end{aligned}$$
(41)

and Eq. (23) now reads

$$\begin{aligned} \bar{v} = \frac{v_1 + v_2}{2} - \sin \beta \, \frac{v_1 - v_2}{2}. \end{aligned}$$
(42)

One can write equations of the form (41) and (42) for \(\rho \ne 0\), as well, with shifted angles \(\bar{\beta }\) related to \(\beta \) by

$$\begin{aligned} \csc \bar{\beta }= \csc \beta - \tan \alpha . \end{aligned}$$
(43)

However, this ceases to work out in the presence of a negative weight (\(\omega < 0\)), in which case one would need to replace the trigonometric by the hyperbolic functions.

6 Multivariate error distributions

To treat cases of more than two measurements with generic correlations, one can choose one of two strategies. Either one effectively reduces the procedure to cases of just two measurements (in general at the price of some precision loss) by iteratively including additional measurements, or one deals with a multi-dimensional covariance matrix.

Table 3 Central values and breakdown of uncertainties (\(\times 10^4\)) of the weak mixing angle determinations by ATLAS

We first discuss the latter approach, starting with the trivariate case where

$$\begin{aligned} T \equiv \begin{pmatrix} t_1^2 &{}\quad \rho _3 t_1 t_2 &{}\quad \rho _2 t_1 t_3 \\ \rho _3 t_1 t_2 &{}\quad t_2^2 &{}\quad \rho _1 t_2 t_3 \\ \rho _2 t_1 t_3 &{}\quad \rho _1 t_2 t_3 &{}\quad t_3^2 \end{pmatrix} \equiv \begin{pmatrix} t_1^2 &{}\quad c^2_3 &{}\quad c^2_2 \\ c^2_3 &{}\quad t_2^2 &{}\quad c^2_1 \\ c^2_2 &{}\quad c^2_1 &{}\quad t_3^2 \end{pmatrix} \end{aligned}$$
(44)

The average can be written as

$$\begin{aligned} \bar{v} = \frac{\omega _1 v_1 + \omega _2 v_2 + \omega _3 v_3}{\omega _1 + \omega _2 + \omega _3} \end{aligned}$$
(45)

with

$$\begin{aligned} \omega _1 \equiv (t_2^2 - c_3^2)(t_3^2 - c_2^2) - (c_1^2 - c_2^2)(c_1^2 - c_3^2), \end{aligned}$$
(46)
$$\begin{aligned}&\omega _2 \equiv (t_1^2 - c_3^2)(t_3^2 - c_1^2) - (c_2^2 - c_1^2)(c_2^2 - c_3^2),\end{aligned}$$
(47)
$$\begin{aligned}&\omega _3 \equiv (t_1^2 - c_2^2)(t_2^2 - c_1^2) - (c_3^2 - c_1^2)(c_3^2 - c_2^2). \end{aligned}$$
(48)

The total error is given by

$$\begin{aligned} \bar{t} = \sqrt{\frac{\det T}{\omega _1 + \omega _2 + \omega _3}}, \end{aligned}$$
(49)

and for its statistical and systematic components we find (in the absence of statistical correlations),

$$\begin{aligned}&\bar{r} = \frac{\sqrt{\omega _1^2 r_1^2 + \omega _2^2 r_2^2 + \omega _3^2 r_3^2}}{\omega _1 + \omega _2 + \omega _3},\nonumber \\&\quad \bar{s} = \frac{\sqrt{\sum _i \omega _i^2 s_i^2 + \sum _{i \ne j} \omega _i \omega _j T_{ij}}}{\omega _1 + \omega _2 + \omega _3}, \end{aligned}$$
(50)

respectively. The generalization of Eq. (33) is now also straightforward. e.g., in the case of 100 % correlation between the three measurements we have,

$$\begin{aligned} \bar{q} = \frac{\omega _1 q_1 + \omega _2 q_2 + \omega _3 q_3}{\omega _1 + \omega _2 + \omega _3}. \end{aligned}$$
(51)

Analogous expressions hold for cases with \(N > 3\) measurements. For example, the covariance matrix for the case of \(N=4\) reads

$$\begin{aligned} T \equiv \begin{pmatrix} t_1^2 &{}\quad c_{12}^2 &{}\quad c_{13}^2 &{}\quad c_{14}^2 \\ c_{12}^2 &{}\quad t_2^2 &{}\quad c_{23}^2 &{}\quad c_{24}^2 \\ c_{13}^2 &{}\quad c_{23}^2 &{}\quad t_3^2 &{}\quad c_{34}^2 \\ c_{14}^2 &{}\quad c_{24}^2 &{}\quad c_{34}^2 &{}\quad t_4^2 \end{pmatrix}. \end{aligned}$$
(52)

All that remains to be computed are the weight factors \(\omega _i\). We found a convenient expression for them, e.g.,

$$\begin{aligned} \omega _1 = \begin{vmatrix} t_2^2 - c_{12}^2&\quad c_{23}^2 - c_{12}^2&\quad c_{24}^2 - c_{12}^2 \\ c_{23}^2 - c_{13}^2&\quad t_3^2 - c_{13}^2&\quad c_{34}^2 - c_{13}^2 \\ c_{24}^2 - c_{14}^2&\quad c_{34}^2 - c_{14}^2&\quad t_4^2 - c_{14}^2 \end{vmatrix}. \end{aligned}$$
(53)

Thus, the \(\omega _i\) can be obtained by computing the determinant of a matrix which is constructed by subtracting the ith column from each of the other columns (or the ith row from each of the other rows) and then removing the ith row and column. The reader is now equipped to handle cases of any N exactly.

The alternative strategy to compute averages is to add more measurements iteratively. We illustrate this using the example of the ATLAS results on the weak mixing angle (see Table 2). Besides the statistical error (there were no statistical correlations) there were seven sources of systematics, six of which correlated between at least two channels. The breakdown of these uncertainties as quoted by the ATLAS Collaboration [9] are shown in Table 3.

Good agreement is observed with Ref. [9], where the small differences are consistent with precision loss due to rounding. Indeed, the fact that there are round-off issues can already be seen from Ref. [9], where the quoted total error of the CF electron channel is smaller than the sum (in quadrature) of the statistical, PDF, and other systematic errors. A similar issue can be observed regarding the quoted combined systematic error which is larger than the sum in quadrature of its components.

However, there are small differences between the results from the exact procedure using Eqs. (50) and (51) and the iterative strategy using Eqs. (30) and (33). The reason can be traced to the asymmetric way in which the error due to higher orders enters the two electron channels. This induces subtle dependences of all sources of uncertainties (even those that were initially uncorrelated) on the correlated ones. It even affects the uncertainty induced by the finite muon energy resolution, which does not enter the electron channels at all. To account for this one can introduce additional contributions \(\Delta c_q^2\) to the off-diagonal entry of the bivariate covariance matrix of the all-electron result and the muon channel. These \(\Delta c_q^2\) can be chosen to enforce the exact result, but it is impossible to compute them beforehand. In fact, they depend on the new measurement to be added (here the muon channel) and not just the initial measurements (here the two electron channels). Moreover, the \(\Delta c_q^2\) necessary to enforce the correct average central value \(\bar{v}\) differs strongly from the \(\Delta c_q^2\) necessary to enforce the total error \(\bar{t}\). This observation is a reflection of the fact that the combination principle can be violated [12], which we state as the requirement that the combination of a number of measurements must not depend on the order in which they added to the average.

Thus, the iterative procedure generally suffers from a loss of precision. In this example the procedure gives nevertheless an excellent approximation because the uncertainty from higher order corrections (the origin of the asymmetric uncertainty) is itself very small. But there are cases in which the iterative procedure does not provide even a crude approximation and where one should use—if possible—the exact method based on the full covariance matrix. Unfortunately, its construction is not always possible, e.g., due to incomplete documentation of past results. Recent discussions of related aspects of this conundrum can be found in Refs. [13, 14].

7 Summary and conclusions

In summary, we have developed a formalism (derivatively weighted errors) to derive formulas for random errors or any error type of uncorrelated Gaussian nature. We introduced what we call disparity and misalignment angles to describe the case of two measurements, and found their relation to the statistical weight factors. For cases of more than two measurements with known covariances, we derived explicit formulas in a form which (as far as we are aware) did not appear before.

It is remarkable, that even in the context of purely Gaussian errors and perfectly known correlations there are intractable problems at the most fundamental statistical level. Specifically, they may arise even when a number of observations of the same quantity is combined and the error sources are recorded and the assumptions regarding their correlations are spelled out carefully. In statistical terms, one can conclude that such a combination—despite of all its recorded details—represents an insufficient statistics of the available information. The inclusion of further observations of the same quantity is then in general ambiguous.

On the other hand, there is no ambiguity in the absence of correlations or when any correlation is common to the set of observations to be combined. The fact that the ambiguities disappear in certain limits then reopens the possibility of useful approximations. For example, if an iterative procedure has to be chosen, one should first combine measurements where the dominant correlation is given approximately by a common contribution. Similarly, the measurements with small or no correlation with the other ones, are ideally kept for last.