1 Introduction

Measurement results are usually reported quoting not only the total uncertainty on the measured values but also their breakdown into uncertainty components—usually the statistical uncertainty and one or more components of systematic uncertainty. A consistent propagation of uncertainties is of upmost importance for global analyses of measurement data, for example, for determining the anomalous magnetic moment of the muon [1] or the parton distribution functions of the proton [2], and for the measurement of Z boson properties at LEP1 [3], the top-quark mass [4], or the Higgs boson properties [5] at the LHC. In high-energy physics experiments, different techniques are used for obtaining this decomposition, depending on (but not fundamentally related to) the test statistic used to obtain the results.

The simplest statistical method consists in comparing a measured quantity or distribution to a model, parameterized only in terms of the physical constants to be determined. Auxiliary parameters (detector calibrations, theoretical predictions, etc.) on which the model depends are fixed to their best estimates. The measured values of the physical constants result from the maximization of the corresponding likelihood. The curvature of the likelihood around its maximum is determined only by the expected fluctuations of the data and yields the statistical uncertainty of the measurement.Footnote 1 Systematic uncertainties are obtained by repeating the procedure with varied models, obtained from the variation of the auxiliary parameters within their uncertainty, one parameter at a time [6]. Each variation represents a given source of uncertainty. The corresponding uncertainties in the final result are usually uncorrelated by construction, and are summed in quadrature to obtain the total measurement uncertainty.

When using this method, different measurements of the same physical constants can be readily combined. When all uncertainties are Gaussian, the best linear unbiased estimate (BLUE) [7, 8] results from the analytical maximization of the joint likelihood of the input measurements, and unambiguously propagates the statistical and systematic uncertainties in the input measurements to the combined result.

An improved statistical method consists in parameterizing the model in terms of both the physical constants and the sources of uncertainty [9, 10], and has become a standard in LHC analysis. In this case, the maximum of the likelihood represents a global optimum for the physical constants and the uncertainty parameters, and determines their best values simultaneously. The curvature of the likelihood at its maximum reflects the fluctuations of the data and of the other sources of uncertainty, therefore giving the total uncertainty in the final result.

The determination of the statistical and systematic uncertainty components in numerical profile likelihood fits is the subject of the present note. Current practice universally employs so-called impacts [11,12,13], obtained as the quadratic difference between the total uncertainties of fits including or excluding given sources of uncertainty. But while impacts quantify the increase in the total uncertainty when including new systematic sources in a measurement, they cannot be interpreted as the contribution of these sources to the total uncertainty in the complete measurement. Impacts do not add up to the total uncertainty, and do not match usual uncertainty decomposition formulas [8] even when they should, i.e., when all uncertainties are genuinely Gaussian.

These statements are illustrated with a simple example in Sect. 2. Sections 3 and 4 summarize parameter estimation in the Gaussian approximation. Sources of uncertainty can be entirely encoded in the covariance matrix of the measurements (the “covariance representation”), or parameterized using nuisance parameters (the “nuisance parameter representation”). The equivalence between the approaches is recalled, and a detailed discussion of the fit uncertainties and correlations is provided. A new and consistent method for the decomposition of uncertainties in profile likelihood fits is proposed in Sect. 5. The method is general, as it results from a Taylor expansion of the likelihood, and a proof that it yields consistent results in the Gaussian regime is given. The different approaches are illustrated in Sect. 6 with examples based on the Higgs and W-boson mass measurements and combinations, which are usually dominated by systematic uncertainties and where the present discussion is of particular relevance. Concluding remarks are presented in Sect. 7.

In the following, we understand the statistical uncertainty in its strict frequentist definition, i.e., the standard deviation of an estimator when the exact same experiment is repeated (with the same systematic uncertainties) on independent data samples of identical expected size. Similarly, a systematic uncertainty contribution should match the standard deviation of the estimator obtained under fluctuations of the corresponding source within its initial uncertainty. Measurements (physical parameters, cross sections, or bins of a measured distribution) and the corresponding predictions will be denoted as \(\vec {m}\) and \(\vec {t}\), respectively, and labeled using Roman indices ijk. The predictions are functions of the physical constants to be determined, referred to as parameters of interest (POIs), denoted as \(\vec {\theta }\) and labeled pq. Sources of uncertainty are denoted as \(\vec {a}\), and their associated nuisance parameters (NPs), \(\vec {\alpha }\), are labeled rst.

2 Example: Higgs boson mass in the di-photon and four-lepton channels

Let us consider the first ATLAS Run 2 measurement of the Higgs boson mass, \(m_\text {H}\), in the \(H\rightarrow \gamma \gamma \) and \(H\rightarrow 4\ell \) final states [14]. The measurement results in the \(\gamma \gamma \) and \(4\ell \) channels have similar total uncertainty, but are unbalanced in the sense that the former benefits from a large data sample but has significant systematic uncertainties from the photon energy calibration, while the latter is limited to a smaller data sample but benefits from excellent calibration systematic uncertainties:

  • \(m_{\gamma \gamma } = 124.93 \pm 0.40 (\pm 0.21 \text { (stat) } \pm 0.34 \text { (syst)})\) GeV;

  • \(m_{4\ell } = 124.79 \pm 0.37 (\pm 0.36 \text { (stat) } \pm 0.09 \text { (syst)})\) GeV.

The uncertainties in the \(\gamma \gamma \) and \(4\ell \) measurements can be considered as entirely uncorrelated for this discussion. In the BLUE approach, the combined value and its uncertainty are then obtained considering the following log-likelihood:

$$\begin{aligned} -2 \ln {\mathscr {L}} = \sum _{i} \left( \frac{m_i - m_\text {H}}{\sigma _{i}}\right) ^2, \end{aligned}$$
(1)

where \(i=\gamma \gamma ,\,4\ell \), and \(\sigma _{\gamma \gamma }\) and \(\sigma _{4\ell }\) are the total uncertainties in the \(\gamma \gamma \) and \(4\ell \) channels, respectively. The combined value \(m_\text {cmb}\) and its total uncertainty \(\sigma _\text {cmb}\) are derived solving

$$\begin{aligned} \left. \frac{\partial \ln {\mathscr {L}}}{\partial m_\text {H}}\right| _{m_\text {H} = m_\text {cmb}} = 0, \quad \frac{1}{\sigma _\text {cmb}^2} = \left. \frac{\partial ^2\ln {\mathscr {L}}}{\partial m_\text {H}^2}\right| _{m_\text {H} = m_\text {cmb}} \end{aligned}$$
(2)

The solutions can be written in terms of linear combinations of the input values and uncertainties:

$$\begin{aligned} m_\text {cmb} = \sum _i \lambda _i \, m_i, \quad \sigma _\text {cmb}^2 = \sum _i \lambda _i^2 \, \sigma _i^2 \end{aligned}$$
(3)

with

$$\begin{aligned} \lambda _i = \frac{1/\sigma _i^2}{1/\sigma _{\gamma \gamma }^2+1/\sigma _{4\ell }^2}, \quad \lambda _{\gamma \gamma } + \lambda _{4\ell } = 1. \end{aligned}$$
(4)

And the weights \(\lambda _i\) minimize the variance in the combined result, accounting for all sources of uncertainty in the input measurements. Since the total uncertainties have statistical and systematic components, i.e., \(\sigma _i^2 = \sigma _{\text {stat},i}^2 + \sigma _{\text {syst},i}^2\), the corresponding contributions in the combined measurement are simply

$$\begin{aligned} \sigma _\text {stat,cmb}^2 = \sum _i \lambda _i^2 \, \sigma _{\text {stat},i}^2, \quad \sigma _\text {syst,cmb}^2 = \sum _i \lambda _i^2 \, \sigma _{\text {syst},i}^2. \end{aligned}$$
(5)

In the profile likelihood (PL) approach, or nuisance-parameter representation, the corresponding likelihood reads

$$\begin{aligned} -2\ln {\mathscr {L}}= & {} \sum _{i} \left( \frac{m_i+\sum _r (\alpha _r - a_r) \varGamma _{ir} - m_\text {H}}{\sigma _{\text {stat},i}}\right) ^2\nonumber \\{} & {} \quad + \sum _r (\alpha _r - a_r)^2, \end{aligned}$$
(6)

where \(\alpha _r\) is the nuisance parameter corresponding to the source of systematic uncertainty r, and \(\varGamma _{ir}\) its effect on the measurement in channel i. Knowledge of the systematic uncertainty r is obtained from an auxiliary measurement, of which the central value, sometimes called a global observable, is denoted as as \(a_r\). The parameters \(\alpha _r\) and \(a_r\) are defined in units of the systematic uncertainty \(\sigma _{\text {syst},r}\), and \(a_r\) is often conventionally set to 0. In this example, since the \(\sigma _{\text {syst},r}\) are specific to each channel and do not generate correlations, \(\varGamma _{ir} = \sigma _{\text {syst},r} \, \delta _{ir}\). The combined value \(m_\text {cmb}\) and its total uncertainty are obtained from the absolute maximum and second derivative of \({\mathscr {L}}\) as above; in addition, the PL yields the estimated value for \(\alpha _r\). One finds that \(m_\text {cmb}\) and \(\sigma _\text {cmb}\) exactly match their counterparts from Eq. (3) (see also the discussion in Sect. 4).

In PL practice, however, the statistical uncertainty is usually obtained by fixing all nuisance parameters to their best-fit value (maximum likelihood estimator) \(\hat{\alpha _r}\), maximizing the likelihood only with respect to the parameter of interest. With fixed \(\alpha _r\), the second derivative of Eq. (6) becomes equivalent to that of Eq. (1), changing only \(\sigma _i\) for \(\sigma _{\text {stat},i}\) in the denominator, giving

$$\begin{aligned} \sigma _\text {stat,cmb}^2 = \sum _i \lambda ^{\prime 2}_{i} \, \sigma _{\text {stat},i}^2, \quad \lambda ^\prime _{i} = \frac{1/\sigma _{\text {stat},i}^2}{1/\sigma _{\text {stat},\gamma \gamma }^2+1/\sigma _{\text {stat},4\ell }^2} \end{aligned}$$
(7)

which this time differ from Eqs. (4), (5): here, the coefficients \(\lambda ^\prime \) are calculated from the statistical uncertainties only, and the combined uncertainty is optimized for this case. The statistical uncertainty is thus underestimated relative to Eq. (3)). The systematic error, estimated from the quadratic subtraction between the total and statistical uncertainty estimate, is overestimated.

Table 1 Uncertainty components of \(m_\text {H}\) in the \(\gamma \gamma \) and \(4\ell \) channels, and for the combined measurement. The combined uncertainties are given according to the BLUE result (Eq. (5)) and using impacts (Eq. (7))

For completeness, numerical values are given in Table 1. The “impact” of a systematic uncertainty on a measurement with only statistical uncertainties differs from the contribution of this systematic uncertainty to the complete measurement. In the impact procedure, the estimated measurement statistical uncertainty is actually the total uncertainty of a measurement without systematic uncertainties, i.e., of a different measurement. In other words, it does not match the standard deviation of results obtained by repeating the same measurement, including systematic uncertainties, on independent data sets of the same expected size.

Finally, extrapolating the \(\gamma \gamma \) and \(4\ell \) measurements to the large data sample limit, statistical uncertainties vanish, and the asymptotic combined uncertainty should intuitively be dominated by the \(4\ell \) channel and close to 0.09 GeV. A naive estimate based on impacts instead suggests an asymptotic uncertainty of 0.20 GeV.

We generalize this discussion in the following, and argue that a sensible uncertainty decomposition should match the one obtained from fits in the covariance representation, and can be also obtained simply in the context of the PL. The Higgs boson mass example is further discussed in Sect. 6.1.

3 Uncertainty decomposition in covariance representation

This section provides a short summary of standard results which can be found in the literature (see e.g. [15]). Gaussian uncertainties are assumed throughout this section. The general form of Eq. (1) in the presence of an arbitrary number of measurements \(m_i\) and POIs \(\vec {\theta }\) is

$$\begin{aligned} -2\ln {\mathscr {L}}_\text {cov}(\vec {\theta }) = \sum _{i,j} \left( m_i - t_i(\vec {\theta })\right) C_{ij}^{-1} \left( m_j - t_j(\vec {\theta })\right) , \nonumber \\ \end{aligned}$$
(8)

where \(t_i(\vec {\theta })\) are models for the \(m_i\), and C is the total covariance of the measurements:

$$\begin{aligned} C_{ij} = V_{ij} + \sum _{r} \varGamma _{ir} \varGamma _{jr}, \end{aligned}$$
(9)

where \(V_{ij}\) represents the statistical covariance, and the second term collects all sources of systematic uncertainties. In general, \(V_{ij}\) includes statistical correlations between the measurements, but is sometimes diagonal, in which case \(V_{ij} = \sigma _i^2 \delta _{ij}\). \(\varGamma _{ir}\) represents the effect of systematic source r on measurement i (see Eq. (6)), and the outer product gives the corresponding covariance.

Imposing the restriction that the models \(t_i\) are linear functions of the parameters of interest, i.e., \(t_i(\vec {\theta }) = t_{0,i} + \sum _p h_{ip}\theta _p\), according to the Gauss–Markov theorem (see e.g. Refs. [7, 16, 17]), the POI estimators with smallest variance are found by solving \(\left. \partial \ln {\mathscr {L}}_\text {cov}/\partial \theta _p\right| _{\vec {\theta } = \hat{\vec {\theta }}} = 0\), and the corresponding covariance is obtained from the matrix of second derivatives, \(\left. \partial ^2\ln {\mathscr {L}}_\text {cov}/\partial \theta _p\partial \theta _q\right| _{\vec {\theta } = \hat{\vec {\theta }}}\) . The solutions are

$$\begin{aligned} \hat{\theta }_p= & {} \sum _i\lambda _{p i} (m_i - t_{0,i}), \end{aligned}$$
(10)
$$\begin{aligned} \text {cov}(\hat{\theta }_p, \hat{\theta }_q)= & {} \sum _{i,j} \lambda _{p i }C_{ij}\lambda _{q j}, \end{aligned}$$
(11)

where the weights \(\lambda _{p i}\) are given by

$$\begin{aligned} \lambda _{p i}= & {} \sum _q \left( h^{T}\cdot S\cdot h\right) _{pq}^{-1}\cdot \left( h^{T}\cdot S\right) _{q i}, \end{aligned}$$
(12)
$$\begin{aligned} S_{ij}= & {} \sum _k V_{ik}^{-1}\left( {\mathbb {I}} - \varGamma \cdot Q\right) _{k j},\end{aligned}$$
(13)
$$\begin{aligned} Q_{r i}= & {} \sum _s \left( {\mathbb {I}} + \varGamma ^{T}V^{-1}\varGamma \right) _{rs}^{-1}(\varGamma ^{T}V^{-1})_{si}. \end{aligned}$$
(14)

In particular, using Eq. (9), the contribution to the uncertainties in the POIs of the statistical uncertainty in the measurements, and of each systematic source r, is given by

$$\begin{aligned} \text {cov}^{[\text {stat}]}(\hat{\theta }_p, \hat{\theta }_q)= & {} \sum _{i,j} \lambda _{p i }V_{ij}\lambda _{q j}, \end{aligned}$$
(15)
$$\begin{aligned} \text {cov}^{[r]}(\hat{\theta }_p, \hat{\theta }_q)= & {} \sum _{i,j} \lambda _{p i } \left( \varGamma _{ir} \varGamma _{jr}\right) \lambda _{q j}. \end{aligned}$$
(16)

We note that the BLUE averaging procedure, i.e., the unbiasedFootnote 2 linear averaging of measurements of a common physical quantity, is just a special case of Eq. (8) where the measurements are direct estimators of the POIs. In the case of a single POI, \(t_i=\theta \) (\(t_{0,i}=0, h=1\)).

A detailed discussion of template fits and of the propagation of fit uncertainties was recently given in Ref. [20]. While the above summary is restricted to linear fits with constant uncertainties, Ref. [20] also addresses nonlinear effects and uncertainties that scale with the measured quantity, i.e., \(\varGamma _{ir}\propto m_i\).

4 Equivalence between the covariance and nuisance parameter representations

Similarly, still assuming Gaussian uncertainties, the general form of Eq. (6) is

$$\begin{aligned} -2\ln {\mathscr {L}}_\text {NP}(\vec {\theta }, \vec {\alpha })= & {} \sum _{i,j} \left( m_i - t_i(\vec {\theta }) - \sum _r \varGamma _{ir} (\alpha _r - a_r)\right) V_{ij}^{-1}\nonumber \\{} & {} \quad \times \left( m_j - t_j(\vec {\theta }) - \sum _s \varGamma _{js}(\alpha _s-a_s) \right) \nonumber \\{} & {} \quad + \sum _r (\alpha _r - a_r)^2. \end{aligned}$$
(17)

The optimum of \(\mathscr {L}_\text {NP}\) can be found by first minimizing Eq. (17) over \(\vec {\alpha }\), for fixed \(\vec {\theta }\) (i.e., profiling the nuisance parameters \(\vec {\alpha }\)); substituting the result into Eq. (17) (thus obtaining the profile likelihood \(\ln \mathscr {L}_\text {NP}(\vec {\theta }, \hat{\hat{\vec {\alpha }}} (\vec {\theta })\)); and minimizing over \(\vec {\theta }\). The profiled nuisance parameters are given by

$$\begin{aligned} \hat{\hat{\alpha }}_r (\vec {\theta }) = \sum _i Q_{ri} \left( m_i - t_i(\vec {\theta })\right) + a_r , \end{aligned}$$
(18)

where \(Q_{ri}\) was defined in Eq. (14). The expression for the covariance is

$$\begin{aligned} \text {cov}(\hat{\hat{\alpha }}_r, \hat{\hat{\alpha }}_s) (\vec {\theta }) = \left( {\mathbb {I}} + \varGamma ^{T}V^{-1}\varGamma \right) _{rs}^{-1}. \end{aligned}$$
(19)

Substituting Eq. (18) back into Eq. (17), and after some algebra, the profile likelihood can be written as

$$\begin{aligned} -2\ln {{{\mathscr {L}}}}_\text {NP}\left( \vec {\theta }, \hat{\hat{\vec {\alpha }}} (\vec {\theta })\right) = \sum _{i,j} \left( m_i - t_i(\vec {\theta )}\right) S_{ij} \left( m_j - t_j(\vec {\theta )}\right) ,\nonumber \\ \end{aligned}$$
(20)

where \(S_{ij}\) was defined in Eq. (13). Moreover, it can be verified that

$$\begin{aligned} \sum _k V_{ik}^{-1}\left( {\mathbb {I}} - \varGamma \cdot Q\right) _{k j}= & {} \left( V_{ij} + \sum _{r}\varGamma _{ir}\varGamma _{jr} \right) ^{-1}, \text { i.e}\end{aligned}$$
(21)
$$\begin{aligned} S_{ij}= & {} C^{-1}_{ij}, \end{aligned}$$
(22)

so that Eqs. (20) and (8) are in fact identical. In other words, \({\mathscr {L}}_\text {cov}(\vec \theta )\), in covariance representation, can be seen as the result of maximizing \({\mathscr {L}}_\text {NP}(\vec \theta ,\vec \alpha )\) over \(\vec \alpha \) for fixed \(\vec \theta \): it is the profile likelihood. Consequently, the best values for the POIs are still given by Eq. (10) and their uncertainties by Eq. (11), and the error decomposition of Sect. 3 applies.

The observation above is not new, and has to the authors’ knowledge been discussed in Refs. [21,22,23,24,25,26,27] for diagonal statistical uncertainties, and in Refs. [28, 29] in the general case. It is also briefly mentioned in Ref. [20]. The equivalence between the covariance and nuisance parameter representations is recalled here to insist that profile likelihood fits should obey the uncertainty decomposition usual from fits in the covariance representation.

For any value of \(\vec \theta \), the estimators of the nuisance parameters and their covariance are given by Eqs. (18) and (19). The estimator \(\hat{\alpha }\) is given by the product of the differences between the measurements and the model, \(m_i - t_i(\vec \theta )\), and a factor Q determined only from the initial systematic and experimental uncertainties. This factor can be calculated from the basic inputs to the fit. Nuisance parameter pulls (\(\hat{\alpha }_r\)) and constraints (\(\sqrt{\text {cov}(\hat{\alpha }_r, \hat{\alpha }_r)}\)) can thus also be calculated a posteriori in the context of a POI-only fit in covariance representation, without explicitly introducing \(\vec \alpha \), \(\vec a\) in the expression of the likelihood, from the same inputs as those defining C.

This procedure can be repeated, first minimizing over \(\vec {\theta }\) for given \(\vec {\alpha }\), substituting the result into Eq. (17), and minimizing the result over the nuisance parameter \(\vec {\alpha }\). This yields the NP covariance matrix elements as

$$\begin{aligned} \text {cov}(\hat{\alpha }_r, \hat{\alpha }_s) = \left[ {\mathbb {I}} + (\zeta \cdot \varGamma )^{T}V_{}^{-1}(\zeta \cdot \varGamma ) \right] _{rs}^{-1}, \end{aligned}$$
(23)

with

$$\begin{aligned} \zeta _{i j}= & {} \sum _{p}h_{ip}\rho _{pj} - \delta _{ij}, \end{aligned}$$
(24)
$$\begin{aligned} \rho _{pj}= & {} \sum _{q}(h^T\cdot V^{-1}\cdot h)^{-1}_{pq}(h^T\cdot V^{-1})_{q j}, \end{aligned}$$
(25)

while the covariance between the NPs and POI is given by

$$\begin{aligned} \text {cov}\left( \hat{\alpha }_r, \hat{\theta }_p\right){} & {} = -\sum _s \left[ {\mathbb {I}} + (\zeta \cdot \varGamma )^{T}V_{}^{-1}(\zeta \cdot \varGamma ) \right] _{rs}^{-1}\nonumber \\{} & {} \quad \left( \rho \cdot \varGamma \right) _{ps}. \end{aligned}$$
(26)

Equations (11), (23), and (26) determine the full covariance matrix of the fitted parameters.

Importantly, Eq. (26) can be further simplified to

$$\begin{aligned} \text {cov}\left( \hat{\alpha }_r, \hat{\theta }_p\right) = -\sum _i \lambda _{pi}\varGamma _{i r}, \end{aligned}$$
(27)

which directly provides the systematic uncertainty decomposition. The inner product of Eq. (27) with itself gives the systematic covariance, Eq. (16), and the statistical uncertainty can be obtained by subtracting the result in quadrature from the total uncertainty in \(\hat{\theta }_p\). In other words, the contribution of every systematic source to the total uncertainty is directly given by the covariance between the corresponding NP and the POI.

5 Uncertainty decomposition from shifted observables

While it is a common and relevant approximation, probability models are in general not based on Gaussian uncertainty distributions. Small samples are treated using the Poisson distribution, and the constraint terms associated to nuisance parameters can assume arbitrary forms. The best-fit values of the POI are however always functions of the measurements and the central values of the auxiliary measurements, i.e., \(\hat{\theta }_p = \hat{\theta }_p(\vec {m},\vec {a})\). Assuming no correlations between these observables, the uncertainty in \(\hat{\theta }_p\) then follows from linear error propagation:

$$\begin{aligned} \text {cov}(\hat{\theta }_p,\hat{\theta }_p) = \sum _i \left( \frac{\partial \hat{\theta }_p}{\partial m_i}\cdot \sigma _i\right) ^2 + \sum _r \left( \frac{\partial \hat{\theta }_p}{\partial a_r}\cdot 1 \right) ^2, \end{aligned}$$
(28)

where \(\sigma _i\) is the uncertainty in \(m_i\), the uncertainty in \(a_r\) is 1 by definition of \(a_r\) and \(\alpha _r\) (Sect. 2), and \(\frac{\partial \hat{\theta }_p}{\partial m_i}\), \(\frac{\partial \hat{\theta }_p}{\partial a_r}\) are the sensitivities of the fit result to these observables. The first sum in Eq. (28) reflects the fluctuations of the measurements, i.e., the statistical uncertainty (each term of the sum represents the contribution of a given \(m_i\), measurement, or bin), and the second sum collects the contributions of the systematic uncertainties.

The contribution of a given source of uncertainty can thus be assessed by varying the corresponding measurement or global observable by one standard deviation in the expression of the likelihood, and repeating the fit otherwise unchanged. The corresponding uncertainty is obtained from the difference between the values of \(\hat{\theta }_p\) in the varied and nominal fits.

This statement can be verified explicitly for the Gaussian, linear fits discussed in the previous section. Now allowing for correlations between the measurements, varying \(m_k\) within its uncertainty yields the following likelihood:

$$\begin{aligned}{} & {} -2\ln {{{\mathscr {L}}}}_{m_k}(\vec {\theta }, \vec {\alpha })\nonumber \\{} & {} = \sum _{i,j} \left( m_i + L_{ik}-t_i(\vec {\theta }) - \sum _r \varGamma _{r i} (\alpha _r-a_r) \right) V_{ij}^{-1} \nonumber \\{} & {} \quad \times \left( m_j + L_{jk} -t_j(\vec {\theta }) - \sum _s \varGamma _{s j} (\alpha _s - a_s)\right) \nonumber \\ {}{} & {} \quad + \sum _r (\alpha _r - a_r)^2, \end{aligned}$$
(29)

where L results from the Cholesky decomposition \(L^T L = V\) and represents the correlated effect on all measurements \(m_i\) of varying \(m_k\) within its uncertainty. In the case of uncorrelated measurements, \(L_{ik} = \sigma _i \delta _{ik}\), and only \(m_k\) is varied, as in Eq. (28). After minimization, the difference between the varied and nominal fit results is

$$\begin{aligned} \varDelta \hat{\theta }^{[m_k]}_{p}\equiv \hat{\theta }^{[m_k]}_{p} - \hat{\theta }_{p} = \sum _{i} \lambda _{p i}L_{ik}. \end{aligned}$$
(30)

Similarly, the uncertainty in \(a_t\) can be obtained from the following likelihood:

$$\begin{aligned}{} & {} -2\ln {{{\mathscr {L}}}}_{a_t}(\vec {\theta }, \vec {\alpha })\nonumber \\{} & {} = \sum _{i,j} \left( m_i - t_i(\vec {\theta }) - \sum _r \varGamma _{r i} (\alpha _r -a_r) \right) V_{ij}^{-1} \nonumber \\ {}{} & {} \quad \times \left( m_j - t_j(\vec {\theta }) - \sum _s \varGamma _{s j} (\alpha _s -a_s) \right) \nonumber \\{} & {} \quad + \sum _r (\alpha _r - a_r - \delta _{rt})^2, \end{aligned}$$
(31)

resulting in

$$\begin{aligned} \varDelta \hat{\theta }^{[a_t]}_{p}\equiv \hat{\theta }^{[a_t]}_{p} - \hat{\theta }_{p} = -\sum _i \lambda _{p i}\varGamma _{it}, \end{aligned}$$
(32)

as in Eq. (27). The differences between the varied and nominal values of \(\hat{\theta }_p\) match the expressions obtained above for the corresponding uncertainties. In particular,

$$\begin{aligned} \sum _k \varDelta \hat{\theta }^{[m_k]}_{p} \varDelta \hat{\theta }^{[m_k]}_{q} = \sum _{i,j} \lambda _{p i} V_{ij} \lambda _{q j} \end{aligned}$$
(33)

reproduces the total statistical covariance in Eq. (15), and

$$\begin{aligned} \varDelta \hat{\theta }^{[a_t]}_{p} \varDelta \hat{\theta }^{[a_t]}_{q} = \sum _{i,j} \lambda _{p i}\left( \varGamma _{it} \varGamma _{jt}\right) \lambda _{q j} \end{aligned}$$
(34)

is the contribution of systematic source t to the systematic covariance in Eq. (16).

As in Sect. 4, the total uncertainty in the NPs can be obtained by minimizing the likelihood with respect to \(\vec \theta \) for fixed \(\vec \alpha \), replacing \(\vec \theta \) by its expression, and minimizing the result with respect to \(\vec \alpha \). The contribution of the measurements to the uncertainty in \(\vec \alpha \) is

$$\begin{aligned} \varDelta \hat{\alpha }^{[m_k]}_r=\hat{\alpha }^{[m_k]}_r - \hat{\alpha }_r =\sum _i {\tilde{Q}}_{ri} L_{ik}, \end{aligned}$$
(35)

where

$$\begin{aligned} {\tilde{Q}}_{ri}{} & {} = -\sum _s \left[ {\mathbb {I}} + (\zeta \cdot \varGamma )^{T}V^{-1}(\zeta \cdot \varGamma ) \right] _{rs}^{-1}\nonumber \\{} & {} \quad \ \times \left[ (\zeta \cdot \varGamma )^T\cdot V^{-1}\right] _{si}; \end{aligned}$$
(36)

and the systematic contributions are given by

$$\begin{aligned} \varDelta \hat{\alpha }^{[a_t]}_r= \hat{\alpha }^{[a_t]}_r - \hat{\alpha }_r = \left[ {\mathbb {I}} + (\zeta \cdot \varGamma )^{T}V^{-1}(\zeta \cdot \varGamma ) \right] _{rt}^{-1}. \end{aligned}$$
(37)

Summing Eqs. (35) and (37) in quadrature recovers the total NP covariance matrix in Eq. (23), as expected.

Finally, the covariance between the NPs and POIs can be obtained analytically by summing the products of the corresponding offsets, obtained from statistical and systematic variations, that is,

$$\begin{aligned}{} & {} \sum _k \varDelta \alpha ^{[m_k]}_r \varDelta \theta ^{[m_k]}_p + \sum _t \varDelta \alpha ^{[a_t]}_r \varDelta \theta ^{[a_t]}_p\nonumber \\ {}{} & {} \quad = - \sum _s \left[ {\mathbb {I}} + (\zeta \cdot \varGamma )^{T}V^{-1}(\zeta \cdot \varGamma ) \right] _{rs}\left( \rho \cdot \varGamma \right) _{ps}, \end{aligned}$$
(38)

which again matches the expression for \(\text {cov}(\hat{\alpha }_r,\hat{\theta }_p)\) in Eq. (26).

The identities (33), (34), (37), and (38) can be obtained analytically only for linear fits with Gaussian uncertainties, but the uncertainty decomposition through fits with shifted observables only assumes the Taylor expansion of Eq. (28) and is therefore general. The covariance and NP representations are equivalent for Gaussian fits, but this equivalence breaks down for fits with non-Gaussian uncertainty distributions, and curvatures at the maximum of the likelihood no longer provide reliable estimates for the variance of the parameters. Such fits can however still rely on Eq. (28) to obtain a consistent uncertainty decomposition where each component directly reflects the propagation of the uncertainty in the corresponding source. In this way, uncertainty components preserve a universal meaning, regardless of the statistical method used for a given measurement.

Fig. 1
figure 1

Uncertainty decomposition as a function of a luminosity scaling factor, using CMS Run 2 results [31]. Left: size of the statistical (stat) and systematic (syst) uncertainties for \(\gamma \gamma \) and \(4\ell \). Right: decomposition of uncertainties on the combination using either the uncertainty decomposition or impacts approach

In practice, the uncertainty can be propagated using one-standard-deviation shifts in m and a as above, or using the Monte Carlo error propagation method, where m or a are randomized within their respective probability density functions, and the corresponding uncertainty in the measurement is determined from the variance of the fit results.Footnote 3 The latter method makes the correspondence between uncertainty contributions and the effect of fluctuations of the corresponding sources (cf. Sect. 1) explicit. It is also more general, and gives more precise results in the case of significant asymmetries or tails in the uncertainty distributions. In addition, it can be more efficient when simultaneously estimating the variance contributed by a large group of sources of uncertainty. Similarly, the present method can be generalized to unbinned measurements using data resampling techniques for the extraction of statistical uncertainty components [30].

6 Examples

6.1 Combination of two measurements

Let us consider again the concrete case of the Higgs boson mass \(m_\text {H}\) described in Sect. 2, which will serve as a simple example with only one parameter of interest (\(m_\text {H}\)) and two measurements. We will further assume that both the statistical and systematic uncertainties are uncorrelated between the two channels, which is not unreasonable given that they correspond to different events and that the dominant sources of systematic uncertainty are indeed uncorrelated. We will take numerical values from the actual ATLAS [14] and CMS [31] Run 1 and Run 2 measurements, as well as from an imaginary case exaggerating the numeric features of the ATLAS Run 2 measurement.

Fig. 2
figure 2

Uncertainty decomposition as a function of a luminosity scaling factor, using ATLAS Run 2 results [14]. Left: size of the statistical (stat) and systematic (syst) uncertainties for \(\gamma \gamma \) and \(4\ell \). Right: decomposition of uncertainties on the combination using either the uncertainty decomposition or impacts approach

For each case, the decomposition of uncertainties between statistical and systematic components will be compared between the two approaches—uncertainty decomposition and impacts. In addition, this is done as a function of a luminosity factor k, which is used to scale the statistical uncertainty of the inputs by \(1/\sqrt{k}\) (while systematic uncertainties are kept unchanged). The published results in the example under consideration are for \(k=1\). Though not shown on the plots, we have also checked numerically that the uncertainty decomposition (as usually done in covariance representation methods or BLUE) can be reproduced from a profile likelihood fit with shifted observables (Sect. 4), while the impacts (as usually done in profile likelihood fits) can also be recovered from the BLUE approach, simply by using the statistical uncertainties alone to compute the combination weights \(\lambda ^\prime _i\) as in Eq. (7) (i.e., repeating the combination without systematic uncertainties). In addition, both approaches have been checked to yield to the same total uncertainty in all cases.

CMS results We first study the combination of CMS Run 2 results [31]: \(\text {stat}_{\gamma \gamma } = 0.18\) GeV, \(\text {syst}_{\gamma \gamma } = 0.19\) GeV; \(\text {stat}_{4\ell } = 0.19\) GeV, \(\text {syst}_{4\ell } = 0.09\) GeV. The results of our toy combination are shown in Fig. 1. This figure, as well as the following ones, comprises two panels: the inputs to the combination on the left, and statistical and systematic uncertainties as obtained in either the uncertainty decomposition or impact approaches on the right. The actual published numbers [31] correspond to \(k=1\) (black vertical line).

With this first simple case, where the two measurements have relatively comparable uncertainties, little difference is found between the two approaches, though the uncertainty decomposition gives a larger statistical uncertainty than the impact one, as expected. The difference becomes larger for higher values of the luminosity factor.

ATLAS results We are now considering the ATLAS Run 2 results [14]: \(\text {stat}_{\gamma \gamma } = 0.21\) GeV, \(\text {syst}_{\gamma \gamma } = 0.34\) GeV; \(\text {stat}_{4\ell } = 0.36\) GeV, \(\text {syst}_{4\ell } = 0.09\) GeV. As shown in Fig. 2, differences between the two uncertainty decompositions are now more evident, already for the nominal uncertainty but even more when extrapolating to larger luminosities (smaller statistical uncertainties). Again, the uncertainty decomposition gives a larger statistical uncertainty than the impact one.

Imaginary extreme case Finally, we consider an extreme case, such that \(\text {stat}_{\gamma \gamma } = 0.1\) GeV, \(\text {syst}_{\gamma \gamma } = 0.5\) GeV; \(\text {stat}_{4\ell } = 0.5\) GeV, \(\text {syst}_{4\ell } = 0.1\) GeV, exaggerating the features of the ATLAS combination (i.e., combining a statistically dominated measurement with a systematically limited one). Dramatic differences between the two approaches for uncertainty decomposition are observed in Fig. 3: for the nominal luminosity, while uncertainty decomposition reports equal statistical and systematic uncertainties, the impacts are dominated by the systematic uncertainty.

Fig. 3
figure 3

Uncertainty decomposition as a function of a luminosity scaling factor, using \(\text {stat}_{\gamma \gamma } = 0.1\) GeV, \(\text {syst}_{\gamma \gamma } = 0.5\) GeV; \(\text {stat}_{4\ell } = 0.5\) GeV, \(\text {syst}_{4\ell } = 0.1\) GeV. Left: size of the statistical (stat) and systematic (syst) uncertainties for \(\gamma \gamma \) and \(4\ell \). Right: decomposition of uncertainties on the combination using the uncertainty decomposition or impact approach

6.2 W-boson mass fits

The uncertainty decomposition discussed above is further illustrated with a toy measurement of the W-boson mass using pseudo-data, where the results obtained from the profile likelihood fit and from the analytical calculation are compared. Since the measurement of W mass is a typical shape analysis, in which the fit to the distributions is parameterized by both POI and NPs, the conclusions drawn from this example can in principle be generalized to all kinds of shape analyses. While the effect of varying the W mass is parameterized by the POI, three representative systematic sources of a W mass measurement at hadron colliders [32,33,34,35] are parameterized by NPs in the probability model: the lepton momentum scale uncertainty, the hadronic recoil (HR) resolution uncertainty, and the \(p_\text {T}^W\) modeling uncertainty. The W mass is extracted from the \(p_\text {T}^\ell \) or \(m_\text {T}\) spectra, since measurements based on these two distributions have very different sensitivities to certain types of systematic uncertainties.

6.2.1 Simulation

The signal process under consideration is the charged-current Drell–Yan process [36] \(p p\rightarrow W^{-}\rightarrow \mu ^{-}\nu \) at a center-of-mass energy of \(\sqrt{s}=\)13 TeV, generated using Madgraph, with initial and final-state corrections obtained using Pythia8 [37, 38]. Detailed information regarding the event generation is listed in Table 2.

Kinematic distributions for different values of the W mass are obtained in simulation via Breit–Wigner reweighting [39]. The systematic variations of \(p_\text {T}^W\) are implemented using a linear reweighting as a function of \(p_\text {T}^W\) before event selection, then taking only the shape effect on the underlying \(p_\text {T}^W\) spectrum.

At the reconstruction level, the \(p_\text {T}\) of the bare muon is smeared by 2% following a Gaussian distribution. A source of systematic uncertainty in the calibration of the muon momentum scale is considered. The hadronic recoil \(\vec {u}_\text {T}\) is taken to be the opposite of \(\vec {p}_\text {T}^W\) and smeared by a constant 6 GeV in both directions of the transverse plane. The second source of experimental systematic uncertainty is taken to be the uncertainty in the calibration of the hadronic recoil resolution. The information about the W mass templates and the systematic variation is summarized in Table 3.

Table 2 Madgraph+Pythia8 [37, 38] event generation for MC samples. Events with an off-shell boson are excluded in the event generation at the parton level, leading to a total cross section of 6543 pb
Table 3 W mass templates and systematic variations for the Madgraph+Pythia8 samples

Both the detector smearing and the event selections listed in Table 4 are chosen to be similar to those of a realistic W mass measurement. The reconstructed muon \(p_\text {T}\) and \(m_\text {T}\) spectra in the fit range after the event selection are shown in Fig. 4, along with the relevant templates and systematic variations.

Table 4 Detector smearing and event selection for Madgraph+Pythia8 samples. The cut-flow efficiency of the event selection is about 29%

6.2.2 Uncertainty decomposition

The profile likelihood fit is performed using HistFactory [40] and RooFit [41]. Its output includes the fitted central values and uncertainties for all the free parameters. The uncertainty components of the profile likelihood fit results are obtained by repeating the fit to bootstrap samples obtained by resampling the pseudo data used to compute the results, or those of the central values of the auxiliary measurements; then, computing the spread of offsets in the POI, the analytical solution of the fit can be calculated following the procedures in Sect. 5. For this exercise, the pseudo-data are chosen to be the nominal simulation, but with the statistical power of the data. The effect of changing the luminosity scale factor is emulated by repeating the fit with an overall factor multiplied by all the reconstructed distributions. The setups of the fits for the validation are summarized in Table 5.

Fig. 4
figure 4

Reconstructed muon \(p_\text {T}\) and \(m_\text {T}\) distributions of the Madgraph+Pythia8 samples. Top: kinematic spectra. Bottom: the variation to nominal ratio with statistical uncertainty indicated by the error band

Table 5 Configuration of the \(m_W\) fits. The luminosity scale factor of 1.0 corresponds to 76.42 [\(\text {pb}^{-1}\)]

Figures 5 and 6 present the uncertainty decomposition as a function of a luminosity scale factor used to scale the statistical precision of the simulated sample. The error bars for the uncertainty decomposition for the profile likelihood fit reflect the limited number of toys. In general, the uncertainty components derived from the numerical profile likelihood fit and the analytical solution match each other within the error bars. The discrepancy at certain points can be assigned to the numerical stability of the PL fit, which shows up when the uncertainty components becomes too small (typically \(< 2\) MeV). The uncertainty decomposition is summarized in Table 6, where the total uncertainty is broken down into data statistic and total systematic uncertainties using the shifted observable method, and compared with the results using the conventional impact approach for PL fit. With 10 times higher luminosity, the statistical uncertainty of the impact approach decreases by exactly a factor of \(\sqrt{10}\), while that of the shifted observable approach introduced in this study decreases more slowly.

Fig. 5
figure 5

Uncertainty decomposition for the muon \(p_\text {T}\) fit compared between the numerical and the analytical PL fit. The total systematic uncertainty of the profile likelihood fit is the quadratic sum of the three components

Fig. 6
figure 6

Uncertainty decomposition for the \(m_\text {T}\) fit compared between the numerical and the analytical PL fit. The total systematic uncertainty of the profile likelihood fit is the quadratic sum of the three components

Table 7 shows the analytical systematic uncertainty decomposition for the \(m_\text {T}\) and \(p_\text {T}^\ell \) fits with nominal luminosity, together with the NP-POI covariance matrix elements obtained from the numerical profile likelihood fit. This confirms that the systematic uncertainty components can be directly read from the PL fit covariance matrix, as discussed around Eq. (27). Finally, Fig. 7 compares the post-fit NP uncertainties between the numerical profile likelihood fit and the analytical calculation. The two methods agree at the 0.1 per-mil level.

6.3 Use of decomposed uncertainties in subsequent fits or combinations

Uncertainty decompositions obtained with the present method are meaningful only if the results can be used consistently in downstream applications, such as measurement combinations or interpretation fits in terms of specific physics models. In particular, uncertainty components that are common to several measurements generate correlations which should be properly evaluated. This happens when measurements are statistically correlated or when they are impacted by shared systematic uncertainties.

As a final validation of the proposed method, we test the combination of profile likelihood fits of the same observable. Such a combination can be performed either using the decomposed uncertainties, or in terms of the PL fit outputs, i.e., the fitted values of the POIs and NPs and their covariance matrix.

The combination is performed starting from Eq. (8), which as noted in Sect. 3 can be applied to linear measurement averaging by adapting the definition of \(t(\vec {\theta })\). In the case of a single combined parameter, \(t_i=\theta \), for a simultaneous combination of several parameters, \(t_i = \sum _{p}{U}_{ip}\theta _p\), where \({U}_{ip}\) is 1 when measurement i is an estimator of POI p, and 0 otherwise [8]. This gives

$$\begin{aligned} -2\ln {\mathscr {L}}_\text {cmb}(\vec {\theta })&= \sum _{i,j} \left( m_i - \sum _{p}{U}_{ip}\theta _p \right) C_{ij}^{-1} \nonumber \\ {}&\quad \times \left( m_j - \sum _{p}{U}_{jp}\theta _p \right) , \end{aligned}$$
(39)

which can be solved as in Sect. 3.

As an illustration, we use the \(m_W\) fits using the \(p_\text {T}^\ell \) and \(m_\text {T}\) distributions described in the previous section. In the case of a combination based on the uncertainty decomposition, there are two measurements (the POIs of the \(p_\text {T}^\ell \) and \(m_\text {T}\) fits), one combined value, and the covariance C is a \(2\times 2\) matrix constructed from the decomposed uncertainties using Eq. (9).

Table 6 Uncertainty decomposition for the muon \(p_\text {T}^\ell \) and \(m_\text {T}\) fits, for two different values of the luminosity scale factor, using the shifted observable method and the impact method for PL fit. The errors arise from the limited number of bootstrap toys. The baseline luminosity is 76.42 [\(\text {pb}^{-1}\)]

For a combination based on the PL fit outputs, there are in this example eight measurements (one POI and three NPs in the \(p_\text {T}^\ell \) and \(m_\text {T}\) fits), four combined parameters, and C is an \(8\times 8\) matrix. The diagonal \(4\times 4\) blocks are the post-fit covariance matrices of each fit (\(p_\text {T}^\ell \) and \(m_\text {T}\)). The off-diagonal blocks reflect systematic and/or statistical correlations between the \(p_\text {T}^\ell \) and \(m_\text {T}\) fits, and can be obtained analytically following the methods of Sect. 5. For two fits \(f_1\) and \(f_2\), the covariance matrix elements are

$$\begin{aligned} \text {cov}\left( \theta _p^{f_1}, \theta _q^{f_2}\right)= & {} \sum _k\varDelta \theta ^{[m_k], f_1}_p\,\varDelta \theta ^{[m_k], f_2}_q \nonumber \\{} & {} \quad \ + \sum _t\varDelta \theta ^{[a_t], f_1}_p \varDelta \theta ^{[a_t], f_2}_q\nonumber \\ \text {cov}\left( \alpha _r^{f_1}, \alpha _s^{f_2}\right)= & {} \sum _k\varDelta \alpha ^{[m_k],f_1}_r \varDelta \alpha ^{[m_k],f_2}_s \nonumber \\{} & {} \quad \ + \sum _t\varDelta \alpha ^{[a_t], f_1}_r \varDelta \alpha ^{[a_t], f_2}_s \nonumber \\ \text {cov}\left( \alpha _r^{f_1}, \theta _p^{f_2}\right)= & {} \sum _k\varDelta \alpha ^{[m_k], f_1}_r\varDelta \theta ^{[m_k], f_2}_p \nonumber \\{} & {} \quad \ + \sum _t\varDelta \alpha ^{[a_t], f_1}_r \varDelta \theta ^{[a_t], f_2}_p\nonumber \\ \text {cov}\left( \theta _p^{f_1}, \alpha _r^{f_2}\right)= & {} \sum _k\varDelta \theta ^{[m_k], f_1}_p \varDelta \alpha ^{[m_k], f_2}_r \nonumber \\{} & {} \quad \ + \sum _t\varDelta \theta ^{[a_t], f1}_p \varDelta \alpha ^{[a_t], f_2}_r \end{aligned}$$
(40)

For each matrix element, the first sum is statistical and typically occurs when the fitted distributions are projections of the same data, as is the case for the \(p_\text {T}^\ell \) and \(m_\text {T}\) distributions in \(m_W\) fits. The second sum represents shared systematic sources of uncertainty.

Table 7 Left: list of systematic uncertainty contributions and the total uncertainty, in MeV, for the \(m_\text {T}\) and \(p_\text {T}^\ell \) fits performed in covariance representation. Center, right: post-fit covariance among the three NPs associated to these systematic uncertainties and the POI, for the profile likelihood fits to the \(m_\text {T}\) and \(p_\text {T}^\ell \) distribution, respectively
Fig. 7
figure 7

Post-fit NP uncertainties at different values of the luminosity scale factor. The results of the numerical and the analytical PL-fits are compared in the ratio panel

Results of this comparison are presented in Fig. 8 and Table 8, which summarize the fit precision as a function of the assumed luminosity. The uncertainty decomposition method and the combination of the PL fit results agree to better than 0.1 MeV. For completeness, the result of a direct joint fit to the two distributions is shown as well; slightly more precise results are obtained in this case, as expected, especially for highly integrated luminosities where systematic uncertainties dominate.

Fig. 8
figure 8

Summary of \(m_T\) and \(p_T^\ell \) PL fit results. Combinations are produced using the uncertainty decomposition method and using the covariance of the PL fit results

Table 8 Summary of \(m_T\) and \(p_T^\ell \) PL fit results. Combinations are produced using the uncertainty decomposition method, and using the covariance of the PL fit results

We note that a combination of PL fit results based on the nuisance parameter representation, Eq. (17), as proposed in Ref. [42], seems difficult to justify rigorously. The principal reason is that Eq. (17) explicitly relies on the absence of correlations, prior to the combination, between the sources of uncertainty encoded in the covariance matrix V and the uncertainties treated as nuisance parameters. Since the input measurements result from PL fits, the POI of each input measurement is in general correlated with the corresponding NPs. One possibility would be to add terms to Eq. (17) that describe these missing correlations. It could also be envisaged to diagonalize the covariance of the inputs and perform the fit in this new basis, but this would work only if all measurements can be diagonalized by the same linear transformation, which is generally not the case.

7 Conclusion

We have studied the decomposition of fit uncertainties in two often-used statistical methods in high-energy physics, namely, fits in covariance representation and the profile likelihood. We recalled the equivalence between the two methods in the Gaussian limit and gave a complete set of expressions for the fit uncertainties in the parameters of interest, the nuisance parameters and their correlations. A direct correspondence was established between the standard uncertainty decomposition in covariance representation and the (POI, NP) covariance matrix elements in nuisance representation.

Numerical profile likelihood analyses generally define statistical and systematic uncertainty components from the results of statistical-only fits and systematic impacts, but this identification does not hold. The uncertainty of statistical-only fits underestimates the statistical uncertainty of fits including systematics, and systematic impacts correspondingly overestimate the genuine systematic uncertainty contributions. Impacts cannot be used as inputs to subsequent measurement combinations or interpretation fits.

We have introduced a set of analytical and numerical methods to remove this shortcoming. In Gaussian approximation, a consistent uncertainty decomposition can be directly extracted from the PL fit covariance matrix. For general (non-Gaussian or nonlinear) profile likelihood fits, a consistent uncertainty decomposition can be rigorously obtained from fits using shifted observables. We have illustrated these points by means of simple examples and have shown that profile likelihood fit results with properly decomposed uncertainties can be used consistently in downstream combinations or fits.