Uncertainty components in profile likelihood fits

When a measurement of a physical quantity is reported, the total uncertainty is usually decomposed into statistical and systematic uncertainties. This decomposition is not only useful to understand the contributions to the total uncertainty, but also required to propagate these contributions in subsequent analyses, such as combinations or interpretation fits including results from other measurements or experiments. In profile-likelihood fits, widely applied in high-energy physics analyses, contributions of systematic uncertainties are routinely quantified using"impacts", which are not adequate for such applications. We discuss the difference between impacts and actual uncertainty components, and establish methods to determine the latter in a wide range of statistical models.


Introduction
Measurement results are usually reported not only quoting the total uncertainty on the measured values, but also their breakdown into uncertainty components -usually the statistical uncertainty, and one or more components of systematic uncertainty.A consistent propagation of uncertainties is of upmost importance for global analyses of measurement data, such as, for example, the determination of the anomalous magnetic moment of the muon [1], of the parton distribution functions of the proton [2], measurements of Zboson properties at LEP1 [3], and of the top-quark mass [4] and Higgs-boson properties [5] at the LHC.In high-energy physics experiments, different techniques are used for obtaining this decomposition, depending on (but not fundamentally related to) the test statistic used to obtain the results.
The simplest statistical method consists in comparing a measured quantity or distribution to a model, parameterised only in terms of the physical constants to be determined.Auxiliary parameters (detector calibrations, theoretical predictions, etc) on which the model depends are fixed to their best estimates.The measured values of the physical constants result from the maximization of the corresponding likelihood.The curvature of the likelihood around its maximum is determined only by the expected fluctuations of the data, and yields the statistical uncertainty of the measurement 1 .Systematic uncertainties are obtained by repeating the procedure with varied models, obtained from the variation of the auxiliary parameters within their uncertainty, one parameter at a time [6].Each variation represents a given source of un-certainty.The corresponding uncertainties in the final result are usually uncorrelated by construction, and are summed in quadrature to obtain the total measurement uncertainty.
When using this method, different measurements of the same physical constants can be readily combined.When all uncertainties are Gaussian, the Best Linear Unbiased Estimate (BLUE) [7,8] results from the analytical maximization of the joint likelihood of the input measurements, and unambiguously propagates the statistical and systematic uncertainties in the input measurements to the combined result.
An improved statistical method consists in parameterising the model in terms of both the physical constants and the sources of uncertainty [9,10], and has become a standard in LHC analysis.In this case, the maximum of the likelihood represents a global optimum for the physical constants and the uncertainty parameters, and determines their best values simultaneously.The curvature of the likelihood at its maximum reflects the fluctuations of the data and of the other sources of uncertainty, therefore giving the total uncertainty in the final result.
The determination of the statistical and systematic uncertainty components in numerical profile-likelihood fits is the subject of the present note.Current practice universally employs so-called impacts [11][12][13], obtained as the quadratic difference between the total uncertainties of fits including or excluding a given source of uncertainty.While impacts quantify the increase in total uncertainty when including a new systematic source in a measurement, they cannot be interpreted as the contribution of this source to the total uncertainty in the complete measurement.Impacts do not add up to the total uncertainty, and do not match usual uncertainty decomposition formulas [8] even when they should, i.e. when all uncertainties are genuinely Gaussian.
These statements are illustrated with a simple example in Section 2. Sections 3 and 4 summarize parameter estimation in the Gaussian approximation.Sources of uncertainty can be entirely encoded in the covariance matrix of the measurements (the "covariance representation"), or parameterised using nuisance parameters (the "nuisance parameter representation").The equivalence between the approaches is recalled, and a detailed discussion of the fit uncertainties and correlations is provided.A consistent method for the decomposition of uncertainties in profile-likelihood ratio fits is introduced in Section 5.The method is general as it results from a Taylor expansion of the likelihood, and a proof that it yields consistent results in the Gaussian regime is given.The different approaches are illustrated in Section 6 with examples based on the Higgs and W -boson mass measurements and combinations, which are usually dominated by systematic uncertainties and where the present discussion is of particular relevance.Concluding remarks are presented in Section 7.
In the following, we understand the statistical uncertainty in its strict frequentist definition, i.e. the standard deviation of an estimator when the exact same experiment is repeated (with the same systematic uncertainties) on independent data samples of identical expected size.Similarly, a systematic uncertainty contribution should match the standard deviation of the estimator obtained under fluctuations of the corresponding source within its initial uncertainty.Measurements (physical parameters, cross sections or bins of a measured distribution) and the corresponding predictions will be denoted ⃗ m and ⃗ t, respectively, and labeled using roman indices i, j, k.The predictions are functions of the physical constants to be determined, referred to as parameters of interest (POIs), denoted ⃗ θ and labeled p, q.Sources of uncertainty are denoted ⃗ a and their associated nuisance parameters (NPs), ⃗ α, are labeled r, s,t.

Example : Higgs-boson mass in the di-photon and four-lepton channels
Let us consider the first ATLAS Run 2 measurement of the Higgs-boson mass, m H , in the H → γγ and H → 4ℓ final states [14].The measurement results in the γγ and 4ℓ channels have similar total uncertainty, but are unbalanced in the sense that the former benefits from a large data sample but has significant systematic uncertainties from the photon energy calibration, while the latter is limited to a smaller data sample but benefits from excellent calibration systematic uncertainties: m γγ = 124.93± 0.40(±0.21(stat) ± 0.34 (syst)) GeV; m 4ℓ = 124.79± 0.37(±0.36(stat) ± 0.09 (syst)) GeV.
The uncertainties in the γγ and 4ℓ measurements can be considered as entirely uncorrelated for this discussion.In the BLUE approach, the combined value and its uncertainty are then obtained considering the following log-likelihood: where i = γγ, 4ℓ; and σ γγ and σ 4ℓ are the total uncertainties in the γγ and 4ℓ channels, respectively.The combined value m cmb and its total uncertainty σ cmb are derived solving The solutions can be written in terms of linear combinations of the input values and uncertainties : with and the weights λ i minimize the variance of the combined result, accounting for all sources of uncertainty in the input measurements.Since the total uncertainties have statistical and systematic components, i.e. σ 2 i = σ 2 stat,i + σ 2 syst,i , the corresponding contributions in the combined measurement are simply In the profile likelihood (PL) approach, or nuisance-parameter representation, the corresponding likelihood reads where α r is the nuisance parameter corresponding to the source of systematic uncertainty r, and Γ ir its effect on the measurement in channel i. Knowledge of the systematic uncertainty r is obtained from an auxiliary measurement, of which the central value, sometimes called a global observable, is denoted as a r .The parameters α r and a r are defined in units of the systematic uncertainty σ syst,r , and a r is often conventionally set to 0. In this example, since the σ syst,r are specific to each channel and do not generate correlations, Γ ir = σ syst,r δ ir .The combined value m cmb and its total uncertainty are obtained from the absolute maximum and second derivative of L as above; in addition, the PL yields the estimated value for α r .One finds that m cmb and σ cmb exactly match their counterparts from Eq. (3) (see also the discussion in Section 4).
In PL practice, the statistical uncertainty is however usually obtained by fixing all nuisance parameters to their best fit value (maximum likelihood estimator) αr , maximising the likelihood only with respect to the parameter of interest.With fixed α r , the second derivative of Eq. ( 6) becomes equivalent to that of Eq. (1), only changing σ i for σ stat,i in the denominator, giving which this time differ from Eqs. (4), (5): here, the coefficients λ ′ are calculated from the statistical uncertainties only, and optimize the combined uncertainty for this case.The statistical uncertainty is thus underestimated, comparing to Eq. ( 3)).The systematic error, estimated from the quadratic subtraction between the total and statistical uncertainty estimate, is overestimated.5)) and using impacts (Eq.( 7)).
For completeness, numerical values are given in Table 1.
The "impact" of a systematic uncertainty on a measurement with only statistical uncertainties differs from the contribution of this systematic uncertainty to the complete measurement.In the impact procedure, the estimated measurement statistical uncertainty is actually the total uncertainty of a measurement without systematic uncertainties, i.e. of a different measurement.In other words, it does not match the standard deviation of results obtained by repeating the same measurement, including systematic uncertainties, on independent data sets of the same expected size.
Finally, extrapolating the γγ and 4ℓ measurements to the large data sample limit, statistical uncertainties vanish, and the asymptotic combined uncertainty should intuitively be dominated by the 4ℓ channel and close to 0.09 GeV.A naive estimate based on impacts instead suggests an asymptotic uncertainty of 0.20 GeV.
We generalise this discussion in the following, and argue that a sensible uncertainty decomposition should match the one obtained from fits in the covariance representation, and can be also obtained simply in the context of the PL.The Higgs-boson mass example is further discussed in Section 6.1.

Uncertainty decomposition in covariance representation
This section provides a short summary of standard results which can be found in the literature (see e.g.[15]).Gaussian uncertainties are assumed throughout this section.The general form of Eq. ( 1), in the presence of an arbitrary number of measurements m i and POIs ⃗ θ is: where t i ( ⃗ θ ) are models for the m i , and C is the total covariance of the measurements: where V i j represents the statistical covariance, and the second term collects all sources of systematic uncertainties.In general, V i j includes statistical correlations between the measurements, but is sometimes diagonal in which case V i j = σ2 i δ i j .Γ ir represents the effect of systematic source r on measurement i (see Eq. ( 6)), and the outer product gives the corresponding covariance.
Imposing the restriction that the models t i are linear functions of the parameters of interest, i.e. t i ( ⃗ θ ) = t 0,i +∑ p h ip θ p , according to the Gauss-Markov theorem (see e.g.Refs.[7,16,17]), the POI estimators with smallest variance are found by solving ∂ ln L cov /∂ θ p ⃗ θ = ⃗ θ = 0, and the corresponding covariance is obtained from the matrix of second derivatives, where the weights λ pi are given by In particular, using Eq. ( 9), the contribution to the uncertainties in the POIs of the statistical uncertainty in the measurements, and of each systematic source r, is given by We note that the BLUE averaging procedure, i.e the unbiased 2 linear averaging of measurements of a common physical quantity, is just a special case of Eq. ( 8) where the measurements are direct estimators of the POIs.In case of a single POI, t i = θ (t 0,i = 0, h = 1).
A detailed discussion of template fits and of the propagation of fit uncertainties has recently been given in Ref. [20].
While the above summary restricts to linear fits with constant uncertainties, Ref. [20] also addresses non-linear effects and uncertainties that scale with the measured quantity, i.e.Γ ir ∝ m i .

Equivalence between the covariance and nuisance parameter representations
Similarly, still assuming Gaussian uncertainties, the general form of Eq. ( 6) is: The optimum of L NP can be found by first minimizing Eq. ( 17) over ⃗ α, for fixed ⃗ θ (i.e.profiling the nuisance parameters ⃗ α); substituting the result into Eq.( 17) (thus obtaining the profile likelihood ln L NP ( ⃗ θ , ⃗ α( ⃗ θ )); and minimizing over ⃗ θ .The profiled nuisance parameters are given by: αr where Q ri was defined in Eq. ( 14).The expression for the covariance is Substituting Eq. ( 18) back into Eq.( 17), and after some algebra, the profile likelihood can be written as where S i j was defined in Eq. (13).Moreover it can be verified that so that Eqs. ( 20) and ( 8) are in fact identical.In other words, L cov ( ⃗ θ ), in covariance representation, can be seen as the result of maximizing L NP ( ⃗ θ ,⃗ α) over ⃗ α, for fixed ⃗ θ : it is the profile likelihood.Consequently, the best values for the POIs are still given by Eq. ( 10), their uncertainties by Eq. ( 11), and the error decomposition of Section 3 applies.
The observation above is not new and has to the knowledge of the authors at least been discussed in Refs.[21][22][23][24][25][26][27] for diagonal statistical uncertainties, and in Ref. [28,29] in the general case.It is also briefly mentioned in Ref. [20].The equivalence between the covariance and nuisance parameter representations is reminded here to insist that profilelikelihood fits should obey the uncertainty decomposition usual from fits in the covariance representation.
For any value of ⃗ θ , the estimators of the nuisance parameters and their covariance are given by Eqs. ( 18) and (19).The estimator α is given by the product of the differences between the measurements and the model, m i − t i ( ⃗ θ ), and a factor Q determined only from the initial systematic and experimental uncertainties.This factor can be calculated from the basic inputs to the fit.Nuisance parameter pulls ( αr ) and constraints ( cov( αr , αr )) can thus also be calculated a posteriori in the context of a POI-only fit in covariance representation, without explicitly introducing ⃗ α, ⃗ a in the expression of the likelihood, from the same inputs as those defining C.
This procedure can be repeated first minimizing over ⃗ θ for given ⃗ α, substituting the result into Eq.( 17), and minimising the result over the nuisance parameter ⃗ α.This yields the NP covariance matrix elements as cov( αr , αs with while the covariance between the NPs and POI is given by cov αr Equations ( 11), ( 23) and (26) determine the full covariance matrix of the fitted parameters.
Importantly, Eq. ( 26) can be further simplified to which directly provides the systematic uncertainty decomposition.The inner product of Eq. ( 27) with itself gives the systematic covariance, Eq. ( 16), and the statistical uncertainty can be obtained subtracting the result in quadrature from the total uncertainty in θp .In other words, the contribution of every systematic source to the total uncertainty is directly given by the covariance between the corresponding NP and the POI.

Uncertainty decomposition from shifted observables
While it is a common and relevant approximation, probability models are in general not based on Gaussian uncertainty distributions.Small samples are treated using the Poisson distribution, and the constraint terms associated to nuisance parameters can assume arbitrary forms.The best-fit values of the POI are however always functions of the measurements and the central values of the auxiliary measurements, i.e. θp = θp (⃗ m,⃗ a).Assuming no correlations between these observables, the uncertainty in θp then follows from linear error propagation: where σ i is the uncertainty in m i , the uncertainty in a r is 1 by definition of a r and α r (Section 2), and ∂ a r are the sensitivities of the fit result to these observables.The first sum in Eq. ( 28) reflects the fluctuations of the measurements, i.e. the statistical uncertainty (each term of the sum represents the contribution of a given m i , measurement or bin), and the second sum collects the contributions of the systematic uncertainties.
The contribution of a given source of uncertainty can thus be assessed by varying the corresponding measurement or global observable by one standard deviation in the expression of the likelihood, and repeating the fit otherwise unchanged.The corresponding uncertainty is obtained from the difference between the values of θp in the varied and nominal fits.This statement can be verified explicitly for the Gaussian, linear fits discussed in the previous section.Now allowing for correlations between the measurements, varying m k within its uncertainty yields the following likelihood: where L results from the Cholesky decomposition L T L = V , and represents the correlated effect on all measurements m i of varying m k within its uncertainty.In the case of uncorrelated measurements, L ik = σ i δ ik and only m k is varied as in Eq. (28).After minimization, the difference between the varied and nominal fit results is Similarly, the uncertainty in a t can be obtained from the following likelihood: resulting in as in Eq. ( 27).The differences between the varied and nominal values of θp match the expressions obtained above for the corresponding uncertainties.In particular, reproduces the total statistical covariance in Eq. ( 15), and is the contribution of systematic source t to the systematic covariance in Eq. ( 16).
As in Section 4, the total uncertainty in the NPs can be obtained minimizing the likelihood with respect to ⃗ θ for fixed ⃗ α, replacing ⃗ θ by its expression, and minimizing the result with respect to ⃗ α.The contribution of the measurements to the uncertainty in ⃗ α is where and the systematic contributions are given by Summing Eqs. ( 35) and (37) in quadrature recovers the total NP covariance matrix in Eq. ( 23), as expected.
Finally, the covariance between the NPs and POIs can be obtained analytically by summing the products of the corresponding offsets, obtained from statistic and systematic variations, that is, which again matches the expression for cov( αr , θp ) in Eq. ( 26).
The identities 33, 34, 37, 38 can be obtained analytically only in the context of Gaussian uncertainties, but the uncertainty decomposition through fits with shifted observables results from the Taylor expansion of Eq. ( 28) and is therefore general.While analytical fits can use the representation that is the most practical for their particular purpose, numerical profile likelihood fits, assuming Poisson statistics and/or non-Gaussian nuisance parameter distributions, can still rely on Eq. ( 28) to obtain a consistent uncertainty decomposition where each component directly reflects the impact of fluctuations in the corresponding source to the total variance of the measurement.In this way, uncertainty components preserve a universal meaning, regardless of the statistical method used for a given measurement.
In practice, the uncertainty can be propagated using onestandard-deviation shifts in m and a as above, or using the Monte Carlo error propagation method, where m or a are randomized within their respective probability density functions, and the corresponding uncertainty in the measurement is determined from the variance of the fit results. 3The latter method makes the correspondence between uncertainty contributions and the effect of fluctuations of the corresponding sources (cf.Section 1) explicit.It is also more general, and gives more precise results in case of significant asymmetries or tails in the uncertainty distributions.It can also be more efficient, when simultaneously estimating the variance contributed by a large group of sources of uncertainty.Similarly, the present method can be generalized to unbinned measurements using data resampling techniques for the extraction of statistical uncertainty components [30].

Combination of two measurements
Let us consider again the concrete case of the Higgs boson mass m H described in Section 2, which will serve as a simple example with only one parameter of interest (m H ) and two measurements.We will further assume that both the statistical and systematic uncertainties are uncorrelated between the two channels, which is not unreasonable given that they correspond to different events and that the dominating sources of systematic uncertainty are indeed uncorrelated.We will take numerical values from the actual AT-LAS [14] and CMS [31] Run 1 and Run 2 measurements, as well as from an imaginary case exaggerating the numeric features of the ATLAS Run 2 measurement.
For each case, the decomposition of uncertainties between statistical and systematic components will be compared between the two approaches -uncertainty decomposition and impacts.In addition, this is done as a function of a luminosity factor k, which is used to scale the statistical uncertainty of the inputs by 1/ √ k (while systematic uncertainties are kept unchanged).The published results in the example under consideration are for k = 1.Though not shown on the plots, we have also checked numerically that the uncertainty decomposition (as usually done in covariance representation methods or BLUE) can be reproduced from a profile likelihood fit with shifted observables (Section 4), while the impacts (as usually done in profile likelihood fits) can also be recovered from the BLUE approach, simply by using the statistical uncertainties alone to compute the combination weights λ ′ i as in Eq. ( 7) (i.e.repeating the combination without systematic uncertainties).In addition, both approaches have been checked to yield to the same total uncertainty in all cases.

CMS results
We first study the combination of CMS Run 2 results [31]: stat γγ = 0.18 GeV, syst γγ = 0.19 GeV; stat 4ℓ = 0.19 GeV, syst 4ℓ = 0.09 GeV.The results of our toy combination are shown in Fig. 1.This figure, as well as the following ones, comprises two panels: the inputs to the combination on the left, and statistical and systematic uncertainties as obtained in either the uncertainty decomposition or impact approaches on the right.The actual published numbers [31] correspond to k = 1 (black vertical line).
With this first simple case, where the two measurements have relatively comparable uncertainties, little difference is found between the two approaches, though the uncertainty decomposition gives a larger statistical uncertainty than the impact one, as expected.The difference becomes larger for higher values of the luminosity factor.

ATLAS results
We are now considering the ATLAS Run 2 results [14]: stat γγ = 0.21 GeV, syst γγ = 0.34 GeV; stat 4ℓ = 0.36 GeV, syst 4ℓ = 0.09 GeV.As shown in Fig. 2, differences between the two uncertainty decompositions are now more evident, already for the nominal uncertainty but even more when extrapolating to larger luminosities (smaller statistical uncertainties).Again, the uncertainty decomposition gives a large statistical uncertainty than the impact one.
Imaginary extreme case Finally, we consider an extreme case, such that stat γγ = 0.1 GeV, syst γγ = 0.5 GeV; stat 4ℓ = 0.5 GeV, syst 4ℓ = 0.1 GeV, exaggerating the features of the ATLAS combination (i.e., combining a statistically-dominated measurement with a systematically-limited one).Dramatic differences between the two approaches for uncertainty decomposition are observed in Fig. 3: for the nominal luminosity, while uncertainty decomposition reports equal statistical and systematic uncertainties, the impacts are dominated by the systematic uncertainty.

W -boson mass fits
The uncertainty decomposition discussed above is further illustrated with a toy measurement of the W -boson mass using pseudo-data, where the results obtained from the profile likelihood fit and from the analytical calculation are compared.Since the measurement of W mass is a typical shape analysis, in which the fit to the distributions is parameterized by both POI and NPs, the conclusions drawn from this example can in principle be generalized to all kinds of shape analyses.While the effect of varying the W mass is parameterized by the POI, three representative systematic sources of a W mass measurement at hadron colliders [32][33][34][35] are parameterized by NPs in the probability model: the lepton momentum scale uncertainty, the hadronic recoil (HR) resolution uncertainty and the p W T modelling uncertainty.The W mass is extracted from the p ℓ T or m T spectra, since measurements based on these two distributions have very different sensitivities to certain types of systematic uncertainties.

Simulation
The signal process under consideration is the charged-current Drell-Yan process [36] pp → W − → µ − ν at a center-ofmass energy of √ s =13 TeV, generated using Madgraph, with initial and final state corrections obtained using Pythia8 [37,38].Detailed information of the event generation is listed in Table 2.

Event Generator
Table 2: Madgraph+Pythia8 [37,38] event generation for MC samples.Events with an off-shell boson are excluded in the event generation at parton level, leading to a total crosssection of 6543 pb.
Kinematic distributions for different values of the W mass are obtained in simulation via Breit-Wigner reweighting [39].The systematic variations of p W T are implemented using a linear reweighting as a function of p W T before event selec- Fig. 1: Uncertainty decomposition as a function of a luminosity scaling factor, using CMS Run 2 results [31].Left: size of the statistical (stat) and systematic (syst) uncertainties for γγ and 4ℓ.Right: decomposition of uncertainties on the combination using either the uncertainty decomposition or impacts approach.Fig. 2: Uncertainty decomposition as a function of a luminosity scaling factor, using ATLAS Run 2 results [14].Left: size of the statistical (stat) and systematic (syst) uncertainties for γγ and 4ℓ.Right: decomposition of uncertainties on the combination using either the uncertainty decomposition or impacts approach.
tion, then taking only the shape effect on the underlying p W T spectrum.
At reconstruction level, the p T of the bare muon is smeared by 2% following a Gaussian distribution.A source of systematic uncertainty in the calibration of the muon momentum scale is considered.The hadronic recoil ⃗ u T is taken to be the opposite of ⃗ p W T and smeared by a constant 6 GeV in both directions of the transverse plane.The second source of experimental systematic is taken to be the uncertainty in the calibration of the hadronic recoil resolution.The information about the W mass templates and the systematic variations is summarized in Table 3.
The detector smearing, as well as the event selections listed in Table 4, are chosen to be similar to those of a realistic  W mass measurement.The reconstructed muon p T and m T spectra in the fit range after the event selection are shown in Figure 4, along with the relevant templates and systematic variations.Fig. 3: Uncertainty decomposition as a function of a luminosity scaling factor, using stat γγ = 0.1 GeV, syst γγ = 0.5 GeV; stat 4ℓ = 0.5 GeV, syst 4ℓ = 0.1 GeV.Left: size of the statistical (stat) and systematic (syst) uncertainties for γγ and 4ℓ.Right: decomposition of uncertainties on the combination using the uncertainty decomposition or impact approach.
Table 4: Detector smearing and event selection for Mad-graph+Pythia8 samples.The cut-flow efficiency of the event selection is about 29%.

Uncertainty decomposition
The profile likelihood fit is performed using HistFactory [40] and RooFit [41].Its output includes the fitted central values and uncertainties for all the free parameters.The uncertainty components of the profile likelihood fit results are obtained by repeating the fit to bootstrap samples obtained by resampling the pseudo data used to compute the results, or those of the central values of the auxiliary measurements, then computing the spread of offsets in the POI, the analytical solution of the fit can be calculated following the procedures in Section 5.For this exercise, the pseudo data is chosen to be the nominal simulation, but with the statistical power of the data.The effect of changing the luminosity scale factor is emulated by repeating the fit with an overall factor multiplied to all the reconstructed distributions.The setups of the fits for the validation are summarized in Table 5.
Figures 5 and 6 present the uncertainty decomposition as a function a luminosity scale factor used to scale the statistical precision of the simulated sample.The error bars for the uncertainty decomposition for the profile likelihood fit reflect the limited number of toys.In general, the uncertainty components derived from the numerical profile likelihood fit and the analytical solution match each other within the error bars.The discrepancy at certain points can be assigned to the numerical stability of the PL fit, which shows up when the uncertainty components becomes too small (typically < 2 MeV).The uncertainty decomposition is summarized in Table 6, where the total uncertainty is broken down into data statistic and total systematic uncertainties using the shifted observable method, and compared with the results using the conventional impact approach for PL fit.With 10 times higher luminosity, the statistical uncertainty of the impact approach decreases by exactly a factor of √ 10, while that of the shifted observable approach introduced in this study decreases slower.
Table 7 shows the analytical systematic uncertainty decomposition for the m T and p ℓ T fits with nominal luminosity, together with the NP-POI covariance matrix elements obtained from the numerical profile-likelihood fit.This confirms that the systematic uncertainty components can be directly read from the PL fit covariance matrix, as discussed around Eq. (27).Finally, Figure 7 compares the post-fit NP uncertainties between the numerical profile likelihood fit and the analytical calculation.The two methods agree at the 0.1 per-mil level.

Use of decomposed uncertainties in subsequent fits or combinations
Uncertainty decompositions obtained with the present method are meaningful only if the results can be used consistently in downstream applications, such as measurement combinations or interpretation fits in terms of specific physics models.In particular, uncertainty components that are common to several measurements generate correlations which should be evaluated properly.This happens when measurements are statistically correlated or when they are impacted by shared systematic uncertainties.
As a final validation of the presented method, we test the combination of profile-likelihood fits of the same observable.Such a combination can be performed either using the decomposed uncertainties, or in terms of the PL fit outputs, i.e. the fitted values of the POIs and NPs and their covariance matrix.
The combination is performed starting from Eq. ( 8) which, as noted in Section 3, can be applied to linear measurement averaging by adapting the definition of t( ⃗ θ ).In case of a single combined parameter, t i = θ ; for a simultaneous combination of several parameters, t i = ∑ p U ip θ p where U ip is 1 when measurement i is an estimator of POI p, and 0 otherwise [8].This gives : which can be solved as in Section 3.
As an illustration, we use the m W fits using the p ℓ T and m T distributions described in the previous section.In the case of a combination based on the uncertainty decomposition, there are two measurements (the POIs of the p ℓ T and m T fits), one combined value, and the covariance C is a 2 × 2 matrix constructed from the decomposed uncertainties using Eq. ( 9).
For a combination based on the PL fit outputs, there are in this example eight measurements (one POI and three NPs in the p ℓ T and m T fits), four combined parameters, and C is an 8 × 8 matrix.The diagonal 4 × 4 blocks are the post-fit covariance matrices of each fit (p ℓ T and m T ).The off-diagonal blocks reflect systematic and/or statistical correlations between the p ℓ T and m T fits, and can be obtained analytically following the methods of Section 5.For two fits f 1 and f 2 the covariance matrix elements are For each matrix element, the first sum is statistical and typically occurs when the fitted distributions are projections of the same data, as is the case for the p ℓ T and m T distributions in m W fits.The second sum represents shared systematic sources of uncertainty.
Results of this comparison are presented in Figure 8 and Table 8, which summarize the fit precision as a function of the assumed luminosity.The uncertainty decomposition method and the combination of the PL fit results agree to better than 0.1 MeV.For completeness, the result of a direct joint fit to the two distributions is shown as well; slightly more precise results are obtained in this case, as expected, especially for     Fig. 5: Uncertainty decomposition for the muon p T fit compared between the numerical and the analytical PL fit.The total systematic uncertainty of the profile likelihood fit is the quadratic sum of the three components.Fig. 6: Uncertainty decomposition for the m T fit compared between the numerical and the analytical PL fit.The total systematic uncertainty of the profile likelihood fit is the quadratic sum of the three components.7: Left : list of systematic uncertainty contributions and the total uncertainty, in MeV, for the m T and p ℓ T fits performed in covariance representation.Centre, right: post-fit covariance among the three NPs associated to these systematic uncertainties and the POI, for the profile-likelihood fits to the m T and p ℓ T distribution, respectively.
high integrated luminosities where systematic uncertainties dominate.
We note that a combination of PL fit results based on the nuisance parameter representation, Eq. ( 17), as proposed in Ref. [42], seems difficult to justify rigorously.The principal reason is that Eq. ( 17) explicitly relies on the absence of correlations, prior to the combination, between the sources of uncertainty encoded in the covariance matrix V and the uncertainties treated as nuisance parameters.Since the input measurements result from PL fits, the POI of each input measurement is in general correlated with the corresponding NPs.One possibility would be to add terms to Eq. ( 17) that describe these missing correlations.It could also be envisaged to diagonalize the covariance of the inputs and per- Table 8: Summary of m T and p ℓ T PL fit results.Combinations are produced using the uncertainty decomposition method, and using the covariance of the PL fit results.form the fit in this new basis, but this would work only if all measurements can be diagonalized by the same linear transformation, which is in general not the case.

Conclusion
We have studied the decomposition of fit uncertainties in two often-used statistical methods in high-energy physics, namely fits in covariance representation and the profile likelihood.The equivalence between the two methods in the Gaussian limit is reminded, and a complete set of expressions is given for the fit uncertainties in the parameters of interest, the nuisance parameters and their correlations.A direct correspondence is established between the standard uncertainty decomposition in covariance representation and the (POI,NP) covariance matrix elements in nuisance representation.
Numerical profile-likelihood analyses generally define statistical and systematic uncertainty components from the results of statistical-only fits and systematic impacts, but this identification does not hold.The uncertainty of statisticalonly fits underestimates the statistical uncertainty of fits including systematics, and systematic impacts correspondingly overestimate the genuine systematic uncertainty contributions.Impacts cannot be used as inputs to subsequent measurement combinations or interpretation fits.
We introduce a set of analytical and numerical methods to remove this shortcoming.In Gaussian approximation, a consistent uncertainty decomposition can be directly extracted from the PL fit covariance matrix.For general (non-gaussian or non-linear) profile-likelihood fits, we propose a procedure, using shifted observables, from which a consistent uncertainty decomposition can be obtained rigourously.We illustrate these points by means of simple examples, and show that profile-likelihood fit results with properly decomposed uncertainties can be used consistently in downstream combinations or fits.

Fig. 4 :
Fig. 4: Reconstructed muon p T and m T distributions of the Madgraph + Pythia8 samples.(top): Kinematic spectra.(bottom): The variation to nominal ratio with statistical uncertainty indicated by the error band.
Breakdown of systematic uncertainties.
Breakdown of systematic uncertainties.
m T fit.

Fig. 7 :
Fig. 7: Post-fit NP uncertainties at different values of the luminosity scale factor.The results of the numerical and the analytical PL-fits are compared in the ratio panel.

Fig. 8 :
Fig. 8: Summary of m T and p ℓT PL fit results.Combinations are produced using the uncertainty decomposition method, and using the covariance of the PL fit results.

Table 3 :
W mass templates and systematic variations for the Madgraph+Pythia8 samples.

Table 5 :
Configuration of the m W fits.The luminosity scale factor of 1.0 corresponds to 76.42 [pb −1 ].

Table 6 :
Uncertainty decomposition for the muon p ℓ T and m T fits, for two different values of the luminosity scale factor, using the shifted observable method and the impact method for PL fit.The errors arise from the limited number of bootstrap toys.The baseline luminosity is 76.42 [pb−1 ].