ReSyst: a novel technique to Reduce the Systematic uncertainty for precision measurements

We are in an era of precision measurements at the Large Hadron Collider. The precision that can be achieved on some of the measurements is limited however due to large systematic uncertainties. This paper introduces a new technique to reduce the systematic uncertainty by quantifying the systematic impact of single events and correlating it with event observables to identify parts of the phase space that are more sensitive to systematic effects. A proof of concept is presented by means of a simplified top quark mass estimator applied on simulated events. Even without a thorough optimization, it is shown that the total systematic uncertainty can be reduced by a factor of at least two.


Introduction
With the large amount of data collected each year, the Large Hadron Collider (LHC) at CERN has entered an era of precision measurements. For the measurement of some of the standard model parameters, the systematic uncertainties largely limit the achievable precision. This is for instance the case for the top quark mass measurement, where the systematic uncertainty is already eight times larger than the statistical uncertainty for the most precise measurement [1]. Over the next five years, the measurements at the LHC will be performed with up to ten times more data, reducing the statistical uncertainty on the most precise measurements to a negligible level. Therefore, the focus of the community is to reduce the systematic uncertainty for the precision measurement of the top quark mass. The ATLAS and CMS collaborations expect to reduce the dominant systematic uncertainties due to the jet energy scale, the b-quark fragmentation and the modelling of tt events in the simulation [2], and therefore to achieve a precision of around 300 MeV on the top quark mass at the end of the LHC Run 3.
In this article, a novel technique is presented to trade statistical precision for a reduced systematic uncertainty. The here introduced ReSyst technique is complementary to existing efforts to reduce the total uncertainty. The ReSyst method revolves around the concept of quantifying the systematic effect for each event in order to identify events inducing a large impact in the total systematic uncertainty. The systematic effect of the event is then correlated with event observables to identify regions of the phase space containing events that induce a large systematic effect. The systematic uncertainty can then be reduced by rejecting events in specific parts of the phase space.
The concept is illustrated with a simplified top quark mass estimator using simulated tt events. The muon+jets decay channel is used, i.e. with one of the W bosons from the top quark decaying to quarks (hadronic leg) and the other one to a muon and corresponding neutrino (leptonic leg). In total 10 M of these events are generated in proton-proton collisions at a centre of mass energy of 13 TeV, corresponding to about 80 fb −1 , using the POWHEG v2 event generator [3][4][5][6] and with the top quark mass set to 172.5 GeV. The generated events are interfaced with PYTHIA 8.2 [7] for the parton showering, hadronization and particle decay using the underlying event tune CUETP8M2T4 [8] and further processed with DELPHES v3.4.2pre03 [9][10][11] to simulate the CMS detector response. The anti-kt jet clustering algorithm [14] in the FastJet package [12,13] is used to reconstruct jets when running DELPHES. The default DELPHES CMS parameter card is used with the exception of the b-tagging efficiency and misidentification probabilities for which the parametrizations for the medium working point of the DeepCSV algorithm are used [15].

Top quark mass estimator
Events are selected when they have at least one muon with transverse momentum (p T ) above 25 GeV within the tracker acceptance (|η| < 2.4) and at least four jets with p T > 30 GeV within the CMS tracker acceptance. At least two of the jets passing the p T and η requirements should be identified as originating from b quarks. The selection efficiency for the signal tt events in the muon+jets decay channel is around 15%. Other tt decays or background processes are not considered in the analysis, which is motivated by the fact that the modelling of the background does not induce a dominant systematic uncertainty [1]. Moreover, the selection criteria which are applied reduce the fraction of background events to less than 10% of the number of selected events [1].
The three highest-p T jets are used to reconstruct the top quark corresponding to the hadronic leg of the tt decay and its mass. This three-jet mass, m jjj , is used to construct probability density functions for the correctly (wrongly) matched jet-quark combinations corresponding (or not) to the top quark decay. Only events which have an m jjj value between 130 and 220 GeV are considered. After this additional requirement, the total event selection efficiency drops to 1.8%. The probability density functions for the selected events are used to construct a likelihood: where the product runs over the number of events and the function f CM is the fraction of correctly matched jet-quark combinations corresponding to about 16% in the considered m jjj range. The functions P CM (m jjj,i |m t ) and P N M (m jjj,i ) are the probability density functions for correctly and wrongly matched jet-quark combinations, respectively. The left panel in figure 1 shows P CM (m jjj,i |m t ) for different m t values. To obtain samples with a different generated top quark mass, the events generated with m t = 172.5 GeV are reweighted. The estimated top quark mass is obtained by evaluating the likelihood for different values of the generated top quark mass m t and performing a maximum likelihood fit (or minimizing ∆χ 2 = −2 ln(L(m t ))). The ∆χ 2 distribution is shown in the right panel of figure 1.  A deviation of about 0.2 GeV is found compared to the generated mass. As discussed in the text, a calibration procedure is usually applied to correct this statistical bias.
To illustrate the ReSyst technique for reducing the systematic uncertainties, the following systematic uncertainties are considered: • b tagging efficiency and mistagging probability: The uncertainty in the b tagging efficiency is typically 2%, while the uncertainty in the mistagging probability is 5% for c jets and up to 15% for light-quark jets [15]. These uncertainties are taken into account by reweighting the events taking into account an independent variation upwards or downwards for the b tagging efficiency and misidentification probability using the true jet flavour. The difference between the estimated top quark mass for these variations and the nominal estimated top quark mass is taken as the systematic uncertainty. The square root of the quadratic sum of the systematic uncertainties for the b tagging efficiency and mistagging probability is used as the total uncertainty due to b tagging.
• Jet energy scale: The jet energy scale is known at the 1-2% level. A variation of 1% upwards and downwards is used to assess the systematic uncertainty due to the jet energy scale. The jet four-momentum is accordingly rescaled prior to the event selection.
• Factorization and renormalization scales: The factorization and renormalization scales (Q 2 ) at the matrix-element level are varied independently upwards and downwards with a factor of two. This gives rise to eight possible variations. The two variations where the factorization and renormalization scales are varied in opposite directions are unphysical and are therefore not considered. For the six remaining variations, the envelope is calculated to obtain the size of the systematic effect.
• Matching between the matrix element and parton shower: The matching between the matrix-element level and the parton shower is controlled by the so-called h damp parameter. Radiated quarks and gluons are damped by a factor h 2 damp /(p 2 T +  [8]. The systematic uncertainty is the difference between the estimated top quark mass when varying h damp by the upward and downward uncertainty.
• Top quark p T : The top quark p T spectrum in data is observed to be softer than in simulated tt events [16][17][18][19]. Therefore, the systematic effect due to the softer top quark p T spectrum is taken into account by reweighting the p T spectra of the two top quarks in the simulation. The difference with the top quark mass measurement before reweighting is taken as the size of the systematic effect.
• b quark fragmentation: Another source of uncertainty is the modelling of the momentum transfer from the b quark to the B hadron during the parton shower. To assess the size of this systematic effect, the ratio of the p T of the generated B hadron and the p T of the b jet, p T (B)/p T (b jet), is varied by 2.5% upwards and downwards. The top quark mass is remeasured for these variations and the difference with the nominal estimated value is quoted as the uncertainty.
The sources of systematic uncertainties listed above are considered because they typically dominate the total uncertainty in top quark mass measurements. In addition, those uncertainties can be determined by reweighting events, avoiding the need to simulate additional event samples. The sources of systematic effects and their corresponding uncertainties are listed in table 1. The systematic uncertainties due to the jet energy scale and the b quark fragmentation clearly dominate. The size of the systematic uncertainties for the various sources are around the values expected for the 1D measurement discussed in Ref. [1], with the exception of the b quark fragmentation modelling, which is substantially larger in this document. This is explained by the different assessment of the b quark fragmentation in both cases and the simplified top quark mass estimator used here. The reader should bear in mind that the numbers obtained in this analysis do not fully reflect the top quark mass measurements performed by the ATLAS and CMS collaborations, but are merely used to illustrate the ReSyst technique. The estimated mass is found to be 172.69 ± 0.24 (stat.) + 1.26 − 1.63 (syst.) GeV. While the estimated mass is found to be slightly different from the expected mass of 172.5 GeV both are consistent within the statistical uncertainty. Usually a calibration (or bias correction) is performed when estimating the top quark mass, e.g. as is done in Ref [1], but in the context of this paper such a calibration is less relevant.

Systematic effect quantifier
The estimator presented in the previous section has a large systematic uncertainty, which is dominated by the uncertainty in the b quark fragmentation and the jet energy scale. If the impact of each event on the total systematic uncertainty could be quantified, we could reduce the systematic uncertainty by rejecting events inducing a large systematic effect. The impact of an event i on the total systematic uncertainty is denoted by R i and can be quantified as: (3.1) In this expression, the sum runs over all systematic sources j, m is the estimated top quark mass for a +1σ (−1σ) variation of systematic source j, and m ±1σ j t(i) is the same but without considering event i. The difference m quantifies the total effect of the upward and downward variation of the systematic source, while m quantifies the effect without event i. In case the effect of a systematic source on the measurement is evaluated by a single variation, as is the case for the uncertainty on the generated top quark p T , the difference is taken between this single variation and the nominal estimated top quark mass value, i.e. m In equation 3.1, the nominator has the same value for all events. When the denominator is smaller than the nominator, it means that the total systematic uncertainty becomes smaller by removing event i. Hence, removing events with relatively low values of R i would be a good idea to reduce the total systematic uncertainty. However, R i is an event variable that is not observable. Therefore it cannot be used directly to reject events to reduce the systematic uncertainty. Instead, the correlation between R i and event observables can be investigated to identify regions of the observable phase space that are more likely to correspond with low R i values. Figure 2 shows the dependence of R i on the H T , defined as the sum of the p T of the jets in the event, and on the p T of the fourth jet when ranking the jets according to decreasing p T . The mean of R i , denoted < R i >, is shown when considering all systematic sources and for the jet energy scale and b quark fragmentation uncertainty alone. It is clear that the value of < R i > is mostly driven by the impact of the jet energy scale, which can be explained from the definition of R i in equation 3.1 and because the jet energy scale dominates the total systematic uncertainty. From figure 2 one can also observe that < R i > can be above or below 1. This is related to the fact that the impact of the (individual) systematic effects on the top quark mass is not fully symmetric around the nominally estimated top quark mass. However, what is most important is the variation of < R i > across the observable range and the possibility to identify a region in the observable phase space corresponding to relatively lower values of < R i >. In figure 2, we see that < R i > is relatively lower at small values of the H T and the p T of the fourth jet compared to < R i > at higher values of those observables. Hence, we can remove those events with relatively low R i value by applying a lower threshold on the H T and the p T of the fourth jet.

Reducing the total systematic uncertainty (ReSyst)
The ReSyst technique consists of applying additional event selection criteria in the event observable phase space to become less sensitive to systematic effects. Based on the behaviour of R i as a function of event observables, as for example shown in figure 2, additional selection criteria can be applied. To illustrate the power of ReSyst and motivated by the observations in figure 2, additional event selection criteria are defined: H T > 250 GeV and the p T of the fourth jet is above 45 GeV. The threshold on each of these observables is roughly chosen as the value where < R i > reaches a stable and relatively large value. Other observables were also studied, but these did not exhibit enough variation in R i , as is the case for the p T of the muon, or in case they did, the initial variation disappeared after applying other additional selection criteria. After applying the additional selection criteria, the total selection efficiency becomes 0.4%, hence a factor of four lower than before (cfr. section 2). In contrast, the number of events for which the correct jet-quark combination is chosen raises from about 16% to about 22%. With this subset of events, the top quark mass is estimated to be 172.60 ± 0.19 (stat.) + 0.67 − 0.58 (syst.) GeV. The statistical uncertainty is smaller compared to the estimation performed before the additional selection requirements. This can be understood as follows. Since the analysis is repeated, also the probability density functions in equation 2.1 are redetermined. These new probability density functions are more sensitive compared to the original ones, resulting in a smaller statistical uncertainty. This can be seen in figure 3, where the left-hand side of the m jjj distribution is more sensitive to  Table 2. The sources of systematic uncertainties considered in the analysis after applying the ReSyst technique and their impact on the top quark mass measurement. The uncertainty in the jet energy scale and the b quark fragmentation still dominate the total uncertainty, but their effect is reduced by a factor of about two and four with respect to table 1, respectively. top quark mass variations. Indeed, the requirements on the H T and the p T of the fourth jet affect the m jjj distributions for different top quark mass values differently. However, more important in the context of this paper is the reduction of the systematic uncertainty by a factor of at least two compared to the result obtained with the initial event selection discussed in section 2. Table 2 shows the systematic uncertainty associated with the different sources. The requirement on the H T and the p T of the fourth jet reduces the uncertainty due to the jet energy scale and b quark fragmentation, which are the two dominant uncertainties. Indeed, those uncertainties are expected to be reduced by the additional selection criteria since the number of events with at least four jets may vary when the jet energy is scaled up or down, especially in case the fourth jet has a p T close to 30 GeV, which is used as a threshold to the baseline selection. The number of jets has a linear correlation of 60% and 40% with the H T and the p T of the fourth jet, respectively. Moreover, for low-p T quarks and gluons there is a higher probability that particles produced during the hadronization are not clustered in the jets, resulting in a larger uncertainty on the energy scale for those jets. Similarly, if the initial b quark momentum is relatively low, the uncertainty on the momentum transferred to the reconstructed B hadron is larger than for quarks with a higher momentum. When applying the technique on data, it is relevant to verify that the distributions of the chosen observables are modelled well. As a cross-check of the ReSyst technique the top quark mass estimation was repeated using only the events failing the additional selection criteria. In this case, the systematic uncertainties are so large that they fall outside the considered range of top quark input masses, which is 171-174 GeV. An increase of the total systematic uncertainty is indeed the behavior we expect for the events that do not pass the additional selection criteria.
For some observables < R i > is found to be fairly stable across the observable range. This is the case for instance for the invariant mass of the muon and the fourth jet m lj 4 as can be seen in figure 4. Therefore, the systematic uncertainty should not change when a cut is placed on this observable. The top quark mass estimation is repeated without additional selection requirements, except m lj 4 <250 GeV. About 36% of the events are rejected by this additional requirement and the top quark mass is estimated to be 172.74 ± 0.30 (stat.) + 1.23 − 1.64 (syst.) GeV. When comparing this result to the estimation before the application of the selection requirements, the two results are fully compatible, with the size of the systematic uncertainty varying by 3% or less. This demonstrates that the ReSyst technique works conceptually.

Conclusion and prospects
The ReSyst technique is presented as a novel technique to reduce the systematic uncertainty on precision measurements for which the statistical uncertainty is a small component in the total uncertainty. This technique allows for balancing statistical and systematic uncertainties by rejecting events which induce a large systematic uncertainty. A quantifier R i is introduced to assess the impact of an individual event on the systematic uncertainty. Correlating R i with event observables opens the possibility to reduce the total uncertainty. This concept is demonstrated using a simplified top quark mass estimator. For this estimator and for the considered systematic sources, the total systematic uncertainty is reduced by at least a factor of two. This factor of two improvement is obtained without optimizing the thresholds on the observables used to reduce the total systematic uncertainty. It should be noted that a number of systematic sources were not considered, such as the number of additional pileup collisions in the same or adjacent bunch crossings, effects from the modelling of colour reconnection, or from initial and final state radiation due to the uncertainty in the strong coupling α S in the parton shower. Clearly, when applying the technique on data these uncertainties should be taken into account to perform a complete top quark mass measurement.
The power of the ReSyst technique to define a quantifier for each event is at the same time also a limitation. The definition of R i assumes that systematic effects can be assessed by using the same events at the matrix-element level such that there is a one-toone connection between the nominal event and the event processed with a different value of the parameter(s) simulating the systematic variation. For some systematic sources and generators this is not (yet) the case, e.g. for the uncertainty in the modelling of the colour reconnection in the parton shower, which is typically evaluated using an independent sample with a different generator seed.
Optimizations to the initial concept presented here are possible. For some analyses it may be better to define R i separately for the upward and downward variation of a systematic source, e.g. when the systematic uncertainty is highly asymmetric, or to define R i for a subset of systematic sources. To optimize the selection criteria, one could perform a scan to find the optimal thresholds on the most promising variables and/or combine variables in a multivariate analysis to find an optimal selection threshold. If a profile likelihood ratio fit is used to perform the measurement, R i could be exploited to identify control distributions that can be used in the fit to constrain the systematic uncertainty due to various sources using the data. Lastly, it may also be interesting to combine the statistical and systematical impact of an event and/or to explore the correlation between the statistical and systematical impact of events.