ReSyst: a novel technique to Reduce the Systematic uncertainty for precision measurements

We are in an era of precision measurements at the Large Hadron Collider. The precision that can be achieved on some of those is limited however due to large systematic uncertainties. This paper introduces a new technique to reduce the total systematic uncertainty by quantifying the systematic impact of single events and correlating it with event observables to identify classes of events that are more sensitive to systematic effects. A proof of concept is presented by means of a simplified top quark mass estimator applied on simulated events. Even without a thorough optimization, it is shown that the total systematic uncertainty can be reduced by at least 30%.


Introduction
With the large amount of data collected each year, the Large Hadron Collider (LHC) at CERN has entered an era of precision measurements. For the measurement of some of the standard model parameters, the systematic uncertainties largely limit the achievable precision. This is for instance the case for the most precise measurement of the top quark mass in a single decay channel, where the systematic uncertainty is already eight times larger than the statical uncertainty, i.e. m t = 172.25 ± 0.08 ± 0.62 GeV [1]. Over the next five years, the measurements at the LHC will be performed with up to ten times more data, reducing the statistical uncertainty on the most precise measurements to a negligible level. Therefore, the focus of the community is to reduce the systematic uncertainty for the precision measurement of the top quark mass. The ATLAS and CMS collaborations expect to reduce the dominant systematic uncertainties due to the jet energy scale, the b-quark fragmentation and the modelling of tt events in the simulation [2], and therefore to achieve a precision of around 300 MeV on the top quark mass at the end of the LHC Run 3.
When measurements in high-energy physics are dominated by systematic uncertainties, diverse techniques can be developed to reduce those, e.g. by tuning the event or reconstruction requirements [3,4]. Also in the context of classification problems and machine learning some techniques have been developed to reduce the impact of sources of systematic uncertainties [5][6][7]. In this article, a novel technique is presented to trade statistical precision for a reduced systematic uncertainty. The here introduced ReSyst technique can as well be considered complementary to existing efforts to reduce the total uncertainty. The ReSyst method revolves around the definition of a non-observable quantifier of the systematic impact for each event. Using simulation this non-observable quantifier can be correlated with event observables in order to identify classes of events inducing a relatively large impact in the total systematic uncertainty. The systematic uncertainty can then be reduced by rejecting events in specific parts of the phase space.
The concept is illustrated with a simplified top quark mass estimator using simulated tt events at the LHC. The muon+jets decay channel is used, i.e. with one of the W JHEP06(2019)132 bosons from the top quarks decaying to quarks (hadronic leg) and the other one to a muon and corresponding neutrino (leptonic leg). In total 10 M of these events are generated in proton-proton collisions at a centre of mass energy of 13 TeV, corresponding to about 80 fb −1 of integrated luminosity, using the POWHEG v2 event generator [8][9][10][11] and with the top quark mass set to 172.5 GeV. The generated events are interfaced with PYTHIA 8.2 [12] for the parton showering, hadronization and particle decay using the underlying event tune CUETP8M2T4 [13] and further processed with DELPHES v3.4.2pre03 [14][15][16] to simulate the CMS detector response. The anti-kt jet clustering algorithm with a distance parameter of 0.5 [17] in the FastJet package [18,19] is used to reconstruct jets when running DELPHES. The default DELPHES CMS parameter card is used with the exception of the b-tagging efficiency and misidentification probabilities for which the parametrizations for the medium working point of the DeepCSV algorithm developed by the CMS Collaboration are used [20].

Top quark mass estimator
Events are selected when they have at least one muon with transverse momentum (p T ) above 25 GeV within the CMS tracker acceptance (|η| < 2.4) and at least four jets with p T > 30 GeV within the tracker acceptance. At least two of the jets passing the p T and η requirements should be identified as originating from b quarks. The selection efficiency for the signal tt events in the muon+jets decay channel is around 15%. Other tt decays or background processes are not considered in the analysis, which is motivated by the fact that the modelling of the background does not induce a dominant systematic uncertainty on the top quark mass [1]. Moreover, the selection criteria which are applied reduce the fraction of background events to less than 10% of the number of selected events [1].
The three highest-p T jets are used to reconstruct the top quark corresponding to the hadronic leg of the tt decay and its mass, m jjj . To reduce the contribution of wrongly matched jet-quark combinations, i.e. those combinations not corresponding to the top quark decay, only events which have an m jjj value between 130 and 200 GeV are considered. About 8% of the initially selected events survive this additional requirement, which means that the total event selection efficiency drops to 1.2%. To obtain samples with a different generated top quark mass, the events generated with m t = 172.5 GeV are reweighted. The m jjj distributions for correctly and wrongly matched jet-quark combinations, figure 1 are fitted with gaussian and third order polynomial functions, respectively, for different m t hypotheses. The dependence of the parameters in the fitted functions on m t is then fitted with first order polynomial functions. The obtained m t dependence of the parameters is then used to construct the probability density functions P CM and P NM respectively for the correctly and wrongly matched jet-quark combinations. This procedure reduces the impact of sample specific statistical fluctuations in the m jjj distribution and the chosen binning. The probability density functions for the selected events are used to construct a likelihood: where the product runs over the number of events and the function f CM is the fraction of correctly matched jet-quark combinations corresponding to about 20.5% in the considered m jjj range. The functions P CM (m jjj,i |m t ) and P NM (m jjj,i |m t ) are the probability density functions for correctly and wrongly matched jet-quark combinations, respectively. The upper right (lower left) panel in figure 1 shows P CM (m jjj,i |m t ) (P NM (m jjj,i |m t )) for different m t values. The dependence of P NM on the generated mass, m t , is negligible except at high m jjj values, which is an additional motivation to not consider events with m jjj values above 200 GeV. The estimated top quark mass is obtained by evaluating the likelihood for different values of the generated top quark mass m t and performing a maximum likelihood fit or minimizing ∆χ 2 = −2 ln(L(m t )). The ∆χ 2 distribution is shown in the lower right panel of figure 1. The minimum of the fitted function corresponds to the estimated top quark mass and the intersections of the horizontal line with the fitted function correspond to the size of the statistical uncertainty on the estimation.
To illustrate the ReSyst technique for reducing the systematic uncertainties, the following systematic uncertainties are considered: • b tagging efficiency and mistagging probability: the uncertainty in the b tagging efficiency is typically 2%, while the uncertainty in the mistagging probability is 5%

JHEP06(2019)132
for c jets and up to 15% for light-quark jets [20]. These uncertainties are taken into account by reweighting the events taking into account an independent variation upwards or downwards for the b tagging efficiency and misidentification probability using the true jet flavour. The difference between the estimated top quark mass for these variations and the nominal estimated top quark mass is taken as the systematic uncertainty. The square root of the quadratic sum of the systematic uncertainties for the b tagging efficiency and mistagging probability is used as the total uncertainty due to b tagging.
• Jet energy scale: the jet energy scale is varied upwards and downwards using a p T and η dependent variation corresponding to the jet energy scale uncertainty in [21]. The jet four-momentum is accordingly rescaled prior to the event selection.
• Factorization and renormalization scales: the factorization and renormalization scales (Q 2 ) at the matrix-element level are varied independently upwards and downwards with a factor of two. This gives rise to eight possible variations. The two variations where the factorization and renormalization scales are varied in opposite directions are unphysical and are therefore not considered. For the six remaining variations, the envelope is calculated to obtain the size of the systematic effect.
• Matching between the matrix element and parton shower : the matching between the matrix-element level and the parton shower is controlled by the so-called h damp parameter. Radiated quarks and gluons are damped by a factor h 2 damp /(p 2 T + h 2 damp ). The parameter value was tuned to 1.581 +0.658 −0.585 × m t [13]. The systematic uncertainties correspond to the difference between the estimated top quark mass when varying h damp by the upward and downward uncertainty with respect to the nominal estimated value.
• Top quark p T : the top quark p T spectrum in data is observed to be softer than in simulated tt events [22][23][24][25]. Therefore, the systematic effect due to the softer top quark p T spectrum is taken into account by reweighting the p T spectra of the two top quarks in the simulation. The difference with the top quark mass measurement before reweighting is taken as the size of the systematic effect.
• b quark fragmentation: another source of uncertainty is the modelling of the momentum transfer from the b quark to the B hadron during the parton shower. To assess the size of this systematic effect, the ratio of the p T of the generated B hadron and the p T of the b jet, p T (B)/p T (b jet), is varied by 2.5% upwards and downwards. This number is motivated by the uncertainty on the r b parameter in the Lund-Bowler function which has been measured using e + e − collisions from LEP and SLC [26,27] and which results in a variation of around 2.5% in the p T (B)/p T (b jet) distribution. The top quark mass is remeasured for these variations and the difference with the nominal estimated value is quoted as the uncertainty.
When assessing the systematic uncertainty induced by each source the likelihood function in equation (2.1) [1] with the exception of the b quark fragmentation modelling, which is larger in this document. However in the 1D measurement, an additional uncertainty is taken into account for the energy scale of b jets. The combination of the b jet energy scale uncertainty and the b quark fragmentation uncertainty in [1] is in the same ballpark as presented here under the label b quark fragmentation. The b jet energy scale uncertainty and the b quark fragmentation uncertainty both account for the possible uncertainty in the energy of the b jet. Therefore, it is concluded that the impact of the systematic uncertainties on the estimated top quark mass in this case study is reasonable compared to the case presented in [1]. Possible differences are related to the detector simulation and the choice of the top quark estimator. The estimated mass is found to be 172.80 ± 0.16 (stat.) + 0.96 − 0.97 (syst.) GeV. The estimated mass is found to be 0.3 GeV different from the expected or generated mass of 172.5 GeV. Usually a calibration (or bias correction) is performed when estimating the top quark mass, e.g. as is done in ref. [1], but in the context of this paper such a calibration is less relevant.

Systematic effect quantifier
The estimator presented in the previous section has a large systematic uncertainty, which is dominated by the uncertainty in the b quark fragmentation and the jet energy scale. If the impact of each event on the total systematic uncertainty could be quantified, we could reduce the systematic uncertainty by rejecting events inducing a large systematic effect. The ReSyst method revolves around quantifying the impact of an event i on the JHEP06(2019)132 total systematic uncertainty, denoted by R i , as: (3.1) In this expression, the sum runs over all systematic sources j, m is the estimated top quark mass for a +1σ (−1σ) variation of systematic source j, and m ±1σ j t(i) is the same but without considering event i. The difference m +1σ j t − m −1σ j t quantifies the total effect of the upward and downward variation of the systematic source, while m quantifies the effect without event i. In case the effect of a systematic source on the measurement is evaluated by a single variation, as is the case for the uncertainty on the generated top quark p T , the difference is taken between this single variation and the nominal estimated top quark mass value, i.e. m The quantifier is inspired on the jackknife delete-1 resampling technique that can be used to estimate the variance and statistical bias of a measurement [28]. In the case of the quantifier R i the systematic impact of a specific event is estimated by removing that event from the sample and repeating the estimation of the systematic uncertainties.
In equation (3.1), the denominator has the same value for all events. When the numerator is smaller than the denominator, it means that the total systematic uncertainty becomes smaller by removing event i. Hence, removing events with relatively low values of R i would be a good idea to reduce the total systematic uncertainty. However, R i is an event variable that is not observable. Therefore it cannot be used directly to reject events to reduce the systematic uncertainty. Instead, the correlation between R i and event observables can be investigated to identify regions of the observable phase space that are more likely to correspond with low R i values. Figure 2 shows the dependence of R i on the H T event observable, defined as the sum of the p T of the jets in the event, and on the maximum ∆R between the muon and any of the jets in the event with p T > 30 GeV. The distance ∆R is defined as ∆R = (η jet − η muon ) 2 + (φ jet − φ muon ) 2 , where η jet (η muon ) and φ jet (φ muon ) correspond respectively to the pseudorapidity and azimuthal angle of the jet (muon). The mean of R i , denoted < R i >, is shown when considering all systematic sources together and separately for both the jet energy scale and b quark fragmentation uncertainties alone. It is clear that the value of < R i > is mostly driven by the impact of the jet energy scale uncertainty, which can be explained from the definition of R i in equation (3.1) and because the jet energy scale uncertainty dominates the total systematic uncertainty. From figure 2 one can also observe that < R i > can be above or below 1. This is related to the fact that the impact of the (individual) systematic effects on the top quark mass is not fully symmetric around the nominally estimated top quark mass. However, what is most important is the variation of < R i > across the observable range and the possibility to identify a region in the observable phase space corresponding to relatively lower values of < R i >. In figure 2, we see that < R i > is relatively lower at small values of the H T . Hence, this observation suggests to remove those events with a lower value of H T in order to reduce the total systematic uncertainty on the top quark mass measurement.  Figure 2. < R i > as a function of H T (left) and as a function of the maximum ∆R between any selected jet and the muon, ∆R max (jet i, muon) (right). The grey horizontal line corresponds to the overall average value of < R i >. The value of < R i > for all systematic uncertainties combined (black) is driven by the two dominant systematic uncertainties: the jet energy scale and the b quark fragmentation for which the < R i > values are shown separately in blue and red, respectively. The error bars reflect the bin-by-bin uncertainty on the mean of R i .

Reducing the total Systematic uncertainty (ReSyst)
The ReSyst technique consists of applying additional event selection criteria in the event observable phase space to become less sensitive to systematic effects. Based on the behaviour of R i as a function of event observables, as for example shown in figure 2, additional selection criteria can be applied. To illustrate the power of ReSyst and motivated by the observation in figure 2, an additional event selection requirement is defined: H T > 220 GeV. The threshold on the H T is roughly chosen as the value where < R i > crosses the average < R i > value illustrated by the grey line. The other observable in figure 2 does not exhibit significant variation in R i . Other observables were also studied, but their initial dependence on < R i > disappeared after applying the additional selection criterion of H T > 220 GeV, as is expected for observables correlated with H T .
After applying the additional selection requirement, an additional 31% of the previously selected events is rejected, resulting in a total selection efficiency of 0.8%. In contrast, the number of events for which the correct jet-quark combination is chosen raises from about 20.5% to about 23.3%. The probability density functions for the m jjj distribution are remade with this subset of events and the obtained likelihood is maximized. Figure 3 shows the probability density functions and after the additional selection requirement.
The top quark mass is estimated to be 172.53 ± 0.18 (stat.) + 0.67 − 0.59 (syst.) GeV. As expected after applying an additional selection requirement, the statistical uncertainty slightly increases. However, more important in the context of this paper is the reduction of the systematic uncertainty by about 30% compared to the result obtained with the initial event selection discussed in section 2. Table 2 shows the systematic uncertainties associated with the different sources. The requirement on the H T observable reduces the uncertainty due to the jet energy scale and b quark fragmentation, which are the two dominant uncertainties. Indeed, those uncertainties are expected to be reduced by the additional selection requirement since also the number of events with at least four jets may   For some observables the value of < R i > is found to be fairly stable across the observable range. This is the case for instance for ∆R max (jet i, muon) as can be seen in the right panel of figure 2. Therefore, the systematic uncertainty is expected not to change when an additional selection requirement is placed on this observable. As an illustration, the top quark mass estimation is repeated replacing the H T > 220 GeV requirement with the requirement ∆R max (jet i, muon) > 3. About 36% of the events are rejected by this additional requirement. To estimate the top quark mass the m jjj probability density functions are remade and the top quark mass is estimated to be 173.04±0.28 (stat.)+0.81−0.94 (syst.) GeV. Modulo smaller changes that are due to the new m jjj templates the resulting systematic uncertainty is, as expected, equivalent to the one without the additional selection criterion. This demonstrates that the ReSyst technique works conceptually.

Conclusion and prospects
The ReSyst technique is presented as a novel technique to guide the design of experimental analyses for precision measurements for which the statistical uncertainty is small compared to the systematic uncertainty. This technique allows for balancing statistical and systematic uncertainties by rejecting events which induce a large systematic uncertainty. A quantifier R i is introduced to assess the impact of an individual event on the systematic uncertainty. Correlating R i with event observables opens the possibility to reduce the total uncertainty. This concept is demonstrated using a simplified top quark mass estimator in the context of proton collisions at the LHC. For this estimator and for the considered systematic sources, the total systematic uncertainty is reduced by at least 30%. This reduction is obtained without optimizing the thresholds on the observables used to reduce the total systematic uncertainty. It should be noted that a number of systematic sources were not considered, such as the number of additional pileup collisions in the same or adjacent bunch crossings, effects from the modelling of colour reconnection, or from initial and final state radiation due to the uncertainty in the strong coupling α S in the parton shower. Clearly, when applying the technique on data these uncertainties should be taken into account to perform JHEP06(2019)132 a complete top quark mass measurement. When applying additional selection requirements using the ReSyst technique, it is possible that additional systematic uncertainties need to be considered to e.g. accommodate potential differences in the selection efficiency in data and simulation.
The power of the ReSyst technique to define a quantifier for each event is at the same time also a limitation. The definition of R i assumes that systematic effects can be assessed by using the same events at the matrix-element level such that there is a oneto-one connection between the nominal event and the event processed with a different value of the parameter(s) simulating the systematic variation. For some systematic sources and generators this is not (yet) the case, e.g. for the uncertainty in the modelling of the colour reconnection in the parton shower, which is typically evaluated using an independent sample with a different generator seed.
Optimizations to the initial concept presented here are possible. For some analyses it may be better to define R i separately for the upward and downward variation of a systematic source, e.g. when the systematic uncertainty is highly asymmetric, or to define R i for a subset of systematic sources. To optimize the additional selection criteria proposed by the ReSyst method, one could perform a scan in the observable space to find the optimal thresholds on the most promising observables and/or combine observables in a multivariate analysis to find an optimal selection threshold. If a profile likelihood ratio fit is used to perform the measurement, R i could be exploited to identify control distributions and/or regions with an enhanced sensitivity on a specific systematic uncertainty source. These can be used in an overal fit to constrain the systematic uncertainties due to various sources using the data itself. Additional to quantifiying the impact of one event on the systematic uncertainty of an estimator, the statistical impact can be quantified as well, for example by considering the derivative of the likelihood at the measured value. The ReSyst method can be extended to consider both in order to minimize the total uncertainty on the estimator.