On the ATLAS Top Mass Measurements and the Potential for Stealth Stop Contamination

The discovery of the stop - the Supersymmetric partner of the top quark - is a key goal of the physics program enabled by the Large Hadron Collider. Although much of the accessible parameter space has already been probed, a narrow open region persists. This"stealth stop"regime is characterized by decay kinematics that force the final state top quark off its mass shell; such decays would contaminate the top mass measurements. We investigate the resulting bias imparted to the template method based ATLAS approach. A careful recasting of these results shows that effect can be as large as 2.0 GeV, comparable to the current quoted error bar. Thus, a robust exploration of the stealth stop splinter requires the simultaneous consideration of the impact on the top mass. Additionally, we explore the robustness of the template technique, and point out a simple strategy for improving the methodology implemented for the semi-leptonic channel.


Introduction
The top quark plays a critical role in understanding the structure of the Standard Model (SM) and its extensions. The measured value of the top quark mass m t (and Yukawa coupling) is an important input for precision tests of the self consistency of the SM. If nature is Supersymmetric (SUSY), the top should have a partner -the stop -that tames the ultraviolet sensitivity implied by the coupling between the top quark and the Higgs boson. Since this is one of the most compelling ways to extend the SM, an extensive search program for the stop has been conducted by both ATLAS [1,2] and CMS [3,4], yielding an impressive exclusion covering stop masses as high as 1 TeV. However, a narrow "splinter" region still persists at low masses, namely when m(t 1 ) ∼ m t . The SUSY framework makes it manifest that as the mass of the stop becomes parametrically large with respect to the weak scale, the fundamental parameters become increasingly fine tuned in order to reproduce the measured Higgs vacuum expectation value. Thus, there remains significant interest in this inherently natural but notoriously difficult to explore "stealth" stop region of parameter space. The degeneracy of these mass parameters implies tight kinematic constraints such that the final state looks nearly identical to top pair production, albeit where the tops are off-shell. Thus, not only does the presence of copious SM top pair production obscure the presence of the stop, but if the stop exists with a mass in this regime, the precision measurements of the top mass itself would be biased due to the presence of stop decays. 1 In order to fully expose the subtle phenomenological signals of stealth stops, we explore the robustness of the top quark mass measurements in three channels (allhadronic, semi-leptonic, and di-leptonic) and quantify the potential contamination of these measurements due to a stealth stop. Furthermore, we propose an improvement in the methodology for measuring the top mass in the semi-leptonic channel that has the benefit of being motivated by the physical configuration of the final state.
To achieve this goal, we carefully recast precise measurements of the top quark mass, choosing the ATLAS Collaboration's template method for its straightforward response to stop signal contamination; we expect our results to be generally applicable regardless of the method used. 2 The ATLAS Collaboration characterizes its measurements by the top decay products considered: all-hadronic [11] (with no leptons in the final state), semi-leptonic [12] (where one top quark decays to jets and the other decays via an electron or muon), and di-leptonic [13] (where both top quarks decay via an electron or muon). These measurements are based on the √ s = 8 TeV dataset and are summarized in the left panel of Fig. 1; we also provide a crude 1 See [5,6] for early phenomenological studies that also explore this question. 2 The top mass has also been precisely measured by the CDF [7] and D0 [8] Collaborations using the matrix element technique, and by the CMS Collaboration using the ideogram method [9,10]. m(χ 0 1 ) = 20 GeV Figure 1: Parameters which give the best fit for the top mass measurement by ATLAS by combining the three different channels at √ s = 8 TeV. The ATLAS measurements of the three different channels are shown in the left panel: all-hadronic [11] (m t = 173.72 ± 1.15 GeV), semi-leptonic [12] (m t = 172.08 ± 0.91 GeV), and dileptonic [13] (m t = 172.99 ± 0.84 GeV). Assuming the SM only, the green band shows our crude best-fit value of m comb t with its associated uncertainty and the black band shows the ATLAS combination m ATLAS t given in Ref. [12], taking into account 7 and 8 TeV data. The center and right panels illustrate the impact of stealth stop contamination, where we show the best fit point in the m tm(t 1 ) plane (marked by the orange star), and the 1-σ confidence interval as shaded bands. When the stops must decay through an off-shell top, they shift the reconstructed template mass extraction to values that are smaller than the actual top mass chosen in the Monte Carlo event generation. combination 3 assuming uncorrelated Gaussian error bars, which is consistent with the sophisticated combination performed by ATLAS that includes the √ s = 7 TeV dataset. A general review of the template method together with an illustrative toy example is presented in Sec. 2. In Sec. 3, we present a detailed analysis of the semileptonic channel and propose an improved strategy that requires minor modifications to the current ATLAS approach.
The potential contamination from a stealth stop is modeled using the "stopneutralino" Simplified Model [14][15][16], which is inspired by the "more minimal SUSY SM" [17,18]. Under the well-motivated assumption the lightest superpartner is a stable state, phenomenological viability requires that the particle be neutral, thereby providing a dark matter candidate [19,20], the so-called lightest neutralinoχ 0 1 . The rate of direct stop pair production is fully specified by m(t 1 ), and each stop subse-  O K x g r G F N J T N Z t M u 3 X y w O x F K 6 M / w 4 k H F q / / G m / / G T R t B R R 8 M P N 6 b Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 I l i g s f M A Q 6 C D V L J S O Q L 1 v e n V 4 X f v 2 d S 8 S S + h V n K v I i M Y x 5 y S k B L 7 h C 4 C F g O 8 5 E 9 q j c s s 9 M + a 1 k W t k x r g Y I 0 z z v N F r Z L p Y F K 9 E b 1 9 2 G Q 0 C x i M V B B l H J t K w U v J x I 4 F W x e G 2 a K p Y R O y Z i 5 m s Y k Y s r L F y f P 8 Y l W A h w m U l c M e K F + n 8 h J p N Q s 8 n V n R G C i f n u F + J f n Z h C 2 v Z z H a Q Y s p s t F Y S Y w J L j 4 H w d c M g p i p g m h k u t b M Z 0 Q S S j o l G o 6 h K 9 P 8 f / E a Z o d 0 7 5 p N b q X Z R p V d I S O 0 S m y 0 Q X q o m v U Q w 6 i K E E P 6 A k 9 G 2 A 8 G i / G 6 7 K 1 Y p Q z h + g H j L d P G u u R V w = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l 4 V m 9 c T S p o G b c H + b D h 0 3 Z K k m W S Y = " > A A A B 8 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B b B U 0 h K 1 f Z W 9 O K x g r G F N J T N Z t M u 3 X y w O x F K 6 M / w 4 k H F q / / G m / / G T R t B R R 8 M P N 6 b Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 I l i g s f M A Q 6 C D V L J S O Q L 1 v e n V 4 X f v 2 d S 8 S S + h V n K v I i M Y x 5 y S k B L 7 h C 4 C F g O 8 5 E 9 q j c s s 9 M + a 1 k W t k x r g Y I 0 z z v N F r Z L p Y F K 9 E b 1 9 2 G Q 0 C x i M V B B l H J t K w U v J x I 4 F W x e G 2 a K p Y R O y Z i 5 m s Y k Y s r L F y f P 8 Y l W A h w m U l c M e K F + n 8 h J p N Q s 8 n V n R G C i f n u F + J f n Z h C 2 v Z z H a Q Y s p s t F Y S Y w J L j 4 H w d c M g p i p g m h k u t b M Z 0 Q S S j o l G o 6 h K 9 P 8 f / E a Z o d 0 7 5 p N b q X Z R p V d I S O 0 S m y 0 Q X q o m v U Q w 6 i K E E P 6 A k 9 G 2 A 8 G i / G 6 7 K 1 Y p Q z h + g H j L d P G u u R V w = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l 4 V m 9 c T S p o G b c H + b D h 0 3 Z K k m W S Y = " > A A A B 8 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B b B U 0 h K 1 f Z W 9 O K x g r G F N J T N Z t M u 3 X y w O x F K 6 M / w 4 k H F q / / G m / / G T R t B R R 8 M P N 6 b Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 I l i g s f M A Q 6 C D V L J S O Q L 1 v e n V 4 X f v 2 d S 8 S S + h V n K v I i M Y x 5 y S k B L 7 h C 4 C F g O 8 5 E 9 q j c s s 9 M + a 1 k W t k x r g Y I 0 z z v N F r Z L p Y F K 9 E b 1 9 2 G Q 0 C x i M V B B l H J t K w U v J x I 4 F W x e G 2 a K p Y R O y Z i 5 m s Y k Y s r L F y f P 8 Y l W A h w m U l c M e K F + n 8 h J p N Q s 8 n V n R G C i f n u F + J f n Z h C 2 v Z z H a Q Y s p s t F Y S Y w J L j 4 H w d c M g p i p g m h k u t b M Z 0 Q S S j o l G o 6 h K 9 P 8 f / E a Z o d 0 7 5 p N b q X Z R p V d I S O 0 S m y 0 Q X q o m v U Q w 6 i K E E P 6 A k 9 G 2 A 8 G i / G 6 7 K 1 Y p Q z h + g H j L d P G u u R V w = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " l 4 V m 9 c T S p o G b c H + b D h 0 3 Z K k m W S Y = " > A A A B 8 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B b B U 0 h K 1 f Z W 9 O K x g r G F N J T N Z t M u 3 X y w O x F K 6 M / w 4 k H F q / / G m / / G T R t B R R 8 M P N 6 b Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 I l i g s f M A Q 6 C D V L J S O Q L 1 v e n V 4 X f v 2 d S 8 S S + h V n K v I i M Y x 5 y S k B L 7 h C 4 C F g O 8 5 E 9 q j c s s 9 M + a 1 k W t k Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 Y W a e n w q u w L I + j M r K 6 t r 6 R n W z t r W 9 s 7 t X 3 z + 4 U 0 k m K X N o I h I 5 8 < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 4 7 1 5 N L 2 v q V g N x 9 / X Q Y 7 T o C 5 m I c = " > A A A B 6 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B a h e g i b U r W 9 F b 1 4 r G h s o Q 1 l s 9 2 0 S z e b s L s R S u h P 8 O J B x a v / y J v / x k 0 b Q U U f D D z e m 2 F m n h 9 z p j R C H 1 Z h a X l l d a 2 4 X t r Y 3 N r e K e / u 3 a k o k Y S 6 J O K R 7 P p Y U c 4 E d T X T n H Z j S X H o c 9 r x J 5 e Z 3 7 m n U r F I 3 O p p T L 0 Q j w Q L G M H a S D f V k + N B u Y L s Z u O 0 j h B E N p o j I 7 W z Z q 0 O n V y p g B z t Q f m 9 P 4 x I E l K h C c d K 9 R w U a y / F U j P C 6 a z U T x S N M Z n g E e 0 Z K n B I l Z f O T 5 3 B I 6 M M Y R B J U 0 L D u f p 9 I s W h U t P Q N 5 0 h 1 m P 1 2 8 v E v 7 x e o o O G l z I R J 5 o K s l g U J B z q C G Z / w y G T l G g + N Q Q T y c y t k I y x x E S b d E o m h K 9 P 4 f / E r d l N 2 7 m u V 1 o X e R p F c A A O Q R U 4 4 B y 0 w B V o A x c Q M A I P 4 A k 8 W 9 x 6 t F 6 s 1 0 V r w c p n 9 s E P W G + f 6 l a N G A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 4 7 1 5 N L 2 v q V g N x 9 / X Q Y 7 T o C 5 m I c = " > A A A B 6 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B a h e g i b U r W 9 F b 1 4 r G h s o Q 1 l s 9 2 0 S z e b s L s R S u h P 8 O J B x a v / y J v / x k 0 b Q U U f D D z e m 2 F m n h 9 z p j R C H 1 Z h a X l l d a 2 4 X t r Y 3 N r e K e / u 3 a k o k Y S 6 J O K R 7 P p Y U c 4 E d T X T n H Z j S X H o c 9 r x J 5 e Z 3 7 m n U r F I 3 O p p T L 0 Q j w Q L G M H a S D f V k + N B u Y L s Z u O 0 j h B E N p o j I 7 W z Z q 0 O n V y p g B z t Q f m 9 P 4 x I E l K h C c d K 9 R w U a y / F U j P C 6 a z U T x S N M Z n g E e 0 Z K n B I l Z f O T 5 3 B I 6 M M Y R B J U 0 L D u f p 9 I s W h U t P Q N 5 0 h 1 m P 1 2 8 v E v 7 x e o o O G l z I R J 5 o K s l g U J B z q C G Z / w y G T l G g + N Q Q T y c y t k I y x x E S b d E o m h K 9 P 4 f / E r d l N 2 7 m u V 1 o X e R p F c A A O Q R U 4 4 B y 0 w B V o A x c Q M A I P 4 A k 8 W 9 x 6 t F 6 s 1 0 V r w c p n 9 s E P W G + f 6 l a N G A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 4 7 1 5 N L 2 v q V g N x 9 / X Q Y 7 T o C 5 m I c = " > A A A B 6 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B a h e g i b U r W 9 F b 1 4 r G h s o Q 1 l s 9 2 0 S z e b s L s R S u h P 8 O J B x a v / y J v / x k 0 b Q U U f D D z e m 2 F m n h 9 z p j R C H 1 Z h a X l l d a 2 4 X t r Y 3 N r e K e / u 3 a k o k Y S 6 J O K R 7 P p Y U c 4 E d T X T n H Z j S X H o c 9 r x J 5 e Z 3 7 m n U r F I 3 O p p T L 0 Q j w Q L G M H a S D f V k + N B u Y L s Z u O 0 j h B E N p o j I 7 W z Z q 0 O n V y p g B z t Q f m 9 P 4 x I E l K h C c d K 9 R w U a y / F U j P C 6 a z U T x S N M Z n g E e 0 Z K n B I l Z f O T 5 3 B I 6 M M Y R B J U 0 L D u f p 9 I s W h U t P Q N 5 0 h 1 m P 1 2 8 v E v 7 x e o o O G l z I R J 5 o K s l g U J B z q C G Z / w y G T l G g + N Q Q T y c y t k I y x x E S b d E o m h K 9 P 4 f / E r d l N 2 7 m u V 1 o X e R p F c A A O Q R U 4 4 B y 0 w B V o A x c Q M A I P 4 A k 8 W 9 x 6 t F 6 s 1 0 V r w c p n 9 s E P W G + f 6 l a N G A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 4 7 1 5 N L 2 v q V g N x 9 / X Q Y 7 T o C 5 m I c = " > A A A B 6 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B a h e g i b U r W 9 F b 1 4 r G h s o Q 1 l s 9 2 0 S z e b s L s R S u h P 8 O J B x a v / y J v / x k 0 b Q U U f D D z e m 2 F m n h 9 z p j R C H 1 Z h a X l l d a 2 4 X t r Y 3 N r e K e / u 3 a k o k Y S 6 J O K R 7 P p Y U c 4 E d T X T n H Z j S X H o c 9 r x J 5 e Z 3 7 m n U r F I 3 O p p T L 0 Q j w Q L G M H a S D f V k + N B u Y L s Z u O 0 j h B E N p o j I 7 W z Z q 0 O n V y p g B z t Q f m 9 P 4 x I E l K h C c d K 9 R w U a y / F U j P C 6 a z U T x S N M Z n g E e 0 Z K n B I l Z f O T 5 3 B I 6 M M Y R B J U 0 L D u f p 9 I s W h U t P Q N 5 0 h 1 m P 1 2 8 v E v 7 x e o o O G l z I R J 5 o K s l g U J B z q C G Z / w y G T l G g + N Q Q T y c y t k I y x x E S b d E o m h K 9 P 4 f / E r d l N 2 7 m u V 1 o X e R p F c A A O Q R U 4 4 B y 0 w B V o A x c Q M A I P 4 A k 8 W 9 x 6 t F 6 s 1 0 V r w c p n 9 s E P W G + f 6 l a N G A = = < / l a t e x i t > (⇤) < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 4 7 1 5 N L 2 v q V g N x 9 / X Q Y 7 T o C 5 m I c = " > A A A B 6 X i c d V B N S 8 N A E N 3 U r 1 q / q h 6 9 L B a h e g i b U r W 9 F b 1 4 r G h s o Q 1 l s 9 2 0 S z e b s L s R S u h P 8 O J B x a v / y J v / x k 0 b Q U U f D D z e m 2 F m n h 9 z p j R C H 1 Z h a X l l d a 2 4 X t r Y 3 N r e K e / u 3 a k o k Y S 6 J O K R 7 P p Y U c 4 E d T X T n H Z j S X H o c 9 r x J 5 e Z 3 7 m n U r F < l a t e x i t s h a 1 _ b a s e 6 4 = " C P 1 Q p j g W P A U U A 9 6 i F e T e y S 6 x g 7 Q = "

z Q K c A h H U A E P L q A G t 1 A H H x j 0 4 R l e 4 c 2 R z o v z 7 n z M W p e c f O Y A / s D 5 / A G j a o z n < / l a t e x i t > (⇤)
< l a t e x i t s h a 1 _ b a s e 6 4 = " C P 1 Q p j g W P A U U A 9 6 i F e T e y S 6 x g 7 Q = "   Figure 2: The left diagram illustrates the full process for the parameter space where m(t 1 ) − m(χ 0 1 ) < m t including the off-shell propagators which encode the non-trivial kinematic correlations, and the right diagram illustrates the same process in the narrow width approximation. The green circles represent the full tree-level stop pair production matrix element, which is included in our simulations. The gray circles represent decays that do not include any matrix element information, i.e., the particles are decayed using phase space alone. The superscripts ( * ) denote particles that can go off shell. This figure was adapted from diagrams given in Ref. [45].
quently decays to an on-or off-shell top quark andχ 0 1 , as illustrated in Fig. 2. 4 The stealth stop region of parameter space is thus more precisely defined by m(t 1 ) − m t m(χ 0 1 ). The degeneracy of these mass parameters implies tight kinematic constraints such that the final state looks nearly identical to top pair production, perhaps with some additional missing energy due to the presence of the neutralino. There have been many phenomenological studies to constrain light or compressed stops, e.g. [14,15,. The stop is off-shell in much of this parameter space; a careful modeling of the angular distributions of the final state is needed since the kinematics can have a non-trivial impact on the resulting efficiencies. Therefore, one must abandon the narrow-width approximation [44] (depicted in the right panel of Fig. 2) and compute the full four-body kinematics (as illustrated in the left panel of Fig. 2). Here, we will follow the procedure developed in Ref. [42] for simulating events including these effects.
The central and right panels of Fig. 1 provide a summary of our main results, which are described in detail in Sec. 4. We introduce stop contamination into the recasted top mass measurements and provide a simple combination of the three channels for the 1-σ best-fit region in the m(t 1 )m t plane. The best fit point is  Table 1: Summary of the maximum bias on the measured m t due to stop contamination in each channel, assuming m(χ 0 1 ) = 1 GeV. The top row shows the mass of the stop that maximally biases the experimentally measured mass from the Monte Carlo truth mass. The size of the bias in the measurement for each channel is shown in the bottom row.
shown as an orange star. Two neutralino mass points are shown: m(χ 0 1 ) = 1 GeV (center) and m(χ 0 1 ) = 20 GeV (right) for a range of stop and top masses. The maximum bias for each channel is also summarized in Table 1. The bias in the observed top mass depends on the mass of the contaminating stops, and it can be as large as 2.0 GeV in the di-leptonic channel.
Our results have an important impact on interpretations of stop exclusion in the stealth stop region. Both precision measurements and direct searches have attempted to whittle away the apparent available parameter space to a mere "splinter." An early ATLAS approach to examining this region relied on precision measurements of the top cross section [46], although only results for m(χ 0 1 ) = 1 GeV were presented. This motivated our previous study [42], where we performed a careful recasting of the ATLAS exclusion to extend it into the full stop-neutralino mass plane. More recently, both ATLAS [47] and CMS [48] have exploited the clean signature and angular distributions of eµ events, nearly excluding the narrow splinter-like region. However, Ref. [49] shows that observed limits on the stop mass using the tt cross section ratio at 7 and 8 TeV center-of-mass energy collisions at the LHC drop from around 180 GeV to 160 GeV if the top mass is changed from 172.5 to 175 GeV, indicating that O(1 GeV) shifts in the top mass can have an appreciable impact on the stop limits. In this paper we demonstrate that stealth stops can contaminate the top mass measurement at this level, which would lead one to infer that the top mass is lighter than its true underlying value. To know if we have actually closed the window on light stops, the interplay between the measured top mass and the stealth stop exclusion limits must be rigorously explored. turbation theory. Two common choices yield what is referred to as the "pole" mass or the "MS" mass. In the three measurements studied here, ATLAS avoids these issues and instead infers what is often called the "Monte Carlo" mass by comparing some observable that is sensitive to the top mass against Monte Carlo generator predictions as a function of the numerically implemented mass parameter. The MC top quark mass m t,MC is related to the field-theoretic pole mass m t,pole as m t,MC = m t,pole ± δm t . (2.1) In the discrepancy δm t ∼ O Q 0 α s (Q 0 ) , Q 0 corresponds to the scale of the shower cutoff [50][51][52][53][54], and α s is the strong coupling. Other studies suggest the uncertainty in this conversion is on the order of the hadronization scale [55,56]; see Ref. [57] for a study on reducing this ambiguity by means of jet grooming. We conclude that the difference is generally on the order of a few hundred MeV, which is comparable to typical modern experimental precision [58]. From here forward, we will put these issues aside and focus on the methodology employed by ATLAS -we emphasize that δm t is another source of systematic uncertainty that must be tracked when comparing the value of m t measured by ATLAS measurement to other approaches or as an input to a theory calculation. In order to compare Monte Carlo predictions to data, ATLAS relies on a template method. An observable O is chosen such that it is sensitive to the top mass, and simulations are then used to compute distributions for multiple values of m t . Clearly, the particular choice of O depends on the channel under consideration; for example, in the all-hadronic channel [11], ATLAS constructs the ratio between the 3-and 2jet invariant masses as this minimizes sensitivity to the jet energy scale uncertainty. Samples of the distributions for O are generated over a range of values for the top mass (ATLAS does this for five m t values from 167.5 GeV to 177.5 GeV). A set of preselection cuts are then applied, and each resulting distribution is fit with the same parametric curve. Then the resulting best-fit values are assumed to be linear functions of m t , and an interpolation as a function of m t is derived by linearly fitting the parameter variations as a function of m t . This resulting object is the so-called template, which allows one to "predict" the shape of O as a function of m t . To make this procedure more concrete, and to highlight some of its features, we work out a detailed toy example template in what follows.

A Toy Example
In this section, we present a toy model that illustrates how the template method works in practice. For now, we will assume that the distribution for the observable O has a characteristic peak followed by an extended tail. For concreteness, we model such a shape using a Gaussian for the peak and a Landau function for the tail, where the latter is defined as where µ essentially controls the location of the peak and c controls the width of the distribution. This toy model is described by six parameters: P x; a, b, χ, σ, µ, c = a P Gaussian x; χ, σ + b P Landau x; µ, c . (2. 3) The parameters a and b control the relative normalizations of the Gaussian and the Landau components, 5 while χ and σ are the mean and standard deviation of the Gaussian, respectively. We choose to use a Gaussian plus a Landau as our toy distribution since this is the shape used by ATLAS for the di-leptonic measurement. The other two channels are fit to similar distributions as discussed below.
The key to choosing a good observable is that its shape (and ideally the location of a peak) must change as a function of the underlying parameter of interest -for the measurements of interest below, this parameter is the top mass, while in the toy model studied in this section, we will call this m toy . We model the "truth-level" change in the underlying six parameters defined in Eq. (2.3) as linear functions of m toy , which are chosen to closely mimic those that ATLAS extracts from real data.
Once the observable O and the parametric model are chosen, the next step is to construct the templates. The ATLAS approach relies on Monte Carlo simulations for different choices of m t . For our toy example, we draw samples from the truth-level probability distributions at five values of m toy using the Metropolis-Hastings Markov Chain Monte Carlo (MCMC) algorithm [59,60]. A dataset of 10,000 elements is constructed for each choice of m toy , which are subsequently binned and normalized. We then fit the resulting histograms to the distribution given in Eq. (2.3). An example fit is shown in the left panel of Fig. 3, comparing the fitted distribution to the toy data, and the right panel displays the best-fit templates for three different values of m toy .
In order to account for the statistical noise due to finite sample sizes, we generate 100 independent data sets from the truth-level distribution, and find the best fit parameters for each. The mean and the standard deviation for each of the parameters are shown as the data points with error bars in Fig. 4, and the linear functions are indicated by the dashed red lines. It is not surprising to see that the largest range occurs for the variable χ, since this determines location of the peak. Additionally, this parameter χ has the smallest fractional uncertainty of ∼ 1%, while the other parameter error bars vary from ∼ 4% to as much as ∼ 8%, which can be traced back to its sensitivity to the position of the peak of the distribution. Each of these distributions is then fit to a line including the impact of the error bars on the fit, as shown by the blue lines in Fig. 4. The final step for constructing a template is to use these linear fits as a function of m toy to convert the parametric model of Eq. (2.3) into a function of m toy alone. Explicitly, the model becomes P x; a, b, χ, σ, µ, c → P x; m toy , where each of the original parameters is determined by the appropriate best fit linear function of m toy . Finally, one can use these templates to extract a mass measurement by fitting the template (which is now a function of the single parameter m toy ) to the experimentally determined distribution.
We identify two sources of uncertainty within the template method as implemented here: the first is the statistical uncertainty from using the derived template, and the second is the systematics associated with deriving the template itself. We use a closure test to assess the size of these uncertainties. An extra 100 sets of samples for a given mass point are generated; for each we find a template mass that best fits the distribution (using the templates of the blue lines of Fig. 4). The difference between the truth and extracted values are small, and the standard deviation gives us an estimate for the uncertainty of using the template, around 0.33 GeV for the toy model.
To measure the second source of uncertainty, which results from the assumption of the linear dependence of the template parameters, we repeat the closure test using the dotted lines of Fig. 4 denoting the uncertainty of the linear fits. Using the shifted template results in extracting m toy ∼ m true ± 1.5 GeV, depending on if the upper or lower shift is considered. In the toy model, we find this systematic uncertainty from deriving the template is larger than the statistical uncertainty. When we perform the same tests for the ATLAS top mass measurements, we find that the two uncertainties are similar in size to each other and subdominant to other quoted experimental uncertainties, e.g. that come from the parton distribution functions or the jet energy scale.

Dependence on the Choice of Fit Function
The last section addressed some of the uncertainty associated with constructing a template. However, in performing those tests we used a parametric fit function that has the exact same form as the true underlying distribution. This is in contrast with the fit functions utilized by ATLAS, which are not necessarily determined from the underlying physics. As we will show here, the template method is quite robust as long as the model parameters are linearly dependent on m toy , even if the model does not provide a particularly good fit to the distribution of the observable O.
To illustrate this point, we repeat the template analysis with the true data distributed according to the same toy model described by Fig. 4. We now fit the distributions by a Gaussian alone, which does not model the tail of the O distribution. As shown in Fig. 5, the best-fit Gaussian tracks the location of the peak, which is highly correlated with the underlying m toy . We then generate a template in analogy with above, and perform a closure test, which yields the left panel of Fig. 6. As comparison, we provide the closure test result from the truth template on the right pannel. Surprisingly, the bias induced by this simple-yet-crude model for the shape of O is smaller than when we used the full model, and a similar trend is observed for the standard deviation. We conclude that although there is no a priori way to determine what parametric shape to use, the template procedure is not particularly sensitive to this choice. 6 The fact that the template method does not require a model which accurately depicts the data can be seen as both a positive and a negative feature. On the positive side, it implies that one does not need to worry too much about the actual shape of the distribution when constructing a fit function, which is a plus since it is unknown how one might determine such shapes analytically (especially including the impact of pre-selection cuts). On the other hand, this opens the possibility that physically unmotivated observables can be used, as long as they are relatively correlated with the top mass. In fact, we will argue in the next section that the observable used for the semi-leptonic channel has the potential for extra accidental biases, and is additionally hard to interpret as a physical distribution. The fact that a good fit is not a necessary requirement for closure in the template approach implies that subtle effects could bias the final extracted value of the top mass without warning when

The Semi-leptonic Channel: A Modified Approach
Although the main focus of this study is to quantitatively investigate the impact that light stops could have on the measurement of the top quark mass, in this section we will critically evaluate the application of the template method to the semi-leptonic final state as currently implemented by ATLAS in Ref. [12]. In particular, we have identified a subtle issue with their procedure that can be corrected by a straightforward implementation of a two-dimensional template. The issue and its resolution are presented in what follows.

Pre-selection Cuts
The defining characteristic of the semi-leptonic channel is that one of the top decays involves a lepton and the other decays fully hadronically. The final state of interest is then two b-jets, two light flavor jets, one charged lepton, and missing energy from the neutrino. This is a powerful channel since the QCD background is reduced due to the lepton requirement.
We recast the ATLAS measurement [12] as closely as possible. However, we encountered a subtle issue as discussed in what follows, which motivates our modified approach. The parton level events are generated using Madgraph [61], and are subsequently showered and hadronized using Pythia 8 [62]. We use Delphes [63] to simulate detector effects, and we modified the Delphes detector card to match the b-tagging characteristics reported by ATLAS. More details regarding the event generation can be found in App. A, and additional details and validation results for the semi-leptonic channel are given in App. D.
In reconstructing objects for the analysis, we use the following set of definitions. Electron candidates are required to have a transverse momentum of p T > 25 GeV, |η| < 2.47 excluding the range (1.37, 1.52) due to the mismatch between the barrel and the end cap at ATLAS. Muon candidates must satisfy p T >25 GeV and |η| < 2.5. Jet candidates are reconstructed with the anti-k t algorithm [64] with a radius of R = 0.4 and are required to satisfy p T >25 GeV and |η| < 2.5. Muons reconstructed within ∆R < 0.4 of a jet candidate are considered to be part of the jet and are subsequently removed from the list of charged lepton candidates. Jet candidates are labeled as jets if they have ∆R > 0.2 from all electron candidates, and otherwise they are removed. Finally, electron candidates within ∆R < 0.4 of a valid jet are removed. We set a flat b-tagging efficiency of 0.7, and rejection factors of 5 and 140 for the charm quark and the light quarks, respectively.
To select events that are likely due to the semi-leptonic decay of a tt pair, the following pre-selection cuts are imposed: • Exactly one charged lepton.
• The E T / and m W T cuts depend on the type of lepton: 7 • µ channel: E T / > 20 GeV and E T / + m W T > 60 GeV. • e channel: E T / > 30 GeV and m W T > 30 GeV.
• Exactly two b-tagged jets. Table 4 in App. D shows the number of events that survive each of these successive cuts as predicted by our simulation.

A Likelihood Approach to Inferring the Neutrino Momentum
A likelihood-based method is used to determine the missing neutrino momentum and address the combinatoric backgrounds as developed in Ref. [65]. This methodology is the basis of the template approach as a function of "m t,reco " developed by ATLAS in the semi-leptonic channel as discussed in the next section. In order to recast this method, we build a likelihood function from Breit-Wigner (BW) distributions [66] defined as follows for each event that passes the preselection cuts: where m is the particle mass and Γ is its width. The likelihood function is simply the product of four BWs, one for each of the two W bosons, one for each of the two top quarks: where p b 1,2 are the four momenta of the two b-jets, p q 1,2 are those of the untagged jets, and p is the lepton four momentum. ATLAS additionally includes the impact of transfer functions that model the mapping from detector to particle level momenta; we neglect this effect in our analysis as it would have a minimal impact on our results. 7 m W T is the transverse mass of the W and is defined as m W The inputs to Eq. (3.3) are the lepton momentum, the missing transverse momentum, and the momenta for up to six jets. The x and y components of the neutrino momentum are assumed to be equal to the missing energy components. The z component, p z,ν is unmeasurable at the LHC, and is therefore treated as a free parameter when maximizing the likelihood function, where the initial value provided to the maximizer is derived from m 2 W = (p + p ν ) 2 . If the solutions of p z,ν are complex, then the initial guess for the maximization is set to p init z,ν = 0. If there are two real solutions, then the solution resulting in the largest likelihood is used. The likelihood is then maximized for all possible assignments of the b-tagged jets to the leptonic side of the event, and all choices of two out of the possible four un-tagged jets. The choice which maximizes the likelihood is then taken to determine the assignment of decay products for both hadronic and leptonic tops. We have additionally checked that this approach does a reasonable job of reproducing the truth level assignments of final states with the appropriate top, and that it tends to find a very good approximation for the z-component of the neutrino momentum, as expected.

The ATLAS Semi-leptonic Template
After selecting events using the preselection cuts described above in Sec. 3.1, ATLAS applies the likelihood method introduced in Sec. 3.2. This provides a systematic way of assigning final state objects to either of the two top candidates, which is then used to construct a three dimensional template as a function of m t , the jet energy scale (JES), and the b-JES. This is done by fitting to three observables O, m t,reco , m W,reco , and R bq , where where p b,had T and p b,lep T are the momenta of the b-jets assigned to the hadronic and leptonic sides of the even respectively, and q 1,2 are the light flavor jets that are associated with the decay of the W .
ATLAS finds that m W,reco largely constrains the JES, while R bq constrains the b-JES relative to the JES. Given that our analysis relies on a simple parametrized detector simulation, we are not equipped to perform a realistic study of the impact of varying the JES or b-JES. Critically, we find that our m W,reco and R bq distributions agree relatively well with those provided by ATLAS, see the left and center panels of Fig. 7. Therefore, we are confidant that the JES and b-JES dependence will not have a significant impact on our interpretation of the semi-leptonic mass measurement. In contrast, our m t,reco distribution (which is critical to the extraction of m t ) does not agree with ATLAS, see the right panel of Fig. 7. Understanding the source of this mismatch is the subject of the next section.  [12]. We observe good agreement for the m W,reco and R bq distributions, but we are unable to reproduce the m t,reco distribution.

Why m t,reco Does Not Peak at the Top Mass
It is concerning that ATLAS finds that the distribution for m t,reco peaks below the actual top mass; this is in contrast with our implementation, which yields a peak closer to m t , see Fig. 7. It is important to emphasize that the location of the peak for ATLAS is not the extracted value, which comes from finding the best-fit template. We reiterate that as shown in Sec. 2, as long as the shape of the template varies linearly with the generator mass, the template procedure will close and the extraction of the best fit is expected to be robust. Despite this fact, this section is devoted to explaining the mismatch between the two m t,reco distributions. Along the way, we will argue that m t,reco is not physically meaningful, which will motivate a physicsdriven proposal for a modified approach presented in Sec. 3.5.
In order to generate their distribution, ATLAS populates a histogram using the value of m t,reco that maximizes the likelihood function in Eq. (3.3) for each event. The underlying assumption is that m t,reco captures the best fit top mass for the whole event. However, this is not the correct interpretation of this variable, as can be made clear by simply studying the form of the likelihood function. As discussed above, the likelihood-based approach provides a way to systematically assign the final state objects to the two top quarks in the event, while also solving for the z-component of the neutrino momentum. Assuming one has made all of these choices such that the maximum value for the likelihood can be achieved, we are left with a simple function log likelihood Figure 8: The double peak structure of the log likelihood function for 5 random events, using truth level neutrino four momenta and Delphes reconstructions otherwise. Note that the blue curve has a second peak at around 450 GeV. It is clear by eye that the right peak is always slightly higher than the left one.
of m t,reco : 8 where k is the numerator factor given in Eq. (3.2), p lep and p had are the sum of the four momenta for the final states assigned to the leptonic and hadronic tops respectively, and the width Γ is set to the PDG value, 1.41 GeV.
The choice to use a Breit-Wigner shape when constructing the likelihood function that peaks at the best fit mass of the top quark is clearly physically motivated. However, while the product form in Eq. (3.3) works very well as an approach to the combinatoric background and for determining p z for the neutrino, it does not return an event-level "best fit" for the top mass. In particular, using the simplified expression in Eq. (3.5), it is straightforward to see that the likelihood shape has two very sharp peaks, one for each choice of m 2 t,reco that equals p had 2 and p lep 2 . This point is clearly illustrated in Fig. 8, where we evaluate Eq. (3.5) as a function of m t,reco for five independent top pair production events. We use the detector-level reconstructions of the momenta for the visible objects, while the neutrino momentum is taken to be the true value. Note that the right peak will always be higher than the left one; this is clear from the fact that the Breit-Wigner numerator factor k ∼ m 2 Therefore, if one were to follow the ATLAS procedure to populate a histogram with the values of m t,reco that maximize the likelihood function, the resulting distribution would be systematically biased. There is an additional practical issue due to the need to use a numerical maximizer function to find the value of m t,reco that yields the largest likelihood. Due to the sharply peaking nature of the Breit-Wigner functions, we found that the output of the numerical approach is very sensitive to the specific algorithm used as well as the initial guess, resulting in a method that is not stable nor reproducible. 9 Therefore, we believe these issues are the source of the difference between the ATLAS distribution and our attempt to recast it, as shown in the right panel of Fig. 7.
Finally, we note that the value of m t,reco that maximizes the likelihood corresponds to the peak associated with the hadronic top ∼ 57% of the time. This implies that the m t,reco distribution is a non-trivial mixture of hadronic and leptonic tops, with unknown implications for systematic effects on the shape of m t,reco . Furthermore, recall that the goal of this approach was to summarize the entire event into a single value of m t,reco . As we have now explained, this approach only captures the mass measurement of the side of the event that corresponds to the heavier top candidate, and discards information about the other side of the event. This motivates our proposal for a modified approach, which is presented in the next section.

A Two-dimensional Mass Extraction Template
Instead of using a one-dimensional template for m t,reco , one would prefer an approach that takes advantage of the fact that there is both a leptonic and hadronic top decay in each event. We propose a modified approach in this section, relying on the same combined likelihood given in Eq. (3.3) to control the combinatorics and to solve for the missing neutrino momentum. We use the configuration that maximizes the likelihood to generate the distribution shown in Fig. 9, where we give a twodimensional density plot of the hadronic and leptonic top masses that result. One observes that the density is essentially symmetric about the diagonal m had t = m lep t , and that most of the time the values of the top masses from the two sides of the event are very similar. Along the diagonal, the density peaks near m t ∼ 170 GeV and then has an extended tail to larger masses. This 2D plane provides an excellent candidate for an improved observable O from which we construct a template. For our parametric model, we want a function with a peak and a tail along the diagonal. We chose this to be a Gaussian plus a Landau function, following the ATLAS approach used to fit a one-dimensional m t,reco distribution. Then we 9 Note that for our one-dimensional implementation of the template, once the other parameters (p z,ν and m W,reco ) are fixed, value of m t,reco which maximizes the likelihood can be obtained analytically by simply taking the larger of p lep t and p had t . However, this does not work for the more complicated three-dimensional template implemented by ATLAS, so we use the numerical maximizer in our study here to best mimic their approach. In the majority of events, the two masses are highly correlated and lie along the diagonal centered around the truth value of m t . This shape motivates the form of the fitting function we are proposing that can be used to build a template for extracting the top mass in the semi-leptonic channel.
model the spread orthogonal to the diagonal using a second independent Gaussian. Concretely, the two-dimensional template is P x, y; a, b, χ 1 , σ 1 , µ, c, χ 2 , σ 2 = a P Gaussian x; χ 1 , σ 1 + b P Landau x; µ, c × P Gaussian y; χ 2 , σ 2 , where x and y are the distance along the diagonal and distance away from the diagonal, respectively. This is a relatively crude model for the distribution shown in Fig. 9, and it does not not take into account how the spread away from the diagonal changes as a function of the distance from the origin. We tested that a more precise fitting function did not lead to improved extractions of the Monte Carlo top mass, while drastically increasing the computational time to perform the two-dimensional fit. This makes sense given the discussion regarding the sensitivity of the template approach to the shape of the fit function, as discussed in Sec. 2.2.
As in our toy model, we perform a closure test to validate the proposal of extracting the top mass from the two-dimensional template. Our new approach faithfully extracts the correct mass, and comes with a relatively small statistical uncertainty ∼ 0.1 GeV. We additionally checked that varying the linear fit of the templates up and down by an amount determined by the covariances led to a similar size uncertainty; see Sec. 2.1 for a discussion of this test. Our determination of these sources of uncertainty due to the template method are subdominant to the JES uncertainties provided by ATLAS in [12].
With this modified procedure in hand, we are now ready to access the impact of stop contamination on the top mass measurement. As we will emphasize below, our results in the semi-leptonic channel use the two-dimensional template method discussed here. As such, the results in the semi-leptonic channel are not a recasting, but can instead be interpreted as an estimate for how much contamination one could expect for this final state.

The Impact of Light Stops
Now that we have explored the template method as it is used by ATLAS to extract the Monte Carlo mass of the top quark (along with our modified approach in the semi-leptonic channel), we will turn to the impact of stop contamination on the template mass extraction. This is important since attempts by ATLAS [46,47,68] to exclude the stealth stop region of parameter space utilizing properties of high purity tt samples assume the top mass is measured in an orthogonal channel. As we will show in this section, light stops can bias the extracted top mass by up to 2 GeV. This implies that any limit which claims to exclude stealth stops using aspects of the top pair kinematics must simultaneously account for the impact on the top mass measurement. While this may not seem like a major issue at first glance, we emphasize that the leading order cross section prediction for tt production at √ s = 8 TeV drops from 160 pb for m t = 172 GeV to 150 pb for m t = 174 GeV. Since the production of stealth stops is ∼ 10 pb, this could easily impact the boundaries of exclusion regions. This sensitivity to m t has been demonstrated by ATLAS in Ref. [49] where observed limits on the stop mass drop from around 180 GeV to 160 GeV if the top mass is changed from 172.5 to 175 GeV. Here, we will focus on demonstrating the quantitative impact of this contamination -assessing how this alters limits is left for future work.
For concreteness, we will work with the stop-neutralino Simplified Model framework. Given a choice of top mass, we then generate a suite of events for different values of the stop mass (and for two benchmark choices of the neutralino mass), including the full effects of the off-shell propagators following the procedure detailed in Ref. [42]; more details regarding the event generation can also be found in App. A.2. In particular, this approach self consistently computes the width of the stop and the top quark as the parameter space is varied. The pair production of stops is determined by its QCD interactions, and its subsequent decayt 1 → t ( * )χ0 1 yields a (potentially off-shell) top quark and missing energy. Intuitively, the biggest impact on the top mass measurement will occur in the parameter space where the top that results from the stop decay is off-shell, since the reconstructed "top" in such events will have a "mass" that is smaller than m t . We will see exactly this behavior in the quantitative results that follow. To get a sense of the impact that stealth stops can yield, Fig. 10 shows the shape of the potential stop contribution to the observable O used to generate the template for each channel from top pair production with m t = 172.5 GeV (blue solid), stop pair production (orange solid) with m(t 1 ) = 164 GeV, m(χ 0 1 ) = 1 GeV, and the combined distribution (green solid). Note that in the semi-leptonic channel, we use the two-dimensional observable introduced in Sec. 3.5, but plot the onedimensional slice along the m diag t ≡ m had t = m lep t diagonal. Each of these distributions are normalized using the production cross section times efficiency to pass the relevant pre-selection cuts, assuming an integrated luminosity of L = 20.2 fb −1 . While the stop contribution is clearly subdominant, it peaks at a slightly lower value in each observable than tt. This has the effect of biasing the combined sample such that the extracted Monte Carlo top mass that best fits the combined distribution is lower than the true value of m t .
The results of our study for all three channels are presented in two different ways: the first representation is provided in Fig. 11, and the second is in Fig. 12. 10 The colored horizontal lines in Fig. 11   pure tt samples. The dotted diagonal line shows the kinematic boundary where m(t 1 ) = m(χ 0 1 ) + m t . Left of this dotted kinematic boundary, the tops are off-shell so the black lines are above the horizontal benchmark lines. This implies that the truth-level top mass (shown on the y-axis) is larger than the reconstructed value when using a SM only template. As the stops are taken to be heavier and cross the dotted line, two effects become important: the stops decay to on-shell tops removing the off-shell effects, and the stop production cross section decreases, thus explaining why the results asymptote to the pure SM in this limit.
The top row of Fig. 11 shows the results for the all-hadronic channel (for more details on the recast procedure for this channel see App. B). For tt production, this channel does not result in any intrinsic missing energy, and the preselection cuts do not make any requirements on E T / . This implies that the distribution utilized for the all-hadronic channel is less sensitive to presence of the additional E T / due to the final state neutralinos. Therefore, this channel is relatively insensitive to stop contamination; off-shell effects (left of the dotted line in the top left panel of Fig. 11) yield the dominant impact on the top mass extraction.
The middle row of Fig. 11 shows the results for the semi-leptonic channel. Due to our issues validating this channel as discussed in Sec. 3 above, we have performed this analysis using our proposed 2D template approach. As with the all-hadronic case, when the stops are lighter (to the left of the blue dashed line) they decay through an off-shell top quark, which biases the templates to extract lower masses. However, there is an additional important effect, which makes the results in this channel even more striking. The SM contribution contains a neutrino, and so the preselection cuts explicitly rely on E T / . Furthermore, the likelihood procedure utilized for addressing the combinatoric background and the missing z-component of the neutrino momentum assumes that the measured E T / corresponds to the transverse components of the neutrino momentum. This implies that the neutralinos in the final state will have a non-trivial impact on the shape of the observable used for the template method. From the figures, it is clear that the impact of stealth stop contamination on the semi-leptonic channel is more dramatic than in the all-hadronic channel, yielding a bias as large as ∼ 2 GeV. For the largest stop masses, the reconstructed top mass over-shoots the true value in this range due to the the effect of the neutralinos on the observable, but eventually asymptotes to the SM-only true value.
The di-leptonic channel is shown in the bottom row of Fig. 11. In this case, the SM final state contains two neutrinos, and so the preselection includes a cut on E T / . As opposed to the semi-leptonic case, the observable used in this channel is simply the invariant mass of the lepton and b-jet pairs m b , and so the distribution should not be impacted by new sources of E T / . However, m b it is not fixed by the mass of a parent particle, and so the resulting distribution is more sensitive to details such as the spin of the top quarks and the kinematics of the top pairs. This explains why the   bias in the reconstructed top mass for this channel is the most dramatic of the three, including the fact that the result asymptotes to the SM value even more slowly as the stop mass increases. Now that we have a sense of how large the bias from stealth stop contamination can be, Fig. 12 illustrates the consistency of the BSM parameter space with the observations performed by ATLAS. In this figure, the axes have been rotated with respect to Fig. 11, and we plot the truth-level Monte Carlo top mass used to generate events along the horizontal axis, while the input stop mass is on the vertical axis. For each point in the truth parameter space, we extract the reconstructed top mass. The black line at the center of the bands denotes the parameters that yield a reconstructed top mass which is equal to the value observed by ATLAS in each channel assuming the SM alone, while the green (yellow) bands are the 1-σ (2-σ) uncertainties taken directly from the ATLAS papers [11][12][13]. This allows one to visualize the non-trivial shapes that result from stealth stop contamination in each channel, and provides some insight into what parameter choices could yield the best consistency.
To quantitatively explore the consistency between the three channels, we performed a naive combination of these channels in the BSM parameter space; the methodology is described in App. E and the results are provided in the middle and right panels of Fig. 1. At each point in the m tm(t 1 ) plane (keeping m(χ 0 1 ) fixed), we compute the χ 2 for the extracted template mass in each channel as com-pared to the observations. The orange stars in the middle and right panels of Fig. 1 show the best-fit point, and the shaded area shows the 1-σ region. For the model with m(χ 0 1 ) = 1 GeV, the best fit point is found to be m t = 174.0 GeV and m(t 1 ) = 160 GeV, 11 and when m(χ 0 1 ) = 20 GeV, the best fit point is at m t = 173.7 GeV and m(t 1 ) = 162.0 GeV. Both panels show that the data fit best using lighter stops; the 1-σ uncertainty band for the right panel does not even extend to the top of the panel. In addition, the entire region results in masses larger than the SM-only assumption. If there are light stops, we may not know the mass of the top quark as accurately as we think we do.
As an amusement, we note that the all-hadronic and semi-leptonic uncertainty bands only slightly overlap in the SM alone assumption. With that, it may be possible that light stops could improve the consistency of the experimental results. In order to naively explore the extent to which the BSM model is a better fit than the SM alone, we compute the test statistic defined in Eq. (E.1) and the result is presented in Fig. 13. While there is no particular overall improvement in the fit for m(χ 0 1 ) = 1 GeV, the heavier choice m(χ 0 1 ) = 20 GeV, does has a very mild preference for the BSM scenario. Although we simply take this to be a coincidence given the current state of the top mass measurement, it does demonstrate that if the top mass measurements became discrepant between the different channels, light stops could bias the mass measurements enough to provide a resolution.

Conclusions
In this work, we have investigated the stealth stop contamination of the tt sample that can potentially bias the measurements of the top mass at ATLAS by up to 2 GeV. Three decay channels are studied in detail: all-hadronic, di-leptonic, and semi-leptonic. The top mass measurement in the all-hadronic channel is the least sensitive to stop contamination, while the di-leptonic channel is the most sensitive. The combination of results suggests that the heavy neutralino case is slightly favored in terms of overall consistency among the three channels. Furthermore, we have examined the logic behind the ATLAS semi-leptonic top mass measurement and proposed a modified method to better address the particular systematics of this channel.
In conclusion, O(1 GeV) shifts in the top mass measurement due to stop contamination are possible and can have O(10 GeV) impacts on the stealth stop exclusion limits [49]. Thus, we advocate that the LHC experiments perform a detailed analysis of the full three-dimensional Simplified Model parameter space spanned by m(t 1 )-m t -m(χ 0 1 ) in order to make a definitive statement on the potential existence of stealth stops. Energy under grant numbers DE-SC0018191 and DE-SC0011640. This work utilized the University of Oregon Talapas high-performance computing cluster.

A.1 Top Event Generation
The 8 TeV tt sample is generated at the parton level using MadGraph5_aMC@NLO 2.6.1 [61], and is passed to Pythia 8.2 [62] for showering and hadronization. Detector effects are approximated using Delphes 3.4.1 [63], which relies on Fastjet [69,70] to cluster the jets with the anti-k T algorithm [64]. We use the default Delphes ATLAS card, except that the b-tagging efficiency is set to be 0.57 for all-hadronic channel and 0.7 for the other two channels, in accordance with ATLAS [11][12][13]. We generated 5 million events for each of 5 top masses: 167.5 GeV, 170 GeV, 172.5 GeV, 175 GeV, 177.5 GeV.

A.2 Stop Event Generation
We work with a stop-neutralino Simplified Model, where the stop has the couplings appropriate for being right-handed. To cover the stealth stop region, events are generated for two choices of m(χ 0 1 ): 1 GeV and 20 GeV, and for a range of stop masses: m(t 1 ) from 160 GeV to 180 GeV in steps of 2 GeV, and m(t 1 ) from 180 GeV to 200 GeV in steps of 5 GeV. At each parameter point, we use MadGraph5_aMC@NLO to calculate the stop decay width. One must be very careful to account for all finite width effects during the generation of events when the top can be off-shell, see [42] for a detailed discussion. To this end, we ensure that the top and W widths are defined consistently for the decay and production in MadGraph5_aMC@NLO. Given the appropriate widths, we again use MadGraph5_aMC@NLO to calculate the matrix elements and generate 500,000 events for stop production and subsequent decay to each final states. We emphasize that this approach does not require any particle to appear on shell, and keeps track of all spin correlations and finite width effects.
To mix the stop and top samples so that we can investigate the impact of the stop contamination, we weight the events from the two samples according to their leading order cross sections, appropriately normalized by the total number of events generated. The stop production cross section is approximately O(10%) of the top, when the stop mass is within the range we scan.

B The All-hadronic Channel
In the all-hadronic channel, the final states is characterized by two b-jets and four light-flavor jets. While this channel has the largest branching ratio (45.7%) of the three final states, it suffers from a large QCD multi-jet background and from large uncertainties in the JES. This channel is the most challenging to measure, which explains why it has the largest error bar.

B.1 Pre-selection Cuts
The following preselection cuts are required before applying the template procedures. Events with isolated e/µ are excluded. At least 6 jets with p T > 25 GeV and |η| < 2.5 are required, and at least 5 of these jets must have p T > 60 GeV. For any pair of jets, an isolation requirement is applied such that ∆R(j i , j k ) > 0.6, where ∆R is the angular distance between two objects. An event must contain at least 2 b-tagged jets, with an azimuthal separation of ∆φ(b i , b j ) > 1.5. To remove events with neutrinos, a missing transverse energy cut of E T / < 60 GeV is applied. The all-hadronic channel has large combinatoric background, due to the homogeneity of the final state. To associate the jets with a particular top decay, a minimum χ 2 approach is utilized. One keeps the permutation that gives the lowest χ 2 among all possible permutations of jets in an event, where the χ 2 is defined as where the m MC W is taken to be 81.18 ± 0.04 GeV and the widths σ ∆m bjj and σ m MC W are taken from [11]: σ ∆m bjj = 21.60 ± 0.16 GeV and σ m MC W = 7.89 ± 0.05 GeV. Then the preselection requires χ 2 <11. Finally, a cut is applied to the azimuthal angle between b-jets and their associated W boson: the average of the two angular separations between the b and the W for each event must satisfy ∆φ(b, W ) < 2. For validation, we present Fig. 14, which gives distributions for the three and two jet invariant masses, after the pre-selection cuts are applied.

B.2 R 32 Templates
The observable R 32 is defined as the ratio of the three-jet mass to the di-jet mass, where the three-jet is a proxy for the top decay and di-jet is associated with the W where x p is the peak, σ E is the width and η is the asymmetry tail factor. As described in Sec. 2.1 above, we account for the statistical uncertainties associated with having a finite data set by bootstrapping 100 samples which are taken to be 3/5 of the full data. We then repeat the R 32 fit for each of these datasets, and use these as input to generate a template as a function of m t . We then test that this procedure closes, which gives the histogram plotted in Fig. 15. Fitting these distributions to a Gaussian gives a quantitative measure of the closure goodness in the form of the mean and standard deviation given in each panel.

C The Di-leptonic Channel
In the di-leptonic channel, each of the W bosons decays into a charged lepton and a neutrino. The final state is characterized by two b-jets, two leptons (e or µ), and E T / . One advantage of this channel is that the background is relatively low, especially in the eµ final state where there is no contribution from Z boson decays. Some drawbacks are that the branching ratio is only 10.5%, and that it is not possible to reconstruct the top mass directly since there are two neutrinos that contribute to the E T / .

C.1 Pre-selection Cuts
The physics object definitions are given as follows. Electron candidates are required to have a transverse momentum of p T > 25 GeV and a rapidity |η| < 2.47 excluding range (1.37, 1.52). Muon candidates must satisfy p T >25 GeV and |η| < 2.5. Muons must additionally satisfy an isolation requirement: muons within a ∆R = 0.4 cone about the axis of a jet that has p T > 25 GeV are not considered. Jets must satisfy an isolation requirement: events with jets that lie within a ∆R = 0.2 cone about the axis of an electron candidate are removed. Then, an electron isolation requirement discards events where electrons are found within ∆R = 0.4 cone about any of the remaining jets. The b-tagging efficiency is set to 0.7, and rejection factors of 5 and 137 are taken for c and light-flavor quarks respectively. Now that we have defined our objects, we will walk through the pre-selection requirements. Events are required to have a signal from the single-electron or singlemuon trigger and at least one primary vertex with at least five associated tracks (we assume the trigger efficiency is 100% for events that have an isolated electron or muon). An event must have exactly two oppositely charged leptons, where at least one of them must match the object that fired the corresponding trigger. In the same lepton flavor channels, E T / > 60 GeV is required. The invariant mass of the lepton pair is must be m > 15 GeV, excluding a window within 10 GeV of the Z boson mass. In the different lepton flavor channels, the scalar sum of p T of the two selected leptons and all jets is required to be larger than 130 GeV. There must be at least two valid jets, and at least one of these jets must be b-tagged. Finally, a cut on Counts Figure 17: Two validation plots are provided for the di-leptonic channel. The p T b distribution is given in the left panel, and the b-jet p T distribution is given on the right, assuming m t = 172.5 GeV. The blue distributions correspond to our simulation and the orange dots represent the ATLAS results [13], which have been rescaled to so that the normalizations agree.
p T b > 120 GeV is required. 12

C.2 m b Templates
In Fig. 17, we show two distributions computed using our samples after the preselection cuts have been applied, along with the comparisons to those given by AT-LAS. The observable m b is used to generate templates, where the parametric fit is now chosen to be a Landau function as defined in Eq. (2.2) and a Gaussian. As described in Sec. 2.1 above, we account for the statistical uncertainties associated with having a finite data set by bootstrapping 100 samples which are taken to be 3/5 of the full data. We then repeat the m b fit for each of these datasets, and use these as input to generate a template as a function of m t . We then test that this procedure closes by fitting the resulting histograms and comparing the fitted mean to the input m truth value, as shown in Fig. 18 .

D More on the Semi-leptonic Channel
This section provides extra validation information for the semi-leptonic channel. In Sec 3.1, we summarized the pre-selection cuts for ATLAS's semi-leptonic analysis. The detailed cutflow table is given by Table 4.
To make the templates, we generate parton-level events with MadGraph5_aMC@NLO that are subsequently passed to Pythia8 for showering and hadronization, and then to Delphes to model detector effects. Five million events are generated for each of five choices for the top mass; a table providing the number of events that pass the preselection cuts is given in Table 4. For each event, we use the likelihood defined in  Table 4: Cutflow table for semi-leptonic sample with m t = 172.5 GeV, separated by the identity of the lepton. The initial number of events is determined using the ATLAS integrated luminosity L = 20.2 fb −1 , and the production cross section as calculated using MadGraph5_aMC@NLO. In order to follow the procedure discussed in Sec. 2, we would like to have a set of ∼ 100 statistically independent samples to work with. However, it is computationally to expensive to re-generate the 5 million events many times. Therefore, we circumvent this issue using the statistical bootstrap, see e.g. [71]. Specifically, we random draw 3/5 of the 5 million events 100 times, allowing for replacement such that some events can be drawn more than once. We then find the best fit using our parametric function to each bootstrapped data set, providing us with an ensemble of 100 best fit parameters. The results are shown in Fig. 19, where the points and error bars show the mean and standard deviation, respectively, of the best fit value for each parameter at each of the five top mass choices. Finally, we fit a line to each of these parameters as a function of the top mass, which is the input needed to define our two-dimensional template as a function of a single top mass parameter.
The closure test is performed by taking another independent bootstrapped sample of 3 million events and fitting this data to our template to derive a best fit value of m t . This is repeated 100 times for each of the five truth top mass choices. The results are shown in Fig. 20, we show the results of this closure test which extracts the correct mass with a relatively small statistical uncertainty ∼ 0.1 GeV.

E Combining Measurements
Since we are interested in the global impact of stop contamination, it is useful to develop a simple framework for combining the measurements made in multiple independent channels. In particular, the results in Fig. 1 of the main text show parame-ters which best fit the mass measurements combining the three channels. In the SM only assumption (left panel of Fig. 1), the top mass which minimizes the χ 2 error is m comb t = 172.83 GeV, and the uncertainty band is determined by finding the contour where the χ 2 is larger than the minimum by 1.0. We note that both the all-hadronic and the semi-leptonic central values lie outside this best-fit uncertainty band. The regions shown in the BSM parameter space are computed in a similar fashion, see the middle and right panels of Fig. 1.
These regions only show the best-fit, and in particular they do not tell us how good the fit is. Therefore, it is amusing to ask if light stops can actually improve the fit to distributions measured by ATLAS. To perform a quantitative test, we compute the likelihood ratio for observing the measured values in the three channels in the  is the value predicted in the BSM model for a given Monte Carlo m t , m(t 1 ), and m(χ 0 1 ). This test statistic is constructed so that when λ comb > 0 the SM is a better fit, while when λ comb < 0 the BSM scenario is preferred.
The values of λ comb computed in the m(t 1 )m t plane are shown in Fig. 13. The red regions correspond to values of λ comb > 0, indicating that the SM alone provides a better fit to the data. We gray out any parameter space with λ comb > 1 for brevity, since this region has a much stronger preference for the SM alone (and is of course additionally constrained by direct searches for stops). The white regions have a similar fit between the models, giving λ comb = 0. Intriguingly, we find a small region in the m(χ 0 1 ) = 20 GeV panel with λ comb < 0. However, our analysis yields that this parameter point is a mere 0.1 σ more consistent with the data than the SM alone. We do not take this to be evidence for a BSM contribution to the top mass measurements.