Searching for supersymmetry scalelessly

In this paper we propose a scale invariant search strategy for hadronic top or bottom plus missing energy final states. We present a method which shows flat efficiencies and background rejection factors over broad ranges of parameters and masses. The resulting search can easily be recast into a limit on alternative models. We show the strength of the method in a natural SUSY setup where stop and sbottom squarks are pair produced and decay into hadronically decaying top quarks or bottom quarks and higgsinos.

The interpretation of a new physics search requires a model hypothesis against which a measurement can be tensioned. Lacking evidence for superpartners and clear guidance from theory, apart from naturalness considerations, it makes sense to employ search strategies that make as few assumptions on the model as possible. For most reconstruction strategies a trade-off has to be made between achieving a good statistical significance in separating signal from backgrounds and the applicability to large regions in the model's parameter space. Hence, experimental searches are in general tailored to achieving the best sensitivity possible for a specific particle or decay, leaving other degrees of freedom of interest unconsidered. This approach can lead to poor performance a e-mail: Matthias.Schlaffer@weizmann.ac.il for complex models with many physical states and couplings, e.g. the MSSM and its extensions. Hence, a reconstruction that retains sensitivity over wide regions of the phase space, thereby allowing to probe large parameter regions of complex UV models, is crucial during the current and upcoming LHC runs. However, since the number of possible realizations of high-scale models exceeds the number of analyses available at the LHC, tools like ATOM [56][57][58], CheckMate [59], MadAnalysis [60] or FastLim [58] and SModels [61] have been developed in recent years to recast existing limits on searches for new physics. A method that shows a flat reconstruction efficiency despite kinematic edges and population of exclusive phase space regions would be particularly powerful to set limits on complex models allowing for broad parameter scans.
In this paper we develop a reconstruction strategy for thirdgeneration squarks that accumulates sensitivity from a wide range of different phase space regions and for a variety of signal processes. 1 The proposed reconstruction is therefore a first step toward an general interpretation of data, i.e. recasting. As a proof of concept, we study stop and sbottom pair production, followed by a direct decay into a hadronically decaying top or a bottom and a neutralino or chargino; see Fig. 1. In our analysis we use simplified topologies, including only sbottom and stops as intermediate SUSY particles and we focus on jets and missing energy as final state signal. 1 We significantly expand on the proposals of [62,63]. In [62] a flat reconstruction efficiency was achieved over a wide range of the phase space for, however, only one process, pp → H H → 4b, and only a one-step resonance decay, i.e. H →bb. The authors of [63] showed that in scenarios with fermionic top partners fairly complex decay chains can be reconstructed including boosted and unboosted top quarks and electroweak gauge bosons. While several production modes were studied that can lead to the same final state, only one final state configuration was reconstructed. Focusing on a supersymmetric cascade decay, we will show that a flat reconstruction efficiency can be achieved over a wider region of the phase space, for a variety of production mechanisms and final state configurations. Fig. 1 Generic Feynman diagram for stop production and decay. The initially produced squarks in our setup can be any of the two stops or the lighter sbottom. They decay subsequently into a top or a bottom and a higgsino, such that the electric charge is conserved While this example might be oversimplifying, e.g. it might not capture long decay chains that arise if the mass spectrum is more elaborate, this setup is motivated by naturalness [64][65][66], i.e. it resembles minimal, natural spectra with light stops, higgsinos, and gluinos where all other SUSY particles are decoupled [67].
To be more specific, within this simplified setup the shape of the event depends strongly on the mass difference between the initially produced squarks and the nearly mass degenerate higgsinos in comparison to the top quark mass Q = (mq − mh)/m t . We can identify three regions in the physical parameter space leading to distinctly different topologies.
1. Q < 1: The only accessible two-body decay of the produced squark is the decay into a bottom quark and a charged higgsino. Possible three-body decays contribute only in small areas of the parameter space. 2. Q 1: The decay into a top quark and a higgsino can become the main decay channel for the produced squark, depending on the squark and the parameter point. The top quark of this decay will get none or only a small boost from the decay and thus its decay products might not be captured by a single fat jet. However, the intermediate W boson can lead to a two-prong fat jet that can be identified by the BDRSTagger [68]. 3. Q 1: If the squark decays into a top quark the latter will be very boosted and its decay products can no longer be resolved by ordinary jets. Yet they can be captured by one fat jet and subsequently identified as a decaying top by the HEPTopTagger [69,70]. The HEPTopTagger was designed to reconstruct mildly to highly boosted top quarks in final states with many jets, as anticipated in the processes at hand. However, other taggers with good reconstruction efficiencies and low fake rates in the kinematic region of Q 1 (see e.g. [71][72][73] and references therein) can give similar results.
Because the value of Q is unknown and the event topology crucially depends on it, a generic reconstruction algorithm that is insensitive to details of the model needs to be scale invariant, i.e. independent of Q. Hence, it needs to be able to reconstruct individual particles from the unboosted to the very boosted regime.
Apart from scanning a large region of the parameter space, such an analysis has the advantage that it captures the final state particles from the three possible intermediate squark statest 1,2 andb 1 , even if they have different masses. Therefore the effective signal cross section is increased compared to a search strategy which is only sensitive to specific processes and within a narrow mass range. In order to preserve this advantage we furthermore apply only cuts on variables that are independent of Q or the mass of one of the involved particles.
The remainder of this paper is organized as follows. In Sect. 2 we give details of the parameter space that we target and the signal and background event generation. Section 3 contains a thorough description of the reconstruction of the top quark candidates and the proposed cuts as well as the results of the analysis. We conclude in Sect. 4.

Signal sample and parameter space
As explained in the previous section, not onlyt 1 but alsot 2 andb 1 production contributes to the signal. For all these three production channels we consider the decay into a higgsino χ ± 1 , χ 0 1,2 , and a top or bottom quark. Since in our simplified topology setup we assume the higgsinos to be mass degenerate, we generate only the decay in the lightest neutralinõ q → q + χ 0 1 and the charginoq → q + χ ± 1 , where q, q stand for t or b. The decay of the second lightest neutralino χ 0 2 → χ 0,± 1 + X and of the chargino to one of the neutralinos χ ± 1 → χ 0 1,2 + X does not leave any trace in the detector since the emitted particles X will be extremely soft. Thus, the event topologies fort 1 → t + χ 0 1,2 will be the same and the different cross section for this topology can be obtained by rescaling with appropriate branching ratios.
We consider the following points in the MSSM parameter space. At fixed A t = 200 GeV and tan β = 10 we scan in steps of 50 GeV over a grid defined by μ ≤ m Q 3 , m u 3 ≤ 1 400 GeV for the two values of μ = 150, 300 GeV. The gaugino masses as well as the other squark mass parameters are set to 5 TeV while the remaining trilinear couplings are set to zero. For each grid point we calculate the spectrum and the branching ratios with SUSY-HIT [74]. Despite the specific choices for the parameters our results will be very generic. An increased A t would enhance the mixing between the left-and right-handed stops and thus render the branching ratios of the physical states into top and bottom quarks more equal. However, since the reconstruction efficiencies for both decay channels are similar, the final results would hardly change. The change due to a different mass of the physical states can be estimated from our final results. Similarly, a different choice for μ only shifts the allowed region in the parameter space and the area where the decay into a top quark opens up but does not affect the efficiencies.
Since the squark production cross section only depends on the squark mass and the known branching ratios we can now determine which event topologies are the most dominant. In the left column of Fig. 2 we show for μ = 300 GeV the relative contribution to the total SUSY cross section, defined as the sum of the squark pair production cross sections σ SUSY ≡ S=t 1 ,t 2 ,b 1 σ SS . In the right panels we show in color code the coverage defined as the sum of these relative contributions. The larger the coverage the more of the signal cross section can be captured by looking into these channels. Clearly considering only the decay of both squarks to top quarks and higgsinos is not enough as the parameter space with Q < 0 is kinematically not accessible. Moreover, in the m Q 3 > m u 3 half of the space this final state misses large parts of the signal since the lighter stop decays dominantly into bottom quarks and charginos.
In the case where all three final states are taken into account, the coverage is nearly 100 % throughout the parameter space, except along the line mt 1 ≈ m t +mh where the top decay channel opens up. There it drops to 70-80 %, because in this narrow region also the direct decay to a W bosoñ t 1 → W + b + χ 0 has a significant branching ratio.
For all parameter points we generate events for each of the up to nine signal processes using MadGraph5_aMC@NLO, version 2.1.1 [75] at a center of mass energy √ s = 13 TeV. No cuts are applied at the generator level. The matching up to two jets is done with the MLM method in the shower-k T scheme [76,77] with PYTHIA version 6.426 [78]. We set the matching and the matrix element cutoff scale to the same value of m S /6 where m S is the mass of the produced squark. We checked and found that the differential jet distributions [79] are smooth with this scale choice. The cross section for the signal processes is eventually rescaled by the NLO QCD and NLL K-factors obtained from NLL-fast, version 3.0 [80][81][82].

Background sample
In our analysis we use top tagging methods based on jet substructure techniques. We therefore focus on the decay of the squarks into a neutralino and a bottom quark or a hadronically decaying top quark. The latter will generate between one to three distinct jets and the former will generate missing energy. Our final state therefore consists of missing energy and up to six jets. As background we thus consider the following four processes, all generated with MadGraph5_aMC@NLO version 2.1.1 [75] and showered with PYTHIA version 6.426 [78].
• W j: pp → W + (2 + X )jets, where we merge up to four jets in the five flavor scheme and require that the W decays into leptons (including taus), such that the neutrino accounts for the missing energy. • Z j: pp → Z ν + (2 + X )jets, where we merge up to four jets in the five flavor scheme and the Z decays into two neutrinos and hence generates missing energy. In both channels W j and Z j we require missing transverse energy of at least 70 GeV at the generator level. • Ztt: pp → Z ν + t +t, where both top quarks decay hadronically, faking the top quarks from the squark decay and the Z decays again into two neutrinos to generate missing energy. This cross section is known at NLO QCD [83] and rescaled by the corresponding K-factor. • tt: pp → tt + jets, where one top decays hadronically and the other one leptonically to emit a neutrino, which accounts for missing energy. The NNLO+NNLL QCD K-factor is obtained from Top++ version 2.0 [84] and multiplied with the cross section.

Reconstruction
For the reconstruction of the events we use ATOM [56], based on Rivet [85]. Electrons and muons are reconstructed if their transverse momentum is greater than 10 GeV and their pseudo-rapidity is within |η| < 2.47 for electrons and |η| < 2.4 for muons. Jets for the basic reconstruction are clustered with FastJet version 3.1.0 [86] with the anti-k t algorithm [87] and a jet radius of 0.4 . Only jets with p T > 20 GeV and with |η| < 2.5 are kept. For the overlap removal we first reject jets that are within R = 0.2 of a reconstructed electron and then all leptons that are within R = 0.4 of one of the remaining jets. All constituents of the clustered jets are used as input for the following re-clustering as described below.
The underlying idea behind the reconstruction described in the following is to cover a large range of possible boosts of the top quark. We therefore gradually increase the cluster radius and employ successively both the HEPTop-and the BDRSTagger. This allows us to reduce background significantly while maintaining a high signal efficiency.
A flowchart for the reconstruction of the top candidates with the HEPTopTagger is shown in Fig. 3. First the cluster radius is set to R = 0.5 and the constituents of the initial antik t jets are re-clustered with the Cambridge-Aachen (C/A) algorithm [88,89]. Then for each of the jets obtained we check if its transverse momentum is greater than 200 GeV and if the HEPTopTagger tags it as a top. In this case we save it as candidate for a signal final state and remove its constituents from the event before moving on to the next jet. Once all jets are analyzed as described above we increase the cluster radius by 0.1 and start over again with re-clustering the remaining constituents of the event. This loop continues until we exceed the maximal clustering radius of R max = 1.5.
After the reconstruction with the HEPTopTagger is finished we continue the reconstruction of the top candidates with the BDRSTagger as sketched in Fig. 4. We choose our initial cluster radius R = 0.6 and cluster the remaining constituents of the event with the Cambridge-Aachen algorithm.
Since we now only expect to find W candidates with the BDRSTagger and need to combine them with a b-jet to form Fig. 3 Flowchart of the top reconstruction with the HEPTopTagger top candidates, the order in which we analyze the jets is no longer arbitrary. Starting with the hardest C/A jet we check if its transverse momentum exceeds 200 GeV, its invariant mass is within 10 GeV of the W mass, and the BDRSTagger recognizes a mass drop. In the case that one of the above requirements fails we proceed with the next hardest C/A jet until either one jet fulfills them or we find all jets to fail them. In the latter case we increase the cluster radius by 0.1 and repeat the C/A clustering and analyzing of jets until the radius gets greater than R max = 1.5. Once a jet fulfills all the previous criteria we need to find a b-jet to create a top candidate. To do this we recluster the constituents of the event that are not part of the given jet with the anti-k T algorithm and a cone radius of 0.4 and pass them on to the b-tagger. 2 Starting with the hardest b-jet we check if the combined invariant mass of the W candidate and the b-jet is within 25 GeV of the top quark mass. If such a combination is found it is saved as a candidate and its constituents are removed from the event. The remaining constituents of the event are reclustered with the C/A algorithm and the procedure repeats. Alternatively, if all b-jets fail to produce a suitable top candidate the next C/A jet is analyzed. Once the C/A cluster radius exceeds R max the remaining constituents of the event are clustered with the anti-k T algorithm with radius 0.4 and passed on to the b-tagger. Those that get b-tagged are saved as candidates of the signal final state as well. 2 For the b-tagger we mimic a tagger with efficiency 0.7 and rejection 50. We check if a given jet contains a bottom quark in its history and tag it as b-jet with a probability of 70 % if this is the case and with a probability of 2 % otherwise. Since the same jet may be sent to the b-tagger at different stages of the reconstruction process we keep the results of the b-tagger in the memory and reuse them each time it gets a previously analyzed jet. This way we avoid assigning different tagging results to the same jet.

Analysis cuts
After having reconstructed the candidates for the hadronic final states of the signal-top candidates and b-tagged antik T jets-we proceed with the analysis cuts. As our premise is to make a scale invariant analysis, we must avoid to introduce scales through the cuts. We propose the following ones and show the respective distribution before each cut in Fig. 5.

Zero leptons:
The leptons or other particles that are emitted by the decaying chargino or second lightest neutralino are too soft to be seen by the detector. Moreover, since we focus on the hadronic decay modes, no leptons should be present in the signal events. In the background, however, they are produced in the leptonic decays which are necessary to generate missing energy. We therefore require zero reconstructed electrons or muons. 2. Exactly two candidates: The visible part of the signal process consists of two hadronic final states as defined above. In the rare case that an event contains more but in particular in the cases where an event contains less than these two candidates it is rejected. This means that no b-jets beyond possible b candidates are allowed. 3. φ( p T,c 1 + p T,c 2 , / E T ) > 0.8π : Since we cannot determine the two neutralino momenta individually, it is impossible to reconstruct the momenta of the initial squarks. Yet, we can make use of the total event shape. In the signal, the transverse missing energy is the combination of the two neutrino momenta and therefore balances the transverse momenta of the two candidates. Consequently the vectorial sum of the candidate's transverse momenta p T,c i has to point in the opposite direction of the missing energy.

p T,c 1 + p T,c 2 + / E T / / E T < 0.5:
This cut is based on the same reasoning as the previous one. The absolute  5. φ( p T,c 1 , / E T ) < 0.9π : By this cut we require that the missing transverse energy and the transverse momentum of the harder of the two candidates are not back-to-back. Since the two produced squarks are of the same type and the higgsinos are mass degenerate, the recoil of the top or bottom quarks against the respective higgsino will be the same in the squark rest frame. Therefore, the two neutralinos should contribute about equally to the missing energy and spoil the back-to-back orientation that is present for each top neutralino pair individually. We therefore reject events where one top candidate recoils against an invisible particle and the second candidate does not. Moreover, we can thus reject events where the missing energy comes from a mismeasurement of the jet momentum.

φ( p T,c 2 , /
E T ) < 0.8π : This cut exploits the same reasoning as the previous one.
In the left plot in Fig. 6 we show the relative contribution of the types of the two final state candidates after all cuts. In the samples with heavy stops and sbottoms the HEPTopTagger contributes between 30-40 % of the candidates. In the samples with lighter squarks and thus less boosted objects the HEPTopTagger finds less candidates and the BDRSTagger contributes up to about 15 % of the top quark candidates. However, most candidates besides the ones from the HEP-TopTagger are b-jets.
In Fig. 7 we show the efficiency of each cut for the three possible final states as a function of the squark mass. As anticipated they show only a mild mass dependence. This is also reflected in the total efficiency which is very flat over the whole parameter space as can be seen in Fig. 8.

Results
To continue further, we consider the m T 2 [90,91] distribution that is shown in the right plot of Fig. 6 (normalized) and in Fig. 9 (stacked). m T 2 is designed to reconstruct the mass of the decaying particle and gives a lower bound on it. This is    reflected in the plotted distribution, where the upper edge of the signal distribution is just at the actual squark mass. For the calculation of m T 2 we assume zero neutralino mass and use a code described in [92] and provided by the authors of this reference. Instead of imposing an explicit cut on m T 2 to improve S/B, mainly to the benefit of the processes involving heavy squarks, we rather evaluate the statistical significance apply-ing a binned likelihood analysis using the C L s technique described in [93,94]. For the calculation we employ the code MCLimit [95]. We assume an uncertainty of 15 % on the background cross section and also include an error stemming from the finite size of the Monte Carlo sample. For the latter we need to combine the pure statistical uncertainty with the knowledge of a steeply falling background distribution. To do this we determine for each background process and each In all three plots μ = 150 GeV, which corresponds roughly to the mass of the higgsinos bin the statistical uncertainty ω √ N , where ω is the weight of one event and N is the number of events in the given bin. Conservatively, we assign N = 1 for those bins which do not contain any events of the given process. In the high m T 2 region where no background events appear this method clearly overestimates the error on the background which is steeply falling. In addition we therefore fit the slopes of the m T 2 distributions with an exponential function and use this function to extrapolate the background distribution to the high-m T 2 region. As uncertainty on the shape for a given background process we now take in each bin the minimum of ω √ N and three times the fitted function. This way the error in the low m T 2 range is determined by the statistical uncertainty while the one in the high m T 2 range from the extrapolation. The combined error on the background in each bin is then obtained by summing the squared errors of each process and taking the square root.

p T,c 1 + p T,c
The results are shown in Fig. 10 (Fig. 11) for μ = 150 (300) GeV and integrated luminosities of 100, 300, and 1 000 fb −1 . In Fig. 12 we show for (m Q 3 , m u 3 ) = (1500, 1500) the C L s exclusion limit as a function of the integrated luminosity. Even for this parameter point, close to the predicted sensitivity reach of the LHC, using our approach, we find a 95 % CL exclusion with 600 fb −1 .

Final remarks
The main idea behind this analysis was to obtain a scale invariant setup. In the first step we achieved this by employ-ing the HEPTop-and BDRSTaggers together with varying radii. Thereby we managed to pick the minimal content of a hadronically decaying top quark for a large range of top momenta. In the second step we avoided introducing scales in the cuts and only exploited the event properties that are independent of the mass spectrum. After this proof of concept it will now be interesting to apply this principle to other searches where top quarks with various boosts appear in the final state as for example in little Higgs models with T-parity [96][97][98]