Naturalness Sum Rules and Their Collider Tests

We present the most general sum rules reflecting the cancellation of ultraviolet divergences in the Higgs potential in weakly-coupled, natural extensions of the Standard Model. There is a separate sum rule for the cancellation of the quadratic and logarithmic divergences, and their forms depend on whether the divergences are canceled by same-spin or opposite-spin partners. These sum rules can be applied to mass eigenstates and conveniently used for direct collider tests of naturalness. We study in detail the feasibility of testing these sum rules in the top sector at a future $100\TeV$ proton collider within two benchmark models, the Little Higgs (LH) and the Maximally Symmetric Composite Higgs (MSCH). We show how the two ingredients of the sum rules, the top partner masses and their Yukawa couplings to the Higgs, can be measured with sufficient accuracy to provide a highly non-trivial quantitative test of the sum rules. In particular, we study observables sensitive to the sign of the top partner Yukawa, which is crucial for verifying the sum rules but is notoriously difficult to measure. We demonstrate that in the benchmark models under study, a statistically significant discrimination between the two possible signs of each Yukawa will be feasible with a 30 ab$^{-1}$ data set at $100\TeV$.


Introduction
Naturalness of electroweak symmetry breaking has long been recognized as the key issue that motivates extending the Standard Model (SM), with additional particles and interactions required to appear around the TeV scale in a natural theory. Experimental data, starting with precision electroweak tests in the 1990's and culminating with the discovery of the Higgs boson in 2012, confirmed the SM picture of electroweak symmetry breaking, and provided strong indication that theory must remain weakly coupled at the TeV scale. To be natural, weakly coupled theories require the presence of one or more new particles at the TeV scale, whose quantum-mechanical contributions to the Higgs potential must cancel the dominant contributions from the SM. At the very least, a natural, weakly-coupled theory with a cutoff of at least around 10 TeV requires that one-loop quadratically divergent contributions to the Higgs mass parameter from the top and electroweak gauge sectors be cancelled. In theoretically motivated constructions, the cancellation is enforced by a symmetry structure of the theory. Particles responsible for canceling the top and gauge contributions to the Higgs mass are called the "top partners" and "gauge partners", and their properties are related by symmetry to those of their SM counterparts, the top quark and the elecvtroweak gauge bosons. In particular, the couplings of the top/gauge partners to the Higgs boson are tightly constrained by the requirement of the divergence cancellation. The first goal of this paper is to provide a simple and general form of these constraints, which we call naturalness sum rules. Our derivation of the sum rules is very general and does not assume any specific form of physics Beyond the Standard Model (BSM). In addition to the "leading" sum rule reflecting the cancellation of quadratic divergences, we also provide a second, "sub-leading" sum rule which holds in theories where logarithmic divergences are also canceled. Such theories, exemplified by the recently proposed Maximally Symmetric Composite Higgs (MSCH) model [1,2], are theoretically attractive, since the amount of fine-tuning in electroweak symmetry breaking is reduced compared to more traditional composite Higgs models (CHMs). 1 The naturalness sum rules in the top sector are expressed in terms of the masses of the top quark and its partner(s), and their Yukawa couplings to the Higgs boson. If top partners are discovered, measuring their masses and Yukawa couplings can provide a direct experimental test of the sum rules. Such a test would establish whether the observed particles indeed restore naturalness, and whether the divergence cancellation occurs only at the quadratic level or at the level of log divergences as well. LHC searches for top partners have placed strong limits on these particles' masses, in all but rather exotic decay scenarios. 2 While future LHC runs may well discover top partners, it is highly unlikely that they will collect samples of top partners large enough for a statistically significant test of the sum rules, given the current constraints. On the other hand, a 100 TeV proton-proton collider, such as the SppC [10] proposed in China, or CERN's FCC-hh [11], may provide an opportunity to explore the top partners in detail. In this paper, we demonstrate with a detailed Monte Carlo study that the naturalness sum rules can be tested at a 100 TeV collider, at a few-% level, for two representative benchmark models of new physics: a Little Higgs (LH) model [12][13][14], and the MSCH model [1]. Crucially, we show that the signs of the top partner Yukawa couplings, and not just their magnitude, can be determined

Naturalness Sum Rules
In this section we establish very general sum rules that any theory without quadratic (or log) divergences have to satisfy. These sum rules provide the essential underlying mechanism for the solution of the hierarchy problem: testing them in high energy collider experiments should be of highest priority. Our considerations in this section will be very general and encompass all models regularly considered where top partners provide the cancellation of the quadratic divergences. We first focus on the case where the Higgs is a pseudo-Nambu-Goldstone boson (pNGB) of a global symmetry G spontaneously broken to its subgroup H, as it happens in all CHMs and LH. In these kinds of models, the quadratic UV sensitivity of the Higgs potential due to top or electroweak gauge bosons loops are cancelled by their corresponding partners with same spin. We show that if the Higgs potential from the gauge and top sectors is natural, the couplings of the standard model fields and their partners to the physical Higgs boson should satisfy some sum rules. We first derive the most general sum rules for the Yukawa couplings of the top sector (including the top partners) and then briefly show the corresponding sum rules in the gauge sector. The detailed derivations for the gauge sum rules are presented in App. A. Then we present the general sum rules for the SUSY like models in which the Higgs is elementary and its divergences from top and gauge bosons loops are cancelled by their partners with opposite spin while more details can be found in App. B.

Sum rules for the top sector: cancellation of quadratic divergences
In unitary gauge, the mass matrix of the fermions in top sector in the background of Higgs H can be parametrized as We can now turn this main condition into a sum rule involving the masses and the Yukawa couplings of the fermions. We will show that the cancelation of the quadratic divergences implies the vanishing of the tadpoles of the physical Higgs in the quadratically divergent part of the potential after electroweak symmetry breaking (EWSB). Assume the pNGB Higgs gets a VEV H =ṽ, and denote the fluctuating field above this VEV as h, Notice that the VEV of pNGB Higgs is not exactly equal to the EWSB scale v = 246 GeV.
To avoid confusion, in this paper we always follow the convention that the nonlinear sigma field describing the pNGBs πâ in coset space G/H is parametrized as where Tâ are the broken generators with normalization Tr[TâTb] = δâb. Using these conventions, the full relation between the pNGB Higgs VEV and physical EWSB scale is For the successful EWSB regionṽ f 1, the EWSB scale is approximately equal to pNGB Higgs VEV, v ≈ṽ + O(ṽ 3 /f 2 ). We will be neglecting other corrections of this order and hence we will not be making a distinction between v andṽ.
We find that Higgs boson h tadpole in V (H) after EWSB is proportional to the same mass terms that control the quadratic divergences before EWSB, In this derivation we are first Taylor expanding where each of the Taylor series coefficients are constants (independent of H or h). The first term in the second line is H tadpole in V (H) and hence must be vanishing by SU(2) gauge invariance: while the second term is directly the coefficient of the quadratically divergent Higgs mass term. Thus we find that if the Higgs mass term is not quadratically divergent, a sum rule for the terms linear in h in the Higgs potential must be satisfied up This sum rule can now be easily converted into one that contains the physical masses and Yukawa couplings. We know that M F | h=0 is the fermion mass matrix (after EWSB) and ( ∂ ∂h M F )| h=0 is the matrix of physical Yukawa couplings. We can diagonize the fermion mass matrix up to order O(ṽ 2 /f 2 ) with a bi-unitary transformation: where L and R are the rotations of the left-and right-handed fermions, M D is the diagonal mass matrix and Y M is the Yukawa matrix for the mass eigenstates. We can now apply the same rotations to our naturalness condition (2.9) If we assume that CP is conserved then the vanishing of the quadratically divergent corrections to the Higgs mass terms yield a simple sum rule for the physical Yukawa couplings and fermions masses (2.12)

Sum rules for the top sector: cancellation of the log divergences
One can carry through the same analysis to obtain a sum rule resulting from the vanishing of the the logarithmically divergent contributions to the Higgs mass. The log divergence in the Coleman-Weinberg potential is proportional to Tr (M F .M † F ) 2 logΛ 2 so the condition for the absence of a logarithmic divergence in the Higgs mass is Just as before, we can show that if the Higgs mass is not logarithmically divergent and using that the tadpole of H is always zero, the h tadpole after EWSB must also be zero Diagonalizing the mass matrix the identity in Eq. (2.14) becomes If CP is conserved, we obtain the sum rule the vanishing of the logarithmis divergence,

Gauge sector sum rules
Following the same logic, we can easily derive the naturalness sum rule in gauge sector. We obtain the following results (the details of the derivations are presented in App. A). The cancellation of the quadratic divergences implies the sum rule where g V V h is the linear coupling of the physical Higgs boson to the physical (mass eigenstate) gauge bosons, which has mass dimension one. Similarly the cancellation of the log divergences will imply the sum rule where M V is the diagonal mass matrix for the physical vector bosons.

Sum rules for opposite spin partners
Our previous discussion of sum rules can be extended to the case where the divergences are cancelled by opposite spin partners as in supersymmetric (SUSY) theories. For completeness we briefly list these sum rules (while the more detailed derivation can be found in App. (B)). The general sum rule for the cancellation of the quadratic divergences via opposite spin partners is where g SSh is the matrix of trilinear couplings of the SM Higgs boson to the physical (mass eigenstate) additional scalar particles. If CP is conserved, this general sum rule will be simplified to Note that in SUSY theories one generically has to deal with multiple Higgs doublets leading to several physical neutral Higgs bosons. Assuming that the quadratic divergences for all Higgs doublets are cancelled, the above sum rules should apply for every physical neutral Higgs boson separately.
For example for the case of SUSY theories this sum rule can be directly translated to the separate sum rules for the top/stop and the gauge/gaugino/Higgs/ Higgsino sectors (assuming that the cancellation of the quadratic divergences in the top/stop sector are independent from the other sector): where the gt iti h are the trilinear physical Higgs-stop-stop coupling to the physical stopst i and y t is the top Yukawa coupling.
• Gauge/gaugino/Higgs/Higgsino sector Generically the gauginos and Higgsinos will mix to form the neutralinos and charginos, and there will be a single sum rule for the entire sector: where y C + i C − i h and y N i N i h are the physical Yukawa coupling of the charginos and neutralinos, m C i and m N i are the chargino and neutralino masses and g W + W − h and g ZZh are Higgs trilinear couplings to the W and Z gauge bosons. In the second line g H 0 are the the trilinear couplings of the physical Higgs to new neutral and charged scalars and g hhh is the Higgs cubic self coupling. If there is a separate cancellation of the charged and the neutral sector there will be two separate sum rules involving only the terms corresponding to the charged/neutral states from the equation above. If there is no mixing between the Higgsinos and gauginos the sum rule may also split into separate sum rules for the gauge/gaugino and the Higgs/Higgsino sectors but generically there is just one overall sum rule that can be experimentally tested. It can be easily checked that all of these sum rules are satisfied within SUSY models.

Explicit Examples examples for the sum rules and benchmark models
Below we present our two benchmark models that we will be examining and show how the sume rules in the top sector are satisied here.

Little Higgs
Little Higgs (LH) models were the earliest successful pNGB Higgs models, which predict the presence of same spin top and gauge partners. In LH models the quadratic divergences in Higgs potential are cancelled via the mechanism of collective symmetry breaking: the Higgs shift symmetry is explicitly broken only when two separate terms in the Lagrangian are simultaneously turned on. If one of these terms is turned off the Higgs shift symmetry will be restored and no potential can be generated, hence the potential must be proportional to the product of these symmetry breaking couplings, softening the divergence and eliminating the quadratic divergence (while the potential is generically still log divergent). Since the Higgs quadratic divergences are cancelled, our first sum rule must be satisfied (while the second one for these models will not be satisfied). We will be chosing the top sector of the "simplest Little Higgs model" (SLH) as one of our benchmark models. Here we first explain how the sum rule is analytically satisfied in this particular model while in the later part of the paper we show how this sum rule can be tested experimentally. The top sector of the SLH is quite general and representative of the top sector of many other LH models. In particular it is identical of the top sector of the "littlest Higgs". The gauge group of the SLH model is SU (3), and there are two copies of the SU (3)/SU (2) coset, where the gauged SU(3) is identified with the diagonal SU(3). The Higgs will be part of the non-linear field U parametrized as in Eq.2.5, where the broken generators Tâ are in direction SU (3)/SU (2) and the Goldstone matrix πâTâ is given by (ignoring an additional singlet Goldstone that does not play a role in the cancellation of the quadratic diveregnces) The linearly transforming Higgs triplet will then be given by (2.24) where H =ṽ + h. The top sector is obtained by extending the SM quark doublet to a full SU(3) triplet by introducing an additional electroweak singlet Dirac fermion T . The left-handed component T L will be part of the SU(3) triplet while T R is an SU(3) singlet. The Yukawa couplings will then be written in the form where u 3R is the usual RH top quark (which will mix with T R to form the actual SM t R ). It is easy to see that if we turn off one of Yukawa couplings λ 1,2 , the NGBs can always be removed from the fermion sector, hence implementing collective symmetry breaking. This also implies that the Higgs potential from the top sector should be proportional to |λ 1 λ 2 | 2 f 4 and only log divergent. The quadratic divergences from the usual top quark loop are cancelled by the loops involving the Dirac fermionT which can be easily checked directly. The Fermion mass matrix in the electroweak (EW) basis (t 3L ,T L ) and (u 3R ,T R ) in the Higgs background is It is easy to see that the quadratically divergent term in the potential is independent of the Higgs field leading to the first sum rule. We can also explicitly check the validity of the first sum rule after EWSB. After EWSB, the physical top and top partner masses are while their diagonal Yukawa couplings to the Higgs are Thus we can see that in this model the first sum rule is indeed satisfied: (2.31)

Maximally Symmetric Composite Higgs
Here we briefly summarize the main general features of the MSCH model [1] and then discuss its top sector and in particular the corresponding sum rules in detail. The MSCH model is based on a symmetric coset space G/H with an unusual unbroken global symmetry of the composite fermion sector called maximal symmetry. This unbroken flavor symmetry is the subgroup of the enhanced chiral global symmetry G L ×G R of the composite fermions: the combinations of the generators of G L × G R which leave the VEV V that breaks G to H invariant. In the SM fermion effective Lagrangian (obtained after integrating out all the heavy composites), this symmetry will eliminate all the Higgs dependence in the effective kinetic terms and only allow the Higgs dependence only in the effective Yukawa couplings. As a consequence in models where the SM fermion masses are produced collectively via partial compositeness, the Higgs potential from effective Yukawa couplings will be automatically finite. In this model, the Higgs potential is dominated by the top sector and has the minimal universal tuning for successful EWSB. Since the Higgs potential from the top sector is finite, we will have both sum rules satisfied, which we will explicitly display below. We choose the minimal SO(5)/SO(4) coset with custodial symmetry for the MSCH model. The EW gauge interaction is embedded in the SO(4) SU (2) L ×SU (2) R subgroup. The NGBs from SO(5)/SO(4) are also described by the nonlinear sigma field U in Eq.2.5, where Tâ is the broken generators in direction SO(5)/SO(4) and f is still the global symmetry breaking scale. A crucial element of the construction of the MSCH model is the presence of the Higgs-parity operator V = diag(1, 1, 1, 1, −1) which also can be identified with the Higgs VEV breaking SO(5) → SO (4).
In the MSCH model it is assumed that the composite fermionic operator O mixing with the SM top is in the fundamental representation of SO(5). Since the composite sector only has SO(4) symmetry, this composite operator can be decomposed into a fourplet and a singlet top partner under this subgroup (2.32) The explicit parametrization of these top partners is then: In order to mix with these composite fermions, the SM top doublet and singlet should also be embedded into the 5 representation with the explicit form To realize maximal symmetry, the elementary-composite mixing terms should be fully SO(5) invariant to ensure that these mixing terms are invariant under the fully enhanced chiral global symmetry SO(5) L × SO(5) R . The breaking of this symmetry appears only in the composite top partner mass term which is twisted by V . This structure results in the chiral global symmetry broken to the maximal symmetry SO(5) V : As explained before this breaking pattern will ensure the finiteness of the Higgs potential. The explicit form of the Lagrangian of the top sector satisfying the requirements of maximal symmetry is where E µ is the e symbol of CCWZ (but this is not important for the top sector contributions to the Higgs potential). Maximal symmetry guarantees that the Higgs shift symmetry is collectively broken by the couplings t,q and the mass M , hence the Higgs potential from the top sector loops must be proportional to ∼ ( t q ) 2 f 4 M 2 .
As expected power counting shows that the degree of divergence of the Higgs potential is negative so it is finite, free of quadratic and logarithm divergences. Now we can check our sum rules in this model explicitly. In unitary gauge, the mass matrix for the top and the top partners in the background of the physical Higgs H in the basis (t, T, X 2/3 , T 1 ) is Due to the off-diagonal Yukawas, a further rotation is necessary to diagonalize the mass matrix after EWSB. We label the final (post-EWSB) mass eigenstates as t 1,2,3 , with It is easy to confirm that the Higgs quadratic and logarithmic divergences vanish respectively since are both independent of the Higgs background field H. After EWSB, the Yukawa coupling We can then explicitly check that first and second sum rule are both exactly satisfied with these mass M F and Yukawa Y matrices:

Collider Tests of the Sum Rules: Outline
We have seen above that the cancellation of divergences via same-sign partners implies the presence of very simple and generic sum rules. The focus of the rest of the paper will be to establish how these sum rules for the top sector can be tested experimentally. Such a test requires measurements of the top partner mass m t , and its "diagonal" coupling to the Higgs, Y t t h . (In this section, t denotes a generic fermionic top partner. If multiple top partners contribute to divergence cancellation in a given model, the measurements outlined here must be performed separately for each one.) While the sum rules apply to any model with weakly-coupled top partners, details of the experimental strategies required to perform such measurements are model-dependent. To illustrate how such measurements can be performed in concrete, well-motivated scenarios, we choose two models with fermionic top partners, namely the LH and MSCH models introduced above. Collider phenomenology of fermionic top partners has been studied by many authors [20]. The dominant production processes at a hadron collider are pair-production, pp → t t , and electroweak single production, pp → t q. The produced top partners decay to SM final states 3 tZ, th, and bW . Currently, searches for such final states at the LHC place bounds on the top partner mass of about m t > ∼ 1.3 TeV [22][23][24][25][26][27]. For our study, we choose benchmark points in the parameter spaces of the LH and MSCH models that are consistent with these constraints, with top partner masses around 1.5 TeV. The benchmark points will be defined in Section 4.1. For these benchmark points, it is expected that future runs of the LHC will discover the top partners. However, we find that detailed measurements of the top partner properties needed for checking the sum rules require significantly larger data sets than what can be collected at the (HL-)LHC. Motivated by this, we study the potential of a future 100 TeV hadron collider, such as the SppC [10] proposed in China, or CERN's FCC-hh [11]. The first ingredient of the sum rules that will be measured is the top partner mass m t . This mass can be reconstructed using fully visible decay channels, for example tZ with a hadronic top (typically appearing as a top jet, given the high mass of the top partner) and leptonic Z. In the case of MSCH model, masses of all relevant top partners can be measured using this technique, with similar uncertainties.
Measuring Y t t h is more subtle. Ref. [19] proposed that this coupling can be determined from the rate of the process pp → t t h, which can be measurable at the 100 TeV collider. This method, however, has an important limitation: since the rate is proportional to |Y t t h | 2 , it can only provide information about the absolute value, and not the sign, of the coupling. Relative signs between the terms in the sum rule are crucially important for cancellations to occur, and therefore it is necessary to test these signs experimentally. In this paper, we propose a novel observable that is sensitive to the sign of the top partner-Higgs coupling. The idea is simple. Consider a parton-level production process involving t , such that the leading matrix element receives two contributions: one with an odd number of Y t t h vertices (for example, one), and the other one with an even number of Y t t h vertices (for example, zero). The cross section of such a process includes the interference between the two contributions, which is sensitive to the sign of Y t t h . If the two contibutions are roughly similar in size, the interference term can constitute a sizeable fraction of the total cross section. Assuming all other relevant parameters are known, measuring this cross section provides information about both the magnitude and the sign of Y t t h .
The simplest example of a process sensitive to the sign of Y t t h is single top partner production in association with the Higgs, pp → qht . Representative Feynman diagrams for this process are shown in Fig. 1. The diagram on the left changes sign when the sign of Y t t h is flipped, while the other two diagrams do not. If the off-diagonal tt h coupling and the top-partner gauge couplings are measured independently, the total rate of this process can be used to infer Y t t h , incuding its sign. This approach is directly parallel to that proposed in Ref. [28] to determine the sign of the SM top Yukawa coupling at the LHC using the process pp → qht.
Another interesting example is the process pp → t thh. The dominant contributions to the matrix element of this process are illustrated in Fig. 2. The first contribution (left panel) is from top partner pair-production pp → t t , followed by a three-body decay of one of the top partners, t → thh, via a diagram with an off-shell top. The second contribution (right panel) is from top partner pair-production in association with the Higgs, pp → t t h, followed by a two-body decay of one of the top partners, t → th. (The second top partner decays in the same two-body channel in both cases, and this decay does not affect the discussion.) If the sign of Y t t h is flipped, the first contribution is unaffected, while the second one changes sign. Since the two constributions are of the same order in perturbation theory, their interference makes up a significant part of the cross section, making this process promising for determining the sign of Y t t h despite its relatively small rate.

Collider Tests of the Sum Rules: Feasibility Study
To demonstrate that the proposed sum rules are useful in practice, we performed a detailed study of the relevant collider phenomenology. For this study, we chose two benchmark parameter points, one each in the LH and MSCH models. We then performed realistic Monte Carlo simulations and analysis of the relevant signal and background processes in proton-proton collisions at √ s = 100 TeV. The main results of this section are summarized in Table 3, which contains quantitative estimates of the precision with which ingredients of the sum rules can be measured.

Benchmark Points
Parameter spaces of both LH and MSCH models are constrained by the LHC searches for charge-2/3 top partners [22][23][24][25][26][27]. To satisfy these constraints, we choose benchmark points (BPs) in both models with top partner masses around 1.5 TeV. The MSCH model also contains a charge-5/3 top partner, denoted as x 5 for mass eigenstate, which can produce a same-sign dilepton final state via the decay chain x 5 → W t → ν νb. Stringent LHC limits [29,30]

Name Mass [GeV]
Decays   Tables 1 and 2. To quantify the sensitivity of cross section measurements to the sign of Y t t h , we consider the flip rate F, defined as The flip rate is the fractional change in the cross section when the sign of Y t t h is flipped, while all other couplings and masses are unchanged. Flip rates for relevant reactions at our BPs are listed in Tables 1 and 2. Note that flip rates vanish for the lightest MSCH top partner t 1 , since Y t 1 t 1 h = 0. For the other top partners, the flip rates can be used as a quick estimate of the accuracy of cross section measurement required to gain sensitivity to the sign of Y t t h . For the reactions we consider, cross sections are in the 10 − 100 fb range and flip rates are in the 10 − 50% range, making it plausible that the sign of the coupling can be detected at a statistically significant level. The limiting factor will be separating the signals from SM backgrounds; this will be discussed in detail in the rest of this section.

Little Higgs
In this subsection, we discuss how all ingredients of the naturalness sum rule in the Little Higgs model, Eq. (2.31), can be measured experimentally at a 100 TeV collider. Here and in the rest of the paper, we assume integrated luminosity of 30 ab −1 [10,11,31]. Monte Carlo (MC) event simulation for signal and background is performed within the MadGraph5 aMC@NLO platform [32]. The Feynrules [33] package was used to generate a UFO output model [34]. Events were then generated using MadGraph 5 [35], the subsequent decays are handled by MadSpin [36], posterior showering and hadronization are modeled by Pythia8 [37], and finally detector effects are approximated by a parametrized detector simulation using Delphes3 [38].

Top Partner Mass and Branching Ratios
The simplest process involving top partners at a hadron collider is pair-production, pp → TT . A very large sample of T pairs will be collected at a 100 TeV collider, allowing for precise determination of the top partner mass and branching ratios. Both are important for testing the sum rules: m T enters the sum rule directly, while the branching ratios are needed to infer the coupling Y T T h (see below). Top partner mass can be fully reconstruced in the decay T → Zt, followed by a hadronic top decay. For simplicity, we consider the situation when this decay occurs on both sides of the event. To reduce QCD backgrounds, we assume that both Z's decays leptonically. Since m T m t , hadronic top decays would typically result in a single (fat) jet. We thus consider the final state with two lepton pairs, each with invariant mass close to m Z , and two top-tagged fat jets. In the MC simulation, we apply the anti-k T reconstruction algorithm with cone size R = 1.0, and require the final state to contain at least two jets with p T > 400 GeV. A Convolutional Neural Network (CNN) jet-tagger algorithm [39][40][41], which is described in detail in Appendix C, is then applied, and the events with exactly two top-tagged jets are selected. Principal SM backgrounds to this final state, ZZtt and ZZjj (with two QCD jets misidentified as top jets), have also been Name simulated. The distribution of events in the invariant mass m( + − j) (including incorrect groupings dues to combinatorial uncertainties), as well as the corresponding likelihood profile for m T , are shown in Fig. 3. We conclude that m T can be measured in this channel with very small statistical error, approximately 1%. In practice, systematic errors such as Jet Energy Scale (JES) uncertainty would likely dominate the measurement, probably giving an overall error of order a few %. This is sufficient for our purposes.
Since T pair-production is dominated by strong interactions, the mass m T is the only non-SM parameter that enters σ(TT ) at leading order. Once this mass is measured, the cross section is completely fixed from theory, up to pdf uncertainties. Measurements of event rates in specific final states can then be used to deduce the branching ratios of various T decay modes. For example, the event rate N (ZtZt) in the fully-reconstructed final state discussed above is proportional to [Br(T → Zt)] 2 . Using the Z + Z jj + 2j t final state, we estimate that this rate can be measured with a statistical error of about 5%. The statistical effor of the branching ratio measurement can be estimated as The derivative ∂ log σ ∂ log M T was estimated numerically using MadGraph. In this case, we find that the rate measurement dominates statistical uncertainty, resulting in a measurement of the branching ratio accurate to about 2.5%, see Table 3. In reality, the measurement will also be affected by systematic errors, including pdf uncertainties, uncertainties in jet tagging probabilities, etc. While a quantitative analysis of these sources of error is beyond the scope of this work, we expect that these errors can be controlled at a few-% level, sufficient for a highly non-trivial check of the sum rules.
Similarly, the branching ratio Br(T → th) can be inferred from the rate of thth events. In this case, we consider the final state in which one of the Higgs bosons decays to γγ, while the other Higgs decays to bb. Each top is assumed to decay hadronically, typically producing a single fat jet. Likewise, the Higgs decaying to a bb pair is highly boosted in the lab frame, resulting in a single "Higgs jet". We thus require 2 photons and 3 jets tagged as massive (non-QCD) jets in the final state. Principal SM backgrounds to this final state include 3j + γγ (including both resonant diphotons from Higgs decays, and the non-resonant contriubution) and ttj + γγ. Imposing a cut p T > 400 GeV on all jets, we find a large event sample with S/B ∼ 1, resulting in a measurement of the signal rate with a statistical error of about 5%. Finally, the channel bW bW can be used to measure the remaining branching ratio, Br(T → W b). To suppress QCD backgrounds, we require that both W bosons decay leptonically, resulting in a final state + − + 2b + E / T . The prinicpal backgrounds are from tt and W W jj processes. We estimate that this signal rate can also be measured with a statistcial error of about 5%. We conclude that with the assumed dataset, statistical errors on all three branching ratios will be about 2.5%. While systematic errors, such as pdf uncertainties, are likely to somewhat degrade the precision, obtaining a few-% measurement of the three branching ratios seems realistic.

Top Partner Yukawa Coupling: pp → qT h
The second ingredient of the sum rule is the top partner Yukawa coupling Y T T h . We propose to infer this coupling, including the sign, from a measurement of the cross section σ(pp → qT h). Before demonstrating that this cross section can measured at the 100 TeV collider with sufficient precision, let us address the following important point. To infer the coupling Y T T h from a measured cross section, one needs to have a theoretical prediction of this cross section in which Y T T h is the only unknown parameter. In the LH model, the cross section depends on the top partner mass, m T , and the couplings g bT W and g tT h , in addition to Y T T h . (See the Feynman diagrams in Fig. 1.) The measurement of m T has already been discussed. The couplings g bT W and g tT h could potentially be inferred from the branching ratios Br(T → W b) and Br(T → th) respectively. However, doing so requires an independent measurement of the total width of the top partner, Γ T . Unfortunately, at our benchmark point, Γ T ≈ 40 GeV is too small to be directly measured: the width of the reconstructed top partner mass distribution in Fig. 3 is completely dominated by detector smearing effects. Instead, one can use a measurement of the single top-partner production cross section σ(pp → T q), which is proportional to g 2 bT W and does not depend on any other non-SM parameters except m T . To estimate the prosects for measuring this cross section, we simulated this process, with the decay channel T → Zt followed by a leptonic Z decay and a hadronic top decay forming a top-jet. The dominant backgrounds, Z+jets and ttZ, were also simulated. The selection criteria for this process are an opposite-sign lepton pair consistent with a Z decay, and at least one top-tagged jet with p T > 400 GeV. We estimate a statistical error in the signal rate measurement of about 1%. This signal rate can be converted into a cross section measurement by dividing it by the branching ratio Br(T → Zt), which can be measured independently in the pair-production channel as discussed above. This method gives a statistical uncertainty on the measurement of g bT W of approximately 1.2 %, dominated by the branching fraction measurement. Once g bT W is known, the coupling g tT h can be inferred from the ratio of the branching ratios,

Br(T →th)
Br(T →W b) , which is independent of the total width. We estimate that this method will yield a measurement of g tT h with a statistical error of about 2%. Now, let us turn to the main subject of this section, the measurement of σ(pp → qT h). This process can produce a number of signatures, depending on the decay chains of T and h. Having studied a number of possible final states, we identified two channels where the signal can be separated from SM background and measured precisely.

Channel Selection Cuts
Results (γγ) + 2j + X p T (j k ) ≥ 400 GeV, k = 1, 2; Table 5 p T (γ 1 ) ≥ 100 GeV; p T (γ 2 ) ≥ 50 GeV m γγ ∈ [120, 130] GeV; m jj ∈ [1.0, 2.0] TeV ( + − ) + 2j + X Pre-cuts: p T (j k ) ≥ 400 GeV, k = 1, 2;   pp → qhT → (γγ) + 2j + X Channel. In this channel, the Higgs boson decays to two photons. The top partner can decay in any of the available channels (T → W b/Ht/Zt), but both of its decay products are required to decay hadronically. Given that M T = 1.5 TeV is well above the weak scale, the decay products of the T are highly boosted. Each of their hadronic decays is typically reconstructed as a single jet, resulting in a final state with two high-p T jets. (The jet from t-channel production is typically not as hard as those from heavy T decay, so we choose to require two hard jets in the final state.) The principal irreducible SM backgrounds in this channel include hjj, tth, and γγjj. We do not include reducible (instrumental) backgrounds in our study.
In the Monte Carlo study of this signature, we apply anti-k T jet reconstruction algorithm with cone size R = 1.0, and require the final state to contain at least two jets with p T > 400 GeV. We require at least two photons, with the leading photon p T > 100 GeV and the second-leading p T > 50 GeV. We further perform two "pre-cuts", requiring the invariant mass of the two photons to be between 120 and 130 GeV, while the invariant mass of the two jets is between 1.0 and 2.0 TeV. The cuts for this and other analyses of this section are summarized in Table 4. The event numbers in the signal and background samples that survive the cuts are listed in Table 5. To further separate signal and background, we performed an analysis based on Random Forest (RF) technique [43], implemented in scikit-learn [44,45]. RF is a machinelearning algorithm that aims to find a non-linear combination of input variables that is optimal for discriminating signal and background. The details of the RF analysis are presented in Appendix D. The RF discriminator is provided with 19 input variables for each event. These include kinematic variables (such as H T , E / T , jet and photon p T 's, etc.), as well as jet substructure information. For the latter, we use the CNN jet-tagger algorithm described in Appendix C. The jet tagger uses the pattern of energy deposits in hadron calorimeter (HCAL) cells inside a jet to classify it according to its origin: light quark or gluon ("QCD jets"), bottom quark, W/Z, Higgs, or top. The output for each jet is a set of probabilities P i for it to belong to each of the classes. In this analysis, we use the probabilities P QCD and P t , for the leading (in order of decreasing p T ) and first subleading jet. These are useful because the dominant backgrounds contain QCD or top jets, while signal events contain a W , Z or H jet (see Fig. 4). These four probabilities are provided as inputs to the RF discriminator. Combining kinematic and jet-substructure information, the RF algorithm outputs a single number optimized to distinguish between signal and background, ξ RF . The distribution of events in ξ RF can then be fit by a sum of signal and background templates, providing a measurement of the signal rate. In this study, we have roughly approximated such a fit by simply counting signal and background events in two bins in ξ RF , with the highest S/B ratios, and combining the statistical significance for each bin. The two bins are referred to as RF I and RF II in Table 5. For 30 ab −1 integrated luminosity, the "best" bin RF I contains about 1500 events and S/B ∼ 1 (see Table 5). We conclude that the signal can be discovered in this channel with a very high Taking also into account the sensitivity of the cross section to g bT W and g tT h and the uncertainties in these couplings discussed above, and combining all errors in quadruture, we estimate the statistical uncertainty of Y T T h determination from this channel at about 3%.
pp → qhT → ( ) + 2j + X Channel In this case, the top partner decays to Zt. The Z then decays leptonically, while the top decays hadronically and produces a single top-jet. The Higgs boson is typically highly boosted in the lab frame, and we consider the decay to bb, producing a single Higgs-jet in the final state. The major irreducible SM backgrounds are t q t q Z and Z jj with mis-identified jets.
In the Monte Carlo study of this signature, we again apply anti-k T jet reconstruction algorithm with cone size R = 1.0, and require the final state to contain at least two jets with p T > 400 GeV. We further require two opposite-charge, same-flavor leptons, with invariant mass between 80 and 100 GeV consistent with Z decay. We reconstruct the Z transverse momentum p Z T = p 1 T + p 2 T , and require p Z T > 400 GeV. The jet tagger algorithm is then applied. We require that the event contain exactly one top-tagged and one Higgs-tagged jet. (We choose the tagger parameters corresponding to 50% efficiency to ID the top and the Higgs jet, respectively. For mistag rates and further details, see Appendix C.) The event numbers passing these pre-cuts in a 30 ab −1 data sample are listed in Table 6. The invariant mass of the jet identified as top and the two leptons from Z decay is expected to match the top partner mass in the signal, and we require that this invariant mass be between 1 and 2 TeV ("Cut I" in Table 6). At this point, S/B ∼ 1/10 is achieved.
To further discriminate between signal and background, we once again construct an RF discriminator. In this case, the RF inputs do not include jet substructure information, since it has already been used. The best bin in RF discriminator output ξ RF , labeled RF I in Table 6, contains about 1200 signal events for a 30 ab −1 data sample, with excellent S/B ∼ 5. The measurement of the signal rate in this channel on its own can discriminate between the "correct" (as required by the sum rule) and "wrong" sign of Y T T h with a statistical significance of about 4σ. Combining the two channels considered in this subsection provides a total significance of over 6σ.

Top Partner Yukawa Coupling: pp → T thh → (γγ) + 3j + X
The second interesting channel sensitive to the sign of Y T T h is pp → T thh, see Fig. 2. We considered the final state in which one of the Higgs bosons decays to a photon pair, while all other decays are hadronic. Since the top and at least one of the Higgs bosons in the final state predominatly come from T decays, they tend to be highly boosted in the lab frame, and their decay products are reconstructed as a single massive jet. Likewise, hadronic decays of the second T are typically reconstructed as a pair of massive jets (in the tZ and th decay channels) or a massive jet and a light jet (in the bW channel). Therefore we require that the final state contain 2 photons and at least three jets, tagged as massive (non-QCD) jets by the jet tagger. (In this case, we do not use the jet tagger to discriminate among various types of massive jets such as Higgs, top, etc.) Further, we impose the following pre-cuts: • p T ≥ 400 GeV for each of the three massive jets; • Two-photon invariant mass consistent with Higgs decay: m γγ ∈ [115, 135] GeV; • At least one out of three pair-wise combinations of the three massive jets has invariant mass roughly consistent with T decay: m j i j k ∈ [1000, 2000] GeV.
With these pre-cuts, we find that the most significant surviving SM backgrounds are ttjh and ttV h where V = W/Z, decaying hadronically. For the LH benchmark point parameters, S/B ∼ 1 after the pre-cuts (see Table 7). To further improve background rejection, we again apply the Random Forest algorithm; see Appendix D for details. The resulting statistical significance of the signal observation in this channel is about 14σ, while the LH model with flipped sign of Y T T h can be distinguished from the "correct" model at the 2.4σ level. While not as significant as the qhT channel, this channel can provide valuable independent confirmation of the sign of Y T T h . This channel will have a more important role to play in the MSCH model.

Summary: Testing the Sum Rule in LH Model
The analysis of this section established that at the chosen benchmark point in the LH model, the top partner mass and Yukawa coupling will be measured with statistical uncertainty of  Table 7. First two columns: Signal and background event numbers in the (γγ) + 3j + X channel in a 30 ab −1 sample at a 100 TeV pp collider. Third column: Poisson significance [42] for discriminating signal+background model from pure background. Last column: Poisson significance for discriminating signal+background models with "correct" and "wrong" signs of Y T T h . For MSCH, we list the number of signal/total background events in the RF signal bin.
approximately 1% and 3%, respectively. Including systematic errors will likely somewhat degrade the precision, but it seems very likely that a few-% measurement of these quantities will be achieved. Of course, the SM top parameters will be by then known with much better precision. We conclude that the 100 TeV collider program with 30 ab −1 integrated luminosity will provide a quantitative test of the sum rule, Eq. (2.31), including the relative signs of the two terms, at a few-% level.

Maximally Symmetric Composite Higgs
In the MSCH model, both quadratic and logarithmic divergences are canceled, and there are two sum rules, Eqs. (2.41). There are three top partner mass eigenstates, t 1 , t 2 and t 3 . The masses and Yukawa couplings of these states need to be measured to test the sum rules. Since the t 1 has vanishing Yukawa coupling, it does not contribute to the sum rules.
In this section, we will focus on the phenomenology of t 2 and t 3 . The set of measurements that need to be performed for each of these states is essentially the same as for the top partner in the LH model discussed above. The only new feature is that the contributions of each of the top partners to the rates need to be disentangled to extract their parameters individually.
To assess the feasibility of such measurements at the 100 TeV collider, we performed a detailed Monte Carlo study, parallel to the case of LH model described above. The simulations are based on our implementation of the MSCH model in FeynRules. (The model files are publicly available [46].) In this section we breifly summarize the results of this study.
The mass of both t 2 and t 3 can be fully reconstructed in the Zt decay channel, with a leptonic Z and a hadronic top forming a single top-jet. This reconstruction was studied in the MC simulation, with the signal given by the pair-production and decay of t 2 and t 3 in the benchmark MSCH model, and the rest of the analysis identical to the LH case discussed in Sec. 4.2.1. Effects of detector resolution are included in the analysis, and dominate the widths of the reconstructed resonances. The widths are sufficiently small for the two resonances to be clearly separated. The likelihood profiles for the masses of t 2 and t 3 are shown in the right panel of Fig. 3. Statistical precision of the measurement of each mass is estimated to be close to 1%, see Table 3.
The next step is to measure branching ratios. The procedure is the same as for the LH top partner, and the only new issue is how to separate the two states. Branching ratio measurements in the tZ and th channels rely on final states where the decay is fully reconstructed, and therefore events with t 2 and t 3 production can be easily distinguished from each other by the reconstructed mass. To measure the bW branching ratio, the LH analysis used the 2(W b) final state with two leptonic W 's, making mass reconstruction impossible. To avoid this problem, in the MSCH case we used a different final state, W b + Zt, in which the two top partners can be distinguished by their mass. The projected statistical uncertainties on the branching ratio measurements are summarized in Table 3.
The final step is to measure the cross sections of pp → qht and pp → T thh processes, and to extract Y t t h . Again, the analysis of the LH model applies here as well, with the only new issue being the simultaneous presence of multiple top partners. In all final states considered in the LH analysis above, the top partner decay is fully reconstructed, either as a pair of jets, or as a leptonic Z in association with a (massive) jet. It should be straighforward to distinguish the events involving various top partners on an event-by-event basis. Moreover, the three top partner masses would be measured independently and with high precision using the pair-production process, and this information can be used in these channels, e.g. to reduce combinatoric errors in mass reconstruction. For the estimates in Table 3, we assume perfect separation of the events involving different top partners, and analyze each one independently, using the procedure outlined in Section 4.2. We find that the best measurement of the Yukawa coupling is provided by different channels: in the case of t 2 , the best channel is t 2 thh → (γγ) + 3j + X, see Table 7; while for the case of t 3 , the best channel is qht 3 → qhZt, see Table 6. The reason can be seen from the production cross sections and branching ratios in Table 2. The resulting statistical error in the determination of the top partner Yukawa is about 7% for both t 2 and t 3 . This error includes the propagated error from mass and branching ratio measurements. We expect that systematic errors can be controlled at a similar or lower level, so that the total uncertainty on the Yukawas would be at most of order 10%. Given that the top partner mass measurements will have much smaller uncertainty, we conclude that the set of measurements outlined here will result in a quantitative test of the sum rules, including signs, at a 10% level.

Discussion
Naturalness of EWSB has been the main theoretical motivation for new particles and interactions appearing around the TeV scale. The main role of such new particles is to cancel the large quantum corrections to the Higgs potential and render it UV insensitive. These cancelations present the essence of the new physics models, and will manifest themselves in naturalness sum rules: relations among the couplings and masses of the SM particles and their new particle partners. We have derived these sum rules for the first time and presented them both for same spin and opposite spin partners. We find a separate sum rule for the cancelation of the quadratic and logarithmic divergences. Depending on the particular model one or both may hold, and usually the sum rules are separately satisfied in the gauge and the fermionic sectors for same spin partners. We presented both the very general form of the sum rules as well explicit demonstration of how these sum rules are obeyed in two benchmark models: the Little Higgs and the Maximally Symmetric Composite Higgs. The LH model satisfies only the first sum rule for the top sector, ensuring the cancelation of the quadratic divergences, while in the MSCH both sum rules for the quadratic and logarithmic divergences hold.
If and when new particles were discovered, testing whether their masses and couplings satisfy our sum rules would provide the most direct test of naturalness. Thus it is essential to establish effective strategies to experimentally test the sum rules at future colliders. In the second half of this paper we have outlined how this goal may be achieved at the proposed 100 TeV pp collider, using fermionic (spin-1/2) top partners in LH and MSCH models as a concrete example. We focused on measurements that are sensitive to the sign of the top partner coupling to the Higgs boson, since determining this sign is crucial for testing the sum rule. We performed a detailed Monte Carlo study of the relevant collider phenomenology, including the dominant SM backgrounds. Using traditional rectangular cuts as well as more advanced machine learning techniques, we constructed analyses that are capable of measuring top partner masses and their Yukawa couplings to the Higgs with excellent precision. Our findings can be summarized as it follows: Little Higgs model: Given the chosen benchmark point, the mass of the top partner can be measured with statistical uncertainty of 1%, while the top partner Yukawa can be determined with a 3% statistical uncertainty.
• In the pp → qhT → (γγ) + 2j + X channel our analysis reaches a very high (> 5σ) significance for top partner discovery, and the statistical significance of 4.5σ for the Yukawa sign determination.
• In the pp → qhT → ( ) + 2j + X channel the analysis achieved a high (> 5σ) significance for discovery, and a 4.2σ significance for Yukawa sign determination.
• In the pp → T thh → (γγ) + 3j + X channel we achieve a high (> 5σ) significance for discovery, and a 2.4σ significance for Yukawa sign determination.
Maximally Symmetric Composite Higgs model: In the MSCH model there are three top partner mass eigenstates, t 1 , t 2 and t 3 . The lightest mass eigenstate t 1 has a vaniching Yukawa, Y t 1 t 1 h = 0, and does not contribute to the sum rules, so we concentrate our analysis on t 2 and t 3 . We showed that the mass of each top partner can be reconstructed using the Zt decay channel. The widths are small enough for the two resonances to be clearly separated. The statistical uncertainty for the measurement of the top partner masses is about 1%, while their Yukawa couplings Y t t h can be determined with a 7% statistical uncertainty.
• In the pp → qht 3 → ( ) + 2j + X channel the analysis achieves a high (> 5σ) significance for discovery of t 3 , and a 3.8σ statistical significance for t 3 Yukawa sign determination.
• In the pp → t 2 thh → (γγ) + 3j + X channel, we achieve a high (> 5σ) statistical significance for discovery of t 2 , as well as for the deterimination of the sign of its Yukawa coupling.
Taken together, our results indicate that, if one of the new physics models studied here is realized in nature, the 100 TeV collider will be able to provide a highly non-trivial quantitative test of the naturalness sum rules. Thus, the sum rules derived in this paper are not only theoretically interesting as a general, elegant consequence of naturalness of electroweak symmetry breaking, but may also provide a key experimental test of this concept.
Following the analysis we performed in the top sector, we can again find that the Higgs h tadpole in the V g after EWSB is proportional to (A.3) which is driving the H quadratically divergent Higgs mass term before EWSB. If the gauge contributions to the Higgs mass are not quadratically divergent, a sum rule for the terms linear in h in V g must be satisfied up to O(ṽ 2 /f 2 ) We know the term ∂M 2 V /∂h| h=0 is the Higgs linear coupling matrix to vector bosons pairs in EW eigenstate. We can rotate the vector bosons to mass eigenstates and the Higgs linear coupling matrix in mass eigensates as where L g is the rotation matrix that diagonalizes the vector boson masses. Thus we obtain a sum rule for the linear couplings of the Higgs to the gauge bosons: The sum rule for the vanishing of the log divergent contributions from vector boson loop can be obtained in the same way. The condition for the vanishing of the log divergences is The Higgs tadpole after EWSB is again proportional to this term so the sum rule will be Rotating into the mass eigenstates, we can get the sum rule for physical sates where M V is the mass matrix for physical vector bosons very similar to that for the top sector.

B Derivation of Sum Rules with Opposite Spin Partners
In this Appendix we present the derivation of the sum rules if the cancellation is due to opposite spin partners like in SUSY models. Since SUSY models usually contain two Higgs doublets we will also assume this but the final sum rule applies to models with arbitrary number of Higgs doublets. For deriving our sum rule we will be treating the CP even neutral component in each Higgs doublet as a background field. The quadratic divergences in the Coleman Weinberg potential from loops of spin j particles can be parametrized as where M 2 S is the scalar mass matrix. If the mass matrix of the two Higgs doublets contains no quadratic divergence we must have  Table 8. Jet-finding efficiencies as a function of jet p T in the 6 SM samples used for training and validating the CNN tagger.

B.1 Top/stop sector
We assume that the top partners are scalars (stops) or vectors. The sum rule in terms of the physical couplings of the top and its partners to the Higgs is where gt iti h and gṼ jṼj h are the Higgs trilinear coupling to stopst i and vector top partners V j . There would be a similar sum rule in other fermionic sectors.

B.2 Gauge/Gaugino/Higgsino sector
We assume that the partners of the gauge bosons and the Higgs are fermions (gauginos and Higgsinos) and they can mix with each other after EWSB. The sume rule including the effects of the neutralinos and charginos is where g S + S − h and g S 0 S 0 h are the Higgs trilinear coupling matrices to electrically charged and neutral scalars in mass eigenstate, Y C + C − h and Y N N h are the Higgs trilinear coupling matrices to charged and neutral fermions in mass eigenstates, M C and M N are the diagonal mass matrices of the physical charginos and neutralinos and g W + W − h and g ZZh are the Higgs trilinear couplings to the SM W ± and Z bosons. If there is a separate cancelation in the electrically charged and neutral sectors the two lines in (B.8) will separately vanish. Note that the g S 0 S 0 h term includes the Higgs self-coupling.

C Jet Tagging
Jet tagging plays a very important role in the collider analysis of Section 4. In this Appendix, we describe the jet-tagging algorithm that was constructed and used in that analysis. The algorithm is based on a Convolutional Neural Network (CNN), which combines calorimeter and tracker information for each jet to assign probabilities that the jet is due to a top, Higgs, or W/Z decay, a b quark, or is a "light" jet (originated by a u/d/s/c quark or a gluon). To train and test the jet tagger algorithm, we generated large event samples for SM pair-production processes, pp → XX with X = j, b, t, h, W, Z. The samples were generated by MadGraph (parton-level production and final-state decays), Pythia (showering and hadronization), and Delphes (fast detector simulation). In the case of t/h/W/Z final states, fully hadronic decays were simulated, and lower p T throesholds were chosen sufficiently high so that the decay products are typically collimated. Anti-kT jet algorithm [47], with jet clustering radius R = 1.0, is applied to these samples. We used the FastJet [48] implementation of the anti-kT algorithm. Jet-finding efficiencies for each event class are shown in Table 8. Two representative events from the tt sample are shown in Fig. 5.
To classify the jets, each of them is converted into a "tensor image", using the following procedure: • Choose a coordinate system centered at the center of the jet, so that (η = 0, φ = 0).
• Consider a square region in the (η, φ) plane, of size 1.0×1.0, centered at the origin. By construction, the clustered jet is fully (or almost fully) contained within this region. Divide this image into 50 × 50 equal-sized pixels. (The angular size of each pixel roughly corresponds to the resolution expected for calorimeters at a future hadron collider.) • For each pixel, record the total p T of particles that fall within that region, as well as and the multiplicity, for both Track and Tower classes from Delphes, which corresponds to the number of charged and neutral particles detected by the calorimeter.
• Combine this information in a tensor with dimensions 50 × 50 × 4.
The tensor images of jets serve as inputs to the CNN constructed using the pyTorch [49] enviroment. The CNN consists of the following elements: • 4 convolutional layers with a ReLU acctivation function; Figure 6. Jet tagging performance for the first region selection 400 < p f atjet T < 800 GeV by ROC curve. From the top left plot, we have the jat tagging efficiency for top, Higgs (top right), Z (bottom left) and W (bottom right) jet respectively. The y-axis are the 1 minus false rate. In the plot labels we display the miss identification for 50% and 80% benchmark points.
• Two Maxpooling layers; • Classification block layers, including two linear layers with Dropout of 50% and ReLu activation function. The final linear layer classifies the jet images into 6 categories: top fat jet, Higgs fat jet, W fat jet, Z fat jet, b fat jet and light jet.
Further, each jet is assigned randomly to either the training sample or the validation sample. In each sample, jets are divided into three bins according to their p T : 200 GeV < p jet T < 400 GeV, 400 GeV < p jet T < 800 GeV and p jet T > 800 GeV. Jets with p jet T < 200 GeV are discarded, since they are not expected to exhibit relevant sub-structure. The CNN is trained using the traning sample, separately for each p T bin. The performance is then tested using jets in the validation sample. The tagger performance can be characterized by the Receiver Operating Characteristic curve (ROC curve). For each pair of jet classes a and b, the ROC curve shows "tag efficiency" (probability of correctly tagging the jet of class a as a) on the horizontal axis, and 1− mistag rate (the probability of incorrectly tagging jet of class b as a) on the vertical axis. Benchmark working points used in the collider analysis correspond to 50% and 80% efficiency for the relevant jet class (t, H, or W/Z, depending on the analysis). These benchmark points are labeled on each of the curves, and the corresponding mistag rates are listed in the plot legend.  Figure 8. Probability distribution to identify a jet as light, for 400 GeV < p jet T < 800 GeV (left) and 800 GeV < p jet T (right). The blue dashed line is the cut selected to reject 95% of the QCD jets (light jets plus bottom jets).
An important task for the tagger is to reduce QCD contamination in samples of jets from massive particle decays. The performance of the CNN tagger is illustrated in Fig. 8, which shows the distribution of probabilities that a given jet is "light", for jets in each of the six SM samples. The QCD contamination can be reduced by 95% with cuts shown by blue lines in the figure, while maintaining high efficiency to tag a massive-object jet in all cases.

D Random Forest Classifier
The key goal of the analyses described in Section 4 is to discriminate between signal events (due to top partner production and decays) and background events (due to SM processes). To maximize the use of kiematic information, we make use of the multivarible machine learning technique, Random Forest (RF) Classifier [43]. RF Classifier algorithms are widely used in high-energy physics and other fields. Their well-known advantages include parallelization, which great reduces CPU time consumption; and avoiding overfitting problems. To construct the RF Classifier used in this paper, we used the Scikit-Learn [44] software framework. The RF was constructed to have 10,000 estimators, with the other hyper-parameters left at their default values since they do not have high influence on our analysis.
In the following, we will discuss the application of the RF algorithm to the (γγ)+2j+X channel in the LH model; applications to other channels in Section 4 are similar. For this channel, our dataset consist of 4 event samples, one signal and three background (hjj,tth,γγjj). We select 19 kinematic observables, listed below, as inputs to the RF algorithm. The RF combines the 19 inputs in an optimal way, determined via training on labeled signal and background samples, to return a single number between 0 and 1. RF output close to 1 corresponds to signal while RF output close to 0 corresponds to background.
The input observables chosen for this analysis are as follows: 1.   . RF output for the training (bars) and testing (points) samples. The doted-dashed green lines correspond to "RF I" and "RF II" signal bins, see Table 5 Table 9. Same as Table 5, but with a simpler RF with only 8 input observables.
12. n b : Number of tagged b-jets; 13. P 0 t , P 1 t : Probabilities that the leading/second-leading fat jet is a top jet; 14. P 0 j , P 1 j : Probabilities that the leading/second-leading fat jet is a light jet.
The last four probabilities are provided by the CNN top tagger, see Appendix C. Note that not all inputs are linearly independent: for example, This does not cause problems, since at worst, the RF algorithm will ignore the extra inputs. However, we find that in practice providing such extra inputs explicitly leads to improved performance, presumably because the algorithm can simply choose the best linear combination in each case without having to "invent" it during training.
To demonstrate that the RF classifier is not overfiting, we divide the event samples into "training" and "testing" sets, and compare the RF output between the two sets. In Fig. 9, the red/blue bars represent the RF output on training signal/background samples, while the red/blue dots correspond to the testing signal/background samples. The training and testing set RF outputs are in excellent agreement. We conclude that the predictions made by the RF classifier are consistent and the RF model is not overfiting.
To gain some insight into the relative importance of the input observables for signal/background discrimination, we computed the Gini importance for each of the observables. The results are shown in Fig. 10. While no single observable dominates, we note that Figure 10. RF inputs classified by their Gini importance. The "others" slice includes the 11 variables not shown explicitly. eight of the observables, shown explicitly in Fig. 10, dominate the analysis. To confirm this, we repeated the significance analysis of Table 5, using only the eight leading observables as RF inputs. The results in Table 9 confirm that the significance of signal observation and sign flip discrimination are only modestly degraded compared to the full analysis with 19 inputs.