Heavy Higgs Bosons at 14 TeV and 100 TeV

Searching for Higgs bosons beyond the Standard Model (BSM) is one of the most important missions for hadron colliders. As a landmark of BSM physics, the MSSM Higgs sector at the LHC is expected to be tested up to the scale of the decoupling limit of O(1) TeV, except for a wedge region centered around $\tan\beta \sim 3 -10$, which has been known to be difficult to probe. In this article, we present a dedicated study testing the decoupled MSSM Higgs sector, at the LHC and a next-generation $pp$-collider, proposing to search in channels with associated Higgs productions, with the neutral and charged Higgs further decaying into $tt$ and $tb$, respectively. In the case of neutral Higgs we are able to probe for the so far uncovered wedge region via $pp\to bb H/A \to bbtt$. Additionally, we cover the the high $\tan\beta$ range with $pp\to bb H/A \to bb\tau\tau$. The combination of these searches with channels dedicated to the low $\tan\beta$ region, such as $pp\to H/A \to tt$ and $pp\to tt H/A \to tttt$ potentially covers the full $\tan\beta$ range. The search for charged Higgs has a slightly smaller sensitivity for the moderate $\tan\beta$ region, but additionally probes for the higher and lower $\tan\beta$ regions with even greater sensitivity, via $pp\to tb H^\pm \to tbtb$. While the LHC will be able to probe the whole $\tan\beta$ range for Higgs masses of O(1) TeV by combining these channels, we show that a future 100 TeV $pp$-collider has a potential to push the sensitivity reach up to $\sim \mathcal O(10)$ TeV. In order to deal with the novel kinematics of top quarks produced by heavy Higgs decays, the multivariate Boosted Decision Tree (BDT) method is applied in our collider analyses. The BDT-based tagging efficiencies of both hadronic and leptonic top-jets, and their mutual fake rates as well as the faking rates by other jets ($h$, $Z$, $W$, $b$, etc.) are also presented.


Introduction
With the discovery of a 125 GeV Higgs boson and rapid progress made at the Large Hadron Collider (LHC), it is time for the high-energy physics community to chart a road-map for the next decade or the next few decades. In addition to a high-luminosity LHC program, two preliminary proposals, the Future hadron-hadron Circular Collider (Fcc hh ) program at CERN [1] and the Super-pp-Collider (SppC) [2] program in China, have been made, both of which involve construction of a 50-100 TeV pp collider [3] (below we will universally assume a 100 TeV machine). With more accumulated data or higher expected energy scale, the high-luminosity LHC and the next-generation pp-colliders offer great opportunities for the search for physics up to and beyond TeV scale, respectively, including additional Higgs bosons. (For recent studies on physics at a next-generation pp-collider, e.g, see [4][5][6][7][8][9][10][11][12][13][14][15]) An extended Higgs sector extensively exists in physics beyond the Standard Model (BSM), such as supersymmetric theories [16] or composite Higgs models [17], because of requirements by either symmetry or phenomenology. The additional Higgs fields fill up singlet, doublet, triplet or some other representations of electroweak (EW) gauge symmetry, generically yielding interactions with the SM sector which are characterized by their electric JHEP11(2015)124 charges and CP-structures. Searching for these new Higgs bosons, therefore, provides an unambiguous way to probe for new physics and is one of the top missions of hadron colliders.
At the LHC, both the ATLAS and the CMS collaborations have started their searches for neutral, singly-charged or doubly-charged Higgs bosons, in various channels [18][19][20][21][22][23]. Within the next decade, tests up to O(1) TeV are expected in a general context. Additionally, the energy scale accessible to the next-generation pp-collider is several times higher than that of the LHC, which enables us to search for a much heavier Higgs sector. In both cases, some decay modes kinematically suppressed in the low mass domain are fully switched on. Concurrently, the kinematics of their decay products can be dramatically changed. A systematic study of decoupled or heavy Higgs sectors at hadron colliders is therefore essential for both the high-luminosity LHC program and the proposals for nextgeneration pp-colliders.
This motivates the studies in this article. We focus on searches for neutral and singlycharged heavy Higgs bosons, and will analyze the sensitivity reach that might be achieved at both, the (HL-)LHC and a 100 TeV pp collider, using the Minimal Supersymmetric Standard Model (MSSM) for illustration. In the MSSM, there are in total five Higgs bosons: three of them are neutral (two CP-even and one CP-odd, or three CP-mixing ones) and two of them are charged. Because of the limitation in the energy reach, the LHC searches mainly focus on a mass domain below the TeV scale. For constraints on the MSSM Higgs sector based on these searches see e.g. [24,25]. For searches including light gauginos and higgsinos see for example [26,27].
For neutral Higgs bosons, pp → bbH/A → bbτ τ and pp → H → V V, hh, tt together with pp → A → hZ, tt yield or might yield the best sensitivities in the large tan β region (tan β > 10) as well as the small tan β region (tan β < 3), respectively. Whereas the moderate tan β region (tan β ∼ 3-10) is difficult to probe, yielding a well-known untouched "wedge" region (for recent discussions, e.g., see [24,28]). As for charged Higgs bosons, pp → tbH ± → tbτ ν plays a crucial role in probing the small tan β region with m H ± < m t + m b as well as the large tan β region, because the coupling of H ± with τ ν is tan β-enhanced.
In the decoupling limit, the decays of H/A → tt and H ± → tb are fully switched on, if the SUSY sector is decoupled as well, whereas the decays of H/A → V V, hh, A → hZ, and H ± → hW ± are generically suppresse [29]. Br(H/A → tt) becomes sizable for moderate tan β and dominant for low tan β, and Br(H ± → tb) becomes dominant for the whole tan β region. However, it is known that the signal in the channel pp → H/A → tt and the QCD tt background have strong interference effects [30,31]. The search in this channel therefore is extremely challenging. Instead we propose in this article to test a heavy Higgs sector in channels with associated Higgs productions, where the interference effects between the signal and the QCD background are much less severe, compared to that in pp → H/A → tt.
Interestingly, the channels pp → bbH/A → bbtt and pp → ttH/A → tttt have a cross section maximized in the moderate tan β and low tan β regions, respectively. A combination of the channels of pp → bbH/A → bbτ τ (or pp → bH/A → bbb [32]), pp → bbH/A → bbtt and pp → ttH/A → tttt or pp → H/A → tt (if the interference structure can be efficiently identified) thus may yield a full coverage for the tan β domain in searching for neutral Higgs bosons, if the moderate and low tan β regions can be probed in the two latter channels  Table 1: Main channels to cover or potentially cover various tan β regions in the decoupling limit of the MSSM Higgs sector. The channels marked by " * " are not covered in collider analyses in this article.
efficiently. For the charged Higgs boson searches pp → tbH ± → tbtb is the golden channel for both high and low tan β domains. The main channels for the MSSM Higgs searches, which will be explored in this paper (except the ones marked with " * "), are summarized in table 1. These channels are characterized by two classes of kinematics: 1. Kinematics related to the heaviness of the BSM Higgs bosons, such as highly boosted top quarks in Higgs decay. Looking into internal structure of the boosted objects or defining boostness-based variables can efficiently suppress the related backgrounds.
2. Kinematics related to the particles accompanying Higgs production, e.g., the forwardness/backwardness of the two b-jets accompanying the Higgs production in the channels of pp → bbH/A → bbtt and pp → tbH ± → tbtb, and the reconstructed top pair accompanying the Higgs production in pp → ttH/A → tttt.
Meanwhile, the kinematic features of the particles accompanying Higgs production can efficiently facilitate revealing the m tt or m tb peak of the signals.
To fully extract the potential of a pp collider, particularly one at 100 TeV, in searching for the decoupled MSSM Higgs sector, we exploit the kinematic features discussed above by applying the multivariate Boost-Decision-Tree (BDT) method for the channels involving top quarks. As a demonstration of the effectiveness of the BDT method, we present, additionally to the sensitivity reach of the pp colliders at 14 and 100 TeV, the tagging efficiencies of both hadronic and leptonic top-jets, and their mutual fake rates as well as the faking rates by other jets (h, Z, W , b, etc.). The code implementing our analyses BoCA 0.1 is publicly available [33]. Although the studies focus on the MSSM Higgs sector for its theoretical motivation and experimental representativeness, the sensitivity reach can be projected straightforwardly to some other contexts such as two Higgs doublet models and composite Higgs models. The strategies developed for these Higgs searches might be applied to the other collider analyses as well.
We organize this article in the following way. We shortly review the MSSM Higgs sector in the decoupling limit in section 2, and introduce the strategies for constructing the BDT-based top-jet taggers (both hadronic and leptonic) and testing the decoupled MSSM Higgs sector at pp colliders in section 3. The discovery reaches and exclusion limits JHEP11(2015)124 which might be achieved at 14 TeV and 100 TeV are presented in section 4. We summarize our studies and point out potential directions to exploration after this work in section 5. More details on the collider analyses and the construction of the BDT-based top-taggers are provided in the appendices.

The Higgs sector in the MSSM
In the MSSM, the Higgs mass spectrum and their couplings with the SM sector and themselves depend, at tree level, only on two free parameters, often chosen to be the Higgs vacuum expectation value (VEV) alignment tan β = v 2 v 1 and the mass of the CP-odd neutral Higgs boson m A (or the mass of the charged Higgs boson m H ± ) in the case with no CP-violation [29]. In comparison, the general type II 2HDM depends on more parameters at tree level [34,35]. The additional free parameters include the masses of the three other Higgs bosons, and the mixing angle α defined within the CP-even neutral sector by if CP-symmetry is conserved [35]. In the MSSM these parameters are correlated to each other. For example, the mixing angle α is fixed by tan β and m A with the relation (e.g. see [36]) In addition to the SM sector and with themselves, the MSSM Higgs bosons can couple with superparticles directly, which may significantly alter their productions and decays. Instead of giving a full phenomenological consideration, we assume a decoupled SUSY sector [37]. We neglect supersymmetric corrections to conventional Higgs productions and decays, tolerate a potential deviation of the SM-like Higgs mass from the observed value, and turn off the Higgs decays into superparticles. Such a treatment ensures the analyses be less model-dependent, and enables us project the sensitivity reach to a specific model more easily.
In the decoupling limit, the couplings take the values of the 2HDM alignment limit yielding a generic suppression in sensitivities for the search modes (see table 2): H → V V , A → hZ, and H ± → hW ± . As for H → hh, though the involved coupling g Hhh is not completely suppressed in the alignment limit, its decay width is inversely proportional to m H , yielding a suppressed sensitivity for this mode in the case of large m H (e.g., see [24]). In the decoupling limit, therefore, H/A mainly decay into bb, τ τ for large and intermediate tan β and into tt for intermediate and low tan β. H ± overwhelmingly decay into tb except in the parameter region with relatively small m H ± and large tan β where the branching ratio of H ± into τ ν is not negligibly small. The total decay width of these Higgs bosons varies at percent level or even below, scaled by their mass.

JHEP11(2015)124
Couplings MSSM H g HV V cos(β − α) g Htt sin α/ sin β g Hbb cos α/ cos β g Hττ cos α/ cos β The contours for branching fractions of H/A → tt and H ± → tb are shown in figure 1. The figure indicates a clear tan β suppression for these branching fractions. Compared to Br(H/A → tt), however, Br(H ± → tb) is much less suppressed in moderate and large tan β regions. This is mainly because the coupling g H ± →tb is at tree level a linear combination of the t and b Yukawa couplings, with the latter yileding a tan β-enhanced contribution to the H ± decay width. In spite of this, Br(H ± → tb) is relatively small in the upper-left corner, due to additional phase-space suppression.
For similar reasons, the productions of these BSM Higgs bosons via vector boson fusion or Higgs Strahlung are highly suppressed. Gluon fusion is the most important production mechanism of H/A in the low tan β region, and bb associated production becomes dominant in the moderate and large tan β regions (see figure 2). The mild enhancement of σ(pp → H/A) in the large tan β region is mainly caused by the contribution from the b quark loop. Naturally we expect that pp → H/A → tt and pp → bbH/A → bbτ τ, bbbb yield the best sensitivity in the low and large tan β regions, respectively. One subtlety, however, arises from the interference between the signal and the QCD background. It has been known for a while that the peak of the invariant mass m tt distribution for the pp → H/A → tt can be distorted into a rather complicated peak-dip structure or even be smoothed away, as m A increases [30,31]. This potentially disables any resonance-reconstruction-based analysis, and necessitates the development of new strategies. Such an exploration is beyond the scope of this paper. Instead, we propose to search in channels with bb, tt associated Higgs productions, where the interference effects are less severe.
A remarkable observation however is: the moderate tan β region can be efficiently probed via the channel pp → bbH/A → bbtt, compared to the large and small tan β regions.  Explicitly, we present the contours for σ(pp → bbH/A)Br(H/A → tt) at a 100 TeV pp collider in figure 3. For reference, we also present the contours of σ(pp → H/A)Br(H/A → tt), with the interference effects neglected. Indeed, σ(pp → bbH/A)Br(H/A → tt) is maximized in moderate tan β region. Though its largest value for m A ∼ 1 TeV is one to two orders smaller than that of σ(pp → H/A)Br(H/A → tt), they become comparable as m A increases towards ∼ 10 TeV. Given that the channel pp → bbH/A → bbtt carries more kinematic features due to the two additional b quarks, we may well expect that this channel can yield a high sensitivity in probing for moderate tan β region. This is true, as will be illustrated below. A combination of these channels does leave no "wedge" open around moderate tan β, given a m A within the energy reach of the LHC or the 100 TeV pp collider, if the low tan β region can be efficiently probed.
The H ± production, it is dominated by pp → tbH ± in both large and small tan β regions (see figure 2). This is simply because the coupling g H ± tb receives both tan β- enhanced and tan β-suppressed contributions at tree level. Given that H ± → tb is the main decay mode in the decoupling limit, the channel pp → tbH ± → tbtb is expected to be the golden channel for the heavy H ± searches. The contours for σ(pp → btH ± )Br(H ± → tb) at a 100 TeV pp collider is also presented in figure 3. Because of the effects discussed above, For example, the 10 ab contour reaches m H ± ∼ 20 TeV for H ± , in comparison to m A ∼ 10 TeV. The contours for σ(pp → bbH/A)Br(H/A → tt) and σ(pp → H/A)Br(H/A → tt) and for σ(pp → btH ± )Br(H ± → tb) in the MSSM at the LHC are presented in figure 4. As a comparison, the 10 ab contours recede to m A ∼ 2 TeV and m H ± ∼ 4 TeV, respectively. These contours can help understand the LHC and HL-LHC sensitivities in this regard which will be discussed below.

Collider analyses and BDT
In this article we mainly analyze the discovery reaches and exclusion limits of searches for both neutral and charged Higgs bosons at a 100 TeV pp-collider. We cover the channels summarized in table 1, except for the channel pp → bbH/A → bbbb, given that its coverage over parameter space largely overlaps with that of the channel pp → bbH/A → bbτ τ , and the JHEP11(2015)124 channels pp → H/A → tt, pp → ttH/A → tttt, which are potentially sensitive in probing for the low tan β region. The corresponding backgrounds are summarized in table 3. Because of their significance, we also present the pp → bbH/A → bbtt and pp → tbH ± → tbtb sensitivities that might be achieved at the LHC. The analyses are largely based on two classes of kinematic features. The first one is related to the heaviness of Higgs bosons in the decoupling limit. In such a case, the decay products of Higgs are strongly boosted, resulting e.g. for H/A → tt in two boosted tops and for H ± → tb in one boosted top. In the analyses with intermediate top quarks, we assume that the more boosted one decays leptonically and optimize the analyses based on this assumption. The distribution of the hardest top-quark and the hardest lepton are depicted in figure 5a and 5b, respectively. The decay products of these tops tend to lie within a single jet cone (cf. figure 5c), yielding a top jet. Additionally, the events have a large sum of scalar transverse momenta (H T ). These features becomes less prominent for smaller Higgs masses, especially for the mass range accessible to the LHC.
Another class of kinematic features is related to the particles accompanying the Higgs production. For example, in the case of bbH/A, the two b-jets accompanying the Higgs are less boosted, but tend to have a large rapidity. Hence, the difference in rapidity (∆η) is typically larger than that of b-jet pairs in the main backgrounds ttbb and ttcc. Therefore, ∆η can serve as a strong discriminator if the two accompanying b-jets are identified. The story is similar in the case of tbH ± , where the second b-jet is from top decay. However, the rapidity of many of these b-jets is larger than the coverage of the tracker of a LHC-like detector design (|η| < 2.5), as is indicated in figure 6. Hence it is difficult for these b-jets to be identified. Even if these b-jets are tagged, we are still confronted by a combinatorial background caused by b-jets from Higgs decays which are typically central and hence easier to be tagged.
Addressing these difficulties is not easy, however, it provides an opportunity to explore potential guidelines for optimizing the detector design of an 100 TeV pp-collider, as well JHEP11 (2015)124  Table 3: Cross-sections and generated luminosities of the relevant background processes after precut. All tt backgrounds are matched up to two jets including b-jets. The pre-cut p T (t) has been applied to both top quarks or the leading top quark and the leading non-top quark for neutral or charged Higgs, respectively. The pre-cut p T (l) has been applied to all event leptons including τ -leptons. In our analysis we apply a k-factor of 1.2 to the LHC cross-sections [47].   (b) ∆η between the two accompanying b-quarks Figure 6: Rapidity distributions of the b-quarks accompanying Higgs production (a) and the distribution of the difference in rapidity between these two b-quarks (b). The graphs show background (ttbb) and signal (bbH/A, tbH ± ) for 14 TeV and 100 TeV. We choose the b-quarks accompanying the top production in ttbb for comparison. For 14 TeV and 100 TeV a cut on the jet transverse momentum of 20 GeV and 40 GeV has been applied, respectively. The vertical line at |η| = 3.5 in (a) represents the tracker coverage assumed for an 100 TeV pp-collider, for jets with larger rapidity b-tagging is not applicable. The areas under all curves are normalized.
as for searching for heavy resonances at such a collider, given the essential role played by an extended Higgs sector in particle physics. In this article, we assume a tracker coverage of |η| < 3.5 for future 100 TeV collider detectors. In response to the boostness kinematics and in order to suppress combinatorial background, we apply a multivariate approach by using a BDT. More explicitly, we apply BDT in each reconstruction step, aiming for a full event reconstruction. Below are the steps: 1. Construct bottom BDT. With a BDT method, we are able to define b-like jets (similar for t-like jets) which are characterized by their likelihood to be a b-jet.  2. Construct top BDTs: one is hadronic and another one is leptonic. We build up BDTbased jet-taggers for boosted tops which mostly originate from the Higgs decays (cf. figure 5c). In the case of less boosted tops we can not expect all top decay products to lie inside a jet cone and we have to reconstruct the W -bosons before we can use these to reconstruct the tops.
3. Construct the Higgs BDT using the reconstructed tops or one reconstructed top and one b-like jet for neutral or charged Higgs, respectively.
4. Construct a BDT to demand two b-like jets (or one top-like and one bottom-like jet) with large ∆η which accompany the Higgs production. We name it "bottom-fusion" BDT, though such kinematics may originate from a process other than bottom quark fusion.
5. Construct the BDT for the whole event by properly combining the Higgs BDT and the bottom-fusion BDT.
For illustration, the reconstruction steps undertaken for the pp → bbH/A → bbtt events with heavily boosted top-pair and unboosted top-pair are depicted in figure 7a and in figure 7b, respectively. The transition from boosted to unboosted events is gradual, which is taken into consideration by the top BDT.
In the large tan β region, the decays into τ -leptons contribute to the searches for both neutral and charged Higgs bosons. We require the two τ leptons in pp → bbH/A → bbτ τ to decay semi-leptonically. In pp → tbH ± → tbτ ν, we let both top quark and τ decay JHEP11(2015)124 hadronically. Then we can use hard leptons from the τ decay or large transverse mass to suppress backgrounds in these two cases respectively. Given their relatively simple collider kinematics, we apply a cut-based method in these two cases.
Our analysis framework is defined in the following. We use SoftSusy 3.5.1 [48] as generator for the MSSM Higgs spectrum and simulate all events with MadGraph. The subsequent decays are performed by Pythia 6.426 [49]. We use Delphes 3.1.2 [50] to simulate a detector with CMS geometry (except using a tracker coverage |η| < 3.5 for the 100 TeV pp-collider), with pile-up turned off. For LHC simulations we keep the default values. For cut-based analyses we depend on Delphes jet-tagging, which we tune according to Drell-Yan samples with a transverse momentum cut of 40 GeV. The b-tagging efficiency is 70 % with 20 % misidentification of c-jets. The τ -tagging efficiency is 60 % and the fake-rate is 1 %. For BDT-based analyses we cluster jets based on Delphes energy flow observables with FastJet 3.0.6 [51], with a radius 0.5. In this case we switch off lepton isolation requirements, and instead require that the leading lepton has a transverse momentum larger than 50 GeV and 100 GeV, and that the missing energy is larger than 30 GeV and 60 GeV, for the LHC and the 100 TeV collider, respectively. We demand that jets have a transverse momentum larger than 20 GeV and 40 GeV for the LHC and the 100 TeV collider, respectively, in all cases. In order to reduce the number of background events, we apply a pre-cut on the top or lepton transverse momenta or the missing transverse energy (cf.

BDT-based top-jet taggers
Each top BDT comprises a boosted top-jet tagger and modules for the reconstruction of unboosted tops. We apply both techniques to each top and rely on the technique with the better result, hence we choose the top jet with the best BDT response. The construction of top-jet taggers plays a crucial role in exploring physics with boosted tops. Thus we introduce in this section how to construct the hadronic and leptonic top-jet taggers, using a BDT method. 1 For heavily boosted hadronic top jets the detector resolution may not allow for using sub-jet information. Hence we rely solely on the jet mass and variables making use of the secondary vertex information (cf. appendix B.1). In order to suppress the fake rate of leptonically decaying tops in the hadronic top tagger, we veto against a hard lepton inside the jet cone. If the top-jets are not too heavily boosted, we make use of the kinematics of sub-jets [55], which are re-clustered using a k T jet algorithm in an exclusive way. We consider the case, where the top-jet consists of two sub-jets which resemble a boosted Wjet and a b-jet respectively, and the case, where the top-jet consists of three sub-jets. 2 We assume that the cell resolution of a typical detector is around ∆R 0.1 which imposes

JHEP11(2015)124
an upper limit on the substructure resolution. We rely solely on traditional kinematic variables of the jets and sub-jets. This can be further improved with variables based on more advanced substructure information, such as pull [56] or di-polarity [57], which should help to suppress the fake rate of jets with a different color-flow compared to top jets. In the case of leptonic top jets, it becomes crucial to identify the lepton together with the hadronic activity. As the lepton is not isolated it has to serve as the foundation for a boosted leptonic top tagger [58], which poses more challenges in the case of electrons compared to muons. The development of the necessary non-isolated lepton taggers lies out of the scope of this paper. Therefore, we have assumed that it will be possible to identify leptons in a hadronic environment.
We have tested these top taggers on various backgrounds. The resulting misidentification rates as a function of the top-tagging rate are presented in figure 8. We present the miss-identification rate for two bins each in which all jets are generated with the same parton level pre-cut of p j T > 500 and 1000 GeV, respectively. Additionally, we require the jets to fall into a transverse momentum window of 500, 700 GeV < p j T < 1000 GeV and 1000 GeV < p j T < 1500 GeV, respectively. We apply a slightly different lower cut on the lower of the two p T windows for leptonic (500 GeV) and hadronic (700 GeV) tagger, in order to compensate for the missing transverse momentum in the leptonic case. For the hadronic top tagger the most important background consists of Higgs jets followed by b, c and h jets. For the leptonic top tagger the most important backgrounds are hadronic top and b jets. We stress that the faking rates shown in figure 8 are defined inclusively, hence the BDT background is defined to a combination of all listed background jets. At an exclusive level, the fake rates can be further optimized.

Prospects at 1and 100 TeV pp colliders
The exclusion limits and discovery reaches which are yielded by these channels at the LHC and a 100 TeV pp collider with a CMS-like detector (a tracker coverage of |η| < 3.5 is assumed for the 100 TeV machine) are presented in figure 9 and figure 10. We provide limits for the LHC with an integrated Luminosity of 0.3 ab −1 and 3 ab −1 which corresponds to the expectation of the collected data after the third run and an upgrade to the HL-LHC, respectively. For the 100 TeV collider we show the contours for a luminosity of 3 ab −1 and 30 ab −1 . Though systematic errors are not considered, we believe that incorporating them will not qualitatively change conclusions reached in this paper.
For one to interpret these results straightforwardly in some other extended Higgs sectors, e.g., the one in the 2HDM, we present the exclusion limits and discovery reaches of the channels bbH/A → t h t l and tbH ± → t h bt l b at 14 TeV and 100 TeV in figure 9, in a model-independent way. A generic feature of these sensitivity reaches is that they tend to be higher, for a larger m A or m H ± within the range under exploration. This is simply because the boostness kinematics plays a crucial role in suppressing the background.
The interpretation of these results in the MSSM are presented in figure 10. For moderate tan β, the cross-section for the associated charged Higgs channel (cf. figure 3c)   We consider only the case of highly boosted top jets and do not use sub-jet information. Quark-jets (including t, b, c and q = u, d) are produced in a Drell-Yan process; g-jets are produced via QCD processes; the SM h-jets are produced via gg → hh, with Br(h → bb) = 100% assumed; and W -and Z-jets are produced via di-boson production, and decay in a standard way. All samples are subject to a pre-cut on their transverse momenta, and are generated in the same size. We present two choices of pre-cuts for each tagger. In the hadronic case we generate the jets with a parton level pre-cut of 500 GeV and 1000 GeV and additionally require the jets to fall into the transverse momentum windows 700 GeV < p j T < 1000 GeV and 1000 GeV < p j T < 1500 GeV, respectively. In the leptonic case we generate the jets with a parton level pre-cut of 500 GeV and 1000 GeV and additionally require the jets to fall into the transverse momentum windows 500 GeV < p j T < 1000 GeV and 1000 GeV < p j T < 1500 GeV, respectively. We do not consider the effects caused by pile-up.  the analyses reach a roughly comparable sensitivity for this parameter point, given that they share similar backgrounds for the setup under consideration. In the case of charged Higgs the cross section and the sensitivity increases when tan β moves away from this point. In the case of neutral Higgs the cross-section and sensitivity decreases when tan β moves away from this point. For the energies reachable by the LHC and a luminosity of 3 ab −1 , we are able to exclude the moderate tan β region up to ∼ 1 TeV, via pp → bbH/A → bbtt for neutral Higgs bosons and via pp → tbH ± → tbtb for charged Higgs bosons. The search for low mass charged Higgses in the moderate tan β region is strongly affected by the pre-cuts JHEP11(2015)124 on lepton transverse momentum and missing transverse energy and effectively keeps a small unprobed area, which we hope to cover with a slightly improved analysis. This is indicated in figure 10. Combining with pp → bbH/A → bbτ τ and pp → ttH/A → tttt (or pp → H/A → tt), a full coverage of tan β might be achievable for neutral Higgs searches, though a dedicated analysis is yet to be done for the latter. As for the charged Higgs searches, the pp → H ± tb → ttbb channel covers additionally the lower and higher tan β region up to more than 2 TeV. The discovery regions are more tightly constrained, at the HL-LHC the charged Higgs can be discovered in the associated channel for high and low tan β up to ∼ 2 TeV.

JHEP11(2015)124
At the 100 TeV collider, a combination of pp → bbH → bbτ τ and pp → bbH/A → bbtt pushes the exclusion limit for the neutral Higgs searches up to m A ∼ 10 TeV, except for the low tan β region. For pp → bbH/A → bbtt, the low mass region has a worse sensitivity than the intermediate mass region, due to the pre-cuts on lepton transverse momentum and the missing transverse energy. In figure 10, we also present the sensitivity reach of the t h t l resonance search for reference (dashed red curves), with the interference effect between signal and background ignored. We however should not interpret it as the real reach for the neutral Higgs searches in the channel of pp → H/A → tt at a 100 TeV collider, since the interference effect can dramatically change the resonance structure. Concurrently, pp → tbH ± → tbtb pushes the exclusion limit for the charged Higgs searches up to m H ± ∼ 10 TeV with a full coverage of tan β, with additional coverage up to m H ± ∼ 20 TeV for both high and low tan β regions. Discovery of neutral and charged Higgs will be possible up to 10 TeV and 10-20 TeV, respectively. In summary, with the channels under study, the HL-LHC and the 100 TeV pp-collider has a potential to push the sensitivity reach for H/A and H ± from a scale of O(1) TeV up to a scale of O(10) TeV, respectively.

Summary and outlook
The BSM Higgs sector is one of the most important physics targets at the LHC and at next-generation pp colliders. In this article, we present a systematic study testing the MSSM Higgs sector in the decoupling limit, at 14 TeV and 100 TeV machines. We propose that the "top" decay channels (H/A → tt and H ± → tb) in associated Higgs productions, should play an essential role. These channels are typically characterized by kinematics with highly boosted Higgs decay products, and large forwardness/backwardness of the particles accompanying the Higgs production (pp → ttH/A is an exception, where tt is less forward or backward). Facilitated with a BDT method, the LHC will be able to cover the "wedge" region for neutral Higgs searches and hence test the Higgs sector up to O(1) TeV for both, high and moderate tan β region. We show that a future 100 TeV pp-collider has a potential to push the sensitivity reach up to O(10) TeV.
The analyses pursued in this article represent a preliminary effort in this regard. A more complete study is definitely necessary. Below are several directions that are interesting to explore in our view: • Although, the channel pp → bbH/A → bbtt enables us to cover the "wedge" or moderate tan β region for neutral Higgs searches in the decoupling limit, its sensitivity gets reduced below or around the threshold m H /m A = 2m t , where this channel either becomes kinematically forbidden at an on-shell level or yields soft decay products. However, we may take similar strategies, combining the bbH/A production with some other dominant decay modes in low tan β region, such as H → hh and A → hZ. Then the unprobed "wedge" region might be covered by the channels pp → bbH → bbhh and pp → bbA → bbhZ.
• A realistic analysis for probing the low-tan β region in the neutral Higgs searches is still absent, largely because of the large interference effect between the channel pp → H/A → tt signal and QCD tt background. To address this question, developing

JHEP11(2015)124
new strategies are definitely necessary and important. One possible way out is to search for pp → ttH/A → tttt. Although its cross section is relatively small, compared to the other channels, the strategies developed in this article might be of help. Both topics are under exploration. The results will be presented in a future article [59].
• Although, the studies in this article are focused on the MSSM Higgs sector, it is straightforward to project their sensitivity reach to some other BSM scenarios, such as the 2HDM. Additionally, the exploration can be generalized to exotic search channels. Such channels might be switched on in a 2HDM model, but are generically suppressed in the MSSM [60][61][62][63]. Dedicated analyses are certainly necessary. We leave this exploration to future work.
Note added: while this article was in finalization, the papers [64,65] appeared, which partially overlap with this one in evaluating the LHC sensitivities in searching for pp → bbH/A → bbtt and pp → tbH ± → tbtb. However, we notice a difference between the LHC sensitivities obtained in [65] and in our analyses, which enable us to conclude that the LHC has a potential to fill up the well-known "wedge" (that is, the region with moderate tan β) up to ∼ 1 TeV, with 3000 fb −1 , via such channels. In spite of this, we are focused more on testing an extended Higgs sector at a future 100 TeV pp-collider, exploring the collider kinematics involved and developing its (BDT-based) search strategies, which are not covered in [65].

A Background generation
All background cross section are calculated with MadGraph and given to leading order. In order to reduce the necessary background we apply different pre-cuts depending on Higgs mass. For the pp → bbH/A → bbtt and pp → tbH ± → tbtb analyses the inclusive tt background is generated with a pre-cut on the top transverse momentum of p T (t) = 300, 1500, 2500 GeV for Higgs masses equal or larger to 1000, 4000, 6000 GeV, respectively. For the LHC analyses we have applied a pre-cut of 250 GeV for masses equal or larger than 1000 GeV.

JHEP11(2015)124
In the pp → bbH/A → bbτ τ analysis the irreducible background bbZ/γ * → bbτ τ and the reducible backgrounds ccZ/γ * → ccτ τ , tt → bbτ τ νν and tt → bblτ ν are considered. We generate background samples with a pre-cut on the lepton transverse momentum of 150 GeV for Higgs masses equal or larger than 1000 GeV, and 700 GeV for Higgs masses equal or larger than 3000 GeV.
For the analysis pp → btH ± → btτ ν, the irreducible background is tt → bbjjτ ν and the reducible backgrounds are tt → bbτ τ νν and tt → bblτ νν. The background tt → bbjjτ ν can be suppressed by a large transverse missing energy together with a requirement on a large transverse mass. Hence we generate tt → bbjjτ ν with transverse missing energy larger than 300 GeV for Higgs masses larger than 1 TeV, and for Higgs mass larger than 3 TeV, the irreducible background can be well suppressed. For tt → bbτ τ νν and tt → bblτ νν, the transverse missing energy is required to be equal or larger than 300 GeV and 600 GeV for Higgs masses larger than 1 TeV and 3 TeV, respectively. All background cross-sections and the generated Luminosities are collected in table 3.

B BDT-based analyses
The analyses for pp → bbH/A → bbtt and pp → tbH ± → tbtb are based on BDT.

B.1 Boosted Decision Trees (BDT)
A decision tree consists of a series of cuts classifying an event to be either signal or background. Subsequent cuts of the decision tree are also applied to events previously classified as background, yielding shapes in the parameter space which approximate the signal region much better than the usual rectangular cuts. The decision tree is trained on a training sample with truth level information; in a second stage it is applied to the testing sample without truth level information. A decision tree can lead to a perfect separation between signal and background in the training sample. However, this goes with an over-training effect where subsidiary cuts deal only with statistical effects of the training sample. This can be seen by comparing the efficiency of the decision tree between the training sample and a test sample. An enhancement to the decision tree dealing with this problem are boosted decision trees. While the basic decision tree algorithm stays the same, it gets applied multiple times, on re-weighted samples. The weighting factors are calculated from the region in parameter space where the decision tree has the lowest discriminating power. BDT techniques have been introduced to high energy physics in [66] and have since then seen a large adoption especially in the experimental community, including [67][68][69][70][71]. For a short introduction see [72]. We make use of the TMVA package of the ROOT framework. We are using the default AdaBoost algorithm and apply bagging, in order to further reduce over-training. Our analysis code BoCA 0.1 is publicly available [33].
In order to classify the events of the training sample as signal or background we identify jets with their nearest quark. Additionally, we define a set of observables according to which the BDT is trained. In the testing and application phase the BDT is applied to a sample without truth level information and returns the signal likeliness of each element in this sample. We combine multiple BDTs tailored to reconstruct single particles into a chain, which we apply to our events. In a first step, we construct a simple bottom jet BDT tagger.

JHEP11(2015)124
For boosted leptonic and hadronic top jets we develop two BDT taggers. For non-boosted tops, on the other hand, we first reconstruct hadronic and leptonic W -bosons, subsequently we combine these W -bosons with a b-jet. In the next step we reconstruct neutral or charged Higgs from two tops or one b-jet and one top, respectively. Additionally, we search for the bottom-fusion jet pairs with large difference in rapidity. Finally, we combine the reconstructed Higgs and the bottom-fusion pair and distinguish signal and background events with a last BDT analysis.

Bottom Tagger
In order to distinguish bottom jets from light jets we make use of the finite lifetime of the intermediate bottom mesons. A bottom jet has multiple displaced vertices with a larger invariant mass and energy fraction compared to light jets. Therefore, we train the bottom jet BDT on  Figure 11 shows the BDT-response of the leptonic top tagger to different jets with the same boostness. The light-, W -and Z-jet fake rate is mostly suppressed by the displacement information. The track multiplicity and vertex mass reduces the fake rate for h-jets as the main decay channel into b-quarks yields more secondary vertices compared to leptonic top jet. The jet mass distribution of the heavy quarks are centered around the respective quark and boson masses, which reduces the fake rates of theses jets further. On the other hand, the jet masses of the light quarks and gluons form a very broad slope, maximized at zero, which makes it hard to suppress the fake rate of these jets with the jet mass information.  In figure 12 we present the BDT-response of the hadronic top tagger to different jets with the same boostness. Also in this case the main discriminator against light-, Wand Z-jets is the presence of secondary vertices. Higgs jets are suppressed by the energy fraction between tracks with and without secondary vertices. The presence of an hard lepton suppresses the leptonic top. In the case of less boosted hadronic top jets we make use of the information provided by its decay products. Hence, we probe if the jet can be re-clustered into two sub-jets to resolve the b and W component or into three sub-jets to resolve additionally the decay products of the W .

Top reconstruction
For non-boosted top jets we have to assume that the W and the b are located in two or three different jets. Hence, we perform a W reconstruction whenever necessary, which combines two (sub-)jets and probes on • Invariant mass of the reconstructed object • Momentum-space position of the reconstructed object (p T , η, φ) • Angular difference between both elements of the reconstructed object (∆η, ∆φ, ∆R) • Momentum difference ∆p T • BDT response of the preceding reconstruction step; in the case of hadronic W -reconstruction this implies a veto on b-jets.
In a second step we combine the the reconstructed or tagged W 's with a b-jet to a top and train the top BDT on the same variables as in the W -BDT.
Finally, we have to take the case of non-boosted leptonic top into account. We consider the hardest lepton as possibly originating from a W decay. In order to reconstruct these leptonically decaying W -bosons, we reconstruct the neutrino four momentum from the lepton momentum and missing energy. Whenever the appearing quadratic equation has complex solutions we iteratively move the missing energy vector towards the lepton until the solution becomes real. Of these two solution we train the BDT on the one that coincides better with the truth level neutrino.

Heavy Higgs reconstruction
For the Higgs reconstruction we combine two tops and one top with one bottom for neutral and charged Higgs, respectively. We train the BDT on the same variables as in the top reconstruction (cf. step 3).

Bottom Fusion BDT
Additionally we train a BDT on the bottom fusion pair accompanying the Higgs. The most important discriminating variables in this case are the large rapidity difference between these two jets and their b-(top-) likeliness. We demonstrate the performance of this tagger on a vector boson fusion samples in figure 13a.

C Cut-based analyses
We analyze the processes pp → bbH/A → bbτ h τ l and pp → btH ± → btτ h ν at 100 TeV, with a cut based approach.

C.1 Neutral Higgs
For the semi-leptonic channel H/A → τ l τ h with a large Higgs masses, the signal contains one hard lepton and one hard jets possibly tagged as τ -jet. We present the analysis for m H /m A = 3 TeV for illustration. The cuts applied in our analysis are • Cut 1 One lepton with p l T > 700 GeV and veto on a second lepton with p l T > 25 GeV • Cut 2 Exactly one oppositely charged (Q(τ ) =−Q(l)) jet with τ -tag and p τ T > 700 GeV where φ l,miss is the azimuthal angle between lepton and the missing transverse momentum. The cut flows for signal and background are presented in table 4. The resulting significance for exclusion is 1.86 σ for a luminosity of 3000 fb −1 . The resulting exclusion region at a 100 TeV collider is shown in figure 10.
Signal Background tbH ± → tbτ ν tt → bbτ τ νν tt → bblτ νν  Table 5: Cut flows for pp → tbH ± → tbτ ν signal and background with m H ± = 3 TeV, where f s is the scale factor for a luminosity of 3000 fb −1 and total cross section σ = 0.507 fb, corresponding to tan β = 10. The signal and background samples are generated with τ decaying inclusively.
For m H ± = 3 TeV, the irreducible background tt → bbjjτ ν and multi-jet background are well suppressed.

C.2 Charged Higgs
The kinematics for the channel pp → tbH ± → tbτ h ν with a large Higgs mass is characterized by large missing energy, hard τ had and large m T (τ had , E miss T ). These features suppress the irreducible and the multi-jet background, particularly in the large Higgs mass domain.
We present the analysis for m H ± = 3 TeV for illustration. Below are the analyses cuts applied: • Cut 1 ≥ 4 jets, with exactly two b-tagged, ∆η bb ≥ 2.5, and veto of leptons with p T > 25 GeV.
We present the cut flows for signal and background in table 5. The significance for exclusion is 2.25 σ for a luminosity of 3000 fb −1 . The cuts on E miss T , p T (τ had ) and m T are modified

JHEP11(2015)124
for optimizing the analyses, as m H ± varies. The resulting exclusion region at a 100 TeV collider is shown in figure 10.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.