Spotting hidden sectors with Higgs binoculars

We explore signals of new physics with two Higgs bosons and large missing transverse energy at the LHC. Such a signature is characteristic of models for dark matter or other secluded particles that couple to the standard model through an extended scalar sector. Our goal is to provide search strategies and an interpretation framework for this new signature that are applicable to a large class of models. To this end, we define simplified models of hidden sectors leading to two different event topologies: symmetric decay, i.e., pair-produced mediators decaying each into a Higgs plus invisible final state; and di-Higgs resonance, i.e., resonant Higgs-pair production recoiling against a pair of invisible particles. For both scenarios, we optimize the discovery potential by performing a multivariate analysis of final states with four bottom quarks and missing energy, employing state-of-the art machine learning algorithms for signal-background discrimination. We determine the parameter space that the LHC can test in both scenarios, thus facilitating an interpretation of our results in terms of complete models. Di-Higgs production with missing energy is competitive with other missing energy searches and thus provides a new opportunity to find hidden particles at the LHC.


Introduction
Postulating a hidden sector that interacts primarily with the Higgs boson is tempting for good reasons. Higgs couplings to new scalar standard-model (SM) singlets are renormalizable and secluded from visible matter [1,2]. An extended scalar sector can thus serve as a portal to a hidden sector [3][4][5][6]. At the LHC, the Higgs interaction with a hidden sector is best probed in signatures with one or two Higgs bosons. Searches for invisibly decaying Higgs bosons or for mono-Higgs production in association with missing transverse energy / E T are well-established parts of the LHC program. Invisible Higgs decays probe hidden sectors with particles significantly lighter than the Higgs boson. Mono-Higgs signals are often predicted in models that can also be probed in other channels, such as mono-jet production, mono-Z production, or signatures with missing energy and several leptons and/or jets. For a review of missing energy searches at the LHC, we refer the reader to refs. [7,8] and references therein.
A signal of two Higgs bosons and missing energy is naturally predicted in the context of supersymmetry (SUSY), for instance from Goldstino production in models with gauge mediated SUSY breaking [9,10], or from chain decays of superpartners into Higgs bosons and neutralinos in the minimal supersymmetric standard model (MSSM) and its extensions [11][12][13]. More generally, di-Higgs plus missing energy is a signature of models with extra scalars [14], such as a pseudo-scalar portal to a dark sector [15], axion-like -1 -

JHEP04(2019)160
particles [16], massive right-handed neutrinos [11,17] or in the framework of Little Higgs scenarios [11,18]. Experimental searches for di-Higgs plus / E T production at the LHC have been performed for a signal of four bottom quarks and missing energy in the context of SUSY [19,20]. This search targets a signature of Higgsino pair production, followed by a decay chain with Higgs bosons and Goldstinos in the final state [10]. Since the analysis is optimized for very light Goldstinos produced via this specific decay chain, its reinterpretation in other scenarios is limited. A systematic exploration of the di-Higgs plus / E T channel at the LHC is still lacking.
Our goal is to provide a minimal, simple framework to exploit the full potential of the LHC to search for new hidden sectors with a di-Higgs plus / E T signature. As a matter of fact, the search strategy for this signature strongly depends on the masses and decays of the relevant particles. Based on two main decay topologies, we define simplified models for pp → hhχχ production, where h is the SM Higgs boson and χ is invisible and stable at detector scales. Each model involves two scalar mediators B and A, where B couples to gluons and is heavier than two A scalars. The first model, referred to as symmetric topology, is inspired by electroweakino production in the MSSM. A pair of on-shell scalars A is produced from the decay of B. Each A subsequently decays into a Higgs boson and an invisible scalar χ. The di-Higgs signature is thus generated by pp → B → AA → (hχ)(hχ).
(1. 1) In the second model, referred to as resonant topology, each of the pair-produced scalars A decays into either two Higgs bosons or invisibly. The corresponding production chain is Such a topology is typical in scalar portal models. Since the definition of the two simplified models is based solely on the kinematic properties of the final state, LHC searches for these simple topologies can easily be recasted in terms of concrete models. Our analysis focuses on the Higgs decay into bottom quarks, h → bb, which maximizes the event rates. The signal thus consists of four b-jets and a large amount of missing transverse energy. To reconstruct the two Higgs bosons from the four b-jets, we will make ample use of the mature analysis techniques for di-Higgs searches without associated missing energy. Due to its sensitivity to the Higgs self-interaction, Higgs pair production in the SM (see refs. [21,22] for the latest experimental prospects) and beyond (for a review see e.g. ref. [23] and references therein) is a key target of the LHC program and proposed future colliders [24]. The prospects to observe a signal of Higgs pairs has evolved from "seemingly impossible" [25] to a detailed investigation of the final states bb τ + τ − [26], bb W W * [27] and bb bb [28]. This tremendous progress was triggered by exploiting novel techniques such as jet substructure and shower deconstruction [29][30][31]. Today these techniques are applied by the ATLAS and CMS collaborations in experimental analyses of Higgs pair production [32,33]. In our search for hhχχ production with four b-jets and missing energy, we will combine jet substructure techniques with a state-of-the-art multivariate analysis to optimize the sensitivity to our signal. For Higgs pair production in the SM, the channel with four b-jets -2 -

JHEP04(2019)160
does not have the best performance, due to an immense multi-jet background. In contrast, due to the presence of large missing energy in our signal, the largest background arises from electroweak gauge bosons plus jets, which lies a few orders of magnitude below the QCD multi-jet processes that appear in SM di-Higgs searches. We therefore focus on the four-bottom final state, leaving other decay channels for future exploration.
Our article is organized as follows. In section 2, we introduce simplified models for hhχχ production. In section 3, we discuss the main features of the di-Higgs plus / E T signature in our simplified models and the challenges we face in reconstructing the four-bottom final state, as well as triggering and backgrounds. We attempt a cut-based analysis and investigate its discovery prospects at the High Luminosity LHC (HL-LHC). In section 4, we explain the details of our multi-variate analysis and demonstrate a large gain in sensitivity compared to the cut-based analysis. We stress that in the context of our simplified model, new physics could be discovered first in the di-Higgs plus / E T channel, while being consistent with all existing (and future HL-)LHC searches. This highlights the importance of carrying out the proposed analysis. In section 5, we explore the validity of the dark matter interpretation of our model. We defer our conclusions to section 6.

Simplified models of a hidden scalar sector
In this section, we provide details on the two simplified models that give rise to the di-Higgs plus / E T signature, but with different final-state topologies. In the symmetric topology, the two Higgs bosons stem from a chain decay within the hidden sector, while in the resonant topology they form a di-Higgs resonance. Since the two setups have some structural similarities, we first discuss their common features. We then move on to describe the specific ingredients of the models that lead to the different topologies.
Both models rely on an extended scalar sector with three new real scalar particles A, B, and χ that are singlets under the SM gauge group, with a mass hierarchy m B m A m χ . The models also feature a discrete Z 2 symmetry under which particles belonging to the hidden sector are odd, while new particles in the visible sector as well as the SM particles are even. For the sake of simplicity, we assume that none of the new scalars develops a vacuum expectation value (VEV).
The heaviest of the three new scalars, B, is produced via gluon fusion at the LHC and predominantly decays to AA pairs. The relevant interaction terms are Here we introduce an effective dimension-five interaction of the scalar B with gluons such that it is resonantly produced via gg → B, in analogy to the dominant Higgs production channel in the SM. We discuss a renormalizable UV completion for this interaction in appendix A. The triple scalar coupling m BAA induces the decay B → AA with a branching ratio near 100%, unless m BAA is very small compared with the Higgs VEV v. For values of C Bgg originating from perturbative physics around the TeV scale, the decay into dijets via -3 -JHEP04(2019)160 symmetric topology resonant topology Figure 1. Topologies for a scalar s-channel resonance B decaying into di-Higgs plus / E T via a pair of scalars AA. If A decays via A → hχ, then the symmetric topology shown on the left emerges. The decays A → hh and A → χχ, on the other hand, lead to the resonant topology on the right. B → gg then occurs only at the percent level. We also suppress the decays B → χχ, B → Aχ, and B → hh by assuming the relevant couplings to be small. Note that a Bhh coupling would induce B-h mixing, which is severely bounded by measurements of the Higgs coupling strength [34]. As B is produced in gluon fusion, it necessarily belongs to the visible sector, i. e., it is even under the Z 2 symmetry. The scalar χ belongs to the hidden sector and is thus taken to be Z 2 -odd. As it is the lightest hidden particle, it is stable and appears as missing energy in the LHC detectors.
Depending on the Z 2 parity of A, two different event topologies for di-Higgs plus / E T can be distinguished, • Symmetric topology. If A is part of the hidden sector, i.e., Z 2 -odd, it decays via A → hχ. Di-Higgs plus / E T arises from a symmetric event topology with chain decay in the hidden sector.
• Resonant topology. If A instead belongs to the visible sector, i.e., if it is Z 2 -even, it can decay via A → hh and A → χχ. The di-Higgs plus / E T signature then arises from an asymmetric event topology and features a di-Higgs resonance. Figure 1 shows the event topologies for these two cases, which we now discuss in more detail.

Symmetric topology
In this model, both A and χ are odd under the discrete Z 2 symmetry, i. e. they belong to the hidden sector. The interaction 1 induces the decay A → hχ with a branching ratio of 100%. Due to their different Z 2 parity, neither A nor χ can mix with the Higgs boson and thus remain pure singlets under the SM gauge group. In particular, they do not couple to electroweak gauge bosons. The coupling λ AχHH induces mixing of A and χ after electroweak symmetry breaking. However, also the mass term m 2 Aχ Aχ contributes to this mixing. We conclude that the A-χ mixing and 1 Here H denotes the SM Higgs doublet and gauge-invariant field contractions are assumed.

JHEP04(2019)160
the decay A → hχ are governed by two independent parameters, so that the mixing can be set to zero without affecting the A → hχ decay in our signature. Since B predominantly decays into AA, and A decays exclusively into hχ, the process is the main discovery channel for the symmetric topology at the LHC. This model encompasses the well studied case of electroweakino production in the MSSM, with A and χ corresponding to fermions (for instance, Higgsino production as in ref. [10]). A renormalizable coupling connecting A, χ and the SM Higgs implies that at least one of the fermions is non-trivially charged under the SU(2) electroweak group. Hence new states with electric charges appear, which often provide the leading collider signatures for these scenarios: multi-leptons plus / E T (see e.g. ref. [35]), mono-jet plus soft leptons (see e.g. refs. [36][37][38][39]), disappearing tracks (for recent work see e.g. refs. [40][41][42][43]).

Resonant topology
In this model, both A and B are even under the Z 2 parity, while χ is odd. The coupling λ AχHH is therefore forbidden and the decay A → hχ is absent. Instead we introduce the couplings Both of these couplings were forbidden with the symmetry assignment leading to the symmetric topology. The coupling term m AHH induces a mixing of A with h after electroweak symmetry breaking. Unlike A-χ mixing in the symmetric model, A-h mixing is unavoidable here: it is induced by the same parameter as the decay A → hh that is part of the di-Higgs plus / E T signature. As a consequence, A inherits the couplings of h, inducing A → W W and A → ZZ as relevant decay modes. In the limit m A m h , m W , m Z , the decay rates fulfill the simple relation (2.5) The decay A → χχ is instead induced by the coupling m Aχχ , so that its decay rate can be treated as an independent parameter. To maximize the significance of the signature we assume a branching ratio of B(A → χχ) = 0.5. In addition to the di-Higgs plus / E T signature, this model also gives rise to signatures with di-boson resonances and signatures with four electroweak bosons V, V = W, Z, h forming two di-boson resonances, These signatures and di-Higgs plus / E T typically occur at similar rates. They complement each other in the search for scalar hidden sectors of this kind.

JHEP04(2019)160
In summary, the two simplified models can be conveniently described by the interaction Lagrangian In the symmetric topology, the couplings m AHH and m Aχχ are equal to zero, due to the Z 2 parity under which both A and χ are odd. In the resonant topology, on the other hand, λ AχHH = 0, as only χ belongs to the hidden sector. Based on the field content and symmetries of both models, additional terms could be added to the respective Lagrangians. Along this work, we will only consider those that are relevant for the collider phenomenology of the di-Higgs plus / E T signature. 2 Using eq. (2.8), we have implemented both Lagrangians into FeynRules [44] and used the Universal FeynRules Output (UFO) [45] for event generation.
Throughout our analysis, we fix the parameters in the hidden sector as follows, The value of C Bgg is motivated by a UV completion with a vector-like quark with mass m Q = Λ = 1 TeV and scalar coupling y Q = 1, as discussed in appendix A. In the resonant model, the relation between m AHH and m Aχχ ensures a 50% decay of A → χχ in the limit m A m h , m χ . The magnitude of these couplings does not affect the signal rate. In the symmetric model, the branching ratio of A → hχ is 100%, regardless of the size of the coupling λ AχHH .
3 Higgs-pair production with missing energy at the LHC In order to develop a search strategy for di-Higgs plus / E T at the LHC, we first analyze the kinematic features of this signature and their parameter dependence in each simplified model. We assume that B is resonantly produced. The signal rate is then well approximated by The production cross section σ(pp → B) can be obtained by rescaling the SM Higgs production cross section, as described in appendix A. For our numerical analysis, we generate signal events at the parton level with MG5 aMC@NLO 2.6.1 [46], using the NNPDF30 lo as 0118 nf 4 [47] parton distribution functions implemented in the LHAPDF 6.1.6 [48] interface. We employ Pythia 8.2 [49] for parton showering and hadronization and Delphes 3.3.3 [50] for a basic detector simulation, using the default implementation of the ATLAS detector. Crucial inputs for our analysis are the b-tagging efficiency, b , and the light and charm jet mistag rates, l and c , which are given by the following p T -dependent functions Hence, for a jet transverse momentum of p T (j) = 50 (250) GeV, we find b , l , c = 67, 0.24, 0.26 (73, 0.3, 1.0)%.

Kinematics and benchmarks
The kinematics of our signature is driven by the available phase space in the B and A decays. We parametrize this in either model in terms of the mass differences of the involved particles, symmetric: We fix the scalar mass m B = {1000, 750, 500} GeV and scan ∆ BA in steps of 75 GeV over the kinematically accessible region. In the symmetric model, ∆ Ahχ is scanned in steps of 100 GeV. We require ∆ Ahχ > 10 GeV to prevent too strong a phase-space suppression in A decays. In the resonant model, we fix m χ = 25 GeV to satisfy the different kinematic boundaries. 3 The so-obtained benchmark scenarios for the symmetric (S) and resonant (R) model are labelled as where the masses of the scalars are given in units of GeV. They are shown for both models in table 1, where we also give the corresponding signal rates, σ S and σ R , as defined in eq. (3.1). We have verified that B(B → AA) 97 % in both topologies for all benchmarks.
In addition to the di-Higgs plus / E T signature, our models induce processes like pp → B → jj and pp → B → γγ, as well as mono-jet, di-jet plus / E T , and mono-Higgs signatures. The highest sensitivity to our scenarios is expected in searches for top squarks, bottom squarks and gluinos [51][52][53][54], which focus on large / E T together with b-jets and light jets. Using CheckMATE2 [55], we have verified that all benchmarks evade existing LHC searches for these and similar processes at √ s = 8 and 13 TeV. Current searches for 4b + / E T lack sensitivity to our models, because they focus on phase-space regions that are sparsely populated by our signal.
We have also verified that future searches at the HL-LHC with 3 ab −1 of data will not be competitive with our final state. Since the 14 TeV analyses available in CheckMATE2 are only a handful, we have estimated the reach of the remaining analyses by naively rescaling the expected reach of the current 13 TeV studies with the square root of the luminosity. We conclude that our benchmarks are not only viable today, but also they will not be probed by future LHC searches. Di-Higgs plus / E T can thus be considered the discovery channel for these scenarios. From the resonant topology, we predict additional signatures with diboson resonances and missing energy, like W W + / E T and ZZ + / E T , as well as signatures with four electroweak bosons forming two di-boson resonances, cf. section 2. Since these signatures are expected to occur with similar rates as di-Higgs plus / E T production, they can serve as complementary discovery channels for the resonant topology. To illustrate the phenomenology of our simplified models, we use the specific benchmarks The first two benchmarks correspond to scenarios with little phase space for the B → AA decay (compressed spectrum), the second two with large phase space (split spectrum). In our benchmarks, a compressed (split) spectrum is parametrized by a small (large) ∆ BA , which determines the boost of A. The kinematic differences between the two models originate in the respective A decays, which are imprinted on the / E T distribution. In the left panel of figure 2, we show the / E T distribution for the four benchmarks, illustrating the effects of a compressed and split spectrum in each topology. We also present the transverse momentum distributions of the Higgs bosons at parton level, p T (h), for a split spectrum in the symmetric topology (center panel) and the resonant topology (right panel).
In the symmetric topology, the two A particles are produced back-to-back in the centerof-mass frame and split their transverse momentum into χ and h. The vector sum of their transverse momenta is thus subject to cancellations. The peak position of the / E T distribution depends on the available phase space in both the B and A decays, For larger values of / E T , the distribution drops fast. In the resonant topology, the / E T distribution is equal to the transverse momentum distribution of A. The peak of the / E T distribution depends now only on ∆ BA = m B − 2m A , and the spectrum is harder at large / E T than for the symmetric topology. A trigger on missing energy will thus favor one or the other topology, depending on the position of the / E T cut. The transverse momentum distribution of the Higgs bosons peaks at lower momenta than the / E T distribution. As can be seen by inspecting the transverse momenta of the Higgs bosons, the b-jets from Higgs decays are less likely to pass the trigger requirements, which typically imply strong cuts on the jet transverse momentum [56]. Triggering on missing transverse energy rather than on the b-jets yields a more efficient signal selection.

Jet substructure technique
Depending on the model and the mediator spectrum, the b-jets from Higgs decays can be produced with a large boost. The b-jets are thus collimated and cannot be resolved as individual jets. To reconstruct the boosted h → bb decays, we crucially rely on jet substructure techniques. The current substructure module in Delphes, SoftDrop, is a modified version of the BDRS algorithm [29] that includes b-tagging and flavor tagging for fat jets. To make the tool applicable for our purposes, we have extended these functionalities to subjets. Based on SoftDrop, we have developed two new modules called JetFlavorAssociationSubjets and BTaggingSubjets. 4 These modules allow us to access the four-momenta and b-tags of each fat jet in the event, and also of each subjet associated to it in the Delphes output. We will speak of "x-y b-tags" to describe an event selection where one fat jet contains at least x b-tagged subjets and another fat jet contains at least y b-tagged subjets. The performance of our tagging technique depends on the fat-jet radius, R. We use R = 1.2 for the symmetric and R = 0.6 for the resonant topology. 5 Due to the limited b-tagging efficiency and rejection efficiency of light (i.e., non-btagged) jets, as well as the jet rapidity cut of |η b | < 2.5, not all of the four b-subjets in our signal will be tagged. To quantify this statement, we show in figure 3 the number of b-tagged subjets, N b , versus the number of light jets, N j , for an exemplary benchmark of the resonant topology, R 1000 275 25. Other resonant benchmarks show a similar pattern, and the behavior is similar for the symmetric topology. It is apparent that most of the time only three or fewer b-subjets are reconstructed. Note, however, that the amount of missing b-jets is larger than the naive estimation from the plain b-tagging efficiency. This loss is due to a significant number of reconstructed jets that are either soft, i.e., carry p T (j) < 20 GeV, 4 The corresponding code can be obtained from the authors upon request. 5 The choice of these values and their impact on the analysis will be explained at the end of this subsection.  Now we turn our attention to the ∆R min bb distribution. As expected, the parton shower barely changes the collinearity of the b-quarks. Therefore, if b-jets are not reconstructed as such, this is due to the characteristic mass spectrum of the model, rather than parton showering. The ∆R min bb distribution depends both on the boost of the Higgs bosons and on the event topology. In a split spectrum, the Higgs bosons are more boosted, so that b-quarks from the same Higgs decay are closer to each other. In a compressed spectrum, the Higgs bosons are softer and the b-quarks are emitted with a larger angular separation. Naively one might thus think that smaller values of ∆R min bb are preferred in the symmetric model, where the Higgs bosons carry larger transverse momenta. In figure 4, however, we observe the opposite behaviour. This is due to the different event topology: in the resonant model, the two Higgs bosons stem from the decay of one A boson and are thus much closer than in the symmetric model, where they originate from opposite sides of the event. Consequently in the resonant model all four b-jets tend to be collimated, while in the symmetric model the two pairs of b-jets are well separated. We hence conclude that only in the symmetric model ∆R min bb is a direct measure of the Higgs transverse momentum. In the resonant model, on the other hand, the closest b-jets do not always stem from the same Higgs decay, so that ∆R min bb is also sensitive to the boost of A. In the symmetric model, the parton level requirement ∆R bb > 0.4 only cuts away a few percent of the signal events. In contrast, in the resonant model this cut has an important impact on the signal. This loss of events, together with the lower total event rates discussed in section 2, suggests that the resonant topology is harder to find that the symmetric one for a given particle spectrum. The fact that the maximum of the ∆R min bb distribution in the resonant model lies at lower values than in the symmetric model motivates different choices of fat-jet radii. We use R = 1.2 for the symmetric and R = 0.6 for the resonant topology.

Backgrounds and cutflow analysis
The main SM backgrounds to our signal of 4b + / E T are due to W + jets, Z + jets, as well as top-antitop production with one leptonic and one hadronic decay. All backgrounds have been simulated using Sherpa 2.2.1 [57] at leading order (LO) in QCD, including parton shower and hadronization effects. We use the same setup as for the signal generation, as described at the end of section 3.1. Our analysis has been performed with ROOT [58].
The cutflow analysis is summarized for the symmetric benchmark S 750 350 25 in figure 5 and for the resonant benchmark R 750 350 25 in figure 6. In the right panel of each figure, we list the cross section for the dominant background processes for / E T > 200 GeV and after applying a lepton veto and requiring at least two b-tagged subjets from fat jets with radius R = 1.2 (symmetric topology) and R = 0.6 (resonant topology), respectively. We have checked that contributions from di-boson plus jets production are smaller. The latter will be neglected in our analysis.
To discriminate between our signal and the backgrounds, we apply a cut-and-count procedure. Throughout our analysis, we apply an initial cut of / E T > 200 GeV and a lepton veto. To study the impact of our b-tagging technique, we request various x-y b-tags with 0 ≤ x, y ≤ 2, one by one. 7 We furthermore apply an optional Higgs Mass Window (HMW) 7 The upper limit on x, y is due to the fact that we only consider the two hardest subjets within a given fat-jet. by requesting that the mass of each of the two identified fat jets lies within the window 75 GeV < m J < 175 GeV. In addition, we allow for a variable lower cut on / E T (in steps of 50 GeV). 8 To optimize the choice of the jet radius, we have carried out our analysis for four different fat-jet radii R = 0.6, 0.8, 1.0, 1.2. For the symmetric topology, the significance is maximized for R = 1.2, while the resonant model favors a smaller fat-jet radius of R = 0.6. 9 The impact of the various cuts on signal and background is shown for the symmetric model in the left panel of figure 5. We see that applying 2-2 b-tags plus a Higgs mass window leaves us with about 40 signal events, while the sum of backgrounds ranges around 700 events. In the cut-and-count analysis, we therefore do not achieve a signal-to-background ratio of O(1), so that the significance depends critically on systematic uncertainties that can affect the analysis. In the center panel, we illustrate this dependence by showing the significance defined as We assume a systematic uncertainty of β = 0, 1, 5, 10%, respectively. For a small uncertainty β = 1%, a maximum significance of about Σ = 2 can be reached for the considered benchmark. In the resonant model, shown in figure 6, the signal rate is significantly lower than in the symmetric model. With our basic cut-and-count analysis, we therefore do not achieve a noticeable sensitivity to our signal. The fact that the significance of our signal depends on a combination of various kinematic variables suggests to perform a multi-variate analysis to optimize the sensitivity.

Multi-variate analysis and results
In this section, we describe the strategy pursued in our multi-variate analysis (MVA) and present our results. Depending on the respective phase-space point, discriminating 8 For the symmetric topology, a looser / E T cut is preferred to optimize the significance. For the resonant topology, due to the harder / E T spectrum and the low number of signal events, the preferred cut lies at higher / E T . In order to establish a fair comparison of the remaining selection criteria in the two topologies, we discuss the cutflow for a fixed cut of / E T > 200 GeV. 9 For the resonant model, the sensitivity with R = 0.6 is a factor of 2 higher than for R = 1.2.

JHEP04(2019)160
between the di-Higgs plus / E T signal and the backgrounds can be very challenging. In order to maximize the sensitivity for each of our benchmarks in the two models, we perform a multi-variate analysis. In this study, we use the scikit-learn [59] implementation of AdaBoost [60], employing the SAMME.R algorithm to perform a Boosted Decision Tree (BDT) classification. As our best setup, we choose to train 70 trees with a maximal depth of 3, a learning rate of 0.5 and a minimum node size of 0.025 of the total weights.
Before running the BDT, we place basic kinematic selection cuts on the missing transverse energy and the jets. As in the cutflow analysis, we apply a / E T > 200 GeV cut and veto events containing isolated leptons. 10 The jets are defined as Cambridge-Aachen fat jets J with a jet radius of R = 1.2 (R = 0.6) for the symmetric (resonant) topology and transverse momentum p T (J) > 20 GeV. A fat jet J i is accepted if it contains two subjets j k i , i, k = {1, 2}, where at least one of them is b-tagged.
In our multi-variate analysis, we use kinematic information on the two hardest b-tagged fat jets, J 1 and J 2 and their corresponding subjets j k 1 and j k 2 . The complete set of variables used for our analysis can be classified in four categories: • Global variables: missing transverse energy, / E T ; H T (computed using the fat jets); total number of fat jets, N J ; total number of b-tagged fat jets, N Jb ; total number of b-tagged subjets within all fat jets, N jb ; • Single fat-jet variables: transverse momentum p T (J i ); pseudo-rapidity η(J i ); jet mass m J i ; azimuthal angular separation between fat jet and missing momenta, ∆φ(J i , / E T ); ratio of transverse momenta p T (J i )/ / E T ; • Two fat-jet variables: distance between two fat jets, ∆R(J 1 , J 2 ); invariant mass of two fat jets, m J 1 J 2 ; maximum jet mass ratio, max(m J 1 /m J 2 , m J 2 /m J 1 ); • Subjet variables: transverse momentum p T (j k i ); pseudo-rapidity η(j k i ); distance between subjets, ∆R(j k i , j l i ). We employ 80% of our events for training and 20% for evaluation purposes. The different backgrounds are weighted according to their relative cross section after applying the basic selection cuts. The BDT thus focuses on the dominant backgrounds when trained to avoid misidentification of the respective backgrounds as a signal. To make sure that the BDT will take equal effort in correctly classifying the overall number of signal events and background events, we scale the total weight of all signal events to match the total weight of all background events. This is especially important, since in the training we involve more Monte Carlo events for the background processes than for the signal. The BDT will assign a score (or threshold in machine learning (ML) terminology) to each event, which reflects the likelihood of it being signal.
In figure 7, we show a typical BDT result for the symmetric benchmark S 750 350 25. The three plots to the left show the training results, while the three plots to the right display 10 We require electrons (muons) to have p T lep > 10 (10) GeV and |η lep | < 2.5(2.7), and consider them isolated if p T lep p T had < 0.12 (0.25) within R = 0.5 (0.5) of the electron (muon) momentum. Looser lepton selection criteria might result in a better rejection of the W +jets background and therefore improve the significance of our analysis. However, in this work we consistently use our conservative lepton definition. the outcome of the evaluation. In the upper left plot of each panel, we present the Receiver Operating Characteristic (ROC) curve. Shown is the true positive rate (TPR, also referred to as recall in ML language) against the false positive rate (FPR, also referred to as fallout). 11 The most relevant information is the Area Under the Curve (AUC), which quantifies the BDT capability to discriminate between signal and background. By construction, the AUC ranges between 0.5 and 1. For all our benchmarks, the AUC is at or above 0.9, which proves that our signal/background classifier has an impressive performance. The upper right plot of each panel displays the precision (or positive predictive value) as a function of the recall. The precision is defined as the fraction of true signal events among those events the BDT classified as signal. This curve illustrates how reliable a classification as signal is, depending on which fraction of signal events are classified correctly. In the displayed curve, we used the adjusted signal weights, ensuring that signal and background are on equal footing (see the discussion above). Hence the minimum value for the precision is 0.5.
The lower plot in each panel shows the significance Σ defined in eq. (3.7) as a function of the expected number of signal events S that would be left after cutting on a given score. The prediction is made for the HL-LHC with an integrated luminosity of L = 3 ab −1 . We present the significance for this benchmark for three different assumptions of systematic uncertainties. It is apparent that the use of a BDT enhances the significance by about an order of magnitude compared to our basic cut-and-count analysis. 12 11 The TPR is the probability that a signal event gets tagged as signal, while the FPR is the probability that a background event gets tagged as signal. 12 The small spikes in the significance curve in the evaluation sample are due to a lack of Monte Carlo statistics in the W jjjj background for high BDT scores. We have simulated 10 8 Monte Carlo events for this background. Owing to the large number of colored final states, however, the event generation is computationally intense. Moreover, the lepton veto cannot be reliably implemented at parton level, which enhances the number of required events. Generating a larger sample of W jjjj events would soften the spikes, but is beyond the scope of our analysis. In order to determine the sensitivity to a given benchmark scenario, we cut the BDT score (in our case, the number of signal events) at the peak of the evaluation significance. In our example S 750 350 25 and assuming 5% systematics, this corresponds to a significance of Σ = 12 and S = 325 and B = 238 remaining signal and background events, compared to S = 42 and B = 713 in the basic cut-and-count approach. The increase in sensitivity is due to a better selection of signal events, while at the same time having a similar improvement in background rejection. All of our benchmarks in the symmetric model feature a signal-to-background ratio close to unity, which suggests good discovery prospects in the presence of systematic uncertainties. To take into account the statistical uncertainty on our W jjjj background simulation (see footnote 12), in what follows we take a conservative approach and claim a "discovery" at a significance of Σ = 7 (corresponding to 7 standard deviations from the standard-model hypothesis for Gaussian statistics), instead of the common 5σ threshold.
In figure 8, we present our results for the resonant benchmark R 750 350 25. Again, we see the excellent performance of our BDT classifier. However, in the resonant model the lower signal rate severely limits the sensitivity. While the BDT improves the sensitivity, we cannot reach the discovery level for our models. Still, a significance of Σ ≈ 2 can be achieved in benchmarks with a lighter scalar B, corresponding to a larger signal rate. The HL-LHC can thus test parts of the parameter space, but a more refined strategy (or a combination of multiple channels) would be required to reach a higher sensitivity. In summary, compared with the cut-and-count analysis the BDT enhances the sensitivity to the resonant topology by about a factor of 10 in most benchmarks. Still, the sensitivity is lower than for the symmetric topology, mostly due to the reduced signal rate.
To show the dependence of the signal sensitivity on the respective model parameters, we present our results in terms of the scalar masses m A and m χ . In figure 9, we display the luminosity required to discover the symmetric benchmark scenarios from table 1   HL-LHC, assuming a systematic uncertainty of 5%. The mass of the heavy scalar is fixed to m B = 500 GeV (left panel) and m B = 750 GeV (right panel). Apart from one benchmark, all scenarios are well within the reach of the HL-LHC. We also see that the significance is particularly high for benchmarks with a compressed spectrum. As anticipated from figure 2, for m B 2m A , the / E T spectrum is harder and the cut on missing energy is thus more efficient in rejecting the background. In figure 9, this effect can be seen by looking at fixed values of ∆ Ahχ and increasing m A (along the diagonal). For a more compressed spectrum, the required luminosity is drastically reduced by a factor of 10-20, depending on the actual value of m B . A similar but milder effect occurs if the decay A → hχ proceeds close to threshold. In this case, the hχ pair follows the direction of A, resulting again in a harder / E T distribution. In the plot, this corresponds to fixed values of m A and increasing m χ (along the vertical direction), resulting in a reduction of the required luminosity by a factor of about 2-3 at most. In summary, a di-Higgs signal could be discovered in an early phase of the HL-LHC, provided that the scalar resonance B is produced at a sizeable rate.
We present our results in a second way, which is particularly convenient for recasting purposes. In figure 10, we report the cross section that can be probed at the discovery level for a luminosity of 3 ab −1 . Again, we fix the heavy scalar mass at m B = 500 GeV (left panel) and m B = 750 GeV (right panel), respectively. We see that in the most difficult benchmark topology, namely in the benchmark that requires the largest discovery luminosity at the HL-LHC, S 750 275 50, a cross section of about 15 fb would be required to claim a discovery. It is interesting to compare this value with the latest di-Higgs predictions for the SM that require a total rate for hh → 4b of about 13 fb for discovery. In the standard model, the Higgs pair is not produced through a resonance, and furthermore the final state of four b-quarks without / E T is difficult to identify. In contrast, in our scenarios the scalar B is resonantly produced and the large / E T in the final state facilitates an efficient discrimination against the backgrounds. The fact that we can probe cross sections of a few to several femtobarns is an important result, which motivates a study of the di-Higgs plus / E T signature in the context of complete models. For the benchmarks with m B = 1 TeV, the signal rates are too low to claim a discovery at the HL-LHC with 3 ab −1 . However,  already with slightly more luminosity (or larger cross section) a discovery of these heavy scalar scenarios is possible.
In the resonant model, the planned HL-LHC luminosity is not sufficient to discover any of the benchmark scenarios, due to the lower production rates. We therefore confine ourselves to presenting the cross sections required to discover a particular resonant benchmark in figure 11. As explained in section 3, the mass of the lightest scalar, m χ , does not affect the sensitivity, since the boost of A does not depend on m χ or m h . We therefore present our results in terms of the heavier scalar masses m A and m B . From the figure, we see that we can only test cross sections in the femtobarn and sub-femtobarn regime. As in the symmetric model, the significance increases when the spectrum is compressed. Increasing m A for fixed m B = 1000 GeV lowers the testable cross section by a factor of 3. Using CheckMATE2 we have verified that even with the largest possible cross section displayed here, the search for di-Higgs plus / E T is still the most sensitive channel for the resonant model.
- 18 -In our models, the lightest scalar χ is automatically stable, due to the imposed Z 2 symmetry. Here we explore the hypothesis that χ is a dark matter candidate. We discuss the dark matter phenomenology for our simplified models, focusing on dark matter-nucleon scattering in direct detection experiments and thermal freeze-out in the early universe.
A contribution to spin-dependent nucleon scattering arises from the portal operator H † H χχ, which induces a hχχ interaction after electroweak symmetry breaking. The effect of this operator has been well studied elsewhere (see e.g. ref. [6] for a recent study). Since it is not relevant for our di-Higgs plus / E T final state, we will assume that it is absent. We follow the same philosophy for other operators, namely we only study the implications of operators that play a role in the collider phenomenology.
In the symmetric model, A and χ can mix through the coupling λ AχHH from eq. (2.2) upon electroweak symmetry breaking. This mixing induces a Higgs coupling to the lightest scalar, λ AχHH sin θ Aχ , where θ Aχ is the A-χ mixing angle. Direct detection experiments set a strong bound on this coupling. However, as we explained in section 2, neither λ AχHH nor the mixing affects the signal strength of di-Higgs plus / E T . We have therefore set A-χ mixing to zero in our analysis, θ Aχ = 0. The lightest scalar χ can be a viable dark matter candidate that leaves a di-Higgs plus / E T signature at the LHC while agreeing with the null results from direct detection experiments.
In the resonant model, A and h mix through the operator m AHH AH † H after electroweak symmetry breaking. For fixed m A and m χ , direct detection experiments set an upper bound on the product of the m Aχχ and m AHH couplings from mixing-induced nucleon scattering. Since the signal strength of di-Higgs plus / E T production depends on the relative size of m AHH and m Aχχ , but not on their overall magnitude (see section 2), we conclude that in the resonant model the lightest scalar χ is not ruled out as a dark matter candidate by direct detection experiments if m Aχχ m AHH is sufficiently small.
Assuming that our dark matter candidate is a relic from thermal freeze-out in the early universe sets additional constraints on the parameter space of our models. In the symmetric model, dark matter annihilation can be efficient in either of the following scenarios, 13 If dark matter is heavier than the Higgs boson, it can annihilate by t-channel A exchange, scaling as λ 4 AχHH . The observed dark matter abundance of Ω χ h 2 = 0.1199 [61] can be obtained with moderate couplings and mediator masses m A 1 TeV. If dark matter is lighter than the Higgs boson, it can only annihilate through A-χ mixing, which is strongly suppressed by the null results from direct detection experiments. It is still possible to satisfy the observed relic abundance for dark matter pair masses near the Higgs mass. In this case, 13 Additional annihilation processes such as χχ → B → gg via the CBgg coefficient are possible. However, such channels would also require a Bχχ interaction, which is irrelevant for the di-Higgs plus / E T signature. As in the case of direct detection, we will neglect these interactions. In the resonant model, the direct detection bounds on m Aχχ sin θ Ah suppress all interactions of dark matter. The observed relic abundance can only be obtained for dark matter pairs in the Higgs resonance region, Obtaining the observed relic abundance away from the Higgs resonance requires additional dark matter annihilation channels beyond what is predicted in our simplified model. In any case, the dark matter hypothesis should not constrain the search strategy for di-Higgs plus / E T at the LHC. For instance, a di-Higgs plus / E T signature could also arise in models with hidden sectors, where χ decays visibly at a later time and outside the detector, so that its decay products could be caught by dedicated detectors such as FASER [62,63] or MATHUSLA [64][65][66].

Conclusions
In this work we have developed a search strategy for hhχχ production at the LHC using the final state with four b-jets and missing transverse energy. For the purpose of our and future studies, we have built two simplified models that give rise to this final state in different kinematic topologies. Both models feature a hidden sector of three new scalar singlets A, B and χ, the latter being stable due to an imposed parity symmetry. Since the scalars A are pair-produced via a resonant scalar B, event rates in both models are sizeable at the HL-LHC. The decay of A depends on the properties of the particles in the hidden sector and determines the event topology. In our symmetric model, both A and χ belong to the dark sector, so that A → hχ decays occur on both sides of the decay chain. In the resonant model, only χ belongs to the hidden sector and A decays via A → hh or A → χχ with similar branching ratios. We stress that these simplified models can be embedded in more complete models featuring an enlarged scalar sector or other particles in the hidden sector.
To demonstrate the LHC potential to discover hidden sectors with the di-Higgs plus / E T signature, we have performed a full-fledged numerical analysis of the multi-b+ / E T final state, including a detailed study of SM backgrounds and detector effects. Dominant backgrounds are due to tt, as well as W jjjj and Zjjjj production. Employing the inclusive / E T trigger for event pre-selection, we have first carried out a cut-and-count analysis, followed by a multi-variate analysis based on a boosted decision tree. Both analyses rely on the use of jet-substructure techniques in a modified version of the BDRS algorithm. In our cutand-count analysis, we employ missing energy, jet and subjet variables, as well as flavor tags to efficiently discriminate between signal and background. With this approach, we have obtained significances of at most 2σ for the symmetric model, while in the resonant model our analysis turned out to be insensitive. It is thus necessary to optimize the signal-background discrimination by optimally exploiting all kinematic features using a multivariate analysis. In particular we have shown that the use of machine-learning tools is well suited for our analysis.

JHEP04(2019)160
In the symmetric model, our BDT analysis predicts a significance well above 5σ for most of our scanned points, thus opening the possibility of an (early) discovery at the HL-LHC. In the resonant model, the signal rates are significantly lower (in the sub-femtobarn regime), which reduces the sensitivity. In order to claim a discovery, an increase of our benchmark cross section by a factor of 2-10 would be needed. In any case, the enhancement from our simple cut-and-count to the multivariate analysis shows that the sensitivity to the di-Higgs plus / E T signal relies on a variety of kinematic features in both signal and background. The BDT is thus the appropriate approach to search for such a many-body final state in an environment with large SM backgrounds. While we have focused on the 4b + / E T channel with the largest event rates, additional final states like bb γγ + / E T , bb W W * + / E T , bb τ + τ − + / E T can contribute significantly to enlarge the search potential of di-Higgs plus / E T .
In the context of dark matter, it is interesting to investigate the interplay of this collider signature with searches at direct and indirect detection experiments in complete models where the relic abundance is satisfied. Within our simplified models, parts of the parameter region could provide us with a viable thermal dark matter relic. In general, potential bounds on viable dark matter models should not limit the scope in di-Higgs plus / E T searches at the LHC.
So far, the LHC collaborations have searched for the di-Higgs plus / E T signature in a specific scenario of supersymmetry with very light invisible particles in the final state. We encourage our experimental colleagues to use our simplified models and a search strategy similar to ours to fully exploit the discovery potential of di-Higgs plus / E T . The use of machine learning techniques is crucial to achieve a high significance, in our case an enhancement by an order of magnitude over a cut-and-count analysis. Searching for di-Higgs plus / E T links the efforts at the dark matter frontier with those on the di-Higgs frontier, which in the last few years have seen a spectacular development in both theory and experiment. The sensitivity to this and similar signatures will greatly benefit from merging the techniques developed for Higgs pair measurements in and beyond the standard model with missing energy searches.

JHEP04(2019)160
A UV completing the Sgg coupling In this appendix, we introduce a minimal perturbative UV completion for the effective coupling C Bgg in eq. (2.8). We add a heavy quark Q with mass m Q , odd Z 2 parity, and vector-like weak interactions to our model. We furthermore assume that Q couples to the heavy scalar B via L ⊃ −y Q BQQ . (A.1) For the sake of simplicity, we take Q to be an SU(2) singlet with hypercharge −1/3. Assuming that Q dominantly decays via Q → bχ, its mass is constrained by sbottom searches at the LHC [67,68], as well as by more inclusive searches for jets plus / E T [53,69]. We estimate the current bound to be at the level of 1 TeV, but leave a more thorough investigation for future work.
Following ref. [70], integrating out Q generates the following effective couplings to gluons and photons, 14 where g s is the coupling constant of QCD and C Bgg is defined in eq. (2.8). The effective coupling to photons is defined analogously by With m Q = 1 TeV and y Q = 1, we find 15 C Bgg = g 2 s 48π 2 2.1 · 10 −3 , (A. 4) and C Bγγ smaller by more than two orders of magnitude. Both C Bgg /Λ and C Bγγ /Λ scale as y Q /m Q , so that the effective couplings do not change when simultaneously increasing both m Q and y Q . The partial decay widths mediated by these couplings are [70] Γ(B → gg) = C 2 This should be compared with the decay width into AA pairs, (A.7) 14 If we had instead introduced Q as a vector-like top partner, the coupling CBγγ would be larger by a factor of four. 15 Notice that the naive dimensional analysis (NDA) estimate for CBgg, assuming a loop-induced perturbative UV completion at the scale Λ = 1 TeV, is larger by about a factor of three, CBgg ∼ 1/(16π 2 ).

JHEP04(2019)160
The production cross section σ(pp → B) can be estimated by making use of the results of the LHC Higgs cross-section working group [71], which provides the contribution of gluongluon fusion to the production cross section of a heavy scalarŜ with Higgs-like couplings to quarks. These numbers can then simply be rescaled to obtain the production cross section of B, (A.8) where m t and y t are the top-quark mass and Yukawa coupling.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.