Resonant Di-Higgs Production at Gravitational Wave Benchmarks: A Collider Study using Machine Learning

We perform a complementarity study of gravitational waves and colliders in the context of electroweak phase transitions choosing as our template the xSM model, which consists of the Standard Model augmented by a real scalar. We carefully analyze the gravitational wave signal at benchmark points compatible with a first order phase transition, taking into account subtle issues pertaining to the bubble wall velocity and the hydrodynamics of the plasma. In particular, we comment on the tension between requiring bubble wall velocities small enough to produce a net baryon number through the sphaleron process, and large enough to obtain appreciable gravitational wave production. For the most promising benchmark models, we study resonant di-Higgs production at the high-luminosity LHC using machine learning tools: a Gaussian process algorithm to jointly search for optimum cut thresholds and tuning hyperparameters, and a boosted decision trees algorithm to discriminate signal and background. The multivariate analysis on the collider side is able either to discover or provide strong statistical evidence of the benchmark points, opening the possibility for complementary searches for electroweak phase transitions in collider and gravitational wave experiments.


I. INTRODUCTION
Understanding the nature of the electroweak phase transition (EWPT) is a major goal in particle physics. A first order phase transition can be obtained by introducing new physics at the electroweak scale and this new physics can be explored at the high luminosity Large Hadron Collider (HL-LHC). On the other hand, a first order phase transition can generate gravitational waves that may be within the reach of future spacebased detectors. It becomes important to understand how this complementarity plays out in concrete models -for example, can one obtain regions of parameter space where all conditions -first order phase transition, detectable gravitational waves, and a strong enough signal at the HL-LHC -are met?
The simplest template for studying these questions is the xSM model [1][2][3], which consists of the Standard Model (SM) extended by a real scalar. We make no comments about the completion of this model in the UV, the naturalness conflicts associated with introducing yet another scalar in addition to the Higgs, etc. Rather, our philosophy is to use the xSM as the simplest extension of the Higgs sector in which a complementary gravitational wave and collider study can be performed.
The purpose of the current paper is to first carefully explore gravitational wave signatures associated with the EWPT, and then study resonant di-Higgs production at the HL-LHC in the same context.
The new features of our study are the following: (i) While the picture of complementarity presented above is appealing, making concrete connections from gravitational * Electronic address: aalves@unifesp.br † Electronic address: tghosh@hawaii.edu ‡ Electronic address: ghk@itp.ac.cn § Electronic address: kuver.sinha@ou.edu wave studies to particle physics at the electroweak scale faces many technical challenges in the calculations of electroweak baryogenesis (EWBG), EWPT and gravitational waves [4]. While we do not intend to target all these challenges in one strike, we initiate a process of making this connection more solid by presenting a careful treatment of the gravitational wave calculations. We address several subtle issues pertaining to the bubble wall velocity and the hydrodynamics of the plasma, in particular the tension between requiring bubble wall velocities small enough to produce a net baryon number through the sphaleron process, and large enough to obtain appreciable gravitational wave production. The velocity that enters the calculations of EWBG might not be the bubble wall velocity for plasma in the modes of deflagrations and supersonic deflagrations ahead of the bubble wall, as demonstrated by hydrodynamic analysis and simulations [5]. This has the consequence that for a large wall velocity, a much smaller velocity for EWBG can be obtained and EWPT can be accompanied by a strong gravitational wave signal [6]. Therefore in our analysis, we make a clear distinction between these two velocities and determine their relation from a hydrodynamic analysis of the fluid profiles.
For our benchmark models, we compute the gravitational wave energy spectra and signal-to-noise ratio for future spacebased gravitational wave experiments.
(ii) On the collider side, our objective is to apply the machine learning techniques initiated in [7] to resonant di-Higgs production, at benchmark points that are compatible with acceptable EWPT and that hold out the most optimistic prospects from gravitational wave observations. We conduct a di-Higgs study at the HL-LHC: pp → h 2 → h 1 h 1 → bbγγ, where h 1 denotes the SM Higgs. We carefully incorporate all relevant backgrounds in our study. In particular, we are careful to include contributions coming from jets being misidentified as photons, as well as light flavor jets or c-jets being misidentified as b-jets.
We utilize two recent advances in the machine learning literature for our collider study. Firstly, recent results [8] show that in terms of efficiency, Bayesian hyperparameter optimization of machine learning models tends to perform better than random, grid, or manual optimization. We use the Python library Hyperopt [8] to optimize cuts on kinematic variables in our study. The second tool from the machine learning community that we apply is XGBoost [9] (eXtreme Gradient Boosted Decision Trees), which has become increasingly popular among Kaggle competitors and data scientists in industry, especially since its winning performance in the HEP meets ML Kaggle challenge. Unlike a simple gradient boosting classifier, where classifiers (decision trees) are added sequentially, XGBoost is able to parallelize this task, leading to superior performance. Both cut thresholds and Boosted Decision Trees (BDT) hyperparameters are jointly optimized for maximum collider sensitivity.
Our paper is structured as follows. In Section II, we introduce the xSM model and settle on the benchmarks that allow a first order phase transition. In Section III, we calculate the gravitational wave energy spectra and signal-to-noise ratio for several benchmark models. In Section IV, we perform our collider analysis. We end with our Conclusions.

II. THE MODEL
The model "xSM" constitutes one of the simplest extentions of the SM where a real scalar gauge singlet S is added to the particle content. The potential for the "xSM" model is defined with the convention following Ref. [1][2][3]: Here is the SM Higgs doublet and S = v s + s defines the real scalar singlet. All the parameters appearing here are real. The minimization conditions of this potential at the vacuum (v, v s ) allows one to eliminate µ, b 2 by With these substitutions, the mass matrix for (h, s) is found to be: which can then be diagonalized by a rotation angle θ. This results in the physical scalars (h 1 , h 2 ) in terms of the gauge eigenstates (h, s): where h 1 is identified as the 125 GeV Higgs scalar and further m h2 > m h1 . Consequently, three of the potential parameters (λ, a 1 , a 2 ) can be replaced by three physical parameters m h1 , m h2 and θ: while keeping in mind that v can be solved from the Fermi constant and m h1 = 125GeV. With the model parameters fully specified, the cubic scalar couplings that are relevant for di-Higgs production are λ h1h1h1 and λ h2h1h1 , given by In the absence of mixing of the scalars when θ = 0, the cubic Higgs coupling reduces to its SM value iλ h1h1h1 = 3m 2 h1 /v while iλ h1h1h2 vanishes. For small θ as suggested by experimental measurements, the following approximation is obtained for the cubic couplings through a Taylor expansion: The gauge and Yukawa couplings of h 1 are reduced by a factor c θ and the couplings of h 2 are −s θ times the SM values, that is, where XX denotes W + W − , ZZ andf f . Since it modifies the Higgs couplings, the mixing angle is constrained by experiments to be small. Moreover, direct searches for a heavier SM-like Higgs by ATLAS and CMS as well as electroweak precision measurements further constrain the parameter space of (θ, m h2 ). Taking these phenomenological constraints into account, Ref. [3] considered 12 benchmark points with m h2 ∈ [250, 850] and studied the resonant di-Higgs production in the bbW W channel. Also imposed on these benchmarks is the strongly first order EWPT criterion, to be discussed in the next section. Several of these benchmarks are reproduced in the current work for gravitational wave and di-Higgs production studies. These are shown in Table. I [95] Parameters that are relevant for EWPT and gravitational waves are also tabulated for each benchmark. The last column is the signal-to-noise ratio which quantifies the gravitational wave discovery prospect at LISA. See text for more detailed explanation.

III. ELECTROWEAK PHASE TRANSITION AND GRAVITATIONAL WAVES
Ever since the first detection of gravitational waves from binary black hole mergers by the LIGO and Virgo collaborations [10], gravitational waves have become an increasingly important new tool for studying astronomy and cosmology in addition to testing the general relativity of gravity in the strong field regime. More importantly, future space-based interferometer gravitational wave detectors, such as the Laser Interferometer Space Antenna(LISA) [11], can probe gravitational waves at the milihertz level, which is right the frequency range of the gravitational waves resulting from a first order EWPT [12][13][14]. Thus gravitational wave studies present a new window for looking into details of the mechanism of electroweak symmetry breaking, complementary to direct searches at colliders and precision measurements at the low energy intensity frontier [15][16][17][18][19]. This complementarity between traditional particle physics techniques and gravitational wave detections can then provide a more complete picture to understanding the physical mechanism for baryon number generation and solving the long standing baryon asymmetry problem of the universe.

A. Electroweak Phase Transition
The starting point for analyzing the EWPT is the calculation of the finite temperature effective potential, which typically involves the inclusion of the tree level effective potential, the conventional one loop Coleman-Weinberg term [20], the one loop finite temperature corrections [21] and the daisy resummation [22,23]. It is known that there is a gauge parameter dependence in the effective potential thus calculated [24]. However a gauge invariant effective potential can be obtained by doing a high temperature expansion with the result equivalent to including only the thermal mass corrections [25]. Here the gauge invariant effective potential is found to be: with the thermal masses given by where we have written the gauge and Yukawa couplings in terms of the physical masses of W , Z and the t-quark. In the above effective potential, it is the cubic terms that allow the realization of a first order EWPT by providing a tree level barrier [96]. Among the physical parameters that characterize the dynamics of a first order EWPT, the following enter the calculation of the gravitational waves [12]: Here T c is the critical temperature at which the stable and metastable vacua become degenerate, T n is the nucleation temperature when a significant fraction of the space is filled with nucleated electroweak bubbles, α is the ratio between the released energy from the EWPT and the total radiation energy density at T n , β denotes approximately the inverse time duration of the EWPT and v w is the bubble wall velocity [26][27][28]. We use CosmoTransitions [29] to trace the evolution of the phases as temperature drops and solve the bounce solutions to determine T c , T n , α and β [97]. These results are added to Table. I for each benchmark. The following comments are important regarding these benchmarks: • To avoid washout of the generated baryons inside the electroweak bubbles, the strongly first order EWPT criterion v h (T n )/T n 1 [4,30] needs to be met, which effectively quenches the sphaleron process inside the bubbles. All the benchmarks presented in Table. I satisfy this condition.
• Currently there is large uncertaintity with the determination of the bubble wall velocity v w , so it is usually taken as a free parameter in the calculations of EWBG, EWPT and gravitational waves. It is however not entirely free as there are constraints from admitting consistent hydrodynamic solutions of the plasma at the time of phase transition, to be discussed in the following. • Very strong phase transitions are observed for BM5, BM7, BM8 and BM9 as their values of α are all larger than 1. A hydrodynamical analysis of the plasma surrounding the bubbles shows that the profiles of the plasma can be classified into three categories [5]: deflagrations, detonations and supersonic deflagrations (aka hybrid) [31], depending on the value of the bubble wall velocity v w . For v w smaller than the speed of sound in the plasma (c s = 1/ √ 3), the plasma takes the form of deflagrations with the following properties: (a) the plasma ahead of the phase front flows outward with non-zero velocity; (b) the plasma inside the bubbles are static. For c s < ξ J (α) < v w where ξ J as a function of α is the velocity corresponding to the Jouguet detonation [32], a detonation profile is obtained: (a) the plasma ahead of the wall is static; (b) the plasma inside the wall flows outward. For intermediate values of v w with c s < v w < ξ J (α), a supersonic deflagration mode is obtained with the feature that both the plasma ahead of and behind the wall flow outward. An important implication relevant for the analysis here is that there is a minumum value of v w when α > 1/3 for deflagration and hybrid modes [5], where v w smaller than this value gives no consistent solution. For benchmarks BM11 and BM12 both with α < 1/3, v w can take any value, while for BM5-10, there is a limited range for v w .
In the left panel of Fig. 1, we show on the plane of (v w , α), the resulting ranges of v w for BM6, BM7 and BM8, denoted by black horizontal lines that extend between the two gray region boundaries. We note that the value of α for BM10 is close to that of BM6, while the values of α for BM5 and BM9 are similar to BM8. We do not plot these cases to prevent the plot from being overcrowded. The left gray region is forbidden by the constraint mentioned above, while the right gray region gives a v w too fast for EWBG to work [98]. The allowed regions in this plot are the light green region for deflagration and the brown region for supersonic deflagration. We also show three representative fluid profiles in each of the modes in the right panel of Fig. 1.
• The usual consensus for EWBG calculations is that the bubble wall velocity needs to be sufficiently small to allow diffusion of particles ahead of the wall and to produce net baryon number through the sphaleron process, with a typical value of v w = 0.05 (see for example [33][34][35][36][37][38]). However such small velocities would weaken gravitational wave production. The story changes when the hydrodynamic properties of the plasma surrounding the bubble wall are taken into account, and the dilemma between successful baryon number generation and a strong gravitational signal may be avoided. The reason is that the plasma ahead of the wall can be stirred by the expanding wall and gain a velocity in the deflagration and hybrid modes. This has the consequence that in the wall frame the plasma would hit the wall with a velocity v + that is different from v w [5,6] and it is v + rather than v w that should enter the calculations of EWBG. While a definitive justification of this argument would require analyzing the transport behavior of the particle species surrounding the wall in the above picture, we assume tentatively that this is true in this work(see Ref. [39] for a similar discussion on this point in the same model). The contours for a subsonic v + with values of 0.3, 0.05 and 0.01 are shown in the left panel of Fig. 1. We can see that v + decreases as α increases for fixed v w , with the contour v + = 0 coinciding with the boundary of the left gray region. Assuming v + = 0.05 is used for EWBG calculations, we locate the value of v w , which corresponds to the intersection point of this contour with the horizon line of each benchmark, represented as a red point. The v w found in this way is used to calculate the gravitational wave energy spectrum.
With above problems properly taken care of, we can now calculate the gravitational waves resulting from the EWPT.

B. Gravitational Waves
A stochastic background of gravitational waves can be generated during a first order EWPT from mainly three sources: collisions of the electroweak bubbles [40][41][42][43][44][45], bulk motion of the plasma in the form of sound waves [46,47] and Magnetohydrodynamic (MHD) turbulence [48,49](see Ref. [12][13][14] for recent reviews). The total resulting energy spectrum can be written approximately as the sum of these contributions: While earlier studies of gravitational wave production from EWPT have focused on bubble collisions, recent advances in numerical simulations show that the long lasting sound waves during and after the EWPT give the dominant contribution to the gravitational wave production [46,47] and the contribution from bubble collision can be neglected [50]. From such numerical simulations, an analytical formula has been obtained for this kind of gravitational wave energy spectrum [47]: Here g * is the relativistic degrees of freedom in the plasma at the time of EWPT, H n is the hubble parameter at T n and f sw is the present peak frequency which is the redshifted value of the peak frequency at the time of EWPT(= 2β/( √ 3v w )): Hz. (14) The factor κ v is the fraction of latent heat that is transformed into the bulk motion of the fluid and can be calculated as a function of (α, v w ) by analyzing the energy budget during the EWPT [5]. We note that a more recent numerical simulation by the same group [51] obtained a slightly enhanced Ω sw h 2 and a slightly reduced peak frequency f sw .
Aside from the sound waves which give the dominant gravitational wave signals, the fully ionized plasma at the time of EWPT results in MHD turbulence, giving another source of gravitational waves. When a possible helical component [52] is neglected, the resulting gravitational wave energy spectrum can be modeled in a similar way [48,49], where the peak frequency f turb corresponding to MHD is given by: Hz.  Table 1 for proposed space-based gravitational wave detectors. Similar to κ v , here the factor κ turb is the fraction of latent heat that is transferred to MHD turbulence. A recent numerical simulation shows that when κ turb is parametrized as κ turb ≈ κ v , the numerical factor can vary roughly between 5 ∼ 10% [47]. Here we take tentatively = 0.1. As has been discussed in previous section, we take the value of v w such that they all yield v + = 0.05, a good choice for EWBG calculations.
Adding the results given in Eq. 13 and Eq. 15, we can then obtain the total gravitational wave energy density spectrum. For example, the resulting gravitational wave energy spectrum for BM5 is shown in Fig. 2. The blue dashed line denotes the gravitational wave signal from sound waves and the brown dotted line from MHD turbulence, while the total contribution is shown with the solid red line. The color-shaded regions on the top are the experimentally sensitive regions for several proposed space-based gravitational wave detectors: LISA introduced earlier, the Taiji [53] and TianQin [54] programs, Big Bang Observer (BBO), DECi-hertz Interferometer Gravitational wave Observatory (DECIGO) and Ultimate-DECIGO [55] [99].
To assess the discovery prospects of the generated gravitational waves, we calculate the signal-to-noise ratio with the definition adopted by Ref. [12]: where h 2 Ω exp (f ) is the experimental sensitivity for the proposed experiments listed above and T is the mission duration in years for each experiment. Here we assume T = 5. For the LISA configurations with four links, the suggested threshold SNR for discovery is 50 [12]. For the six link configurations as drawn here, the uncorrelated noise reduction technique can be used and the suggested SNR threshold is 10 [12]. We show the SNR for the benchmarks versus m h2 in Fig. 3. The SNR for LISA are also added in Table. I for each of the benchmarks, where it shows that BM5, BM6, BM7, BM8, BM9 all have SNR larger than 10. In particular the SNR for BM5, BM7, BM8, BM9 are all much larger than 10 and for each of these cases a very strong gravitational wave signal is expected. The last three benchmarks BM10-12 give gravitational wave signals too weak to be detected by LISA, Taiji and TianQin but some may be detected by other proposed detectors.
In this Section, we study the collider prospects of probing the benchmark points for which a large SNR for proposed gravitational wave detectors has been calculated in the previous Section. The xSM model predicts a resonant di-Higgs production pp → h 2 → h 1 h 1 → bbγγ which is the channel that we will explore. Double Higgs production occurs through the three contributions depicted in Fig. 4. The non-resonant component involves the box diagram and the diagram with the trilinear Higgs coupling, while the resonant contribution corresponds to the diagram with h 2 in the s-channel. The non-resonant production cross section is strongly dependent on the size of λ h1h1h1 , with a minimum at ∼ 0.31 due to destructive interference between the box and the triangle diagrams. The benchmark points considered in this work all exhibit values of λ h1h1h1 between the SM value of 0.13 and 0.2, and for these points the non-resonant production cross section is suppressed compared to the SM. This suppression is partly compensated by the resonant contribution. We checked that the interference between the resonant and non-resonant contributions is negligible, so the contributions can be added incoherently.
While the resonant di-Higgs production cross section drops rapidly as the mass of h 2 is increased, the resonance peak of the h 1 h 1 invariant mass becomes easier to identify in the tail of the background distribution as shown in Fig. 5. Taking this tradeoff into account, and noticing that BM5 and BM7 provide acceptable SNR in the gravitational waves calculation, we take these two benchmarks as the most promising ones to be probed at the HL-LHC.
We study the bbγγ channel, which is currently the most promising channel to study the double Higgs production in the SM [7, 60, 62-65, 76, 77]. Recently, the fully leptonic bbW + W − channel was studied in the context of the xSM [3]. This channel presents better prospects than bbτ + τ − and bbγγ for scalar masses greater than around 450 GeV. However, the signal-to-background ratio for the BM5 and BM7 points is ∼ 0.1 which may be an issue if the systematic uncertainties in tt backgrounds are not very well controlled. Moreover, the presence of two neutrinos precludes the reconstruction of the scalar resonance. The bbγγ channel, on the other hand, is cleaner and permits the reconstruction of the Higgses, while its cross section is much smaller than the bbW + W − and bbτ + τ − channels.
In Ref. [7], we found that the challenge of controlling the systematic uncertainties can be addressed by judiciously adjusting the selection criteria in order to raise the signal-tobackground ratio. A full comparative study across different channels using our methods would be interesting, and is left for future study.
Inclusive di-Higgs production was simulated with MadGraph5 aMC [78] at √ s = 14 TeV and NN23LO1 PDFs [79]. We multiply the non-resonant LO rates by the NNLO QCD K-factor of 2.27 [80], the resonant one by the NNLL QCD K-factor of 2.5 [81] and add them together to get the total cross section. This is justifiable once the contributions do not interfere. Besides the fact that the K-factors for the two contributions are similar, the kinematic cuts enhance the resonant contribution to eliminate backgrounds more efficiently. The total di-Higgs production cross section is thus approximated as described, and our signal events are weighted accordingly.
The signal cross sections are displayed in Table II. The Higgs bosons are decayed into bottom quarks and photons with the MadSpin module of MadGraph5. We pass our simulated events to Pythia8 [82] for hadronization and showering of jets. FastJet [83] is employed for clustering of jets and Delphes [84] for detector effects.
The backgrounds were also simulated within the same framework [100] and their total yield is shown in Table II. The backgrounds accounted for include bbγγ, Zh (Z → bb and h → γγ), bbh (h → γγ), tth → bb + γγ + X, jjγγ (the light-jets jj are mistaken for b-jets), bbjj (the light-jets jj are mistaken for photons), ccγγ (a c-jet is mistagged as a b-jet), bbγj (the light-jet is mistaken for a photon), and ccγj (the cjets are mistagged as b-jets and the light-jet as a photon), nine in total.
The first four backgrounds are generated with one extra parton radiation to better simulate the kinematic distributions, and MLM scheme [85] of jet-parton matching is used to avoid double counting. Their cross section normalizations were taken from Ref. [65]. All the other five backgrounds are normalized by their NLO QCD rates from [78] but their simulation do not involve extra jets. The probability of a light-jet to be mistagged as a photon is taken to be 1.2 × 10 −4 , although this may be an underestimate if pileup is taken into account.
We note that several previous studies underestimated the background, and/or did not take into account light flavor jets or c−jets being misidentified as b-jets, or jets being misidentified as photons. We correctly take into account bbγj, ccγγ and ccγj backgrounds in our work. We assume a 70% b-tagging efficiency for jet p T > 100 GeV, a photon efficiency of 90% and a 20(5)% mistagging factor for c(j)-jets. We refer to [7] for further details of the background simulation and normalization. The basic event selection requirements are two b-tagged jets with p T (b) > 30 GeV, and two photons with p T (γ) > 20 GeV, all within |η| < 2.5. Bottom jets and photons pairs are further required to reconstruct a 125 GeV Higgs boson with |M bb(γγ) − 125| < 25 GeV, and all identified particles are isolated from any other reconstructed object within a cone of ∆R = 0.4 around the particle's 3-momentum.
In order to improve the statistical significance of the signal hypothesis against the background hypothesis, we used machine learning tools. First, we used an algorithm to learn the best kinematic cut thresholds in order to maximize the significance metric. This algorithm was shown to increase the significance of the non-resonant di-Higgs study in the bbγγ channel up to 50% without relying on any other multivariate analysis [7]. It is based on a Gaussian process algorithm built upon the backend program Hyperopt [8]. We refer to [7] for a detailed description of the algorithm and its usefulness in increasing the signal significance. Many other multivariate tools can be used to improve the classification of collision events as, for example, those employed in Refs. [86][87][88][89].
The kinematic variables chosen are: (1) the transverse momentum of the two leading bottom jets and the two leading photons, (2) the γγ invariant mass, (3) ∆R(γ, γ), the distance between the two leading photons, and (4) the bbγγ invariant mass, totaling seven kinematic variables. The peak in the bbγγ mass is helpful in isolating the signal events, and, in contrast to the standard analysis of non-resonant SM double Higgs production in this channel, makes the search efficient without many more variables. One interesting kinematic feature helps to explain the larger efficiency of the BM5 point. The heavy Higgs mass is right on the bulk of the non-resonant h 1 h 1 → bbγγ invariant mass after the basic selections, around 450 GeV, but for the BM7 point, it is displaced to 563 GeV as we can see in Fig. 5. Requiring a cut around the mass peak TABLE II: The signal benchmarks, BM5 and BM7 are displayed at the first two columns and the total background is displayed in the last column. In the first row, we show the cross sections, in fb, after the basic selection discussed in the text.The second and third rows show the cut efficiencies and the number of events after optimization assuming 3 ab −1 . The numbers in parenthesis in the last column represents the backgrounds for the cuts that maximize the BM7 point.
thus retains more non-resonant di-Higgs events for the BM5 point, raising its cut efficiency compared to BM7. The cuts that maximize the signal significance for BM5 and BM7 benchmark points, assuming 3 ab −1 of integrated luminosity and 10% systematic error in the total background rates, are the following The cut selections for other systematics are similar. Because BM7 has a smaller production rate, the cuts learned by the algorithm were harder than the BM5 case in order to raise the significance. The cut efficiencies for the signals are almost three orders of magnitude larger than the backgrounds, as we can see in Table II, reaching a signal to background ratio slightly larger than 1 for both signal points.
The signal significance, assuming a 10% systematic error in the total background rate is 3.2σ for the BM5, and 1.8σ for BM7, respectively, as shown in the third column of Table III where results for 5% and 15% systematics are also shown. Note that S/B, displayed in the fourth column of Table III, increases to soften the degradation of significance with the systematics in BM5 and is kept constant for BM7. This is the job of the cut optimization program [90]. A public code of the algorithm used in this work to learn the cuts and run our multivariate analysis in an automatized way will be released in the future [90].
Further improvement of the study was achieved by training boosted decision trees with XGBoost [9] using the full representation of the events which comprise 28 kinematic variables: the transverse momentum of the two leading bottom jets p T (bb) and two leading photons p T (γγ), the ∆R distance, the invariant masses and the Barr variable [91,92] of all combinations of two particles, the bbγγ mass, the azimuthal angle between the leptons and the bottoms pairs ∆φ(bb, ), and the missing transverse energy of the event. In order to tag tth events we also used the number of leptons of the event. We averaged the results of a 10-fold cross validation to assess the robustness of our BDT training procedures. In order to obtain  The signal significance and the signal-to-background ratio of BM5 and BM7 benchmark points for three systematic uncertainties in the background total rates scenarios -5, 10 and 15%. The results of the optimized cut-and-count and the corresponding S/B achieved are displayed in the third and fourth columns, and the BDT analysis in the last column. Also, in the last column, we show in parenthesis the results for the joint BDT+cuts optimization.
the best result possible, we tuned the BDT hyperparameters and the cut thresholds jointly. By doing this, we find the best compromise between cut-and-count and the multivariate analysis.
The BDT classification increases the signal significance for both benchmark scenarios, predicting discovery for the BM5 and evidence for BM7 for systematics ranging from 5 to 15%. In the case of BM5, a 5σ discovery might be possible for around 2 ab −1 in the bbγγ channel.
In the last column of Table III we display two significances: one by cutting on the BDT scores distributions of signal and background after tuning only the kinematic cuts but keeping BDT hyperparameters fixed, reaching 6.3σ and 3.3σ for BM5 and BM7, respectively, in the 5% systematics scenario. The other number, in parenthesis, represents the significance achieved by jointly optimizing cuts and BDT hyperparameters. In this case, the significances increase slightly for all systematics but the joint optimization algorithm learns to soften S/B even further, making the significance prospects insensitive to systematic uncertainties in the background rates. The final cut on the BDT scores shown in Fig. 6 is also optimized in order to get the maximum significance possible. The typical best BDT score cut is around 0.7 which corresponds approximately to a 80% efficiency for signals and 80% rejection for backgrounds resulting in around 12(4) signal events against 3(1) expected background events for the BM5(BM7) point assuming 3 ab −1 . We use the profile likelihood formula of Ref. [93] which approximates well the true Poissonian statistics and embodies systematic uncertainties in the background rates to compute our signal significances.

V. CONCLUSIONS
Understanding the EWPT is an important goal of current and future experiments. We have explored the complementarity of the HL-LHC and proposed space-based gravitational wave detectors in achieving this goal.
We have taken the simplest template where this complementarity can be probed -the xSM model -and studied several benchmarks that are compatible with a first order EWPT.
We first calculated their gravitational wave energy spectra and signal-to-noise ratio for proposed experiments, being careful about subtle issues pertaining to the bubble wall velocity and the hydrodynamics of the plasma. Then, we took the most optimistic benchmarks and performed a collider study of double Higgs production using machine learning tools for two learning tasks: (1) to search for optimum cut thresholds and BDT hyperparameters, and (2) discriminate signal and background events with BDTs. Our results show that state-of-the-art machine learning tools can be quite powerful in probing these processes, even assuming substantial systematic uncertainties. There are several future directions. The tension between requiring bubble wall velocities small enough to produce a net baryon number through the sphaleron process, and large enough to obtain appreciable gravitational wave production, merits further study and a more comprehensive understanding of the parameter space in concrete models. A deeper understanding of the mechanism of gravitational wave production will be needed to obtain more realistic benchmark models. On the collider side, other final states of di-Higgs, such as bbW + W − , can be studied at these realistic benchmarks using the multivariate tools we have discussed.

VI. ACKNOWLEDGMENTS
A. Alves thanks Conselho Nacional de Desenvolvimento Científico (CNPq) for its financial support, grant 307265/2017-0. K. Sinha is supported by the U. S. Department of Energy grant de-sc0009956. T. Ghosh is supported by U. S. Department of Energy grant de-sc0010504 and in part by U. S. National Science Foundation grant PHY-125057. HG would like to thank Hao-Lin Li for helpful discussions.