Higgs pair production: choosing benchmarks with cluster analysis

New physics theories often depend on a large number of free parameters. The phenomenology they predict for fundamental physics processes is in some cases drastically affected by the precise value of those free parameters, while in other cases is left basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics predicted by different models; a clustering algorithm using that metric may allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmarks are then guaranteed to be sensitive to a large area of the parameter space. In this document we show a practical implementation of the above strategy for the study of non-resonant production of Higgs boson pairs in the context of extensions of the standard model with anomalous couplings of the Higgs bosons. A non-standard value of those couplings may significantly enhance the Higgs boson pair-production cross section, such that the process could be detectable with the data that the LHC will collect in Run 2.


Introduction
After the Run 1 discovery of a new scalar particle at 125 GeV [1,2] the LHC experiments are now looking forward to the data they are collecting in Run 2 and in the higher-luminosity phases that will follow. New discoveries are possible with the significantly increased centreof-mass energy of proton-proton (pp) collisions and the foreseen integrated luminosity. The new data will also enable a deep investigation of the 125 GeV particle. To test if the latter can be identified with the Higgs boson predicted by the Standard Model (SM), 1 and very generally to probe the mechanism of electroweak symmetry breaking (EWSB), it is of the utmost importance to measure the scalar potential.
In the SM Lagrangian the Higgs potential contains a quartic self interaction of the Higgs doublet. The interplay of this term with the negative-sign mass term −µ 2 drives electroweak symmetry breaking (EWSB) (see however [3]). One physical state, the Higgs boson h, remains in the theory, with cubic and quartic self-couplings resulting from the JHEP04(2016)126 quartic interaction of the scalar doublet. The measurement of these self-couplings, possible with the study of multi-Higgs production, allows us to gain information on the scalar potential. In the SM scenario the strength of all Higgs boson couplings is precisely predicted; deviations from those predictions would thus imply the existence of beyond-Standard-Model (BSM) physics. A high-statistics study of the couplings of the newly discovered boson may therefore reveal whether we are in the presence of the last building block of the SM, or rather of the first one of a new physics sector.
The idea of probing BSM physics scenarios (especially the Higgs trilinear coupling) in non-resonant Higgs boson pair production at proton-proton colliders dates far before the top quark discovery and the LHC design [4]; a large number of studies have been performed since then. After the Higgs boson discovery, many authors have investigated different phenomenological aspects of the topic (see for example [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][19][20][21][22][23][24][25][26][27]); most of the phenomenology-driven works have focused on the effects of a variation of the Higgs boson trilinear coupling λ. In the present work we consider that any kind of coupling deviation from the SM Higgs sector is a proof that the SM is not complete; therefore all possible Higgs boson couplings should be considered as ingredients of a BSM extension of the Standard Model. 2 The parameter space resulting from that interpretation is multi-dimensional; its systematic study calls for a principled approach, which we aim to provide here.
In this paper we will focus on gluon-gluon fusion (GF) production of Higgs boson pairs, which is the simplest process available to probe the Higgs boson self-coupling at the LHC. The possible BSM deviations in inclusive di-Higgs production arising in other production modes, such as vector-boson fusion [28,29] or associated production with top quarks (see for example [30,31]) or vector bosons, probe a different set of parameters in the context of an effective-field theory (EFT) as the one we are considering. The interpretation of the results of those channels should be studied separately.
Measurements performed by the ATLAS and CMS collaborations with Run 1 LHC data have already started to constrain the value of some of the Higgs boson couplings [32,33]. Due to interference effects, even small modifications of some of the couplings within the constraints posed by measurements may change drastically the di-Higgs signal topology, and enhance the production cross section enough to make the process accessible with Run 2 data. For that to happen, the observable features of the final state need to be exploited in an optimized way, given the huge cross section of physical processes yielding irreducible backgrounds. This implies making full use of the distinguishing characteristics of signal events in the multi-dimensional space of their observable features. What is needed is therefore the identification of a manageably small set of benchmark points which are maximally representative of the largest possible volume of the unexplored parameter space. The investigation of those points in as much detail as possible will effectively provide information on the full model space.
We employ a very general parametrization to sample the di-Higgs signal topology, assuming the absence of new heavy particles accessible at the LHC energy, which can for JHEP04(2016)126 example describe the effects of strongly-coupled BSM theories (without being limited to those); we provide it in section 2. The rest of this work is organized as follows. In section 3 we describe in detail the technique we devised to determine the similarity of final state densities in the space of observable kinematics, and the clustering procedure which uses that measure of similarity to identify homogeneous regions in the parameter space. In section 4 we describe the application of the technique to determine optimal benchmarks for the study of anomalous di-Higgs boson production, and we discuss the special features of the resulting partition of the parameter space. We finally draw some conclusions in section 5. In the appendix we also provide the coefficients of our parametrization of the di-Higgs production cross section.

Sampling signal kinematics in the Higgs couplings basis
In the SM Higgs boson pair production occurs predominantly by gluon-gluon fusion (GF) via an internal fermion loop, where the top quark contribution is dominant. This is because the Higgs boson couplings are exclusively controlled by the particle masses; couplings to light quarks are negligible. The extension of the latter feature as an assumption for BSM theories is well motivated if the Higgs sector is minimal (see also [34]). In the absence of new light states, the GF Higgs boson pair production at the LHC can then be generally described (to leading approximation) by five parameters controlling the tree-level interactions of the Higgs boson. These five parameters, which will be discussed in detail in the following, are κ λ , κ t , c g , c 2g , and c 2 . The Higgs boson trilinear coupling and the top Yukawa interaction do exist in the SM Lagrangian, where the former is given by λ SM = m 2 h /2v 2 , with v the vacuum-expectation value of the Higgs field. Deviations from SM values are parametrized with the multiplicative factors κ λ and κ t , respectively. The contact interactions of the Higgs boson with gluons and those coupling two Higgs bosons with two gluons or a topantitop quark pair, which could arise through the mediation of very heavy new states, are instead genuinely not predicted by the SM; they can be parametrized by the absolute couplings c g , c 2g , and c 2 . The relevant part of the Lagrangian then takes the form In fact, this Lagrangian follows from extending the SM with operators of mass dimension 4 < D ≤ 6 in the framework of an effective field theory (EFT), encoding the effects of new heavy states currently beyond experimental reach. In the case of a linear realization of EWSB, one obtains the EFT relation c 2g = −c g [35][36][37][38]. 3 In eq. (2.1) we have assumed the absence of any other light state in addition to the SM particles. In the presence of JHEP04(2016)126 such states, the kinematic structures will in general be further modified [5]. 4 In addition, we do not consider CP-violating BSM effects. Finally, we recall that the bottom quark Yukawa coupling κ b in the EFT is already constrained within 0.75 < κ b < 1.25 by LHC data [43], excluding large enhancements and justifying its omission in our framework as a good approximation. The different Feynman diagrams contributing to a di-Higgs boson signal in pp collisions at leading order (LO) are shown in figure 1. In eq. (2.1) we included operators with higher orders of the Higgs boson fields, which are for example a common feature of models where the Higgs doublet is a pseudo-Goldstone boson of a new strong (broken) symmetry and its effective interactions come from field expansions [35,44]. The translation of our parametrization to the flavour-diagonal Higgs basis (see [45,46]), which has been endorsed by the LHCHXSWG document [47] as a general EFT basis to be used to derive experimental results, is trivial; for simplicity we prefer to keep the notation of eq. (2.1). Any dimension-6 EFT basis is related to the Higgs basis by analytical relations among the coefficients; the automation of basis conversions is under development [48].
The differential cross section of the full process under consideration is proportional to the matrix element squared. We may write the square of the full matrix element (ME) at LO as In models with an extended (non-decoupled) Higgs sector, the coupling of the Higgs boson to bottom quarks might also be strongly enhanced in the limit of large scalar mixing. The topology of double Higgs boson production would consequently be modified: besides the presence of a component of the signal initiated by bottom fusion, the gluon-fusion topology would be modified since loop factors would now contain a non-negligible component with a low-mass quark [39]. An enhanced Higgs boson coupling to bottom quarks is not the only physical effect that could arise in Supersymmetric (SUSY) theories, where additional SUSY scalars are predicted (see for example [40][41][42]). In the context of SUSY scenarios nonresonant di-Higgs boson production is a wide topic, and we do not discuss it further in this document, although our treatment may still be useful in the decoupling limit, where all new states are heavy.

JHEP04(2016)126
where the various terms in the matrix element expression contribute differently in different regions of the kinematic space. Above, M j identifies the matrix element piece where the parameter j is entering at LO, and M corresponds to the box diagram. Ideally, the bulk of the higher-order corrections do not spoil this structure to reasonable approximation.

Cross section
The full cross section of GF Higgs boson pair production can be expressed by a polynomial in terms of all the model parameters as In proton-proton collisions at 13 TeV the SM prediction is σ SM hh = 34.3 fb ± 9% (scale) ± 2% (PDF), while at 8 TeV it is σ SM hh = 9.96 fb ± 10% (scale) ± 2% (PDF). 5 Those values are based on recent studies [52,53] which use the CT10 PDF set [54] and employ as input the mass values m h = 126 GeV, m t = 173.18 GeV, and m b = 4.75 GeV. At LO the scattering amplitude for the gg → hh process contains terms with different loop structures, corresponding to the different operators. The real emissions for the gg → hh process are not trivial to compute; the corresponding diagrams would contain up to pentagons to be matched with parton showers. Different groups of phenomenologists are progressing in the calculation of (N)NLO predictions matched to shower-level effects for the GF di-Higgs boson production process, especially for the SM case; see for example [50,[55][56][57][58].
The simulation setup used in this paper was produced by the authors of [59]. The LO process is already at one-loop level; in the approach followed in [59], loop factors are calculated on an event-by-event basis with a Fortran routine on top of an aM C@N LO [60,61] effective model; the NN23LO1 PDF set [62] is used. Those simulations represent the stateof-the art in the description of BSM di-Higgs boson production. We simplify the task of mapping the five-dimensional parameter space of BSM theories in the limit of the present computational capability by assuming that each of the matrix element pieces of eq. (2.2) gets corrected by an overall k-factor, as written in the first equality of eq. (2.4). As a second step we make the stronger assumption that the k-factors related to the different ME pieces are equal, and that they may be taken as a k-factor derived for the SM case, leading to the second equality in eq. (2.4): The above approximations are expected to be good for QCD-like radiative corrections when quoting the total cross section. Indeed, the enhancement in the total cross section from QCD NLO corrections is mainly due to soft gluon radiation from the initial state [58]. For a characterization of the differential distributions, on the other hand, the description outlined above might not be entirely satisfactory. Bearing in mind that potential caveat, JHEP04(2016)126 we decided to use it for this study as it facilitates the mapping of experimental results derived with LO simulations to the results of a radiative corrected calculation. Using eq. (2.4) it is possible to calculate the coefficients of the polynomial (2.3) by evaluating the results of LO computations in different points of the five-dimensional parameter space. For each considered point, using the setup mentioned above, we generate 20,000 pp collision events at 13 TeV centre of mass energy, producing a final state of two Higgs bosons. The resulting cross sections are then fit with a maximum likelihood technique to the polynomial (2.3). In order to ensure a stable fit we inspect six orthogonal two-dimensional planes in the five-dimensional parameter space that all contain the point corresponding to the SM. The procedure used to derive the coefficients of the polynomial and the numerical results for the fitted parameters is detailed in [63]. Figure 2 shows the resulting cross section in the two-dimensional planes mentioned above. The range of parameters considered in our study is discussed below.

Parameter space study
In order to carry out a phenomenological study of Higgs boson pair production in BSM theories we need to first define the range of parameter variations we are willing to consider. Some of the parameters relevant to the production phase are basically unconstrained by single Higgs boson measurements: among these are the triple Higgs coupling and the di-Higgs boson contact interactions with top quarks and gluons. Others, such as the top Yukawa coupling, are already constrained by experimental results [32,33]. A precise

JHEP04(2016)126
interpretation of all experimental bounds as constraints on the effective operators and their effect on the considered process is not trivial, as the parameters do not only affect the predicted rate of di-Higgs boson production, but also the kinematics of the final state.
Measurements of single Higgs boson production performed so far at the LHC already constrain the κ t and c g parameters. The combination of those results using the κ formalism [64] shows that, by marginalising over all other Higgs boson couplings, the allowed values of κ t are constrained at 95% C.L. in the region between 0.5 and 2.5. A Bayesian analysis of BSM operators based on available measurements of Higgs boson properties constrains the c g parameter to be at most at the O(1) level [43,[65][66][67][68]. The remaining parameters are constrained only by absolute cross section limits on inclusive di-Higgs boson production [69] as explained below.
The current experimental limits on non-resonant Higgs boson pair production in the SM come from the study of the γγ bb and 4b final states at 8 TeV by ATLAS [69,70]. Those limits are respectively σ SM hh→γγbb < 5.72 fb (220 times the SM value) and σ SM hh→4b < 202 fb (56 times the SM value). Both results above were derived by counting experiments and they involve no strong assumptions on the signal topology in the final state other than the presence of two Higgs bosons. Considering the ATLAS results extended to all the parameter space, we find the |κ λ |-only variation to be constrained at 95% C.L. in the region |κ λ | 15. 6 Following a similar approach the c 2 parameter can be constrained to |c 2 | < 5 at 95% CL when κ λ = 1 and κ t ⊂ [0.5 , 2.5].
A cursory look at the kinematics of the final state, as described in any of the already cited phenomenological studies, suggests that different choices of the coupling parameters give rise to striking differences in the density functions of the kinematic observables. This convinces us of the need of a systematic approach to characterize the signal topology.
In order to retain generality of the results of our study for any final state of di-Higgs boson production, and invariance to further analysis cuts and/or analysis techniques, we study the event topology as it results from the production of the two Higgs bosons free from initial-state radiation effects and before the subsequent decays and final-state radiation effects. The study is performed with an extended sampling of parameter space points with respect to the ones used to calculate the total cross section in last section; again, no generation cut is applied to the processes. For each studied point of the parameter space we generate 20,000 events of pp collisions at 13 TeV centre of mass energy. These are sufficient for the task of understanding how the event kinematics varies as a function of the model parameters.
We are considering a 2 → 2 process at leading order. The two Higgs bosons are produced with identical transverse momenta (p h T ), and they are back-to-back in azimuth at this order (before a parton shower). The final state can then be completely defined by

JHEP04(2016)126
three kinematic variables, if we ignore the irrelevant azimuthal angle of emission of the bosons. Furthermore, one of the three remaining variables can be used to isolate all the information related to the PDF of the colliding partons, which is also irrelevant to the physics of the production process once one focuses on a specific initial state (the gluongluon fusion process). The variable factorizing out the PDF modelling can be taken as the magnitude of the boost of the centre of mass frame as seen in the laboratory frame.
The two remaining variables, which provide direct information on the physics of GF di-Higgs boson production, can be chosen to be the invariant mass of the di-Higgs system (m hh ) and the modulus of the cosine of the polar angle of one Higgs boson with respect to the beam axis (| cos θ * |). Since we are using parton-level information, this last variable is equivalent to the polar angle in the Collins-Soper frame (| cos θ * CS |) [71], which is commonly used in experimental analysis. The variables m hh and | cos θ * | can thus be used to fully characterize the final state kinematics produced by different choices of the value of anomalous Higgs boson (self-) coupling parameters.

Classification of final state kinematics
The choice of benchmarks for the study of a new physics model is usually a difficult task, as it obliges to several partly conflicting desires. While the collection of benchmarks should in principle offer an exhaustive representation of the varied final state composition and topologies that the new physics model may give rise to, one's choice of the specific values of the model parameters to study in more detail often falls on those which are within the sensitivity reach of a specific amount of data collected by a given experiment at a given time. In that case the focus is usually on the cross section of the new physics signal, which is identified as the most important factor. As it happens with the drunkard who lost his watch in the dark and only searches it under the street lamps, this approach is guaranteed the highest short-term impact but may not be the most principled if one takes a long-term perspective.
The case of Higgs boson pair production at the LHC offers a peculiar situation, as in the short term we will be unable to achieve experimental sensitivity to the largest part of the BSM parameter space. Furthermore, anomalous Higgs boson pair production processes are characterized by a final state which is homogeneous in its composition, as opposed to, e.g., SUSY production processes, which give rise to a quite rich and diverse set of final states depending on the exact choice of theory parameters. Within that homogeneous final state, anomalous di-Higgs boson production offers quite varied kinematics as a function of the model parameters. This makes it an ideal ground for a principled and quantitative approach to the choice of benchmark points.
In light of the above considerations we take the problem from the side of shape information rather than normalization. By identifying sets of parameters which yield similar final state kinematics we simplify the problem of investigating a large and unconstrained model space. The resulting partition of the space will remain useful as the integrated luminosity collected by the LHC experiments grows from tens to hundreds of inverse femtobarns.

JHEP04(2016)126
The task of partitioning the parameter space into homogeneous regions can be performed with cluster analysis techniques. These allow the grouping of elements of a set into subsets in such a way that members of each subset are mutually more similar to one another than are elements belonging to different subsets. The similarity will, in our case, be described by an ordering parameter which is constructed with the event kinematics.

Two-sample tests
In order to define a metric to classify physics models based on the similarity of the event kinematics they describe in the feature space, we need to choose a general statistical framework as well as a suitable two-sample test statistic. At first, we might consider the problem as one of hypothesis testing. Accordingly, we would define a test size α and a null hypothesis H ij 0 for each pair of parameter space points i and j, the null hypothesis being that the corresponding data samples S i and S j share the same parent probability density function -or, in other words, that models i and j describe the same physics. The choice of a test statistic and its evaluation on all pairs of samples would then allow us to populate a matrix describing the mutual compatibility of the samples, in the form of a set of pass/fail bits. Clearly, such a result would not be practical for the task of grouping samples into subsets of similar characteristics. Furthermore, it must be noted that as we start with samples which do originate from distinguishable parameter space points and which yield different density functions, it is only the lack of infinite statistics what might prevent us from calling two samples passing the test as "different" from an experimental point of view.
We may turn the limited statistics of the datasets to our advantage if we realize that what we need is an analog answer rather than a digital one: a degree of similarity between each pair of samples must take the place of the yes/no answer of the hypothesis test. A test statistic (TS) such as a χ 2 probability or a likelihood value may be used to determine which samples should be grouped into homogeneous subsets.
There exists a large variety of two-sample tests suitable for the task at hand. To name a few, one may use the Anderson-Darling test [72], the Kolmogorov-Smirnov test, the χ 2 test, the T test, or others. The ones mentioned above are usually single-dimensional tests, in the sense that they are meant to compare two single-dimensional distributions; their extension to multi-dimensional data is not always straightforward, as it is subject to implementation choices that call for detailed power studies. 7 In a multi-dimensional setup possible choices also include the Energy test [73] or nearest-neighbour-based metrics. Such unbinned multi-dimensional TS may be the right choice in situations when the statistics of the samples to be compared are very small, or when the dimensionality of the problem is large. In the specific case of non-resonant di-Higgs boson production, however, we found that the TS with highest power to detect localized differences in the kinematic distributions is a likelihood ratio based on Poisson counts in a set of two-dimensional bins. That is the solution we investigate and discuss in this work.

The likelihood ratio test statistic
In the specific application described here, the numerousness of our generated datasets (20,000 events per sample) and the small dimensionality of the feature space that completely defines the final state of the process (two variables) allow us to employ as test statistic a binned likelihood ratio. The number of bins in m hh and | cos θ * | are chosen such that the main kinematic features of the distributions are properly modelled while retaining sufficiently populated bins. We found appropriate for our application to have fifty 30-GeVwide bins in m hh in the range from zero to 1500 GeV and five 0.2-wide bins in | cos θ * | from zero to one.
To define our test statistic let us first consider the hypothetical case in which the two samples under test share the same parent distribution. The corresponding likelihood function is the product over the bins of the probability to observe n i,1 and n i,2 event counts in bin i from the two samples S 1 and S 2 . This probability is given by the product of two Poisson distributions Pois(n i,1 |μ i ) × Pois(n i,2 |μ i ) whereμ i = (n i,1 + n i,2 )/2 is the maximum likelihood estimate for the expected contents in bin i. However it can be shown that Pois(n i,1 ) × Pois(n i,2 ) = Pois(n i,1 + n i,2 ) × Binomial(n i,1 /(n i,1 + n i,2 )).
( 3.1) It is clear that the first term in the right-hand side of the decomposition does not contain any information about the differences between the density functions of the two samples; it only contains information on the precision of the test. This is what is called, in statistics literature, an ancillary statistic which can be advantageously neglected, as we do in the following. The retained binomial term is explicitly Binomial(n i,1 /(n i,1 + n i,2 )) = (n i,1 + n i,2 )! n i,1 !n i,2 ! 1 2 Now, to obtain a likelihood ratio we consider the case in which the two samples are equal, the so-called saturated hypothesis [74]. The appropriate single-bin-content probability can be obtained from eq. (3.2) by imposing n i,1 = n i,2 =μ i , yielding Calling L the likelihood obtained from the distribution in eq. (3.2) and L S the one from eq. (3.3) we define the log-likelihood ratio which, up to a minus sign, is "χ 2 distributed" [74,75]. Thanks to this property this TS can be directly used as an ordering parameter to perform a cluster analysis. In other words, the values T S ij and T S kl obtained respectively by testing the compatibility of samples ij

JHEP04(2016)126
and kl are suitable to determine if samples S i and S j are more similar to each other than are samples S k and S l : this is the case if T S ij > T S kl . 8 In addition to its distribution-independence the TS of eq. (3.4) is particularly sensitive to small-scale features of the distributions under test, and is thus well suited to our task as we are confronted with samples exhibiting bi-modal structures in the studied spectra (see for instance figure 5). In contrast, TS which are more sensitive to large-scale structure may give precedence to it when used as an ordering parameter in a clustering procedure: we have observed that such behaviour gives rise to unwanted results, whereby bimodal and single-modal distributions are clustered together.

The clustering technique
The clustering procedure must produce a grouping of the parameter space points based on the kinematical distributions of the corresponding final states. Such a task can be performed in a number of ways, yielding in general different results. The algorithm we chose matches our desire to create homogeneous regions of parameter space based on the TS metric, and it allows to univocally identify the sample in each cluster which is the most representative of the set -what we call a benchmark. The benchmark is chosen as the sample which is the most similar to all the other samples associated to the same cluster.
The sample comparisons are pairwise, therefore from N sample tested points we can form N sample (N sample − 1)/2 two-sample test results with the procedure described in section 3.2. We define the following procedure to group samples into a given number of clusters (N clus ): 1. Start by identifying each of the N sample elements as one-element clusters.  decreases, more and more discrepant elements are clustered together; accordingly, the benchmark becomes less and less representative on the whole of the subset that contains it. It is easy to see how the technique outlined above possesses some attractive features for our application. There is always a well-defined benchmark in each cluster, and the criterion by which points are clustered together privileges a maximum intra-cluster uniformity over an average one. In the next section we apply the method to the parameter space points describing BSM di-Higgs boson production, which allows us to show what those properties mean in practice.

Application to Higgs pair production
In this section we discuss the application of the procedure described in section 3 to GF di-Higgs boson production at the LHC. The first step is to identify the set of parameter space points on which we wish to run the cluster analysis. Ideally one would like to start with a regular and homogeneous grid in the five-dimensional parameter space of anomalous couplings described in section 2; however any meaningfully-spaced regular grid would require a prohibitive number of simulated data samples. Instead of using a regularly spaced grid, we focus primarily on the regions of parameter space where the probability densities of the final state observables exhibit the fastest variability with parameter variation. These regions coincide with local minima of the production cross section, as explained below (section 4.3). The resulting population of the five-dimensional grid is admittedly arbitrary; however it is seen a posteriori to be able to picture reasonably well the varied spectrum of topologies of GF di-Higgs boson production. It includes N sample = 1507 points of the five-dimensional parameter space, composed of the following three subsets: • We start with a geometrically well-spaced grid in the slices of

JHEP04(2016)126
• In some regions of parameter space (especially those with c 2 = 0.5 and c 2g = O(1)) there is a strong cancellation between the different operators in the threshold m hh region. This leads to topologies where the distribution of m hh exhibits a long tail to high values. 9 In order to have a better kinematic description of this topology (and as well of the cancellation pattern between operators) we add to the grid one slice of parameter space with c 2 = 0.5 and κ λ = κ t = 1, maintaining the previous binning in the c g − c 2g plane.
• Finally, we also consider a three-dimensional grid of points described by the parameters κ λ , κ t , and c 2 in the hyperplane defined by c g = c 2g = 0.

Choice of the number of clusters
The total number of required clusters (N clus ), and therefore the total number of regions into which the parameter space is divided, is the only free parameter in the clustering procedure described in section 3. The uniformity of the kinematical distributions within each cluster is a qualitative criterion which can be used to choose the target value of N clus . A large number of clusters provides a fine sub-division of the parameter space and guarantees a better uniformity of the kinematical distributions within each cluster. However, a too large number of benchmarks puts a heavy load on the experimental treatment of the data needed to probe the full parameter space. On the other hand, a too small N clus may produce marked differences in the samples grouped together, such that the corresponding benchmark does not appear suitable to accurately represent the behaviour of the subset. In our specific application we have observed that strong discrepancies within the clusters appear when N clus becomes smaller than 12, while for N clus > 12 the differences between the kinematical distributions of the samples included in different clusters are small enough that they should have a limited impact on the extrapolation of results obtained for the benchmark point. Figure 4 shows the m hh distribution for the two clusters that are merged when the number of clusters is reduced by one unity from N clus = 13 to 9. It is evident that when reducing N clus from 13 to 12 there is no significant worsening of the uniformity of the merged cluster, while the same cannot be said in further reducing N clus . Given the good uniformity of the distributions in all clusters, N clus = 12 is the value chosen for the cluster analysis of the 1507 samples of di-Higgs boson production model points. We consider this a reasonable trade-off between homogeneity and numerousness of the clusters.

Kinematical sampling with N clus = 12
The parameter space values of the benchmarks obtained with N clus = 12 are listed in table 1. 10 The benchmarks distribute fairly evenly in the space of model parameters, without concentrations in specific corners of phase space; furthermore, both samples with and without Higgs-gluon contact interactions are represented in the set. Figure 5 shows the m hh and | cos θ * | distributions for all the samples considered in the five-dimensional parameter space, grouped into twelve clusters by the procedure described in section 3. Cluster 3 includes the SM point while cluster 4 includes the sample with unique contribution from the box diagram (κ λ = 0.0, κ t = 1.0, and c 2 = c g = c 2g = 0). Cluster 8, which presents the characteristic doubly-peaked m hh distribution, includes the sample with the maximal interference between the box and triangle contributions in the SM couplings scenario, i.e. the point defined by (κ λ = 2.4, κ t = 1.0, and c 2 = c g = c 2g = 0).  The clustering is clearly driven by the m hh variable. The impact of anomalous physics in | cos θ * | is expected to be small because all the different operators in our parametrization are predominantly s-wave (see for example [39]). This is evident in figure 5, where only few samples exhibit a non-flat structure in | cos θ * |; these correspond to points of parameter space where there is a maximal interference between different terms, as in cluster 8. All spin 2 structures (at the level of D ≤ 6 operators, i.e. to leading approximation) come just from the box diagram. The study of the γγ bb final state of hh decay is expected to be the most sensitive probe to local changes in the m hh spectrum; however other decay channels, such as the W W bb or the bb bb one, could also in principle be sensitive to small shape variations in different regions of hard sub-process energy, especially when multi-variate analysis techniques are implemented. With increased statistics of the available data, fine structures in the kinematics -in particular in the m hh distribution, e.g. in clusters 2, 5, and 8-will become more interesting and may call for a more specific study of the corresponding regions of parameter space.

JHEP04(2016)126
In figure 6 we show the distribution of the Higgs boson p h T (which is the same for both Higgs bosons at generator level) and the longitudinal momentum of the Higgs boson with the highest energy in the laboratory frame, |p h z |. Figures 5 and 6 visually confirm that m hh and | cos θ * | constitute a robust choice of variables to fully describe the salient features of the 2 → 2 process. The features shown in figure 6 are more directly connected with experimental selections and acceptance cuts, and to the Higgs boson reconstruction techniques. In particular, the Higgs boson transverse momentum distributions allow one to gauge how the different clusters will be affected by baseline selections in the analyses targeting the corresponding benchmarks. The |p h z | variable is highly homogeneous within each cluster, as a result of the good properties of the clustering performed using the m hh variable.
It is important to point out that the clustering procedure applies no special treatment to any of the parameter space points; yet one is especially interested in the point corre-JHEP04(2016)126   Figure 7. Distribution of points in the κ λ × κ t plane that contains the SM point. Downwardpointing triangles symbolize clusters where the benchmark has Higgs boson p T peaking at around 50 GeV or at a smaller value. Circles describe clusters whose benchmark has Higgs boson p T peaking around 100 GeV. Upward-pointing triangles describe clusters where the benchmark has Higgs boson p T peaking around 150 GeV or more. Finally, crosses describe clusters that show a double peaking structure in the p h T distribution.
sponding to the Standard Model prediction. In our clustering with N clus = 12 the SM point is included in cluster 3 and it is well represented by the relative benchmark. An experimental study of the twelve benchmarks should of course be complemented by the study of the SM case; results derived for the latter are likely to be compatible with the ones for the benchmark of cluster 3.

Maps of the clusters in the parameter space
In this section we attempt a direct mapping of the partition of the parameter space into the twelve regions found to produce homogeneous kinematical densities, using the choice of N clus = 12. We organize our results in slices of parameter space, plotting the distribution of the clusters in each of them. There is no logical ordering in the numbering of the clusters; we choose markers of different shape and colour to describe how clusters spread along the different parameter space regions. Figure 7 shows the clusters distribution in the κ t × κ λ plane, which we will call SM-like plane. The iso-contours of constant cross section σ hh as computed in section 2.1 are shown by gray lines. We point out how the parameter space region around the SM benchmark in the SM-like plane is especially interesting. At LO, changes in the top Yukawa parameter as small as 30% and/or in the Higgs trilinear coupling of O(1) times the SM drive modifications of the differential cross section in p h T from single-peaked structures to more complex two-peaked shapes where the peaks are separated by O(100) GeV. As a logical corollary of what is noted above, however, one should expect the kinematical behaviour of the SM benchmark to be quite sensitive to the accuracy of the theoretical calculation. This accidental sen- sitivity of the kinematic behaviour to parameter values is due to the fact that the SM point is located near a cross section minimum, where there are fine cancellations between triangle and box diagrams. Figure 8 shows the clusters in the plane κ t × c 2 for different values of κ λ , when c g = c 2g = 0. We observe that outside the SM-like plane there is no clear asymptotic behaviour of the event kinematics with |κ λ | 1. This confirms that asymptotic approximations of the different pieces of eq. (2.2) are not useful for a deep parameter space scan. Figure 9 shows the map of clusters in various slices of the five-dimensional parameters space, the same used in section 2.1 for the calculation of the cross section modifications. Figures 7 and 8 suggest that the largest modifications in the final state kinematics are tightly related to the minima of the cross section. Since all the matrix-element pieces not related with interferences (|M i | 2 ) are positive-definite, the minima of the cross section in each slice of parameter space are partially a reflection of regions where the interferences between the different processes are large in comparison with the other ME pieces. The correspondence however is not bilateral: the balance between the non-interference terms can also drive visible changes in shapes while not affecting much the total cross section. As an additional qualitative proof of the close correspondence between the cross section minima and the regions of largest variation of the density of the kinematical distributions in the final state, we show in figure 10 a few maps of the cross section of di-Higgs boson production in two-dimensional subspaces of the five model parameters, with overlaid colour maps describing the magnitude of the point-to-point variations in the value of the loglikelihood test statistic defined in section 3. The latter describe the speed with which the m hh and | cos θ * | distributions vary, highlighting the effect of the cancellation of diagram contributions mentioned above.

Conclusions
The study of Higgs pair production processes at the LHC may evidence the existence of BSM physics in the form of anomalous (self-)couplings of the Higgs boson. While both the total Higgs pair-production cross section and the kinematics of the final state depend on those couplings, it is the latter that impact the most the design of experimental techniques aimed at measuring the process. In this article we described a procedure to define suitable benchmark points in the multi-dimensional parameter space spanning the possible value of the anomalous couplings. The procedure optimally chooses the benchmarks such that the study of the resulting physics has the largest impact on the exploration of the parameter space.
The technique we propose is based on the definition of a test statistic measuring the similarity of the kinematics of the Higgs boson pairs in the final state resulting from differ-  Figure 10. Superposition of isolines of cross section and colour maps of the speed at which the likelihood test statistic described in section 3 varies as one moves around in three selected twodimensional surfaces of the five-dimensional parameter space of BSM theories. The cross section decreases in the direction where the density of isolines decreases. Blue and red tones in the colour maps indicate the highest variation in the T S values; the colour scale is arbitrary. The behaviour observed in the graphs is common to all investigated two-dimensional planes. ent parameter space points, and a suitable clustering procedure to group parameter space points together. Although it finds a very profitable application to the case of Higgs boson pair production at the LHC, the technique is quite general and may successfully be employed in other physics studies.
We study gluon-fusion-initiated di-Higgs boson production in 13 TeV proton-proton collisions and examine an extensive but not exhaustive set of subspaces of the fivedimensional space of anomalous couplings. We find that twelve benchmarks are sufficient to describe to a reasonable level of approximation the possible different kinematic densities that may arise from arbitrary combinations of the parameters. We argue that an experimental study which focuses on those twelve scenarios should maximize the impact on the exploration of the parameter space, without leaving unexplored "holes".
The grouping of parameter space points is also meant to allow one to extrapolate the results of an experimental search performed on a benchmark point to all other points of the cluster which contains the benchmark. Whether such an appealing plan is feasible remains to be proven, as it depends on the homogeneity of the intra-cluster kinematics as well as on the statistical power of the experimental data. A detailed study of the degree of validity of such extrapolations will be the subject of future investigations.

A Cross section coefficients
In this appendix we summarize the procedure we followed to fit the coefficients A i in the cross section ratio, eq. (2.3). In principle to fix the fifteen coefficients in a recursive way one needs to calculate the total cross section only for fifteen selected points in parameter space. The cross section estimates are however obtained with a Monte Carlo generator and they may contain intrinsic errors, related for example to the finite statistics in phase space integration. Moreover, the final result also includes uncertainties due to PDF errors and missing higher orders, captured by scale variations.  Table 2. Coefficients of our fit to the cross section modifications of double Higgs production in proton-proton collisions with respect to the SM benchmark (eq. (2.3)). See [63] for the relative theory uncertainties.
In order to properly account for the effects mentioned above, and in particular to judge the associated stability of the fitted coefficients of eq. (2.3), the fit must be performed using a large data sample. Such a study is described in [63], and the result is reported in table 2.
In addition to the coefficients for the cross sections at 13 TeV, for completeness we also provide the values of the cross section coefficients for pp collisions at 7, 8, 14 and 100 TeV centre-of-mass energy. The theory uncertainties on the cross sections defined by eq. (2.3) and table 2 are also evaluated and can be found in [63].