Exploring anomalous couplings in Higgs boson pair production through shape analysis

We classify shapes of Higgs boson pair invariant mass distributions mhh, calculated at NLO with full top quark mass dependence, and visualise how distinct classes of shapes relate to the underlying coupling parameter space. Our study is based on a five-dimensional parameter space relevant for Higgs boson pair production in a non-linear Effective Field Theory framework. We use two approaches: an analysis based on predefined shape types and a classification into shape clusters based on unsupervised learning. We find that our method based on unsupervised learning is able to capture shape features very well and therefore allows a more detailed study of the impact of anomalous couplings on the mhh shape compared to more conventional approaches to a shape analysis.


Introduction
The Higgs sector as we see it today is probably just a glimpse of an underlying more general structure still awaiting to be explored. Manifestations of new physics at higher scales would lead to operators which on one hand introduce new, effective couplings and on the other hand also modify interactions known in the Standard Model (SM). Therefore it is a primary goal for collider physics in the next decades to constrain the couplings, in particular in the Higgs sector, to an unprecedented precision. This is particularly true for the Higgs boson self couplings, in order to find out whether the Higgs potential is indeed of the form assumed by the SM. Deviations from this form could provide strong hints about how to extend the SM.
The trilinear Higgs boson coupling λ can be constrained by measurements of Higgs boson pair production [1,2], where the gluon fusion channel yields the largest cross section, and the most stringent 95% CL limits on the total gg → HH cross section at √ s = 13 TeV are currently σ HH max = 6.9 × σ SM , constraining trilinear coupling modifications to the range −5.0 ≤ λ/λ SM ≤ 12.0 [1].
The trilinear Higgs couplings can also be constrained in an indirect way, through measurements of processes which are sensitive to these couplings via electroweak corrections [3][4][5][6][7][8][9][10][11][12][13][14]. Such processes offer important complementary information, however they are susceptible to other BSM couplings entering the loop corrections at the same level, and therefore the limits on c hhh = λ/λ SM extracted this way may be more model dependent than the ones extracted from the direct production of Higgs boson pairs. A corresponding experimental analysis based on single Higgs boson production processes has JHEP03(2020)091 been performed [15], and recently combined constraints from single and double Higgs boson production became available [16]. The idea of indirect constraints through loop corrections also has been employed trying to constrain the quartic Higgs boson self-coupling from (partial) EW corrections to Higgs boson pair production [17,18].
Theoretical constraints on c hhh are rather loose if derived in a largely model independent way. Recent work based on general concepts like vacuum stability and perturbative unitarity suggests that |c hhh | 4 for a new physics scale in the few TeV range [19][20][21][22]. More specific models can lead to more stringent bounds, see e.g. refs. [23][24][25][26][27]. Recent phenomenological studies about the precision that could be reached for the trilinear coupling at the (HL-)LHC and future hadron colliders are summarised in refs. [28][29][30].
Higgs boson pair production in gluon fusion in the SM has been calculated at leading order in refs. [31][32][33], and at NLO in the m t → ∞ limit, rescaled with the full Born matrix element, in ref. [34]. ref. [35] contains the full top quark mass dependence in the real radiation, while the virtual part is calculated in the heavy top limit. The NLO QCD corrections with full top quark mass dependence became available more recently [36][37][38]. Implementations of the full NLO QCD corrections in parton shower Monte Carlo programs are also available [39][40][41].
NNLO QCD corrections have been computed in the m t → ∞ limit in refs. [42][43][44][45][46]. The calculation of ref. [46] has been combined with results including the top quark mass dependence as far as available in ref. [47], and the latter has been supplemented by soft gluon resummation in ref. [48].
The scale uncertainties at NLO are still at the 10% level, while they are decreased to about 5% when including the NNLO corrections. The uncertainties due to the chosen top mass scheme have been assessed in ref. [38].
Analytic approximations for the top quark mass dependence of the two-loop amplitudes in the NLO calculation have been studied in refs. [49][50][51][52][53]. Complete analytic results in the high energy limit have been presented in ref. [54]; the latter have been combined with the full NLO results in the regions where they are more appropriate in ref. [55].
It is well known that Higgs boson pair production in gluon fusion is a process where delicate cancellations occur between contributions containing the trilinear Higgs coupling and box-type contributions not containing the trilinear coupling. While the destructive interference between these contributions is usually seen as a curse leading to small cross sections, it can be turned into a virtue when analyzing the shapes of distributions, as for example the di-Higgs invariant mass distribution m hh , because even small anomalous couplings can lead to characteristic shape changes. Therefore it is important to investigate in which way the shapes are influenced by a certain configuration in the coupling parameter space.

JHEP03(2020)091
The idea of a shape analysis has been pursued already in various ways based on LO studies, see e.g. refs. [58-61, 71, 72]. In ref. [61], a cluster analysis is proposed to define 12 benchmark points in a 5-dimensional non-linear EFT parameter space which result from clusters of "similar" shapes. The similarity measure in this case is based on a binned likelihood ratio using LO predictions for the observables m hh , cos θ * and p T,h . In ref. [70] it was analyzed how the m hh and p T,h distributions change when going from LO to NLO for the benchmark points defined in ref. [61].
As a function of the 5-dimensional coupling parameter space, the m hh distribution can have a few characterising features, such as an enhanced low-m hh region, a double peak, a single peak or an enhanced tail. Some of these features can be attributed rather easily to a certain anomalous coupling, for example, an enhanced low-m hh region is naturally produced by large values of |c hhh |. Other features of the m hh -shape, like a double peak or a SM-like distribution, are harder to attribute to a certain coupling configuration, as there are a multitude of configurations leading to such shapes. This is also reflected in the cluster analysis proposed in ref. [71], where (a) very different coupling configurations can end up in the same cluster, and (b) the same cluster can contain shapes which "by eye" look quite different (for example "double peak" and "single peak").
Therefore it is desirable to seek for alternative methods to extract information about the underlying parameter space from the shape of distributions in Higgs boson pair production. In this work we first classify the shapes of Higgs boson pair invariant mass distributions, calculated at full NLO, into four characteristic types. We visualise the underlying 5-dimensional EFT parameter space producing these shape types, projecting onto 2-dimensional subspaces. We also comment on the shape of the p T,h distribution. Then we refine the shape analysis, applying an unsupervised learning algorithm based on an autoencoder to identify patterns in the shapes of the m hh distribution. We use the KMeans clustering algorithm from scikit-learn [73] and ask for a classification of the shapes into a certain number of clusters. One aim of this study is to offer an alternative to the cluster analysis proposed in refs. [61,71,72] and earlier work, another aim is to provide an analysis based on full NLO results. The present study allows us to associate certain shapes more globally with distinct regions in the parameter space, in this sense going beyond a benchmark point analysis. Nonetheless, to facilitate a future more quantitative analysis, for example a profile likelihood study, we identify new benchmark points, based on the cluster centers given by our procedure.
The remainder of this work is structured as follows: in section 2 we explain the framework our data samples are based on. We define four different shape types for the m hh distribution and visualise the parameter space underlying the predefined shape types. In section 3 we describe our cluster analysis based on unsupervised learning and show how JHEP03(2020)091 the clusters found by this procedure relate to the underlying parameter space. We also definine seven new benchmark points, before we conclude.
2 Classification through predefined shape types 2.1 Parametrisation of anomalous couplings in the Higgs sector As a starting point we use the effective Lagrangian in a non-linear Effective Field Theory ("Higgs Effective Field Theory, HEFT") relevant for Higgs boson pair production, assuming CP conservation, up to order 4 in the chiral expansion [70,100]: In the SM c t = c hhh = 1 and c tt = c ggh = c gghh = 0. The chromomagnetic operator is absent in (2.1) because it contributes to gg → hh only at higher order in the chiral counting. The coefficients c ggh and c gghh are related in SMEFT ("SM Effective Field Theory") [30,101,102], however in HEFT there is not necessarily a relation between the two parameters. To clarify the relation to the widely used SMEFT operators, we briefly comment on the SMEFT Lagrangian here. The dimension-6 terms relevant for gg → hh can be written as [59,65] Relating the coefficientsc i in eq. (2.2) to the couplings of the physical Higgs field h and comparing with the corresponding parameters of the chiral Lagrangian eq. (2.1), one finds, after a field redefinition of h to eliminatec H from the kinetic term, Note that, assuming an underlying weakly coupled gauge theory, dimension-6 operators involving field-strength tensors can only be generated through loop diagrams [103]. Their coefficients then come with an extra factor of 1/16π 2 . In this case, the coefficientsc ug andc g in eq. (2.2) are counted as order (1/16π 2 )(v 2 /Λ 2 ), whilec H ,c u andc 6 are still of order v 2 /Λ 2 . For more details about the difference between HEFT and SMEFT we refer to refs. [30,70].
We produce our data using the differential distributions calculated in ref. [70], parametrised in terms of coefficients A i for each coupling combination occurring in the JHEP03(2020)091 (differential) NLO cross section, which allows for a fast evaluation: The coefficients A i are evaluated in bins of width 20 GeV from 250 GeV to 1050 GeV, i.e. for 40 bins. They are available with ref. [70] as .csv tables, in units of fb/GeV, for √ s = 13, 14 and 27 TeV. The median of the statistical uncertainties of the differential coefficients A i does not exceed 3%, however in the bins beyond m hh 650 GeV some A i coefficients have uncertainties in the 20-30% range.

Definition of shape types
We distinguish four types of characteristic shapes for the Higgs boson invariant mass distribution m hh : 1. Enhanced low m hh region, constantly falling distribution as m hh increases.
2. Double peak with peaks separated by more than 100 GeV.
3. Single peak near the tt production threshold at m hh ∼ 346 GeV.

Double peak with peaks separated by less than 100 GeV.
Examples of the four shape types are shown in figure 1. According to our classification the Standard Model shape is contained in distributions of kind 3. Certainly there is some arbitrariness in the definition of these shapes. For example, shapes of kind 4 would move to kind 1 or 3 for bin widths ≥ 100 GeV. However, the other three shape types are quite robust and would be clearly distinguishable experimentally.
Based on the parametrisation in eq. (2.5), the normalised differential cross section is computed for a 5-dimensional grid in the coupling parameter space and according to its behaviour is classified into one of the four shape types. For this purpose we wrote a function, called "analyzer" in the following, that checks the slopes of the distribution and puts it into the corresponding class. At this stage the shape classes are mutually exclusive. For each point in the coupling parameter space, we also consider the variations of the result in each bin due to inclusion of the statistical uncertainties on the coefficients A i . If the shape obtained after these variations belongs to a different kind, we exclude that point from the data set. We find that for shape type 4 about 20% of points fall into this category and are therefore excluded, while for shape type 2 it is about 8%, and for types 1 and 3 it is less than 5%. Scale variations have not been included, as they tend to be rather uniform over the whole m hh range [41,70] and therefore would not significantly modify our shape analysis. JHEP03(2020)091

Classification of m hh distributions
Our results for the gg → hh cross sections at NLO are produced for a centre-of-mass energy of √ s = 13 TeV, using PDF4LHC15 nlo 100 pdfas [104] parton distribution functions interfaced via LHAPDF, along with the corresponding value for α s . The masses have been set to m h = 125 GeV, m t = 173 GeV and the top quark width has been set to zero.
We study the differential cross section as a function of five anomalous couplings, varying them in the ranges specified below, (2.6) The ranges are motivated by current experimental constraints. For c hhh we use a smaller range than suggested by experiment in order to focus more on the range where interesting shape features are present. In order to visualise the results, we project out 2-dimensional slices of the 5-dimensional parameter space, fixing the other three couplings to their SM values. This leads to a total of ten configurations. For each of these ten projections we generated a set of 10 6 parameter pairs. Feeding them through our analyzer we obtain the shape type produced by the given point in the coupling parameter space. The results are shown in figures 2-6. The white diamonds denote the Standard Model point in the parameter space. Scale variations are not included, as they are rather uniform over the whole m hh range [41,70] and therefore would not modify our shape analysis significantly.
JHEP03(2020)091 In figure 2 we display variations of the top quark Yukawa coupling c t versus the trilinear Higgs coupling c hhh (left) and the effective gluon-Higgs couplings, c ggh versus c gghh (right). In all the figures where two couplings are varied, the other three couplings are set to their SM values. It can be clearly seen that the shapes of kind 1, i.e. shapes with an enhanced low m hh region (marked in black), are resulting from larger c hhh values. The total cross section as a function of c hhh is a parabola with a minimum around c hhh ≈ 2.4, while for |c hhh | 3 and c t = 1 the distribution is enhanced in the low m hh region, where the triangletype contributions dominate. Larger/smaller values of c t shift this behaviour towards larger/smaller values of c hhh because they enhance/decrease the box-type contributions. For shapes of kind 2, i.e double peaks with a separation of more than 100 GeV (green), we find that such a shape can be produced for coupling values which are rather close to the SM values. Shapes of kind 3 (red) are SM-like. They only cover about one quarter of the c t − c hhh plane. Shapes of kind 4 (blue) have a double peak separated by less than 100 GeV. For c tt = c ggh = c gghh = 0, such structures only occur for negative values of c hhh , over the whole allowed c t range.
Considering variations of c ggh versus c gghh , shown in the right-hand panel of figure 2, we find only shapes of kind 2 (green) and SM-like shapes (red). The existence of kind 2 shapes means that a double peak structure could be produced solely by effective Higgsgluon couplings, while keeping c hhh , c t and c tt at their SM values. However, for the more likely case that c ggh deviates only slightly from zero [105], and so does c gghh , these couplings do not distort the SM shape significantly.
Variations of c t versus c tt and c hhh versus c tt are shown in figure 3. Varying only c t and c tt , the shapes remain mainly SM-like. A small area in the c t − c tt plane however contains doubly peaked m hh distributions, which thus can originate from anomalous top-Higgs couplings only, while the trilinear Higgs coupling remains fixed at its SM value.
Turning to c hhh versus c tt , displayed in the right-hand panel of figure 3, we find that for kind 1 and kind 4 shapes the parameter regions are split into two disconnected parts.  While shapes of kind 1 are favoured by large values of c hhh , it becomes clear that large values of c tt , also related to triangle-type diagrams, can counterbalance this effect, because the top right corner is not a parameter region producing shapes of kind 1. If both c hhh and c tt are large, it is more likely to produce a double peak structure with close-by peaks (kind 4, blue). Further we see that shapes of kind 2 (well separated double peak structure, green) can be produced by values of c tt and c hhh which are rather close to the SM values. Figure 4 shows variations of c t versus c ggh (left) and c t versus c gghh (right). The parameter space is dominated by SM-like shapes (red), however double peaks can occur as well (green). We also see that c gghh acts similarly to c ggh in what concerns the shape.

JHEP03(2020)091
For variations of c hhh versus c ggh , shown in figure 5 (left), all four shape types can occur. The parameter region related to kind 1 (enhanced low m hh , black) is at high values of c hhh as expected, and the kind 2 shapes (well separated double peak, green) can be seen as a transition from kind 1 to kind 3. Close-by double peaks (kind 4, blue) however are mostly associated to negative c hhh values. Note that a similar pattern can be found in figure 2 Figure 6 shows variations of c tt versus c ggh (left) and c tt versus c gghh (right). We observe that SM-like shapes (red) are preferred. However, doubly peaked structures are also possible for c tt values not too far from the SM value (c tt = 0). We also notice the similarity to figure 3 (left). The behaviour with respect to c gghh is again similar.
Note that in SMEFT, c ggh and c gghh are related, so this behaviour would necessarily be the case. However we will see later that a shape classification algorithm based on unsupervised learning is able to detect shape differences which distinguish effects of c ggh and c gghh . An interesting feature is also that kind 1 (black) and kind 4 (blue) shapes appear only when we modify the value of c hhh : for c hhh = 1 shapes of kind 1 never occur, and shapes of kind 4 are very unlikely. Further, the kind 4 shapes tend to point to (moderately) negative values of c hhh as long as c tt is close to zero, as can be seen from figures 2, 3 and 5.

Classification of p T ,h distributions
So far we have studied m hh distributions, assuming that they are very well suited to study the sensitivity to shape changes induced by anomalous couplings. In order to verify that JHEP03(2020)091  we do not miss out interesting features in the transverse momentum distributions, we also present a study of the p T,h distributions, but only at LO, to assess the salient features. The main difference with respect to the m hh case is that in the p T,h analysis we could identify only two kinds of clearly distinct shapes: single peak (SM-like, which we denote as 'p T,h kind 3') and double peak, with peaks separated by at least 30 GeV (denoted as 'p T,h kind 2'). Examples of such p T,h shapes are shown in figure 7.
The parameter spaces leading to singly or doubly peaked shapes are shown in figure 8 for the c t − c hhh and c hhh − c tt configurations. The parameter region related to shapes with a well separated double peak (green) is similar to the m hh case, as one can see comparing with figures 2 and 3. This indicates that the underlying parameter space leads to similar characteristics for the distributions differential in p T,h and m hh , however the p T,h distribution is less sensitive than the m hh distribution.

Unsupervised learning procedure
To assess the bias introduced by the definition of the four shape types, and to find a more flexible classification which can be extended easily to more than four shape types, we JHEP03(2020)091 approach the classification problem using unsupervised learning techniques. We construct a classification of the shapes of the m hh distribution into distinct types, where we do not predefine what the types should look like. For this purpose we use an autoencoder to find common patterns in the data and thus achieve a compressed representation. The setup is implemented using Keras [106] and TensorFlow [107]. As input data we use 30 bins of width 20 GeV for the normalised m hh distributions. We train the network based on a set of 10 5 distributions, retaining 10% for the validation. The encoder architecture, i.e. the part compressing the array information, is composed of two dense layers with 20 nodes and a middle layer with 4 nodes, the latter defining the length of the array containing the compressed information. The decoder architecture, which reconstructs the original array from the compressed one, is composed of two dense layers of 20 nodes and an output layer of the length of the input array. We have also tried other encoder architectures, varying the number of nodes in the layers as well as the number of layers, and found that the results deteriorate for less than three layers. Adding more nodes had the tendency to lead to overfitting.
To test how stable our results are against variations of the training data set and the encoding procedure, and to reduce uncertainties, for example due to overfitting, we produced ten different autoencoder models. For each model we picked 10 4 random points from the training set for validation to start from different training and validation sets and a different initialization of the weights. We trained the autoencoder for each model over 10000 epochs using Adam [108] as optimizer and the root mean square error to define the loss function. Based on the trained autoencoder we applied the encoder models to the training and validation data to obtain two sets of compressed arrays for each of the ten models. The ten different encoded training data sets are then fed to a classification algorithm, where we employed the KMeans clustering algorithm from scikit-learn [73], asking for a classification into a given number of clusters. We tested classifications into four to eight clusters.
Asking the KMeans algorithm to find four clusters yielded the shape types shown in figure 9, the result of asking for seven clusters is shown in figure 10. The curves denote the cluster centres determined by the KMeans algorithm, for each of the ten encoder models, with a colour code as defined in table 1.
One can see from figures 9 and 10 that in the case of clustering into four shape types, cluster 2 contains shapes which vary substantially. In contrast, for seven shape types, the cluster centers obtained from the ten different encoder models are quite similar. Asking for 5-8 clusters we found that seven clusters seemed to be the optimal number to capture distinct shape features, while defining eight clusters did not lead to useful additional features but rather to the tendency to focus on local minima in the clustering space, while neglecting more gobal shape features.
The four clusters shown in figure 9 do only partly coincide with the ones defined in section 2.2. Shapes of kind 1, showing an enhanced m hh region, as well as shapes of kind 3 (SM-like), were clearly identified. Shapes having a double peak were clustered together with shapes showing a shoulder. However, a cluster was formed which was not considered in the predefined types, containing shapes with an enhanced tail. JHEP03(2020)091 Figure 9. The clusters obtained by asking for a classification into four shape types. We show the cluster centres obtained from 10 different encoder models, in the colour code defined in table 1.
To combine the results from the ten clustering procedures, we adopted the "majority vote" method, i.e. for each of the ten clustering procedures a given point in the coupling parameter space gets a label ("vote") corresponding to the cluster it belongs to. The final cluster assigned to that point is the one which collected the largest number of votes.

Parameter space underlying the clusters
In this section we show how the parameter space relates to the clusters if we ask for four or seven clusters. For each parameter configuration of our 5-dimensional grid, we plot the corresponding cluster type in figure 11 and figures 13 to 16. The colour codes are shown in figures 9 and 10, and are also listed in table 1. For clusters which are similar to the shape types defined in section 2.2, we should also find patterns similar to the ones shown in figures 2 to 6.
Comparing figure 11 (top row) with figure 2 (left), both showing variations of c t versus c hhh , we see that kind 1 shapes (black) are clearly identified. However, for both four and seven clusters the area for SM-like shapes got smaller, as the clustering algorithm also identifies features which were not considered in the predefined shapes. For example, the clustering into four clusters identifies shapes which are almost SM-like but have an enhanced tail (magenta), and the clustering into seven clusters in addition identifies shapes which are almost SM-like but have a shoulder (blue). Certainly we could have defined such features in our analyzer as well, but it is not that easy to define where the tail starts and what JHEP03(2020)091   exactly should be considered as "enhanced". Further, the figure clearly shows that small variations of c hhh can easily distort the SM-like shape, while the shape is more robust against variations of c t . Figure 11 (bottom row) shows c t versus c tt . We again see that variations c t and c tt mostly produce SM-like shapes. Why this is so can be understood from the behaviour of the coefficients A i in eq. (2.5) which are relevant in these cases. For figure 11 (top row), only the coefficients A 1 , A 3 and A 7 are relevant. As A 1 and A 7 have opposite signs and a different peak location, this can generate a rich shape structure. For figure 11 (bottom row), the coefficients A 2 , A 6 and A 8 are relevant in addition to A 1 , A 3 and A 7 . A 2 being the coefficient of c 2 tt , it is dominant except for very small values of c tt and leads to a SM-like shape. We also observe that c tt has the tendency to enhance the total cross section, such that only a relatively small slice in c tt is left after considering the bounds on the total cross section.  algorithm with seven clusters distinguishes four shape types, showing that large values of c ggh and c gghh favour shapes with an enhanced tail (magenta) or/and a double peak (green), while negative values favour a shoulder on the left of the peak (blue). The limits on the total cross section do not exclude any parameter range in this panel.
A behaviour similar to the one in figure 11 can be seen in figure 14: as c hhh varies the disribution goes through various shape types, while variations of c ggh and c gghh affect the shapes to less extent. Figure 14 also shows that a positive c gghh value has the tendency to enhance the tail of the distribution. Figure 15 shows c tt versus c ggh (top) and c tt versus c gghh (bottom). Compared to figure 6, the clustering into both four and seven clusters shows a better discrimination power between SM-like shapes and small deviations, for example due to an enhanced tail. We again see that c tt has a larger impact on the total cross section than c ggh or c gghh . Figure 16, showing the c t − c ggh and c t − c gghh parameter planes, can be compared to figure 4. Again, both the case of four and of seven clusters indicates that the unsupervised learning algorithm is able to distinguish better subtle influences on the shape than our method based on humanly classified shapes. In figure 17, we compare results of our shape analysis produced with LO and NLO input data. We observe that NLO corrections can change the shape considerably and therefore are important for a shape analysis.

JHEP03(2020)091
The results above have shown that the parameters c hhh and c tt have the largest influence on the shape. In SMEFT, c tt is suppressed compared to c t by one order of the large new physics scale [30]. Furthermore, SMEFT imposes the relation (2.4) between c ggh and c gghh . Using this relation and imposing that c tt amounts to 5% of c t , we obtain a 3-dimensional parameter space simulating the SMEFT situation, which is visualized in figure 18.

Identification of benchmark points
The determination of the cluster centres allows to identify NLO benchmark points which should be representative for each characteristic shape. As the cluster centres determined by the KMeans algorithm do not necessarily correspond to grid points of our input grids in parameter space, and as we work with normalised distributions to find the cluster centres, we determine the benchmark points based on the following procedure JHEP03(2020)091  1. We identify the grid point in parameter space corresponding to a curve which is closest to the cluster centre, where the distance measure is the bin-wise geometric distance to the cluster center of the encoded distribution.
2. If the corresponding total cross section exceeds the limit of 6.9 × σ SM [1], we proceed to the curve which is the next-closest to the cluster center.
3. If several curves determined this way have an identical distance measure, we choose the one where the value of c t is closer to the SM value (anticipating that the top-Higgs Yukawa coupling will be constrained increasingly well from other processes).
4. If after this procedure there are still several curves satisfying these criteria, from these curves we pick the one closest to the cluster center according to the Kullback-Leibler [109] distance measure, applied to the normalised distributions.
Following this procedure we find the benchmark points listed in table 3.3. In figure 19 we show the m hh distributions corresponding to these benchmark points, at LO as well as at NLO.

Conclusions
The aim of this work was to provide more insight how certain configurations of anomalous couplings in the Higgs sector lead to a corresponding characteristic shape of the Higgs boson pair invariant mass distribution. For this purpose we employed the Lagrangian relevant to Higgs boson pair production as given in a non-linear Effective Field Theory framework, which contains five (potentially) anomalous couplings [70].  Table 2. NLO benchmark points derived from the cluster centers as described in the text. parameters by finite steps, thus producing a dense grid of data. Then we defined four characteristic shape types for the m hh distribution and visualised the parameter space leading to these shape types. To this aim we projected onto all possible two-dimensional slices of the parameter space, keeping the remaining parameters at their Standard Model values. We also considered p T,h distributions for a shape analysis, however we found that the m hh distribution is more sensitive to shape changes induced by anomalous couplings.

JHEP03(2020)091
Further, we tested an unsupervised learning approach to classify shapes. We produced 10 5 distributions, trained a neural network based on an autoencoder to extract common shape features and tried to find the number of shape clusters which optimally catches different shape characteristics. Our study demonstrated that some shape features, like an enhanced tail or a shoulder in the m hh distribution, were caught very well by this procedure, and provided more insight about the underlying parameter space leading to such features than the analysis based on predefined shape classes. While machine learning is not essential to define shape clusters, it has the advantage of being easily extendible to a different number of shape types, different binnings or other observables, and of minimising the human bias compared to other shape analysis methods.
The shape analysis revealed that the Standard-Model-like shape is quite stable against variations of c t , c ggh and c gghh , as long as c hhh = 1, while deviations of c hhh from the SM value show a rich shape changing pattern. We also found that small deviations of c tt from zero are very likely to produce a doubly peaked structure in the m hh distribution, while SM-like shapes dominate again as c tt moves further away from zero. However, as c tt leads to a rather fast increase of the total cross section, the shape analysis in combination with the limits on the total cross section allows to put constraints on c tt . This is an interesting feature because, in contrast to c t and c ggh , c tt cannot be constrained directly from single Higgs boson processes. Further, an enhanced tail or a shoulder of the m hh distribution are likely to be produced by nonzero values of c gghh , however the influence of the effective Higgs-gluon couplings on the shape is milder than the one of c hhh and c tt .
We also provide seven benchmark points, based on the full NLO calculation and taking into account current experimental constraints, which lead to the characteristic shapes as represented by our cluster centers.
The method can also be applied to other processes where anomalous couplings introduce characteristic shape changes to differential cross sections, and it can be extended to consider more than one distribution simultaneously.