Probing Higgs couplings to light quarks via Higgs pair production

We consider the potential of the Higgs boson pair production process to probe the light quark Yukawa couplings. We show within an effective theory description that the prospects of constraining enhanced first generation light quark Yukawa couplings in Higgs pair production are similar to other methods and channels, due to a coupling of two Higgs bosons to two fermions. Higgs pair production can hence also probe if the Higgs sector couples non-linearly to the light quark generations. For the second generation, we show that by employing charm tagging for the Higgs boson pair decaying to $c\bar{c}\gamma\gamma$, we can obtain similarly good prospects for measuring the charm Yukawa coupling as in other direct probes.


Introduction
After the Higgs boson discovery the era of precision measurements of Higgs properties has begun. While the Higgs boson couplings to vector bosons and third generation fermions have been measured at the LHC and agree with their Standard Model (SM) prediction at the level of 10% -20% [1], the situation for the Higgs self-couplings and couplings to first and second generation fermions is quite different. Current bounds on the trilinear Higgs self-coupling range from −5.0 < λ hhh /λ SM hhh < 12.0 [2] and are still above the limits of perturbative unitarity [3] or vacuum stability [4]. The quartic Higgs self-coupling is out of reach of the high-luminosity-LHC (HL-LHC) [5,6]. Upper limits on the Higgs boson decays to muons are g hµµ /g SM hµµ < 1. 53 [1], while current bounds on the Higgs coupling to electrons, g hee /g SM hee < 611, are far away from the SM [7]. For the Yukawa couplings to the first and second generation quarks, henceforth denoted as light quark Yukawa couplings, the current best limits are obtained from a global fit to Higgs data [8,9]. For instance for the HL-LHC, ref. [10] obtained for a projection on the coupling strength modification, κ i = g hq i q i /g SM hq i q i , where g hq i q i denotes the i = u, d, s, c Higgs Yukawa coupling to quarks, in a global fit |κ u | < 570 , |κ d | < 270 , |κ s | < 13 , |κ c | < 1.2 . (1) The determination of the light quark Yukawa couplings in a global fit is plagued by the fact that the Higgs boson width can only be measured at the LHC under certain assumptions. 1 The global fit can therefore not be considered to be completely model-independent. A more direct way of constraining the light Yukawa couplings is hence welcome. Searches for exclusive decays of the Higgs boson to a vector meson and a photon h → Xγ with X = ρ, ω, φ, J/ψ, 2 as a probe of light Yukawa couplings have been proposed in [16] and can be even used to probe flavour-off-diagonal Yukawa couplings [8] for instance in Higgs boson rare decays such as h → M W ± or h → M Z, with M denoting generically a scalar or pseudoscalar vector meson. From the experimental side, ATLAS and CMS have reported upper bounds on the decays h → ργ, h → φγ in [17] and to h → J/ψγ in [18,19]. The charm Yukawa coupling can also be constrained to a factor of a few times its SM value at the HL-LHC making use of charm tagging in pp → W/Zh with subsequent decay of the Higgs boson to cc [20] (see [21,22] for first experimental results) or in pp → hc [23].
Another possibility for constraining the light quark Yukawa couplings is from Higgs kinematics. If the Higgs boson is produced with an associated jet, the transverse momentum distribution changes with respect to the SM one in the presence of enhanced quark Yukawa couplings of the second and first generation. For the second generation quarks, the main effect stems from log-enhanced contributions due to interference between top and light quark loop diagrams. This allows to set a bound on κ c ∈ [−0.6, 3.0] at 95% C.L. at the HL-LHC [24]. Instead in the presence of significantly enhanced first generation quark Yukawa couplings the Higgs boson can be directly produced from initial state quarks, which again would alter the Higgs p T -distribution [25]. For non-collider probes of the light Yukawa couplings see ref. [26].
As for Higgs plus jet production, we can make use of kinematical information in Higgs pair production. We will mainly consider the case in which the modifications of the light Yukawa couplings can be described by a dimension six effective operator, denoted schematically by ( The left-handed quark SU (2) doublet has been denoted by Q L ,the right-handed quark SU (2) singlet by q R , while φ is the scalar Higgs doublet field. In the presence of such an operator, both a shift in the Yukawa coupling to one Higgs boson as well as a new coupling of two Higgs bosons to two fermions modifies the Higgs pair production cross section. In the case of the top quark it was shown that such a new coupling can lead to large enhancements of the double Higgs production process [43][44][45][46]. For the light quark Yukawa couplings this was shown in [47] under the assumption of universally enhanced light Yukawa couplings. We will consider more general scenarios and will show that indeed such an operator can also be constrained in di-Higgs production for the light generations of quarks. Under the assumption of linearly realised electroweak symmetry breaking we can then obtain a bound on the light quark Yukawa couplings which is competitive with the above mentioned ways of constraining them. We will also investigate how our bounds are modified if we allow for a modification of the trilinear Higgs self-coupling. Furthermore, we will discuss the possibility of charm tagging for di-Higgs final states, which will allow us to set bounds on the second generation quark Yukawa couplings. The paper is structured as follows: in sect. 2 we will introduce our notation and point out under which circumstances scenarios considered in our analysis can be realised. In sect. 3 we present how the di-Higgs production process and the Higgs boson decays are modified in the presence of enhanced light quark Yukawa couplings. In sect. 4 we present the results of our analysis both in the presence of enhanced first and second generation Yukawa couplings. We also consider the potential reach of the HL-LHC by employing charm tagging. We conclude in sect. 5.

Effective Field Theory of light Yukawa couplings
Within the SM, the Higgs couplings to quarks are described by the Lagrangian withφ = iσ 2 φ * , σ 2 is the second Pauli matrix, φ denotes the Higgs doublet, Q i L the lefthanded SU (2) quark doublet of the i-th generation and u j R and d j R the right-handed up-and down-type fields of the j-th generation, respectively. Modifications of the SM from high-scale new physics can be described in a model-independent way by means of the SM effective field theory (SMEFT), in terms of higher dimensional operators. In particular, the couplings of the quarks to the fermions are modified by the operator where Λ denotes the cut-off of the effective field theory (EFT). The mass matrices of the up-type and down-type quarks are They can be diagonalised by means of a bi-unitary transformation while the CKM matrix is defined as we can write the couplings of one and two Higgs boson to fermions with as In the following, we will also use for the diagonal couplings alternatively the notation in a slight abuse of language of the κ-framework used often in experimental analyses. Flavour-changing Higgs couplings are strongly constrained from low-energy flavour observables, such as meson-antimeson mixing. The bounds are of order |c uc/ds | 10 −5 Λ 2 /v 2 and |c db/sb | 10 −4 Λ 2 /v 2 [48]. Given that, a common assumption for the Wilson coefficients in eq. (4) is that of minimal flavour violation (MFV) [49], where with flavour universal c u and c d . Hence, under the assumption of MFV the Yukawa matrices y u (y d ) and the Wilson coefficients c u (c d ) are simultaneously diagonalisable and no flavour changing Higgs interactions with quarks exist. We refrain though from making the assumption of MFV, due to the reason that with the Wilson coefficients being proportional to the Yukawa couplings, we introduce a strong hierarchy into the Higgs couplings to quarks. Since we want to describe modifications of the order of the ones in eq. (1) we would need to assume very low values of the new physics scale Λ and/or large Wilson coefficients, rendering the validity of the EFT questionable and in potentially conflict with measurements of the third generation couplings to the Higgs boson. Instead, we will consider the case in which thec q ij are diagonal, though not proportional to the Yukawa matrices. This can be realised by appropriate choice of the parameters. For instance, V u L/R = 1 , V d R = 1 , and V d L = V CKM , which keepsc u flavour-diagonal if c u is chosen flavour-diagonal. Flavour violation then originates only from the CKM matrix. We will refer to this as flavour alignment. However, from a UV-perspective there is no obvious symmetry argument to enforce this at low-energy.
A possible way of keepingc flavour-diagonal with symmetry arguments could be realised for flavour universal c u/d and a left-right symmetry rendering V L = V R . Then by setting universalc u/d /Λ 2 ≈ 1/(3 TeV) 2 we get for instance a modification of the up-quark coupling to the Higgs boson of a factor of 500, but only a modification of the top Yukawa coupling by 1%, which is still consistent with the current limits on the top Yukawa coupling [50,51]. Note that doing so for the down-type quarks would of course be more difficult, as it would imply a larger deviation in the bottom quark Yukawa coupling, due to its smaller mass. Alternatively, one can chosec f flavour-diagonal (or with strongly suppressed flavour-off-diagonal elements) by choosing horizontal symmetries. We refer to [47] for a model with vector-like quarks and strongly enhanced light quark Yukawa couplings. Another realisation of large first and second generation Yukawa couplings without tree-level flavour-changing neutral currents (FCNCs) has been discussed in [52], and is referred to as spontaneous flavour violation. The basic idea is to achieve this by breaking the quark family number symmetry via the RH up-type or down-type quark wave function renormalisation, leading to either enhanced up-or downtype quark Yukawa couplings. A concrete realisation of this idea for a two-Higgs doublet model was discussed in [53].
We would also like to stress that from a UV perspective it makes sense to assume that if there is a modification in the light quark Yukawa couplings with respect to the SM, deviations in the di-Higgs production process can be expected, which in the limit of heavy new physics can be traced back to a coupling of two Higgs boson to two fermions. We show this schematically in fig. 1 for a heavy new scalar and a heavy new vector-like fermion. The coupling of the SM-like Higgs boson in the models extended by a heavy new Higgs boson or a heavy new vector-like quark as shown in fig. 1 is modified due to a mixing with either the new Higgs boson, if it acquires a vacuum expectation value (VEV), or by the mixing between the quark and the new vector-like fermion. For the case of the heavy new scalar, the effective coupling of two SM-like Higgs bosons to fermions in the limit of m H E, with E denoting the energy scale of the process and m H the Higgs mass of the heavy Higgs boson, can be written as A coupling g Hhh always exist, if both of the Higgs fields acquire a VEV, since a portal term in the Lagrangian, (φ † φ)(Φ † Φ), is always allowed by the symmetries. We denoted here the new Higgs multiplet by Φ with neutral component H.
In the presence of new vector-like quarks that mix with the SM quarks, the coupling of Figure 1: Examples of potential concrete models leading to a hhqq coupling. The left Feynman diagram shows a heavy Higgs H, the right diagram a vector-like quark Q.
two Higgs bosons to two fermions comes fromt/û channel diagrams. If the mass of the new vector-like quark m Q is m Q E one obtains for the coupling 3 A more explicit consideration of models that realise large light Yukawa couplings is beyond the scope of this paper and we refer to existing work [47,53]. We finally note that an alternative way of describing model-independent deviations from the SM Higgs couplings is by a non-linear effective Lagrangian (alternatively referred to as electroweak chiral Lagrangian) [54,55]. While in SMEFT the Higgs boson is assumed to be part of an SU (2) doublet and the expansion is organised in terms of dimensionality of the operator, in the chiral Lagrangian the Higgs boson is assumed to be a singlet and the expansion is organised in terms of chiral dimension, where bosonic fields are assigned chiral dimension 0 and derivatives and fermion bi-linears chiral dimension 1. The Lagrangian responsible for a potential modification of the Yukawa couplings can be written as [56] in terms of the Pauli matrices σ a and the Goldstone bosons π a with a = 1, 2, 3. The field Σ transforms linearly under the custodial symmetry SU (2) L × SU (2) R . We note again as for the SMEFT that off-diagonal elements of k q are strongly constrained. Compared to SMEFT the couplings of one or two Higgs boson to fermions are now uncorrelated, leading to different coefficients k q and k 2q . In principle, the coefficients of the light fermion couplings to two Higgs bosons are yet unconstrained and di-Higgs production is the place to test if there exists a correlation among those and hence whether a linear or non-linear EFT prescription is to be preferred. While in the following we will mainly concentrate on the case of SMEFT we shall shortly comment also on the case of non-linear EFT. 3 Higgs pair production and Higgs decays with modified light Yukawa couplings In this section we will describe how the Higgs pair production process for modified light quark Yukawa couplings is affected. While in the SM Higgs pair production is dominantly mediated by gluons fusing into a heavy quark loop coupling to the Higgs boson, for large first and second generation quark Yukawa couplings also quark annihilation becomes relevant. For a phenomenological analysis we also need to take into account the Higgs boson decays, which we describe in the last part of the section.

Higgs pair production via gluon fusion
The dominant process for Higgs pair production at the LHC in the SM is the gluon fusion process (ggF) via a heavy quark loop Q, where Q stands mainly for the top quark. The bottom quark contributes with less than 1%. We show the Feynman diagrams for the process in fig. 2. The process has been known since long at leading order (LO) in full mass dependence [57][58][59][60]. The next-to-leading order (NLO) in the strong coupling constant was initially computed using the infinite top mass limit (m t → ∞) and reweighted with the full LO results [61]. However, this approximation is only valid up to the top quark threshold. More recently, the NLO QCD corrections have been computed in full top mass dependence, showing that the infinite top mass limit overestimates the full result by 14% [62][63][64]. 4 For distributions, the approximation of infinite top mass is even worse. At next-to-next-to leading order (NNLO) results are available in the infinite top mass limit [68,69] and by including top mass effects for the double real radiation [70]. First steps towards an inclusion of top mass effects for the virtual corrections (for the triangle only) have been made in [71,72] and for the light fermion triangle contributions the NNLO has been computed in [73]. For our analysis, we have calculated the √ s = 14 TeV LO ggF inclusive cross section and distributions with modified light Yukawa couplings by including the light quark loops and the coupling hhqq shown in fig. 3. The calculation was carried out using a private FORTRAN implementation of the LO cross section utilising the VEGAS integration algorithm, and NNPDF30 parton distribution functions (PDF's) [74] implemented via the LHAPDF-6 package [75]. For the one-loop integrals appearing in the form factors of the box and triangle diagrams, we have used the Collier library [76] to ensure numerical stability of the loop integral calculation for massless quarks inside the loops. A K-factor for the NNLO correction was used following the recommendations by the Higgs cross section working group [77] For differential distributions in the invariant mass of the Higgs boson pair, M hh , we extract a differential K-factor from [70]. As a reference cross section at NNLO [70] for the analysis in sect. 4 we use The uncertainty stems from the scale choice, the PDF+α s error and the uncertainty associated to the usage of the infinite top mass limit in parts of the calculation. Since we found that the cross section does not change much once the effects of the modified light Yukawa couplings are included, we use the same NNLO K-factor for all values of the scalings. The renormalisation, µ R , and factorisation scales, µ F , are set to µ 0 = M hh /2 as has been pointed out as an optimal choice in ref. [78], and α s (M Z ) = 0.118.

Results
For comparison of the results with modified Yukawa couplings with the SM results, we define as a benchmark point the case where all first and second generation quark Yukawa couplings are scaled to the SM bottom Yukawa coupling, which we will refer to in plots and tables as g hqq = g SM hbb . This means we scale the Yukawa couplings by κ q = g hqq /g SM hqq with and use only flavour-diagonal modifications of the quark Yukawa couplings. This benchmark is inspired by ref. [47]. Figure 4 shows the di-Higgs invariant mass M hh -and the p T,h -distributions for the computed LO process. From the distributions it is evident, that the change of the ggF process in the presence of enhanced light Yukawa couplings is quite small. The reason is that the box contribution which is the major part of the cross section has two fermion coupling insertions and hence is strongly suppressed for all the light quarks with respect to the top quark loop diagrams. The bottom quark contribution to the ggF process in the SM is less than 1% and comes mainly from the triangle diagram, so adding several contributions from similar size does not change the cross section by much. Also the new diagrams (cf. fig. 3) are suppressed  The di-Higgs invariant mass differential cross section dσ/dM hh for the SM at LO and the benchmark point toy. The error boxes denote the total scale, PDF and α s uncertainties. Right: The same but for the Higgs transverse momentum p T,h distribution. compared to the box diagrams of the top quark. In the presence of enhanced light quark Yukawa couplings the Higgs boson pair can though be directly produced by quark annihilation. We turn to discuss this process in the next part. In the meanwhile we can conclude that for the ggF process we can improve on the LO predictions by using SM K-factors and that the effects of light Yukawa coupling modifications for the ggF process are small for the still allowed modifications.

Higgs pair production via quark anti-quark annihilation
If the Yukawa couplings of the light quark generations are sufficiently increased, the Higgs bosons will be produced directly from the constituents of the proton with a sizeable rate. The quark anti-quark annihilation (qqA) process becomes then relevant for Higgs pair production.
The qqA process has four Feynman diagrams shown in the fig. 5. The differential cross section given by We neglect here thet andû channel diagrams, as their contribution is ∼ 0.1% of the total cross section, as they are suppressed by g 4 hq i q j and only interfere with each other. The hadronic cross section is then obtained by with τ 0 = 4 m 2 h /s,ŝ = τ s and The parton luminosity is given by We neglected all the kinematical masses in accordance with the 5-flavour scheme of the PDFs while the coupling of the Higgs boson to the light quarks (for flavour diagonal couplings) is and analogously for the g hhq i q j coupling. 5

NLO QCD correction
Since NLO QCD corrections are sizeable, we will take them into account in our analysis. For this purpose, we will detail here how we obtained them. Since thet andû channel diagrams are strongly suppressed we can take the NLO QCD corrections over from bb → h in the 5-flavour scheme [79][80][81] 6 by some adjustments taking into account the modified LO cross section and the different kinematics of the process. The Feynman diagrams at NLO QCD are shown in fig. 6. For convenience and for making our adjustments explicit we report here the formulae from [84]  andσ (21), and the ω factors are given by with ζ 2 = π 2 6 . The Altarelli Parisi splitting functions P qq (z) and P qg (z) [85][86][87] are given by and the 'plus' distribution is We have chosen the renormalisation scale µ R = M hh and the factorisation scale µ F = M hh /4, as central values. We define the NLO K-factor as with the first error denoting the uncertainty from varying the various κ q and the second error is propagated from LO and NLO scale and PDFs+α s uncertainty. The K-factor does not depend on the scaling of the couplings, nor the flavour of the initial qq since the LO cross section factors out (with exception of the different integration in the real contributions). We finally note that at NNLO the qqA process interferes with diagrams with top quark loops, which contribute to ggF also at NLO. These contributions can in the SM limit be rather large, i.e. of similar order or even larger than the tree-level qqA process depending on the flavour considered. 7 Due to the fact that the modifications of the Yukawa couplings that we consider in our analysis are rather large and that we are mostly interested in the case where qqA is of similar size or large than the ggF process we can safely neglect these contributions.

Results
While in the SM, the contribution from quark annihilation to a Higgs boson pair is below 0.11 fb at NLO, it scales like ∼ κ 2 q m 2 q /v 4 , dominated by the hhqq diagram as can be seen from eq. (20), hence showing significant enhancement for enhanced Yukawa couplings. For our benchmark scenario (g hqq = g SM hbb ) we find for the cross section and therefore a significantly larger cross section as for the ggF process. In fig. 7 we compare the ggF process (black line) for rescaled charm coupling to the Higgs boson(s) with the qqA process for different scalings of the light quark Yukawa couplings (different coloured, dashed, dotted solid and dashed dotted lines). We find that for sufficiently large scaling of the Yukawa couplings still allowed by current data, qqA can be even the dominant di-Higgs production channel. Note that in the figure we scale the Yukawa couplings for the different quark mass eigenstates differently. For the up and down quark Yukawa coupling the scaling is the same, hence the effect from rescaling the down Yukawa coupling is larger even though the up quark is more abundant in the proton. The plot shows nicely for which values of the coupling modifications the qqA process surpasses ggF. We would also like to give a qualitative argument for the dominance of qqA for large κ q . The dominant term for the qqA comes from the hhqq vertex diagram, such that the qqA cross section behaves for large values of κ as (assuming that σ qqA The ggF cross section instead gets contributions from light quark loops from the diagram in fig. 3 interfering with top quark loops in the triangle SM diagram, leading to a scaling of Taking the ratio we get This ratio approaches one (neglecting effects from different PDFs) when In fig. 8 we show the di-Higgs invariant mass normalised differential cross section distributions for the g hqq = g SM hbb benchmark point at NLO compared to the NNLO SM ggF cross section extracted from [70]. We notice a considerable shape difference, with shifted peak to the left, and a larger tail. This will allow us later on to use kinematical information to extract the light quark Yukawa couplings.

Higgs decays
The light fermion decay channels will no longer be negligible for enhanced light Yukawa couplings. The decay channels h → gg, h → γγ and h → Zγ containing fermion loops will get modified, but similarly to the production, the modification is ∼ 2 κ q (m 2 q /m 2 h ) ln 2 (m q /m h ).
Thus, the main effect on the Higgs boson branching ratios and width is the 'opening' of the new light fermion channels. In order to compute the Higgs partial widths and branching ratios (BR) at higher orders in QCD, we have modified the FORTRAN programme HDECAY [88,89] to include the light fermion decay channels and loops in the above-mentioned decays. In the SM, light fermion BRs are of order O(10 −4 ) for h → cc, O(10 −6 ) for h → ss and < O(10 −9 ) for the first generation quarks [77]. In our benchmark point (g hqq = g SM hbb ) these would increase to ∼ 18%. Correspondingly, the BRs for h → bb/V V /τ + τ − decrease due to the increased Higgs width in the model.
In fig. 9 we show the BRs, denoted by B in the following, of the Higgs boson pair with the best prospects for discovering Higgs pair production, hh → bbbb, hh → bbγγ and hh → bbτ + τ − [2], and in addition we show for later purpose also hh → ccγγ. Once we increase the light quark Yukawa couplings (shown for the different quarks by the different coloured lines) the BRs to bbbb, bbγγ and bbτ + τ − decrease due to the increased Higgs width. Instead the B(hh → ccγγ) first increases with increasing κ c , but starts decreasing after reaching a maximum around κ c ≈ 8, where the B(h → cc) asymptotically reaches 1 while the B(h → γγ) continues decreasing.
In fig. 10 we show the signal strength modifier defined here as      channels for di-Higgs production. We shall notice however, that while the increase of the cross section comes mainly from the qqhh vertex diagram, the decrease of the BRs stems from the increased width which would be in good approximation (for flavour-diagonal couplings) where Γ q stands generically for the partial width of the Higgs boson decaying to light quarks. In a non-linear EFT as briefly discussed in sect. 2, the couplings of one Higgs boson to quarks and two Higgs bosons to quarks are uncorrelated. So an increase of the cross section for hh production in the presence of modified light quark Yukawa couplings does not need to go hand in hand with a decrease of the BRs in the final states with bottom quarks (or at least the decrease could be in-proportional).

Phenomenological analysis
In this section we will investigate whether enhanced light quark Yukawa couplings can be measured in Higgs pair production. As we have seen in the previous section, we can get an enhancement in the signal strengths for first generation quarks from the enhanced cross sections while BRs in the standard di-Higgs search channels decrease. We have also seen that final states with charm quarks might be worth studying further for enhanced second generation Yukawa couplings. Here in this section, we will perform a phenomenological analysis to see if the HL-LHC has potential to constrain the light quark Yukawa couplings in di-Higgs channels. The first part of the section is devoted to the analysis strategy, before we discuss the bounds from final states with bottom quarks. We will be focussing in particular on the bbγγ final state as it holds promising prospects [27][28][29][30][31]33] despite the low BR of 0.27% in the SM for the Higgs boson pair. At the end of the section we take a closer look at the ccγγ final state, which is in particular interesting for enhanced charm Yukawa couplings. For our phenomenological analysis we do not assume that the efficiency is constant for the new physics hypothesis with respect to the SM efficiency. Hence, we use the full definition of the signal strength µ as the ratio of the number of events measured or expected given the new physics hypothesis over the number of events expected by the SM (null) hypothesis The number of expected events N expec at a hadron collider with integrated luminosity L and selection efficiency SEL in the narrow width approximation for a process pp → R with subsequent decay of R → X is given by the formula The selection efficiency can be written in terms of several factors by with Acc being the detector acceptance efficiency, Rec the efficiency from reconstruction, Trig the trigger efficiency and cut the efficiency obtained from the applied kinematical cuts on the signal. For the ATLAS and CMS experiments, the acceptance for the Higgs pair production is close to 100% due to the complete coverage of the pseudorapidity range of 2.5 < |η| < 5, so we use Acc = 1. The other efficiencies will be discussed in more detail in subsect. 4.2.

Event generation
The parton showering and hadronisation of the process pp → hh → bbγγ has been simulated using Pythia 6.4 [90] with the settings detailed in appendix A. The cross section of the Higgs pair production (ggF and qqA both at LO multiplied by a K-factor as described in subsect. 3.1 and 3.2.1) is fed to Pythia which decays the two Higgs bosons and then performs the parton showering. We have accounted for the correct BRs by using the values obtained as described in subsect. 3.3 from HDECAY. We have turned on initial and final state QCD and QED radiation and multiple interactions. The generated events were written to a ROOT file via RootTuple tool [91] for further analysis.

Analysis strategy
The analysis strategy follows the one performed in [29] allowing us to use their backgrounds. Note that the analysis was based on the SM simulated events, meaning that the significances could be potentially improved performing a dedicated new physics analysis. In order to satisfy the minimal reconstruction requirements of the LHC we select only events with Moreover, we veto events with hard leptons corresponding of an expected Trig = 0.9. Jets were clustered using fastjet [92] with the anti-kt algorithm with a radius parameter of R = 0.5. We have used a b-tagging efficiency of b = 0.7 8 . The contamination probability of j→b < 1% is found to be consistent with ATLAS and CMS performance [94][95][96]. For the photon reconstruction efficiency we used γ = 0.8 as reported by ATLAS and CMS in [96,97]. The selection cuts we used are the same ones as in [29], starting with the cuts of the transverse momentum p T of the photons and b-tagged jets. The two hardest photons/b-tagged jets, with transverse momentum p T > , and the softer ones with p T < are selected to satisfy In order to ensure well-separation of the photons and b-jets, we require the following cuts on the jet radius, ∆R(b, b) < 2, ∆R(γ, γ) < 2, ∆R(b, γ) > 1.5 .
While the majority of the signal lies within this region, these cuts significantly reduce the backgrounds.
We choose a wide m γγ window (see eq. (45)) corresponding to 2-3 times the photon resolution of ATLAS and CMS [96,97] which does not cause any significant loss. As for the Higgs mass window reconstructed from 2 b-jets m bb , the mass window chosen in eq. (45) corresponds to the given b-tagging efficiency. The mass windows used are then 105 GeV < m bb < 145 GeV, 123 GeV < m γγ < 130 GeV .
The selection cuts are summarised in table 1 with their corresponding efficiency. In table 2 we summarise all the efficiencies used in the analysis. The major backgrounds for the considered final state are the bbγγ continuum background, γγjj with two mistagged jets, tth, Zh and bbh in the order of importance after the cuts in eq. (43). The number of background events (surviving the cuts) is taken from [29]. The backgrounds are illustrated in the fig. 11 in which we show the number of events for the SM Higgs pair signal in light blue and the most relevant backgrounds in other colors. It should be noted that the background h(→ γγ)Z(→ bb) is modified in the presence of enhanced light quark Yukawa couplings. We checked though explicitly that scaling the Yukawa couplings to the values of our benchmark point only changes the NLO cross section by less than 1%, making this effect negligible.     [29].
The analysis was carried out for varying values of κ f for the different flavours. Due to the change in the kinematical distributions (cf. fig. 12) resulting from the PDFs of the different flavours, the efficiencies depend on the flavour of the quarks. For κ f 1 the κ f dependence factors out of the cross section such that for the values considered in the analysis of the distributions no dependence on the concrete value of κ f is seen. The flavour-specific efficiency ratio f is given by with σ ggF being the gluon fusion cross section, σ qq the quark annihilation cross section and ggF = 0.044. We give the values for the qqA efficiency qq in table 3.
In fig. 12 we show for the SM and for our benchmark point g hqq = g SM hbb the M hh distribution. The lower panels in the plot show the efficiencies. These plots illustrate how the efficiency depends on the shape of the distribution, and hence the flavour f that is scaled by κ f .

Statistical analysis
We have used the likelihood ratio test statistic q µ in order to estimate the HL-LHC sensitivity, and set projected limits on the scalings of the light Yukawa couplings. A (log)-likelihood was constructed from the signal and background events in each bin of the histogram in fig. 11, with N bi and N si being the number of background and signal events in the ith M hh distribution, respectively. In order to include the theoretical uncertainties on the expected number of signal events, the above likelihood was extended by a gaussian distribution for N si in which the mean equals to the central value of the bin values and standard deviation σ equals to its theoretical uncertainty. The signal strength µ was then estimated by minimising − ln L (µ) to obtain the estimator forμ by injecting SM signal + background events n i . The test statistic is then given by q µ = 2(ln L (µ) − ln L (μ)), following the procedure described in [98]. Table 3: The dependence of qq on the flavour of the Yukawa couplings' scalings. In the lower part of the plot we show the ratio between truth-level and reconstructed distribution, which is equivalent to the efficiency.
In order to set bounds on the scalings, we have fitted the signal strength inclusively by a function depending on the scaling of the Yukawa couplings with and m q1 and m q2 denoting the MS masses of the quarks. Taking M hh ≈ 300 GeV, we could perform a fit for the signal strength for each of the quark generations scalings separately. Note that one could of course also extend the model to include the dependence of the signal strength on four Yukawa coupling modifications, taking into account the correlation between them when fitting the likelihood in eq. (47).
Note that these bounds are not directly comparable to the standard κ formalism bounds since we relate with κ the Yukawa couplings g hqq and the new coupling g hhqq . For the second generation quarks we were not able to obtain similar bounds due to the reduction of µ/µ SM with increasing κ s and κ c away from the SM, which stems from the decrease of the branching ratio B(hh → bbγγ) as new decay channels open, while the cross section is not as much enhanced as for up and down quarks due to the charm and strange quark being less abundant in the proton. This leads to signal strength modifiers µ/µ SM < 1 (cf. fig. 10). We will analyse the second generation Yukawa couplings instead for the final state hh → ccγγ, in which we observe significant enhancement of the relative signal strength modifier µ/µ SM (cf. fig. 10). Before turning to a different final state though, we will reanalyse the bbγγ final state under the point of view of a non-linear effective field theory, hence leaving the couplings g hqq and g hhqq independent.

Results for non-linear EFT
We will consider in this part a non-linear EFT as introduced in eq. (15). By expanding in the chiral modes, taking the 0th mode and the flavour diagonal terms, we get where we rescaled the coefficients k q and k 2q of eq. (15) as k q,ii = √ 2c q m q /v and k 2q,ii = √ 2c qq m q /v 2 . Unlike the linear EFT, the Wilson coefficients c q and c qq are independent of each other leading to the coupling constants We can observe that compared to SMEFT (see eq. (10)) the interaction hhqq becomes independent of the Yukawa coupling hqq, with the first contributing to the contact interaction diagram and the latter to theŝ channel Higgs exchange diagram and thet andû channel diagrams as shown in fig. 5. As we found already for SMEFT, the ggF process depends only very little on the modifications of the light quark coupling to the Higgs boson, hence barely changes for the considered values of the coefficients c q and c qq . The Higgs boson decays are only affected by a variation of c q but not c qq , as the latter does not contribute to single Higgs interactions. We have observed that the shape of the differential hh production distribution is dominated by a change of the hhqq coupling, hence the efficiency changed in a similar way to the linear EFT when changing c qq and remained almost constant when changing c q alone. Unlike the linear EFT case, we have two parameters to vary independently per quark flavour, making a total of eight Wilson coefficients when restricting ourselves to the first and second generation. The analysis used is identical to the one of the linear EFT, with the same statistical technique, except here we have used spline functions to fit the signal strength µ, as it yielded a better fit result than the simple model of eq. (49), though the same test statistics was utilised as for the SMEFT case. The thus obtained sensitivity bounds are given in fig. 14. We observe that without the hhqq interaction, one cannot set bounds on any of the light Yukawa couplings from Higgs pair production. We remark though that in case any deviation in the light Yukawa couplings is observed, the di-Higgs channel can distinguish whether electroweak symmetry breaking is realised linearly or non-linearly.

Charm-tagging and second generation bounds
In order to set bounds on the second generation Yukawa couplings, we use the method developed in [9,99] that re-analyses final states with b-quarks based on the mistagging of c-jets as b-jets in associated V H production. The analysis relies on the current CMS [100] and ATLAS [101] working points for b-tagging, as illustrated in the table 4. The signal strength estimator when considering the mistagging probability of b-jets to c-jets (i.e. c-jet contamination of b-tagged jets) b→c iŝ with f being the efficiency ratio in eq. (46). The above expression simplifies tô are needed. This is illustrated in fig. 15, where the working points fitting contours are combined using Fisher's method [102]. We thus obtain an upper projected limit on the charm final state signal strength after Tight-Tight 5.9 · 10 −3 Table 4: The b-tagging working points used in the analysis, for CMS [100] and ATLAS [101].
However, the obtained sensitivity is not sufficient to set any better limits at 95% CL than the existing ones (or projected ones in other channels) for the Yukawa coupling modifiers κ c , and κ s . Instead, we can improve on them by introducing c-tagging working points ( c-tag mixed with the b-tagging ones. We denoted the contamination of c-jets with b-jets by c→b .  [105,106] 50% 20% 3.8 Table 5: The c-tagging working points with the expected 95% CL upper limit (sensitivity) of µ c obtained after profiling over µ b .
For mixed tagging, the signal strength estimator becomeŝ where now b is either b or c→b and c either c or b→c . This simplifies tô The working point 2 c/b could be the b-tagging or c-tagging working point. Assuming that c-tagging and b-tagging are uncorrelated, and working with the methods discussed in [9,20], i.e. combining the ATLAS medium cuts (med.) for b-tagging with the c-tagging working points in order to break the degeneracy, we could improve the 95% CL sensitivity on µ c . We start by the c-tagging working point used by the ATLAS collaboration in Run I searches for top squarks decays to charm and neutralino [103,104], which we refer to as c-tagging I. Further c-tagging working points from the HL-LHC upgrade are used: with the expected insertable B-layer (IBL) sub-detector that is to be installed during the ATLAS HL-LHC upgrade [105,106], the new c-tagging II and III points, as illustrated in table 5, can be identified. In fig. 16 we used them to obtain in combination with the ATLAS med b-tagging expected 95% CL upper limits on µ c for the HL-LHC from an analysis of the final state bbγγ.
Fitting signal strengths with varying κ c , κ s for charm and bottom final states (cf. eq. (49)) for constructing the likelihood L (κ c , κ s ), we can set limits from the anticipated charm tagging working points as shown in fig. 17. These projected limits are an improvement compared to the current direct bound and prospects for HL-LHC, particularly for charm quark Yukawa modifications [9,20]. Again, it should be kept in mind that the bounds on κ q do not just correspond to the scaling of the Yukawa coupling, but also to the new coupling g hhqq arising in SMEFT.

Bounds with trilinear coupling scaling
Since we expect that most of the UV-complete models will modify the trilinear Higgs coupling by a scaling κ λ = λ hhh /λ SM hhh , we have investigated the light quark Yukawa bounds along with a modified trilinear Higgs self-coupling.
The likelihood contours obtained in fig. 18 assume that a single flavour coupling modifier κ f is not correlated to the others, nor with the trilinear coupling scaling. The correlated case in terms of a two Higgs doublet model has been discussed in ref. [107]. The modification of κ λ enhances the contributions to the s-channel qqA and triangle ggF diagrams, where the first interferes destructively with the hhqq diagram, and the latter with the spin-0 box form factor for κ λ > 0. Therefore, we observe that if we let κ λ be in the range of κ λ ∈ [1, 4] the sensitivity for κ q becomes worse.

Conclusion
The couplings to the first and second generation fermions remain among the less well measured couplings of the Higgs boson. In this paper we investigated the possibility of measuring light quark Yukawa couplings in Higgs pair production. For enhanced Yukawa couplings of  Figure 17: The expected sensitivity likelihood contours at 68% CL and 95% CL for an integrated luminosity L = 3000 fb −1 for modified second generation quark Yukawa couplings, using the c-tagging I (upper pannel, left), II (upper pannel, right) and III (lower pannel ) working points.
the first generation quarks, we found that limits can be set when considering quark annihilation with subsequent decay of the Higgs boson pair to bbγγ. In an effective theory description with dimension 6 operators that modify the quark Yukawa couplings, there exists also a coupling of two Higgs bosons to two fermions. This coupling increases the Higgs pair production cross section and hence allows to set bounds on the light quark Yukawa coupling modifications. For the HL-LHC we found a sensitivity of |κ u | 1170 and |κ d | 850, cf. fig. 13, which is comparable to the sensitivity of other channels that can directly probe the light quark Yukawa couplings though being weaker than the results from a global fit. Further improvements could be possible with a more dedicated analysis. We note though that the bounds we find stem mostly due to the diagram involving the coupling of two Higgs bosons to two quarks, as we showed explicitly also by considering a non-linear effective theory in which the coupling of one and two Higgs boson to fermions are uncorrelated. This channel can pp → hh → bbγγ 95% CL contours κ c (c-tag III ) κ s /10 κ u /120 κ d /100 Figure 18: The expected sensitivity likelihood contours at 95% CL for an integrated luminosity L = 3000 fb −1 for modified Higgs trilinear coupling κ λ vs the light quark Yukawa couplings scalings κ q .
hence also be used to distinguish between a linear vs non-linear Higgs EFT hypothesis in the light quark sector. The LHC experiments should hence consider the Higgs pair production process in addition to other channels for probing the light quark Yukawa couplings. For the second generation quarks we found that at the HL-LHC in the di-Higgs channel we will be able to set competitive bounds on the charm Yukawa coupling if final states with tagged charm quarks are considered. We were in particularly considering the final state ccγγ, in which we found a sensitivity of |κ c | 5 and |κ s | 100, cf. fig. 17, where the first prospective limit is comparable to the prospects from charm tagging in the V h channel [20].

A Parameter values as used in the analysis
In this appendix, we give the input parameters for masses, widths, and couplings as used in the Pythia simulation, see table 6. The collider input is given in 3.894 × 10 11 fb GeV −2 conversion from GeV 2 to fb     1 Top quark decay before fragmentation. MSTP (61) 1 Turn on initial state radiation. MSTP (71) 1 Turn on final state radiation. MSTJ (41) 1 Turn off QED bremsstrahlung. MSTP (81) 1 Multiple interaction on. MSTP(111) 2 Allow fragmentation and decay. MSTP (42) 0 Turn off shell boson production.