1 Introduction

Since the dawn of the standard model (SM), the vector boson scattering (VBS) process has been heralded as a cornerstone to test the high-energy behaviour of the electroweak sector. Such importance originated in calculations of scattering amplitudes involving longitudinally polarised vector bosons which, in the absence of a Higgs boson, were shown to grow quadratically with energy and eventually violate unitarity bounds [1,2,3,4,5,6]. The ability to fully scrutinise the VBS process was therefore one of the motivations to project the ill-fated superconducting super collider (SSC) with a center of mass energy of \(\sqrt{s}=40\) TeV [7]. If the Higgs boson were not responsible for electroweak symmetry breaking, the SSC might have been able to discover new resonances in the high-energy tail of VBS events.

While we know now that the Higgs boson, following its discovery in 2012 [8, 9], unitarises the VBS cross-sections, such processes still provide unique sensitivity to deformations of the SM at high energies, such as those parametrised by the standard model effective field theory (SMEFT) [10,11,12]. VBS therefore provides a fully complementary probe to investigate the electroweak sector of the SMEFT compared to processes such as on-shell Higgs production or gauge-boson pair production, both in terms of covering a different energy regime (up to the TeV scale) and by its contributions from different EFT operator combinations. A particularly attractive feature of VBS in this context is the appearance of quartic gauge couplings (QGCs), which have often led to a theoretical interpretation of VBS data in terms of anomalous QGCs (aQGCs).

One significant challenge in studying the VBS process at the LHC is the rather small signal-to-noise ratios due to its electroweak nature, with backgrounds being dominated by QCD-induced diboson production. Fortunately, VBS also benefits from a characteristic signature that allows for a relatively clean isolation, defined by two energetic jets in the forward region and a large rapidity gap between them that contains reduced hadronic activity.Footnote 1 The combination of this characteristic topology together with the improved analysis of the high statistics delivered during Run II of the LHC (\({\mathcal {L}}=140\) \(\hbox {fb}^{-1}\) at \(\sqrt{s}=13\) TeV) has made possible not only the identification of VBS events with reasonable statistical significance, but also the measurement of the associated unfolded cross-sections and differential distributions in the fiducial region [15,16,17,18,19,20,21,22]. In particular, VBS measurements from ATLAS and CMS based on the full Run II dataset have recently been presented for different final states, from \(W^{\pm }W^{\pm }jj\) and \(ZW^{\pm }jj\) [15] to ZZjj [18, 19], including one analysis targeting polarized \(W^{\pm }W^{\pm }\) scattering [22].

In the past, searches for new physics using VBS processes have either been based on unitarisation techniques [23,24,25,26] or interpreted in terms of anomalous gauge couplings, where the SM couplings are rescaled by phenomenological parameters fitted from the data [19, 27,28,29]. However, this approach is only beneficial for bookkeeping purposes since, among other limitations, it violates gauge invariance. For this reason, different strategies based on effective field theories have been advocated [30,31,32,33] to interpret multi-boson and VBS measurements. These EFT-based approaches have numerous advantages over the previous phenomenological approaches: they respect the fundamental symmetries of the SM, are systematically improvable in perturbation theory, allow the correlation of eventual deviations between different processes, and can accommodate a meaningful quantification of theoretical uncertainties. We note that, beyond the SMEFT, other effective theory interpretations of VBS data have been considered such as those based on the Electroweak Chiral Lagrangian [34,35,36,37], where the Higgs boson is not necessarily part of an SU(2) doublet.

With this motivation, VBS measurements have often been interpreted in the SMEFT framework to identify, parametrise, and correlate possible deviations in the structure of the electroweak gauge couplings compared to the SM predictions. However, these studies have so far [38,39,40,41] been mostly restricted to a selection of dimension-eight operators [31, 42], in particular those that induce aQGCs without modifying the triple gauge couplings (TGCs). As emphasized in Ref. [43], it is theoretically inconsistent to derive bounds on aQGCs from VBS data accounting for dimension-eight operators while neglecting the dimension-six ones, which also modifying the electroweak interactions that enter the same observables. The fact that available EFT interpretations of VBS processes ignore the contribution from dimension-six operators casts doubts on the robustness of the obtained aQGCs bounds.

While several works have investigated the effects of dimension-six operators on diboson production [44,45,46], including the impact of QCD corrections to the EFT cross-sections [47,48,49], much less attention has been devoted to the corresponding effects on VBS processes [43, 50,51,52]. In this work, we present for the first time a systematic interpretation of VBS fiducial cross-sections and unfolded differential distributions from the LHC in the framework of the dimension-6 SMEFT at linear order, \({\mathcal {O}}\left( \varLambda ^{-2}\right) \), in the effective theory expansion. Our study is carried out within the SMEFiT framework, a toolbox for global EFT interpretations of experimental data which has been deployed to characterise the top-quark sector [53] and is currently being updated to perform a combined EFT analysis of Higgs boson, top-quark, and diboson measurements from LEP and the LHC in Ref. [54].

In the present study, we consider all available VBS measurements of fiducial cross-sections and distributions, in most cases based on the full Run II integrated luminosity. These are complemented by the most updated QCD-induced diboson production datasets from ATLAS and CMS [55,56,57,58,59], which are interpreted simultaneously within the same EFT theoretical framework as the VBS measurements. We demonstrate how the VBS measurements provide complementary information on several operators relevant for the description the electroweak sector of the SMEFT, in particular those modifying the triple and quartic gauge couplings. In addition, we quantify the impact of the VBS data by direct fits and by using statistical metrics such as information geometry and principal component analysis. We also highlight the consistency between the constraints separately provided by the VBS and diboson data on the dimension-six operators considered, representing a non-trivial stress-test of the gauge sector of the SMEFT. Overall, our analysis motivates the systematic inclusion of VBS data in global SMEFT interpretations [60,61,62,63,64,65,66,67,68,69].

While we have now the first VBS unfolded measurements of cross-sections and differential distributions, they are limited by statistics. Accessing the full physics potential associated to VBS processes will only be achieved with the analysis of the complete dataset from the High Luminosity LHC [70, 71]. In particular, the HL-LHC will provide access to the high energy region of \(VV'\rightarrow VV'\) scattering and has the potential to disentangle contributions from \(V_LV_L'\) polarised scattering [72,73,74,75]. To quantify this impact, we present projections for the reach in the EFT parameter space of the VBS measurements expected at the HL-LHC, which demonstrate a significant increase in sensitivity compared to current measurements.

The structure of this paper is as follows. First, we present the theoretical framework of the analysis in Sect. 2, in particular our definition of the dimension-six operator basis and the flavour assumptions. In Sect. 3 we describe the VBS and diboson data used as input for our EFT fit, outline the details of the corresponding SM theoretical calculations, and present different measures of the expected operator sensitivity. The main results of this work are then presented in Sect. 4, where we derive bounds on the relevant operators and discuss the interplay between the various data sets. Finally, we study in Sect. 5 the impact that future measurements of VBS processes at the HL-LHC will have on the EFT parameter space, followed by a summary and indication of possible future developments in Sect. 6.

2 Theoretical framework

In this section we introduce the dimension-six SMEFT operators that will be considered for the interpretation of the vector boson scattering and diboson measurements at the LHC. Restricting ourselves to dimension-six operators, we can express the SMEFT Lagrangian as,

$$\begin{aligned} {\mathcal {L}}_{\mathrm{SMEFT}} = {\mathcal {L}}_{\mathrm{SM}} + \sum _{i=1}^{n_{\mathrm{op}}} \frac{c_{i}}{\varLambda ^2} {\mathcal {O}}_i^{(6)}, \end{aligned}$$
(2.1)

where the \({\mathcal {O}}_i^{(6)}\) represent a complete basis of operators built upon the SM fields with mass dimension equal to six, and \(c_i\) are their corresponding Wilson coefficients. These operators respect the fundamental symmetries of the SM such as gauge and Lorentz invariance. In Eq. (2.10), \(\varLambda \) indicates the energy scale that determines the regime of validity of the EFT approximation. For instance, \(\varLambda \) can be interpreted as the typical mass of the new heavy particles that arise in the ultraviolet (UV) completion of the SM. Note that, from a bottom-up phenomenological analysis, only the ratio \(c_i/\varLambda ^2\) can be determined, rather than the two parameters separately.

In this work, we will focus on those operators that modify the interactions of the electroweak gauge bosons. These will involve the weak gauge field strength tensors

$$\begin{aligned} W^{I}_{\mu \nu }= & {} \partial _\mu W^{I}_\nu - \partial _\nu W^I_\mu - g_2 \epsilon ^{IJK} W^J_\mu W^K_\nu , \end{aligned}$$
(2.2)
$$\begin{aligned} B_{\mu \nu }= & {} \partial _\mu B_\nu - \partial _\nu B_\mu , \end{aligned}$$
(2.3)

as well as the SM covariant derivative, given by

$$\begin{aligned} D_\mu = \partial _\mu + i g_2 \frac{\sigma ^{I}}{2} W^{I}_\mu + i g_{1} Y_f B_\mu , \end{aligned}$$
(2.4)

where \(g_1, g_2\) are the weak couplings, \(\sigma ^{I}\) are the Pauli matrices (SU(2)\(_L\) generators), and \(Y_f\) is the fermionic hypercharge. Here we neglect strong interaction effects, which play a limited role in the description of the VBS process, and set to zero the masses of all leptons and quarks except for the top quark.

Some of the relevant dimension-six operators for this analysis will also involve the Higgs doublet field, defined in the unitary gauge by

$$\begin{aligned} \varphi = \frac{1}{\sqrt{2}} \begin{pmatrix} 0 \\ v + h \end{pmatrix} , \end{aligned}$$
(2.5)

with \(v=246\) GeV being the Higgs vacuum expectation value (vev) and h represents the \(m_h=125\) GeV Higgs boson. Here we will also consider CP-odd operators, which are constructed in terms of the dual field strength tensors, defined by

$$\begin{aligned} {\widetilde{X}}_{\mu \nu } = \frac{1}{2} \epsilon _{\mu \nu \rho \sigma } X^{\rho \sigma } , \end{aligned}$$
(2.6)

and whose presence leads to CP-violating effects which are potentially observable in the electroweak sector [76,77,78,79,80].

There exist several bases that span the SMEFT operator space at dimension-six. In this work we adopt the Warsaw basis [81], which contains 59 operators for one fermion generation, and consider only those operators that contain at least one electroweak gauge field. This means, in particular, that we neglect the contributions from four-fermion operators as well as from those that modify the Yukawa interactions and the Higgs self-coupling.

Flavour assumptions In this work, we will assume that the operator structure is the same across the three fermionic families, the so-called SU(3)\(^5\)-symmetric model. In other words, we assume flavour universality of the UV-complete theory. In practice, this means that all Warsaw basis operators that contain fermion generation indices will be understood as diagonal and summed over generations, e.g.,

$$\begin{aligned}&[c_{\varphi f}]_{ij} ( \varphi ^{\dagger }\overleftrightarrow {D}_{\mu }\varphi )(\bar{f}_i \gamma ^\mu f_j) \nonumber \\&\quad \longrightarrow c_{\varphi f} \sum _{i=1}^3 ( \varphi ^{\dagger }\overleftrightarrow {D}_{\mu }\varphi )(\bar{f}_i \gamma ^\mu f_i) . \end{aligned}$$
(2.7)

Note that, as a consequence of this SU(3)\(^5\) symmetric flavour structure, when comparing with constraints obtained in EFT fits based on more general flavour specific operators, such as those that single out the top quark, the value of our coefficient will be the average of the flavour-dependent coefficients in that analysis.

Purely bosonic operators To begin, we define the purely bosonic operators that modify the gauge structure of the theory as compared to the SM. In Table 1 we list the dimension-six operators constructed from bosonic fields that modify the interactions of the electroweak gauge bosons and which are considered in this work. For each operator, we indicate its definition in terms of the SM fields and also the notation conventions adopted both for the operator and for the Wilson coefficient. Note that, as mentioned above, we consider both CP-even and CP-odd operators.

The only CP-even modifications of the triple and quartic gauge couplings arise from \({\mathcal {O}}_{W}\). In addition, we account for possible CP-odd contributions to the aTGC and aQGC from the \(\lbrace {\mathcal {O}}_{{\widetilde{W}}} , {\mathcal {O}}_{ \varphi {\widetilde{W}}} , {\mathcal {O}}_{ \varphi {\widetilde{B}}} , {\mathcal {O}}_{ \varphi {\widetilde{W}}B} \rbrace \) operators. The remaining operators in this category modify the Higgs-gauge (hVV and hhVV) vertices. They appear in the processes either by means of Higgs decays (through the interference of \(gg \rightarrow h \rightarrow 4 \ell / 2 \ell 2\nu \) with diboson production), or through the t-channel Higgs exchange contributions to the VBS cross-sections. Furthermore, the operators \({\mathcal {O}}_{ \varphi WB}\) and \({\mathcal {O}}_{\varphi D}\) also enter the definitions of the gauge masses and mixing angle in the SMEFT Lagrangian, and are hence both dependent of our scheme choice.

Table 1 Dimension-six purely bosonic operators that modify the interactions of the electroweak gauge bosons. For each operator, we indicate its definition in terms of the SM fields, and the notational conventions that will be used both for the operator and for the Wilson coefficient

Two-fermion operators Another relevant class of dimension-six operators that modify the interactions of the electroweak gauge bosons are those composed by two fermion fields and two Higgs fields, where the gauge bosons enter via the covariant derivative. These operators describe new contact interactions involving fermions with gauge and Higgs bosons which are unrelated to the Yukawa couplings. They generate corrections to the \(V\ell \ell \) and \(V q {\bar{q}}\) vertices and can be constrained, among other processes, from the electroweak precision observables (EWPOs) measured by LEP [82]. They also generate contact interactions of the form \(h V f {\bar{f}}\) which affect specific Higgs boson production and decay processes. The two-fermion operators that will be considered in this work are listed in Table 2, and consist of seven CP-even operators containing each two Higgs doublets, a covariant derivative, and two fermionic fields. In the definition of these operators, we have introduced

$$\begin{aligned} \overleftrightarrow {D}_\mu \equiv ( D_\mu + \overleftarrow{D}_\mu ), \end{aligned}$$
(2.8)

which is required to ensure that operators with fermionic neutral currents are Hermitian. All the operators listed in Table 2 are CP-even.

Table 2 Dimension-six operators that modify the interactions of the electroweak gauge bosons and that are composed by two fermion fields, two Higgs fields, and one covariant derivative. The sum over the index j runs over the three SM generations, which are treated symmetrically in this study

Dipole operators These operators involve the direct interactions between gauge bosons and fermions, rather than the indirect ones that proceed via the covariant derivative such as the operators listed in Table 2. They have a special Lorentz structure connecting same-helicity fermions. In general, they do not interfere with the SM, except for a few cases where the light Yukawa couplings are taken to be nonzero. Since our analysis is restricted to the \({\mathcal {O}}\left( \varLambda ^{-2}\right) \) corrections to the VBS and diboson cross-sections and we neglect quark masses, we do not need to consider these operators here.

Parameter shifts and EWPOs Some of the dimension-six SMEFT operators generate a contribution to the relevant electroweak parameters,

$$\begin{aligned} m_Z, m_W, G_F, \sin ^2 \theta _W , \alpha _{\mathrm{EW}} , \end{aligned}$$
(2.9)

and depending on which input parameter scheme (IPS) one adopts, the expressions for \(\lbrace g_1 , g_2 , v \rbrace \), and hence for the resulting SM Lagrangian and Feynman rules, will be different. The operators affecting these electroweak input parameters are closely connected with the EWPOs and are thus significantly constrained by the former. In particular, the \(c_{\varphi l}^{(3)}\) and \(c_{l l}\) coefficients modify the definition of Fermi’s constant \(G_F\), while \(c_{\varphi W B}\) and \( c_{\varphi D}\) enter the Z mass and mixing angle. They can be well constrained through the measurement of the muon lifetime and of the EW oblique parameter respectively [83]: \(c_{\varphi W B}\) affects directly the value of the S parameter, also known as \(\rho \) [84], whereas \(c_{\varphi D}\) contributes to the T parameter.

Several BSM and EFT fits of these EWPO have been performed in recent years [61, 85, 86], and furthermore various LHC analyses tackle the extraction of the same EWPOs from LHC data [87,88,89,90,91,92], mostly relying on Drell-Yan production and related processes. Here we choose not to account for these constraints in our study, and constrain the coefficients of the operators listed in Tables 1 and 2 solely from the VBS and diboson measurements.

In the future, once the VBS measurements are integrated in the global EFT analysis, one will be able to constrain these electroweak parameter shifts by including both the LEP’s EWPOs, the LHC Drell-Yan data directly [93,94,95,96,97,98], and all other measurements (e.g. Higgs production) sensitive to them.

Overview of fitted degrees of freedom We summarise in Table 3 the degrees of freedom considered in the present work, categorised into purely bosonic and two-fermion operators. We also indicate the notation that will be used in some of the plots and tables of the following sections. We end up with \(n_{\mathrm{op}}=16\) independent coefficients, of which 9 are purely bosonic and 7 are two-fermion operators. Of the purely bosonic operators, 5 are CP-even and 4 are CP-odd. Recall that we use symmetric flavour assumptions and thus the operators involving quarks or leptons are summed over the three SM generations.

Table 3 Overview of the degrees of freedom considered in this analysis, separated into purely bosonic (upper) and two-fermion (lower part) operators. We also indicate the notation that will be used in some of the plots and tables of the following sections See Tables 1 and 2 for their definitions in terms of the SM fields

Amplitudes and cross-sections The dimension-six operators that compose the SMEFT Lagrangian Eq. (2.1) modify a generic SM cross-section to be,

$$\begin{aligned} \sigma _{\mathrm{SMEFT}} = \sigma _{\mathrm{SM}} + \sum _i^{n_{\mathrm{op}}} \frac{c_i}{\varLambda ^2}\sigma ^{(\mathrm eft)}_i + \sum _{i,j}^{n_{\mathrm{op}}} \frac{c_i c_j}{\varLambda ^4} {\widetilde{\sigma }}^{(\mathrm eft)}_{ij} , \end{aligned}$$
(2.10)

where \(\sigma _{\mathrm{SM}}\) indicates the SM prediction and the Wilson coefficients are assumed to be real. The \({\mathcal {O}}(\varLambda ^{-2})\) terms arise from EFT operators interfering with the SM amplitude and in most cases correspond to the dominant correction. For this reason, the cross-sections \(\sigma ^{(\mathrm eft)}_i\) are usually denoted as the SMEFT linear interference terms.

The third term in the RHS of Eq. (2.10) contains the quadratic contribution arising from the square of the amplitudes involving dimension-six operators, and scales as \({\mathcal {O}}(\varLambda ^{-4})\). These quadratic terms are of the same order of the dimension-eight operators that interfere with the SM amplitudes and that modify the TGCs and QGCs. Given that we consider here only dimension-six operators, the consistent inclusion of \({\mathcal {O}}(\varLambda ^{-4})\) corrections to VBS processes is left for future work and we restrict ourselves to the linear approximation.

We note that linear EFT interference effects due to CP-odd operators remain CP-odd, while squared CP-odd terms become CP-even and thus are difficult to disentangle from their CP-even counterparts [99]. For this reason, it is interesting to study CP-odd operators in processes for which the linear EFT terms are dominant, such as the high energy bins of differential distributions, or by looking at specific observables such as asymmetries. Separating the impact of CP-even and CP-odd operators has been studied mostly in the context of EFT analysis of the Higgs sector [78, 99,100,101,102].

The SMEFT is defined to be valid for energies satisfying \(E \ll \varLambda \). A lower bound on the value of \(\varLambda \) is given by the highest energy scale of the data included in our fit, which as discussed in Sect. 3 turns out to be around \(E\simeq 3\) TeV. An upper bound on \(\varLambda \) cannot be set from first principles and requires the observation of a hypothetical heavy resonance. In the rest of this paper, we will assume for simplicity \(\varLambda =1\) TeV, with the caveat that results for any other values of \(\varLambda \) can be obtained by a trivial re-scaling.

Fig. 1
figure 1

EFT corrections modifying the quartic (left panel) and triple (middle panel) gauge couplings in vector-boson scattering, as well as the the t-channel Higgs exchange contribution (right panel) and the \(V f {\bar{f}}\) interaction vertices. In this work we consider only final states where the gauge bosons decay leptonically

Fig. 2
figure 2

Same as Fig. 1 for two representative EFT diagrams contributing to diboson production: a pure diboson diagram (left) and another for which diboson production interferes with the \(h\rightarrow VV\) process (right)

Interplay between VBS and diboson production Gauge boson pair production has been extensively studied as a precision probe of the electroweak sector of the SM and its various extensions, first in the context of precision SM electroweak tests at LEP and more recently in the EFT framework and accounting for the corresponding LHC measurements [44,45,46,47,48,49]. Since diboson production is a relatively clean process with large cross-sections [103], fiducial cross-sections and differential distributions have been measured with high precision by ATLAS and CMS.

Most of the dimension-six operators listed in Table 3 modify also the theoretical calculation of diboson cross-sections, and thus it would seem that VBS data might be redundant for EFT studies. While indeed dimension-six EFT effects can be well constrained by diboson production at the LHC [44, 104], here we will show that VBS measurement provide non-trivial, complementary information for many of these operators. Furthermore, the role of VBS measurememnts is only bound to increase as more data is accumulated, in particular at the HL-LHC.

In VBS, only one CP-even operator in the Warsaw basis affects directly the triple and quartic gauge couplings, with three more operators contributing once CP-odd effects are allowed. Beyond these modifications of the TGCs and QGCs, the VBS process is also sensitive to several other dimension-six operators, given the large amount of vertices and topologies contributing to the definition of the its final state. This is illustrated in Fig. 1, where we show representative diagrams for EFT corrections to quartic and triple gauge couplings as well as the the t-channel Higgs exchange contribution.

In the case of WW diboson production at LEP, the process is sensitive to the triple gauge couplings ZWW and \(\gamma WW\) at leading order in the EFT expansion, and thus the corresponding EFT parametrisation will include the modification of the TGC (through \(c_W\)). It will also modify the \(e {\bar{e}} Z\) vertex and the corresponding IPS dependence, which could include \(c_{\varphi WB},c_{\varphi D}\) and \(c_{\varphi l}^{(3)}\), and even some contact term of the form \(e {\bar{e}} W W\), generally not interfering with the SM. Similar considerations apply for diboson production at hadron colliders, although now a new feature appears, namely the interference with Higgs production in gluon fusion followed by the \(h \rightarrow VV\) decay. This correction induces a non-negligible sensitivity to the \(c_{\varphi B}\) and \(c_{\varphi W}\) coefficients in gauge boson pair production at the LHC. These features are illustrated in Fig. 2.

3 Experimental data and theoretical calculations

In this section we describe the experimental data sets that will be used in the present analysis as well as the corresponding theoretical predictions both in the SM and at the EFT level. We also quantify the sensitivity that each of the VBS and diboson data have on the coefficients associated to the dimension-six operators introduced in Sect. 2.

3.1 Vector boson scattering

At hadron colliders, vector boson scattering occurs when two vector bosons are radiated off incoming quark lines and scatter into another pair of vector bosons, \(VV'\rightarrow VV'\). The latter decay either leptonically or hadronically, and thus the VBS amplitude will be proportional to \(\alpha _{\mathrm{EW}}^6\). Fig. 3 displays representative Feynman diagrams associated to vector boson scattering at the LHC for the ZZjj channel. The sensitivity to quartic gauge couplings is a unique feature of this process, and in particular the longitudinally polarised scattering amplitude \(V_LV'_L\rightarrow V_LV_L'\) provides a direct probe of the high-energy behaviour of the theory. We emphasize again that QGCs represent only a fraction of the VBS events, and thus a complete description of the process requires accounting for EFT effects in all possible topologies, as discussed in Sect. 2.

Fig. 3
figure 3

Representative Feynman diagrams for vector boson scattering in the ZZjj final state (top row) and its main background, the QCD-induced diboson production (bottom row)

The characteristic VBS topology is defined by two energetic jets with moderate transverse momenta, \(p_T\sim M_V/2\), which therefore are produced relatively close to the beam pipe and appear predominantly in the forward region of the detectors. The specific final-state signature that we will focus on in this work is thus composed by four leptons (either charged or neutral) and two jets in the forward region exhibiting a large invariant mass \(m_{jj}\) and wide rapidity separation \(\varDelta y_{jj}\). Furthermore, being a purely electroweak process, there is no color flow between the two incoming quark lines. This implies that the central rapidity region between the two tagging jets will have a reduced amount of hadronic activity, known as the “rapidity gap”.

As highlighted by the bottom diagrams of Fig. 3, the vector boson scattering process is affected by large backgrounds from QCD-induced diboson production processes with similar topology, with amplitudes proportional instead to \(\alpha _{\mathrm{EW}}^4 \alpha _s^2\). The interference terms between the diboson and VBS processes are usually small and therefore will be neglected in this analysis. Beyond diboson production, other sources of background to VBS include \(t+V\), \(t{\bar{t}}\), V+jets and QCD multijet production and are generally small. While the diboson inclusive cross-section is much larger than the VBS one, provided the statistics are large enough, one can efficiently disentangle the two processes by focusing on the large \(m_{jj}\) and \(\varDelta y_{jj}\) region (or related kinematic variables) where the VBS processes dominates.

Theoretical simulations In order to evaluate the expected cross-sections and differential distribution for the VBS (and the diboson) processes, we use two Monte Carlo generators, MG5_aMC@NLO [105] and POWHEG-box [106,107,108], to generate NLO QCD matrix elements. QCD corrections represent up to a \({\mathcal {O}}(100\%)\) effect for diboson processes [57, 58, 109], while in VBS, a purely electroweak process, they amount to a few percent [110,111,112,113,114,115,116], although they can modify the shape of distributions. Here we adopt the NNPDF3.1NNLO no-top PDF set [117].

The fixed-order NLO events are then showered with Pythia8 [118,119,120]. Accounting for parton shower effects is especially relevant for the modelling of additional soft QCD radiation in diboson production. It is also convenient to facilitate the matching between the theoretical predictions with the experimental analyses. However, since we restrict ourselves to fully leptonic final states, both hadronisation, underlying event, and multiple parton interactions are switched off in the Pythia8 simulation. The showered events are further processed with Rivet [121], a crucial step to reproduce the experimental selection requirements and acceptance cuts, given that only a subset of these can be implemented at the generation level. Moreover, this allows us to compare directly with the datasets published in HEPData [122].

Table 4 The settings of the theoretical calculations used for the description of the LHC cross-sections included in the present analysis. We indicate, for both the SM and the EFT contributions, the perturbative accuracy and the codes used to produce the corresponding predictions. All the simulations are first generated at fixed-order and then matched to a parton shower using Pythia8

Bottom quarks are always included in the initial state (\(n_f=5\) scheme) and sometimes also in the definition of the final state jets, following the prescription in the associated experimental analysis.

The signal to background ratio in VBS is generally small, and for this reason most VBS differential results are only available as a sum of EW- and QCD-induced processes, which can only be disentangled at the level of fiducial cross-sections. To account for this, in the simulation of VBS processes we generate MC events corresponding to both the EW-induced contributions (signal) and the QCD-induced contributions (background), with EFT corrections included only in the former.Footnote 2

The evaluation of the linear EFT cross-sections, \(\sigma ^{(\mathrm eft)}_i\) in Eq. (2.10), is carried out with MG5_aMC@NLOinterfaced with SMEFTsim [123], in its \(\lbrace m_W , m_Z , G_F \rbrace \) IPS implementation. Specifically, we compute the linear EFT cross-sections at LO in the SMEFT, and then calculate an NLO/LO K-factor assuming that the QCD corrections to the SM cross-sections factorise such that they can be assumed to be the same in the EFT. Nevertheless, we found the impact of this assumption to be rather small at the level of our fit results. In future work, it would be advisable to use exact NLO QCD calculations for the EFT cross-sections, such as the ones presented in [47,48,49], or by using for example SMEFT@NLO [124].

In Table 4 we summarize the settings of the SM and EFT theoretical calculations used to evaluate the LHC VBS and diboson cross-sections included in the fit. The perturbative accuracy and the codes used to produce the corresponding predictions for both the SM and the EFT contributions are also given.

Same sign \({{W}}^{\pm }{{W}}^{\pm }{{jj}}\) production In this category we consider two data sets, one from ATLAS [16, 128] based on \({\mathcal {L}}=36\) \(\hbox {fb}^{-1}\) and another from CMS based on the full Run II luminosity [15, 129], \({\mathcal {L}}=137\) \(\hbox {fb}^{-1}\). Theoretical predictions are evaluated using MG5_aMC@NLO  and then showered with Pythia8. Only the fiducial cross section measurement from ATLAS is used in the fit, since no differential distributions are available. Concerning the CMS measurement, the input to the fit is the differential distribution in the mass of the charged lepton pair \(m_{ll'}\), which includes the sum of VBS (EW-induced) and diboson (QCD-induced) contributions. In addition, we include the VBS-only fiducial cross section measurement. To avoid double counting, we remove one bin of the aforementioned distribution. Figure 4 displays the CMS \(m_{ll'}\) measurement together with the corresponding EW+QCD-induced theoretical predictions, finding good agreement. These theoretical predictions also agree with those presented in the original CMS publication [15].

Fig. 4
figure 4

The dilepton invariant mass distribution \(m_{ll'}\) from the CMS measurements of same-sign \(W^{\pm }W^{\pm }jj\) production [15], compared with the corresponding diboson (QCD-induced) plus the VBS (EW-induced) theoretical predictions. The error bar on the CMS data points indicates the total experimental uncertainty. In the lower panel, we display the ratio of theory over data.

Fig. 5
figure 5

The \(W^{\pm }Zjj\) production measurements from ATLAS [17] (left) and CMS [15] (right). In both cases the EW-induced induced contributions, which are being added to the QCD-induced ones, are separated into \(W^{+}Zjj\) and \(W^{-}Zjj\)

\({{W}}^{\pm }{{Zjj}}\) production In this category we include the \(m_{T}^{WZ}\) differential distribution from the ATLAS measurement [17, 130] based on \({\mathcal {L}}=36\) \(\hbox {fb}^{-1}\), which consists again on the sum of VBS signal and diboson background. For this dataset the full bin-by-bin correlation matrix is available and is accounted for in the fit. We also include the (signal plus background) differential distribution in the dijet invariant mass from CMS, \(d\sigma / d m_{jj}\), based on the full Run II dataset luminosity of \({\mathcal {L}}=137\) \(\hbox {fb}^{-1}\) [15, 129]. Again, we include the EW-only fiducial cross-section from CMS in addition to the differential distribution, and remove a bin from the latter to avoid double counting. Theoretical predictions for this process are evaluated at NLO with POWHEG-boxfor the EW component and at LO for the QCD diboson background with MG5_aMC@NLO.

Figure 5 displays a comparison between our theoretical predictions and the \(W^{\pm }Zjj\) production measurements from ATLAS [17] (absolute \(m^{WZ}_T\) distribution) and from CMS [15] (absolute \(m_{jj}\) distribution). For completeness, the EW-induced induced contributions, which are being added to the QCD ones, have been separated into \(W^{+}Zjj\) and \(W^{-}Zjj\). In the case of the CMS \(m_{jj}\) measurement, there is good agreement between data and theory, and one can observe how the VBS contribution clearly dominates over the QCD-induced processes at large dijet invariant masses \(m_{jj}\). For the ATLAS measurement, we observe some tension on the second bin in \(m_{T}^{WZ}\) where the theory undershoots the data, a behaviour that was also observed in the original analysis [17]. Both the ATLAS and CMS \(W^{\pm }Zjj\) measurements benefit from sensitivity to the high-energy region, covering kinematics of up to \(m_{T}^{WZ}\simeq 1\) TeV for ATLAS and \(m_{jj}=3\) TeV for CMS, which highlights their potential for constraining EFT operators that modify the VBS process.

ZZjj production Here we consider two recently released measurements from ATLAS [18, 131] and CMS [19, 132] based on the full Run II luminosity of \({\mathcal {L}} \approx 140\) \(\hbox {fb}^{-1}\). The ATLAS analysis represents their first VBS measurement in the ZZjj final state, while the CMS one updates a previous study of the same final state [133]. In the ATLAS case, we include the fiducial VBS cross section, which accounts for both EW- and QCD-induced contributions, while from CMS we include the EW-induced fiducial cross section together with the detector-level differential distribution in \(m_{ZZ}\) for the sum of the EW and QCD-induced contributions. Since the latter is not unfolded, it requires some modelling of detector effects. For this reason, our baseline dataset used in the fit will include only unfolded measurements, with the detector-level ones used as an additional cross-checkFootnote 3

The theoretical calculation for the ZZjj process for the signal (EW-induced) events is simulated at NLO using POWHEG-box[50] and at LO with MG5_aMC@NLO for the QCD-induced background. As discussed in Sect. 3.3, the ZZjj final state exhibits a large sensitivity to the EFT operators considered in this work, but their practical impact in the fit is moderate due to the large experimental uncertainties. In Fig. 6 we compare the number of events per \(m_{ZZ}\) bin between the theoretical predictions and the detector-level experimental data from CMS in the ZZjj final state based on the full Run II luminosity. In this comparison, our simulations account for the QCD- and EW-induced ZZjj contributions, while the other sources of background are taken from the original publication [19]. Note that the error band on the data points includes only the statistical uncertainty, which is dominant. The overall detector selection efficiency is modelled here by comparing the theory prediction for the fiducial cross-section with the expected yields in the folded distribution. In general, we observe a fair agreement between the theory simulations and the experimental data once the experimental uncertainties are accounted for.

Fig. 6
figure 6

The CMS detector-level VBS measurement in the ZZjj final state based on the full Run II luminosity [19]. Here we compare the number of events per \(m_{ZZ}\) bin between the theoretical predictions and the experimental data. The error bars display only the statistical uncertainty

\({{\gamma }} {{Zjj}}\) production Finally, we consider the rare VBS final state composed by a photon \(\gamma \) and a Z boson which subsequently decays leptonically. In this case, we have available two fiducial cross-section measurements for the electroweak production of a \(Z\gamma \) pair in association with two jets from ATLAS [20] and CMS [21, 134] based on the 2016 dataset with \({\mathcal {L}}\simeq 36\) \(\hbox {fb}^{-1}\). As for the ZZjj final state, we will consider here one detector-level distribution from ATLAS as a consistency check. Our theoretical predictions for this channel are evaluated at LO with MG5_aMC@NLO  and are found to be in good agreement with the data. This channel is interesting for our study both because of its sensitivity to neutral Higgs couplings as well as its ability to break degenerate solutions in the EFT parameter space. Moreover, we found that ATLAS and CMS have taken very different approaches to the definition of the phase space, which is already useful at the level of the cross-section and would mean an increased EFT sensitivity if unfolded distributions were also available. In Fig. 7 we report the reconstructed differential distribution. Our theoretical simulation includes only the EW signal, while other sources of background (QCD-induced \(\gamma Z\), \(Z+\mathrm{jets}\), and \(t{\bar{t}}\gamma \)) are taken from [20]. For this process, EW-induced VBS contributes only to \(\sim 10 \%\) of the total events, thus the impact of this distribution to the EFT fit is expected to be moderate.

Fig. 7
figure 7

Comparison between data and theory predictions for the ATLAS measurement of VBS in the \(\gamma Zjj\) final state [20]. Here we compare the number of events per \(p_{T}^{\ell \ell \gamma }\) bin between our predictions and the experimental data

Overview of VBS measurements A summary of the VBS datasets to be considered in our EFT interpretation is collected in Table 5. For each dataset, we indicate the final state, the selection criteria (e.g. EW-only versus EW+QCD contributions), the experimental observable, the number of data points \(n_{\mathrm{dat}}\) and integrated luminosity \({\mathcal {L}}\), as well as the dataset label and the original reference. In the data labelled with \(^{(*)}\), one bin from the differential distribution has been traded by the associated fiducial cross section to avoid double counting. In those cases, the latter corresponds to the EW-only component and thus exhibits increased sensitivity to the EFT operators, and \(n_{\mathrm{dat}}\) indicates the actual number of fitted data points. In this overview we separate the unfolded from the folded, detector-level data, since only the former will be part of the baseline dataset. Overall, we end up with \(n_{\mathrm{dat}}=18\) unfolded VBS cross-sections and \(n_{\mathrm{dat}}=15\) bins for the detector-level distributions, giving a total of \(n_{\mathrm{dat}}=33\) fitted data points. As will be shown in Sect. 4, the addition of the detector-level distributions has a significant impact in a VBS-only EFT fit, but only a marginal effect in the joint VBS+diboson analysis.

Table 5 Overview of the VBS measurements considered in this EFT analysis. We indicate the final state, the selection criteria, the experimental observable, the number of data points \(n_{\mathrm{dat}}\) and integrated luminosity \({\mathcal {L}}\). In the datasets labelled with \(^{(*)}\), one bin from the differential distribution has been traded for the fiducial cross section. We separate the unfolded (baseline) from the detector-level (used for cross-checks) datasets
Fig. 8
figure 8

Representative Feynman diagrams for opposite-sign \(W^\pm W^{\mp }\) diboson production, where the first two diagrams correspond to leading order processes while other two to gluon-initiated loop-induced contributions

3.2 Diboson production

In this work, gauge boson pair production is defined as the process whereby, at leading order, two vector bosons are produced on shell and then decay. This implies that the tree-level scattering amplitude will be proportional to \(\alpha _{\mathrm{EW}}^4\). Higher-order QCD corrections will lead to additional hard radiation and thus the QCD-induced \(VV'jj\) final state becomes a background to the VBS processes. This final state scales as \(\alpha _\mathrm{EW}^4\alpha _s^2\), and therefore in general will dominate over the EW-induced diagrams except in regions of the phase space where the VBS topology is enhanced.

Figure 8 displays representative Feynman diagrams for opposite sign \(W^\pm W^{\mp }\) production, a typical example of a diboson process. One can observe how diboson production is sensitive to the TGCs at the Born level and that the QGCs do not enter the theoretical description of this process. The gluon-gluon-initiated contributions are usually quite suppressed in VBS-like analysis, since their topology does not have the characteristic forward tagging jets. In this work, we will focus on the diboson production data with leptonic final states, in correspondence with the VBS case.

Fig. 9
figure 9

The \(m_{e\mu }\) differential distributions in opposite-sign \(W^\pm W^{\mp }\) diboson production at \(\sqrt{s}=13\) TeV from ATLAS (left) and CMS (right panel). The legend indicates the values of the \(\chi ^2\) per data point associated to different theoretical predictions: \(q{\bar{q}}\)-initiated at LO and NLO, and the latter plus gg-initiated at LO

The standard experimental selection cuts for diboson processes are \(p_T\) cuts in the leading and subleading charged leptons, leptonic rapidities being restricted to the central region, and in the presence of W bosons, a cut on the missing transverse energy, \(E_{T}^\mathrm{miss} \gtrsim 30 \) GeV. Furthermore, additional cuts on the transverse masses of the reconstructed leptons around \(m_W\) and \(m_Z\) are required to minimise the contribution from Higgs s-channel production. The resulting fiducial cross-sections are relatively large, and already at \({\mathcal {L}}\simeq 36\) \(\hbox {fb}^{-1}\) they become limited by systematic uncertainties. These large cross-sections explain why unfolded differential cross-sections for different kinematic variables have been available for some time already.

Opposite-sign \({{W^\pm W^{\mp }}}\) production This channel has been measured by ATLAS based on the \({\mathcal {L}}=36\) \(\hbox {fb}^{-1}\) [55, 135] data in the \(e \mu \) final state. Several differential distributions are available with their corresponding bin-by-bin correlation matrices. From CMS, we include their recent measurement [56, 136] based on the same luminosity, where events containing two oppositely charged leptons (electrons or muons) are selected. In our EFT analysis, we will include the same differential distribution, \(m_{\mu e}\), from both ATLAS and CMS consisting of \(n_{\mathrm{dat}}=13\) data points in each case. While the ATLAS distribution is provided as an absolute distribution, the CMS is normalised to the fiducial cross-section. Since the EFT total cross-section is different to the SM one, we revert this normalisation to maximise our EFT sensitivity.

Figure 9 displays a comparison between our theory predictions and the experimental data. The measurement extends up to values of the dilepton invariant mass of \(m_{e\mu }\simeq 1.5\) TeV. Here one can observe that the inclusion of higher-order QCD and gluon-initiated contributions is essential to achieve a good agreement with experimental data, which turns out to be similarly good for the two data sets. Furthermore, the effect of NLO QCD corrections is seen to be smaller for the normalised distribution than the absolute one, indicating that the NLO K-factor depends only mildly on the value of the invariant mass \(m_{e\mu }\).

\({{W}}^\pm {{Z}}\) production In this channel, we consider the ATLAS [57, 137] and CMS [58] measurements at 13 TeV based on \({\mathcal {L}}=36\) \(\hbox {fb}^{-1}\). In particular we chose the \(e \mu \mu \) final state as a benchmark, although other combinations are available. The ATLAS and CMS \(p_T^Z\) distributions contain \(n_\mathrm{dat}=7\) and 11 data points and their kinematic reach is \(p_T^Z\sim 1\) TeV and 300 GeV, respectively. For the ATLAS measurement, the information on the bin-by-bin correlated systematic uncertainties is made available and therefore are included. Moreover, we note that an EFT interpretation in terms of a subset of dimension-six operators has been presented in the CMS analysis of Ref. [58].

We display in Fig. 10 the comparison to our theoretical predictions at LO and at NLO. The latter in particular provides an excellent description to the experimental data. Here the effects of the NLO QCD corrections are reduced in the normalised distributions as was the case in \(W^\pm W^{\mp }\) production. Finally, as we will show in Sect. 4, this channel provides the strongest bounds on the TGC/QGC operator \({\mathcal {O}}_W\).

Fig. 10
figure 10

The Z boson transverse momentum distribution, \(p_T^Z\), as measured in \(W^\pm Z\) production from ATLAS [57] and CMS [58] at 13 TeV based on \({\mathcal {L}}=36\) \(\hbox {fb}^{-1}\). Note that while ATLAS provides an absolute distribution, the CMS one is instead normalised

ZZ production For this channel, we use the recent CMS measurements based on \({\mathcal {L}}=137\) \(\hbox {fb}^{-1}\) corresponding to the four-lepton final state [59], which supersedes a previous publication based on 36 \(\hbox {fb}^{-1}\) [138, 139]. For the theoretical predictions, the \(qq \rightarrow ZZ\) and \(gg \rightarrow ZZ\) contributions are simulated with POWHEG-boxat NLO and with MG5_aMC@NLOat LO, respectively. Figure 11 displays the normalized \(d \sigma / d m_{ZZ}\) distribution in the fiducial phase space from this CMS \(ZZ\rightarrow 4\ell \) measurement, which contains \(n_{\mathrm{dat}}=8\) data points. We find that the agreement with the normalised distribution at LO is good, and that the contribution from the gluon-dominated diagrams is quite small. The most updated ATLAS analysis related to the ZZ final state is the measurement of the four-lepton invariant mass spectrum at 13 TeV based on \({\mathcal {L}}=36\) \(\hbox {fb}^{-1}\) [140], which receives contributions also from single-Z and from Higgs production (via \(h\rightarrow ZZ^*\) decays) and therefore is not considered further here.

Fig. 11
figure 11

Normalized \(d \sigma / d m_{ZZ}\) distribution in the fiducial phase space from the CMS measurement based on \({\mathcal {L}}=137\) \(\hbox {fb}^{-1}\)

Overview of diboson measurements The diboson measurements that will be considered in this analysis are summarised in Table 6. In total we have \(n_\mathrm{dat}=52\) diboson cross-sections from the \(W^\pm W^{\mp }\), \(W^\pm Z\), and ZZ channels, three times more data points than the corresponding VBS unfolded cross-sections. In Sect. 4 we will compare the impact in the EFT parameter space between these two families of measurements.

Table 6 Overview of the diboson measurements considered in this work

3.3 Sensitivity on the dimension-six EFT operators

Quantifying the sensitivity of each VBS and diboson data set to the various dimension-six EFT operators is an important step towards understanding the fit results. It is also relevant to understand if there are flat directions in our fit basis, and identify which data sets will provide the dominant constraints in the parameter space. In the following, we summarise the dependence of each process to the EFT operators considered and determine their relative sensitivity by means of the Fisher information. We also apply a principal component analysis (PCA) to identify the hierarchy of directions in the parameter space and assess the possible presence of flat directions.

General discussion In Table 7 we list the contributions of the dimension-six EFT operators that constitute our fitting basis to the various VBS and diboson processes. Overall the complementarity between the diboson and VBS can be seen, with VBS providing direct access to the \({\mathcal {O}}_{\varphi B}\) and \({\mathcal {O}}_{\varphi W}\) operators (and their CP-odd counterparts) which are essentially unconstrained from diboson-only data.

Table 7 We indicate which of the \(n_{\mathrm{op}}=16\) dimension-six EFT operators considered in this analysis contributes to which of the experimental datasets. Furthermore, with \((\checkmark )\) we denote those datasets where the corresponding operator sensitivity arises via the gluon fusion contributions, \(gg \rightarrow h \rightarrow VV\), which is known to be a very small effect

The operators \({\mathcal {O}}_W\) and \({\mathcal {O}}_{{\widetilde{W}}}\) modify both the TGCs and the QGCs, and thus are not relevant for the description of the diboson production in the ZZ channel. The operators \({\mathcal {O}}_{\varphi D}\) and \({\mathcal {O}}_{\varphi WB}\) contribute to all the diboson and VBS channels, since they lead to modifications of the SM parameters as discussed in Sect. 2. Given that \({\mathcal {O}}_{\varphi B}\) modifies only couplings involving the Higgs boson and/or Z and \(\gamma \), it will be unconstrained from the WW and WZ diboson channels as well as from the WWjj and WZjj processes. The operator \({\mathcal {O}}_{\varphi W}\) induces additional modifications compared to \({\mathcal {O}}_{\varphi B}\), contributing to diboson processes by means of the hZZ and hWW vertices.

The two-fermion interaction vertices \(\gamma {\bar{\psi }} \psi \) and \(Z{\bar{\psi }} \psi \) are modified by some of the two-fermion operators, specifically by \({\mathcal {O}}_{\varphi l}^{(1)}\), \({\mathcal {O}}_{\varphi e}\), \({\mathcal {O}}_{\varphi l}^{(3)}\), \({\mathcal {O}}_{\varphi q}^{(3)}\), \({\mathcal {O}}_{\varphi q}^{(1)}\), \({\mathcal {O}}_{\varphi d}\), and \({\mathcal {O}}_{\varphi u}\), while the \(W{\bar{\psi }} \psi \) vertex will be affected by \({\mathcal {O}}_{\varphi l}^{(3)}\) and \({\mathcal {O}}_{\varphi q}^{(3)}\). Furthermore, the \(pp \rightarrow VV \rightarrow 4\ell \) and \(pp \rightarrow VVjj \rightarrow 4\ell jj\) processes provide sensitivity to two-fermion operators of the form \(\varphi D \psi ^2\) in all channels except for the ones with two WW bosons. Moreover, since the experimental phase space selection in the diboson production is designed to be orthogonal to the Higgs production, we expect that the WW and ZZ channel will be less sensitive to these operators compared to VBS. This justifies why the contributions from \({\mathcal {O}}_{\varphi B}\) and \({\mathcal {O}}_{\varphi W}\) (and their corresponding CP-odd counterparts) are negligible in this channel.

The Fisher information matrix While certainly informative, Table 7 does not allow one to compare the sensitivity brought in by different data sets on a given EFT degree of freedom. In particular, we would like to quantify the relative impact that the diboson and VBS observables have for each coefficient. To achieve this, it is convenient to resort to the Fisher information matrix [63, 141] which, when restricted to linear contributions only, is given by

$$\begin{aligned} I_{ij} = \sum _{m=1}^{n_{\mathrm{dat}}} \frac{\sigma ^\mathrm{(eft)}_{m,i}\sigma ^\mathrm{(eft)}_{m,j}}{\delta _{\mathrm{exp},m}^2} ,\quad i,j=1,\ldots ,n_{\mathrm{op}} , \end{aligned}$$
(3.1)

where the EFT coefficients are defined in Eq. (2.10) and where \(\delta _{\mathrm{exp},m}\) stands for the total experimental error associated to the mth data point. In Eq. (3.1), the sum extends over all the data points that belong to a given data set or family of processes. While the absolute values of the entries of the Fisher matrix \(I_{ij}\) are not physically meaningful (since the overall normalisation of the EFT operators is arbitrary), the ratios of the diagonal entries \(I_{ii}\) for the i-th degree of freedom between two different groups of process is well-defined, since there the operator normalizations cancel out.

The diagonal entries of the Fisher information matrix evaluated for each of the degrees of freedom that form our basis are displayed in Fig. 12. Its entries have been normalised such that the sum over the elements of a given row adds up to 100. We show results both for the individual groups of processes as well as the comparison between the overall impact of the VBS and the diboson datasets. For those entries greater than 10%, we also indicate its numerical value in the heat map.

Fig. 12
figure 12

The diagonal entries of the Fisher information matrix, \(I_{ii}\), evaluated for each of the coefficients that form our fitting basis. We display results separately for each channel (left) and when clustering all VBS and diboson datasets together (right panel). For those entries greater than 10%, we also indicate the numerical value in the heat map

Fig. 13
figure 13

Theoretical predictions for the VBS signal (EW-induced component only) for different final states at \(\sqrt{s}=13\) TeV. We show the dilepton \(p_{T_{\gamma \ell \ell }}\) distributions for the \(\gamma Zjj\) (left) and \(m_{\ell \ell }\) for \(W^\pm W^\pm jj\) (right) final states based on the selection cuts of the corresponding ATLAS measurements. In each case, we compare the SM predictions with three EFT benchmark points in terms of the dimensionless quantities \({\bar{c}} = c v^2 / \varLambda ^2 \). Either \({\bar{c}}_{W}\), \({\bar{c}}_{\varphi W}\), or \({\bar{c}}_{\varphi B}\) are set to 0.5 and the other coefficients to zero. In the upper panels, only the EFT prediction with \({\bar{c}}_{W}=0.5\) are shown to improve readability

Fig. 14
figure 14

The \(m_{ZZ}\) and \(m_T^{WZ}\) distributions in the ZZjj and \(W^\pm Zjj\) final states, based on the same selection cuts as the corresponding CMS measurements

One can observe from Fig. 12 that the VBS data provide the dominant sensitivity for several of the operators considered in this analysis, in particular for three of the CP-odd ones. In general, we find that VBS process can provide complementary information on the EFT parameter space compared to the diboson data. Specifically, one finds that VBS measurements provide the dominant sensitivity (more than 50% of the Fisher information) for \(c_{\varphi B}\) and \(c_{\varphi W}\) (and their CP-odd versions) as well as for \(c_{\varphi {\widetilde{W}}B}\). Moreover, they provide a competitive sensitivity (defined as more than 20%) for \(c_{\varphi l}^{(3)}\), \(c_{\varphi d}\), \(c_{\varphi D}\) and for the triple gauge operator \(c_{W}\). The latter result illustrates how VBS measurements, while still providing less information that diboson measurements to constrain modifications of the TGCs, do indeed provide useful information. In the case of the triple gauge operator \(c_{W}\), we also note that the WZ diboson final state dominates the sensitivity, with the contribution from the WW one being negligible. In terms of identifying which VBS final states lead to higher relative sensitivities, we observe that ZZjj provides most of the information for \(c_{\varphi B}\) and \(c_{\varphi {\widetilde{B}}}\), \(W^\pm W^{\mp } jj\) dominates for \(c_{\varphi W}\), and WZjj leads in constraining the CP-odd operators \(c_{\varphi {\widetilde{W}}}\) and \(c_{\varphi {\widetilde{W}}B}\).

EFT benchmark points Another strategy to quantify the sensitivity to the different Wilson coefficients is to compare the size of the SM and EFT cross-sections for representative benchmark points in the parameter space. Here we present only representative results for these comparisons, since compatible information is found for the complete set of final states and EFT operators. In Fig. 13 we display the theoretical predictions for the VBS signal (EW-induced component only) at \(\sqrt{s}=13\) TeV. We show the differential distributions for the \(\gamma Zjj\) and \(W^\pm W^\pm jj\) final states based on the selection cuts of the corresponding ATLAS reference measurements. In each case, we compare the SM predictions with three EFT benchmark points, in which either of the dimensionless quantities \(c_{W} v^2 / \varLambda ^2 \), \(c_{\varphi W} v^2 / \varLambda ^2\), or \(c_{\varphi B} v^2 / \varLambda ^2\) are set to 0.5, and the rest are set to zero. In the upper panels, only the EFT prediction for \(c_{W} v^2 / \varLambda ^2 \) is shown, to improve readability. We also display in Fig. 14 the corresponding comparisons for the \(m_{ZZ}\) and \(m_T^{WZ}\) distributions in the ZZjj and \(W^\pm Zjj\) final states based on the same selection cuts as in the associated CMS measurements.

From the comparisons in Figs. 13 and 14, one can observe a distinct variation in the EFT sensitivity across the specific final state and differential distribution being considered. In the case of the \(\gamma Zjj\) and \(W^\pm W^\pm jj\) final states, there is good sensitivity to \(c_{W}\) but rather less for \(c_{\varphi W}\) and \(c_{\varphi B}\) assuming the same value for each coefficient. Interestingly, the sensitivity to \(c_{W}\) can arise both from the low energy region as well as from the high energy tail of the distributions. The situation concerning \(c_{W}\) is similar for the ZZjj and \(W^\pm Z jj\) final states, with the difference being that now one becomes also sensitive to \(c_{\varphi B}\), which suppresses the cross-section compared to the SM expectation in a manner more or less independent from the kinematics. In the case of the \(c_{\varphi W}\) coefficient, the only distribution with comparable sensitivity to the other benchmark points is \(m_T^{WZ}\) in the \(W^\pm Z jj\) final state.

Principal component analysis (PCA) Lastly, we use PCA in this section to identify the combinations of Wilson coefficients which exhibit the largest and the smallest variabilities and determine the possible presence of flat directions. While PCA is primarily used as a dimensionality reduction tool by removing principal components with the lowest variance, here we use its core steps based on singular value decomposition (SVD) only for diagnosis purposes, and the EFT fitting basis remains the same as that defined in Sect. 2. More specifically, we utilize PCA to identify the possible presence of flat directions, assess whether there is a large gap in the variability between the principal components, and to determine the matching between the physical fitting basis and the principal components.

The starting point of the principal component analysis is the matrix K of dimensions \(n_{\mathrm{dat}}\times ~n_{\mathrm{op}}\) and (dimensionless) components \(K_{mi}=\sigma ^{(\mathrm eft)}_{m,i}/\delta _{\mathrm{exp},m}\), where \(\delta _{\mathrm{exp},m}\) is the same total experimental error that appears in the evaluation of the Fisher information matrix. Using singular value decomposition (SVD) we can write \(K = U W V^\dagger \), where U (V) is a \(n_\mathrm{dat} \times n_{\mathrm{dat}}\) (\(n_{\mathrm{op}} \times n_{\mathrm{op}}\)) unitary matrix and W is an \(n_{\mathrm{dat}}\times n_{\mathrm{op}}\) diagonal matrix with semi-positive real entries, called the singular values, which are ordered by decreasing magnitude. The larger a singular value, the higher the variability of the associated principal component. The elements V contain the (normalised) principal components associated to each of the singular values, which can be expressed as a superposition of the original coefficients,

$$\begin{aligned} \mathrm{PC}_k = \sum _{i=1}^{n_{\mathrm{op}}} a_{ki}c_i \, , \quad k=1,\ldots ,n_{\mathrm{op}} ,\qquad \left( ~ \sum _{i=1}^{n_{\mathrm{op}}} a_{ki}^2=1\,~\forall k \right) \nonumber \\ \end{aligned}$$
(3.2)

where the larger the value of the coefficient \(a_{kl}\), the larger the relative weight of the associated Wilson in this specific principal component.

Fig. 15
figure 15

Results of the principal component analysis applied to the baseline VBS+diboson dataset. The upper panel shows the distribution of singular values, while the lower one displays the squared values for the coefficients \(a_{ki}\) of the principal components in Eq. (3.2)

The upper panel of Fig. 15 displays the distribution of singular values for the \(n_{\mathrm{op}}=16\) principal components associated to the fitting basis described in Sect. 2 with the baseline VBS+diboson dataset. This analysis confirms that there are no flat directions in our parameter space, which would appear as a principal component with a vanishing singular value. Furthermore, we do not observe large hierarchies in the distribution of singular values, indicating that the physical dimensionality of our problem coincides with that of the adopted fitting basis.

The lower panel of Fig. 15 displays a heat map indicating the values of the (squared) coefficients \(a_{ki}^2\) that relate the original fitting basis to the principal components via the rotation in Eq. (3.2), and whose associated eigenvalues are displayed in the upper panel. For entries with \(a^2_{ki}\ge 0.1\), we also indicate the numerical value in the corresponding entry. The principal component associated with the highest singular value can be attributed to the two-fermion coefficient \(c_{\varphi q}^{(3)}\), which therefore is expected to be well constrained from the fit (as anticipated in Sect. 2). Other principal components which coincide with the coefficients of our fitting basis are \(c_{W}\) and \(c_{\varphi {\widetilde{B}}}\). In general, the majority of principal components involve a superposition of several basis coefficients \(c_i\), for example in \(\hbox {PC}_k\) with \(k=2,7,8\) or 10, none of the squared coefficients \(a_{ki}^2\) is larger than 0.3.

EFT effects in the VBS backgrounds As discussed in Sects. 2 and 3, the main background process associated to the VBS measurements is QCD-induced diboson production with extra jets. In principle, a consistent interpretation of the available VBS data would require evaluating the EFT corrections to our theory predictions for the QCD-induced diboson backgrounds with VBS cuts, while here instead we only evaluate these cross-sections using only SM theory.

However, taking into account these effects presents important challenges of both experimental and computational origin. In the former case, unfolded cross-sections and differential distributions are often provided directly at the level of the signal (VBS) process, with the background contributions already subtracted. In the latter case, their theoretical modelling requires generating a very large matched sample of QCD-induced diboson production in association with up to two hard jets with the EFT operators activated, to ensure that sufficient statistics remain once the stringent VBS cuts are applied.

Nevertheless, accounting for EFT effects in the VBS background processes is unlikely to significantly modify the interpretation of the VBS data presented in this work. The reason is that our baseline results in Sect. 4 are based on the simultaneous interpretation of (inclusive, QCD-induced) diboson production and of VBS measurements. Hence, EFT effects that modify the backgrounds of the latter (the QCD-induced diboson process with VBS cuts) are already constrained, at least partly, by the inclusive diboson data used in the fit. Indeed, higher-order QCD corrections to diboson production do not introduce qualitatively new EFT sensitivity once VBS cuts are applied to them. Furthermore, we will show how, for the considered EFT operators, the diboson measurements exhibit superior constrained power as compared to the VBS cross-sections. All in all, within a joint EFT fit of diboson and VBS cross-sections, neglecting EFT effects in the backgrounds of the latter appears to be a well-motivated approximation.

4 Results and discussion

In this section, we present the main results of this work, namely the dimension-six EFT interpretation of the VBS and diboson datasets from the LHC Run II. We first briefly summarise the fitting strategy adopted in this analysis and then present the fit quality by comparing the best-fit results with the corresponding experimental measurements. We then present the fit results for the baseline dataset, determine the 95% CL intervals for the \(n_{\mathrm{op}}=16\) operators considered, and study the dependence of our results with respect to variations of the input data, in particular with fits based only on VBS measurements.

4.1 Fitting strategy

The EFT analyses carried out in this work are based on the SMEFiT global fitting framework presented in [53, 54]. Two options to constrain the EFT parameter are available in this framework: the Monte Carlo replica fit method (MCfit) and Nested Sampling (NS) via MultiNest [142]. In this work we adopt the latter technique. The end result of SMEFiT is a representation of the probability density in the space of Wilson coefficients spanned by \(N_{\mathrm{spl}}\) samples, \(\{ c_i^{(k)}\}\), which allows the evaluation of statistical estimators such as mean values and standard deviations, e.g.,

$$\begin{aligned} \left\langle c_i\right\rangle= & {} \frac{1}{N_\mathrm{rep}}\sum _{k=1}^{N_{\mathrm{rep}}} c_i^{(k)} , \quad i=1,\ldots ,n_{\mathrm{op}} , \end{aligned}$$
(4.1)
$$\begin{aligned} \delta c_i= & {} \left( \frac{1}{N_{\mathrm{rep}}-1}\sum _{k=1}^{N_{\mathrm{rep}}} \left( c_i^{(k)} -\left\langle c_i\right\rangle \right) ^2\right) ^{1/2} , \quad i=1,\ldots ,n_{\mathrm{op}} ,\nonumber \\ \end{aligned}$$
(4.2)

and likewise for other estimators such as the correlation coefficients. Since the present analysis is carried out at the linear level in the EFT expansion, and there are no flat directions for the baseline dataset (see Sect. 3.3), the probability distributions associated to the coefficients \({\varvec{c}}\) are expected to be Gaussian. For this reason, it is not necessary to go beyond the first two moments of the posterior distributions in \({\varvec{c}}\).

The overall fit quality is assessed by means of the \(\chi ^2\) figure of merit, defined as

$$\begin{aligned} \chi ^2\left( {{\varvec{c}}} \right)\equiv & {} \frac{1}{n_\mathrm{dat}}\sum _{i,j=1}^{n_{\mathrm{dat}}}\left( \sigma ^{(\mathrm th)}_i\left( {{\varvec{c}}} \right) -\sigma ^{(\mathrm exp)}_i\right) (\mathrm{cov}^{-1})_{ij} \nonumber \\&\times \left( \sigma ^{(\mathrm th)}_j\left( {\varvec{c}}\right) -\sigma ^{(\mathrm exp)}_j\right) \; , \end{aligned}$$
(4.3)

where \(\sigma _i^\mathrm{(exp)}\) corresponds to the central experimental data point and \(\sigma ^{(\mathrm th)}_i\left( {{\varvec{c}}}\right) \) is the associated theoretical prediction, Eq. (2.10), for the ith cross-section. The covariance matrix, \(\mathrm{cov}\), is constructed from all available sources of uncorrelated and correlated experimental uncertainties, with the ‘\(t_0\)’ definition [143] used for the fit and the standard experimental covariance used to quote the resulting \(\chi ^2\) values. Whenever appropriate, we also add to the covariance matrix estimates of theoretical uncertainties coming from the input proton PDFs, as well as the MC theory calculations. The post-fit \(\chi ^2\) values are then evaluated using the best-fit estimate (mean) of the Wilson coefficients, Eq. (4.1), computed from the resulting MC samples obtained by NS.

4.2 Fit quality and comparison with data

In Table 8 we display the values of the \(\chi ^2/n_{\mathrm{dat}}\), Eq. (4.3), for each of the data sets contained in our baseline fit, as well as the total values associated to the diboson and VBS categories. We also indicate the \(\chi ^2\) values corresponding to the Standard Model predictions (pre-fit) together with the values obtained once the EFT corrections are accounted for (post-fit). Note that our baseline dataset does not contain any detector-level folded distributions. The graphical representation of these \(\chi ^2\) values is also displayed in Fig. 16.

Table 8 The values of the \(\chi ^2 /n_{\mathrm{dat}}\) for each dataset considered in the fit, as well as the totals in each category. We indicate the SM (pre-fit) results as well as the best-fit results once EFT effects are accounted for, and separate the diboson (upper) from the VBS (bottom) datasets
Fig. 16
figure 16

Graphical representation of the \(\chi ^2\) values reported in Table 8

From Table 8 one can observe that for the diboson data, a \(\chi ^2\) of around one per data point is obtained. Moreover, the total \(\chi ^2/n_{\mathrm{dat}}=1.17\) found at the level of SM calculations is reduced to 0.97 once EFT effects are included in the fit. Concerning the VBS dataset, there is a higher spread in the \(\chi ^2/n_{\mathrm{dat}}\) values, which is explained by the fact that each data set is composed of either a single or a few cross-section measurements. Taking into account the 18 independent cross-section measurements that we includein the fit, the SM value of \(\chi ^2/n_{\mathrm{dat}}=0.83\) is reduced to 0.75 at the post-fit level. Overall, the combination of the diboson and VBS measurements adds up to \(n_{\mathrm{dat}}=70\) data points for which a pre-fit value of \(\chi ^2/n_{\mathrm{dat}}=1.08\) based on the SM predictions is reduced to 0.92 after the EFT fit.

Figure 17 displays a comparison between experimental data and best-fit EFT theory predictions for the LHC diboson distributions considered in the present analysis. We show the results for the \(W^\pm Z\), \(W^\pm W^{\mp }\) and ZZ final states from CMS in the upper panels and the corresponding \(W^\pm Z\) and \(W^\pm W^{\mp }\) distributions from ATLAS in the lower panels. Both the data and the EFT fit results are normalised to the central value of the SM prediction. The experimental data is presented as both unshifted in central values (where the error band represents the total error) and with the best-fit systematic shifts having been subtracted (so that the error band contains only the statistical component). The band in the EFT prediction indicates the post-fit 95% CL uncertainty. For the datasets in which the information on correlated systematics is not available, only the unshifted data is shown. In Fig. 18, we show a similar comparison as that of Fig. 17 but now for the VBS measurements. In all cases a fair agreement is observed between experimental data and SM and EFT theory predictions, consistent with the \(\chi ^2\) values reported in Table 8.

Fig. 17
figure 17

Comparison between experimental data and best-fit EFT theory predictions for the LHC diboson distributions considered in the present analysis. Both the data and the EFT fit results are normalised to the central value of the SM prediction. The band in the EFT prediction indicates the post-fit 95% CL uncertainty

Fig. 18
figure 18

Comparison for the VBS measurements, both for the unfolded differential distributions (top panel), and for the EW-only fiducial cross-sections (bottom panel)

4.3 Constraints on the EFT parameter space

We now present the constraints on the coefficients of the dimension-six EFT operators used to interpret the VBS and diboson cross-sections listed in Table 8. In Fig. 19, we display the posterior probability distributions associated to each of the 16 coefficients that are constrained in this analysis for the baseline dataset. In all cases, we can see that these are approximately Gaussian, as expected for a linear EFT fit without flat directions. The latter result is consistent with observations derived from the PCA in Fig. 15, and confirm that the input dataset is sufficient to constrain all 16 independent directions in the EFT parameter space.

Fig. 19
figure 19

The posterior probability distributions associated to each of the \(n_{\mathrm{op}}=16\) coefficients that are constrained in this analysis for the baseline dataset. Note that the x-axis ranges are different for each coefficient

Table 9 The 95% confidence level intervals associated to the 16 dimension-six EFT degrees of freedom considered in the present analysis. We compare the results of fits based on the baseline VBS + diboson dataset both at the global (marginalised) and the individual levels,  as well as with those of a fit based only on the diboson cross-sections. Results shown here correspond to the dimensionless quantities \({\bar{c}} = c \cdot v^2/ \varLambda ^2 \) and can be rescaled for any value of \(\varLambda \), or presented as dimensioful quantities by extracting the powers of the vev

From these posterior probability distributions, the 95% confidence level intervals associated to each of the fit coefficients can be evaluated. Table 9 displays these 95% CL intervals associated to all 16 degrees of freedom. Moreover, a comparison is made between the results of the baseline VBS+diboson fit performed at the global (marginalised) and individual levels, as well as with a fit based only on the diboson cross-sections. In the fourth column (individual fits), only one coefficient is varied at a time while all others are set to their SM values. The results of Table 9 are also graphically represented in Fig. 20, which displays the absolute value (upper) and the magnitude (bottom panel) of these 95 \(\%\) CL intervals.

Fig. 20
figure 20

Graphical representation of the results of Table 9, displaying the absolute value (upper) and the magnitude (bottom panel) of the 95% CL intervals divided by two (that is, the 2\(\sigma \) error) associated to each of the 16 EFT operators considered here. We compare the marginalised results of a diboson-only fit (blue) with the same fit once VBS data is added (orange) in both cases when all coefficients are fitted simultaneously. For reference, we also show the results of the individual VBS+diboson fits, where only one operators is varied at the time and the rest are fixed to their SM value

From the comparison between the 95% CL intervals in Table 9 and Fig. 20, several interesting observations can be made. First, in comparing the results of the combined VBS+diboson fit with the diboson-only analysis, the VBS measurements are seen to improve the bounds provided by the diboson data in a pattern consistent with the Fisher information matrix displayed in Fig. 12. For instance, the bounds on \({\bar{c}}_{\varphi W}\) improve from \(\left[ -0.97,+2.1\right] \) to \(\left[ -0.55,+1.4\right] \), while those on the CP-even (odd) triple gauge operator \({\bar{c}}_{W}\) (\({\bar{c}}_{{\widetilde{W}}}\)) are reduced from \(\left[ -0.20,+0.11\right] \) (\(\left[ -0.63,+0.85\right] \)) down to \(\left[ -0.13,+0.14\right] \) (\(\left[ -0.35,+0.57\right] \)). In all cases, the VBS data improve the bounds on the EFT coefficients obtained from the diboson-only fit, highlighting the consistency and complementarity between the two families of processes. This result applies both to the CP-even as well as the CP-odd operators.

Another relevant observation from Table 9 concerns the differences between the marginalised and individual fits in the case of the combined VBS+diboson analysis, which illustrates the role of the correlations between the operators that modify these two processes. In the individual fits, one finds more stringent bounds by artificially setting all other EFT operators to zero, and this distorts the physical interpretation of the results. For several operators, the individual bounds underestimate the results of the 16-dimensional fit by an order of magnitude or more. This highlights the importance of accounting for all relevant EFT operators that contribute to a given process rather than just selecting a subset of them, as has often been the case in the interpretation of VBS measurements.

Figure 21 then displays the values of the correlation coefficient between the operators considered in the fit to the baseline dataset. For some pair-wise combination of operators we observe strong (anti-)correlations between the fit coefficients, for example \(c_{\varphi B}\) and \(c_{\varphi {\widetilde{B}}}\) are strongly anticorrelated, and the same holds for \(c_{\varphi D}\) and \(c_{\varphi W B}\). However, in most cases, these correlations turn out to be quite small, confirming that our choice of fitting basis is suitable to describe efficiently the available dataset in consistency with the PCA results.

Fig. 21
figure 21

The values of the correlation coefficients between the operators considered in the fit to the baseline dataset. These are categorised as positively correlated (\(\rho \ge 0.50\ (0.75)\), in red (orange)), negatively correlated (\(\rho \le 0.50\ (0.75)\), in green (blue)), and uncorrelated (\(|\rho | \le 0.5\), in grey)

Finally, in Fig. 22 we display the 95% CL lower bounds on the value of \(\varLambda /(v \, \sqrt{c_i})\). These bounds can be interpreted as the lower bounds derived from the EFT fit on the scale of new physics \(\varLambda \) in UV-completions where the corresponding Wilson coefficients are \(c_i={\mathcal {O}}\left( 1\right) \). They are again presented as dimensionless quantities, measured in units of vev. This interpretation can be adjusted to other BSM scenarios, for example in the case of strongly coupled theories where one expects \(c_i={\mathcal {O}}\left( 4\pi \right) \). For several operators, the combined VBS+diboson analysis results in values above 1 TeV for the new physics scale \(\varLambda \), for example the triple gauge operator \(c_W\) which has \((\varLambda /\sqrt{c_i})\gtrsim 3 v\) at the 95% CL.

Fig. 22
figure 22

Same as the lower panel of Fig. 20 now representing the 95% CL bounds on \(\varLambda /(v \sqrt{c_i})\)

Comparison with other EFT analyses Figure 23 displays a comparison between the individual bounds obtained in this work, based on the VBS+diboson dataset and shown in Fig. 20, with the corresponding individual bounds obtained in the BDHLL20 [48] and EMMSY20 [68] EFT analyses. The BDHLL20 fit includes data on diboson cross-sections from the LHC together with information from the associated production of a Higgs with a vector boson, hW and hZ. EMMSY20 is instead a global EFT interpretation that includes Higgs and top production data together with the EWPOs from LEP and some diboson cross-sections. For the three sets of results shown in Fig. 23, only the linear terms in the EFT expansion are being included and the EFT cross-sections are evaluated at leading order.Footnote 4 Given that these three analyses are based on different subsets of dimension-six operators, a comparison at the level of individual constraints is the most direct way of interpreting similarities or differences. We also note that CP-odd operators are only considered in this analysis.

For the majority of operators, the global study of EMMSY20 exhibits the superior sensitivity. Our good determination of \(c_{W}\) can be traced back to the inclusion of the WZ differential distributions from ATLAS and CMS, which are also included in BDHLL20, but absent in EMMSY20, where Zjj is included instead. This fact hints that a combined analysis of WZ and Zjj might shed more light on the purely gauge operator. The results of the global EFT fit lead to more stringent bounds as compared to those from this work and from BDHLL20, especially for the purely bosonic operators \(c_{\varphi B}\), \(c_{\varphi W}\) and \(c_{\varphi B W}\), which are significantly constrained both by the EWPOs from LEP as well as Higgs measurements. For most coefficients, our individual results and those of BDHLL20 are in good agreement, in particular for bosonic operators \(c_{\varphi D}\), \(c_{\varphi B W}\), and \(c_{\varphi W}\), \(c_{W}\). This is what we would expect, given the datasets chosen.

The comparison of the three works shows that Higgs. LEP and EWPD measurements represent the leading contributions to the parametrisation of BSM effects. There are also enough hints that a global interpretation of the LHC data, independent of older measurements is also a feasible way to go further on the road to the most accurate EFT interpretation.

Fig. 23
figure 23

Comparison of the individual bounds obtained in this work from the VBS+diboson dataset (shown in Fig. 20) with the corresponding individual bounds obtained in the BDHLL20 [48] and EMMSY20 [68] and EFT analyses, see text. In the three cases, only the linear terms in the EFT expansion are being included and the EFT cross-sections are evaluated at leading order

4.4 Dataset dependence

Until now, we have focused only on the analysis of the EFT fit results for the baseline dataset listed in Table 8. In the following, we assess the dependence of these results with respect to variations in the input data and theory settings by performing VBS-only fits and studying the impact of the VBS detector-level distributions when added to the VBS-only and to the baseline VBS+diboson fits. We also present fits where the CP-odd operators are set to zero and only the CP-even ones remain.

VBS-only fits First of all, we have verified through a dedicated PCA that flat directions in the EFT parameter space are absent also in the case of a VBS-only fit . However, the same analysis also reveals that some combinations of coefficients will be poorly constrained. The latter result is not unexpected, given that for a VBS-only dataset we have \(n_{\mathrm{op}}=16\) parameters to fit with only \(n_{\mathrm{dat}}=18\) data points. We display in Fig. 24 the same 95% CL intervals as in the lower panel of Fig. 20, but now comparing the results of our baseline fit with those obtained from the marginalised and individual VBS-only fits. By comparing the VBS+diboson with the VBS-only fits, we see that the obtained bounds in the latter case are much looser by a factor between 10 and 100 for most operators. These findings are consistent with our previous observations that current VBS data provides only a moderate pull when added together with the diboson cross-sections.

Fig. 24
figure 24

Comparison of the 95% CL intervals in the baseline fit with those resulting from marginalised and individual VBS-only fits. Only the unfolded VBS cross-section measurements listed in Table 5 are being included in the fits

Fig. 25
figure 25

Posterior distributions associated to the VBS-only fits that include only unfolded cross-sections (blue) and also the detector-level distributions (orange)

However, we would like to emphasize that this result does not imply that VBS-only fits cannot provide competitive sensitivity in a EFT analysis, but rather that the available VBS measurements are still scarce and limited by statistics. In fact, if one compares the results of the marginalised with the individual VBS-only fits, one can see that the individual bounds are notably reduced and become similar, or even better, than in the baseline VBS+diboson analysis. This implies that VBS processes are endowed with a unique potential to constrain the dimension-six operators of the SMEFT, but only once sufficient data has been collected to pin down the effects of the individual operators separately. We will verify this expectation in Sect. 5 through EFT fits based on dedicated HL-LHC projections.

The impact of the VBS detector-level measurements As was discussed in Sect. 3, one can in principle use detector-level measurements in the EFT fit in addition to the unfolded VBS cross-sections and distributions measured by ATLAS and CMS. Here we consider the \(m_{ZZ}\) and \(p_T^{\ell \ell \gamma }\) distributions from CMS and ATLAS in the ZZjj and \(\gamma Z jj\) final states respectively, which consist of 15 data points that can be included together with the unfolded VBS cross-section measurements. Given that our modelling of the detector response is basically reduced to a flat acceptance correction, we have chosen to remove these data points from the baseline results presented in the previous section. We would therefore like to illustrate how these detector-level distributions contain valuable information and are particularly instrumental to realise a reliable VBS-only EFT dimension-six analysis.

Figure 25 displays the same posterior probability distributions as in Fig. 19 but now corresponding to the VBS-only fits. We compare the results of the analysis based only on unfolded cross-sections with that in which the two detector-level distributions mentioned above are also included. While the VBS-only fit based on unfolded cross-sections does not exhibit genuine flat directions, several coefficients end up poorly constrained. The situation is different once the detector-level distributions are added to the fit: here the posterior distributions become Gaussian-like, and their width is markedly reduced compared to the previous case. In particular, the inclusion of the \(m_{ZZ}\) and \(p_T^{\ell \ell \gamma }\) detector-level distributions is particularly helpful in strengthening the VBS-only bounds on \(c_{\varphi B}\) and its CP-odd counterpart.

The 95% CL intervals associated to the posterior probability distributions of Fig. 25 are then represented in Fig. 26, where for reference we also display the results of the baseline VBS+diboson fit. We find that by adding the detector-level distributions, there is a noticeable improvement in the result of the VBS-only fit, with bounds being reduced by a factor between two and ten depending on the specific operator. In the case of \(c_{\varphi B}\), the resulting bound becomes comparable to that obtained in the VBS+diboson fit, though in general the VBS-only fit cannot compete with the combined VBS+diboson results even after the addition of the folded data. These results motivate the release of all available VBS measurements in terms of unfolded distributions. We have verified that in the case of the combined VBS+diboson fit, adding the detector-level measurements leaves the results essentially unaffected, providing a further justification of our choice of removing them from the baseline dataset.

Fig. 26
figure 26

Comparison of the results of the VBS-only fit based only on unfolded cross-sections with those of the same fit where in addition one includes the detector-level distributions. For reference, we also display the results for baseline VBS+diboson dataset

The impact of CP-odd operators Finally, we assess how the EFT fit results are modified once only CP-conserving operators are considered. Figure 27 compares the results of the baseline VBS+diboson fit with those of the same fit where the CP-odd operators have been set to zero, such that only the CP-even ones remain. In general the differences are quite small, and as expected the fit without CP-violating operators leads to somewhat more stringent bounds. The only operator for which removing the CP-odd operators has a significant effect is \(c_{\varphi B}\), where a difference of an order of magnitude in the 95% CL bound is observed. The reason for this behaviour is that, as indicated in the correlation heat map of Fig. 21, \(c_{\varphi B}\) and \(c_{\varphi {\widetilde{B}}}\) are strongly anti-correlated and thus in general it is rather challenging to disentangle them.

Fig. 27
figure 27

Comparing the results of the baseline fit with those of the same fit where the CP-odd operators have been set to zero, such that only the CP-even ones remain

5 Vector boson scattering at the HL-LHC

While the results presented in the previous section indicate the potential of VBS measurements for dimension-6 EFT analyses, their impact is currently limited by statistics. The ultimate LHC sensitivity required to constrain the coefficients of these dimension-6 operators from VBS data will only be achieved by legacy measurements based on the full HL-LHC luminosity of \({\mathcal {L}}\simeq 3\) \(\hbox {ab}^{-1}\) per experiment. With this motivation, we generate HL-LHC pseudo-data for EW-induced vector boson scattering processes and quantify their impact on the EFT fit by comparing the results to those presented in Sect. 4.3. The strategy adopted here is the same as the one used for the HL-LHC PDF projections in Refs. [144, 145], which were subsequently used in the studies presented in the corresponding Yellow Reports [70, 71].

In order to generate the HL-LHC pseudo-data, we select reference measurements out of the VBS datasets presented in Sect. 3. Table 10 presents the overview of the HL-LHC projections considered in this analysis, which include only EW-induced VBS processes since we assume that the QCD-induced backgrounds can be removed at the analysis level.

We consider the following differential distributions for each final state: \(m_{\ell \ell }\) for \(W^{\pm }W^{\pm }jj\), \(p_{T}^{\ell \ell \ell }\) and \(m_{T}^{WZ}\) in \(ZW^{\pm }jj\), \(m_{ZZ}\) for ZZjj, and then \( p_{T}^{\gamma \ell \ell }\) and \(m_{\gamma Z}\) in the \(\gamma Zjj\) final state, yielding a total of \(n_{\mathrm{dat}}=61\) datapoints. The theoretical predictions for these observables are generated as in Sect. 3 with the same selection and acceptance cuts, except that they are rescaled to account for the increase in the center of mass energy from \(\sqrt{s}=13\) TeV to \(\sqrt{s}=14\) TeV. We note that the actual HL-LHC analysis are expected to contain a larger number of bins, as well as a higher reach in energy, however for simplicity we maintain here the current binning. The theoretical calculations are generated for the null hypothesis (\({\varvec{c}}= {\varvec{0}}\)), with the caveat that better sensitivities would be obtained in the case of an EFT signal.

The statistical and systematic uncertainties associated to the HL-LHC pseudo-data are evaluated as follows. First, we denote \(\sigma ^\mathrm{th}_{i}\) as the theoretical prediction for the EW-induced VBS cross-section in the ith bin of a given differential distribution. This cross-section includes all relevant selection and acceptance cuts, as well as the leptonic branching fractions. The expected number of events in this bin and the associated (relative) statistical uncertainty \(\delta _i^\mathrm{stat}\) are then given by,

$$\begin{aligned} N_{i}^\mathrm{th}= \sigma ^\mathrm{th}_{i}\times {\mathcal {L}} \,, \quad \delta _i^\mathrm{stat}\equiv \frac{\delta N_{i}^\mathrm{stat} }{N_{i}^\mathrm{th}} = \frac{1}{\sqrt{N_{i}^\mathrm{th}}}. \end{aligned}$$
(5.1)

Note that the relative statistical uncertainty for the number of events and for the cross-sections will be the same, either in the fiducial region or extrapolated to the full phase space. Here we take the luminosity to be \({\mathcal {L}}=3\) \(\hbox {ab}^{-1}\) and generate two differential distributions per final state, one from ATLAS and the other from CMS, as indicated in Table 10.

Table 10 Overview of the (EW-induced) VBS HL-LHC projections considered in this analysis

Concerning the systematic uncertainties, these are also taken from the reference measurements as follows. If \(\delta _{i,j}^\mathrm{sys}\) denotes the jth relative systematic uncertainty associated to the ith bin of the reference measurement, we assume that the same systematic error at the HL-LHC will be given by \(f_{\mathrm{red},j}\delta _{i,j}^\mathrm{sys}\), where \(f_{\mathrm{red},j}\simeq 1/2\) is the expected reduction in systematic errors, in agreement with available projections [72,73,74,75]. Adding in quadrature all systematic uncertainties with the statistical error, the total relative uncertainty for the ith bin of our HL-LHC projections will be given by

$$\begin{aligned} \delta _{\mathrm{tot},i}^\mathrm{exp} = \left( \left( \delta _i^\mathrm{stat}\right) ^2 + \sum _{j=1}^{n_{\mathrm{sys}}} \left( f_{\mathrm{red},j}\delta _{i,j}^\mathrm{sys} \right) ^2\right) ^{1/2} , \end{aligned}$$
(5.2)

where \(n_{\mathrm{sys}}\) indicates the number of systematic error sources. Finally, we generate the central values for the HL-LHC pseudo-data projections by fluctuating the theory prediction by the expected total experimental uncertainty, namely

$$\begin{aligned} \sigma ^\mathrm{hllhc}_{i} \equiv \sigma ^\mathrm{th}_{i} \left( 1+ r_i\delta _{\mathrm{tot},i}^\mathrm{exp} \right) , \quad i=1,\ldots ,n_{\mathrm{bin}} , \end{aligned}$$
(5.3)

where \(r_i\) are univariate Gaussian random numbers. By construction, one expects that the EFT fit quality to the HL-LHC pseudo-data to be \(\chi ^2/n_{\mathrm{bin}} \simeq 1\) for a sufficiently large number of bins.

Fig. 28
figure 28

Comparison of the 95% CL intervals for the EFT coefficients between three related analyses: the VBS-only and a the combined diboson+VBS fits based on current data, and the VBS-only fit based on the HL-LHC projections listed in Table 10

Figure 28 displays the comparison of the obtained 95% CL intervals for the 16 EFT coefficients considered here between three related analyses. In particular, EFT fits based on the current measurements, both for a VBS-only and for a combined diboson+VBS dataset, are compared with the corresponding results from the VBS-only fit based on the HL-LHC projections listed in Table 10. Here we find that the HL-LHC measurements lead to a significant impact at the level of the VBS-only fit, where the current best bounds are improved by up to three orders of magnitude depending on the specific coefficient. It is also interesting to note that a VBS-only fit from HL-LHC measurements would even have a superior sensitivity compared to the combined diboson+VBS analysis, especially for the purely bosonic operators where at least a factor of 10 improvement over the current bounds is expected.

The results presented here further highlight the capability of VBS measurements for dimension-six EFT studies and the relevance of their integration in the global EFT fit, especially as more luminosity is accumulated. While our projections are based on optimistic assumptions such as a clean separation between the EW- and QCD-induced components of the measurement, the outstanding performance of the LHC experiments so far is rather encouraging.

6 Summary and outlook

In this work, we have presented an exhaustive investigation of effects from dimension-six SMEFT operators in the theoretical modelling of vector boson scattering processes. By exploiting information provided by the most updated VBS measurements from ATLAS and CMS, several of which are based on the full Run II data, we have obtained bounds on the relevant SMEFT operators that contribute to this process. We have demonstrated the overall consistency of the constraints provided by VBS with those from diboson production, and have highlighted how VBS measurements provide a useful addition to global EFT interpretations of LHC data. Using tailored projections, we have also estimated the improvements in the bounds on these dimension-six operators that can be expected from the VBS process with the legacy measurements of the HL-LHC, finding that these measurements will provide a remarkable sensitivity to several directions in the EFT parameter space.

We emphasize that the goal of this work was not to achieve state-of-the-art bounds on all the dimension-six operators that modify VBS observables. Such ambition can only be achieved within a dedicated global EFT fit that includes all relevant sensitive observables. These analyses must include, among others, Higgs production and decay measurements from the LHC and electroweak precision observables from electron-positron colliders, which by virtue of the electroweak gauge symmetry, constrain several of the same dimension-six operators that enter the description of VBS observables, as well as Drell-Yan distributions. For such an effort, some improvements in the theory calculations compared to this work will be required, in particular the use of exact, rather than approximate, NLO QCD effects in the EFT cross-sections using SMEFT@NLO as well as accounting for the quadratic corrections in the EFT expansion.

Most of the previous EFT interpretations of VBS observables from the LHC have focused on dimension-eight operators, with the argument that these can modify the quartic gauge couplings while leaving unaffected the triple ones that are purportedly well constrained by other processes. It would therefore be important to revisit these studies within a consistent EFT analysis that includes the effects of both dimension-six and dimension-eight operators up to \({\mathcal {O}}\left( \varLambda ^{-4}\right) \). For instance, it would be important to quantify how the current bounds on dimension-8 operators are modified with the inclusion of the dimension-six ones. Since there is no cross-talk between the dim-6 and dim-8 operators at this order in the EFT expansion, it would be possible to extend the present analysis by adding the various sources of quadratic contributions separately. Such a fully consistent \({\mathcal {O}}\left( \varLambda ^{-4}\right) \) analysis, combined with future measurements from Run III and the HL-LHC, would unlock the ultimate potential of EFT interpretations of VBS data and represent one of the key legacy results from the LHC.

Additional avenues for future research include the EFT interpretation of novel VBS observables, such as polarised scattering, as well as going beyond the SMEFT by considering other effective theories such as the HEFT or the Electroweak Chiral Lagrangian. In this respect, we point out that the fitting framework used in this work can be straightforwardly extended to other EFTs, and a fully general dependence of the theory predictions with the EFT coefficients is allowed.

The first measurements of unfolded VBS cross-sections and differential distributions discussed in this work undoubtedly represent a milestone in the LHC program, with profound implications for our understanding of the gauge sector in the SM and its extensions. While current VBS measurements are still statistics-dominated and, for the time being, provide only a moderate pull in the EFT fit, we have demonstrated that they provide complementary information as compared to the more traditional diboson processes. VBS is therefore poised to play a growing role in global EFT interpretations in the coming years, especially once high-statistics measurements become available.