1 Introduction

Supersymmetry (SUSY) provides well-justified extensions of the Standard Model (SM) of particle physics that can stabilise the electroweak scale against quantum corrections [1,2,3,4,5,6], radiatively break electroweak symmetry [7,8,9,10] and provide a dark matter (DM) candidate with the right abundance [11, 12]. Supersymmetric models are, however, increasingly challenged by null observations at a number of experiments, including searches for supersymmetric particles (sparticles) in proton–proton collisions at the Large Hadron Collider (LHC), and direct and indirect searches for DM.

In the minimal supersymmetric standard model (MSSM), the superpartners of the electroweak gauge and Higgs bosons mix to form electroweakinos. These consist of four Majorana fermions (neutralinos \({\tilde{\chi }}_i^0\), with \(i=1,2,3,4\) in order of increasing mass), and two Dirac fermions (charginos \({\tilde{\chi }}_i^\pm \), with \(i=1,2\)). The two mass matrices that mix these states contain only four parameters: the soft-breaking bino mass, \(M_1\), the soft-breaking wino mass, \(M_2\), the Higgsino superpotential mass parameter, \(\mu \), and the ratio of the two Higgs vacuum expectation values, \(\tan \beta \).

Although the masses of the neutralinos and charginos are unknown, there are theoretical reasons to expect them to be light. The \(\mu \) parameter, which governs Higgsino masses, enters tadpole cancellations required for electroweak symmetry breaking. Were \(\mu \) significantly greater than the weak scale, other parameters would need to be fine-tuned in order to satisfy these relations. Indeed, according to some measures of fine tuning presented in the literature, it is possible to have low fine tuning when the sfermions and gluino are heavy, provided that the Higgsinos (and therefore \(\mu \)) remain light [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34]. SUSY models with electroweakino states significantly lighter than the other SUSY states have been presented as natural SUSY [35,36,37,38,39,40,41,42] and in models where naturalness has been abandoned as a guiding principle [43,44,45,46,47,48,49,50,51]. In the latter, other motivations such as DM, where the lightest neutralino may play the role of DM even if the rest of the SUSY spectrum is heavy, are used as the guiding principles.Footnote 1

In this paper we take an agnostic approach to the questions of fine tuning and whether or not the neutralino plays the role of DM. Instead, we attempt to present a precise picture of current experimental knowledge of the electroweakino sector (which we call the EWMSSM) from direct collider searches for sparticles.

Constraints on electroweakinos have commonly been calculated under restrictive assumptions about their masses or compositions, or only over restricted slices of parameter space. For example, lower limits on the mass of the lightest neutralino from LEP [52, 53] are based on assumptions about the unification of gaugino masses at high scales. The purpose of this work is to determine whether the current suite of direct searches allows some range of the electroweakino masses (and/or couplings) to be robustly excluded – or alternatively, preferred.

Previous studies have investigated the combined impacts of various DM and collider constraints on the electroweakino sector, in the limit that other sparticles are decoupled [54,55,56,57,58,59,60,61,62,63,64,65,66,67,68]. Here, we carry out a more detailed, model-independent study, performing a global fit of the EWMSSM using only collider constraints from LEP, ATLAS and CMS arising either from direct searches for electroweakinos, or SM particle decays into them. Having a complete picture of the constraints on this sector from LEP and the LHC, independent of any assumptions about DM or Higgs physics, is of great interest. It may be the case, for example, that R-parity violation renders the \({\tilde{\chi }}_1^0\) metastable, or that the true Higgs sector is far more complex than that of the MSSM.

Electroweakino constraints from the LHC were first considered in detail in Ref. [69], which our study extends in a number of ways. First, we consider LEP searches in detail, plus constraints arising from measurement of the Z and h invisible widths. Second, we perform a convergent global statistical fit of the parameter space, with Monte Carlo (MC) event generation for LHC processes at each sampled parameter point, rather than simply performing a rectangular grid scan of the parameter space (and we generate at least twice as many MC events per parameter point as the previous study). Our statistical treatment is also superior, as we recreate the ATLAS and CMS limit-setting procedures for each analysis rather than comparing the predicted number of signal events to the ATLAS and CMS 95% CL exclusions on the numbers of signal events. This allow us to combine continuous likelihood terms from each analysis, and thus explore possible tensions between analyses in a rigorous fashion. Most significantly, Ref. [69] is based on searches for electroweakinos using the 8 TeV proton–proton collision dataset of the LHC, which have been all but superseded by new ATLAS and CMS searches that use 36 \(\hbox {fb}^{-1}\) of 13 TeV proton–proton collision data. This dramatically extends the possible discovery and exclusion reach of the LHC searches.

We begin in Sect. 2 by introducing the model and parameters over which we scan, followed by our sampling methodology, adopted priors and statistical framework. In Sect. 3, we then give a brief summary of the observables and likelihoods that we employ. We present our main results in Sect. 4 and briefly consider the implications for DM in Sect. 5 before presenting final conclusions in Sect. 6. Appendix A provides additional details for the interested reader on the impact of 8 TeV data on our results, and Appendix B provides best-fit signal predictions for all signal regions of all analyses that we consider. All GAMBIT input files, generated likelihood samples and best-fit benchmarks for this paper are publicly available online through Zenodo [70].

2 Model and fitting framework

2.1 Model definition

In this study we investigate the electroweakino sector of the MSSM. This sector is composed of Higgsinos (\({\tilde{H}}_u^0, {\tilde{H}}_u^+, {\tilde{H}}_d^-, {\tilde{H}}_d^0 \)) and electroweak gauginos: the bino (\({\tilde{B}}\)) and winos (\({\tilde{W}}^0,{\tilde{W}}^+,{\tilde{W}}^- \)). The neutral states mix together to form neutralinos, while the charged states mix to form charginos. The Lagrangian density therefore includes

$$\begin{aligned} {\mathcal {L}}_{\text {EWino}} = -\frac{1}{2} (\psi ^0)^T M_N \psi ^0 - \frac{1}{2} (\psi ^\pm )^T M_C \psi ^\pm + c.c. \end{aligned}$$
(1)

where

$$\begin{aligned} \psi ^0 = ({\tilde{B}},{\tilde{W}}^0,\tilde{H_d^0},\tilde{H_u^0} ),~ \psi ^\pm = ({\tilde{W}}^+,{\tilde{H}}_u^+,{\tilde{W}}^-,{\tilde{H}}_d^- ), \end{aligned}$$
(2)

and the neutralino mass matrix is

$$\begin{aligned} M_N = \begin{pmatrix} M_1 &{} 0 &{} -\frac{1}{2} g' v c_\beta &{} \frac{1}{2} g' v s_\beta \\ 0 &{} M_2 &{} \frac{1}{2} g v c_\beta &{} -\frac{1}{2} g v s_\beta \\ -\frac{1}{2} g' v c_\beta &{} \frac{1}{2} g v c_\beta &{} 0 &{} -\mu \\ \frac{1}{2} g' v s_\beta &{} -\frac{1}{2} g v s_\beta &{} -\mu &{} 0 \end{pmatrix}. \end{aligned}$$
(3)

Here \(s_\beta = \sin \beta \) and \(c_\beta = \cos \beta \), and the SU(2) and \(U(1)_Y\) gauge couplings, g and \(g'\), and the electroweak VEV, v are fixed from data while the ratio \(\tan \beta = v_u / v_d\) is a free parameter.

Similarly, the chargino mass matrix may be written as

$$\begin{aligned} M_C = \begin{pmatrix} 0 &{} X^T\\ X &{} 0 \end{pmatrix}, \;\; \text {where} \;\; X = \begin{pmatrix} M_2 &{} \frac{g v s_\beta }{\sqrt{2}} \\ \frac{g v c_\beta }{\sqrt{2}} &{} \mu \end{pmatrix}. \end{aligned}$$
(4)

Therefore the electroweakinos can be described using just the four electroweakino parameters mentioned in the introduction: \(M_1\), \(M_2\), \(\mu \) and \(\tan \beta \).

An electroweakino effective field theory (EFT) can be constructed by including additional light states, namely the SM fermions, gauge bosons and a SM-like Higgs boson. As with g and \(g^\prime \), the SU(3) gauge coupling and SM Yukawa couplings can be fixed from data. The Higgs potential parameters can be fixed by imposing the minimisation condition and requiring that the Higgs mass is fixed to its measured value \(m_h = 125.09\,\text {GeV}\) [71].

Note that in the MSSM, the quartic couplings in the Higgs potential are fixed by SM gauge couplings, allowing the Higgs mass to be calculated given a value of \(\tan \beta \). To find \(m_h \simeq 125\,\text {GeV}\) over a range of input \(\tan \beta \), one would then have to vary additional MSSM parameters. We choose to instead fix the Higgs mass, in the spirit of interpreting the results in an electroweakino EFT rather than any specific MSSM ultraviolet completion. This avoids introducing additional degrees of freedom that are not part of the electroweakino sector.

In principle it is possible to perform all calculations in such an electroweakino EFT. In practise, it is simpler to use an MSSM model where the rest of the states are heavy and decoupled, and make use of existing MSSM tools for computing e.g. electroweakino decays. We implement this model within the GAMBIT MSSM model hierarchy, in which the user may define child models of more general scenarios. The GAMBIT SUSY models include a chain of scenarios in which the MSSM soft SUSY-breaking Lagrangian parameters are defined at some scale Q, which one typically sets to be near the weak scale. The most general model has 63 free parameters: the gaugino masses \(M_1\), \(M_2\), and \(M_3\), the trilinear coupling matrices \({\mathbf{A}}_u, {\mathbf {A}}_d\) and \({\mathbf {A}}_e\) (9 parameters each), the squared soft sfermion mass matrices \({\mathbf {m}}^{\mathbf {2}}_Q\), \({\mathbf {m}}^{\mathbf {2}}_u\), \({\mathbf {m}}^{\mathbf {2}}_d\), \({\mathbf {m}}^{\mathbf {2}}_L\) and \({\mathbf {m}}^{\mathbf {2}}_e\) (6 parameters each), and three additional parameters describing the Higgs sector.

In this work we define the dimensionful parameters at the SUSY scale \(Q = M_{\text {SUSY}} = 3\) TeV. We set all trilinear couplings to zero. We take all diagonal entries of the squared soft sfermion mass matrices to be \(M^2_{\text {SUSY}}\), and all off-diagonal entries to be zero. We adopt a value of 5 TeV for both the pseudo-scalar Higgs mass \(m_A\) and the gluino mass parameter \(M_3\). We choose these values in order to effectively decouple all sparticles except for the electroweakinos. Their precise values are not significant, and simply serve to push the model into the decoupling regime. In this way, we fix all MSSM parameters except the four free parameters of the EWMSSM given in Table 1.

In this model we also assume that R-parity is either conserved or broken sufficiently weakly that the lightest supersymmetric particle (LSP) is metastable on detector timescales; we thus discard all parameter combinations where the LSP is not a neutralino.

Table 1 Parameters, ranges and priors adopted in the scans of this paper. The “hybrid” prior is flat where \(|x| < 10\,\mathrm {GeV}\), and logarithmic elsewhere. All other soft SUSY-breaking parameters are decoupled; see the text for details

2.2 Global fitting framework

The fits that we present in this paper are done with GAMBIT [72,73,74,75,76,77] 1.2.0. The LHC and LEP constraints that we apply come from ColliderBit [73] and the invisible width constraints are from DecayBit [76]. Both rely on spectrum calculations carried out with SpecBit [76]. All sampling is driven by ScannerBit [77]. We later explore DM implications (Sect. 5) with DarkBit [74].

Compared to GAMBIT 1.1, version 1.2 offers a number of new features. Those of most relevance for this study are updates to DecayBit to include the invisible Z width and theory errors on the invisible Higgs width (Sect. 3.1), and to ColliderBit to include

  • many new 13 TeV analyses

  • a LEP search for degenerate chargino–neutralino pairs (Sect. 3.3.1),

  • the ability to account for background correlations in different signal regions via simplified likelihoods (Sect. 3.3.3),

  • a dynamic convergence test of LHC Monte Carlo simulations designed to achieve a specific fractional signal uncertainty,

  • explicit output of individual LHC likelihood components, and

  • the ability to simultaneously include likelihood components from multiple uncorrelated signal regions in a single analysis.

Other updates include

  • the ability to call backends written in Python,

  • an interface to the polychord sampler [78],

  • improved parallelism and shutdown handling in the hdf5 printers and the T-Walk sampler,

  • a standalone hdf5 combination utility,

  • a new cout printer that sends outputs directly to the system standard output,

  • support for DM semi-annihilation processes and related models in DarkBit and SpecBit [79],

  • a wider range of Higgs portal models [79, 80],

  • a number of new MSSM parameterisations (using \(\mu \) and \(m_A\) instead of \(m_{H_u}^2\) and \(m_{H_d}^2\)), and

  • support for a number of new and updated external packages, including FlexibleSUSY 2.0 [81], nulike 1.0.6 [82, 83], DDCalc 2.0.0 [80], Capt’n General 1.0.0 [80] and fjcore 3.2.0 [84].

2.3 Parameters and priors

Table 1 summarises the ranges over which we scan the EWMSSM parameters, along with the priors that we assume.Footnote 2 Except for \(\tan \beta \), which we sample using a flat prior, our main scan employs a “hybrid” prior on each of the parameters x, which is flat where \(|x| < 10\,\mathrm {GeV}\), and logarithmic elsewhere.

To ensure that we include all possible mass hierarchies, we allow the three dimensionful parameters \(M_1\), \(M_2\) and \(\mu \) to vary up to a magnitude of 2 TeV. This is well beyond the LHC reach for electroweak states. Without loss of generality, we restrict \(M_2\) to positive values, as is commonly done in the literature (see e.g. [85, 86]), while allowing both positive and negative signs for both \(\mu \) and \(M_1\). Although we do not expect our results to be very sensitive to \(\tan \beta \), we consider a large range of possible values for this parameter (1–70), as previous work [87, 88] has shown a preference for large \(\tan \beta \).

For the purposes of mapping the profile likelihood, we sample the parameter space of the EWMSSM using the differential evolution sampler Diver 1.0.4 [77], employing the self-adaptive jDE version of the algorithm [89]. We set the population size to 18 700, and the convergence threshold to \(10^{-3}\). In order to sample the final high-likelihood region more efficiently, we performed two additional targeted scans, one for \(|\mu | < 500\) GeV and another for \(M_2 < 500\) GeV, using flat priors for the dimensionful parameters and the same Diver settings as the full-range scan.

A critical factor in the scanning strategy is the number of MC events generated per point to determine the LHC likelihood. This is particularly important for electroweak supersymmetry searches, since the acceptance of the analyses is often very small, due to very stringent kinematic selections that are designed to reject SM backgrounds that otherwise swamp the tiny SUSY signal. This problem is made worse by the necessity for some analyses of pre-selecting the signal region to use for a given parameter point, according to which of the available signal regions is expected to have the best sensitivity to the model. As the MC statistics are increased, the signal region with the best expected sensitivity to a given parameter point may change abruptly. When the level of agreement between data and background expectations differs notably between signal regions, a switch in which signal region is pre-selected can cause a large change in the likelihood assigned to the parameter point.

To combat this we perform our initial scan of the full parameter space with 100 000 generated events per parameter point, and the targeted scans with 500 000 events. We then carry out a sequential post-processing of the scan results to increase the MC statistics for points within the parameter regions preferred by our fit. Through this post-processing we ensure a minimum of 4 million generated events for all points in the \(2\sigma \) region, 16 million events for all points inside the \(1\sigma \) region, and 64 million events for the 500 points with highest likelihood. In total, we process \(2.4\times 10^5\) of the original scan samples with at least 4 million MC events. All the results that we present in this paper are based on this set of post-processed samples, unless otherwise stated.

2.4 Electroweakino spectrum and decays

In the course of our scans, model parameter values are sampled by ScannerBit and passed to an MSSM FlexibleSUSY [81, 90] spectrum generator,Footnote 3 which determines \(\overline{\text {DR}}\) couplings and computes the predicted electroweakino masses and mixings. It computes neutralino and chargino masses at the full one-loop level, performing a fixed-order calculation at the SUSY scale \(Q = 3\) TeV. The separation of scales implies somewhat large fractional corrections to the masses: \(\sim g^2/(4\pi )^2 \ln (M^2_{\text {SUSY}}/m^2_Z)\) for gaugino-like states and \(\sim y_t^2/(4\pi )^2 \ln (M^2_{\text {SUSY}}/m^2_t)\) for Higgsinos.

A more precise calculation of the masses could be achieved by using effective field theory techniques of matching and running to resum logs, or by including two-loop dominant \({\mathcal {O}}(\alpha _t \alpha _s)\) and \({\mathcal {O}}(\alpha \alpha _s)\) corrections to the neutralino and chargino masses [95, 96]. However, such improvements would not have a significant impact on our conclusions about the implications of experimental searches for the electroweakino sector.

We extract the electroweak gauge couplings at one-loop level using a fixed-order calculation at scale \(m_Z\), and thus these also receive electroweak corrections with logarithms between the SUSY scale and \(m_Z\).Footnote 4

We calculate neutralino and chargino decay branching fractions with SUSY-HIT 1.5 [97], which incorporates HDECAY [98] and SDECAY [99]. The resulting total widths and branching ratios are passed to the Pythia 8 event generator [100, 101] which performs the decays. Since Pythia 8 is in most instances limited to phase space decays, the kinematics of three-body decays of electroweakinos through off-shell gauge bosons, \({{\tilde{\chi }}}_1^\pm \rightarrow {{\tilde{\chi }}}_1^0 W^*\) and \({{\tilde{\chi }}}_2^0\rightarrow {{\tilde{\chi }}}_1^0 Z^*\), is not perfectly described. This is a limitation inherent to our fast simulation of LHC events. As will be clear below, the problematic region of the parameter space is not preferred by our scans.

3 Observables and likelihoods

Having chosen to investigate only the constraints provided by collider data on the electroweakino sector, our study includes a variety of direct searches for charginos and neutralinos from the OPAL and L3 experiments at the LEP collider, and the ATLAS and CMS experiments at the LHC, plus constraints on the invisible widths of the Z and Higgs bosons.

3.1 Higgs and Z boson invisible width

We calculate the Z boson decay width to neutrinos \(\varGamma (Z\rightarrow \nu \nu )\) at two loops in terms of SM nuisance parameters, using a parametric formula from Ref. [102]. To calculate the invisible width, we add this width to the tree-level decay width to the LSP, \(\varGamma (Z\rightarrow {{\tilde{\chi }}}^0_1 {{\tilde{\chi }}}^0_1)\). Indirect LEP measurements [103] require that the invisible width,

$$\begin{aligned} \varGamma (Z\rightarrow \text {inv.}) = 499.0 \pm 1.5 \,\text {MeV}. \end{aligned}$$
(5)

We use a Gaussian likelihood for this measurement, including in quadrature a \(10\%\) theoretical error in \(\varGamma (Z\rightarrow {{\tilde{\chi }}}^0_1 {{\tilde{\chi }}}^0_1)\) and an error of \(0.048 \,\text {MeV}\) accounting for missing higher-order corrections in \(\varGamma (Z\rightarrow \nu \nu )\) [102]. This indirect measurement is stronger than constraints from monophoton searches at LEP near the Z pole [104,105,106,107], which we did not include.Footnote 5

Higgs measurements at ATLAS, CMS and the Tevatron constrain the invisible branching fraction of the Higgs, \(\text {BF}(h\rightarrow \text {inv.})\). Assuming SM-like couplings for the Higgs, Ref. [108] found that a combination of such measurements requires

$$\begin{aligned} \text {BF}(h\rightarrow \text {inv.}) \le 0.19, \end{aligned}$$
(6)

at \(95\%\) confidence. This combined limit remains stronger than more recent single-experiment limits (e.g. [109]). More recent combinations (e.g. [110]) do not assume SM-like couplings, allowing all Higgs couplings to vary freely in their fits. We employ the likelihood for the Higgs invisible branching fraction described in Ref. [76], based on the chi-squared as a function of invisible branching fraction extracted from Ref. [108]. Here we apply this likelihood to Higgs decays to the LSP, \(\text {BF}(h\rightarrow {{\tilde{\chi }}}^0_1{{\tilde{\chi }}}^0_1)\), bearing in mind that heavier neutralinos are unstable and therefore not invisible.

We calculate the decay widths to (all) charginos and neutralinos at tree level [111], and then add them to the decay width in the SM [112] to estimate the total width of the Higgs in our simplified electroweakino scenario. Because we consider such a simplified scenario, we do not include one-loop corrections to the decay widths to charginos or neutralinos. We therefore include a conservative \(50\%\) log-normal theory uncertainty on our prediction of the invisible branching fraction, based on findings from one-loop calculations in Ref. [113].

3.2 LEP searches for electroweakino production

Electroweakino production provides an excellent example of a case where limits from the LEP experiment remain competitive with LHC searches, particularly for light, degenerate spectra. The ColliderBit module of GAMBIT includes individual cross-section limits on the pair production of neutralinos and charginos from the L3 and OPAL experiments, expressed as a function of the sparticle masses. For each point in the EWMSSM parameter space, we calculate the LEP pair-production cross-sections for the processes given in Table 2, and calculate the product of the cross-section and branching fraction for each process (using the DecayBit interface to SUSY-HIT 1.5). These are then compared to digitised, and interpolated, LEP cross-section limits from the analyses listed in Table 2 to form a Gaussian likelihood term, as described in [73, 87]. The likelihoods from each channel and experiment are multiplied, on the assumption that they are independent measurements.

Table 2 Results from LEP on sparticle pair production used in the scans

The selection of searches originally included in the ColliderBit module are only sensitive down to electroweakino mass differences of 3 GeV. We have therefore also included the OPAL search for a degenerate chargino–neutralino pair [115] in ColliderBit. This is sensitive to mass differences from 320 MeV to 5 GeV, and is important for constraining wino and Higgsino LSP scenarios from 45 GeV up to the kinematic limit of 95 GeV.

The implementation follows that of the other electroweakino searches from LEP: the pair-production cross-section of the (lightest) chargino is calculated and compared to the digitised OPAL limit in the plane of chargino mass versus chargino–neutralino mass difference to find the likelihood contribution. This particular search does not rely on the decay of the chargino, because it is based on missing energy plus the emission of a photon as initial state radiation (ISR).

3.3 LHC searches for electroweakino production

3.3.1 Analyses

There is a long list of searches for supersymmetry from the ATLAS and CMS experiments of the LHC, conducted using proton–proton collision data taken at \(\sqrt{s}=7\), 8 and 13 TeV. Searches for strongly-coupled supersymmetric particles are conducted in final states with jets (including b-jets), missing transverse energy \(E_T^{\text {miss}}\) and/or some number of leptons, and are specifically optimised on simplified models of gluino and squark production. This includes dedicated searches for third generation squark production. Models involving only chargino and neutralino production are not expected to pass the stringent multiplicity and kinematic selections required by these analyses.

Fig. 1
figure 1

The simplified model used for the optimisation of ATLAS and CMS searches targeting on-shell W and Z production. Neutralinos and charginos are pair-produced, resulting in final states with leptons, jets and missing energy

Searches for weakly-produced sparticles are generally challenging due to the small production cross-sections, and the dominant constraints come from final states rich in leptons, but relatively poor in jets. Searches are typically optimised on simplified models, with the most relevant model for our work shown in Fig. 1. This model assumes that \({\tilde{\chi }}_1^+ {\tilde{\chi }}_1^-\) and \({\tilde{\chi }}_1^\pm {\tilde{\chi }}_2^0\) production are the only available SUSY production processes at the LHC, and that the decay of the electroweakinos involves on-shell W and Z production. It is further assumed that the \({\tilde{\chi }}_2^0\) and \({\tilde{\chi }}_1^\pm \) are degenerate in mass and are wino-dominated, and that the \({\tilde{\chi }}_1^0\) is bino-dominated. This sets the production cross-sections for these processes, whilst ensuring that there are only two parameters remaining in the simplified model: \(m_{{\tilde{\chi }}_1^\pm }\) (or, equivalently, \(m_{{\tilde{\chi }}_2^0}\)) and \(m_{{\tilde{\chi }}_1^0}\). Each analysis that uses the simplified model is then optimised by generating a grid of simulated signal events in the \((m_{{\tilde{\chi }}_1^\pm },m_{{\tilde{\chi }}_1^0})\) mass plane, and defining a number of signal regions that exploit differences between the expected kinematics of the signal and the expected kinematics of the dominant SM background processes for each region of the mass plane. Null observations are interpreted in terms of a 95% confidence-level exclusion contour in the plane.

Many of the signal regions that we use in this paper are optimised for finding the simplified model of Fig. 1. However, we also make use of signal regions optimised on (and interpreted using) an extension containing additional intermediate sleptons. Despite not being obviously relevant to a model with decoupled sleptons, it is still possible for these regions to have some sensitivity to the EWMSSM. We also use analyses that have been optimised on a number of other models, e.g. general gauge mediation, in case they have sensitivity to our model of interest; we explain below why one might expect this to be the case.

In the main section of this paper, we include only LHC analyses based on the full 36 \(\hbox {fb}^{-1}\) of data from Run II at \(\sqrt{s}=13\) TeV. These are discussed below, and are far more sensitive than the earlier 7 or 8 TeV results. For the sake of completeness, in Appendix A we also consider the relatively small impact of also including 8 TeV data.

The ATLAS search for chargino and neutralino production in two- and three-lepton final states [116]: This has search regions optimised in three channels. The two-lepton and zero-jets channel targets \({\tilde{\chi }}^+_1 {\tilde{\chi }}^-_1\) production and \({\tilde{\ell }}{\tilde{\ell }}\) production in signal regions with no jets, optimised using the dilepton invariant mass \(m_{\ell \ell }\) and the “stransverse mass” \(m_{T2}\) (see Table 1 of [116], and note that we use the inclusive signal region definitions). The two-lepton and jets channel targets \({\tilde{\chi }}^+_1 {\tilde{\chi }}^0_2\) production with decays via gauge bosons into two same-flavour, opposite-sign leptons (assumed to come from an on-shell Z boson), and at least two jets (assumed to come from an on-shell W boson). The signal regions in this case are split into dedicated categories for high, intermediate and low \({\tilde{\chi }}^+_1/{\tilde{\chi }}^0_2\)\({\tilde{\chi }}^0_1\) mass differences, and use a variety of variables including the dilepton invariant mass \(m_{\ell \ell }\), the dijet invariant mass \(m_{jj}\), the missing transverse energy, a list of angular distances, the W and Z boson transverse momenta and \(m_{T2}\) (see Table 2 of Ref. [116]). Finally, the three-lepton channel targets \({\tilde{\chi }}^+_1 {\tilde{\chi }}^0_2\) production with decays via intermediate \({\tilde{\ell }}\) or gauge bosons into final states with three leptons. The signal regions use the invariant mass of the same-flavour, opposite-sign lepton pair in the events, the missing transverse energy, the \(p_T\) of the third lepton, the number of jets, the transverse mass, the \(p_T\) of the three-lepton system, and the \(p_T\) of the leading jet (see Table 4 of [116]). It is important to note in particular that the jet multiplicity in this analysis splits the \(3\ell \) regions targeting on-shell W and Z production into a region with no jets, and a region with at least one jet. No significant excess was reported in any signal region, although there are modest excesses in some regions. The most significant of these has a local significance of 1.8\(\sigma \), occurring in a region, SR3_WZ_1Jc, that requires three leptons and at least one jet, along with a same-flavour, opposite-sign lepton pair with an invariant mass consistent with a Z boson, \(E_T^\text {miss}>200\) GeV, and other kinematic cuts on the lepton and jet systems. Taken as whole, this analysis should be very sensitive to parts of the EWMSSM parameter space, with the most sensitivity occurring in the regions targeting W and Z production.

The ATLAS search for chargino and neutralino production using recursive jigsaw reconstruction in final states with two or three leptons [117]: This analysis has four signal regions dedicated to high, intermediate, and low mass splittings, along with an ISR-initiated search region, in both the two- and three-lepton final states. The two-lepton and three-lepton regions select leptonic Z-boson decays, with hadronic W-boson decays being chosen for the former (via a cut on the dijet mass) and leptonic W-boson decays for the latter (via a transverse mass selection). Only minimal event selection is applied on object momenta and multiplicity criteria, with variables arising from the application of the recursive jigsaw reconstruction technique used instead [118]. This provides so-called hemisphere variables that test the scale and balance of events using a specific decay tree formulation designed to test whether a given event looks more signal- or background-like. The signal regions are constructed such that the low-mass and ISR regions in the two-lepton and three-lepton searches are non-overlapping. The ISR regions use a specific formulation of the recursive jigsaw method outlined in Ref. [119], which requires at least one hadronic jet associated with a strong ISR system, making the ISR regions orthogonal to the low-mass regions. The results were compatible with the SM background expectation in all signal regions targeting large and intermediate \({\tilde{\chi }}^\pm _1/{\tilde{\chi }}^0_2-{\tilde{\chi }}^0_1\) mass splittings (leading to the best exclusion limits to date in that mass range), but revealed excesses in four signal regions targeting low mass splittings, with local significances of 2.0, 3.0, 1.4 and 2.1\(\sigma \).

The ATLAS search for pair production of Higgsinos in the hh final state [120]: This consists of two separate analyses with 24.3 and , focused on light and heavy Higgsinos, respectively. The signature is in both cases four jets that are kinematically consistent with two SM Higgs boson candidates, with three or four b-jet tags present. This is sensitive to the pair production of two Higgsinos – charged or neutral – where any charged Higgsino decays to the neutral with very soft SM decay products, and the resulting pair of neutral Higgsinos each decay to a Higgs boson and a light neutral sparticle. The search is motivated by gauge-mediated supersymmetry-breaking scenarios, where the light sparticle is a gravitino. We include this search here because the light sparticle may just as well be a lighter neutralino. Each analysis has a large number of signal regions. For the low-mass search, ATLAS set exclusion limits on the basis of the two-dimensional distribution of events in a histogram with bins of missing energy \(E_T^\text {miss}\) and effective mass \(m_\text {eff}\). We use all 42 bins from the original analysis as signal regions. Similarly, the high-mass search uses seven orthogonal signal regions, optimised for exclusion sensitivity. In addition to these exclusion-optimised signal regions, two discovery regions were defined for each analysis. Because of overlaps between the low-mass and high-mass signal regions, we have chosen to use only the low-mass signal regions in this study, so as to maximise the exclusion power in the most interesting (i.e. low-mass) region.

The ATLAS search for supersymmetry in final states with four or more leptons [121]: This examined final states with four or more leptons, including up to two hadronically decaying taus. The search was optimised on simplified models of General Gauge-Mediated (GGM) SUSY breaking with R-parity conservation, and on simplified models with R-parity violation. However, the model dependence of the search was reduced by making the requirements on the effective mass and transverse missing momentum in the selected events fairly loose; these were applied along with a requirement of the presence or absence of a Z-boson candidate. This search should be sensitive to certain EWMSSM models through the production of multi-gauge-boson final states, which are capable of producing events with four leptons. Note that we here only include the search regions with at least four light leptons. The ATLAS results showed no significant excess in any of the signal regions, except for a modest one (2.3\(\sigma \) local) in SR0D, which required two Z boson candidates and \(E_T^\text {miss}>100\) GeV.

The CMS search for chargino and neutralino production in the Wh final state [122]: This was optimised on a simplified model that assumed \({\tilde{\chi }}_1^\pm {\tilde{\chi }}_2^0\) production, followed by the decays \({\tilde{\chi }}^\pm _1\rightarrow W^\pm {\tilde{\chi }}^0_1\) and \({\tilde{\chi }}^0_2\rightarrow h {\tilde{\chi }}^0_1\). Events were selected to have \(E_T^\text {miss}>125\) GeV, two b-jets with an invariant mass close to the Higgs boson mass, a transverse mass of the lepton-\(E_T^\text {miss}\) system greater than 150 GeV, and a “contranverse mass” \(M_{CT}>170\) GeV [123, 124]. No significant excess was reported in the two signal regions, which were defined using different bins of \(E_T^\text {miss}\). This analysis should be sensitive to the EWMSSM, which is more than capable of producing Wh final states.

The CMS search for degenerate charginos and neutralinos in final states with two low-momentum opposite-sign leptons [125]: This search targets \({\tilde{\chi }}_1^\pm {\tilde{\chi }}_2^0\) production with a mass-degenerate \({\tilde{\chi }}_1^\pm \) and \({\tilde{\chi }}_2^0\) that are assumed to decay to the \({\tilde{\chi }}_1^0\) via virtual W and Z bosons (note that there are also search regions defined for stop squark pair production, which we ignore). The results were optimised on and interpreted in two variants of the \({\tilde{\chi }}_1^\pm {\tilde{\chi }}_2^0\) simplified model, in which the \({\tilde{\chi }}_1^\pm \) and \({\tilde{\chi }}_2^0\) are either wino-dominated or Higgsino-dominated. A second Higgsino model considers \({\tilde{\chi }}_1^0 {\tilde{\chi }}_2^0\) production, where the mass of the chargino is set to \(m_{{\tilde{\chi }}_1^\pm }=(m_{{\tilde{\chi }}_2^0}+m_{{\tilde{\chi }}_1^0})/2\). The selected events have two opposite-sign leptons and at least one jet. A pre-selection includes requirements that the transverse mass of both lepton-\(E_T^\text {miss}\) combinations is less than 70 GeV, that the \(E_T^\text {miss}\) is greater than 125 GeV, that the dilepton invariant mass must be less than 50 GeV, and that the lepton transverse momenta must be less than 30 GeV. Thus, this analysis would be sensitive to off-shell gauge boson production in the EWMSSM in cases of compressed mass spectra, but would rapidly lose sensitivity to on-shell production. Signal regions are defined in bins of \(E_T^\text {miss}\) and \(m_{\ell \ell }\), and we use the simplified composite likelihood treatment to combine the bins as described in Sect. 3.3.3.

The CMS search in states with jets and two opposite-sign same-flavour leptons [126]: This analysis uses the invariant mass of the lepton pair, searching for a kinematic edge or a resonant-like excess compatible with the Z-boson mass. We deal with the latter search only, since the former is designed to target strong sparticle production. The electroweakino search was optimised on the wino-dominated \({\tilde{\chi }}_1^\pm {\tilde{\chi }}_2^0\) production model shown in Fig. 1, and a second model based on gauge-mediated SUSY breaking. In the electroweakino search, selected events are required to have a dilepton invariant mass close to the Z-boson mass, at least two jets, and a missing transverse energy in excess of 100 GeV. Multiple signal regions are defined with bins of the dijet mass, \(M_{T2}\) and \(E_T^\text {miss}\). Regions with two b-jets are also defined, in order to target hZ final states. We use the simplified composite likelihood treatment to combine the bins as described in Sect. 3.3.3. This search should be very sensitive to models in the EWMSSM.

The CMS search for chargino and neutralino production in final states with two or three leptons [127]: This search targeted various scenarios of direct \({\tilde{\chi }}_1^\pm {\tilde{\chi }}_2^0\) production, with a wino-dominated \({\tilde{\chi }}_1^\pm \) and \({\tilde{\chi }}_2^0\). One set of simplified models included light sleptons, whilst the other was essentially that shown in Fig. 1, but with an extra model in which the \({\tilde{\chi }}_2^0\) produces an h boson rather than a Z boson. CMS searched events with two same-sign light leptons, in which they binned the events in the transverse mass, the transverse momentum of the dilepton system, and the \(E_T^\text {miss}\), for a total of 30 bins. They also performed a three-lepton search using bins of the transverse mass, \(E_T^\text {miss}\), and the dilepton invariant mass, with 44 bins defined for the case where two of the leptons form an opposite-sign, same-flavour pair, and six additional regions defined for the opposite case. Further regions were defined for the case where there was at least one hadronically-decaying tau. To facilitate reinterpretation of the results, they defined aggregated signal regions (i.e. signal regions with a wider selection on the kinematic properties than the single bins), most of which require a missing transverse energy of at least 200 GeV. We provide a thorough discussion of the difference between using the aggregated signal regions and the full set of bins below.

Additionally, in test scans, we investigated the impact of the CMS monojet analysis, which may be sensitive to \({\tilde{\chi }}_1^0{\tilde{\chi }}_1^0\) production [128]. We found that this had no sensitivity in any region of the parameter space. This matches the naive expectation based on the literature, so we exclude this analysis from our final results.

Table 3 Labels for the independent likelihood terms included in our LHC likelihood, along with the analyses from which they are derived

A typical LHC search includes quantifying the impact of a long list of systematic uncertainties, including those related to the jet energy scale and resolution, lepton identification and reconstruction, trigger efficiency, b-tagging, MC modelling (such as the choice of renormalisation and factorisation scales, plus uncertainties related to the choice of parton distribution function), pileup modelling, and particle production cross-sections. These are often correlated across signal regions, and this must be taken into account in determining the likelihood of a SUSY model given the observed data and expected SM background contribution. In addition, for searches with non-orthogonal signal region selections, there will be a correlated number of events in overlapping regions.

For most of the analyses that we use, no detailed information is provided by the experiments regarding the correlation of event numbers and uncertainties between the different signal regions (the exceptions will be discussed below). Best practise in this case is to take the signal region expected to give the highest exclusion power for a given point in the SUSY parameter space, and use that region to calculate a likelihood contribution using the observed LHC data. In previous GAMBIT studies [73, 87, 88], our approach has been to select a single such “best expected” signal region across those contained in a given paper, for each point in the SUSY parameter space. However, the division of experimental results into different papers does not always make this a sensible procedure, given that several papers summarise the results of multiple analyses that are thematically similar, but actually orthogonal from the point of view of selecting events. Therefore, in this study, we instead divide the signal regions by final state, and assume that the “best expected” region in each final state can be used to obtain a likelihood contribution independently of other final states (and, of course, a final state in the ATLAS data yields an independent likelihood term from the same final state in the CMS data). This gives a series of independent likelihood terms whose origin is summarised in Table 3.

A possible flaw in this approach is the inclusion of two recent ATLAS searches for two- and three-lepton final states ([116, 117]) as independent contributions in our scan likelihood function. In this case, however, ATLAS have published plots showing that the overlap in the selected events for the two analyses is small, and we have performed our own checks that our final conclusions do not change substantially when the ATLAS recursive jigsaw electroweak (EW) analysis is supplemented by the earlier analysis that uses conventional variables.

We have added all the above searches to the ColliderBit module in GAMBIT. ColliderBit implements LHC constraints by performing a Monte Carlo (MC) simulation of sparticle production at the 13 TeV LHC for each point in the parameter space (using the Pythia 8 MC generator [100, 101]), before passing the events through a custom fast detector parameterisation of the ATLAS and CMS detectors, and an implementation of the relevant analysis cuts. This gives the expected yield of signal events in each analysis which, for most analyses, is used to define a Poisson likelihood term marginalised over statistical and systematic uncertainties, based on the signal region with the best expected exclusion power. Further details can be found in [73, 87]. The likelihoods for different analyses are treated as independent, and are multiplied together. In the above analyses, we have implemented new efficiencies for leptons and b-jets in certain analyses, in order to better reproduce the published cutflows.

A potential weakness in our approach is that we use leading order (LO) cross-sections plus leading logarithmic (LL) corrections from Pythia, due to the prohibitive computational cost of next-to-leading order (NLO) and next-to-leading logarithmic (NLL) calculations. We return to the expected effect of this approximation in our results discussion.

3.3.2 Validation

Example cut-flows are shown in Tables 4, 5 and 6, for the ATLAS search for two Higgs bosons and \(E_T^{\text {miss}}\) [120], the CMS two low-momentum opposite-sign leptons and \(E_T^{\text {miss}}\) search [125], and the CMS two opposite-sign same-flavour leptons and \(E_T^{\text {miss}}\) search [126]. The agreement is in general good, rising to a maximum discrepancy of \(\sim \)40% in the worst case.

Table 4 Example comparison of GAMBIT and ATLAS [120] cutflows for two signal regions targeting low-mass Higgsinos in a search for new physics in events with two Higgs bosons decaying into \({{\bar{b}}} b\). Shown are the numbers of events expected in 24.3 \(\text {fb} ^{-1}\) of 13 T\(\text {e}\text {V}\) ATLAS data for Higgsino pair production with a signal cross-section of 0.577 pb, \(m_{{\tilde{H}}}=250\) GeV and a massless gravitino, assuming 100% branching fraction for \({{\tilde{H}}}\rightarrow h{{\tilde{G}}}\)
Table 5 Comparison of the GAMBIT and CMS [125] cutflows for a WZ signal model (\(m_{{\tilde{\chi }}_1^{\pm }}=150\) GeV, \(m_{{\tilde{\chi }}_1^{0}}=130\) GeV) in a search for new physics in events with two low-momentum opposite-sign leptons and missing transverse momentum. Shown are the numbers of events expected in 33.2 \(\text {fb} ^{-1}\) of 13 T\(\text {e}\text {V}\) CMS data for a signal cross-section of 5.18 pb [129]. Both the CMS cutflow and GAMBIT cutflow are generated for production of \({\tilde{\chi }}_1^{\pm }{\tilde{\chi }}_2^{0}\) in a simplified model with decays via off-shell W / Z
Table 6 Comparison of the GAMBIT and published CMS cutflows [126] in four signal regions of a search for new physics in events with two opposite-charge same-flavor leptons and missing transverse momentum, for a WZ signal model (\(m_{{\tilde{\chi }}_1^{\pm }}=550\) GeV, \(m_{{\tilde{\chi }}_1^{0}}=200\) GeV). Shown are the numbers of events expected in 35.9 \(\text {fb} ^{-1}\) of 13 T\(\text {e}\text {V}\) CMS data, and the ratio of the GAMBIT and CMS numbers. Note that the CMS cutflow is generated for a \({\tilde{\chi }}_1^{\pm }{\tilde{\chi }}_2^{0}\) simplified model decaying via W / Z where the Z boson decays leptonically, while the GAMBIT cutflow is generated without specifying Z boson decay mode. This explains the discrepancy at the “All events” cut

To provide further validation, Fig. 2 displays a GAMBIT version of the exclusion limit in the \((m_{{\tilde{\chi }}_1^\pm },m_{{\tilde{\chi }}_1^0})\) mass plane arising from the conventional ATLAS multilepton analysis [116], and the ATLAS RJ analysis [117], for a simplified model in which production of the wino-dominated \({\tilde{\chi }}_1^\pm \) and \({\tilde{\chi }}_1^0\) is followed by decays to W and Z gauge bosons and neutralinos. For these reproductions we have scaled the signal predictions from GAMBIT using the NLO+NLL cross-sections for wino pair production taken from [130]. We see that the overall agreement is good, particularly at low masses. Some differences exist for heavy \({\tilde{\chi }}_2^0\) (and \({\tilde{\chi }}_1^\pm \)) in the two-lepton searches, however, this is not so surprising given the low number of signal events in this area, which makes the exclusion limit very sensitive to small details of the analysis. Despite this, the agreement indicates that our implementations of these particular analyses are capable of supplying a similar exclusion to that reported by ATLAS when used on the same simplified model.

We note that it is difficult to reproduce the reported exclusion from the equivalent CMS multilepton analysis in the simplified model defined in that analysis [127], as that limit is obtained using a combination of many bins for which covariance information is not supplied. For this analysis, we use the aggregated signal regions defined in the original version of the analysis in Ref. [127]. These are recommended for reinterpretation purposes by the CMS collaboration, on the grounds that the aggregated region with the best-expected exclusion should be more constraining than the single bin with the best expected exclusion in the multibin analysis, i.e. the extra power of the multibin analysis comes from the combination of bins, not the individual bins.

Another reason for making this choice is that taking the single bin with the best expected sensitivity is not very robust against statistical fluctuations, both in the original data and MC fluctuations in the signal evaluation. This is because in the full combination of bins, a bad fit to the data in one bin can be compensated for by a sufficiently good fit to the data in other bins.

We have compared the result obtained with the aggregated regions to a naive sum of the log-likelihoods for all bins used in the CMS exclusion limit derivation for this analysis, and find that we get very large differences for simplified model points that are well within the CMS exclusion contour. Whilst these differences may be mitigated by the use of the relevant covariance information, it is impossible to quantify the size of this effect without access to that information. This is therefore a case where best practise does not allow us to fully estimate the likelihood of the CMS search, and we will revisit this point in the final presentation of our results.

A common theme in the included electroweak searches is the requirement of one or more ISR jets to isolate the signal. Given that our simulation with Pythia, unlike the signal description in the original ATLAS and CMS analysis, does not include extra hard jets in the matrix element description, the efficiency of the signal in our simulation should be smaller. This is to some degree borne out in the cut-flow shown in Table 6, but not in Table 5.

Fig. 2
figure 2

GAMBIT reproductions of 95% CL ATLAS exclusion limits for a simplified model of wino production. The results for the “conventional” multilepton analysis [116] and the recursive jigsaw analysis [117] are shown in the top and bottom rows, respectively. In both cases results are given separately for 2-lepton (left) and 3-lepton (right) signal regions. The ATLAS observed (light blue) and expected (dashed, dark blue) limits, along with the \(\pm 1\sigma \) uncertainty band (hatched, yellow) on the expected limit, are obtained from the published auxiliary materials [131, 132]. The signal predictions from GAMBIT have been scaled to the NLO+NLL cross-sections for wino pair production [130]. The underlying heatmap depicts the full log-likelihood function obtained from the GAMBIT simulations

In Fig. 3 (left) we show the \(p_T\) distributions for the three hardest jets in signal events simulated in GAMBIT with Pythia 8.212, compared to a simulation using MadGraph5_aMC@NLO [133, 134] and Pythia with a matching procedure including up to two extra hard jets in the matrix element, which copies the signal simulation used by the experiments. The chosen benchmark point features production of wino-dominated \({\tilde{\chi }}_2^0{\tilde{\chi }}_1^{\pm }\) pairs with \(m_{{\tilde{\chi }}_2^0,{\tilde{\chi }}_1^{\pm }}=200\) GeV, which decay into a bino \({\tilde{\chi }}_1^0\) with \(m_{{\tilde{\chi }}_1^{0}}=100\) GeV and vector bosons with 100% branching fraction. The vector bosons in turn decay leptonically. The latter choice maximises any difference between the simulations as there are no extra jets from hadronic decays of the vector bosons. In both cases, jets are reconstructed using the anti-\(k_T\) algorithm with \(R=0.4\) [135], as implemented in FastJet [84]. For the GAMBIT sample, we reconstruct jets and apply jet energy smearing and lepton isolation criteria using the BuckFast [73] detector output. For the MadGraph sample, we use the Delphes [136] simulation package.

The relatively small differences between the jet spectra in Fig. 3, in particular for the hardest jet, show that our simulation of signal events provides reasonable fidelity. Together with the existence of extra jets in hadronic vector boson decays this also explains the small or absent decrease in efficiency observed for the jet cut in Tables 5 and 6. While this result may seem somewhat surprising, it has been noted before that the ISR shower together with the implemented matrix element corrections in Pythia do quite well up to \(p_T^\text {jet}\sim \mu _F/2\), where \(\mu _F=\sqrt{p_T^2+{{{\hat{m}}}}^2}\) is the factorization scale used (given in terms of the \(p_T\) of the produced sparticles and their average mass \({{\hat{m}}}\) [137]).

Fig. 3
figure 3

\(p_T\) distributions (left) for the three hardest jets in a benchmark model with production of chargino–neutralino pairs with \(m_{{\tilde{\chi }}_2^0,{\tilde{\chi }}_1^{\pm }}=200\) GeV, as well as the corresponding cut-efficiencies (right)

In the final results this lower efficiency should result in a small systematic shift of the highest likelihoods towards lower masses with higher cross-sections in order to compensate. In Sect. 4.3 we include this effect in the kinematical distributions for the best-fit point by running the same simulation as above with up to two extra hard jets in the matrix element.

3.3.3 Simplified likelihoods

Without correlation information, the conservative approach to likelihood construction from multiple signal regions is to choose the single signal region with the highest expected signal significance for each model point. This is the approach that we took in earlier GAMBIT papers [73, 87, 88], and is discussed above in Sect. 3.3.1.

As a result, what we refer to as the likelihood from a given LHC analysis, \({\mathcal {L}}_i\), is in fact a ratio between the signal-plus-background and the background-only likelihoods,

$$\begin{aligned} {\mathcal {L}}_i = \frac{{\mathcal {L}}_\text {marg}(n_i|s_i+b_i)}{{\mathcal {L}}_\text {marg}(n_i|b_i)}, \end{aligned}$$
(7)

where \(n_i\), \(s_i\) and \(b_i\) respectively refer to the number of events measured, predicted due to signal, and predicted due to background, in this expected best signal region. We divide the signal-plus-background likelihood by the background likelihood in order to avoid the large likelihood normalization changes from point to point in parameter space that would otherwise occur when switching between signal regions. The total LHC likelihood from ColliderBit is then the direct product of these individual analysis likelihoods. Here the numerator and denominator of Eq. 7 are Poisson likelihoods, marginalised over a log-normally distributed nuisance parameter \(\xi \), which accounts for fractional background and (where relevant) signal uncertainties characterised by \(\sigma _\xi \)

$$\begin{aligned} {\mathcal {L}}_\text {marg}(n|p)= & {} \int _0^\infty \frac{\left[ \xi p\right] ^{n}e^{-\xi p }}{n!} \nonumber \\&\times \frac{1}{\sqrt{2\pi }\sigma _\xi } \frac{1}{\xi }\exp \left[ -\frac{1}{2}\left( \frac{\ln \xi }{\sigma _\xi }\right) ^2\right] \mathrm {d}\xi \,. \end{aligned}$$
(8)

Further details on this one-dimensional marginalised likelihood can be found in Refs. [73, 138, 139].

A new feature now available in the ColliderBit code is the ability to construct a “simplified” composite likelihood [140], when the relevant information about background correlations in different signal regions is available. The simplified likelihood formalism steers a course between the pessimistic approach of taking only one signal region, and the unavailable full experimental likelihood. The latter typically makes use of interpolations between template yield histograms representing the effects of each elementary systematic uncertainty, and hence requires substantially more information to be published than just expected yields and uncertainties. Simplified likelihoods replace this detailed likelihood with a standard convolved Poisson-Gaussian form, in which the systematic uncertainties on expected background yields are treated as Gaussian distributions, with correlations encoded via a covariance matrix \(\varSigma \):

$$\begin{aligned} {\mathcal {L}}(\varvec{s}, \varvec{\gamma })= & {} \prod _i^{N_\text {bin}} \left[ \frac{(s_i + b_i + \gamma _i)^{n_i} e^{-(s_i + b_i + \gamma _i)}}{n_i!} \right] \nonumber \\&\times \frac{1}{\sqrt{\det 2\pi \varSigma }} e^{-\frac{1}{2} \varvec{\gamma }^T \varvec{\varSigma ^{-1}} \varvec{\gamma }} \, . \end{aligned}$$
(9)

Here, \(n_i\), \(s_i\), and \(b_i\) are respectively the observed yield and the nominal expected signal and background yields in signal region i, and \(\gamma _i\) is the background deviation from nominal due to systematic uncertainties.Footnote 6

In ColliderBit analyses where the simplified-likelihood correlation/covariance matrices are published – currently limited to some publications by the CMS experiment – the full set of \(N_\text {bin}\) signal regions is used to construct the composite likelihood. This is currently evaluated by marginalising the likelihood over the background uncertainties \(\gamma _i\), distributed as the \(N_\text {bin}\)-dimensional Gaussian \(G(\varvec{0}, \varvec{\varSigma })\):

$$\begin{aligned} {\mathcal {L}}(\varvec{s})\equiv & {} \int \mathrm {d}\varvec{\gamma } {\mathcal {L}}(\varvec{s}, \varvec{\gamma }) \nonumber \\= & {} \int \mathrm {d}\varvec{\gamma } \, p(\varvec{n}|\varvec{s}, \varvec{b}, \varvec{\gamma }) \times G(\varvec{\gamma }|\varvec{0}, \varvec{\varSigma }). \end{aligned}$$
(10)

In practice this marginalisation is performed by sampling \(\varvec{\gamma }\) vectors from the Gaussian, calculating the Poisson \(p(\varvec{n}|\varvec{s},\varvec{b},\varvec{\gamma })\) for each, and averaging over the set of samples. For computational speed, ColliderBit performs this sampling in parallel using OpenMP, and skips it entirely if the signal prediction from the event generator run is exactly zero in all signal regions. Numerical convergence of the sampling is ensured by iterative doubling of the number of samples \(N_\text {samp}\), starting from \(10^5\), until the marginalised likelihood estimator is stable within 5%, or the absolute variation in the likelihood estimate drops below 0.05. In this study we use the simplified likelihood approach for the likelihood contributions from the CMS two-lepton searches in Refs. [125, 126].

3.4 \(p\)-value calculations

To quantify the significance of deviations from the SM across multiple LHC and LEP searches for sparticles, as well as to quantify the absolute goodness-of-fit of our EWMSSM best-fit point, we compute \(p\)-values via likelihood-ratio tests. These computations are performed by dedicated Monte-Carlo simulations outside of the main GAMBIT software framework. The ‘local significance’ test and the ‘goodness-of-fit’ test each use a different form of likelihood ratio, so we describe them separately below.

3.4.1 Local significance

Computing the significance of any excesses in the data is done by attempting to exclude the background-only hypothesis across all analyses simultaneously. We construct this test by assigning a single “signal strength” parameter \(\mu \) across all analyses,Footnote 7 where the nominal (\(\mu =1\)) signal is obtained via the predictions of the best-fit point found in our scan. We then attempt to exclude the \(\mu =0\) null hypothesis.

For example, consider the simplified likelihood of Eq. 9. The signal predictions for each analysis bin \(s_i\) become \(\mu s_i\), and this scaling is applied consistently across all components of the joint likelihood. By setting \(\mu =1\) we obtain a ‘nominal’ signal hypothesis for a given parameter point, whilst \(\mu =0\) retrieves the joint background-only hypothesis.

The test statistic we construct is then

$$\begin{aligned} q_\mathrm {LS} = -2 \log \frac{{\mathcal {L}}_\mathrm {joint}(\mu =1,{\hat{\eta }})}{{\mathcal {L}}_\mathrm {joint}(\mu =0,\hat{{\hat{\eta }}})}, \end{aligned}$$
(11)

where \({\mathcal {L}}_\mathrm {joint}\) is the joint likelihood for all analyses (with \(\mu =0\) setting the signal to zero in the denominator case), and \({\hat{\eta }}\) and \(\hat{{\hat{\eta }}}\) are the best-fit (i.e. profiled) values of nuisance parameters under each hypothesis (for example the \(\gamma _i\) in Eq. 10). When the null hypothesis \(\mu =0\) is true, this test statistic is (asymptotically) distributed as a Gaussian [141, Sec. 3.8]. However, because some analyses involve few events and may jeopardise the asymptotic assumptions, we determine the test statistic distribution by Monte Carlo simulation.

For the LHC analyses, \(\eta \) represents nuisance parameters that characterise uncertainties in the background estimates. In our scan we marginalised over these (see Sect. 3.3.3) due to better numerical stability, however, for our p-value calculations we have chosen to profile them so that our Monte Carlo output could be validated by comparison with the predictions of asymptotic theory, and to maintain a frequentist intepretation of the resulting p-values.

It is of great importance to note that this test performs only a local significance test at chosen parameter points. In principle a “trial” correction should be computed, as choosing to test the best-fit EWMSSM point after analysing the data constitutes a form of “cherry-picking”. This problem is also known as the “look-elsewhere effect”, or, in statistics, the “problem of multiple comparisons”.

Unfortunately, it is incredibly computationally demanding to correct for this in parameter spaces larger than one or two dimensions, and is beyond our means at present.Footnote 8 We nevertheless can get some idea of a ‘global’ significance by computing the goodness-of-fit of the background-only (SM) hypothesis in a test against a fully general signal hypothesis. We discuss this further in Sect. 3.4.2. The results of applying this test to each analysis individually, and to their combination, are listed in Table 8.

3.4.2 Goodness-of-fit

Aside from the joint significance of excesses, we are interested in quantifying the absolute goodness-of-fit of points in the EWMSSM. Profile likelihood contours do not have the power to exclude the best-fit point in a global fit, as they are computed based on likelihood ratios relative to the best fit. Their stated coverage is also often somewhat incorrect, as they are computed based on asymptotic theory relying on Wilks’ theorem, whose regularity assumptions are often violated in complicated parameter spaces such as the EWMSSM [148,149,150].

To formulate this test, we take the predictions of the best-fit point of our scan and embed them in a larger “proxy” hypothesis space, where the possible signals are allowed to vary in a more general way. For example in the likelihood of Eq. 9 we simply take the signal predictions \(s_i\) in each bin as independent free parameters. We can thus test the goodness-of-fit of any EWMSSM point by seeing whether a sufficiently better-fitting point can be found in the more general hypothesis space. The method is similar to a common chi-squared test used to measure goodness-of-fit in histograms [151], as each of our signal regions may be thought of as one bin in a histogram. Such a test has much less statistical power to detect signals than a more targeted test like the one we use to compute local significances. However, its false positive rate is better controlled, because it is less susceptible to the look-elsewhere effect.Footnote 9

For a more explicit example, let us consider the LHC analyses for which we have no correlation information. In these analyses we pre-select the signal region with the best expected sensitivity to the signal predictions of the parameter point of interest (see Sect. 3.3.1). The simplified pdfs for these analyses then reduce to a single Poisson distribution times a Gaussian constraint on a nuisance parameter:

$$\begin{aligned} \mathrm {Pr}(n,{\hat{\gamma }}) = \mathrm {Poisson} (n | s+b+\gamma ) \cdot \mathrm {Normal}({\hat{\gamma }}|\gamma ), \end{aligned}$$
(12)

where n is the number of events observed in that signal region, and \({\hat{\gamma }}\) is the maximum likelihood estimator for \(\gamma \) obtained from control measurements. For the observed data \({\hat{\gamma }}\) is zero by definition, however, it is a random variable from the point of view of pseudodata generation, as we keep b fixed. As in the case of Eq. 10, the signal expectation s is allowed to vary freely (over both positive and negative values), for each pre-selected signal region in every analysis.

When correlation information is available (the Eq. 9 case), a free signal parameter is assigned to every signal region in the analysis. In the case of Gaussian likelihoods (the Higgs and Z invisible width likelihoods), the expected value is allowed to vary as a free parameter.

We then construct the test statistic

$$\begin{aligned} q_\mathrm {GOF} = -2 \log \frac{{\mathcal {L}}_\mathrm {joint}({\mathbf {s}}(\theta ),{\hat{\eta }})}{{\mathcal {L}}_\mathrm {joint}(\hat{\hat{{\mathbf {s}}}},\hat{{\hat{\eta }}})}, \end{aligned}$$
(13)

where \({\mathbf {s}}(\theta )\) are the predictions of EWMSSM point \(\theta \) (or SM) and form the null hypothesis, whilst \(\hat{\hat{{\mathbf {s}}}}\) are the global best-fit values of the parameters \({\mathbf {s}}\) in the free-signal parameter space. \({\hat{\eta }}\) and \(\hat{{\hat{\eta }}}\) likewise represent vectors of nuisance parameters fit to the null hypothesis and free-signal, respectively.

When data is generated under \({\mathbf {s}}(\theta )\) this test statistic is asymptotically distributed as a \(\chi ^2\) variable, whose degrees of freedom are equal to the dimension of \({\mathbf {s}}\). This parameter space has good regularity properties so Wilks’ theorem applies well, meaning the theoretical distribution should be quite reliable. However, discretisation and boundary effects can still enter for signal regions with low expected count numbers, so we also compute these distributions via Monte Carlo simulation.

We use this test to assess the goodness-of-fit of both the SM and our best-fit EWMSSM point to the data observed in each analysis individually, as well as jointly. The results are given in Table 8.

Fig. 4
figure 4

Profile likelihood in the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane (upper left), the \((m_{{\tilde{\chi }}_2^0},m_{{\tilde{\chi }}_1^{\pm }})\) plane (upper right), the \((m_{{\tilde{\chi }}_2^0},m_{{\tilde{\chi }}_3^0})\) plane (lower left) and the \((m_{{\tilde{\chi }}_4^0},m_{{\tilde{\chi }}_3^{0}})\) plane (lower right). The contour lines show the \(1\sigma \) and \(2\sigma \) confidence regions. The best-fit point is marked by the white star

4 Results

4.1 Profile likelihood maps

Figure 4 shows our results for the profile likelihood in various electroweakino mass planes. There is a clear preference for a mass scale in the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane (top-left), centered on \(m_{{\tilde{\chi }}_1^{\pm }}\approx \) 150 GeV, and \(m_{{\tilde{\chi }}_1^{0}}\approx \) 50 GeV. We also find that \(m_{{\tilde{\chi }}_1^{\pm }} \lesssim 300\) GeV and \(m_{{\tilde{\chi }}_1^{0}}\lesssim 200 \) GeV at the \(2 \sigma \) level. This preference is driven by the small number of coincident excesses in a variety of ATLAS and CMS searches, which we discuss in detail below. One can also see that the best-fitting solutions lie far from the line \(m_{{\tilde{\chi }}_1^{\pm }}=m_{{\tilde{\chi }}_1^{0}}\), indicating a preference for a predominantly bino LSP.

The top right panel of Fig. 4 shows results in the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_2^0})\) plane and indicates that the best-fitting solutions exhibit an approximate degeneracy between the \({\tilde{\chi }}_1^{\pm }\) and \({\tilde{\chi }}_2^{0}\) masses, such as would be expected if they were dominantly composed of Higgsinos, winos or a mixture of the two. As such we also find \(m_{{\tilde{\chi }}_2^{0}} \lesssim 300\) GeV within the \(2 \sigma \) contours.

In the bottom left panel of Fig. 4 the results are displayed in the \((m_{{\tilde{\chi }}_2^0}, m_{{\tilde{\chi }}_3^0})\) plane, which clearly shows that \(m_{{\tilde{\chi }}_2^0} \lesssim 300\) GeV, as in the top right panel, and \(m_{{\tilde{\chi }}_3^0} \lesssim 700\) GeV within the \(2\sigma \) region. There is a slight preference for \(m_{{\tilde{\chi }}_2^0} \ll m_{{\tilde{\chi }}_3^0}\), as represented by the best-fit point, corresponding to the scenario where winos are lighter than Higgsinos. The opposite scenario, where Higgsinos are lighter than winos and \(m_{{\tilde{\chi }}_2^0} \sim m_{{\tilde{\chi }}_3^0}\), is also present within \(1\sigma \) of the best fit, albeit for somewhat higher \({\tilde{\chi }}_2^0\) masses.

The bottom right panel shows results in the mass planes of the heaviest neutralinos, \({\tilde{\chi }}_3^{0}\) and \({\tilde{\chi }}_4^0\). Within the \(2 \sigma \) contours the masses of these states are bounded by \(m_{{\tilde{\chi }}_3^{0}} \lesssim 700\) GeV and \(m_{{\tilde{\chi }}_4^{0}} \lesssim 700\) GeV. For even heavier \(\chi ^0_3\) or \(\chi ^0_4\), the profile likelihood function flattens out beyond the \(2\sigma \) contour and becomes indifferent to the specific mass. One therefore obtains a better fit to the LHC data when the entire neutralino and chargino spectrum is light, but the heavier electroweakinos are not constrained at the 3\(\sigma \) level. We do not show results for \(m_{{\tilde{\chi }}_2^\pm }\), as our results indicate that it is nearly degenerate in mass with \(m_{{\tilde{\chi }}_4^0}\) for the full \(2\sigma \) region.

Our findings for the electroweakino masses are neatly summarised in Fig. 5, where we show the \(1\sigma \), \(2\sigma \) and \(3\sigma \) bands for each electroweakino mass.Footnote 10 This shows that we find \(3\sigma \) upper limits on the masses of the two lightest neutralinos and the lightest chargino. At this confidence level, the heavier neutralino and chargino masses saturate the upper limits set by the allowed range for the input parameters.

Fig. 5
figure 5

Summary of the one-dimensional \(1\sigma \), \(2\sigma \) and \(3\sigma \) confidence intervals for the neutralino and chargino masses. The orange lines mark the best-fit values. For \(m_{{\tilde{\chi }}_3^0}\), \(m_{{\tilde{\chi }}_4^0}\) and \(m_{{\tilde{\chi }}_2^\pm }\), the \(3\sigma \) confidence intervals extend up to the 2 TeV upper limit on the mass parameters in our scan

Let us first assume that this pattern of excesses arises from statistical fluctuations, and that there is no production of electroweakinos (or any other sparticle) at the LHC. Under this assumption, it is interesting to determine what limits the present data from LHC direct SUSY searches put on charginos and neutralinos in the EWMSSM. A simple way to do this is to consider a capped version of our LHC likelihood,

$$\begin{aligned} {\mathcal {L}}_\text {cap} = \min [{\mathcal {L}}_\text {LHC}(\varvec{s}+\varvec{b}), {\mathcal {L}}_\text {LHC}(\varvec{b})], \end{aligned}$$
(14)

where \({\mathcal {L}}_\text {LHC}\) is the combined likelihood from all the simulated 13 TeV SUSY searches. This construction ensures that no EWMSSM parameter point can achieve a likelihood higher than the background-only expectation. This makes it only possible to exclude EWMSSM models. Note that the ‘capping’ in Eq. 14 is done on the final composite likelihood for all analyses, not on the likelihood contribution from each analysis individually.Footnote 11 The profile likelihood ratio of the capped likelihood to its best possible value, \({\mathcal {L}}_\text {cap}/{\mathcal {L}}_\text {cap,max} \equiv {\mathcal {L}}_\text {cap}/{\mathcal {L}}_\text {LHC}(\varvec{b})\) is thus a measure of how much worse a given EWMSSM parameter point does in fitting the data than the SM does. A likelihood ratio of 1 means that the EWMSSM does at least as well as the SM, whereas a ratio of less than 1 means that the SM fits the data better than the EWMSSM point. To obtain a ratio of 1 for a given point in the EWMSSM parameter space, it must either be the case that no analysis is sensitive to the given parameter point (e.g. \(\varvec{s} = \varvec{0}\)), or that a bad fit to some of the analyses is completely offset by a sufficiently good fit to other analyses.

Fig. 6
figure 6

Capped profile likelihood in the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane. The capped likelihood function (Eq. 14) is based solely on the joint likelihood for the 13 TeV LHC direct SUSY searches. The contour lines show the \(1\sigma \) and \(2\sigma \) confidence regions

In Fig. 6 we plot this profile likelihood ratio in the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane. The result shows little variation across the entire mass plane, indicating that the combined results from the 13 TeV LHC direct searches in fact do not produce any significant general constraint on the masses of neutralinos or charginos. Naively, this conclusion would seem to be in conflict with published ATLAS and CMS results. However, the ATLAS and CMS analyses are all optimised and interpreted in terms of simplified models. The full electroweakino sector of the MSSM has a far richer phenomenology than the simplified models. When the likelihoods from this multi-dimensional space are profiled onto the neutralino-chargino plane, there is only a very weak constraint remaining, on some isolated islands in the mass plane.

Such a lack of exclusion has been noted before [152], and can be understood physically. For example, non-wino dominated \({\tilde{\chi }}^{\pm }_1\) and \({\tilde{\chi }}^0_2\) pairs have a lower production cross-section compared to a scenario with pure winos. Also, the prevalence of other production and decay modes changes the typical final states, so that for a given EWMSSM parameter point the signal regions with the best expected sensitivity may differ from the signal regions with best sensitivity to a simplified model with similar masses for the light electroweakinos. We emphasise that, in a frequentist approach, this lack of exclusion must be interpreted literally. In a Bayesian framework, one could instead marginalise over the dimensions not appearing on the axes of each plane, to determine the posterior mass in excluded scenarios; we leave such an analysis for future work.

We note that in order to obtain a large enough dataset to produce Fig. 6, we include all parameter samples with at least \(500\,000\) MC events in the LHC likelihood calculation. This should be contrasted with the other results in this paper, where only samples with at least 4 million MC events are used. Because the profile likelihood picks out the least constrained parameter sample for every point in the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane, this larger MC uncertainty implies that the result in Fig. 6 should be viewed as a somewhat conservative estimate of the constraining power of the combined data.

Let us now remove the assumption that there are no sparticles within reach of the LHC, and return to a consideration of the complete, uncapped profile likelihood. In this case, the observed results are not surprising in light of the ATLAS recursive jigsaw (RJ) search described in Sect. 3.3.1, which saw excesses in four signal regions targeting chargino plus neutralino production, with decays to W and Z bosons and lightest neutralinos.

Fig. 7
figure 7

Profile likelihood of the bino (left), wino (middle) and Higgsino (right) content of the four neutralinos (starting from the lightest in the top row), plotted against the mass of the respective neutralino. Contour lines show the \(1\sigma \) and \(2\sigma \) confidence regions. The best-fit point is marked by the white star

Note that an excess in a search for electroweakinos that is optimised for on-shell W and Z production effectively sets one chargino-neutralino mass difference to be somewhere near the W mass, whilst also setting a neutralino-neutralino mass difference to be at least equal to the Z mass, after which the overall mass scale is forced to the value with a cross-section that is able to reproduce the size of the excess. In the simplified model approach, these mass differences would be defined between the \({\tilde{\chi }}_1^{\pm }\) and \({\tilde{\chi }}_1^0\), and between the \({\tilde{\chi }}_2^0\) and \({\tilde{\chi }}_1^0\), but we see departures from this behaviour due to the fact that other electroweakino production and decay processes are able to produce on-shell W and Z bosons. Nonetheless, in Fig. 4 we still see a mild preference for a mass difference of around 100 GeV between \({\tilde{\chi }}_1^{\pm }\) and \({\tilde{\chi }}_1^0\).

It is also true that the gaugino contents are heavily constrained by the observation of the W and Z decay modes, which can provide more information about the electroweakino sector than would have been possible given an excess in another channel. In Fig. 7, we show plots of the fraction of bino, wino and Higgsino in each neutralino, plotted against the mass of that neutralino, and with the profile likelihood shown as a colour contour. The first row confirms the previous hint that the best-fitting points have a predominantly bino LSP, with a small admixture of Higgsino and/or wino. The maximum allowed Higgsino contribution exceeds the maximum allowed wino contribution.

Figure 7 also shows that the data has little preference between wino, Higgsino or mixed scenarios for the \({\tilde{\chi }}_2^0\) and \({\tilde{\chi }}_4^0\), though due to the mass relations between Higgsinos there is a preference for \({\tilde{\chi }}_3^0\) to be Higgsino at the \(2 \sigma \) level. As expected, when the heavier neutralinos \({\tilde{\chi }}_{3,4}^0\) are pushed up in mass they tend to be pure gauge eigenstates.

Fig. 8
figure 8

The \(1\sigma \), \(2\sigma \) and \(3\sigma \) regions (orange lines) preferred by our combination of searches in the \((m_{{\tilde{\chi }}_1^0}, m_{{\tilde{\chi }}_1^{\pm }})\) plane. For each of the twelve panels, the colors (where present) show the contribution to the total log-likelihood from a different search (white text). Blue indicates that the signal improves the fit to that search and red that it worsens it

Fig. 9
figure 9

The \(1\sigma \), \(2\sigma \) and \(3\sigma \) regions (orange lines) preferred by our combination of searches in the \({\tilde{\chi }}^\pm _1\) versus \({\tilde{\chi }}^0_1\) (left), \({\tilde{\chi }}^0_2\) versus \({\tilde{\chi }}^0_3\) (middle) and \({\tilde{\chi }}^0_4\) versus \({\tilde{\chi }}^0_3\) (right) mass planes. The colors (where present) show the contribution to the total log-likelihood from the ATLAS_4lep (top), ATLAS_MultiLEP_2lep_2jet (second row), ATLAS_MultiLEP_3lep (third row), and ATLAS_RJ_3lep (bottom) searches. Blue indicates that the signal improves the fit to that search and red that it worsens the fit

There is no preference in the data for the content of the charginos, which may be wino-like, Higgsino-like or a mixture of the two. This is to be expected, given that the data likewise allow any wino-Higgsino admixture for the \({\tilde{\chi }}^0_2\), and prefer solutions where \({\tilde{\chi }}^0_2\) and \({\tilde{\chi }}^\pm _1\) are essentially degenerate in mass. The only exception is that we again see a tendency for pure states to arise at high masses, as can be deduced from the corresponding pure neutralino states. We therefore omit plots of the chargino composition.

4.2 Discussion of excesses

An important question is how the pattern of these excesses can be consistent with other published searches, which are a mix of null results, and modest excesses that were not previously thought to be significant. In this section, we investigate whether the different LHC results are consistent with each other for our best-fit models, or whether there are tensions between different analyses.

First we show in Fig. 8 the contribution of each analysis to the total combined likelihood, inside the interesting \(1\sigma \), \(2\sigma \) and \(3\sigma \) preferred regions, which are bounded by orange contour lines. These plots show the relative contribution to the best likelihood in each bin of the 2D profile likelihood map when all analyses are included. The log-likelihood on the z-axis favours a signal if it is greater than zero. Thus, blue regions indicate analyses that contribute positively to the combined likelihood, white regions indicate that the analyses have no sensitivity, and red regions indicate tension with the model with the highest likelihood in each bin. We can thus divide the analyses into the following categories:

  • Favours background only (red): A mild tension results from the CMS multilepton analysis (in the three-soft-lepton signal region, CMS_MultiLep_3lep), which persists across most of our best-fit region. The fact that this search has exclusionary power for our best-fit region makes sense on the grounds that, of all the signal regions in that paper, it is the most sensitive to on-shell W and Z production. As noted earlier, the likelihood of this analysis is hard to estimate given the lack of published covariance information for the multibin analysis, and we are forced to use aggregated signal regions that might not have similar exclusion power. Our results suggest that there is a mild tension between this analysis and the other analyses in our combined likelihood, but it is impossible to quantify the effect precisely, and we will therefore leave this as an open question. Note that we also see a stronger tension in a small region beyond our \(2\sigma \) contour in the three lepton signal region of the conventional ATLAS multilepton analysis (ATLAS_MultiLep_3lep), which is important in shaping our final 2\(\sigma \) contour.

  • No sensitivity (white): The ATLAS_4b analysis has no sensitivity to our best-fit models, which is to be expected given that it is optimised for scenarios with two on-shell Higgs bosons present in the final state. Although our best-fit models will include some Higgs production, they must feature copious production of W and Z bosons in order to fit the observed excesses in the searches targeted at on-shell W and Z production. The CMS one lepton plus two b-jet analysis, targeting Wh final states, and the two same-sign lepton regions of the CMS multilepton analysis also show no sensitivity to our highest-likelihood models. This makes sense given that the ATLAS excesses require an on-shell Z boson to be produced most of the time in our models. An alternative option is that there are in fact hh and Wh final states produced relatively often in our models, but the kinematics of the final state particles differ from those on which the CMS analyses were optimised. This makes our benchmark points, provided in Sect. 4.3 below, particularly interesting for the optimisation of future searches. We note with interest that the two-lepton zero-jet region of the conventional ATLAS multilepton analysis appears white in these plots at the best-fit region, indicating no tension with the analyses that show positive log-likelihood contributions. This indicates that there is no tension between the analyses containing excesses and these signal regions. We expand on this point below.

  • Favours signal (blue): The strongest positive contributions to our log-likelihood come from the conventional ATLAS multilepton analyses (in the four-or-more-lepton, three-lepton and two-lepton plus jets final states, i.e., ATLAS_4lep, ATLAS_MultiLep_3lep and ATLAS_MultiLep_2lep_jet), and the ATLAS recursive jigsaw analysis (ATLAS_RJ_3lep and ATLAS_RJ_2lep_2jet). A weaker positive contribution near the best-fit region is evident in the CMS two soft lepton analysis (CMS_2lep_soft) and the CMS two opposite-sign lepton analysis (CMS_2OSlep).

The fact that the conventional ATLAS multilepton analysis shows evidence of an excess is naively in conflict with the published exclusion limits. However, we have already shown (left panel of Fig. 2) that our ColliderBit treatment of this analysis can reproduce the exclusion in the same simplified model (which assumes \({\tilde{\chi }}_2^0{\tilde{\chi }}_1^\pm \) production and subsequent decay to W and Z bosons). This analysis prefers a signal in our results instead of an exclusion because our electroweakino model differs from the ATLAS simplified model.

To further understand the interplay between the analyses driving our fit result, we show, in Fig. 9, log-likelihood contributions for selected analyses in the mass planes \((m_{{\tilde{\chi }}_2^{0}},m_{{\tilde{\chi }}_3^0})\) and \((m_{{\tilde{\chi }}_3^{0}},m_{{\tilde{\chi }}_4^0})\), and for easy comparison, show again the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane alongside. As in the previous plot, the log-likelihood shown on the z-axis favours a signal if it is greater than zero and has some exclusionary power if it is less than zero. These plots show that the contribution from an analysis may depend very strongly on whether there are additional light electroweakinos beyond the set of states included in the simplified model, as shown by the dependence on the mass of \({\tilde{\chi }}^0_3\) in the middle panels and the dependence on the \({\tilde{\chi }}^0_4\) mass in the lower part of the right-hand panel for ATLAS_MultiLEP_3lep (third row of third column).

Before discussing Fig. 9 in more detail, we note that some care must be taken in interpreting these plots. First, as described earlier, there can be large errors from Monte Carlo statistics. To reduce statistical fluctuations, we post-processed our results to have more Monte Carlo events in higher-likelihood regions. In particular, we ensured that at least 4 million events were generated for all points within the \(2 \sigma \) and \(3 \sigma \) regions, at least 16 million events for points in the \(1 \sigma \) region and 64 million events for the 500 highest-likelihood points. Second, our parameter scans have sampled the interesting \(2\sigma \) region – where the total likelihood function is peaked – more densely than the very broad \(3\sigma \) region. The combination of the comparatively low MC statistics and less dense sampling implies that there are significantly larger uncertainties on the profile likelihood in the \(3 \sigma \) region, compared to the \(2 \sigma \) and \(1\sigma \) regions. When we profile over the dimensions not shown in the plane, we are selecting points with the highest total likelihood, which biases the results towards larger values for the log-likelihood contribution shown on the z-axis of this plot, i.e. the MC and sampling uncertainties tend to lead to an overestimate of the log-likelihood contribution when profiling. For instance, one should interpret the small patches of blue (i.e. positive log-likelihood) that appear for higher values of \(m_{{\tilde{\chi }}_3^0}\) in the three lepton signal region (third row of second column) with care, due to the larger statistical fluctuations in the \(3 \sigma \) region and the bias towards positive values from profiling. Negative log-likelihood contributions between the \(2 \sigma \) and \(3 \sigma \) contours should therefore be considered somewhat more robust than positive ones.

We focus first on the plots for the ATLAS search in the four or more lepton final state (ATLAS_4lep), shown in the top row of Fig. 9. The positive log-likelihood contribution that our best-fit region gets from this search originates from the modest excess seen in the SR0D signal region (described in Sect. 3.3.1) with two reconstructed Z-boson candidates and missing energy. This can clearly not come from \({\tilde{\chi }}_1^+{\tilde{\chi }}_1^-\) or \({\tilde{\chi }}_2^0{\tilde{\chi }}_1^\pm \) production in a simplified model, but relies on the production of heavier neutralinos with decays \({\tilde{\chi }}_i^0\rightarrow Z{\tilde{\chi }}_{1,2}^0\). We see that the search prefers a \({\tilde{\chi }}_3^0\) lighter than around 500–600 GeV, but does not significantly constrain the \({\tilde{\chi }}_4^0\) when this is a bino or a wino (the horizontal band at high \(m_{{\tilde{\chi }}_4^0}\) in the right-hand plot).

Moving on to the plots for ATLAS_MultiLEP_2lep_jet, ATLAS_MultiLEP_3lep and ATLAS_RJ_3lep in rows two, three and four, respectively, we notice that a preference for \(m_{{\tilde{\chi }}_3^0} < 600\) GeV is also seen in ATLAS_MultiLEP_3lep, in addition to a clear preference for \(m_{{\tilde{\chi }}_4^0} < 700\) GeV when \(m_{{\tilde{\chi }}_3^0} < 250\) GeV. This suggests that additional light neutralinos play a very important role in evading the limits placed by ATLAS_MultiLEP_3lep on the ATLAS simplified model (which we reproduced in the top right panel of Fig. 2). In contrast, both ATLAS_MultiLEP_2lep_jet and ATLAS_RJ_3lep can provide positive log-likelihood contributions in the limit of decoupling \({\tilde{\chi }}_3^0\) and \({\tilde{\chi }}_4^0\) (both Higgsino), as long as the mass-splitting between the wino pair \({\tilde{\chi }}_2^0\)/\({\tilde{\chi }}_1^\pm \) and the bino \({\tilde{\chi }}_1^0\) is larger than, but close to, \(m_Z\). This was already evident in the simplified model likelihood maps for these analyses in Fig. 2 (top left and bottom row panels). In Fig. 9 this manifests as the positive log-likelihood contribution along the high-mass diagonals in the right-hand plots for ATLAS_MultiLEP_2lep_jet and ATLAS_RJ_3lep.

The sharp changes in likelihood contribution visible in several of the plots are due to sudden changes in what scenarios are being picked out by the profiling, and consequently, which signal regions get to determine the analysis likelihoods. One example of this can be seen in the horizontal band of high \({\tilde{\chi }}_4^0\) mass in the right-hand plots for ATLAS_MultiLEP_2lep_jet and ATLAS_MultiLEP_3lep. In the region with \(m_{{\tilde{\chi }}_3^0} > 250\) GeV, the scenarios that are picked out by the profiling have a large \({\tilde{\chi }}_3^0\)\({\tilde{\chi }}_2^0\) mass splitting and a \(\sim 30\) GeV \({\tilde{\chi }}_2^0\)\({\tilde{\chi }}_1^0\) splitting, suggesting that the \({\tilde{\chi }}_2^0\) and \({\tilde{\chi }}_1^0\) are predominantly Higgsino in these scenarios. For \(m_{{\tilde{\chi }}_3^0} < 250\) GeV, however, the profiling selects scenarios where the \({\tilde{\chi }}_2^0\) and \({\tilde{\chi }}_3^0\) are a Higgsino pair of similar mass, with a large mass gap down to a \({\tilde{\chi }}_1^0\) below 100 GeV. The scenario in this region looks a lot like the simplified model from ATLAS, except that there is production of three Higgsinos (\({\tilde{\chi }}_3^0\), \({\tilde{\chi }}_2^0\), \({\tilde{\chi }}_1^\pm \)) instead of two winos (\({\tilde{\chi }}_2^0\), \({\tilde{\chi }}_1^\pm \)). An important reason for the similarity with the simplified model is that the Higgsino nature of the produced sparticles ensures large branching ratios for decays to on-shell Z bosons, even when the \({\tilde{\chi }}_{3,2}^0\)\({\tilde{\chi }}_1^0\) mass splittings are larger than \(m_h\). This matches the assumption \(BR({\tilde{\chi }}_2^0 \rightarrow {\tilde{\chi }}_1^0 Z) = 100\%\) that is commonly employed for simplified models.Footnote 12 Therefore, the tension between the likelihood contributions from ATLAS_MultiLEP_3lep and ATLAS_RJ_3lep that can be seen in this low-\(m_{{\tilde{\chi }}_3^0}\), high-\(m_{{\tilde{\chi }}_4^0}\) region is a manifestation of the tension one naively would expect based on the corresponding simplified model results.

The overall result is that models in the vicinity of our best fit have appreciable amounts of \({\tilde{\chi }}_3^{0}\), \({\tilde{\chi }}_4^{0}\) and \({\tilde{\chi }}_2^{\pm }\) production in addition to \({\tilde{\chi }}_2^{0}\) and \({\tilde{\chi }}_1^{\pm }\) production, whilst the ATLAS simplified model only includes \({\tilde{\chi }}_2^{0}\) and \({\tilde{\chi }}_1^{\pm }\) production. Our models can thus produce richer final states than the ATLAS simplified model, typically generating more gauge bosons, which in turn produce leptons that allow the events to pass a three-lepton selection whilst also producing additional jets.

Examining the event record for the MC simulation of our best-fit point shows a variety of extra processes that will lead either to a higher multiplicity or a change in the missing \(E_T\) distribution. These include:

  • \({\tilde{\chi }}_2^{0}{\tilde{\chi }}_3^{0}\) production, with e.g. \({\tilde{\chi }}_2^{0}\rightarrow Z + {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_3^{0}\rightarrow W^- + {\tilde{\chi }}_1^{+} \rightarrow W^- + W^+ + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_2^{\pm }{\tilde{\chi }}_2^{\mp }\) production, with e.g. \({\tilde{\chi }}_2^{\pm }\rightarrow W^\pm + {\tilde{\chi }}_2^{0} \rightarrow W^\pm + Z + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_2^{\pm }{\tilde{\chi }}_3^{0}\) production, with e.g. \({\tilde{\chi }}_2^{\pm }\rightarrow W^\pm + {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_3^{0}\rightarrow Z + {\tilde{\chi }}_2^{0} \rightarrow Z + Z + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_2^{\pm }{\tilde{\chi }}_3^{0}\) production, with e.g. \({\tilde{\chi }}_2^{\pm }\rightarrow W^\pm + {\tilde{\chi }}_2^{0}\rightarrow W^\pm + Z+ {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_3^{0}\rightarrow W^- + {\tilde{\chi }}_1^{+} \rightarrow W^- + W^+ + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_2^{\pm }{\tilde{\chi }}_4^{0}\) production, with e.g. \({\tilde{\chi }}_2^{\pm }\rightarrow W^\pm + {\tilde{\chi }}_2^{0} \rightarrow W^\pm +Z+ {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_4^{0}\rightarrow Z + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_2^{\pm }{\tilde{\chi }}_2^{0}\) production, with e.g. \({\tilde{\chi }}_2^{\pm }\rightarrow h + {\tilde{\chi }}_1^{\pm } \rightarrow h+W^\pm + {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_2^{0}\rightarrow Z + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_1^{\pm }{\tilde{\chi }}_3^{0}\) production, with e.g. \({\tilde{\chi }}_1^{\pm }\rightarrow W^\pm + {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_3^{0}\rightarrow W^- + {\tilde{\chi }}_1^{+}\rightarrow W^+ + W^- + {\tilde{\chi }}_1^{0}\)

  • \({\tilde{\chi }}_2^{\pm }{\tilde{\chi }}_4^{0}\) production, with e.g. \({\tilde{\chi }}_2^{\pm }\rightarrow Z + {\tilde{\chi }}_1^{\pm } \rightarrow Z+ W^\pm + {\tilde{\chi }}_1^{0}\), \({\tilde{\chi }}_4^{0}\rightarrow h + {\tilde{\chi }}_2^{0}\rightarrow h + Z+ {\tilde{\chi }}_1^{0}\)

Note that it is quite common to have four gauge bosons produced for our best-fit model. For the best fit point the \({\tilde{\chi }}_1^{\pm }\) and \({\tilde{\chi }}_2^{0}\) have a \(\sim 20\%\) Higgsino component, resulting in a smaller \({\tilde{\chi }}_1^{\pm }{\tilde{\chi }}_2^{0}\) production cross-section compared to a scenario with pure wino \({\tilde{\chi }}_1^{\pm }\) and \({\tilde{\chi }}_2^{0}\). In the conventional 3-lepton ATLAS analysis (ATLAS_MultiLep_3lep), this has the effect of reducing the predicted signal in the SR3_WZ_0Ja and SR3_WZ_0Jb signal regions. At the same time, the additional production processes made relevant by the relatively light \({\tilde{\chi }}_3^{0}\), \({\tilde{\chi }}_4^{0}\) and \({\tilde{\chi }}_2^{\pm }\) increase the signal prediction for the SR3_WZ_1Jc signal region, which in contrast to SR3_WZ_0Ja and SR3_WZ_0Jb requires \(\ge 1\) jet, with a leading jet \(p_T\) of at least 70 GeV. The ATLAS results for these signal regions are:

  • SR3_WZ_0Ja: expected background \(21.7 \pm 2.9\), observed 21.

  • SR3_WZ_0Jb: expected background \(2.7 \pm 0.5\), observed 1.

  • SR3_WZ_1Jc: expected background \(1.3 \pm 0.3\), observed 4.

Thus a reduction in the SR3_WZ_0Ja and SR3_WZ_0Jb predicted signal yields, plus a simultaneous increase in the predicted SR3_WZ_1Jc yield, clearly helps a model fit the data better. This change in what is the most sensitive signal region is responsible for the switch from negative to positive log-likelihood contribution from ATLAS_MultiLep_3lep in the \(m_{{\tilde{\chi }}_3^0}<250\) GeV region when \(m_{{\tilde{\chi }}_4^0}\) lowered below \(\sim 700\) GeV. In this case, the same light electroweakinos are also able to provide a good fit to the results from ATLAS_4lep, ATLAS_MultiLep_2lep_jet and ATLAS_RJ_3lep, allowing all analyses to contribute positively to the combined log-likelihood.

As this work was nearing completion, a new CMS search for chargino pair production was made public, in which evidence for chargino production and decay to either W bosons or intermediate sleptons is searched for in events with two opposite sign leptons, for different jet and b-jet multiplicities and lepton flavour configurations [153]. A large number of bins in \(p_T^\text {miss}\) and \(M_{T2}\) are used to determine exclusion limits on a variety of simplified model scenarios. When interpreted in the context of a model with decoupled sleptons, the observed exclusion limit on the cross-section \(\sigma (pp\rightarrow {\tilde{\chi }}_1^+{\tilde{\chi }}_1^-)\) is weaker than the median expected limit at the \(2\sigma \) level. While it is tempting to speculate about the connection between this and the pattern of excesses presented in this paper, a detailed treatment of this analysis is beyond the scope of the present work.

4.3 Benchmark points

Table 7 Parameter values and sparticle masses for a variety of benchmark points. Point #1 is our best-fit model, for which the Higgsinos are heavier than the winos. Point #2 is a solution with the winos heavier than the Higgsinos with similar likelihood. Point #3 is the point within the \(1\sigma \) region with the highest LSP mass. Point #4 has the best combined DM and collider likelihood

In Table 7, we show the parameter values and electroweakino masses for a number of benchmark points. The first of these corresponds to our best-fit point. As can be seen in Figs. 4 and 7, there are many points that give likelihoods that are very close to the best-fit value. In particular, these include models where winos are lighter than Higgsinos (as occurs at our best-fit point), models where they have similar masses, and models where the winos are heavier. We show a second benchmark in Table 7 where the latter is true, with a slightly smaller likelihood than the best fit. In both benchmarks, all electroweakinos have masses less than about 300 GeV. It is worth noting that this second benchmark point is also the highest likelihood point with negative \(\mu \), often a preferred scenario for improving \(g-2\) for the muon, in spite of this feature not being included in the analysis. We show mass spectra for both these possibilities in Fig. 10.

Fig. 10
figure 10

The electroweakino mass spectra [154] for our best-fit point (benchmark point #1; left) and another point with a similar likelihood but with heavier winos (benchmark point #2; right). The benchmarks are defined in Table 7

Table 7 also shows a third benchmark point: that with the highest LSP mass within the \(1\sigma \) region. This point features a bino LSP, as our best-fit point does, and all electroweakino masses are below 350 GeV.

Lastly, benchmark scenario 4 in Table 7 is the point with the best combined DM and collider likelihood. The value of the likelihood shown in Table 7 corresponds to the collider likelihood, clearly within \(1\sigma \) of the best-fit point. The combined DM likelihood for this point constitutes essentially a perfect fit, showing no tension with direct nor indirect detection, and a relic density well below the observed value. With a lightest neutralino mass of \(m_{{\tilde{\chi }}^0_1} = 45.1\) GeV, the LSP at this point falls right on the Z funnel, explaining how it is able to avoid saturating the DM relic density despite being predominantly bino.

In Table 8, we give the local \(p\)-value estimates for each analysis separately, confirming the picture presented in Fig. 8. We also present generalised goodness-of-fit estimates for both the background-only hypothesis and our best-fit signal. The total significance is dominated by contributions from the ATLAS_4lep and ATLAS_RJ_3lep analyses, whilst other analyses do not disfavour this point.

Table 8 Combined significance of excesses in all analyses considered in this paper, along with significances considering each analysis individually. Local significances are computed with respect to the nominal signal predictions of our EWMSSM best-fit point via the method described in Sect. 3.4.1. The SM and EWMSSM (using the best-fit point found by our scan) goodness-of-fit are computed via the method described in Sect. 3.4.2. Significances displayed as ‘0’ correspond to \(p\)-values greater than 0.5. The “Best expected SRs” columns show significances computed using only the expected most sensitive signal region in each analysis, except when correlation information is available, as is done during our scan (this procedure is discussed in Sect. 3.3.1). The “All SRs; neglect correlations” columns show significances computed using the same tests as described above, but assuming that no correlations exist between signal regions within the analyses for which correlation information is unavailable. We compute these to check that our best-fit signal predictions are not in obvious tension with the observations in the signal regions that were not pre-selected for analysis at our best-fit point

We estimate the combined local \(p\)-value to be \(3.3\sigma \), but urge caution in its interpretation as it neglects necessary look-elsewhere corrections. For this reason, we have also performed goodness-of-fit tests constructed with less a priori information about our best-fit signal (see Sect. 3.4.2). This test has much less statistical power for discovery, due to its greater number of degrees of freedom compared to our local \(p\)-value test; there is only about a \(20\%\) probability to observe a \(2\sigma \) or greater excess under our best-fit model in this test, as opposed to over \(95\%\) probability to observe a \(2 \sigma \) or greater excess in the local significance test.Footnote 13 However, our goodness-of-fit test has a false positive rate much closer to the stated significance, due to a reduced look-elsewhere effect. Under this test, we estimate the significance with which the background-only hypothesis is excluded to be \(1.4\sigma \).

Performing the same test for the best-fit signal hypothesis, we see that our best-fit EWMSSM model is indeed a good fit to the data, showing only a \(0.2\sigma \) significance in Table 8, with no significant tension evident between analyses. The worst fit to our best-fit signal in an individual LHC analysis occurs in the ATLAS_RJ_ 3lep analysis, at \(1.1\sigma \); this is because our EWMSSM best-fit point slightly under-predicts the excess observed in this analysis.

A limitation of our significance estimates is that we can only perform rigorous tests using the selected signal regions at our best-fit point (as described in Sect. 3.3.1). This is due to a lack of information about correlations between signal regions for many analyses. One may therefore be concerned that our conclusions could be significantly altered by the observations in signal regions that were deemed less sensitive to our best-fit point by Monte Carlo simulation, and therefore not included in our likelihood calculation for this point. To address this concern, we have also computed the results that would be obtained using all signal regions in all analyses, assuming them to be independent where no correlation information is available. These are shown in the righthand side of Table 8. Neglecting unknown correlations is of course not fully correct, however it is sufficient to detect large tensions between signal regions that we might have missed in our main analysis. The results do not indicate any large qualitative difference to the main results; the local combined significance is mildly increased (from \(3.3\sigma \) to \(4.1\sigma \)), whilst the EWMSSM and SM goodness-of-fits are mildly improved (\(0.2\sigma \) to \(0\sigma \), and \(1.4\sigma \) to \(1.2\sigma \) respectively). Note, however, that the goodness-of-fit tests have decreased statistical power due to the increased number of degrees of freedom that result from combining more signal regions; this is the main reason for the improved goodness-of-fit in both cases.

With regards to other features in Table 8, one may notice that the Z invisible width measurement has zero local significance, but nevertheless shows a \(1.3\sigma \) tension with the background-only hypothesis. This is because our EWMSSM best fit predicts zero new physics contribution to the Z invisible width, meaning it has a trivial likelihood ratio and zero significance with respect to the SM. However, the LEP measurement is slightly above the SM prediction, which means that the completely free hypothesis in our goodness-of-fit test can improve upon the SM by a small amount.

Our other LEP likelihoods are not included in this combination because we implement them in our scan via interpolated limits, rather than fully simulating the analyses as we do for the LHC likelihoods, and these limits are not sufficient to reconstruct pdfs that can be used to produce pseudo-data. However, our best-fit point predicts zero contribution from these likelihoods due to the fact that all the electroweakinos except the \({{\tilde{\chi }}}_1^0\) are out of the kinematic reach of LEP, so we do not expect them to have much effect on the \(p\)-values presented here.

Fig. 11
figure 11

Distribution of missing transverse energy in the 2 lepton plus jets signal regions of the traditional ATLAS multilepton analysis, after applying all selection requirements. The grey bars show the total SM background (taken from Ref. [116]) and the stacked blue bars show the signal for our best-fit point. The hatched red bands show the \(1\sigma \) uncertainty on the total number of expected events, found by summing in quadrature the background uncertainty and the signal statistical uncertainty for our best-fit point. The black points show the ATLAS data

Fig. 12
figure 12

Distribution of missing transverse energy in the 3 lepton signal regions of the traditional ATLAS multilepton analysis, after applying all selection requirements. The grey bars show the total SM background (taken from Ref. [116]) and the stacked blue bars show the signal for our best-fit point. The hatched red bands show the \(1\sigma \) uncertainty on the total number of expected events, found by summing in quadrature the background uncertainty and the signal statistical uncertainty for our best-fit point. The black points show the ATLAS data

Fig. 13
figure 13

Distribution of kinematic variables in the 2 lepton signal regions for the ATLAS RJ analysis, after applying all selection requirements. The grey bars show the total SM background (taken from Ref. [117]) and the stacked blue bars show the signal for our best-fit point. The hatched red bands show the \(1\sigma \) uncertainty on the total number of expected events, found by summing in quadrature the background uncertainty and the signal statistical uncertainty for our best-fit point. The black points show the ATLAS data

Fig. 14
figure 14

Distribution of kinematic variables in the 3 lepton signal regions for the ATLAS RJ analysis, after applying all selection requirements. The grey bars show the total SM background (taken from Ref. [117]) and the stacked blue bars show the signal for our best-fit point. The hatched red bands show the \(1\sigma \) uncertainty on the total number of expected events, found by summing in quadrature the background uncertainty and the signal statistical uncertainty for our best-fit point. The black points show the ATLAS data

As made clear in Sect. 3.3.1, we use LO cross sections for our event generation. The increase in cross-sections going to NLO (and beyond) would result in a shift in the best-fit masses that give the same number of events. Calculating cross-sections at LO and NLO for our best-fit point using Prospino  2.1 [155, 156], and ignoring changes in efficiency – which would be reasonably small when all the electroweakino masses are changed by the same amount, giving very similar decay kinematics – this corresponds to shifting all the neutralino and chargino masses upward by 7.0 GeV. It is interesting to observe that this brings the mass up to slightly below half the Higgs boson mass. Whatever the continuing status of the small excesses in various signal regions, it is interesting to note that such a light spectrum with an LSP of less than 60 GeV is not particularly constrained by current 13 TeV LHC searches.

To finish this section, we compare the shapes of the signal variable distributions in the ATLAS multilepton analyses for our best-fit point with those published by the ATLAS experiment. The ATLAS simulation uses NLO cross-sections for normalization, and a full matching procedure including up to two extra hard jets in the matrix element. We have checked the missing energy distributions for the two lepton plus jets and three lepton signal regions of the traditional multilepton analysis (Figs. 11 and 12, respectively), as well as the distributions of several variables for the two lepton and three lepton signal regions of the recursive jigsaw analysis (Figs. 13 and 14, respectively). We see that the shape of the total expected contribution to data (SM background plus SUSY signal) is well compatible with the observed data in all cases. This was, however, almost inevitable given the limited numbers of events in these signal regions. Clearly, the shapes of these distributions will offer a powerful test of our best-fit hypothesis as more LHC data are collected.

4.4 Extraction of underlying parameters

The neutralino and chargino masses are fixed by a set of four parameters: \(\{M_1, M_2, \mu , \tan \beta \}\). If the excess is a real signal of new physics it will be very important to extract the underlying parameters from the data. In Fig. 15 we show the profile likelihood of each parameter individually. For the three dimensionful parameters, \(M_1\) (top left panel), \(M_2\) (top right panel) and \(\mu \) (bottom left panel) a preference for a low mass scale can be seen, as one would expect from the fact that we have already seen a preference for all electroweakinos being light, with the current level of resolution similar to that with which we are able to determine the electroweakino pole masses. We do not observe any lower bound on \(M_1\), allowing an extremely light bino, while \(M_2\) and \(\mu \) must be heavier than about 100 GeV. We see no strong preference for any particular choice of \(\tan \beta \) (bottom right panel) in the data, with the entire range from 1 to 70 permitted at \(2\sigma \). In Fig. 16, we also show the profile likelihood in planes of the underlying dimensionful parameters, \((M_1,M_2)\) and \((M_2,\mu )\). We see that within the \(2\sigma \) contours, all three parameters are light. This implies that two types of neutralino (wino, bino or Higgsino) are light. This was already suggested by Fig. 7, which shows the mixing of the four neutralinos. Given collider constraints on the gluino mass, it appears that the excess is not compatible with high-scale unification of the gaugino masses, as assumed in e.g. constrained MSSM/mSUGRA scenarios.

Fig. 15
figure 15

One-dimensional profile likelihood for the electroweakino sector parameters: \(M_1\) (top left), \(M_2\) (top right), \(\mu \) (bottom left) and \(\tan \beta \) (bottom right). The dashed black lines show the \(1\sigma \) and \(2\sigma \) confidence limit and the black star marks the best-fit point

Fig. 16
figure 16

Profile likelihood in the \((M_1,M_2)\) plane (left) and the \((\mu ,M_2)\) plane (right). The contour lines show the \(1\sigma \) and \(2\sigma \) confidence regions. The best-fit point is marked by the white star

5 Implications for dark matter

With an indication that relatively light LSPs may have been produced at the LHC, it becomes very interesting to consider the possibility that they may constitute DM. The most important observables to check in this context are the thermal relic density of DM, constraints on the interaction of the LSP with nuclei from direct detection experiments and neutrino telescope observations of the Sun, and limits from indirect searches for DM annihilation.

Fig. 17
figure 17

Combined collider and DM profile likelihood of the relic density of DM (left) and spin-independent direct detection cross-section (right), both plotted against the DM candidate mass. The contours show the \(1\sigma \) and \(2\sigma \) regions. The white star marks the point with the highest combined collider-DM likelihood, whereas the grey star marks our collider-only best-fit point. For comparison, we show the latest limits from PandaX [162] and XENON1T [163], along with the projected sensitivity of the LZ experiment [170]

In this paper we have analysed electroweakino searches specifically in an effective framework where the sfermions are decoupled, in order to fully understand the implications of electroweak LHC searches for the electroweakino sector. Whilst this framework fully captures all phenomenology relevant for those particular searches, it does not cover all possible implications of the same electroweakinos for DM. Light sfermions and/or non-SM Higgs bosons can provide (co-)annihilation channels able to deplete the relic density of the LSP and boost late-time annihilation signals, as well as impact nuclear scattering rates. Modifications to the expansion history during freeze-out could also significantly dilute the final relic density, as could decay of the lightest neutralino to gravitino DM. A full exploration of possible DM scenarios involving the neutralinos and charginos involved in the putative LHC signal is therefore beyond the scope of the current paper. However, we can at least consider a standard cosmological history along with the possible annihilation and scattering processes that involve only the electroweakinos and SM particles, in order to see if they might be able to explain DM alone.

The three relevant DM annihilation processes in the early Universe in this context are efficient annihilation of Higgsino DM (potentially also involving co-annihilation with similar-mass Higgsino charginos and next-to-lightest neutralinos) or wino DM (potentially involving co-annihilation with similar-mass wino lightest charginos), or resonant annihilation of binos via the SM Higgs or Z boson. Whilst all of these processes have been shown to be effective in the relevant mass range in recent studies [88, 157, 158], the detailed mixture of the LSP plays a significant role in determining whether the resulting relic density of DM is equal to the full cosmological abundance, or some fraction of it. At a mass of a few tens or hundreds of GeV, pure winos and Higgsinos annihilate too efficiently to produce the full relic density. On the other hand, annihilation of pure binos is too inefficient to bring the relic density down to the observed value, unless assisted by a resonance. Solutions that produce the full relic density of DM must therefore either be predominantly bino with an LSP mass of \(m_h/2\) or \(m_Z/2\), in order to trigger the resonance mechanism, or a mixture featuring a significant bino component plus some Higgsino and/or wino contribution(s).

In order to examine the potential of the models preferred by LHC electroweakino searches to explain DM, we postprocessed all points found in our scans to apply a series of additional DM likelihoods. These likelihoods are based on the relic density measured by Planck [159] (applied as an upper limit), constraints on the DM-nucleon scattering rate from LUX [160], PandaX [161, 162], XENON1T [163], CDMSlite [164], CRESST-II [165], PICO-60 [166], DarkSide-50 [167] and IceCube [83, 168], as well as gamma-ray limits from observations of 15 Milky Way dwarf spheroidal galaxies by the Fermi Large Area Telescope (LAT; [169]). More details of these observable calculations and likelihood functions can be found in Refs. [74, 80].

In the left panel of Fig. 17, we show the relic density of the models found in our scans, coloured according to their combined collider-DM profile likelihood. The two disconnected regions are those where the relic density is brought down to (or below) the observed value by resonant annihilation via either the Z (left region) or Higgs boson (right region). The Z funnel in particular is mapped out quite clearly in our results, with both sides of the resonance clearly visible around \(m_{{\tilde{\chi }}_1^0} = m_Z/2\). Whilst the total likelihood is highest when the lightest neutralino makes up only \(\sim \)10% of the DM, it is interesting to see the possibility that the LSP makes up all of the DM is well within the 2\(\sigma \) region. The small preference for the lower relic density is driven by the direct detection likelihoods, in particular those of PandaX, Xenon1T, and PICO-60.

The right panel of Fig. 17 shows the spin-independent nuclear scattering cross-sections of the models found in our scans, compared to the latest limits from two leading experiments included in our likelihood (PandaX [162] and XENON1T [163]), as well as the expected sensitivity of the LZ experiment [170]. We account for the fraction of the observed DM in neutralinos at each point in the scan by rescaling the cross-sections by \(f=\varOmega _{{\tilde{\chi }}_1^0} / \varOmega _\mathrm {DM}\), so as to compare fairly with the experimental limits (which assume \(f=1\)). We see that the h-funnel region already sits at the edge of the current experimental sensitivity, and will be probed in its entirety in the next generation of experiments. A substantial part of the Z-funnel region will also be tested by LZ and similar experiments, but this does not include the best-fit point.

We do not show plots relevant for indirect searches for DM, as the preferred annihilation cross-sections in the EWMSSM (after the application of the collider and DM likelihoods) all lie at . This is significantly below the sensitivity of any planned future indirect detection experiment. Although the preferred masses (around 45 or 62 GeV) and dominant DM annihilation final states (mostly \({\bar{b}}b\)) of our best-fit models are strikingly similar to those preferred in DM fits to the Galactic Centre gamma-ray excess (e.g. [171]), the annihilation cross-sections are too low to explain the excess without the presence of e.g. an additional light CP-odd Higgs boson to mediate additional late-time annihilation to \({\bar{b}}b\) [172, 173].

We summarise our result for the joint collider and DM likelihood in Fig. 18, where we show the one-dimensional \(1\sigma \) and \(2\sigma \) confidence intervals for the six electroweakino masses. This is compared to the \(2\sigma \) confidence intervals resulting from the collider likelihood alone, i.e. the \(2\sigma \) intervals from Fig. 5. The restriction of the EWMSSM to the Z-funnel and h-funnel regions by the DM likelihood not only restricts the LSP mass to the relevant resonance, but also significantly contracts the range of allowed \({\tilde{\chi }}^\pm _1\) and \({\tilde{\chi }}^0_2\) masses. This is to be expected from the strong correlation between all three of \(m_{{\tilde{\chi }}^0_1}\), \(m_{{\tilde{\chi }}^0_2}\) and \(m_{{\tilde{\chi }}^\pm _1}\) in Fig. 4.

Fig. 18
figure 18

Summary of the one-dimensional \(1\sigma \) (yellow) and \(2\sigma \) (cyan) confidence intervals for the neutralino and chargino masses, resulting from the joint collider and dark matter likelihood. For comparison, the \(2\sigma \) confidence intervals from the collider likelihood alone are shown in grey (hatched). The orange lines mark the best-fit values

Finally, we emphasise that these results are based on a scan that searched for regions of parameter space that could provide the best fit to the LHC likelihood. As a result, the parts of parameter space preferred at the \(2\sigma \) level by the combined DM and LHC likelihood are rather sparsely sampled, with only 541 points. Therefore, these results should be taken as a rough first check of the DM properties of the EWMSSM models consistent with the excesses seen at the LHC, and it should be kept in mind that a scan that seeks to map the combined likelihood could ultimately reveal more viable regions of parameter space.

6 Conclusions

We are in a very interesting period in the hunt for electroweakinos at the Large Hadron Collider, as the large accumulated datasets at a centre of mass energy of 13 TeV offer real potential for discovering weakly-produced sparticles. In this paper, we have performed a comprehensive global statistical fit of a 4D MSSM model in which \(M_1\), \(M_2\), \(\mu \) and \(\tan \beta \) are varied, whilst other MSSM parameters are held at fixed values in order to decouple all sparticles except the electroweakinos. In interpreting the results, we have considered both the case where one assumes that supersymmetry is not realised at a scale accessible by the LHC (in which case we are testing the exclusion power of current LHC searches), and the case where one allows the presence of a possible signal in the LHC data.

In the case where we assume that the data are consistent with SM backgrounds only, we find that current LHC searches offer little power to exclude any point on the \((m_{{\tilde{\chi }}_1^{\pm }},m_{{\tilde{\chi }}_1^0})\) plane. This is due to the differences between the simplified SUSY models used for optimisation and interpretation at the LHC, and the more realistic model that we employ here. This model allows for richer final states, plus a much wider variation in the assumed electroweakino contents. Our results interpreted in this fashion can be used to generate insights into how to better optimise the LHC’s ability to exclude sparticles.

In the case of a possible signal, we find that a series of excesses in the LHC data point towards a model with neutralino masses of \((m_{{\tilde{\chi }}_1^0}, m_{{\tilde{\chi }}_2^0}, m_{{\tilde{\chi }}_3^0}, m_{{\tilde{\chi }}_4^0})\) = (8–155, 103–260, 130–473, 219–502) GeV, and chargino masses of \((m_{{\tilde{\chi }}_1^{\pm }}, m_{{\tilde{\chi }}_2^{\pm }})\) = (104–259, 224–507) GeV at the 95.4% confidence level. The LSP is predominantly bino in our best-fit region, and the models are otherwise split into those that have the winos lighter than the Higgsinos, and those that have the Higgsinos lighter than the winos. Intriguingly, having all of the electroweakino spectrum light not only helps our best-fit model evade some LHC searches, but it also highlights a series of excesses that all contribute positively to our best-fit log-likelihood in the same mass region. Even if one does not take the pattern of current excesses seriously, this suggests that, at the very least, optimising analyses on simple one-step decay chains resulting from NLSP pair-production is not a good way to probe light electroweakino spectra.

Our best-fit point has neutralino masses of \((m_{{\tilde{\chi }}_1^0}, m_{{\tilde{\chi }}_2^0}, m_{{\tilde{\chi }}_3^0}, m_{{\tilde{\chi }}_4^0})\approx (49.4, 141.6, 270.3, 290.2)\) GeV, and chargino masses of \((m_{{\tilde{\chi }}_1^{\pm }}, m_{{\tilde{\chi }}_2^{\pm }}) \approx (142.1, 293.9)\) GeV. We find a local significance of \(3.3\sigma \) for this excess. If there is indeed a supersymmetric signal resembling these properties the ATLAS and CMS experiments should be sensitive to it using the full LHC Run 2 dataset. If one includes LHC searches for charginos and neutralinos conducted with proton–proton collision data collected at a centre of mass energy of 8 TeV, the local significance reduces to \(2.9\sigma \), but the general details of our best fit region apparently remain intact.

Analysis of the DM implications of our points is complicated by the fact that the particular values of the MSSM parameters that are held fixed in our analysis might influence the ability of our models to generate the correct relic density. Nevertheless, we find that a subset of the area around our best-fit point is very much consistent with both the observed relic density and constraints from direct and indirect searches for DM – even assuming that only electroweakinos and SM particles play a role in setting the relic density. Excellent fits to both the DM and collider likelihoods are possible in the so-called Z- and h-funnel regions, where the lightest neutralino has a mass equal to approximately half the mass of either the Z or Higgs boson. Many of these models will be accessible to the next generation of direct detection experiments, raising the possibility of a simultaneous confirmation of the putative LHC signal in future datasets from both the LHC and dark matter experiments.