ColliderBit: a GAMBIT module for the calculation of high-energy collider observables and likelihoods

We describe ColliderBit, a new code for the calculation of high energy collider observables in theories of physics beyond the Standard Model (BSM). ColliderBit features a generic interface to BSM models, a unique parallelised Monte Carlo event generation scheme suitable for large-scale supercomputer applications, and a number of LHC analyses, covering a reasonable range of the BSM signatures currently sought by ATLAS and CMS. ColliderBit also calculates likelihoods for Higgs sector observables, and LEP searches for BSM particles. These features are provided by a combination of new code unique to ColliderBit, and interfaces to existing state-of-the-art public codes. ColliderBit is both an important part of the GAMBIT framework for BSM inference, and a standalone tool for efficiently applying collider constraints to theories of new physics.


Introduction
Despite decades of searches for physics beyond the Standard Model (BSM), we still lack an unambiguous discovery of such physics. The many null results from the Large Hadron Collider (LHC) and other experiments allow us to constrain, to various degrees, the parameter spaces of many extensions of the Standard Model (SM). These include effective theories and simplified models of dark matter, supersymmetric theories, theories with extra space dimensions and composite Higgs models. Because even the most minimal realistic theories of BSM physics have observable consequences in multiple experiments, it is particularly important to combine collider exclusions with other experiments in a statistically rigorous way if one is to draw sound conclusions on the viability of a theory. Rigorously taking into account the sum of data relevant to a given model from the many disparate experimental sources has become a challenging task. This problem is addressed in GAMBIT (the Global And Modular Beyond-the-Standard-Model Inference Tool) [1], which combines calculations of observables and likelihoods in collider, flavour, dark matter and precision physics with a model database, a flexible system for interfacing to external codes, and a wide selection of different statistical methods and parameter scanning algorithms that can be applied to the models [1]. In this paper, we introduce ColliderBit, a GAMBIT module for the application of high-energy collider constraints to BSM physics theories.
The ATLAS and CMS experiments [2,3] have made great progress in the search for evidence of BSM physics at high energies, but applying these constraints to a generic theory of such physics remains challenging. Searches for new particles at the LHC are typically presented either in specific planes of a restrictive high scale physics hypothesis, e.g. the constrained minimal supersymmetric model (CMSSM), or in simplified models that strictly apply only to a very small volume of the total allowed space of particle masses and branching ratios. The computational expense of simulating signal processes for hundreds of thousands of points in a candidate model prevents an extended treatment by the experiments. In addition, some LEP results remain useful [4][5][6][7][8][9][10][11][12][13][14][15], and are not always rigorously applied in the literature.
Finally, the discovery of an SM-like Higgs boson [16,17] by the ATLAS and CMS experiments in 2012 -and the subsequent measurement of its properties -provides tight constraints on variations in the Higgs branching ratios, which must be included in any thorough exploration of a BSM physics model. Given the ever-growing list of constraints on BSM physics from experiments at the LHC, the need to rigorously test those limits against various models is ever more pressing.
Partial solutions to each of these issues exist, but there is as yet no comprehensive tool that tackles all of them. The package SModelS applies constraints to supersymmetric (SUSY) models based on a combination of simplified model results [18]. FastLim provides similar functionality for SUSY models, but is extendible (in principle) to non-SUSY models through the use of user-supplied efficiency tables [19]. Both of these tools will provide limits that are much more conservative than a more rigorous calculation, due to the limitations of simplified models. SUSY-AI [20] provides a random forest classifier for SUSY models based on LHC exclusions, but as seen in earlier applications of machine learning to this problem [21][22][23][24][25], accuracy concerns exist when applying the method to large-volume parameter spaces, due to the relative sparsity of the training data [26] in the model parameter space. Other approaches to SUSY model exclusion based on machine learning can be found in [27,28]. CheckMATE provides a customised version of the Delphes detector simulation, an event analysis framework and a list of ATLAS and CMS analyses that can be used to apply LHC limits, and includes an interface to MadGraph5_aMC@NLO for event generation [29][30][31][32]. However, the time required to run a single BSM parameter combination through CheckMATE makes large-scale parameter scans a difficult prospect, and integration with a global fitting framework is not within the scope of the package. To the best of our knowledge, no general purpose tool exists to apply LEP BSM search limits, although many theorists have implemented their own local codes over the years. Packages such as HiggsBounds [33][34][35][36], HiggsSignals [37] and Lilith [38] allow the user to apply constraints on Higgs physics.
As ColliderBit is designed within the GAMBIT framework [1], it offers seamless integration with modules that provide statistical fitting [1,39], the ability to impose constraints from electroweak precision data [40], flavour physics [41] and a large range of astrophysical observations [42]. For LHC physics, we use a combination of parallelised Monte Carlo (MC) simulation and fast detector simulation to recast LHC limits without the approximations of the simplified model approach. The first release of the code comes with a list of ATLAS and CMS analyses that collectively present strong constraints on supersymmetry and dark matter scenarios [43][44][45][46][47][48][49][50][51][52][53]. It contains interfaces to the Pythia 8 MC event gener-ator [54,55], to the Delphes detector simulation [30,31], and a customised detector simulation based on four-vector smearing (BuckFast). In this paper we show that that Buck-Fast gives comparable results to Delphes, but at a dramatically lower CPU cost. We also supply custom routines for re-evaluating LEP limits on supersymmetric particle production, and include interfaces to HiggsBounds and HiggsSignals for calculating Higgs observables. ColliderBit follows the modular design of GAMBIT, thus enabling the user to easily swap components (e.g. choose a different detector simulation without affecting the LHC analysis framework), add new collider analyses, or provide interfaces to standard particle physics tools. This paper serves as both a description of the physics and design strategy of ColliderBit, and a user manual for the first code release. In Appendix A, we provide a quick start guide for users keen to compile and use the software out of the box. Sect. 2 describes the physics and implementation of the Col-liderBit software. The ColliderBit user interface is outlined in Sect. 3. In Sect. 4 we cover two use cases: first, we point to an annotated GAMBIT input file that details the application of collider constraints in a scan of the constrained minimal supersymmetric standard model (CMSSM). Second, we provide a detailed example of how the user can add their own model to the ColliderBit code. The second of these examples shows the flexibility of ColliderBit in tackling generic theories supplied by the user, using existing codes for automatic generation of matrix elements. After summarising in Sect. 5, we also provide Appendices B and C, where we detail the C ++ classes defined by ColliderBit, and a glossary of common GAMBIT terms, respectively.
ColliderBit is released under the terms of the 3-clause BSD license, 1 and can be obtained from gambit.hepforge.org.

Physics and implementation
To perform any calculations, ColliderBit requires numerical values for the free parameters of a theory for new physics. If ColliderBit is run with other GAMBIT modules, these will come from a scanning algorithm implemented in the Scan-nerBit [39] module, and other GAMBIT modules will then perform the necessary spectrum generation and decay rate calculations. The user may also run ColliderBit as a standalone code, in which case the parameters can be supplied via a model description, such as an SLHA file for supersym-metric models [59,60]. In this case, the user must supply spectrum and/or decay calculations as appropriate. The Col-liderBit output is a series of signal event rate predictions and likelihood terms derived from BSM searches at the LHC, as well as likelihood terms from SUSY searches at LEP and Higgs searches at LEP, the Tevatron and the LHC. The terms may then be combined according to the user's request, to form a composite likelihood. Here we describe the strategy for calculating each individual likelihood term, along with the code implementation.

Overview of LHC constraints included in ColliderBit
As the flagship collider at the energy frontier, the LHC provides the most stringent constraints on BSM physics models in the majority of cases. The search groups of the ATLAS and CMS experiments provide long lists of results using data from LHC proton-proton collisions taken at √ s = 7, 8 and 13 TeV, including searches for specific particles encountered in BSM physics models, and generic resonances in a multitude of final states [61][62][63][64].
Implementing the full list of LHC constraints is a daunting task. The initial approach taken in ColliderBit is to provide a representative set of searches that run out-of-the-box, supplemented by a framework that makes it easy to add new LHC analyses. ColliderBit includes a selection of Run I and Run II LHC analyses, chosen for their relevance to supersymmetry and dark matter simplified model applications.
The Run I analyses included are: -ATLAS 0-lepton supersymmetry search This targets squark and gluino production, and is the most constraining single ATLAS SUSY analysis in cases where the gluino, some or all squarks are expected to be light. , which targets direct sbottom production. All of these searches are also expected to strongly constrain simplified dark matter models with a mediator that couples preferentially to third generation fermions. These models, which have gained popularity as explanations of the Fermi-LAT Galactic Centre excess [65], give rise to similar final states and, indeed, the CMS searches that we implement have already been used in a non-supersymmetric dark matter context [66]. , which provides a constraint on various dark matter scenarios, in addition to supersymmetric scenarios with compressed spectra. Some caution must be taken when applying this to e.g. dark matter effective field theories. In cases where NLO QCD effects are significant, the user will need to interface GAMBIT to a suitable Monte Carlo generator capable of modelling these effects.
We also provide the ATLAS and CMS Run II (13 TeV) 0-lepton supersymmetry searches, based on 13 fb −1 of analysed data [67,68]. More Run II analyses will be added to ColliderBit in the near future, including searches sensitive to R-parity violating supersymmetry such as Ref. [69].
We consider this a reasonable minimum of LHC searches for covering a wide range of LHC phenomenology, but the average user will no doubt be keen to expand the collection. New analyses will be continuously added to the code repository, and information on how the user can add a new LHC analysis to ColliderBit is given in Sect. 3.1.1. It is worth noting, however, that the general treatment of the LHC analyses in ColliderBit means that even the LHC Run I results can provide previously unavailable insights when used to constrain models with large parameter spaces. We also emphasise that the above list does not contain searches for SUSY scenarios with compressed sparticle spectra. Since we use the LO Pythia generator in the current ColliderBit release, we would obtain less precise results than the ATLAS and CMS publications that use MadGraph5_aMC@NLO to explicitly model initial state jet radiation through the addition of the relevant diagrams to the tree-level sparticle production process.
In the rest of this section, we describe the process by which LHC analysis constraints are derived without employing any model-dependent assumptions, following a full simulation of proton-proton collisions, including detector effects and an approximation of the ATLAS and CMS statistical procedures.

Strategy for applying LHC constraints without model-dependent assumptions
A parameter point of a specified BSM model can in principle be expected to show up in a variety of LHC BSM searches. For counting analyses, the relevant data to model are the number of events that pass kinematic selection criteria (for brevity referred to in what follows as 'cuts') imposed in each analysis. If a model predicts that s signal events will pass the cuts for a given signal region, and b background events are expected from known SM processes, the likelihood of observing n events is given by the standard Poisson formula, For now, we neglect effects of systematic uncertainties in the signal and background yields -but we return to this point in Sect. 2.1.8. LHC BSM search papers provide details of b and n for each signal region, along with the background uncertainty, and some estimate of the signal uncertainty for representative models. Calculating the likelihood for a given model thus requires an accurate estimate of s. This is given by where σ , and A are the process-specific production crosssection, detector efficiency and acceptance, respectively. L is the integrated luminosity of data used in the search.
The rigorous way to calculate s is to perform a crosssection calculation at the highest practically achievable level of accuracy in perturbation theory, before evaluating the acceptance and efficiency via a Monte Carlo simulation of the LHC collisions. This is usually augmented by simulating the reconstructed signatures of the Monte Carlo events in the relevant detector -ATLAS or CMS in the case of direct BSM searches at the LHC. One can then apply the analysis cuts for a given LHC search to the results of the detector simulation. An approach using look-up tables for efficiencies and extrapolations from simplified models removes the need for time consuming simulation, but tends to give very conservative results as a consequence. This is because the approach misses models that do not resemble the simplified models under consideration, but still have some acceptance to the analysis cuts that are used to generate the simplified model results. Furthermore, generating look-up tables must be repeated for the parameter space of every physics model of interest, making it hard to produce a generic code for the application of LHC constraints.
The core strategy of the ColliderBit LHC module is instead to make each step of the simulation chain faster, using a combination of custom speed increases and parallel computing. The package thus performs a cross-section calculation, generates Monte Carlo events, performs an LHC detector simulation and then applies the analysis cuts for a range of LHC analyses, using a custom event analysis framework. The user can then utilise the GAMBIT statistical routines to return LHC likelihoods. The basic processing chain is illustrated in Fig. 1. The code is designed so that the user can choose, among available options, which software performs each step of this process, or, as an alternative, add an interface to an external code in place of an implemented option. Nevertheless, ColliderBit has a default chain implemented, and the current version contains the elements summarised in the following subsections.

Cross-section calculations
ColliderBit uses the LO + LL cross-sections calculated numerically by the Pythia 8 event generator [54,55]. For many models, these are the state-of-the-art. For models where an NLO (or better) calculation exists, e.g the MSSM, this is a conservative approximation, as the K -factors are predominantly greater than one. The LO + LL MSSM cross-sections are considerably quicker to evaluate than the full NLO results obtained using e.g. Prospino [70][71][72]. A single evaluation of just the strong production cross-sections for a CMSSM benchmark point, with all relevant processes kinematically available, takes around 15 minutes of CPU time on a modern processor using Prospino 2.1 (Intel Core i5 at 2.6 GHz). This is clearly unusable in a scan where the evaluation of a single parameter point must be done in times on the order of a few seconds. For strong production there exist pre-computed grids of NLO cross-sections with added (N)NLL corrections, which in combination with fast interpolation routines allow accurate cross-sections to be obtained within fractions of a second [73][74][75][76][77][78][79]. However, these interpolations are limited to models where all squarks except the stops are mass degenerate. While this approximation is suitable for many lowerdimensional parametrisations of the MSSM, it is not sufficiently general to serve as a default MSSM cross-section calculator for ColliderBit.
With the improvement to NLO + NLL, the error from the factorisation and renormalisation scales has been shown to be as low as 10% [76] for a wide range of processes and masses; however, PDF and α s uncertainties must be included in the total error budget. These increase with the sparticle masses because the PDFs are most poorly constrained at large scales and at large parton x. As an example, at 8 TeV NLL-fast 2.1 [75][76][77][78][79] gives errors of (+24.3, −22.2%) and (+8.3, −7.3%), for the PDF and α s , respectively, using the MSTW2008 NLO PDF set [80], with gluino and squark masses set to 1.5 TeV. Because 1.5 TeV is at the edge of the LHC reach at that energy, the total error budget here will not drop much below 25% even with NLO + NLL crosssections. 2 In light of the above, we take the conservative path of calculating likelihoods with the LO Pythia 8 cross-sections for the LHC. Assigning errors to these cross-sections is rather meaningless, considering the monotonic nature of LO scaledependence, and the fact that the LO cross-sections in BSM models are known to almost always lie significantly below the NLO and higher order cross-section, sometimes by as Table 1 Time taken for the ColliderBit LHC likelihood calculation as a function of the number of cores, for 100,000 SUSY events at the SPS1a parameter point [83,84], including all sub-processes. The processes were run on a single computer node, with ISR, FSR, and full hadronisation enabled, but multiple parton interactions and tau decay spin correlations disabled. GAMBIT was compiled with full optimisation settings (cf. Sect. 11  much as a factor of two. 3 The LO cross-sections are hence nearly always more conservative than the lower edge of the most pessimistic NLO uncertainty band due to renormalisation scale systematics. Accordingly, we do not apply any systematic theory error to our cross-sections, as any error due to finite statistics in the event generation is dwarfed by the systematic underestimation of the cross-sections due to the LO approximation. We have verified that these choices, combined with the approximations used in the event and detector simulation, result in limits equal to or more conservative than those in the included ATLAS and CMS analyses (see Sect. 2.1.7).
In future releases, we will allow the user to supply crosssections and associated uncertainties as input to the LHC likelihood calculation, making it possible to calculate them using any preferred choice of external code (known in GAM-BIT as a backend).

Monte Carlo event generation
For the ColliderBit event generation, we supply an interface to the Pythia 8 [54,55] event generator, alongside custom code that parallelises the main event loop of Pythia using OpenMP. 4 This substantially reduces the runtime, as seen in Table 1. In a parameter scan with GAMBIT the parameter sampling is parallelised using MPI. The additional OpenMP parallelisation of the LHC likelihood calculation in Collid-erBit, along with similarly parallelised calculations in other GAMBIT modules, helps GAMBIT's overall scan performance to scale beyond the number of cores that the sampling algorithm alone can make efficient use of.
For the purposes of BSM searches, many time-consuming generator components also add little to the quality of relevant physics modelling, and can therefore be safely disabled. The single-threaded timing effects of sequentially disabling Table 2 Single-thread CPU effects of sequentially disabling event simulation components, for 100,000 SUSY events at the SPS1a parameter point [83,84], including all sub-processes. The disabled components have a major effect on CPU, and a minor (sometimes even positive) effect on physics performance. The third row corresponds to the first row in  [85] that removes the effects of pile-up and MPI on average, so disabling MPI is actually a more appropriate physical configuration than enabling it -and delivers a 60% CPU cost saving to boot. The choices for FSR and hadronisation are less clear: these are responsible for production of realistic track and cluster multiplicities and energies on which detector simulation can be run. Completely disabling FSR -which mainly produces internal jet structure, not relevant to most BSM analyses -and all hadron-level processes including both hadronisation and decays, are both rather drastic options. In practice there are intermediate alternatives, such as raising the lowp T cutoff of FSR evolution to balance CPU cost against physical accuracy, or to produce physical primary hadrons but elide simulation of their decays.
By default ColliderBit runs in the mode with MPI and "sophisticated" tau decays disabled; there is potential for further significant speed-up if the hadron-level or FSR simulation steps can be reduced, perhaps by use of specialised detector smearing to compensate for the biased final state particle distributions.
This combination of multi-threading and reduced generator functionality allows generation of 20,000 all-subprocess SUSY events in about 7 s on an Intel Core i7 processor using 8 cores, provided that the compilation makes use of the gcc option --ffast-math, or a suitable equivalent. Generating 100,000 events with the same settings and number of cores takes 19 s. When including FSR and hadronisation, as per the ColliderBit default, the time required to generate 20,000 and 100,000 events increase to 17 and 67 s, respectively.
In the above examples a factor 5 increase in the number of generated events only lead to a factor 2.5-4 increase in the evaluation time. This illustrates that when the number of generated events is fairly low, other parts of the calculation besides the event loop itself account for a noticable fraction of the total evaluation time. The most important contribution comes from the initialisation of Pythia 8. While this step has not been parallelised, we have optimised the ColliderBit Pythia 8 initialisation so that per-thread copies of the Pythia objects are only constructed at the beginning of a GAM-BIT sampling run, only requiring re-initialisation of processspecific physics components for each new model point. This produces a further speed increase in realistic applications.
In addition, in a GAMBIT-driven global fit, the event generation for a point can be skipped on the basis of the initial estimated maximum cross-section. If this is already too low to lead to observable consequences at the LHC, running the event generator is pointless, so skipping that step for some fraction of parameter points gives a further average speed increase. Event generation is also aborted if Pythia returns an error from the pythia.next() call. In both cases the contribution to the log likelihood, see Sect. 2.1.8, is set to zero.
Taken together, these routines make it computationally tractable to run a full Monte Carlo simulation in a global fit.
The choice of the Pythia generator is an acceptable compromise between generality and ease of use for the first Col-liderBit release. It is sufficient for many BSM models, and is easily extendable with matrix elements for new models via the existing MadGraph5_aMC@NLO interface [32]. For an example, see Sect. 4.2. Pythia will prove insufficient, however, in cases where NLO corrections are significantfor example in the accurate treatment of some effective field theories of dark matter, where top quark loops become important [66]. These deficiencies can be addressed in the current release via a user-supplied interface to an appropriate Monte Carlo tool, and such interfaces will be supplied in future Col-liderBit releases.

Event record
ColliderBit provides a custom set of event record classes that are independent of the particular choice of event generator or detector simulation. These are: a P4 momentum 4-vector; Particle and Jet, which respectively add particle ID and flavour-tagging information to P4; and an Event container. The latter is used to store the particles in discrete categories of photons, electrons, muons, taus and invisibles, as well as a jet container and a missing momentum vector.
These event objects should be populated by conversion routines attached to the interface to each MC generator, allowing the different event structure conventions of each generator to be treated correctly. The conversion may be done either at parton or particle level. Parton-level conversion is primarily intended for speed, as it allows the most CPUintensive parts of the event generation to be skipped, at some cost to physical accuracy. The description below concerns the complete particle-level variant, but the parton-level version only differs from it in a few minor details.
First, ColliderBit loops over the contents of each event, looking separately for decayed and stable particles. The former are only used to find b quarks 5 and hadronic taus for later jet tagging; following the established Rivet MC analysis system [87], only stable particles are used for constructing the kinematics of truth-level events, making the detector simulation and analysis more robust.
We require identified final-state leptons and photons to be "prompt", i.e. their ancestry is recursively checked to ensure that they have not been produced (even indirectly) in hadron decays. All final-state particles other than muons and invisibles are used as inputs to jet finding, which is performed using the FastJet [56] implementation of the anti-k t jet algorithm [88]. We set the anti-k t R parameter to 0.4 for Run I ATLAS BSM searches, 0.5 for corresponding CMS analyses, and 0.4 for both ATLAS and CMS Run II analyses. We use ΔR matching between jets and the unstable tagging objects to set appropriate jet attributes. ColliderBit computes missing momentum from the vector sum of the momenta of the invisible final-state particles within a geometric acceptance of |η| < 5.
The resulting Event is then passed on down the ColliderBit chain: first for modification by detector simulation, and then in read-only form to the analysis routines.

Detector simulation
ColliderBit is structured so that the detector simulation is run during the main parallelised event loop, implicitly speeding up the simulation step. The user has several options for this step.

No detector simulation
The user can choose not to perform any detector simulation, in which case the truth-level MC events described above are passed directly to the event analysis framework without modification. Jets may be defined directly at the parton level, or at the hadron level. The former is only really sufficient for analyses in which leptons are the main species of interest, in which case turning off hadronisation can lead to a large speed increase, as seen in Table 2.
Delphes We provide an interface to the Delphes 3.1.2 detector simulation [30,31], which provides simulations of the ATLAS and CMS detectors. Delphes includes a simulation of track propagation in the magnetic field of an LHC detector, along with a simulation of the electron and hadron calorimeters, and the muon chambers. The user can configure the parameters of the simulation using the normal Delphes mechanism, but it should be noted that b-and τ -tagging, and the ATLAS lepton ID selection efficiencies ("medium", "tight", etc.), are controlled explicitly within the Collider-Bit event analysis codes, to allow different analyses to use different calibration settings. Delphes has been interfaced with ColliderBit such that it can be passed single events via memory, rather than performing several passes over a large sample of MC events in pre-produced files, as in its usual mode of operation.
BuckFast For most purposes, a more approximate approach based on four-vector smearing is sufficient. We supply an internal ColliderBit detector simulation, BuckFast, which uses particle and jet resolution and efficiency functions based on those in Delphes, plus parametrised ATLAS electron identification efficiencies. New parametrisations are being added as Run 2 performance data becomes public.
The components of the BuckFast simulation are: Electrons: We apply the Delphes functions for electron tracking efficiency, electron energy resolution and electron reconstruction efficiency (in that order) to the truthlevel electron four-vectors. In the analysis step, we apply parametrisations of the ATLAS electron identification efficiencies as appropriate, taken from Refs. [89,90]. Muons: We apply the Delphes functions for the muon tracking efficiency, the muon momentum resolution and the muon reconstruction efficiency (in that order) to the truth-level muon four-vectors. Taus: Hadronic taus are identified at truth level. Leptonic taus are discarded. For both ATLAS and CMS the hadronic tau momentum is smeared by a 3% Gaussian resolution. Tau tags are applied to jets found within ΔR < 0.5 of the true hadronic taus. Flat efficiencies are applied to tau tagging in the analysis code to allow use of different tagging configurations within the analyses of each experiment. Jets: Jets are reconstructed at hadron level using the anti-k t algorithm, implemented in the FastJet package. All fiducial final-state particles other than invisibles and muons are used in jet finding, mimicking typical LHC jet calibration. For both ATLAS and CMS the jet momentum is smeared by a 3% Gaussian resolution chosen for compatibility with Delphes' constituent-level smearing. b-jets: Truth-level jet tags are obtained by matching jets to final b-partons for ΔR < 0.4; a more robust approach using final b-hadrons is also available, but by construction agrees less well with the parton-based Delphes and LHC Run 1 tagging calibrations. As for taus, tagging efficiencies and mistag rates parametrised in ηp T are applied in each analysis code to allow the use of different tagger configurations in different analyses.

Missing energy (MET): MET is constructed at generator
level by summing the transverse momenta of invisible particles within the acceptance of the detector, and all particles outside the acceptance. No "soft-term" MET smearing is currently applied, since for events with real hard-process invisible particles the ATLAS reconstruction of E miss T is within a few percent of the true value at all scales, and within 1% above 70 GeV [91]. The same approach is taken to define the "truth MET" in the "no simulation" mode. Figure 2 shows example performance of BuckFast, with comparisons to Delphes and no-simulation processing. For this example, we choose a CMSSM point close to the current ATLAS 95% exclusion contour, consistent with the measured Higgs boson mass: The major effects of detector simulation are seen to be due to lepton efficiencies, with explicit resolution modelling producing relatively minor effects. BuckFast and Delphes typically agree to within a few percent for leptons, but some larger differences remain for b-jets (due to truth-tag definition) and missing E T . The latter is currently unsmeared in BuckFast, but the origin of the deviation at high-E miss T is unclear since the reconstructed ATLAS E miss T closely matches the truth value in the BSM search region above 70 GeV [91]. The impact of these discrepancies on an ATLAS analysis dominated by b jets and E miss T is shown in Table 4. BuckFast is significantly faster than Delphes. One reason for this is that the operations it performs are computationally simpler, and should complete in fractions of a second. The other, more signficant, reason is that the ROOT framework on which Delphes is based is not thread-safe, so must be run serially within an OpenMP critical block. In contrast, BuckFast can be run in parallel along with our parallelised version of Pythia 8 (cf. Sect. 2.1.4), as it has no dependence on ROOT.

LHC event analysis framework
ColliderBit provides a simple analysis framework, built on the event record classes described in Sect. 2.1.5. Each analysis routine is a C ++ class derived from the BaseAnalysis class, which provides the usual interface of a pre-run init method and an in-run analyze method to be called on each event. The user can choose which analyses to run in a given scan directly from the GAMBIT configuration file. Using the generic Col-liderBit event record classes means that the analyses can be automatically run on either unsmeared truth records or ones to which detector effects (other than jet tagging rates) have been applied. The result of an analysis is a set of SignalRegionData objects. Each of these encodes the predicted event counts in a particular signal region of the analysis, from both signal and background processes. The signal numbers are obtained by normalising the yields of simulated events to the integrated luminosity of the original experimental data analysis. The BaseAnalysis class provides additional methods for statistically combining analyses (either equivalent or orthogonal), and for specifying the effective luminosity simulated in the Monte Carlo step.

LHC statistics calculations
To determine the basic likelihood of observing n events in a certain signal region, given a signal prediction s, we use the marginalised form of Eq. (1) [92][93][94]. This allows us to incorporate systematic uncertainties on the signal prediction (σ s ) 6 and background estimate (σ b ) into the calculation, by marginalising over the probability distribution of a rescaling parameter ξ : Note that the use of a single rescaling parameter is an approximation to avoid the need for a time-consuming 2D integration. The probability distribution for ξ is peaked at ξ = 1, and has a width characterised by σ 2 ξ = σ 2 s +σ 2 b . The user can choose whether to assume a Gaussian form for this function, or a log-normal form, The ColliderBit default is to use the log-normal version. This is slower but more correct, as it does not permit a finite probability for ξ = 0. In the limit of small σ ξ , both likelihoods give extremely similar results. We use the highly optimised implementations of these functions contained in nulike [94,95]. The steps we have described so far allow ColliderBit to calculate the predicted number of events in any given signal region, defined by a specific set of observables and kinematic cuts, and to compute the likelihood for that region. However, certain ATLAS and CMS analyses make use of multiple signal regions, allowing analysis cuts to be optimised according to the specific characteristics of each model being tested. These signal regions may overlap, and so contain events in common. The likelihood functions from overlapping signal regions are therefore not independent. Ideally, information would be available from the experiments about the degree to which this overlap occurs, which would allow GAMBIT analyses to include all signal regions and their correlations in the final likelihood for a given analysis.
As this information is not presently available for most analyses, ColliderBit computes the likelihood for a given analysis on the basis of the signal region expected to give the strongest limit. It does this individually for each model, by calculating the expected number of events for every possible signal region considered in the the original ATLAS or CMS analysis. It then chooses the signal region with the maximally negative log-likelihood difference This difference is the log of the likelihood ratio between the signal plus background and background-only predictions, assuming that the observed counts match the background expectation. To calculate the likelihood for the analysis in question, ColliderBit then computes the likelihood of the actual data in the chosen signal region, and takes the difference with respect to the background-only expectation in that region, giving an effective log-likelihood It is necessary to define the effective log-likelihood in this way because of the selection step between different signal regions. Signal regions can in principle differ markedly in their number of analysis bins and expected numbers of events, leading to very different effective likelihood normalisations. Because of this, choosing the signal region on a per-model basis and then adopting the raw log-likelihood from the selected signal region would introduce erroneous model-to-model likelihood fluctuations. Taking the difference with respect to the background prediction not only removes the differing (but model-independent) offsets to the log-likelihood from the different signal regions' typical count rates, but also reduces the effective degrees of freedom of the resulting likelihood, from N (the number of analysis bins) to just one. This puts effective likelihoods from all signal regions on the same footing, and allows them to be compared correctly across different points in parameter space. This is a conservative approach, but it is the best possible treatment when one lacks sufficient information to handle correlated data and systematic uncertainties. When such Table 3 The published ATLAS cutflow for the 2 jt signal region taken from Ref. [49], which searched for squarks and gluinos in events with jets and missing transverse momentum. The cutflow is generated for a squark pair-production simplified model (in which a pair of squarks is produced with direct decay to a quark and a lightest neutralino, all other sparticles being decoupled), with mq = 1000 GeV and mχ0 1 = 100 GeV. This is compared with the GAMBIT cutflow obtained using Pythia 8 and BuckFast. Shown are the efficiencies for passing each cut (second and third columns), and the ratio of efficiencies (fourth column). For reference, we also show the cutflow results obtained using the Check-MATE package (taken from their public validation results) [29] Cut information is made available (ideally in a standardised format), ColliderBit will use it for a more complete likelihood calculation. Refs. [96,97] are indeed very welcome recent steps in this direction.
To construct a compound likelihood from different analyses, we assume that all analyses have been chosen to be orthogonal, in the sense that they have disjoint selection criteria and no single event could contribute to the signal region counts of multiple analyses. This means that their effective log-likelihoods can be straightforwardly summed; Collider-Bit does this for all analyses selected by the user, and returns the result to the GAMBIT Core as a final, combined LHC log-likelihood. It is the responsibility of the user to ensure that they only select mutually orthogonal analyses for combination in a ColliderBit run.

Validation of ColliderBit LHC constraints
We verified the ColliderBit LHC simulation and analysis chain by comparing cutflows for representative model parameter points against those published by the LHC experiments. Note that we use Pythia 8 and BuckFast for these compar-isons, so we expect to see the agreement degrade in cases where effects not included in this chain become important, e.g. for compressed spectra, where a more appropriate treatment of initial state radiation is important.
Three sample cutflows are presented in Tables 3, 4 and 5, for a jets + MET search, 2b + MET search and a dilepton + MET search, respectively. These show close agreement for most signal regions, rising to no more than ∼ 50% discrepancy in the worst case. These are a representative choice of sample cutflows for all signal regions considered. For reference, we also show the publically available Check-MATE cutflows where these are available. These confirm the expectation that BuckFast gives respectable performance, but does not match the ATLAS cutflows as closely as the CheckMATE package, which runs a heavily tuned version of the Delphes detector simulation. The compromise in performance in BuckFast is of course compensated for by the two-fold increase in speed, resulting from the quicker simulation step, plus the fact that it can be parallelised since it does not rely on the ROOT framework.
To illustrate the effect of changing the Pythia 8 settings on the physics performance of ColliderBit, we show the ATLAS Table 5 The published ATLAS cutflow for Model 1 in Ref.
[46], a search for new physics in events with two leptons and missing transverse momentum. This is compared with the GAMBIT cutflow obtained using Pythia 8 and BuckFast. Shown are the numbers of events expected in 20.1 fb −1 of 8 TeV ATLAS data, and the ratio of the GAMBIT and ATLAS numbers. Note that for the GAMBIT numbers, we used the same value of the SUSY production cross-section as that assumed in the ATLAS cutflow (and thus our cutflow does not include the effect of the LO cross-section that we use in our SUSY scans)  Table 6. This should be the most strongly affected cutflow, since the settings are all relevant for jet physics. In fact, we observe only a slight degradation of the cutflow performance as various approximations (e.g. removing hadro-nisation and FSR) are made, which validates the removal of certain Pythia 8 features in the interests of speed. We caution that for models with compressed particle decays where the effects of final state radiation may become more important, this conclusion is not expected to hold, but a thorough investigation is clearly physics-dependent and beyond the scope of this paper. In Fig. 3, we compare the observed ATLAS Run 1 zero lepton CMSSM 95% CL exclusion limit in the m 0 -m 1/2 plane from [49] with a GAMBIT ColliderBit scan performed with the same model. Here m 0 and m 1/2 are free parameters, tan β = 30, A 0 = −2m 0 and μ > 0. Since there are only two free parameters we perform a simple grid scan with 50 ×50 grid points. The ColliderBit likelihood includes only the LHC likelihood contribution, which in turn uses only the ATLAS zero lepton analysis, with 20,000 MC events generated per parameter point. The white solid line show the 95% CL exclusion contour, defined by the likelihood ratio L/L max = 0.05. For comparison, the observed limit from the ATLAS analysis is plotted as a solid blue line, with dashed blue lines showing the reported ±1σ theoretical uncertainty on this limit.
The ColliderBit exclusion limit is more conservative than the ATLAS result, as expected from the different crosssections used (LO for ColliderBit, NLO + NLL for the ATLAS result). We have checked how our limit would change with NLO + NLL cross-sections from NLL-fast 2.1 for a number of points close to the observed ATLAS limit. In the region where m 0 m 1/2 we find close agreement with the ATLAS limit; the rescaled ColliderBit limit ends within the uncertainty band of the ATLAS limit. In the low-m 0 part of the plane, where m 0 ∼ m 1/2 , we see a somewhat larger discrepancy with the observed ATLAS limit also after rescaling our results with NLO + NLL cross-sections. The reason for this discrepancy is that ColliderBit here differ from the ATLAS analysis in what signal region is predicted to have the best expected sensitivity. ATLAS uses the 4-jet region 4jt while ColliderBit chooses the 3-jet region 3j. Coinciden- Table 6 Reproduction of the same ATLAS 0 lepton cutflow as Table 3, with each column representing different Pythia 8 settings. The baseline in the final "None" column has tau spin correlations turned off (since they have no effect for SUSY models in any case), ISR turned on, and hadronisation (HAD), FSR and multiple parton interactions (MPI) turned off. Pythia 8 is configured to produce light squark pairs only, and the parton-level events are reconstructed with the parton BuckFast settings. The first three columns add hadronization, FSR and multiple parton interactions. It is worth noting that none of these configurations match the cutflow configuration in Table 3  tally, the 4jt region observes a small downwards fluctuation relative to the background expectation, leading ATLAS to set stronger limits than expected, while at the same time there is a small upwards fluctuation in the event count for the region 3j, giving a weaker ColliderBit limit than expected. In this part of parameter space the squarks and gluinos are rather close in mass, implying that some of the jets will be soft. The choice between a 3-jet and a 4-jet signal region is therefore likely to be sensitive to the details of jet handling in the different event generators used (HERWIG++ 2.5.2 for the ATLAS result and Pythia 8.212 for ColliderBit). We note that if we by hand force ColliderBit to use the 4jt region, our limit again agrees nicely with the ATLAS limit after NLO + NLL scaling. For instance, our rescaled limit at m 0 = 800 GeV moves up to m 1/2 = 780 GeV.
In the above 50 × 50 grid scan of the m 0 -m 1/2 plane we only generated 20,000 MC events per parameter point. This scan completed in less than 80 minutes using 48 CPUs. Clearly, for a low-dimensional scan like this one can afford a much higher number of events per point to reduce the MC uncertainty. But for large global fits in many-dimensional parameter spaces 20,000 events may be a realistic trade-off between speed and accuracy. In Fig. 4 we show a colour map of the relative MC uncertainty, √ n s /n s , across the m 0m 1/2 plane for the ColliderBit simulation of the ATLAS zero lepton search using 20,000 events per point. Here n s is the number of accepted MC events for the signal region chosen by ColliderBit for the given parameter point. As in Fig. 3, the white line depicts the 95% CL exclusion contour obtained with 20,000 generated events. For comparison the cyan line shows the limit obtained when generating 100,000 events per point. We see that for the signal regions used to calculate the likelihood for this analysis, the relative MC uncertainty stays below 13% in the parameter regions around the exclusion limit. At large m 0 and m 1/2 the uncertainty increases as we approach the production threshold. (The apparent cut-off at √ n s /n s ∼ 0.25 is due to the grid step size). In contrast to the simple grid scan presented in Figs. 3 and 4, a large global fit will typically involve sampling millions of parameter points. Limited MC event statistics then increases the chance of a few points ending up with a spurious good likelihood due to MC fluctuations. This may in particular affect the result of a frequentist analysis where the preferred parameter regions are determined relative to the best-fit point, as with the likelihood ratio L/L max used above. A spurious good likelihood L max for the best-fit point will result in a falsly strong constraint on the preferred parameter regions. One simple way to ensure conservative and more stable limits in a large scan is to cap the ColliderBit effective log-likehood in Eq. (7) at the value given by the backgroundonly expectation (s = 0), i.e. to force ln L eff ≤ 0. This then becomes an "exclusion only" likelihood, as all points where s > 0 gives an improved fit to the data are assigned the likelihood corresponding to s = 0. Of course, this method is not appropriate if the aim of the parameter scan is to fit the model to a potential new signal in the data. The approach of capping the effective likelihood was used for the result shown in Fig. 3. We also apply it in the analysis shown in Fig. 5. This gives an example of a typical use-case for ColliderBit, in which multiple LHC searches are included in the combined LHC likelihood and a larger parameter space is scanned. Here we show the CMSSM 95% CL exclusion limit in the m 0 -m 1/2 plane, following a scan of m 0 , m 1/2 , tan β, and A 0 using Diver [39] with GAMBIT production settings (convthresh = 10 −5 and NP = 19200), with the SM parameters held to their default values. All of the LHC Run I analyses listed above are included in the LHC combined likelihood, and no other likelihoods are used. One obtains an exclusion contour of similar shape to the ATLAS zero lepton limit for fixed A 0 and tan β, but it is shifted to lower values of m 1/2 .

LEP likelihood calculation
Despite the huge improvement in lower limits provided by high-energy LHC data, limits from direct searches at the LEP experiments are still important for some BSM models. This is true in particular for SUSY models that only have significant production of slepton or neutralino/chargino pairs with masses below half the maximum LEP centre-of-mass energy. In most available codes, LEP limits from direct searches take the form of hard lower limits on sparticle masses, at e.g. 95% CL. This is how such limits are implemented in Dark-SUSY [98] and micrOMEGAs [99], for example. Such lim-its generally rely on model-dependent assumptions, which are not always clearly stated.
As an example, in  [100]. However, if one looks closely at the details of these limits there are indeed remaining model assumptions, e.g. the ALEPH experiment assumes μ = −200 GeV and tan β = 2 for the production cross-section, and a branching ratio BR(ẽ R → eχ 0 1 ) = 1. In contrast, in an analysis of the same data using a more general MSSM parameter space (but still assuming gaugino mass unification, scalar mass unification, no slepton mixing, and negligible squark mixing), the selectron mass limit becomes 73 GeV [9]. The weakening of this limit is due to possibile cascade decays of the selectron.
In ColliderBit we take a different approach, which is free from model-dependent assumptions, using the direct crosssection limits for sparticle pair production of sleptons, neutralinos and charginos at LEP. Our approach includes not only model-dependent effects in the cross section, but also in the decay rates, where we make no assumptions on the branching ratios, relying instead on an explicit calculation.
Continuing the example of the selectron case, we now discuss how we model the cross-section limit for slepton pair production and decay into the lightest neutralino from the L3 experiment. This has been given as a function of the selectron and neutralino mass in Fig. 2a of Ref. [6], which we reproduce here in Fig. 6 for demonstration. Corresponding results for smuons and staus are used in the same manner. These results cover slepton masses from 45 GeV up to the kinematic limit of 104 GeV, with neutralino masses from zero up to the slepton mass. For a particular model point the theoretical slepton pair production cross-section is calculated in a separate routine. This uses leading order results on the crosssection taken from Refs. [101,102], which includes t-channel contributions from neutralinos. We treat contributions to a possible signal cross-section fromẽ * LẽL andẽ * RẽR pair production separately, taking into account the relevant branching ratios for the decay to the lightest neutralino using Decay-Bit [40], which can be interfaced to, e.g., SUSY-HIT [103]. Hereafter, we refer to this cross-section times branching ratio as σ × BR.
We estimate the dominant theoretical uncertainty on σ × BR using the mass uncertainties of the sleptons, as reported by SpecBit [40]. For slepton mass values of m 1 ± δ 1 and m 2 ± δ 2 , we calculate the central value of σ × BR for m 1 and m 2 . Then, we recalculate σ × BR with the upper and lower mass values and use the maximum and minimum of these as estimates for the overall σ × BR uncertainty.
Once the σ ×BR has been calculated this way, we can look up the appropriate limit with which to compare from Fig. 6.  Fig. 6 Example of the limit interpolation process, based upon Fig. 2a of Ref. [6]. The line segment OP is used to find the intersection points A 1 , A 2 , and A 3 , which determine that P is within the 0.06 pb limit. Then, for each angle α ∈ [0, 2π ], the line segments PQ 1 and PQ 2 contribute to the weighted average limit at the point P. More details of this procedure are described in the main body of text We do this by digitising each cross-section limit contour, and using inverse distance-weighted interpolation [104] to estimate the cross-section limits in regions between contours. The weighted averaging prevents the noisiness of the LEP limit curves from strongly influencing the interpolant (an advantage over e.g. spline or bilinear interpolation), whilst at the same time forcing the interpolant to exactly reproduce the published cross-section contours (an advantage over e.g. data smoothing algorithms).
Our algorithm works as follows. Given a point on the mẽ-mχ0 1 plane, such as point P in Fig. 6, we first determine which contours contain this point. This can be achieved by drawing a line segment from this point to any point O outside of the plot. Then, for each contour, if this line segment OP intersects the contour an odd number of times, say at A 1 , A 2 , and A 3 , then the point lies within the contour. Using this method, we find the two limit contours, 0.06 pb and 0.03 pb, between which the point P lies. Next, for a large number of angles α ∈ [0, 2π ], we draw a line segment PQ 2 from P to where it intersects the outer limit of 0.06 pb. We also note if this line segment intersects the 0.03 pb contour, such as at Q 1 . If the point lies directly on top of one of the contours, we simply take that contour as the correct limit. If not, we calculate the limit as a weighted average of all bounding cross-section limits over all angles. We weight each cross-section limit sample by PQ − p , where PQ is the length of the line segment PQ, and p is the so-called 'power' parameter of the inverse distanceweighted interpolation algorithm. We choose p = 0.5, to avoid artificially endowing the interpolating functions with local minima and maxima around the sample points, a known shortcoming of the algorithm for choices of p greater than or equal to 1. The results of this interpolation proceedure can be seen in Fig. 7, which shows the interpolated 95% confidence limits for all of the results from the L3 experiment that we use here. In particular the top left plot can be compared directly to Fig. 6.
Comparing the values of σ × BR to the 95% confidence interpolated limit drawn from the digitised limit plot, we can now calculate the likelihood using the error function and the estimated theoretical uncertainty on σ × BR.
To increase the constraining power of the direct LEP searches, we also use the corresponding cross-section limits set by the ALEPH experiment, calculating a second likelihood in the same manner. For this, we consider the searches for scalar leptons in the same mass range, using the modelindependent results of Fig. 3 in Ref. [8]. We treat the data of the two experiments as independent. 7 In Fig. 8 we show the ColliderBit exclusion limits from the combination of ALEPH and L3 searches for slepton pair production, in the CMSSM (m 0 , m 1/2 )-mass plane for two different values of tan β. The results in Fig. 8 can be compared to the corresponding CMSSM exclusion limits from ALEPH alone given in [9] (dashed lines). We have checked that the observed differences are mainly due to the higher constraining power of the two experiments combined, with some remaining unavoidable differences caused by the RGE codes used for the spectrum generation.
We take similar limits for the neutralino and chargino pair production cross-sections, with decays into the lightest neutralino, from searches by the OPAL and L3 experiments. The corresponding theoretical leading-order cross-sections are from Refs. [105] and [106], again taking into account the relevant branching ratio for each model point. For neutralino pairs, the limits are set onχ 0 2χ 0 1 production with subsequent decay of theχ 0 2 . We take OPAL results from Fig. 9 in Ref. [12], which applies to hadronic decays, giving bounds for mχ0 2 from 100 GeV to the kinematic limit of 204 GeV, while mχ0 1 ranges from zero to mχ0 as a function of the produced sparticle mass and the mass of the lightest neutralino. This interpolation is based on results by the L3 experiment at LEP [4,6] For chargino pair production, the OPAL experiment [12] sets limits on hadronic, semi-leptonic and leptonic decays separately in Figs. 5, 6 and 7 of that article. The limits are set from a chargino mass of 75 GeV up to the kinematical limit of 104 GeV, and for neutralino masses from zero up to the chargino mass. For each channel we take into account the branching ratios BR(χ ± 1 → qq χ 0 1 ) and BR(χ ± 1 → lνχ 0 1 ) of the model point. We use an older, compatible, limit from the L3 experiment on fully leptonic decays taken from Fig. 2b of Ref. [4]. This extends from 45 GeV chargino masses up to a kinematic limit of 94.5 GeV. Unfortunately, the L3 experiment does not give separate model-independent cross-section limits for the other two channels. Our interpolation from the L3 results is again shown in Fig. 7 (bottom left).
We also include results on chargino and neutralino pair production from both OPAL (Fig. 8 in Ref. [12]) and L3 (Figs. 2a and 3a in Ref. [4]), where the limits assume that the fermions inχ ± 1 → f f χ 0 1 andχ 0 2 → ffχ 0 1 are represented as per the normal W and Z branching ratios into two fermions. This must be used with some care, as light sfermions may affect the assumption.
We note that while the experimental limits have been set onχ 0 2χ 0 1 andχ + 1χ − 1 production, the ColliderBit likelihood can be calculated for production of anyχ 0 iχ 0 1 andχ + iχ − i , as long as we consider the same experimental signature in the decay. Again, we show the resulting exclusion limits in the CMSSM in Fig. 8. Here we observe very good agreement with earlier ALEPH results on chargino pair production [9], and we have checked that the difference is dominated by differences in the RGE codes used. GAMBIT relies on Flex-ibleSUSY [57], while the ALEPH analysis was carried out with ISASUSY 7.51 [107].

Higgs likelihood calculation
ColliderBit includes likelihoods relating to constraints on extended Higgs sectors from collider experiments, and to measurements of the SM-like Higgs mass and production cross-sections at the LHC. These likelihoods are provided through an interface to HiggsBounds [33,34] and Hig-gsSignals [35].
Although constructing a likelihood from null search results at colliders generally requires event simulation, the information provided by the combined LEP Higgs search results [108] allows for the construction of an approximate likelihood for neutral Higgs bosons. HiggsBounds interpolates the full CL s+b distribution from the combined modelindependent LEP searches, for all Higgs mass combinations, With the direct observation of an SM-like Higgs boson [16,17], measurements of the new particle's mass, production cross section, and branching ratios can be used to constrain the neutral Higgs sector of BSM models. In channels where measurements of the neutral boson's mass are available, Hig-gsSignals calculates contributions to the mass likelihood as a χ 2 , taking into account both experimental and theoretical uncertainties. For each channel, it minimises the χ 2 independently over the possibility of each neutral state in the Higgs sector being responsible for the signal, including the simultaneous appearance of multiple resonances if they are nearly degenerate in mass. For signal strengths, it uses measurements over all available channels to construct a single χ 2 , using the associated N meas -dimensional covariance matrix to account for reported experimental uncertainties, including correlations due to common channels between experiments and the uncertainty in the integrated luminosity. It then combines this signal-strength likelihood with the mass likelihood to form a combined LHC neutral Higgs sector likelihood.
For both the LEP and LHC likelihoods implemented in ColliderBit, the theoretical masses (with uncertainties, when available), couplings and branching ratios come from other GAMBIT modules, namely DecayBit and SpecBit [40]. In particular, we use the Higgs couplings provided via HiggsCouplingsTable objects from SpecBit to estimate the neutral Higgs boson production cross sections. We calculate the ratios of the production cross-sections for each Higgs in a given BSM theory to an SM Higgs of the same mass, assuming them to be given by the ratio of squared couplings for the relevant processes.

User interface
The GAMBIT code consists of a series of separate code modules that calculate likelihoods for new physics models using data from flavour physics [41], astrophysics [42], electroweak precision physics [40] and collider physics (the present paper). These modules can be used as standalone tools (using a custom C++ driving code), or they can be used via the GAMBIT core framework that resolves dependencies between calculations, and steers scans with the aid of a dedicated scanning and statistics module [39]. The advantage of using the latter is that it is by far the easiest way to define models, calculate spectra and perform decay width calculations.
There are thus two ways to take advantage of the highenergy collider likelihoods provided by ColliderBit: either via the GAMBIT framework or by interfacing to ColliderBit as a standalone tool. Here we describe each in turn.

GAMBIT interface
The GAMBIT framework [1] defines two sorts of function that can be used by each module within the framework: In GAMBIT, each module function is given a tag, called a capability, that describes what it can calculate, be it an observable, e.g. the number of events expected at the LHC, or a likelihood, e.g. the combined likelihood of a set of LHC searches. Module functions may also have dependencies on other module functions -which may live either in the same module or in another GAMBIT module -or backend requirements that are satisfied by backend functions or backend variables. A concrete example from ColliderBit is the capability of the combined LHC likelihood calculation (Table 7), which has dependencies on the numbers of events expected in CMS and ATLAS searches, and has a backend requirement relating to the functional form of the likelihood required.
ColliderBit interfaces with the GAMBIT Core to communicate its capabilities, dependencies, and backend requirements. The Core then runs its dependency resolution routine to connect and execute the module functions in the order that fulfils all the dependencies. As most of this machinery is described in the main GAMBIT paper [1], in this section we shall simply describe each of the ColliderBit capabilities and as their uses.

LHC simulation capabilities
These capabilities are grouped within ColliderBit into three categories, which correspond to the three main steps of simulation: collider, detector, and analysis. There are also two additional capabilities needed to complete a collider simulation. One is a capability meant simply to control the parallelisation and execution of the event generation loop (ColliderOperator), and the other calculates the likelihood as a final result (LHC_Combined_LogLike). These two capabilities are shown in Table 7. Tables 8, 9, and 10 show the collider, detector, and analysis capabilities, respectively.
Since there can be a variety of configurations for collider simulation and a variety of experimental analyses, we designed these components so new configurations and analyses could be easily added.
For instance, a user who wishes to add a new Pythia 8 configuration must only complete the following steps:

Create a SpecializablePythia initialisation function.
These functions are defined in the file colliders/Speci alizablePythia.cpp. 9 Each such function must have its own namespace and a call signature of void init (SpecializablePythia* specializeMe).
Within the init function, settings can be sent to Pythia 8 as strings using the SpecializablePythia::addToSet tings function. For example, the following would be a valid init function: 9 Within this section, all header files (*.hpp) mentioned are found in the ColliderBit/include/gambit/ColliderBit directory, while source files (*.cpp) are found in the ColliderBit/src directory.  (Table 10) nulike Table 8 The collider capabilities provided by ColliderBit. In addition to the dependencies shown above, all of these functions also depend on the ColliderOperator capability in Table 7 (Table 7). The SLHA_filenames option is a list of the SLHA files that the user wants to run using getPythiaFileReader.
Finally, the xsec_vetos option specifies limits on the maximum total cross-section (in fb), as estimated by Pythia at the beginning of a run, below which the simulation should be skipped. One cross-section limit can be set per Pythia  Table 9 The detector capabilities provided by ColliderBit. In addition to the dependencies shown above, all of these functions also depend on the ColliderOperator capability in Table 7

Table 10
The analysis capabilities provided by ColliderBit. In addition to the dependencies shown above, all of these functions depend on the ColliderOperator capability in Table 7,  provides a list of CMS analyses within a container that is ready to apply them to an event HardScatteringSim ( provides a list of "identity" analyses (no detector smearing) within a container that is ready to apply them to an event  (Table 8) IdentityAnalysis-Container namespace Pythia_ttbar_LHC_13TeV { void init(SpecializablePythia* specializeMe) { specializeMe->addToSettings( "Beams:eCM = 13000"); specializeMe->addToSettings( "Top:qqbar2ttbar = on"); specializeMe->addToSettings( "Top:gg2ttbar = on"); } } This would initialize a Pythia 8 configuration named Pythia_ttbar_LHC_13TeV to simulate tt production through qq → tt and gg → tt at 13 TeV. Init functions may also "inherit" from other existing init functions by explicitly calling them. For instance: TeV, which sets the energy to 13 TeV and turns on the qq → tt and gg → tt processes. Then follows four calls to addToSettings that switch on the simulation of the additional production processes qq → tq (t-channel W exchange), ff → tt (s-channel γ /Z exchange), ff → tq (s-channel W exchange) and γ γ → tt. 2. Add the initialisation function namespace within Speci alizablePythia::resetSpecialization. This function is located at the end of colliders/Specializable Pythia.cpp, and it allows for runtime selection of the Pythia 8 specialisation via a std::string. The namespaces for the new init functions must be included here using the IF_X_SPECIALIZEX macro. Thus, for our example top production init functions above, we would add: For example, to activate our "alltop" Pythia 8 configuration, this would be the YAML entry: The last of these options removes much of the output from the Pythia event generator. We may also supply our chosen configuration with additional options right in the YAML file. We do this in the rules for the getPythia module function: We recommend registering commonly-used Pythia 8 configurations in colliders/SpecializablePythia.cpp as described above. However, it is also possible to set up custom Pythia configurations directly in the YAML file by specifying all relevant Pythia options there. In this case the configuration name given in pythiaNames must not match any of the registered init functions in colliders/SpecializablePythia.cpp.
For each Pythia configuration, the choice of analyses and detector simulations can be varied. Many of the options detailed in Tables 8, 9 and 10 are therefore vectors expecting one element per Pythia configuration, in the same order as the configurations in pythiaNames.
Adding a new analysis is nearly as simple as adding a Pythia 8 configuration. An annotated minimal example is given in analyses/Analysis_Minimum.cpp. This contains the minimum required to print out the number of jets, b-jets and leptons, and the missing energy in each event, plus pass an arbitrary set of signal region cuts. To add a new analysis: Although the current framework only supports cut-andcount analyses, the user could easily add more complicated likelihoods by adding new module functions.

LEP supersymmetry limit capabilities
ColliderBit contains functions that calculate the crosssection for various SUSY particle productions within the context of the LEP collider. The capabilities for these functions are described in Table 11. Using these functions along with SUSY particle decay information, we calculate the crosssection times branching ratio for each production mechanism associated with LEP model-independent limits. The capabilities and functions that compare this calculation with each LEP limit are described in Tables 12 and 13.

Higgs likelihood capabilities
ColliderBit provides likelihoods from experimental searches for Higgs bosons at LEP and the LHC, through interfaces to HiggsBounds and HiggsSignals. The capability LEP_Higgs_LogLike is provided by the function calc_HB_LEP _LogLike, which uses HiggsBounds to calculate an approximate likelihood constructed from the results from searches for neutral and charged Higgs bosons at LEP. Similarly, capability LHC_Higgs_LogLike is provided by function calc_HS_LHC _LogLike, which employs HiggsSignals to compute a likelihood including constraints from measurements of the Higgs boson production rates and mass at the LHC. These functions are detailed in Table 14, along with their dependencies.
Both functions depend on being provided with a Hig-gsBounds/Signals-specific data object containing all the input parameters needed to run either of these two external codes. ColliderBit constructs one of these objects from the Higgs_Couplings provided by SpecBit. There are separate functions to do this for a pure SM Higgs, an MSSM Higgs sector with three neutral and one charged Higgs, and a Higgs sector containing just one SM-like Higgs and possible invisible states for it to decay to, as in the scalar singlet and other such singlet Higgs portal models (e.g. [109,110]).

Standalone interface
As described in [1], GAMBIT routines can be called in a standalone code provided that the code specifies the module functions and backend functions that the user requires, along with any necessary options. In addition, the user must resolve the dependencies of each module function "by hand".
An annotated example program for running ColliderBit independently of the GAMBIT framework can be found in ColliderBit/examples/ColliderBit_standalone_ example.cpp. This example uses ColliderBit with a custom version of Pythia (8.212.EM) to calculate the LHC likelihood for a simple BSM model, with the required couplings, masses and branching ratios input via an SLHA file. The details of how to connect the custom Pythia version to ColliderBit and run the standalone are given in Sect. 4.2. Here we go through the structure of the code in ColliderBit_standalone_example.cpp.
The program consists of three main parts: dependency resolution, configuration of ColliderBit and Pythia, and execution of the simulation loop plus calculation of the LHC loglikelihood. It is the second part that the user typically will want to edit, as this is where the settings for event generation and detector simulation are specified, along with which LHC analyses to include.

Table 11
The capabilities provided by ColliderBit that calculate SUSY particle production cross-sections within the context of the LEP collider. All of these functions return a triplet of doubles.
These correspond to the maximum, central, and minimum cross-sections calculated while varying the SUSY particle masses according to their estimated uncertainties. All of these functions depend on GAMBIT's MSSM30atMGUT  Then the getPythiaFileReader function can be configured with vectors of settings for the two Pythia configurations. For In ColliderBit_standalone_example.cpp the path to a single input SLHA file is taken as a command line argument and stored in a variable inputFileName, which can then be passed to getPythiaFileReader: getPythiaFileReader.setOption<vstr>( "SLHA_filenames", vstr {inputFileName}); Finally, we choose which detector simulators and LHC analyses to include. To use the ATLAS configuration of BuckFast with both "Pythia_EM_8Tev" and "Pythia_EM_13 Table 13 The gaugino LEP limit capabilities provided by ColliderBit. In addition to the dependencies shown above, all of these functions also depend on GAMBIT's MSSM30atMGUT [1] model parameters and SpecBit's MSSM_spectrum [40] capability. Each of the decay_rates can be provided by DecayBit [40]. These functions have no options to be specified in the YAML file. Note that the All_Channels likelihoods assume that the neutralino or chargino decay to fermions follows the same branching pattern as the corresponding on-shell gauge boson, and should be used with care The names of the ATLAS analyses to include are then passed to the function getATLASAnalysisContainer. Here we include the 0-lepton searches at 8 and 13 TeV: Provides estimated MSSM Higgs production cross sections through an interface to FeynHiggs

FeynHiggs
Note that the two analyses are given in separate subvectors, one for each Pythia configuration (see Table 10). CMS analyses are similarly included by configuring getBuckFastCMS and getCMSAnalysisContainer. In our example, we only use a Run I CMS analysis, which is therefore only applied to the 8 TeV Pythia configuration. The full LHC simulation loop and likelihood calculation is run in the third part of the main program, by executing the ColliderBit functions operateLHCLoop and calc_LHC_LogLike: operateLHCLoop.reset_and_calculate(); calc_LHC_LogLike.reset_and_calculate();

CMSSM example
An annotated example of a YAML file for scanning the CMSSM with GAMBIT using only functions from Collider-Bit is provided in yaml_files/ColliderBit_CMSSM.yaml. The file demonstrates how to specify the model parameters (and priors), choose and configure a sampler, choose a printer (either hdf5 or ascii), run the LHC and LEP collider likelihoods, run the HiggsBounds and HiggsSignals Higgs likelihoods, and configure details of the detector simulation and Monte Carlo event generator.

Generic Pythia model example
The recommended method of using ColliderBit with a new model is to define and run the model within the full GAM-BIT framework, allowing access to the model declaration and scanning routines, in addition to non-collider likelihood functions should these be of interest. However, if the user only wants to check single parameter points with Collider-Bit, the standalone interface described in the previous section presents a more minimal alternative. Regardless of which interface is used, ColliderBit must be set up to work with a version of Pythia that can generate events for the new model. Here we go through an example of how to achieve this.
Our physics model example consists of the SM augmented by a new scalar singlet field φ 1 and a new, coloured Dirac fermion U . The model is a stripped down version of that featured in [111], which contains a complete tutorial for how to implement the model in Monte Carlo generators. The new particles have the following mass terms: The new fermion interacts with the new scalar via the Lagrangian term where u is the SM up-quark field. We will simulate the process pp →ŪU where the U subsequently decays via U → uφ 1 . To use this model with ColliderBit, we make use of the MadGraph5_aMC@NLO-Pythia 8 interface to generate matrix element code that can be used to supplement the internal processes in Pythia. Sample Mathematica notebook and Feynrules model files for generating UFO output are provided in ColliderBit/data/ExternalModel. The MadGraph5_aMC@NLO commands for generating matrix element code for coloured fermion production in proton collisions are as follows, assuming that the UFO model has been placed in the MadGraph5_aMC@NLO models directory: import model GambitDemo_UFO generate p p > uv uv ∼ output pythia8 The resulting C ++ code can be found in the src and include subdirectories of This instructs Pythia to produceŪU pairs in proton collisions, but they will not decay unless instructed to do so via the input SLHA file. An example SLHA file generated with MadGraph5_aMC@NLO is provided in ColliderBit/data/ExternalModel_point.slha. This file contains a decay table for the U particle with a 100% branching ratio to an up quark and a φ 1 . We remind the reader that the standalone example is only intended as a minimal way of running single points of a new model through ColliderBit. For a comprehensive study, including scanning over model parameters, the user should add the model in the GAMBIT model database and implement spectrum and decay calculations through the GAMBIT modules SpecBit and DecayBit as required.
Finally, there is an important subtlety regarding invisible particles. At the time of writing, the default PDG ID codes of new particles in Feynrules do not always correspond to those of invisible, uncharged particles. In the ColliderBit simulation chain, this means that the particles will not appear as missing energy in the detector simulation. According to the PDG ID code standard, invisible particles may have a PID of 12, 14 or 16 (SM neutrinos), 1000022 (lightest neutralino in a superysmmetric model), or 50-60 (for generic new BSM particles). The user can thus obtain correct behaviour for an invisible species by including the PDG code definition in the Feynrules field definition as in the following example:

Conclusions
ColliderBit is a new modular software code for the application of high-energy collider constraints to generic BSM physics models, written in the GAMBIT framework. This paper serves as an introduction to the code, and as a reference manual for users wishing to add new analyses or features.
The code provides a rigorous and fast implementation of LHC constraints through a parallelised Monte Carlo simulation interfaced with several detector simulation options, including a new simulation based on four-vector smearing. A custom event analysis class allows the user to apply the same LHC analysis code to any level of detector simulation, and we supply likelihood routines capable of reproducing LHC cut and count searches, or binned shape fits. An interface to the Pythia 8 event generator allows the user to add matrix elements for new models.
LEP constraints are handled via a new code based on a sophisticated interpolation of the cross-section limits on slepton, neutralino and chargino pair production. Higgs limits, for both LEP and the LHC, are currently handled via an interface to the HiggsSignals and HiggsBounds packages, but there exists scope to provide and interface new likelihood calculations in future ColliderBit releases.
The code can function either as a standalone tool for quick checks of specific model points, or it can be run within the GAMBIT framework to provide a complete tool for BSM inference from high energy collider data.  The addition of custom detectors to ColliderBit involves subclasses of the BaseDetector class, which is templated on both the type of event that it can accept for simulation (EventIn), and the type of event that it will return after detector simulation (EventOut). In a similar way as described above for colliders, a user may add a fully custom detector by creating a subclass of BaseDetector<EventIn, EventOut> and writing overrides for the virtual functions, as described in Table 16.
The analysis base class, BaseAnalysis, is templated on the type of event that it can analyze (EventT). Subclasses of BaseAnalysis<EventT> inherit the functions as described in Table 17, some of which must be overridden by the subclass author. Existing analyses provide examples of how to do this.
The addition of custom limits and limit curve interpolation to ColliderBit requires that the user declare new module functions in ColliderBit_rollcall.hpp and define them in ColliderBit.cpp. However, if the user wishes to use ColliderBit's limit interpolation system (as described in Sect. 2.2), they can create a subclass of the BaseLimitContainer class and override the functions shown in Table 19. Functions that are virtual and marked with a perform only simple operations on variables in the base class, unless they are overridden by the subclass author. Functions that are virtual and marked with b must be overridden by the subclass author. Non-virtual functions are not intended to be overridden. Analysis results are contained within each subclass in a SignalRegionData instance, which is described in Table 18 C++ function signature Improve the stored cross section by averaging it with the given one, and recompute the uncertainty Table 18 Member variables of the SignalRegionData struct, which is used in the BaseAnalysis class as a container for the analysis results of each signal region. The BaseAnalysis class is described in Table 17 C++ Member Variable Intended purpose std::string analysis_name The name of the analysis that contains this signal region The absolute systematic error of n_background Functions that are virtual and marked with b must be overridden by the subclass author. Non-virtual functions are not intended to be overridden. Note that P2 is a simple class that represents a point found on the limit plane and is defined in the header file Creates a file, filename, containing the limit contour data as a series of points, using nperLine points for each line of the contour