Measurement of the $t\bar{t}t\bar{t}$ production cross section in $pp$ collisions at $\sqrt{s}$=13 TeV with the ATLAS detector

A measurement of four-top-quark production using proton-proton collision data at a centre-of-mass energy of 13 TeV collected by the ATLAS detector at the Large Hadron Collider corresponding to an integrated luminosity of 139 fb$^{-1}$ is presented. Events are selected if they contain a single lepton (electron or muon) or an opposite-sign lepton pair, in association with multiple jets. The events are categorised according to the number of jets and how likely these are to contain $b$-hadrons. A multivariate technique is then used to discriminate between signal and background events. The measured four-top-quark production cross section is found to be 26$^{+17}_{-15}$ fb, with a corresponding observed (expected) significance of 1.9 (1.0) standard deviations over the background-only hypothesis. The result is combined with the previous measurement performed by the ATLAS Collaboration in the multilepton final state. The combined four-top-quark production cross section is measured to be 24$^{+7}_{-6}$ fb, with a corresponding observed (expected) signal significance of 4.7 (2.6) standard deviations over the background-only predictions. It is consistent within 2.0 standard deviations with the Standard Model expectation of 12.0$\pm$2.4 fb.


Introduction
Being the heaviest known elementary particle of the Standard Model (SM), the top quark has a large coupling to the SM Higgs boson and is predicted to have large couplings to hypothetical new particles in many models of physics beyond the Standard Model. For this reason, it is particularly important to study rare processes involving the top quark. The production of four top quarks,¯¯, is one of these processes which has not yet been observed. The¯¯cross section could be enhanced, for instance, by gluino pair production as in supersymmetric theories [1,2], by pair production of scalar gluons [3,4], or by the production of a heavy scalar or pseudoscalar boson in association with a top-quark pair (tt) in type-II two-Higgs-doublet models [5][6][7]. The¯¯cross section is also sensitive to both the magnitude and the charge conjugation and parity properties of the Yukawa coupling of the top quark to the Higgs boson [8,9], as well as to various four-fermion couplings in the context of the effective field theory framework [10,11]. Within the SM, the¯¯cross section in proton-proton (pp) collisions at a centre-of-mass energy of √ = 13 TeV is predicted to be¯¯= 12.0 fb at next-to-leading order (NLO) in QCD including NLO electroweak corrections [12], with a relative combined uncertainty of 20% dominated by the missing higher order QCD correction evaluated by varying the renormalisation and factorisation scale choices. The uncertainty from the PDF and s choice in the¯¯cross section was estimated to be about 6.3% using the PDF4LHC prescription. An example of a Feynman diagram for SM¯¯QCD production is shown in Figure 1 (left). The electroweak¯¯contribution is illustrated in Figure 1 (middle) with an example of a Feynman diagram where a Higgs boson acts as an off-shell mediator.
The¯¯events can give rise to several different final states depending on the hadronic or semileptonic decay mode of each of the top quarks. The final states can be grouped according to the number of electrons or muons from the semileptonic top-quark decays, including those from subsequent leptonic decays. The final state with two leptons 1 of the same electric charge or with more than two leptons is referred to as the 2LSS/3L final state. This final state contains 13% of all produced¯¯events and features a low level of background contamination. The final state with one lepton or two oppositely charged leptons (referred to as the 1L/2LOS final state) accounts for a larger fraction, capturing 57% of all produced¯¯events. However, this final state suffers from a large irreducible background that is mostly composed of tt production in association with additional jets (¯+jets). It is complementary to the 2LSS/3L final state and poses different challenges. The main challenge lies in the proper evaluation of the dominant background from¯¯events with additional jets, which has significant theoretical uncertainty. An example Feynman diagram for this background is shown in Figure 1 (right). ATLAS and CMS have already searched for¯¯production in 13 TeV pp collisions at the Large Hadron Collider (LHC). The most recent ATLAS result focused on the 2LSS/3L final state using 139 fb −1 of data and led to the first evidence for this process with an observed (expected) significance of 4.3 (2.4) standard deviations and a measured cross section of¯¯= 24 +7 −5 fb [13]. The previous ATLAS search in the 1L/2LOS final state using 36.1 fb −1 set an observed (expected) 95% confidence level (CL) upper limit on¯¯of 47 fb (33 fb) [14]. CMS also set a 95% CL upper limit on the¯¯production cross section in this final state of 48 fb using a 35.8 fb −1 data set [15]. The latest CMS search with 137 fb −1 of data in the 2LSS/3L final state lead to an observed (expected) significance for a¯¯signal of 2.6 (2.7) standard deviations and a measured cross section of¯¯= 12.6 +5.8 −5.2 fb [16].
This article presents a search for¯¯production in the 1L/2LOS final state using the full data set of pp collision data at √ = 13 TeV corresponding to 139 fb −1 . The result is then combined with that obtained in the 2LSS/3L final state using the same data set [13]. 1 In the rest of this article, 'lepton' refers exclusively to an electron or muon.

The ATLAS detector
The ATLAS experiment [17] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4 coverage in solid angle. 2 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic (EM) and hadron calorimeters, and a muon spectrometer (MS). The inner tracking detector covers the pseudorapidity range | | < 2.5. It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. Lead/liquid-argon (LAr) sampling calorimeters provide EM energy measurements with high granularity. A steel/scintillator-tile hadron calorimeter covers the central pseudorapidity range (| | < 1.7). The endcap and forward regions are instrumented with LAr calorimeters for EM and hadronic energy measurements up to | | = 4.9. The MS surrounds the calorimeters and is based on three large air-core toroidal superconducting magnets with eight coils each. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. The MS includes a system of precision tracking chambers and fast detectors for triggering. A two-level trigger system is used to select events. The first-level trigger is implemented in hardware and uses a subset of the detector information to accept events at a maximum rate of nearly 100 kHz. This is followed by a software-based trigger that reduces the accepted event rate to 1 kHz, on average, depending on the data-taking conditions [18]. An extensive software suite [19] is used for real and simulated data reconstruction and analysis, for operation and in the trigger and data acquisition systems of the experiment.

Object and event selection
Events are selected from √ = 13 TeV pp collision data collected by the ATLAS detector in the period between 2015 and 2018. Only events for which all detector subsystems were operational are considered. The data set corresponds to an integrated luminosity of 139 fb −1 [20,21]. Events were required to fire single-electron or single-muon triggers, with minimum T thresholds varying from 20 to 26 GeV depending on the lepton flavour and the data-taking period. Triggers with minimum T thresholds include isolation requirements [22,23].
Events are required to have at least one vertex reconstructed from at least two ID tracks with transverse momenta T > 0.4 GeV. The primary vertex for each event is defined as the vertex with the highest sum of 2 T over all associated ID tracks [24]. Object reconstruction closely follows that of Ref. [13] and is briefly summarised in the following. Electron candidates are reconstructed from energy deposits in the EM calorimeter associated with ID tracks [25]. The pseudorapidity of the calorimeter energy cluster, cluster , must satisfy | cluster | < 2.47, excluding the transition region between the barrel and the endcap calorimeters (| cluster | ∉ [1.37, 1.52]). Muon candidates are reconstructed by combining tracks reconstructed in both the ID and the MS [26] and are required to have | | < 2.5. Both the electron and muon candidates are required to have T > 10 GeV. The transverse impact parameter divided by its estimated uncertainty, | 0 |/ ( 0 ), is required to be lower than five (three) for electron (muon) candidates. The longitudinal impact parameter 0 must satisfy | 0 sin( )| < 0.5 mm for 2 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the -axis along the beam pipe. The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis. The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of both lepton flavours. Electrons are required to satisfy the 'Tight' likelihood-based identification criterion and to be isolated according to the 'FixedCutTight' selection criterion [25]. Muons must satisfy the 'Medium' cut-based identification criterion and be isolated according to the 'FixedCutTightTrackOnly' selection criterion [27].
Jets are reconstructed from topological clusters [28] of energy deposits in the calorimeters using the antialgorithm [29, 30] with a radius parameter of = 0.4 and calibrated as described in Ref. [31]. They are referred to as 'small-jets'. These jets are required to have T > 25 GeV and | | < 2.5. To suppress the effect of additional pp collisions in the same or a nearby bunch crossing, collectively referred to as pile-up, the jets with T < 120 GeV and | | < 2.4 must satisfy a pile-up suppression requirement based on the output of a multivariate classifier called the jet-vertex-tagger (JVT) [32]. Events are required to pass a set of quality criteria to suppress those containing any jets arising from non-collision sources or detector noise [33]. The MV2c10 multivariate algorithm [34] is used to identify jets containing -hadrons. Each jet is given a score representing the likelihood of the jet to contain a -hadron. A jet is -tagged if the score passes a certain threshold, referred to as an operating point (OP). Four OPs are defined with average expected efficiencies for -jets of 60%, 70%, 77% and 85%. A pseudo-continuous score is assigned to each jet passing these OPs, with an integer value ranging from five for jets that pass the 60% OP to two for jets passing only the 85% OP. A score of one is assigned if the jet does not pass any of the OPs.
The selected and calibrated small-jets are used as inputs for jet reclustering [35] using the antialgorithm with a radius parameter of = 1.0. These reclustered jets are referred to as 'large-jets'. The calibration corrections and uncertainties for the reclustered large-jets are inherited from the smalljets [35]. A trimming procedure is applied to the reclustered jets to remove all the associated smalljets that have T below 5% of the T of the reclustered jet [36]. The reclustered jets are required to have T > 200 GeV and | | < 2.0. A sequential overlap removal procedure defined in Ref. [13] is applied to ensure that the same calorimeter energy deposit or the same track is not associated with two or more different reconstructed objects.
The missing transverse momentum in the event, whose magnitude is denoted by miss T , is defined as the negative vector sum of the T of reconstructed and calibrated objects in the event [37]. This sum includes the momenta of the ID tracks matched to the primary vertex but not associated with any other objects.
The events are required to have either exactly one lepton satisfying T > 28 GeV and at least seven jets (1L channel) or exactly two oppositely charged leptons with T > 28 GeV for the leading lepton and T > 10 GeV for the subleading lepton and at least five jets (2LOS channel). In the 2LOS channel, the events with two same-flavour leptons must have a dilepton invariant mass above 15 GeV and outside the -boson mass window of 83-99 GeV. Each event is required to have at least one reconstructed lepton that matches the lepton that fired the trigger and must contain at least two -tagged jets passing the 70% OP.

Monte Carlo samples
Monte Carlo (MC) samples of simulated events were produced to model the SM¯¯signal and background processes. The nominal samples are identical to those used in Ref. [13] and are normalised using the best theory predictions available. The NNPDF3.0NLO [38] PDF set was used in all matrix element (ME) calculations if not stated otherwise. The top-quark mass top was set to 172.5 GeV in all relevant samples.
The nominal¯¯signal sample was generated using the M G 5_aMC@NLO v2.6.2 [39] generator at NLO in the strong coupling constant s with the NNPDF3.1NLO [38] PDF set. The functional form of the renormalisation and factorisation scales were set to r = f = T /4, where T is defined as the scalar sum of the transverse masses √︃ 2 + 2 T of the particles generated from the ME calculation, following Ref. [12]. An additional sample with settings similar to the nominal was generated using a leading-order (LO) ME, which makes more efficient use of simulation resources as unlike the nominal it does not contain negative weight events. The additional sample is only used for the training of the multivariate discriminant to separate signal from background.
The nominal MC sample for tt background modelling was produced with the HVQ program [40,41] in the P B v2 [40,[42][43][44] generator at NLO in QCD in the five-flavour scheme (5FS). The ℎ damp parameter, which controls the T of the first additional emission beyond the Born configuration, was set to 1.5 top [45,46]. A dedicated¯¯sample with the highest available precision, and with the additional -quarks coming from the ME, was produced at NLO QCD accuracy in the four-flavour scheme (4FS) with the P B R [47] generator and OpenLoops [48,49], with the NNPDF3.0nlonf4 [38] PDF set. For this sample, a pre-release of the implementation of this process in P B R was provided by the authors [50]. The factorisation scale was set to 1 2 Σ = ,¯, ,¯, T, (where stands for extra partons), the renormalisation scale was set to 4 √︁ = ,¯, ,¯T, , and the ℎ damp parameter was set to 1 2 Σ = ,¯, ,¯T, where T, is the transverse mass of a given parton.
Single-top-quark production processes, i.e. associated production, t-channel and s-channel production, were modelled using the P B v2 [42][43][44][51][52][53] generator at NLO in QCD. The t-channel process was generated in the 4FS with the NNPDF3.0NLOnf4 [38] PDF set, while for and s-channel processes the 5FS was used. The diagram removal scheme [54] was employed to handle the interference between and tt production [46].
The¯events were generated using the S 2.2.1 [55] generator. The ME was calculated for up to one additional parton at NLO in QCD and up to two partons at LO QCD precision using the Comix [55] and OpenLoops [48,49] libraries and merged with the S parton shower [56] using the MEPS@NLO prescription [57][58][59][60] and a merging scale of 30 GeV. The production of¯events was modelled using the P B generator at NLO in QCD. The¯and events were generated with M G 5_aMC@NLO v2.3.3 at NLO in QCD. The other rare top-quark processes, namely ,ā nd¯production, were all modelled using the M G 5_aMC@NLO generator at LO in QCD. To assess the uncertainty due to the choice of generator, the tt, single-top-quark,¯and¯samples produced with the nominal generator set-ups are compared with alternative samples generated with M G 5_aMC@NLO for the calculation of the hard-scattering, interfaced with P 8. An alternative sample of¯events was generated with NLO MEs using S 2.2.1. All alternative samples were generated using the same PDF in the ME as in the nominal sample.
The effects of pile-up were modelled by overlaying minimum-bias events, simulated using the soft QCD processes of P 8.186 with the A3 tune [66], on events from hard processes. For all samples of simulated events, except those generated using S , the E G 1.2.0 program [67] was used to describe the decays of bottom and charm hadrons.
The non-tt samples were processed through the simulation [68] of the ATLAS detector geometry and response using G 4 [69]. A fast simulation (AtlFast 2) of the ATLAS detector relying on parameterised showers in the calorimeter [70] was used for the¯¯and tt samples. All samples were reconstructed using the same software used for collider data. Corrections were applied to the simulated events so that the physics objects' selection efficiencies, energy scales and energy resolutions match those determined from data control samples.

Analysis strategy
In the 1L/2LOS final state, the parton-level¯¯topology is characterised by four -quarks resulting from the decay of the four top quarks and by either six or four other quarks coming from the hadronic -boson decays. Consequently, the signature of the events from¯¯production in the ATLAS detector features a high number of jets ( jets ), with 10 (8) jets in the 1L (2LOS) channel, among which four contain a -hadron. This phase-space region is contaminated by a large number of background events coming almost exclusively from¯+jets production. After the preselection described in Section 3, the contribution from other backgrounds is below 8% and arises mainly from¯,¯,¯, single top-quark or / boson production in association with jets. Diboson production and rare processes including , ,¯and contribute less than 1% of all events. All small non-tt backgrounds are estimated from simulations. The contribution from the background arising from misreconstructed or non-prompt leptons is negligible.
The nominal tt sample was generated at NLO as described in Section 4, with up to one additional parton from the ME calculation. All additional jets, including the heavy-flavour ones, are produced in the parton shower. The background predictions in high jets regions were found to be unreliable. Reference [71] demonstrates clear mismodelling of the number of jets not from the tt decay and the scalar sum of the transverse momenta of the reconstructed hadronically and leptonically decaying top quarks, tt T . In addition, the rate of tt production in association with -jets was observed to be underestimated by the current MC simulations [72,73]. Therefore, the following strategies were developed in this analysis to obtain a reliable estimate of the¯+jets background by reweighting the tt MC samples using data.
Firstly, the¯+jets MC events are classified according to the flavour of the particle jets that are not from the decay of the tt system, using the same procedure as described in Ref. [74]. The particle jets are reconstructed from the simulated stable particles using the anti-algorithm with a radius parameter = 0.4, and are required to have T > 15 GeV and | | < 2.5. Events are labelled as¯+≥1 if at least one particle jet is matched within Δ < 0.4 to a -hadron with T > 5 GeV that did not originate from the decay of a top quark. Similarly, events not categorised as¯+≥1 , and where at least one particle jet is matched within Δ < 0.4 to a -hadron with T > 5 GeV that did not originate from the decay of a top quark, are labelled as¯+≥1 . The remaining events are labelled as¯+light.
The selected events are categorised into different regions according to the lepton and jet multiplicities and different -tagging requirements. The details of the event categorisation can be found in Section 6. These regions vary in their signal-to-background ratio and the flavour composition of¯+jets. Corrections to the normalisation of¯+light,¯+≥1 and¯+≥1 and their kinematic distributions are derived from regions with low signal contamination using data, as described in Section 7. These corrections are used to mitigate the mismodelling of¯+jets MC events in regions with a high signal-to-background ratio, where a binned profile likelihood fit is performed to extract the¯¯signal strength. In the regions most sensitive to¯p roduction, a multivariate discriminant is used to improve the sensitivity by distinguishing between signal and background events, as discussed in Section 8. Large-jets are used as proxies for hadronically decaying top quarks with high T , improving the discrimination between¯¯signal and¯+jets background.
The different¯+jets components after the corrections are further adjusted and constrained in the profile likelihood fit. The¯+≥1 events are divided into further categories to define their systematic uncertainties. Events with one particle jet matched to a single -hadron are labelled as¯+ , those with one particle jet matched to at least two -hadrons are labelled as¯+ , and those with two particle jets, each matched to one -hadron are labeled as¯+¯. The rest of the¯+≥1 events are labelled as¯+≥3 . Section 9 documents all systematic uncertainties considered in this analysis. Details of the scheme used for thē +jets modelling systematic uncertainties are presented in Section 9.3.

Event categorisation
Selected events in both channels are categorised into several regions as illustrated in Figure 2. The resulting regions are used for different purposes. Signal-depleted and signal-enriched regions are defined as control regions and signal regions, respectively. They are used in the profile likelihood fit to extract the signal cross section and to constrain the overall background model using the assigned systematic uncertainties. Thē +jets kinematic reweighting regions are used to extract correction factors for the¯+jets MC prediction. The validation regions are defined in order to check that the background prediction is able to describe data in regions with background composition similar to that in the signal regions.
To take advantage of the high jet multiplicity in the¯¯signal, events in each lepton channel are first categorised according to their number of jets, from 7 (5) jets to at least 10 (8) jets in the 1L (2LOS) channel. These events are then further categorised according to the -tagging requirements summarised in Table 1. The requirements were chosen to provide good separation between the different flavour components of the associated jets in the¯+jets background. The 2b, 4b and ≥5b regions are defined by respectively requiring the presence of 2, 4 and at least 5 jets -tagged at the 70% OP. The ≥5b regions in the 1L channel help constrain the modelling of¯+≥3 . In the 2LOS channel, the 4b and ≥5b regions are merged into ≥4b regions because fewer events are expected. The events with 3 jets -tagged at the 70% OP are further assigned to the 3bL, 3bH and 3bV regions using requirements on the number of jets -tagged at the 60% and 85% OPs. The 3bL (3bH) regions are defined to have relatively lower (higher) purity of MC 'truth' -jets amongst the three jets tagged at the 70% OP. As a result, the 3bH regions contain a larger fraction of + and¯+ events, whereas the 3bL regions are more populated by¯+≥1 and¯+light, where the third jet -tagged at the 70% OP is a mis-tagged -jet or light-flavour jet. The 3bV regions are defined to be orthogonal to 3bL and 3bH regions and are used to validate the background modelling in events enriched in¯+≥1 .
A total of 21 regions are used in the profile likelihood fit, with 12 regions in the 1L channel and 9 regions in the 2LOS channel. They are defined by considering the regions with at least 8 (6) jets in the 1L (2LOS) Table 1: Summary of the -tagging requirements for the event categorisation. Events in each category must satisfy all requirements listed in columns. 60% , 70% and 85% are defined as the numbers of -tagged jets obtained using -tagging operating points with average expected efficiencies of 60%, 70% and 85%, respectively. The 3bL (3bH) requirement selects events with lower (higher) purity of MC 'truth' -jets amongst the three jets tagged at the 70% OP. The 3bV requirement is used to define the validation regions. The symbol '-' indicates that no requirement is applied.  Figure 2: Schematic view of the event categories used to select analysis regions (signal, control, validation and tt+jets reweighting regions) in the 1L channel (left) and 2LOS channel (right). The axes represent the jet multiplicity and -tagging requirements defined in Table 1. The 3bL (3bH) -tagging requirement selects events with lower (higher) purity of MC 'truth' -jets amongst the three jets tagged at the 70% OP. The 3bV -tagging requirement is used to define the validation regions. The regions in grey are not used in the analysis. channel and satisfying the 3bL, 3bH or ≥4b requirements. Among these regions, the ones that have at least 10 (8) jets or have 9 (7) jets and satisfy the ≥4b requirement in the 1L (2LOS) channel are defined as the signal regions. The rest of the fitted regions are defined as the control regions. A total of 6 validation regions are also defined by considering the regions with at least 8 (6) jets in the 1L (2LOS) channel and satisfying the 3bV requirement. The validation regions are not used in the fit and hence do not contribute to the signal extraction. The largest signal contamination in the validation regions is expected to be 4.4% in the (≥8j, 3bV) region in the 2LOS channel. A test found that including these regions in the fit would increase the sensitivity to the signal by 5%. However, this small gain in sensitivity is relinquished to ensure reliable background modelling. The¯+jets kinematic reweighting regions are defined by considering all the regions that satisfy the 2b requirement. Figure 3 shows the background composition in each signal, control and validation region, as expected from the nominal¯+jets MC simulation after applying the corrections described in Section 7. The contribution from the signal is included, assuming the cross section predicted by the SM. The control regions have a signal contamination of no more than 1%. The largest signal-to-background ratio evaluated from the inclusive yields in the signal regions is 6.1%. Figure 3: Relative contribution from the signal and backgrounds in all signal, control and validation regions in the 1L channel (left) and 2LOS channel (right). The 3bL (3bH) -tagging requirement selects events with lower (higher) purity of MC 'truth' -jets amongst the three jets tagged at the 70% OP. The 3bV -tagging requirement is used to define the validation regions. For the¯+jets background, the fraction is shown for each component with the finer classification. The¯¯signal is normalised to the SM cross-section prediction.

Modelling of¯+jets background
This section describes the corrections applied to both the nominal¯+jets prediction and to the associated systematic uncertainties. The corrections applied to the¯+jets MC sample include a rescaling of thē +jets flavour components and a sequential kinematic reweighting. Any possible residual mismodelling is accounted for by the systematic uncertainties in the profile likelihood fit used for the extraction of the signal strength.

7.1¯+jets flavour rescaling
The¯+jets flavour rescaling adjusts the overall yields of the¯+light,¯+≥1 and¯+≥1 categories. The rescaling factors are derived from a dedicated profile likelihood fit to data using the event yields in the regions defined by various -tagging requirements. Events with ≥ 8j in the 1L channel and ≥ 6j in the 2LOS channel are assigned to 2b, 3bL, 3bH and ≥4b regions, using the same criteria as defined in Table 1. The flavour rescaling fit exploits the different¯+jets flavour fractions in the eight fitted regions. These regions have the same jets requirements as the regions used in the profile likelihood fit to extract the signal. The largest signal-to-background ratio in these regions is 2.5%, estimated from MC simulation prior to the fit. Systematic uncertainties due to the tagging efficiency of -jets and the mis-tag rate of -jets and light-flavour jets are considered as nuisance parameters. The measured rescaling factors for¯+light,¯+≥1 and¯+≥1 are 0.99 ± 0.05, 1.58 ± 0.18 and 1.33 ± 0.06, respectively, where the quoted uncertainties are from the statistical uncertainty of the data and from uncertainties in the -tagging calibration.
The measured rescaling factors are applied to the¯+jets MC sample entering the profile likelihood fit to extract the signal; however, their measured uncertainties are not used in the fit. Instead, large global normalisation uncertainties are assigned to¯+HF events, as discussed in Section 9.3. However,¯+jets modelling uncertainties due to the choice of generator and scale also affect the flavour composition of the radiated jets. To reduce the correlation among these modelling uncertainties and the dedicated¯+HF normalisation uncertainties, the¯+jets flavour fraction in all modelling systematic variations is rescaled to match the nominal prediction after the flavour rescaling within the acceptance of the analysis, i.e. events in the 1L (2LOS) channel with jets ≥ 8 (6) and 70% ≥ 2. The¯+≥1 events in the¯¯4FS sample are rescaled to have the same yield as the nominal¯+≥1 events within the acceptance. Such treatment ensures that the overall production rate of¯+HF events is controlled by the global normalisation uncertainties, whilst the modelling systematic uncertainties affect only the relative changes in the yield across different analysis regions and the kinematic distributions within each region.

Sequential kinematic reweighting
Following the flavour rescaling, a sequential reweighting is used to mitigate the kinematic mismodelling observed in¯+jets MC simulation. The reweighting corrects the distributions of jets , the number of large-jets ( LR-jets ), the scalar sum of all jet and lepton T in the event ( all T ), and the average Δ between any two jets (Δ jj avg. ). These variables are related to the overall jet activity in the events and are observed to be mismodelled, especially the jets and all T spectra. These variables capture the most representative global kinematics of the events, as well as kinematic properties of the individual jets such as their T and angular distributions.
The¯+jets events in ≥3b regions are reweighted so the overall MC prediction matches the data in the 2b regions. In this procedure it is assumed that the radiation modelling deficiency in the parton shower is independent of the flavour of the radiated jets. Systematic variations of the¯+jets modelling cover possible deviations from this assumption.
The reweighting is performed in three steps, separately for the 1L and 2LOS channels. In the first step, the two-dimensional distribution of the numbers of jets and large-jets, ( jets , LR-jets ), is reweighted. The jets spectrum is corrected up to ≥ 13 (≥ 11) jets, while the LR-jets spectrum is corrected up to ≥ 2 (≥ 1) large-jets in the 1L (2LOS) channel. The reweighting factors are derived for each ( jets , LR-jets ) bin. The second step aims to correct for the mismodelling in the all T spectrum. The reweighting is performed on the reduced all T variable, defined as all,red. T = all T − ( jets − min ) × 90 GeV, where min = 7 (5) in the 1L (2LOS) channel. The value of 90 GeV corresponds to the average T of each additional jet, estimated as the difference between the most probable values of all T in consecutive jets bins. The term subtracted from all T therefore removes the dependence of this variable on jets and minimises the correlation between the first and the second reweighting steps. The reweighting factors are therefore derived in an inclusive region with jets ≥ 7 (5) in the 1L (2LOS) channel, but separately for each LR-jets multiplicity bin, from 0 to ≥ 2 (≥ 1) large-jets. The events with 7 (5) jets in the 1L (2LOS) channel are included in this step to increase the statistical precision of the reweighting factors in the highest LR-jets multiplicity bin. These events are not used elsewhere in the analysis. To further reduce statistical fluctuations in the binned reweighting factors of all,red. T , a three-parameter exponential function, ( all,red. , is used to fit the factors in each LR-jets multiplicity bin. The final step corrects for the residual mismodelling in the angular distributions by reweighting the binned distribution of Δ jj avg. , the average Δ between any two jets. This is performed in each ( jets , LR-jets ) bin, up to ≥ 10 (≥ 8) jets and ≥ 2 (≥ 1) large-jets in the 1L (2LOS) channel.
Uncertainties in the derived reweighting factors are propagated as systematic uncertainties in the¯+jets background. The uncertainty sources include the limited numbers of data and MC events and the cross-section uncertainties of the non-tt processes. An additional uncertainty is considered for the all,red. T reweighting step by replacing the exponential function with a reciprocal function. A more detailed discussion of these uncertainties is presented in Section 9.3.
The reweighting factors are also derived for each systematic uncertainty affecting the¯+jets prediction, such that in the 2b regions all systematic variations match the nominal prediction and thus the data. After applying the reweighting, the overall systematic uncertainty in the ≥3b regions is reduced. Figure 4 illustrates the combined effect of the flavour rescaling and the sequential reweighting. The jets and all T distributions for events with ≥ 8 jets and ≥ 3 -jets in the 1L channel are shown as examples. Significant mismodelling is present in both distributions before the corrections. A large improvement in the level of agreement between data and MC simulation is achieved for both the overall normalisation and the kinematic shape. The flavour rescaling factors increase the overall normalisation of¯+≥1 and¯+≥1 , which dominate the deficiency of MC events in the ≥3b regions. The sequential reweighting increases the jet multiplicity while decreasing the total energy of the event and the average distance between the jets predicted by the¯+jets simulation. The total uncertainty in the MC prediction is also reduced after the sequential reweighting is applied coherently to the¯+jets systematic variations. The level of agreement after the reweighting of Δ jj avg. is equally good, though the effect of the correction is smaller.

Signal discrimination
A multivariate analysis is performed in the signal regions to discriminate¯¯signal from the large background. It uses boosted decision trees (BDTs) [75] combining several input observables to build an output score maximising the separation between signal and the total background. The LO¯¯simulated signal sample is used in the training, while the BDT performance is evaluated on the NLO¯¯sample. For the distributions of kinematic variables used by the multivariate discriminant, good agreement was observed when comparing them in LO and NLO simulation. The background sample is composed of the simulated¯+jets events after the data-driven corrections are applied and the non-tt backgrounds as predicted by the simulation.
The BDTs are trained in six regions using fourteen variables. The six regions are defined by events with at least three -tagged jets and 8 (6), 9 (7) and at least 10 (8) jets in the 1L (2LOS) channel. The BDTs use global event variables, the kinematic properties of reconstructed objects and pairs of objects, jet -tagging information, the multiplicity and substructure of large-jets, and the miss T . The most powerful discriminating variable in all regions is the sum of the pseudo-continuous -tagging score over the six jets with the highest score in the event. The definition of the pseudo-continuous -tagging score can be found in Section 3. In the highest jets regions in the two channels, where an inclusive requirement on the number of jets is applied, the jet count is an important discriminating variable. The minimum distance Δ among all pairs of -tagged jets also provides good discrimination in the majority of regions since      the spatial separation of -jets in the¯¯signal is larger on average than in the main¯+≥1 background. Other variables include all T calculated using all reconstructed jets and leptons in the event, centrality T / , where the sum runs over all reconstructed jets and leptons in the event, leading jet T , minimum Δ among all pairs of -tagged jets and leptons, average Δ between all pairs of jets, the invariant mass of the triplet of jets that has the smallest Δ , 3 the miss T , and the transverse mass of the boson, T (ℓ, miss T ), in the single-lepton channel. 4 Additional variables are related to large-jets: the number of large-jets with a mass above 100 GeV, the sum of the first splitting scale 12 of all largejets, 5 and the sum of the second splitting scale 23 of all large-jets.
The modelling of the input variables was checked in the control and validation regions before and after the fit by propagating the fitted parameters obtained from the fit in the control and signal regions to the validation regions. Figure 5 shows the modelling of the sum of the pseudo-continuous -tagging score in each lepton channel before performing the fit. The distribution in each lepton channel is shown for an inclusive selection of at least three -tagged jets and at least 9 (7) jets in the 1L (2LOS) channel. These regions include events from the validation, control and signal regions where the BDT score distribution is used. Taking into account all uncertainties, no significant discrepancy between data and the predicted background was found.
Uncertainty *: normalised to tot. bkg. Figure 5: Pre-fit comparison between data and prediction for the distributions of the sum of the pseudo-continuous -tagging score over the six jets with the highest score in the event for the 1L channel (left) and the 2LOS channel (right) in the regions with ≥ 3 -jets and ≥ 9 (7) jets in the 1L (2LOS) channel. The¯+jets background is corrected using data. The band includes the total uncertainty of the pre-fit computation. The dashed red line shows the signal distribution normalised to the background yield. The ratio of the data to the total pre-fit expectation is shown in the lower panel. The last bin contains overflow events. 3 The Δ of a triplet of jets is defined as Δ = √︃ Δ 2 + Δ 2 + Δ 2 , where , , are the indices of the three jets. 4 The transverse mass is defined as where Δ is the azimuthal angle between the lepton and miss T .

5
The splitting scale is defined as the recombination distance between the jet constituents from a algorithm with radius parameter : = min( T 2 , T 2 ) × Δ 2 / 2 .

Systematic uncertainties
Various sources of systematic uncertainty affect the estimated signal and background rates, including those related to the luminosity, the identification and reconstruction of the physics objects, referred to as experimental uncertainties, and the modelling of the signal and background processes. In the following, a brief description of the sources of systematic uncertainty is provided. Particular emphasis is placed on the uncertainties related to the tt background prediction, which has the largest impact on the sensitivity of the measurement.

Experimental uncertainties
The experimental uncertainties are the same as in Ref. [13] and are briefly summarised here. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [20], obtained using the LUCID-2 detector [21] for the primary luminosity measurements. An uncertainty in the pile-up simulation is derived from the uncertainty in the scale factors used to adjust the MC pile-up to the data pile-up profile. Uncertainties in the modelling of electrons and muons in MC simulation arise from their momentum scale calibration and resolution, as well as trigger, reconstruction, identification, and isolation efficiencies. Uncertainties in the modelling of jets are primarily related to their energy scale (JES) and resolution (JER). The JES uncertainty is decomposed into a set of 30 uncorrelated components referred to as eigenvectors (EV), with contributions from pile-up, jet flavour composition and single-particle response [76]. The JER uncertainty is represented by nine components [77]. An uncertainty in the efficiency to pass the JVT requirement for suppressing pile-up jets is also considered [32]. The -tagging efficiencies and mis-tagging rates are measured in data using the same methods as those described in Refs. [34,78,79]. The uncertainties in the -tagging calibration are determined separately for -jets, -jets and light-flavour jets. They account for differences between data and simulation, depending on T for -and -jets, and on T

Signal modelling uncertainties
Various sources of modelling uncertainty are considered for the¯¯signal. An uncertainty related to the missing higher-order QCD corrections is estimated by varying the renormalisation ( r ) and the factorisation ( f ) scales simultaneously by factors of 2.0 and 0.5 relative to the nominal value. The effect of the PDF variations on the signal MC prediction was evaluated following the PDF4LHC prescription [80] and found to be negligible. The uncertainty due to the choice of parton shower and hadronisation model is estimated by comparing the nominal¯¯MC sample with the alternative that uses H 7 to simulate the PS, as described in Section 4.

Uncertainties in the¯+jets background
The systematic uncertainties affecting the¯+jets background modelling are summarised in Table 2. They are applied to each flavour component of the tt background separately, i.e. treated as uncorrelated, to account for the variation in flavour composition of the regions included in the fit and for possible differences in the modelling of the¯+≥1 ,¯+≥1 , and¯+light processes.
An uncertainty of 50% in the normalisation of¯+≥1 events as well as of the different subcategories of the¯+≥1 events (¯+ ,¯+¯,¯+ , and¯+≥3 ) is applied [72]. The uncertainties due to the choice of generator and PS model used to simulate the inclusive tt sample are evaluated by comparing the nominal tt sample with the alternative tt samples, detailed in Section 4. They are split into shape and migration components, with the former affecting only the shape of the distributions in each fit region and the latter changing the yields in different regions. These uncertainties are applied to each of the¯+≥1 subcategories as detailed in Table 2.
Uncertainties due to missing higher-order QCD corrections are estimated by separately varying the renormalisation and the factorisation scales by factors of 2.0 and 0.5 in the nominal tt sample. Additionally, uncertainties in the amounts of initial-and final-state radiation (ISR and FSR) from the PS are assessed by respectively varying the corresponding parameter of the A14 PS tune 6 and by varying the FSR renormalisation scale by factors of 2.0 and 0.5. An additional uncertainty in the main background¯+≥1 process is evaluated by comparing its modelling by the nominal sample, where -quarks are produced in the parton shower, with the prediction by the P B R generator which calculates the¯+p rocess in the matrix element at NLO in QCD with massive -quarks in the 4FS. A group of uncertainties arise from the kinematic reweighting procedure applied to correct for the mismodelling of the¯+jets background. The uncertainties due to the limited numbers of data and MC events in each of the reweighted bins of the ( jets , LR-jets ) and Δ jj avg. variables result in 120 systematic uncertainties. Their effect on all T and the BDT discriminants is less than 1% in any bin of the corresponding distribution.
The uncertainty in the all,red. T reweighting is evaluated using a 68% confidence interval contour of the nominal fit to the exponential function, resulting in five systematic uncertainties corresponding to the five fitted functions in each of the LR-jets multiplicity bins in the two channels. Five additional systematic uncertainties are obtained by replacing the exponential function with a three-parameter reciprocal function,  . The difference compared to the exponential function is smaller than 20% in any fitted regions, with a larger deviation at high or low all,red. T . The net effect of the systematic uncertainties in the all,red. T reweighting is below 5% across the all T and BDT distributions. The reweighting factors depend on the contribution from the non-tt processes in the 2b regions which is subtracted before deriving the reweighting. The cross sections of these processes are varied by ±50% simultaneously, and the reweighting factors are rederived. The resulting difference is taken as an additional systematic uncertainty, ranging from 3% to 6%, in the reweighting factors.

Modelling uncertainties in non-tt backgrounds
An uncertainty of 5.5% in the total cross section of the three single-top-quark production modes is included [81][82][83]. Similar to the tt background, uncertainties associated with the PS and hadronisation model and the generator choice are evaluated by comparing the nominal P + P 8 sample for each process with alternative samples produced with P + H 7 and M G 5_aMC@NLO + P 8. The uncertainty due to the treatment of the interference between and tt production at NLO is evaluated by comparing the nominal sample produced using the diagram removal scheme with an alternative sample produced with the same generator but using the diagram subtraction scheme [54].
Modelling uncertainties in the¯,¯, and¯processes are evaluated in a similar way. Uncertainties of 60%, 15%, and 20% are applied to the¯,¯, and¯cross sections, respectively [13,84,85]. The PDF uncertainty in the¯,¯and¯processes is negligible compared to the uncertainty applied to their cross section. The uncertainty due to missing higher-order QCD corrections is determined independently for each of these processes by varying simultaneously the renormalisation and factorisation scales by factors of 2.0 and 0.5 relative to the nominal value. The uncertainty due to the choice of generator and PS is assessed by comparing the nominal simulation of each of these processes with the alternative one described in Section 4. Additional uncertainties of 10%, 20% and 30% for events with respectively 9 (7), 10 (8) and at least 11 (9) jets are applied to account for the production of additional jets in these processes in the 1L (2LOS) channel. These uncertainty values are based on the level of mismodelling of the¯+jets jets distribution observed in the 2b region. They are also consistent with the modelling uncertainties evaluated in the 2LSS/3L final state for these backgrounds using a full set of MC modelling uncertainties [13].
An uncertainty of 60% is assumed for the +jets production cross section. It is estimated by adding a 24% uncertainty in quadrature for each additional jet based on a comparison among different algorithms for merging LO matrix elements and parton showers [86].
For each of the other small background processes a conservative cross-section uncertainty of 50% is applied.

Results
The analysis uses a binned profile likelihood fit to extract the¯¯signal strength with systematic uncertainties parameterised as constrained nuisance parameters. The regions used in the fit are described in Section 6. In all control regions, the binned all T distribution is used as the input to the profile likelihood fit, except for the (8j, ≥5b) region in the 1L channel where the event count is used due to the small number of events and the low signal-to-background ratio in this region. To improve the sensitivity to the¯¯signal, the distribution of the BDT score described in Section 8 is used in the signal regions. Each bin in these regions is represented by a Poisson probability term for the observed data, with the expectation value provided by the MC prediction with the corrections to¯+jets. The expected yield depends on the signal strength, , defined as the ratio of the measured¯¯production cross section to that predicted by the SM,¯¯/ SM¯, and the systematic uncertainties, . A binned likelihood function, L ( , ), is constructed as the product of the Poisson probability terms of all fitted bins. Systematic uncertainties are taken into account in the likelihood function, each as a nuisance parameter constrained by a Gaussian prior. The uncertainties due to the limited number of generated MC events, also referred to as the MC statistical uncertainties, are implemented as additional nuisance parameters with Poisson priors, one for each fitted bin. The MC prediction is fitted to the data by adjusting and simultaneously to maximise the likelihood function. The statistical inferences of the fit results are derived using the likelihood ratio, ( ) = L ( ,ˆ)/L (ˆ,ˆ), whereˆandˆare the values of and that maximise the likelihood function andˆrepresents the values that maximise the likelihood function given a specific [87]. The uncertainty in the best-fit value is determined by finding the values that correspond to −2 ln ( ) = 1. The test statistic 0 = −2 ln (0) is used to check the compatibility of the background-only hypothesis ( = 0) with the observed data, and consequently the significance of the fitted signal. The statistical analysis is performed using the R S framework [88,89]. Figure 6 shows the observed event yield compared with the prediction in each control and signal region in each lepton channel before the fit ('pre-fit') and after the fit ('post-fit'). The level of agreement between data and the prediction is improved post-fit due to the adjustment of the nuisance parameters. The overall uncertainty in the prediction is also significantly reduced as a result of the constraints from the data and the correlations among the nuisance parameters. Figures 7 and 8 show the distribution of the BDT score in the signal regions in both lepton channels. Good agreement is observed between data and the post-fit prediction in all fitted regions. The goodness of fit was evaluated using a likelihood-ratio test, comparing the likelihood value from the nominal fit with the one obtained from a saturated model, built with one free-floating normalisation factor for each fitted bin [90]. The probability that the fit model is compatible with the observed data is 59%. The post-fit background modelling is further checked in validation regions that are not used in the fit. The definition of these regions is described in Section 6. Figure 9 shows the distribution of the BDT score in four of those regions.
The best-fit value for the signal strength is which takes into account the additional 20% uncertainty in SM¯i n the denominator of the parameter. In order to check the compatibility of the 1L and 2LOS channels, an alternative fit is performed with two independent signal-strength parameters, one for each of the two channels. The best-fit values of the two signal strengths are 2.9 +1.8 −1.5 for the 1L channel and 1.3 +1.7 −1.5 for the 2LOS channel. The probability of obtaining a discrepancy between these two signal-strength parameters equal to or larger than the one observed is 39%.
The statistical uncertainty is obtained from a fit where all nuisance parameters are fixed to their post-fit values. The systematic uncertainty is determined by subtracting in quadrature the statistical uncertainty from the total uncertainty. The measured cross section is consistent with the SM predicted cross section of Uncertainty *: normalised to tot. bkg. Figure 6: Comparison of predicted and observed event yields in each control and signal region in the 1L channel (top) and in the 2LOS channel (bottom) before the fit (left) and after the fit (right). The¯+jets background is corrected at the pre-fit level using data. The band includes the total pre-or post-fit uncertainties. The dashed red line shows the signal distribution normalised to the total background yield. The ratio of the data to the total pre-or post-fit prediction is shown in the lower panel. SM¯= 12.0 ± 2.4 fb computed at NLO in QCD including NLO electroweak corrections [12]. The observed significance of¯¯production relative to the background-only prediction is 1.9 standard deviations, while 1.0 standard deviations is expected. The observed (expected) -value is 3% (17%), corresponding to the probability of obtaining a result at least as signal-like as observed if no signal is present. Figure 10 shows the event yield in data compared with the post-fit signal ( ) and total background ( ) prediction. It is ordered by the signal-to-background ratio of the discriminant bins from all fitted regions. The predictions are shown for the best-fit signal strength and for the SM prediction. The observed data shows a signal-like excess over the background, with a higher probability of compatibility with the best-fit signal strength of = 2.2 than with the SM prediction of = 1.0 in the high log 10 ( / ) region. Table 3 lists the contributions from the various groups of systematic uncertainties to the fitted uncertainty in the measured¯¯production cross section, together with the total systematic uncertainty, the statistical uncertainty and the total uncertainty. The impacts of the individual uncertainties on the best-fit value of are presented in Figure 11, ranked according to their post-fit impacts. The 20 systematic uncertainties with the largest impact are shown. The impacts of the individual and grouped systematic uncertainties are Uncertainty *: normalised to tot. bkg.
Uncertainty *: normalised to tot. bkg. evaluated using different methods, as described in the corresponding captions.
Apart from the uncertainties in the¯¯signal, the largest contribution is from¯+≥1 modelling uncertainties. The major uncertainties in this category are due to the choice of flavour scheme and MC generator used to model the¯+¯background, which is dominant in all signal regions. This reflects the large difference between the different predictions, which is expected given our limited knowledge of the phase space being probed. The contribution from¯+≥1 modelling uncertainties is dominated by the¯+≥1 cross-section Uncertainty *: normalised to tot. bkg.
Uncertainty *: normalised to tot. bkg. uncertainty. Although the fraction of¯+≥1 events is less than 10% in the signal regions, this uncertainty acquires a large anti-correlation with the¯+¯cross-section uncertainty because of the interplay between +≥1 and¯+¯in the 3b regions. The¯+≥1 cross-section uncertainty is constrained by 50% due to the substantial impact of this uncertainty on the regions with loose -tagging requirements, where abundant data is available to provide the constraint. The¯+¯cross-section uncertainty does not have a large post-fit impact on by itself. This is because it is constrained by 70% relative to its pre-fit uncertainty due to the high purity of¯+¯and the sufficient amount of data in the ≥ 4b regions. The uncertainties in the jet energy scale and resolution also contribute substantially to the overall uncertainty. This is due to the high jets multiplicity feature of the¯¯signal. The uncertainties in the -tagging efficiencies are also an important source. The most important contribution is from 'light jets mis-tag rates EV0', which is the leading component in the 20 uncertainties related to the mis-tag rates of light-flavour jets. This component mainly affects the migration of the different¯+jets components across the regions defined using different -tagging requirements, subsequently affecting the¯+¯yields in the signal regions, which are highly correlated with the fitted signal strength. Figure 11 also illustrates the best-fit values of the nuisance parameters corresponding to the 20 most important systematic uncertainties. A number of nuisance parameters are pulled from their nominal value in the likelihood fit. However, all observed pulls are smaller than 50% of the corresponding prior uncertainty. The pull on the '¯¯PS choice' is driven by the discrepancy between data and MC simulation in the 2LOS channel, (≥8j, ≥4b) region. The slight pull on this nuisance parameter reduces the¯¯yield in this region by 3%. The effects in other fitted regions are either comparable or smaller. The 'light jets mis-tag rates EV0' nuisance parameter is pulled to mitigate the discrepancy between data and prediction by adjusting the relative normalisation of the different¯+jets components across the fitted regions. All other pulls, including those not shown in Figure 11, are mainly on¯+jets modelling uncertainties. The combination of these pulls is found to correct for the mismodelling in the shape and normalisation of thē +jets background prediction. Similar pulls are observed in a background-only fit when removing the bins with a signal-to-background ratio larger than 5%. Additional tests were performed with pseudo-data sets constructed using various alternative¯+jets predictions and injected signal. No bias in the extracted was seen in the fits to these pseudo-data sets.

Combination with the same-sign dilepton and multilepton final state
The measurement of¯¯production in the 1L/2LOS final state presented in this article is combined with the ATLAS measurement in the 2LSS/3L final state [13] using the same data set. The combination is performed via a simultaneous profile likelihood fit in all regions of both analyses, with all systematic uncertainties included, and the combined¯¯production cross section is extracted. The events in the two analyses are statistically independent by construction due to the different lepton selection criteria.
The systematic uncertainties have a significant impact on the sensitivity of both the 1L/2LOS and 2LSS/3L analyses. However, because the most relevant systematic uncertainties in these final states are different, the assumptions about the correlations of the uncertainties have a negligible effect on the combined result.
The uncertainties related to the modelling of the¯+jets background are treated as uncorrelated between the 1L/2LOS and 2LSS/3L final states. The contribution of¯+jets events in the 2LSS/3L final state is much smaller than that in the 1L/2LOS final state, and arises from misreconstructed or non-prompt leptons, while it is the dominant irreducible background in the 1L/2LOS final state. Its estimation and the treatment of relevant uncertainties is therefore different in the two final states. The experimental uncertainties are treated as fully correlated between the two final states since both analyses use the same reconstructed objects and data set. Theoretical modelling uncertainties in the non-tt backgrounds and the¯¯signal are also fully correlated except for the normalisation of the¯background, which is treated differently in the 1L/2LOS and 2LSS/3L final states. In the latter,¯is the most important background and its normalisation is a free parameter of the fit, while in the former it is a small background and its normalisation is taken from MC simulation with a large uncertainty motivated by the normalisation factor obtained in the 2LSS/3L analysis. For this reason, the¯normalisation is uncorrelated between the two final states. The systematic Table 3: The contribution from different systematic uncertainties to the measured¯¯production cross section,¯, grouped into categories. For each uncertainty source, the fit is repeated with the corresponding group of nuisance parameters fixed to their best-fit values,ˆ. The contribution from each source, Δ¯¯, is then evaluated by subtracting in quadrature the uncertainty in¯¯obtained in this fit from that of the full fit. The contributions from individual groups are compared with the total systematic uncertainty and the statistical uncertainty. The total systematic uncertainty is different from the sum in quadrature of the different groups due to correlations among nuisance parameters in the fit. The¯¯signal strength from the combination of the two final states is measured to be The signal is shown for both the best-fit signal strength, = 2.2, and the SM prediction, = 1.0. The lower panel shows the ratio of the data to the post-fit background prediction, compared with the signal-plus-background prediction with the best-fit signal strength and the SM prediction. The shaded band represents the total uncertainty in the background prediction.
Nuis. Param. Pull ATLAS -1 = 13 TeV, 139 fb s Figure 11: The nuisance parameters ranked according to their post-fit impacts on the best-fit value of . Only the 20 nuisance parameters with the largest impacts are shown. The empty (solid) blue rectangles illustrate the pre-fit (post-fit) impacts on the parameter of interest , corresponding to the top axis. The pre-fit (post-fit) impact of each nuisance parameter, Δ , is calculated as the difference in the fitted value of between the nominal fit and the fit when fixing the corresponding nuisance parameter toˆ± Δ (ˆ± Δˆ), whereˆis the best-fit value of the nuisance parameter and Δ (Δˆ) is its pre-fit (post-fit) uncertainty. The black points show the best-fit values of the nuisance parameters, with the error bars representing the post-fit uncertainties. Each nuisance parameter is shown relative to its nominal value, 0 , and in units of its pre-fit uncertainty.

Conclusion
A measurement of four-top-quark production using 139 fb −1 of √ = 13 TeV proton-proton collision data collected by the ATLAS detector at the Large Hadron Collider is presented.
The selection in the 1L/2LOS final state targets events with a single lepton or an opposite-sign lepton pair accompanied by multiple jets. The measured four-top-quark production cross section is found to be 26 ± 8 (stat.) +15 −13 (syst.) fb, corresponding to an observed (expected) signal significance of 1.9 (1.0) standard deviations over the background-only hypothesis. The measurement uncertainty is dominated by the systematic uncertainties, particularly the modelling uncertainty of the¯¯signal and the¯+HF background.
The result in the 1L/2LOS final state is combined with the previous measurement performed by the ATLAS Collaboration in the multilepton final state with the same data set. The combined four-top-quark production cross section is measured to be¯¯= 24 ± 4 (stat.    [37] ATLAS Collaboration, Performance of missing transverse momentum reconstruction with the ATLAS detector using proton-proton collisions at