Search for black holes and sphalerons in high-multiplicity final states in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search in energetic, high-multiplicity final states for evidence of physics beyond the standard model, such as black holes, string balls, and electroweak sphalerons, is presented. The data sample corresponds to an integrated luminosity of 35.9 fb$^{-1}$ collected with the CMS experiment at the LHC in proton-proton collisions at a center-of-mass energy of 13 TeV in 2016. Standard model backgrounds, dominated by multijet production, are determined from control regions in data without any reliance on simulation. No evidence for excesses above the predicted background is observed. Model-independent 95% confidence level upper limits on the cross section of beyond the standard model signals in these final states are set and further interpreted in terms of limits on semiclassical black hole, string ball, and sphaleron production. In the context of models with large extra dimensions, semiclassical black holes with minimum masses as high as 10.1 TeV and string balls with masses as high as 9.5 TeV are excluded by this search. Results of the first dedicated search for electroweak sphalerons are presented. An upper limit of 0.021 is set on the fraction of all quark-quark interactions above the nominal threshold energy of 9 TeV resulting in the sphaleron transition.


Introduction
Many theoretical models of physics beyond the standard model (SM) [1][2][3] predict strong production of particles decaying into high-multiplicity final states. Among these models are supersymmetry [4][5][6][7][8][9][10][11], with or without R-parity violation [12], and models with low-scale quantum gravity [13][14][15][16][17], strong dynamics, or other nonperturbative physics phenomena. While the final states predicted in these models differ significantly in the type of particles produced, their multiplicity, and the transverse momentum imbalance, they share the common feature of a large number of energetic objects (jets, leptons, and/or photons) in the final state. The search described in this paper targets these models of beyond-the-SM (BSM) physics by looking for final states of various inclusive multiplicities featuring energetic objects. Furthermore, since such final states can be used to test a large variety of models, we provide model-independent exclusions on hypothetical signal cross sections. Considering concrete examples of such models, we interpret the results of the search explicitly in models with microscopic semiclassical black holes (BHs) and string balls (SBs), as well as in models with electroweak (EW) sphalerons. These examples are discussed in detail in the rest of this section.

Microscopic black holes
In our universe, gravity is the weakest of all known forces. Indeed, the Newton constant, ∼10 −38 GeV −2 , which governs the strength of gravity, is much smaller than the Fermi constant, ∼10 −4 GeV −2 , which characterizes the strength of EW interactions. Consequently, the Planck scale M Pl ∼ 10 19 GeV, i.e., the energy at which gravity is expected to become strong, is 17 orders of magnitude higher than the EW scale of ∼100 GeV. With the discovery of the Higgs boson [18-20] with a mass [21,22] at the EW scale, the large difference between the two scales poses what is known as the hierarchy problem [23]. This is because in the SM, the Higgs boson mass is not protected against quadratically divergent quantum corrections and-in the absence of fine tuning-is expected to be naturally at the largest energy scale of the theory: the Planck scale. A number of theoretical models have been proposed that attempt to solve the hierarchy problem, such as supersymmetry, technicolor [24], and, more recently, theoretical frameworks based on extra dimensions in space: the Arkani-Hamed, Dimopoulos, and Dvali (ADD) [13][14][15] model and the Randall-Sundrum model [16,17].
In this paper, we look for the manifestation of the ADD model that postulates the existence of n ED ≥ 2 "large" (compared to the inverse of the EW energy scale) extra spatial dimensions, compactified on a sphere or a torus, in which only gravity can propagate. This framework allows one to elude the hierarchy problem by explaining the apparent weakness of gravity in the three-dimensional space via the suppression of the fundamentally strong gravitational interaction by the large volume of the extra space. As a result, the fundamental Planck scale, M D , in 3 + n ED dimensions is related to the apparent Planck scale in 3 dimensions via Gauss's law as: M Pl 2 ∼ M D n ED +2 R n ED , where R is the radius of extra dimensions. Since M D could be as low as a few TeV, i.e., relatively close to the EW scale, the hierarchy problem would be alleviated.
At high-energy colliders, one of the possible manifestations of the ADD model is the formation of microscopic BHs [25,26] with a production cross section proportional to the squared Schwarzschild radius, given as: scenario, the cross section is given by the area of a disk of radius R S , i.e., σ ≈ πR S 2 [25,26]. In more complicated production scenarios, e.g., a scenario with energy loss during the formation of the BH horizon, the cross section is modified from this "black disk" approximation by a factor of order one.
As BH production is a threshold phenomenon, we search for BHs above a certain minimum mass M min BH ≥ M D . In the absence of signal, we will express the results of the search as limits on M min BH . In the semiclassical case (strictly valid for M BH M D ), the BH quickly evaporates via Hawking radiation [27] into a large number of energetic particles, such as gluons, quarks, leptons, photons, etc.. The relative abundance of various particles produced in the process of BH evaporation is expected to follow the number of degrees of freedom per particle in the SM. Thus, about 75% of particles produced are expected to be quarks and gluons, because they come in three or eight color combinations, respectively. A significant amount of missing transverse momentum may be also produced in the process of BH evaporation via production of neutrinos, which constitute ∼5% of the products of a semiclassical BH decay, W and Z boson decays, heavy-flavor quark decays, gravitons, or noninteracting stable BH remnants.
If the mass of a BH is close to M D , it is expected to exhibit quantum features, which can modify the characteristics of its decay products. For example, quantum BHs [28][29][30] are expected to decay before they thermalize, resulting in low-multiplicity final states. Another model of semiclassical BH precursors is the SB model [31], which predicts the formation of a long, jagged string excitation, folded into a "ball". The evaporation of an SB is similar to that of a semiclassical BH, except that it takes place at a fixed Hagedorn temperature [32], which depends only on the string scale M S . The formation of an SB occurs once the mass of the object exceeds M S /g S , where g S is the string coupling. As the mass of the SB grows, eventually it will transform into a semiclassical BH, once its mass exceeds M S /g S 2 > M D .
A number of searches for both semiclassical and quantum BHs, as well as for SBs have been performed at the CERN LHC using the Run 1 ( √ s = 7 and 8 TeV) and Run 2 ( √ s = 13 TeV) data. An extensive review of Run 1 searches can be found in Ref. [33].

Sphalerons
The Lagrangian of the EW sector of the SM has a possible nonperturbative solution, which includes a vacuum transition known as a "sphaleron". This class of solutions to gauge field theories was first proposed in 1976 by 't Hooft [43]. The particular sphaleron solution of the SM was first described by Klinkhamer and Manton in 1984 [44]. It is also a critical piece of EW baryogenesis theory [45], which explains the matter-antimatter asymmetry of the universe by such processes. The crucial feature of the sphaleron, which allows such claims to be made, is the violation of baryon (B) and lepton (L) numbers, while preserving B − L.
Within the framework of perturbative SM physics, there are twelve globally conserved currents, one for each of the 12 fundamental fermions: J µ = ψ L γ µ ψ L . An anomaly breaks this conservation, in particular ∂ µ J µ = [g 2 /(16π 2 )]Tr[F µν F µν ]. This is because the integral of this term, known as a Chern-Simons (or winding) number N CS [46], is nonzero. The anomaly exists for each fermion doublet. This means that the lepton number changes by 3N CS , since each of three leptons produced has absolute lepton number of 1. The baryon number will also change by 3N CS because each quark has an absolute baryon number of 1/3 and there are three colors and three generations of quarks produced. This results in two important relations, which are essential to the phenomenology of sphalerons: ∆(B + L) = 6N CS and ∆(B − L) = 0. The anomaly only exists if there is enough energy to overcome the potential in N CS , which is fixed by the values of the EW couplings. Assuming the state at 125 GeV to be the SM Higgs boson, the precise measurement of its mass [21,22] allowed the determination of these couplings, giving an estimate of the energy required for the sphaleron transitions of E sph ≈ 9 TeV [44,47].
While the E sph threshold is within the reach of the LHC, it was originally thought that the sphaleron transition probability would be significantly suppressed by a large potential barrier. However, in a recent work [47] it has been suggested that the periodic nature of the Chern-Simons potential reduces this suppression at collision energies √ŝ < E sph , removing it completely for √ŝ ≥ E sph . This argument opens up the possibility of observing an EW sphaleron transition in proton-proton (pp) collisions at the LHC via processes such as: u + u → e + µ + τ + t t b c c s d + X. Fundamentally, the N CS = +1 (−1) sphaleron transitions involve 12 (anti)fermions: three (anti)leptons, one from each generation, and nine (anti)quarks, corresponding to three colors and three generations, with the total electric charge and weak isospin of zero. Nevertheless, at the LHC, we consider signatures with 14, 12, or 10 particles produced, that arise from a q + q → q + q + sphaleron process, where 0, 1, or 2 of the 12 fermions corresponding to the sphaleron transition may "cancel" the q or q inherited from the initial state [48,49]. Since between zero and three of the produced particles are neutrinos, and also between zero and three are top quarks, which further decay, the actual multiplicity of the visible final-state particles may vary between 7 and 20 or more. Some of the final-state particles may also be gluons from either initial-or final-state radiation. While the large number of allowed combinations of the 12 (anti)fermions results in over a million unique transitions [50], many of the final states resulting from these transitions would look identical in a typical collider experiment, as no distinction is made between quarks of the first two generations, leading to only a few dozen phenomenologically unique transitions, determined by the charges and types of leptons and the third-generation quarks in the final state. These transitions would lead to characteristic collider signatures, which would have many energetic jets and charged leptons, as well as large missing transverse momentum due to undetected neutrinos.
A phenomenological reinterpretation in terms of limits on the EW sphaleron production of an ATLAS search for microscopic BHs in the multijet final states at √ s = 13 TeV [34], comparable to an earlier CMS analysis [36], was recently performed in Ref. [48]. In the present paper, we describe the first dedicated experimental search for EW sphaleron transitions.

The CMS detector and the data sample
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
In the region |η| < 1.74, the HCAL cells have widths of 0.087 in pseudorapidity and 0.087 in azimuth (φ). In the η − φ plane, and for |η| < 1.48, the HCAL cells map on to 5 × 5 arrays of ECAL crystals to form calorimeter towers projecting radially outwards from close to the nominal interaction point. For |η| > 1.74, the coverage of the towers increases progressively to a maximum of 0.174 in ∆η and ∆φ. Within each tower, the energy deposits in ECAL and HCAL cells are summed to define the calorimeter tower energies, subsequently used to provide the energies and directions of hadronic jets.
Events of interest are selected using a two-tiered trigger system [51]. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a time interval of less than 4 µs. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage.
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [52].
The analysis is based on a data sample recorded with the CMS detector in pp collisions at a center-of-mass energy of 13 TeV in 2016, corresponding to an integrated luminosity of 35.9 fb −1 . Since typical signal events may contain jets, leptons, photons, and missing transverse momentum from undetected particles, to be as inclusive as possible we employ a trigger based on the H T variable, defined as the scalar sum of the p T of all jets in an event reconstructed at the HLT. We require H T > 800 GeV for the earlier portion of the data taking and H T > 900 GeV for the later one; the increase in the threshold value being driven by the need to keep the trigger rate within an acceptable limit as the LHC luminosity steadily increased. For the later portion of the run, to recover some of the residual inefficiency of the H T trigger, we also used a logical OR with several single-jet triggers with the E T thresholds of 450-500 GeV. The resulting trigger selection is fully efficient for events that subsequently satisfy the offline requirements used in the analysis.

Event reconstruction
The particle-flow (PF) algorithm [53] aims to reconstruct and identify each individual particle in an event with an optimized combination of information from the various elements of the CMS detector. The energy of photons is directly obtained from the ECAL measurement, corrected for zero-suppression effects. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.
The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects are the jets, clustered using the anti-k T jet finding algorithm [54, 55] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. Events are required to have at least one reconstructed vertex within 24 (2) cm of the nominal collision point in the direction parallel (perpendicular) to the beams.
For each event, hadronic jets are clustered from the PF candidates using the anti-k T algorithm with a distance parameter of 0.4. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance. Additional pp interactions within the same or neighboring bunch crossings (pileup) can contribute additional tracks and calorimetric energy depositions to the jet momentum. To mitigate this effect, tracks originating from pileup vertices are discarded and an offset correction is applied to correct for the remaining contributions. Jet energy corrections are derived from simulation, to bring the measured response of jets to that of particle-level jets on average. In situ measurements of the momentum balance in dijet, multijet, γ+jet, and leptonically decaying Z+jet events are used to account for any residual differences in the jet energy scales in data and simulation [56]. The jet energy resolution amounts typically to 15% at a jet p T of 10 GeV, 8% at 100 GeV, and 4% at 1 TeV. Additional selection criteria are applied to each jet to remove those potentially dominated by anomalous contributions from various subdetector components or reconstruction failures. All jets are required to have p T > 70 GeV and be within |η| < 5. For the leading p T jet in each event, the energy fraction carried by muon candidates failing the standard identification [57] is required to be less than 80%. This requirement removes events where a low-momentum muon is misreconstructed with very high momentum and misidentified as a high-energy jet. We further require the leading jet in an event to have a charged-hadron fraction of less than 0.99 if this jet is found within |η| < 2.4 [58].
The missing transverse momentum, p miss T , is defined as the magnitude of the vectorial sum of transverse momenta of all PF candidates in an event. The jet energy corrections are further propagated to the p miss T calculation.
Details of muon reconstruction can be found in Ref.
[57]. The muon candidate is required to have at least one matching energy deposit in the pixel tracker and at least six deposits in the silicon strip tracker, as well as at least two track segments in the muon detector. The transverse impact parameter and the longitudinal distance of the track associated with the muon with respect to the primary vertex are required to be less than 2 and 5 mm, respectively, to reduce contamination from cosmic ray muons. The global track fit to the tracker trajectory and to the muon detector segments must have a χ 2 per degree of freedom of less than 10. Muon candidates are required to have p T > 70 GeV and to be within |η| < 2.4.

Details of electron and photon reconstruction can be found in Refs.
[59] and [60], respectively. Electron and photon candidates are required to have p T > 70 GeV and |η| < 2.5, excluding the 1.44 < |η| < 1.57 transition region between the ECAL barrel and endcap detectors where the reconstruction is suboptimal. We use standard identification criteria, corresponding to an average efficiency of 80% per electron or photon. The identification criteria include a requirement that the transverse size of the electromagnetic cluster be compatible with the one expected from a genuine electron or photon, and that the ratio of the HCAL to ECAL energies be less then 0.25 (0.09) for electrons and less than 0.0396 (0.0219) for photons in the barrel (endcap). In addition, photon candidates are required to pass the conversion-safe electron veto requirements [60], which disambiguates them from electron candidates.
Muons, electrons, and photons are required to be isolated from other energy deposits in the tracker and the calorimeters. The isolation I is defined as the ratio of the p T sum of various types of additional PF candidates in a cone of radius ∆R = √ (∆η) 2 + (∆φ) 2 of 0.4 (muons) or 0.3 (electrons and photons), centered on the lepton or photon candidate, to the candidate's p T . For muons, the numerator of the ratio is corrected for the contribution of neutral particles due to pileup, using one half of the p T carried by the charged hadrons originating from pileup vertices. For electrons and photons, an average area method [61]], as estimated with FAST-JET [55], is used. The details of the isolation requirements are the same as used in an earlier 13 TeV analysis [36], except that for electrons we use a slightly tighter isolation requirement of I < 0.07.
To avoid double counting, we remove jets that are found within a radius of ∆R = 0.3 from a muon, electron, or photon, if the latter object contributes more than 80, 70, or 50% of the jet p T , respectively.

Analysis strategy
We follow closely the approach for semiclassical BH searches originally developed by CMS for Run 1 analyses [62-64] and subsequently used in the studies of early Run 2 [36] data. This approach is based on an inclusive search for BH decays to all possible final states, dominated by the high-multiplicity multijet ones in the semiclassical BH case. This type of analysis is less sensitive to the details of BH evaporation and the relative abundance of various particles produced, as it considers all types of particles in the final state. We use a single discriminating variable S T , defined as the scalar sum of p T of all energetic objects in an event (which we define as jets, electrons, muons, and photons with p T above a certain threshold, of which there are N), plus p miss T in the event, if it exceeds the same threshold: This definition of S T is robust against variations in the BH evaporation model, and is also sensitive to the cases when there is large p miss T due to enhanced emission of gravitons or to models in which a massive, weakly interacting remnant of a BH is formed at the terminal stage of Hawking evaporation, with a mass below M D . It is equally applicable to sphaleron searches, given the expected energetic, high-multiplicity final states, possibly with large p miss T . The S T distributions are then considered separately for various inclusive object multiplicities (i.e., N ≥ N min = 3, . . . , 11). The background is dominated by SM QCD multijet production and is estimated exclusively from control samples in data. The observed number of events with S T values above a certain threshold is compared with the background and signal+background predictions to either establish a signal or to set limits on the signal production. This approach does not rely on the Monte Carlo (MC) simulation of the backgrounds, and it also has higher sensitivity than exclusive searches in specific final states, e.g., lepton+jets [65,66].
The main challenge of the search is to describe the inclusive multijet background in a robust way, as both BH and sphaleron signals correspond to a broad enhancement in the high tail of the S T distribution, rather than to a narrow peak. Since these signals are expected to involve a high multiplicity of final-state particles, one has to reliably describe the background for large jet multiplicities, which is quite challenging theoretically as higher-order calculations that fully describe multijet production do not exist. Thus, one cannot rely on simulation to reproduce the S T spectrum for large N correctly.
To overcome this problem, a dedicated method of predicting the QCD multijet background directly from collision data has been developed for the original Run 1 analysis [62] and used in the subsequent Run 1 [63, 64] and Run 2 [36] searches. It has been found empirically, first via simulation-based studies, and then from the analysis of data at low jet multiplicities, that the shape of the S T distribution for the dominant QCD multijet background does not depend on the multiplicity of the final state, above a certain turn-on threshold. This observation reflects the way a parton shower develops via nearly collinear emission, which conserves S T . It allows one to predict the S T spectrum of a multijet final state using low-multiplicity QCD events, e.g., dijet or trijet events. This "S T invariance" provides a powerful method of predicting the dominant background for BH production by taking the S T shape from low-multiplicity events, for which the signal contamination is expected to be negligible, and normalizing it to the observed spectrum at high multiplicities at the low end of the S T distribution, where signal contamination is negligible even for large multiplicities of the final-state objects. The method has been also used for other CMS searches, e.g., a search for stealth supersymmetry [67] and a search for multijet resonances [68].

Black hole and string ball signal samples
Signal simulation is performed using the BLACKMAX v2.02.0 [69] (semiclassical BHs) and CHARYBDIS 2 v1.003 [70,71] (semiclassical BHs and SBs) generators. The generator settings of each model are listed in Tables 1 and 2.   C1  TRUE  FALSE  FALSE  FALSE  TRUE  TRUE  FALSE  FALSE  C2  FALSE  FALSE  FALSE  FALSE  TRUE  TRUE  FALSE  FALSE  C3  TRUE  FALSE  FALSE  TRUE  FALSE  FALSE  FALSE  FALSE  C4  TRUE  TRUE  TRUE  FALSE  TRUE  TRUE  FALSE  FALSE  C5  TRUE  TRUE  TRUE  FALSE  FALSE  FALSE  TRUE  FALSE  C6  TRUE  TRUE  TRUE  FALSE  FALSE  FALSE  FALSE  TRUE For semiclassical BH signals, we explore different aspects of BH production and decay by simulating various scenarios, including nonrotating BHs (B1,C2), rotating BHs (B2,C1), rotating BHs with mass loss (B3), and rotating BHs with Yoshino-Rychkov bounds [72] (C4). Models C3, C5, and C6 explore the termination phase of the BH with different object multiplicities from the BH remnant, varying from 2-body decaying remnant (C3), stable remnant (C5, for which additionally the generator parameter NBODY was changed from its default value of 2 to 0), and "boiling" remnant (C6), where the remnant continues to evaporate until a maximum Hawking temperature equal to M D is reached. For each model, the fundamental Planck scale M D is varied within 2-9 TeV in 1 TeV steps, each with n ED = 2, 4, 6. The minimum black hole mass M min BH is varied between M D + 1 TeV and 11 TeV in 1 TeV steps. For SB signals, two sets of benchmark points are generated with CHARYBDIS 2, such that different regimes of the SB production can be explored. For a constant string coupling value g S = 0.2 the string scale M S is varied from 2 to 4 TeV, while at constant M S = 3.6 TeV, g S is varied from 0.2 to 0.4. For all SB samples, n ED = 6 is used. The SB dynamics below the first transition (M S /g S ), where the SB production cross section scales with g S 2 /M S 4 , are probed with the constant g S = 0.2 and low M S values as well as with the constant M S scan. The saturation regime (M S /g S < M SB < M S /g S 2 ), where the SB production cross section no longer depends on g S , is probed by the higher M S points of the constant g S benchmark. For each benchmark point, the scale M D is chosen such that the cross section at the SB-BH transition (M S /g S 2 ) is continuous.

Sphaleron signal samples
The electroweak sphaleron processes are generated at LO with the BARYOGEN v1.0 generator [49], capable of simulating various final states described in Section 1.2. We simulate the sphaleron signal for three values of the transition energy E sph = 8, 9, and 10 TeV. The partonlevel simulation is done with the CT10 LO PDF set [76]. In the process of studying various PDF sets, we found that the NNPDF3.0 yields a significantly larger fraction of sea quarks in the kinematic region of interest than all other modern PDFs. While the uncertainty in this fraction is close to 100%, we chose the CT10 set, for which this fraction is close to the median of the various PDF sets we studied. The PDF uncertainties discussed in Section 7 cover the variation in the signal acceptance between various PDFs due to this effect.  The cross section for sphaleron production is given by [48]: σ = PEF σ 0 , where σ 0 = 121, 10.1, and 0.51 fb for E sph = 8, 9, and 10 TeV, respectively, and PEF is the pre-exponential factor, defined as the fraction of all quark-quark interactions above the sphaleron energy threshold E sph that undergo the sphaleron transition.

Background samples
In addition, we use simulated samples of W+jets, Z+jets, γ+jets, tt, and QCD multijet events for auxiliary studies. These events are generated with the MADGRAPH5 aMC@NLO v2.2.2 [77] event generator at LO or next-to-LO, with the NNPDF3.0 PDF set of a matching order.
The fragmentation and hadronization of parton-level signal and background samples is done with PYTHIA v8.205 [78], using the underlying event tune CUETP8M1 [79]. All signal and background samples are reconstructed with the detailed simulation of the CMS detector via GEANT4 [80]. The effect of pileup interactions is simulated by superimposing simulated minimum bias events on the hard-scattering interaction, with the multiplicity distribution chosen to match the one observed in data.
6 Background estimate

Background composition
The main backgrounds in the analyzed multi-object final states are: QCD multijet, V+jets (where V = W, Z), γ+jets, and tt production, with the QCD multijet background being by far the most dominant. Figure 2 illustrates the relative importance of these backgrounds for the inclusive multiplicity N ≥ 3 and 6 cases, based on simulated background samples. To reach the overall agreement with the data, all simulated backgrounds except for the QCD multijets are normalized to the most accurate theoretical predictions available, while the QCD multijet background is normalized so that the total number of background events matches that in data. While we do not use simulated backgrounds to obtain the main results in this analysis, Fig. 2 illustrates an important point: not only is the QCD multijet background at least an order of magnitude more important than other backgrounds, for both low-and high-multiplicity cases, but also the shape of the S T distributions for all major backgrounds is very similar, so the method we use to estimate the multijet background, discussed below, provides an acceptable means of predicting the overall background as well. Figure 2: The S T distribution in data for inclusive multiplicities of (left) N ≥ 3 and (right) N ≥ 6, compared with the normalized background prediction from simulation, illustrating the relative contributions of major backgrounds.

Background shape determination
The background prediction method used in the analysis follows closely that in previous similar CMS searches [36, 62-64]. As discussed in Section 4, the central idea of this method is that the shape of the S T distribution for the dominant multijet background is invariant with respect to the final-state object multiplicity N. Consequently, the background shape can be extracted from low-multiplicity spectra and used to describe the background at high multiplicities. The S T value is preserved by the final-state radiation, which is the dominant source of extra jets beyond LO 2 → 2 QCD processes, as long as the additional jets are above the p T threshold used in the definition of S T . At the same time, jets from initial-state radiation (ISR) change the S T value, but they contribute a relatively small fraction of S T and also typically change the multiplicity N by just one unit. Consequently, we extract the background shape from the N = 3 S T spectrum, which already has a contribution from ISR jets, and therefore reproduces the S T shape at higher multiplicities better than the N = 2 spectrum used in earlier analyses. To estimate any residual noninvariance in the S T distribution, the N = 4 S T spectrum, normalized to the N = 3 spectrum in terms of the total number of events, is also used as an additional component of the background shape uncertainty. Furthermore, to be less sensitive to the higher instantaneous luminosity delivered by the LHC in 2016, which resulted in a higher pileup, and to further reduce the effect of ISR, the p T threshold for all objects was raised to 70 GeV, compared to 50 GeV used in earlier analyses. The reoptimization that has resulted in the choice of a new exclusive multiplicity to be used for the baseline QCD multijet background prediction and a higher minimum p T threshold for the objects counted toward S T was based on extensive studies of MC samples and low-S T events in data.
In order to obtain the background template, we use a set of 16 functions employed in earlier searches for BSM physics in dijets, VV events, and multijet events at various colliders. These functions typically have an exponential or power-law behavior with S T , and are described by 3-5 free parameters. Some of the functions are monotonously falling with S T by construction; however, some of them contain polynomial terms, such that they are not constrained to have a monotonic behavior. In order to determine the background shape, we fit the N = 3 S T distribution or the N = 4 S T distribution, normalized to the same total event count as the N = 3 distribution, in the range of 2.5-4.3 TeV, where any sizable contributions from BSM physics have been ruled out by earlier versions of this analysis, with all 16 functional forms. The lowest masses of the signal models considered, which have not been excluded by the previous analysis [36], contribute less than 2% to the total number of events within the fit range. Any functional form observed not to be monotonically decreasing up to S T = 13 TeV after the fit to both multiplicities is discarded. The largest spread among all the accepted functions in the N = 3 and N = 4 fits is used as an envelope of the systematic uncertainty in the background template. The use of both N = 3 and N = 4 distributions to construct the envelope allows one to take into account any residual S T noninvariance in the systematic uncertainty in the background prediction. We observe a good closure of the method to predict the background distributions in simulated QCD multijet events.
The best fits (taking into account the F-test criterion [81] within each set of nested functions) to the N = 3 and N = 4 distributions in data, along with the corresponding uncertainty envelopes, are shown in the two panels of Fig. 3. In both cases, the best fit function is f (x) = p 0 (1 − x 1/3 ) p 1 /(x p 2 +p 3 log 2 (x) ), where x = S T / √ s = S T /(13 TeV) and p i are the four free parameters of the fit. The envelope of the predictions at large S T (S T > 5.5 TeV, most relevant for the present search) is given by the fit with the following 5-parameter function: φ(x) = p 0 (1 − x) p 1 /(x p 2 +p 3 log(x)+p 4 log 2 (x) ) to the N = 4 (upper edge of the envelope) or N = 3 (lower edge of the envelope) distributions. For S T values below 5.5 TeV the envelope is built piecewise from other template functions fitted to either the N = 3 or N = 4 distribution.

Background normalization
The next step in the background estimation for various inclusive multiplicities is to normalize the template and the uncertainty envelope, obtained as described above, to low-S T data for various inclusive multiplicities. This has to be done with care, as the S T invariance is only expected to be observed above a certain threshold, which depends on the inclusive multiplicity requirement. Indeed, since there is a p T threshold on the objects whose transverse energies count toward the S T value, the minimum possible S T value depends on the number of objects in the final state, and therefore the shape invariance for an S T spectrum with N ≥ N min is only observed above a certain S T threshold, which increases with N min . In order to determine the minimum value of S T for which this invariance holds, we find a plateau in the ratio of the S T spectrum for each inclusive multiplicity to that for N = 3 in simulated multijet events. The plateau for each multiplicity is found by fitting the ratio with a sigmoid function. The lower bound of the normalization region (NR) is chosen to be above the 99% point of the corresponding sigmoid function. The upper bound of each NR is chosen to be 0.4 TeV above the corresponding lower bound to ensure sufficient event count in the NR. Since the size of the simulated QCD multijet background sample is not sufficient to reliably extract the turn-on threshold for inclusive multiplicities of N ≥ 9-11, for these multiplicities we use the same NR as for the N ≥ 8 distribution. Table 3 summarizes the turn-on thresholds and the NR boundaries obtained for each inclusive multiplicity.
The normalization scale factors are calculated as the ratio of the number of events in each NR for the inclusive multiplicities of N ≥ 3, . . . , 11 to that for the exclusive multiplicity of N = 3 in data, and are listed in Table 3. The relative scale factor uncertainties are derived from the number of events in each NR, as 1/ √ N NR , where N NR is the number of events in the corresponding NR.

Comparison with data
The results of the background prediction and their comparison with the observed data are shown in Figs. 4 and 5 for inclusive multiplicities N ≥ 3, . . . , 11. The data are consistent with the background predictions in the entire S T range probed, for all inclusive multiplicities.

Systematic uncertainties
There are several sources of systematic uncertainty in this analysis. Since the background estimation is based on control samples in data, the only uncertainties affecting the background predictions are the modeling of the background shape via template functions and the normalization of the chosen function to data at low S T , as described in Section 6. They are found to be 1-130% and 0.7-50%, depending on the values of S T and N min , respectively.
For the signal, we consider the uncertainties in the PDFs, jet energy scale (JES), and the integrated luminosity. For the PDF uncertainty, we only consider the effect on the signal acceptance, while the PDF uncertainty in the signal cross section is treated as a part of the theoretical uncertainty and therefore is not propagated in the experimental cross section limit. The uncertainty in the signal acceptance is calculated using PDF4LHC recommendations [82,83] based on the quadratic sum of variations from the MSTW2008 uncertainty set (≈0.5%), as well as the variations obtained by using three different PDF sets: MSTW2008, CTEQ6.1 [84], and NNPDF2.3 [75] (up to 6% based on the difference between the default and CTEQ6.1 sets) for one of the benchmark models (nonrotating BH with M D = 3 TeV, M BH = 5.5 TeV, and n = 2, as generated by BLACKMAX); the size of the effect for other benchmark points is similar. To be conservative, we assign a systematic uncertainty of 6% due to the choice of PDFs for all signal samples. The JES uncertainty affects the signal acceptance because of the kinematic requirements on the objects and the fraction of signal events passing a certain S min T threshold used for limit setting, as described in Section 8. In order to account for these effects, the jet fourmomenta are simultaneously shifted up or down by the JES uncertainty, which is a function of the jet p T and η, and the largest of the two differences with respect to the use of the nominal JES is assigned as the uncertainty. The uncertainty due to JES depends on M BH and varies between <1 and 5%; we conservatively assign a constant value of 5% as the signal acceptance uncertainty due to JES. Finally, the integrated luminosity is measured with an uncertainty of 2.5% [85]. Effects of all other uncertainties on the signal acceptance are negligible.
The values of systematic uncertainties that are used in this analysis are summarized in Table 4.

Results
As shown in Figs Figure 4: The background predictions after the normalization for inclusive multiplicities N ≥ 3, . . . , 6 (left to right, upper to lower). The gray band shows the background shape uncertainty alone and the red lines also include the normalization uncertainty. The bottom panels show the difference between the data and the background prediction from the fit, divided by the overall uncertainty, which includes the statistical uncertainty of data as well as the shape and normalization uncertainties in the background prediction, added in quadrature. model-independent limits on BSM physics in energetic, multiparticle final states, and as modelspecific limits for a set of semiclassical BH and SB scenarios, as well as for EW sphalerons.
Limits are set using the CL s method [86][87][88] with log-normal priors in the likelihood to constrain the nuisance parameters near their best estimated values. We do not use an asymptotic approximation of the CL s method [89], as for most of the models the optimal search region corresponds to a very low background expectation, in which case the asymptotic approximation is known to overestimate the search sensitivity.

Model-independent limits
The main result of this analysis is a set of model-independent upper limits on the product of signal cross section and acceptance (σ A) in inclusive N ≥ N min final states, as a function of the minimum S T requirement, S min T , obtained from a simple counting experiment for S T > S min T . These limits can then be translated into limits on the M min BH in a variety of models, or on any other signals resulting in an energetic, multi-object final state. We start with the limits for the inclusive multiplicities N ≥ 3, 4, which can be used to constrain models resulting in lower  Figure 5: The background predictions after normalization for inclusive multiplicities of N ≥ 7, . . . , 11 (left to right, upper to lower). The gray band shows the shape uncertainty and the red lines also include the normalization uncertainty. The bottom panels show the difference between the data and the background prediction from the fit, divided by the overall uncertainty, which includes the statistical uncertainty of data as well as the shape and normalization uncertainties in the background prediction, added in quadrature. The N ≥ 7 (N ≥ 8, . . . , 11) distributions also show contributions from benchmark BH (sphaleron) signals added to the expected background.      ±2.5% -Shape modeling -±(1-130)%, depending on S T Normalization -±(0.7-50)%, depending on N min multiplicities of the final-state objects. Since part of the data entering these distributions are used to determine the background shape and its uncertainties, the limits are set only for S min T values above the background fit region, i.e., for S T > 4.5 TeV. For other multiplicities, the limits are shown for S T values above the NRs listed in Table 3. These limits at 95% confidence level (CL) are shown in Figs. 6 and 7. When computing the limits, we use systematic uncertainties in the signal acceptance applicable to the specific models discussed in this paper, as documented in Section 7. It is reasonable to expect these limits to apply to a large variety of models resulting in multi-object final states dominated by jets. The limits on the product of the cross section and acceptance approach 0.08 fb at high values of S min T .

Model-specific limits
To determine the optimal point of S min T and the minimum multiplicity of the final-state objects N min for setting an exclusion limit for a particular model, we calculate the acceptance and the expected limit on the cross section for a given model for each point of the model-independent limit curves, for all inclusive multiplicities. The optimal point of (N min , S min T ) is chosen as the point that gives the lowest expected cross section limit. In most of the cases this point also maximizes the significance of an observation, for the case of a nonzero signal present in data [36].
An example of a model-specific limit is given in Fig. 8 for a BLACKMAX benchmark point B1 (nonrotating semiclassical BH) with M D = 4 TeV, n ED = 6, and M min BH between 5 and 11 TeV. In this case, the optimal inclusive multiplicity N min starts at 7 for the lowest M min BH value of 5 TeV, with the corresponding S min T = 5 TeV. As M min BH increases, the optimal point shifts to lower inclusive multiplicities and the corresponding S min T increases, reaching (3, 7.6 TeV) for M min BH = 11 TeV. The corresponding 95% CL upper limit curve and the theoretical cross section for the chosen benchmark point is shown in Fig. 8. The observed (expected) 95% CL lower limit on M min BH in this benchmark model can be read from this plot as the intersection of the theoretical curve with the observed (expected) 95% CL upper limit on the cross section, and is found to be 9.7 (9.7) TeV.
We repeat the above procedure for all chosen benchmark scenarios of semiclassical BHs, listed in Tables 1 and 2. The resulting observed limits on the M min BH are shown in Figs. 9 and 10, for the BLACKMAX and CHARYBDIS 2 benchmarks, respectively. We also obtain similar limits on the SB mass for the set of the SB model parameters we scanned. These limits are shown in Fig. 11 for a fixed string scale M S = 3.6 TeV, as a function of the string coupling g S (left plot) and for a fixed string coupling g S = 0.2 as a function of the string scale M S (right plot). The search excludes SB masses below 7.1-9.4 TeV, depending on the values of the string scale and coupling.
For the sphaleron signal, the optimal (N min , S min T ) point is also chosen by scanning for the lowest expected limit and is found to be (8, 6.2 TeV) for E sph = 9 and 10 TeV, and (9, 5.6 TeV) for The 95% CL upper exclusion limit on the signal cross section for each M min BH value is obtained at the optimal (N min , S min T ) point, which ranges from (7, 5.0 TeV) for M min BH = 5 TeV to (3, 7.6 TeV) for M min BH = 11 TeV. Also shown with a dashed line are the theoretical cross sections corresponding to these optimal points. The green (yellow) band represents the ±1 (±2) standard deviation uncertainty in the expected limit.   E sph = 8 TeV. Consequently, the exclusion limit on the sphaleron cross section can be converted into a limit on the PEF, defined in Section 5.2. Following Ref. [48] we calculate the PEF limits for the nominal E sph = 9 TeV, as well as for the modified values of E sph = 8 and 10 TeV. The observed and expected 95% CL upper limits on the PEF are shown in Fig. 12. The observed (expected) limit obtained for the nominal E sph = 9 TeV is 0.021 (0.012), which is an order of magnitude more stringent than the limit obtained in Ref. [ Figure 12: Observed (solid curve) and expected (dashed black curve) 95% CL upper limit on the pre-exponential factor PEF of the sphaleron production as a function of E sph . The green (yellow) band represents the ±1 (±2) standard deviation uncertainty in the expected limit. The area above the solid curve is excluded by this search.

Summary
A search has been presented for generic signals of beyond the standard model physics resulting in energetic multi-object final states, such as would be produced by semiclassical black holes, string balls, and electroweak sphalerons. The search was based on proton-proton collision data at a center-of-mass energy of 13 TeV, collected with the CMS detector in 2016 and corresponding to an integrated luminosity of 35.9 fb −1 . The background, dominated by QCD multijet production, is determined solely from low-multiplicity samples in data. Comparing the distribution of the total transverse momentum S T of the final-state objects in data with that expected from the backgrounds, we set 95% confidence level model-independent upper limits on the product of the production cross section and acceptance for such final states, as a function of the minimum S T for minimum final-state multiplicities between 3 and 11. These limits reach 0.08 fb at high S T thresholds. By calculating the acceptance values for benchmark black hole, string ball, and sphaleron signal models, we convert these model-independent limits into lower limits on the minimum semiclassical black hole mass and string ball mass. The limits extend as high as 10.1 TeV, thus improving significantly on previous results. We have also set the first experimental upper limit on the electroweak sphaleron pre-exponential factor of 0.021 for the sphaleron transition energy of 9 TeV.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus   [23] R. Barbieri and G. F. Giudice, "Upper bounds on supersymmetric particle masses", Nucl. Phys. B 306 (1988) 63, doi:10.1016/0550-3213(88)90171-X.