How to discover QCD Instantons at the LHC

The Standard Model of particle physics predicts the existence of quantum tunnelling processes across topological inequivalent vacua, known as Instantons. In the electroweak sector, instantons provide a source of baryon asymmetry within the Standard Model. In Quantum Chromodynamics they are linked to chiral symmetry breaking and confinement. The direct experimental observation of Instanton-induced processes would therefore be a breakthrough in modern particle physics. Recently, new calculations for QCD Instanton processes in proton-proton collisions became public, suggesting sizable cross sections as well as promising experimental signatures at the LHC. In this work, we study possible analysis strategies to discover QCD Instanton induced processes at the LHC and derive a first limit based on existing Minimum Bias data.


Introduction
Yang-Mills theories [1], embedded in the Standard Model (SM) of particle physics, form the basis of our understanding of the strong and electroweak interactions. The beauty and success of the SM lies in its predictive power, which is however only achieved in the weakly coupled regime. Perturbation theory, developed in order to describe hadron collisions at high energies, relies on the smallness of the strong coupling at high momentum transfers and short distances. The study of Quantum Chromodynamics (QCD) in this perturbative regime has seen tremendous advancements in the last decades. Hard processes have been calculated up to the third order in the strong coupling [2,3], and perturbative QCD predictions have been verified to incredible precision over many order of magnitudes of momentum transfer at high energy colliders [4]. At the same time a fundamental understanding of Yang-Mills theories in the strongly coupled limit is still lacking, and remains one of the biggest challenges for particle physics to date.
Unlike the abelian case, Yang-Mills theories exhibit a rich and non-trivial vacuum structure. In particular, they admit semi-classical solution corresponding to fluctuations of the gauge fields across topologically non-equivalent vacua, the Instantons [5]. These inherently non-perturbative phenomena are of great theoretical interest (introductory reviews on the physics of instantons can be found in [6][7][8][9]).
The vacuum structure of a Yang-Mills theory is depicted in Figure 1, showing the energy density of the gauge field as a function of the Chern-Simons (or winding number), N CS , characterising the topological charge of a system. Instantons describe tunneling transitions in Minkowski spacetime between classically degenerate vacua, which only differ by their winding number by one unit, i.e. ∆N CS = 1. Instanton solutions are not only localised in time, but also in space, i.e. they have a certain spatial extension. There exists also a second class of classical solutions, known as Sphalerons, corresponding to transitions from one vacuum by a half-integer winding number on top of the energy barrier (also shown in Figure 1), where its static energy corresponds to the barrier height. These tunnelling solutions differ significantly from the solutions known from ordinary perturbation theory, where only those field configurations are accessible which correspond to small changes of the vacuum field at N CS = 0, while other minima, which are not accessible by continuous transformation of the gauge field, are ignored. Clearly, this approximation holds only as long as the energy barrier between the vacua is sufficiently large. Instanton and Sphaleron solutions provide crucial ingredients for an understanding of a number of non-perturbative issues in the SM. In the electroweak theory, Instanton and Sphaleron transitions are associated to B + L violation. These become highly relevant at high temperatures [7,10,11] and have a crucial impact on the evolution of the baryon and lepton asymmetries of the universe (see also Ref. [12] for a review). In QCD these topological solutions have been argued to play an important role in various long-distance aspects of the theory. They provide a possible solution to the axial U (1) problem [13] and are associated to chiral symmetry breaking [14][15][16].
The height of the energy barrier between two vacua, called Sphaleron mass M Sp , in the electroweak theory is of the order of M Sp ∼ π αρ eff ∼ π M W α W ∼ 10 TeV [17], where α W is the weak coupling constant and ρ eff the effective Instanton size. As the energy barrier is lower than the LHC center-of-mass energy, one might think that electroweak Sphalerons should be produced, and could be observed at the LHC. However it was shown that the difficulty of obtaining a coherent state makes these processes likely to remain unobservably small at current and future colliders [18,19]. The situation is different for QCD Instanton processes, for which the energy barrier, M Sp ∼ 3π 4αsρ eff ∼ Q [20], with α s the strong coupling and the parameter Q related to the energy scale of the underlying process, can be as low as a few GeV. Searches for Instanton processes have been performed in Deep Inelastic Scattering at the HERA collider [21][22][23] already excluding the lower range of the predicted cross-sections. It is then interesting to understand if these processes could be measured also at the LHC. Recent works have provided first calculations for LHC crosssections [24], and some discussions on the expected phenomenology [25]. In this work we explore in further details suitable analysis strategies at the LHC, in particular exploring the (relatively) small-size regime with Instanton masses of few tenths of GeVs, where the cross-section is the highest. In this regime the challenge lies in finding suitable observables that, while retaining sensitivity to the soft decay products of the Instanton, can also be described to an acceptable level of accuracy by the non-perturbative models of soft QCD activity.
The paper is structured as follows: In section 2 we briefly provide a review on Instanton processes at the LHC, covering their expected production cross sections and experimental signature. This is followed by an overview of the Monte Carlo samples used in section 3. Possible search strategies and the optimisation of the event selection are described in section 4. The expected sensitivity of the proposed analysis, as well as first limit on QCD Instanton processes are presented in section 5. The paper concludes in section 6.

Production of the Instanton Pseudo-Particle in Proton-Proton Collisions
The expected inclusive cross sections of QCD Instanton-induced processes in Instanton perturbation theory [20,26] can schematically be written as α Ω(E/M I ) · (further terms), (2.1) where D(ρ) and D(ρ) denote the Instanton and anti-Instanton size distributions, E the available energy of the process and Ω describes the Instanton anti-Instanton interaction, with Ω(x) = 1 for x → 0 and Ω(x) = 0 for x → ∞. The Instanton size distribution is proportional to D(ρ) ∼ ρ 11−2/3n f −5 [14,20,[27][28][29], thus an integral over ρ would diverge. However, it was shown that the additional term e −(ρ+ρ)Q has to be taken into account [20], where Q describes a generic hard scale of the Instanton process. This form factor effect renders the ρ integration convergent. In order to make reliable calculations of cross sections in QCD, Instanton perturbation theory [20] has to be applied. This requires the validity of the diluted gas approximation [20], i.e. requires that the extensions of Instantons and anti-Instantons are not overlapping. Therefore, the validity of Instanton perturbation theory requires Instantons to be sufficiently localized in space-time. In QCD, a generic hard scale Q of the underlying process can be defined which reduces the Instanton size, justifying the diluted gas approximation and enabling the Instanton perturbation theory approach. Here, the cross section can become sizeable at high energies. The reason for the increasing cross section can be intuitively understood [30] by changing the picture from a tunnelling between vacua at E=0 to that of the actual creation of a Sphaleron-like configuration [17] on top of the potential barrier of height. Therefore, in a naive (but not fully correct) picture, the Instanton process can be interpreted as the creation and the decay of a Sphaleron pseudoparticle, where the pass of the pseudo-particle depends directly on the height of potential barrier. Depiction of a QCD Instanton processes in proton-proton (right) collisions without the requirement of an external scale parameter Q For deep inelastic scattering processes, the necessary hard scale Q was defined by a highly virtual momentum transfer by a photon, emitted by the incoming electron, yielding a high energetic final state quark in addition to the Instanton process, as shown in Figure  2. This concept can be easily transferred to proton-proton collisions where the photon exchange is simply replaced by a gluon as shown in Figure 2. A first calculation of the the latter processes became recently public [25], suggesting negligible cross sections at the LHC for typical energies at LHC.
However, an alternative strategy to calculate finite cross sections was also recently published [24]. Here, no a second independent kinematic scale, as the DIS highly virtual momentum scale Q, is required (Fig. 3). In this approach, only small Instantons contribute to the scattering processes in QCD and potentially problematic contributions of Instantons with large size are automatically cut-off by the inclusion of quantum effects due to interactions of the hard initial states that generate the factor e −αs ρ 2 s log s . The latter argumentation was already brought in 1991 [31,32] and provides a dynamical solution to the problem of IR divergences arising from Instantons of large scale-sizes in QCD. Basically, the scale invariance of the classical Yang-Mills theories is broken by those quantum effects which lead to a suppression of all but small Instantons with sizes ρ (10 − 30)/ √ s . The characteristic QCD Instanton size is therefore inversely proportional to the centre-of-mass (CoM) energy of two colliding partons √ s . Table 1 shows the proton-proton cross sections for Instanton processes at a center of mass energy of 13 TeV for various choices of minimal values of s min as calculated in [24]. The cross section at s min = 20 GeV contributes already several percent to the total proton-proton cross section and at even lower values would saturate. This implies that the calculation breaks down at some small value of s min .
In the following it is assumed that the cross section estimate is reliable for s min > 20 GeV, keeping in mind that the uncertainty on this prediction could be of several order of magnitudes. The cross section dependence can be interpolated by a phenomenological formula σ ∼ e a·e b·x +c·x 2 +d·x+e , implying an exponential decrease vs. s min . In the pseudoparticle picture of the Instanton process, the available energy s min can be interpreted as the Instanton mass m I . The production of Instantons is therefore not a resonant but a continues processes, with large production rates of Instantons of low mass and small production rates of Instantons of high masses expected.

Decay of the Instanton Pseudo-Particle and Experimental Observables
The question whether manifestations of such topological fluctuations, i.e. Instanton processes, can be directly observed in high-energy experiments was already raised in the 1980s in the context of the electroweak sector [33][34][35][36]. A QCD Instanton tunneling process between ∆N CS = 1 vacua leads to the creation of a quark-antiquark pair of different chirality σ(pp → I → X) [pb] 1.7 · 10 11 6.3 · 10 9 4.1 · 10 7 8.0 · 10 4 1.1 · 10 2 3.5 · 10 −3 for each flavour, N f in association with a number n g of additional gluons. Following the approach of [24] we only consider the dominant contribution from gluon-gluon transition, which proceeds through the process: 2) In the pseudo particle picture, this can be interpreted as a decay process of a Instanton pseudo particle with a mass m I . For low Instanton masses, e.g. in the 50 GeV range, we expect therefore an isotropic decay into up to 5 quarks, 5 anti-quarks as well as 5-10 gluons. The number of gluons is assumed to be Poisson distributed around n g , which has been calculated in [24] in turn depends on m I and varies between 5 and 13 over the broad 10 GeV< m I < 4 TeV. As a consequence, QCD Instanton-induced scattering processes produce soft bombs -very high-multiplicity spherically symmetric distributions of relatively soft particles [37]. These generic properties of QCD Instanton decays reflect into the experimental observables that can be used to search for these processes. Most importantly the Instanton cross section falls rapidity with increasing mass of the Instanton pseudo-particle, m I , or equivalently with the center of mass energy at parton level, √ s of the Instanton process. The cross section dependence on √ s is expected to very different from other SM processes, hence it is expected that different regimes of m I will have on the one hand different signal to background ratios and on the other hand, different processes which contribute to the background. Experimentally, m I can be approximated by the 4-vector sum of all charged particles with a certain minimal transverse momentum.

MC Samples and Detector Simulation
While the decay of low mass Instantons, e.g. m I = 30 GeV, results in events with a high multiplicity of low energetic charged particle tracks, the decay of Instantons with masses larger than 200 GeV results in numerous reconstructed particle jets. Depending on the range of Instanton processes masses considered, different SM processes can thus act as background. In the low mass regime, softQCD events originating from inelastic, nondiffractive processes play the dominant role due to their large production cross section. For higher Instanton masses, high-p T jet production processes (hardQCD ), as well as vectorboson and top anti-top quark pair production in the hadronic decay channels contribute.
The different signal and background samples used in this study are summarized in Table 2 and discussed in the following sections. A typical detector response for the different generated samples has been simulated through the Delphes framework [38] with settings corresponding to the ATLAS experiment, and without considering additional pile-up interactions. Pile-up activity could become an important source of background when selecting events with high multiplicity final states, and will require dedicated studies in LHC analyses that cannot be performed with parametrised simulations.
Several observables sensitive to Instanton processes have been defined at detector level. Basic reconstructed quantities are the 4-vectors of charged particles reconstructed as par-ticle tracks as well as particle jets, reconstructed using an anti-k T algorithm with a radius parameter of 0.4. Tracks are assumed to be massless and hence their 4-vector is defined by their transverse momentum, p T , in the x − y plane 1 , the polar angle θ measured from the positive z axis, as well as the azimuthal angle φ in the x − y plane. The polar angle is mostly expressed in terms of the pseudorapidity η, defined by η = − ln(tan θ/2). We require transverse momenta greater than 500 MeV for the reconstructed tracks, as well as a maximal absolute pseudo-rapidity of 2.5. Particle jets are required to have at least a transverse momentum of 20 GeV and η<2.5.
A first experimental observable for the selection of Instanton processes is the number of reconstructed tracks, N Trk , as well as the number of reconstructed jets N jet . Since many charged decay particles are expected from the decay of an Instanton pseudo-particle with a given mass, also the ratio of the mass and the number of tracks, m I /N Trk is of interest. Similarly, the scalar sum of the transverse momenta p T of all charged tracks (or particlejets), S T = |p i T | in dependence of m I is studied. Isotropic decays of resonances are expected to have more central than forward activity, i.e. the pseudo-rapidity distribution of all charged tracks, η T rk , as well as the average pseudo-rapidity of charged tracks per event η T rk are expected to be sensitive. Due to the presence of c-and b-quarks as decay products of the Instanton and the relatively long life-times of the corresponding hadronized mesons, one might expect a higher number of charged particles displaced vertices compared to other Standard Model processes. The number of reconstructed charged particles tracks with a production vertex that has a distance in the transverse plane of more than 0.02 mm to the primary vertex of the collision, N Displaced , is therefore also studied.
The observables discussed previously are not directly related to the expected isotropy of Instanton decays. One variable that targets the isotropy is event-sphericity and defined via the tensor S, where the indices denote the x, y, and z components of the momentum of the particle i in its rest-frame. The sphericity of the event is then constructed using the two smallest eigenvalues of this tensor, λ 2 and λ 3 , i.e. S = 3 2 (λ 2 + λ 3 ) and takes values between 0 and 1. A fully balanced dijet events leads to a sphericity of S = 0, while a fully isotropic event has a sphericity of S = 1. A similar event shape variable is thrust, defined as where n is a unit vector. Fully spherical symmetric events yield τ = 0.5, while fully balanced dijet events have T = 0. The definition of thrust also defines the thrust axis n, which maximizes the value of T . The thrust axis defines a left L and right R hemisphere for each event, which can be used to define the jet broadening of an event. The left and right broadening is defined as (3. 2) The total jet broadening B is then defined as B = B L + B R , and behaves similar as τ , i.e. is 0 and 0.5 for dijet and spherically symmetric events, respectively.
The sphericity S, the thrust T as well as the total jet broadening B are calculated using all reconstructed tracks as well as for all reconstructed jets per event. The calculation is based on the code provided in [39]. It should be noted that these three observables are significantly correlated. An additional correlation is observed between S and the η T rk distribution as events with large values of S tend to enhance the number of tracks in the central region, i.e. with |η| < 1.0. SoftQCD:all = on -A14 Tune 1,000,000 softQCD Sherpa 1,000,000 softQCD Herwig 1,000,000 qq → X, qg → X, gg → X Pythia8 HardQCD:all = on 1,000,000 (hardQCD ) PhaseSpace:pTHatMin = 5. qq → X, qg → X, gg → X Pythia8 HardQCDAll=on 1,000,000 (hardQCD )

Soft QCD Processes
Due to their non-perturbative nature, softQCD processes are described by phenomenological models, which have been tuned to data using a wide variety of reference measurements. Within this study, the softQCD processes in the Pythia8 [40] generator is used as baseline.
In total one million events have been generated for proton-proton collisions at a center of mass energy of 13 TeV using the NNPDF23lo PDF set [41].
We also consider softQCD production in Pythia8 using a different tune, A14 [42], as well as the predictions of softQCD processes from the Herwig7 [43] and the Sherpa [44,45] event generators. The comparison of normalised distributions for the observables of interest for Instanton processes is shown in Figure 4 for events with a reconstructed invariant mass based on reconstructed tracks between 20 and 40 GeV. A good agreement is observed between the different predictions for most distributions, with the possible exception of the N Displaced distribution, where the Sherpa predictions differs by about 20% from the other generators. In the following, he maximal difference between the various softQCD samples is taken as systematic uncertainty on the nominal softQCD prediction from Pythia8.

Hard QCD and Further Processes
High-p T jet production processes can be predicted with high accuracy in perturbation theory. Pythia8 with the NNPDF23lo PDF set was used to simulate hardQCD multi-jet final states at a center of mass energy of 13 TeV. The transition between the softQCD and hardQCD processes is not well defined. In our study, we use softQCD samples for all events, which have no jet at particle level with a transverse momentum above 20 GeV, while the simulation of hardQCD processes is used for all other events. In addition to multi-jet final states, the production of top-quark pairs and of W and Z bosons and di-boson processes can also lead to high multiplicity final states, in particular in their fully hadronic decay channels. These processes are also simulated with Pythia8 at leading order in α s with the NNPDF23lo PDF set. An uncertainty of 10% on all predictions of multi-jet, tt and vector boson processes are assumed in the following, to account for theoretical uncertainties from missing higher orders. However, it should be noted that it is possible to simulate these processes at higher perturbative accuracy. In addition, a realistic data analysis could use the leptonic decay channels of vector-boson and tt production to validate the theoretical predictions in dedicated control regions, thus reducing significantly the model uncertainties.

Signal Samples
The Instanton signal samples have been produced with a modified version of the Sherpa event generator [24,44,45]. The predicted cross sections of Instanton induced processes for different Instanton masses m I have been implemented based on the calculations in [24] and are shown in Figure 5. The figure also shows the predicted dependence of the Instanton cross section, for two different values of s min , on the center of mass energy of proton-proton collisions where also cross sections for soft-and hard-processes are shown in comparison. It is interesting to note that the Instanton cross sections exhibit a different dependence on √ s than softQCD and hardQCD processes. The decay of the Instanton pseudo particle in the Sherpa implementation proceeds as follows [24]: first, the particle content of the final state is determined, where quark-antiquark pairs qq, starting from the lowest mass, are added as long as the mass of the quark m q is smaller than a kinematics dependent threshold µ q , m q < µ q and as long as the combined mass of all pair-produced quarks is smaller than the Instanton mass. In a second step, the number of additional gluons is determined according to a Poissonian distribution with mean n g . The Rambo algorithm [46] is then used to distribute isotropically momenta to all decay products in the rest-frame of the Instanton pseudo particle and boosted back to the lab-frame. The subsequent showering and hadronization is based on the standard Sherpa implementation. The shapes of selected experimental observables for Instanton processes is shown in Figure 6 for a mass range of 500 GeV to 800 GeV. For comparison, also hardQCD processes with a minimal energy s of 500 GeV are shown.  Several things have to be noted: first, the used implementation of the Instanton production shows significantly too little events for Instanton masses between 20 and 50 GeV as well as some outliers at very high Instanton masses. To correct for this behaviour, all generated Instanton events are reweighted to the cross section prediction shown in Figure  5 based on the reconstructed Instanton mass on an event-by-event level. The potential bias which is introduced by this reweighting procedure is expected to be small compared to the theoretical uncertainties on the calculation. Secondly, it is not clear at which value of s min the Instanton cross section calculation breaks down. Hence a search for Instantons over the full mass range is necessary. In total six Instanton signal samples have been produced, each covering an exclusive mass range between 20 and 600 GeV. An overview is given in Table 2.
In order to validate the main properties of the Instanton decay in the Sherpa implementation, additional samples of the similar decay in the Herwig7 generator [47] have been produced, where isotropic decays of pseudo-particles with masses between 500 GeV and 800 GeV into 2·N f -quarks and n g gluons have been simulated and reasonable agreement has been observed.

Search Strategies
In contrast to most searches for new particles, no resonance behaviour is expected for Instanton induced processes, rather a continuous, rapidly falling spectrum of invariant mass of all hadronic final state objects. This provides significant challenges in the search for Instantoninduced processes. While sizeable cross sections are expected for small Instanton masses, the experimental signatures in this energy regime might be difficult to distinguish from soft QCD activity. At high luminosities, the large amount of pile-up events further complicates such a search. In the high energy regime, the experimental signatures of Instanton-induced processes are strikin. their cross sections are however highly suppressed and hence difficult to observe in the first place.
The expected invariant mass distribution of reconstructed tracks is shown in Figure  7 for SM background and Instanton processes, scaled to the expected event yields for an integrated luminosity of Ldt = 1pb −1 . At low invariant masses, softQCD processes are dominating, while for high invariant masses hardQCD processes as well as top-quark pair and electroweak boson production becomes relevant. The signal over background ratio falls rapidly with increasing mass, suggesting a higher chance of observing of Instanton processes in the low mass regime. As high multiplicity final state is expected for the Instanton processes, the number of reconstructed charged particle tracks or the number of reconstructed jets in the event can be used as powerful discriminant against backgrounds. Figure 7 shows the invariant mass distribution for signal and background processes when requiring at least eight reconstructed jets with a minimal momentum of 20 GeV. The expected signal to background ratio increases by several orders of magnitude, remaining below 10 −4 with the expectation of 1 Instanton process at a integrated luminosity of Ldt ≈ 0.1 f b −1 , showing the challenge of observing of Instanton processes for masses of several hundred GeV.  Figure 7. Invariant mass distribution of all reconstructed tracks for standard model background processes and Instanton processes (left) as well as the same distribution for events with at least ten reconstructed particle jets with a p T > 20 GeV (right). The events correspond to an integrated luminosity of L = 1 pb −1 and the distributions from all processes except the Instanton process are stacked. The model uncertainties are indicated as bands. The lower plots show the signal over background ratio corresponding to the upper row.
One possible avenue to is to exploit the different energy dependence of the production cross section of the Instanton processes and of the background SM processes. A simultaneous search at the different center-of-mass energies available at the LHC might help to strengthen a potential observation.
In the following, we will present and discuss possible analysis strategies. Our analysis is performed in four different mass ranges: 20 < √ s min < 40 GeV and 40 < √ s min < 80 GeV for the low mass regime, where softQCD processes dominate, 200 < √ s min < 300 GeV for the medium mass regime where hardQCD processes dominate and 300 < √ s min < 500 GeV for the high mass regime, where also top-quark pair productions becomes relevant. Different signal selections have been studies , optimised on the signal to background ratio. In addition, at least two control regions are defined for each mass range. These regions are designed to have only a small signal contribution and can therefore be used for the validation of the modelling of background processes. In fact, the control regions can also be used for an ABCD-based background estimation technique, i.e. to determine the background contribution in the signal regions in a fully data-driven way.

Very Low Instanton Masses: The Soft QCD Regime
The very low Instanton mass regime is defined for two regions: the first requires the invariant mass of reconstructed tracks, m I , between 20 GeV and 40 GeV, the second between 40 GeV and 80 GeV. The average m I values for Instantons in both regions are 24 GeV and 46 GeV, respectively. A veto on 20 GeV jets at reconstruction level is applied for both mass ranges. This requirement is applied to keep these regions orthogonal to the regions where hardQCD processes dominate 2 .
The lower mass region is discussed first: Figure 8 shows the predicted distributions of the event sphericity, S and the pseudo rapidity, η T rk , of reconstructed charged particles for the various processes considered. The distributions are scaled to the expected event yields for an integrated luminosity of L = 1 pb −1 . While softQCD processes dominate the background, the Instanton signal is enhanced at large values of the sphericity and predicts a more central η tracks distribution. In order to determine possible observables that allow for a distinction between signal and background processes, it is illustrative to compare the shapes for various observables. An overview of eight relevant observables, previously introduced, is shown in Figure 9. Instantons processes are expected to have larger track multiplicities and hence smaller values of m I /N Trk . As expected, the observables related to the event topology indicate more spherical events compared to the background processes. Highly interesting is the distribution of N Displaced , i.e. the number of tracks with a displaced origin, as it differs significantly for soft-and hardQCD processes and the signal process. This behavior might be explained by the fact that more heavy quarks in the final state of the Instanton decays are expected, which typically hadronize to long(er) lived mesons and hadrons. Based on these distributions, a few signal selection scenarios have been developed and are summarized in Table 3. No requirements are made specifically for the event sphericity as well as the pseudo-rapidity of tracks. The idea behind this approach is, that these distributions could then be used in a combined fit of signal and background templates to data in order to extract a limit on the Instanton signal-strength.  The standard signal selection applies requirements on the N Trk , m I /N Trk and N Jets distributions, where the latter is required to be 0 to reject hardQCD processes. The resulting sphericity and track η distributions for the signal and background processes is shown in Figure 10. A improvement by a factor of two in the signal over background ratio becomes visible after these selections are applied. In particular, the expected number of Instanton events becomes larger than the total SM background for event with sphericities S>0.85, which is used to define the signal region selections in the following. The event-shape signal selection adds in addition requirements on B and T , hence affecting also the S distribution. The observables after this selection are shown in Figure 11. The signal over background ratio improves further when also a requirement of N Displaced > 6 is applied, which defines our tight signal selection ( Table 3). The resulting sphericity distribution for the tight signal selection is shown in Figure 11. A very clean Instanton signal is expected for this tight selection.
Two possible definitions of control regions, called A and B, are summarized in Table  3. Both exhibit a signal contamination smaller than 10%. The N Displaced distribution for control region A, as well as the sphericity distribution for control region B, are shown in Figure 12.
We also study a higher range of Instanton masses, 40 < m I < 80 GeV, where we expect the cross section predictions for Instanton processes should be more reliable. In this regime, the softQCD background is still the dominant one in most regions of the phase space. Analogously to the previous case, three different signal selection scenarios and two control regions are defined, which is summarized in Table 4.
The sphericity distribution for the signal and background processes for an inclusive selection 3 in this mass range, as well as the three signal region definitions (standard, event- Expected Events for Ldt = 1 pb −1 in the Signal Region (S >0.85) N Signal 1.1 · 10 7 8.9 · 10 6 5.9 · 10 6 <1 6.8 · 10 5 N Background 6.2 · 10 6 4.3 · 10 6 1.8 · 10 5 3 · 10 5 . 3.3 · 10 6 Table 3.  shape, tight) are shown in Figure 13, together with the expected signal and background events in the signal region in Table 4. The standard and event-shape selections yield signal to background ratios below 1 and are dominated by softQCD processes. The tight selection significantly enhances the signal and would allow for a clear observation. In this selection the dominant background contribution in the signal region comes from hardQCD processes. We note however that the negligible contribution from softQCD could just be a consequence of the limited statistics employed in this study, and it is possible that the actual background from softQCD processes is larger.The interplay between softQCD and hardQCD processes   could be experimentally studied, by applying various requirements on the number of reconstructed jets and thus define different control regions.       Predicted distributions of the event sphericity for various processes, weighted by their predicted cross sections for an integrated luminosity of L = 1 pb −1 for an inclusive selection (upper left), the nominal selection (upper right), the event-shape based selection (lower left) and the tight selection (lower right). The invariant mass of all reconstructed tracks is required to be between 40 GeV and 80 GeV. The distributions from all processes except the Instanton process are stacked. The model uncertainties are indicated as bands.

Medium Instanton Masses: The Hard QCD Regime
As one explores higher Instanton invariant masses, one enters the regime of perturbative QCD, and the background prediction becomes less uncertain. For this, we studied medium mass range of 200 < m I < 300 GeV with an average Instanton mass of 220 GeV was studied. The S tracks T distribution and the number of reconstructed jets with p T > 20 GeV in this mass range is shown in Figure 14 for the signal and background processes. The Instanton processes are expected to peak for 3 ≤ N jet ≤ 6, and one can see events with S T > 150 GeV have only a negligible contribution form softQCD processes. A standard signal selection imposes requirements on N Trk , m I /N Trk and the N Jets distribution. The event-shape selection applies in addition a minimum requirement on B and T . Similarly to the low mass scenarios, a requirement on N Displaced is made for the tight signal selection. All cuts for the signal selection as well as the definitions of the two control regions are summarized in Table 5. The event-shape selection yields 1 signal event and approximately 10 background events for an integrated luminosity of Ldt = 1 pb −1 . For the tight selection 0.5 signal events and 0.6 background events are expected. An observation would therefore only be possible with an integrated luminosity of about Ldt ∼ 10 pb −1 .
In this context the ability of the LHC experiments to trigger on these event topologies becomes relevant. For the studies of softQCD processes at the LHC, special triggers are used, which record collision events even with limited activity in the detector, i.e. are nearly free of any bias towards a certain physics signature. Given the enormous rates of such minimum bias triggers, only a small fraction of these events can be actually stored on tape. However, the published softQCD analysis at the LHC indicate that sufficient statistics has been already collected to allow for Instanton searches in the low mass regime. In the medium and high Instanton mass regime the available statistics is a larger challenge, as the required integrated luminosity increases significantly. Typically, multi-jet triggers require minimal  Table 5. Overview of the standard and tight signal selection as well as the definition of two control regions aiming at very low Instanton masses (200 GeV < m I < 300 GeV) transverse jet energies of 50 GeV or more and hence are of no use. It might be therefore important to develop new trigger strategies for the upcoming LHC runs to able to record sufficient statistics, as it was already pointed out previously in [25]. The predicted distributions of the event sphericity for the signal and background processes, scaled to the expected event yields for an integrated luminosity of L = 1 pb −1 is shown in Figure 15 for the event-shape and the tight selection. The dominant background are multijet events from hardQCD processes. Figure 16 shows the predicted number of displaced tracks in the control region A as well as the event sphericity in control region B, which again can be used to validate the modelling of the background processes.

High Instanton Masses: The Top Quark Regime
The hardQCD multi-jet background is also dominant in the 300 < m I < 500 GeV mass ranges. This can be seen in Figure 14, which shows the event sphericity and the number of reconstructed particle jets with p T > 20 GeVfor signal and background events. However, it might be more interesting to apply a dedicated event selection that promotes another process as dominant background, which does not suffer from the same model uncertainties as the background in the medium mass regime. Hence we focus our selection here on topquark pair events. While it is obvious to enhance the top-quark (as well as the Instanton) contribution by requiring reconstructed jets which are tagged to stem from b-hadrons, this was not required within this study. Our results on the expected signal over background ratio are therefore conservative.
The large track multiplicity in events within 300 < m I < 500 GeV does not allow anymore for a clean separation between signal and background processes. Hence no signal selection cut involving N Trk is applied. The standard signal selection requires therefore only m I /N Trk < 3.0 as well as more than seven reconstructed jets with p T > 20 GeV. While the latter requirement is not optimal for the enhancement of the overall signal over background ratio, it allows to enhance the top-quark background contribution. The event-shape and tight signal selections follow the same lines as for the lower mass ranges, i.e. impose cuts on the event topology as well as on N Displaced . A summary of all signal selection criteria is given in Table 6. The corresponding event sphericity distributions for the event-shape and the tight selection is shown in Figure 18, where the top-quark background starts to dominate large values of S. A further enhancement of the top-quark contribution can be achieved by an additional b-tagging requirement without impacting significantly the signal yield.
The advantage of this signal selection relies on the experimentally well understood top-   Table 6. Overview of the standard and tight signal selection as well as the definition of two control regions aiming at very low Instanton masses (300 GeV < m I < 500 GeV) quark pair production. In addition to similar control regions as in the low mass ranges, also a top-quark specific selection can be envisioned, e.g. by requiring one additional reconstructed lepton (electron or muon) in the event, which stems from the leptonic decay of one top-quark (see definition of Control Region C in Table 6). This ensures large experimental constrains on the background uncertainties in the signal region. However, the signal over background ratio is only in the order of 30% and only 2 signal events are expected for an integrated luminosity of Ldt = 1 fb −1 , implying an observation with a 5σ significance would require an integrated luminosity of more than 80 fb −1 based on pure statistical considerations. Such large luminosities might require either dedicated developed triggers for high multiplicity jet events during the Run-3 of the LHC, or a long data-taking period with pre-scaled jet triggers during the high luminosity phase of the LHC.   Predicted distributions of the event sphericity for various processes, weighted by their predicted cross sections for an integrated luminosity of L = 1pb −1 for the event-shape based selection (left) and the tight selection (right). The invariant mass of all reconstructed tracks is required to be between 300 GeV and 500 GeV (high mass regime). The distributions from all processes except the Instanton process are stacked. The model uncertainties are indicated as bands.

How to mimic QCD Instanton Signatures
As shown in the previous sections, the most promising mass range for the observation of Instanton induced processes at the LHC is below 100 GeV, where the softQCD background contribution dominates. It is therefore crucial to understand if the softQCD phenomenological models have enough freedom to mimic the QCD Instanton signatures. A first indication that this might not be easily achieved comes from the observation that the softQCD predictions from the Pythia, Sherpa and the Herwig7 generators, which implement different models, are remarkably consistent for the observables considered in this study. As the softQCD model in event generators contain many parameters optimised using data, it remains remains however possible that with a suitable parameter choice the softQCD predictions can be made more similar to the Instanton.
In this context it is interesting to note that an existing ATLAS measurement of charged particle event shapes [48] in Minimum Bias events did show light discrepancy between the data and different generator predictions, indicating data events were slightly more spherical than expected. As a proof-of-principle demonstration, we have tested if the softQCD Pythia predictions can be made to yield significantly more spherical events, even beyond what data indicates. Starting with the baseline Monash tune [49] of Pythia8, we found that increasing the MultipartonInteractions:alphaSvalue = 0.150 in softQCD events does produce more spherical events, as seen in Figure 19.
However, such tunes would also alter many other event shape distributions, such as the number of charged particles vs. η, which are not supported by data as seen for example in Figure 20). While the latter is based on √ s = 7 TeV data, the same conclusions hold for √ s = 13 TeV data as well. Even more importantly, a tune of multiple parton interactions, would not impact certain distributions, such as the number of tracks with displaced vertices N Displaced , which could be used to isolate Instanton events. As the cross section dependence on m I of QCD Instanton and softQCD processes is very different. it is also non trivial to tune QCD Instanton sensitive distributions for different m I regions. This can be taken as a further motivation for studying Instanton production at low mass in different mass ranges, i.e. 20 < m I < 40 GeV and 40 < m I < 80 GeV.

LHC Projections
Having defined suitable signal regions, we evaluate here the expected 95% confidence level (CL) upper limits on the instanton cross-section. For each range of Instanton mass, the respective tight signal region selection is applied. The systematic uncertainty on the background is estimated as described in the previous sections, and ranges from about 20% at low invariant masses to about 50% at high invariant masses. The limits are then performed as counting experiment with the pyhf package [51]. The results are shown in Fig. 21 for different assumptions on the integrated luminosity of 1 pb −1 , 100 pb −1 , and 10 fb −1 . We can see how even with only 1 pb −1 , the predicted Instanton cross-sections can be excluded for masses up to about 150 GeV, excluding at low masses cross-section ten times smaller than those predicted by [24]. Increasing the collected luminosity to 100 pb −1 would extend the limit at large invariant masses to 300 GeV, with a negligible improvement of the limit at low masses. A further integrated luminosity increase by an additional factor of 100 would push the limit to the 400 GeV mass point.  Figure 21. Expected exclusion limits at 95% CL on the cross section for Instanton induced processes for three different assumed integrated luminosities . Conservative systematic uncertainties on the background modelling have been made.  Figure 22. Primary charged-particle multiplicities as a function of for events with at least two primary charged particles with p T >100 MeV and |η|<2.5, each with a lifetime τ > 300 ps. The black dots represent the measurement of ATLAS [50], while the soft-QCD prediction is produced with Pythia8. Also shown is a prediction which includes a contribution of 9% on Instanton induced processes. The model uncertainties are indicated by the bands

First limit on Instanton Processes from proton-proton collision data
As already indicated in Figure 8, the distribution of charged particles vs. their pseudorapidity is a sensitive observable for QCD Instanton processes in the low mass regime, which is in fact a standard distribution to be measured in the context of softQCD studies [50,[52][53][54]. This distribution has been therefore also measured previously by the ATLAS Collaboration [50] for events with at least two primary charged particles with p T >100 MeV and |η| <2.5, each with a lifetime τ >300 ps. We studied to which extent these measurements can be used to constrain QCD Instanton production. The Rivet-routine for the ATLAS analysis was applied on the simulated softQCD sample based on Pythia8 as well as the Instanton signal sample with s min > 25 GeV. A modelling uncertainty on the softQCD prediction was estimated by considering the envelope of the Pythia, Herwig7 and Sherpa samples. Since no second generator for Instanton processes for this low mass regime is currently available, the differences in the η T rk distribution at m I ≈ 500 GeV between the Sherpa and Herwig7 predictions have been taken as approximation. The resulting signal and background uncertainties have been treated once fully bin-to-bin correlated uncertainties as well as once fully bin-to-bin uncorrelated.
The Instanton distribution has been added with a scaling factor α to the predicted softQCD distribution, normalized accordingly and then fitted via a χ 2 minimization ap-proach to the data distribution, shown in Figure 22. The fit yields a maximal value of α = 0.09 and α = 0.03 at 95% CL, assuming bin-to-bin correlated as well as bin-to-bin uncorrelated uncertainties on the predictions, respectively. The expected shape for an Instanton contribution of 9% to the standard softQCD processes is also indicated in Figure  22. The fiducial cross section defined by the selection p T >100 MeV and |η| <2.5, each with a lifetime τ > 300 ps can be estimated to be σ = 71 mb, when taking the integrated luminosity of Ldt = 151 µb −1 , the number of selected events N = 9.3 · 10 6 and assuming a detector efficiency of = 0.87. Hence an upper limit on Instanton induced processes with s min > 25 GeV can be placed between 2.1 and 6.4 mb, depending on the correlation scenario assumed. In principle, further measured distributions could be used to derive more stringent limits, however, we think a dedicated analysis effort by the LHC collaborations would be the right next step, to shed light on QCD Instanton processes.

Conclusion
In this paper we presented detailed studies towards possible analysis strategies to observe Instanton induced processes in proton-proton collisions at the Large Hadron Collider. Several observables have been identified, which allow to effectively separate signal and background events. In order to study Instanton processes at higher energies, special triggers might have to be implemented for the upcoming LHC runs. However, the situation is different for low energies which have significantly larger cross sections and should be in principle already recorded. It is concluded that the most promising phase-space region for an early observation is therefore at low energies, for Instanton masses below 100 GeV, where the cross-section is very high. Since the dominant background in this energy regime is from softQCD processes, several methods to constrain and validate softQCD models in dedicated control regions have been discussed. We find that with just 1 pb −1 of integrated luminosity, the LHC can already probe this low mass Instanton regime. With 10 fb −1 it would be possible to probe Instanton mass of up to 0.5 TeV. In addition, available measurements of Minimum Bias data have been used to derive a first upper limit on the cross section of Instanton processes, yielding an upper bound of 6.4 mb for Instanton masses above 25 GeV.
The methods described in this paper will hopefully boost dedicated search efforts at the LHC over the full Instanton mass range by several experiments, leading to a robust result based on different strategies.