1 Introduction

The dynamic aperture (DA) is the amplitude of the phase space region where stable, i.e. bounded, motion occurs. DA is one of the key quantities for the design of modern colliders based on superconducting magnets, which feature unavoidable non-linear field errors, such as the Tevatron [1,2,3], HERA [4,5,6,7], RHIC [8], the Superconducting Super Collider (SSC) [9, 10], and LHC (see, e.g. Ref. [11] for a detailed overview).

The study of transport in the phase space of non-integrable Hamiltonian systems is a very difficult problem due to the coexistence of weakly chaotic regions and invariant Kolmogorov–Arnold–Moser (KAM) tori [12] that implies a sensitive dependence of the orbit evolution on the initial conditions. The relevance of Arnold diffusion [13], a generic phenomenon in Hamiltonian systems with two or more degrees of freedom, in applications is still debated.

Macroscopic physical systems cannot realise the symplectic character of the dynamics at arbitrary spatial and time scales. Nonetheless, some results of Hamiltonian perturbation theory turn out to be robust with respect to the details of the considered system and they can provide effective laws for the study of stability and diffusion problems of the orbits. Nekhoroshev theorem [14] is an excellent example of such a result, the corresponding estimate for the orbit stability time being applied in several fields ranging from celestial mechanics to accelerator physics, where in recent years, a connection between Nekhoroshev theorem and time variation of the dynamics aperture has been established [15].

In a mathematical sense, the stability property requires an arbitrarily large time scale. In a physical context, however, particle stability can be linked to a maximum number of turns \(N_{\mathrm{max}}\) that is determined on the basis of the specific application. Let (xy) be the transverse spatial coordinates describing the betatronic motion in a collider, if an ensemble of initial conditions defined on a polar grid (\(x = r \, \cos \theta \, , \,\, y = r \, \sin \theta \,\,\, 0 \le \theta \le \pi /2\), where xy are expressed in units \(\sigma _x, \sigma _y\) of the beam dimension) is tracked for up to \(N_{\mathrm{max}}\) turns, then a measure of the DA can be defined as [15]:

$$\begin{aligned} D(N) = \frac{2}{\pi } \int _0^{\pi /2} r(\theta ;N) \, \mathrm{{d}} \theta \equiv \langle r(\theta ; N) \rangle \end{aligned}$$
(1)

where \(r(\theta ; N)\) stands for the last stable amplitude in the direction \(\theta \) for up to N turns. Note that in case the stable phase space region is made of disconnected parts, only the area surrounding the origin is retained in these computations. In this way, the DA can be considered a function of N, with an asymptotic value, when it exists, representing the DA for an arbitrary large time.

An accurate numerical computation of DA, as well as a good estimate of the numerical error associated with the numerical protocol used is of paramount importance to ensure the reliability of DA as a figure-of-merit for assessing synchrotron performance. A general discussion of the DA definition, its computation, and accuracy can be found, e.g. in Ref. [15].

DA computation requires the determination of the evolution of a large number of initial conditions, distributed to provide good coverage of the phase space under study, to probe whether their motion remains bounded over the selected time interval. While the computational burden of a large set of initial conditions can be easily mitigated by means of parallelism [16], it is not possible to mitigate the heavy CPU power needed for long-term simulations. Hence, studies have explored the possibility to describe the DA dependence on the number of turns using simple models [17, 18]. The underlying idea is that long-term behaviour of the DA can be extrapolated using knowledge from numerical simulations performed over a smaller number of turns. Additionally, a more efficient estimate of the long-term behaviour of the DA would expedite analysis of several configurations of the circular accelerator, which is sometimes mandatory to gain insight into the deeper nature of the beam dynamics.

The Nekhoroshev [14] theorem suggests an answer to the quest for modelling the time evolution of DA. In fact, according to the results of Refs. [17, 18], the following scaling law holds:

$$\begin{aligned} D(N) = D_{\infty } + \frac{b}{ \left( \log N \right) ^{\kappa }} \, , \end{aligned}$$
(2)

where \(D_{\infty }\) represents the asymptotic value of the amplitude of the stability domain, b and \(\kappa \) being additional parameters.

The model (2) gives the following rough description of the transverse phase space, in which we distinguish three macroscopic regions: an inner central core around the origin \(r < D_{\infty }\), where the measure of the KAM [12] invariant tori is large, thus producing a stable behaviour apart from a set of a very small measure where Arnold diffusion can take place; a surrounding region, with \(r > D_{\infty }\), where a weak chaos is present and the escape rate is reproduced by a Nekhoroshev-like estimate [14, 19, 20]; an outer region where most of orbits escape quickly towards the infinity. In the region \(r > D_{\infty }\) the model (2) provides an estimate of the stability time as a function of the amplitude r of the form:

$$\begin{aligned} N(r) \, = \, N_0 \, \exp \left[ {\left( \frac{r_*}{r} \right) ^{1/\kappa }} \right] \end{aligned}$$
(3)

where N(r) is the number of turns that are estimated to be stable for particles with initial amplitude smaller than r, and \(r_*\) is a positive parameter.

According to the scaling law (2), it has been proposed a model for the evolution of beam intensity in a hadron synchrotron [22], which is the basis of the novel experimental method used to probe DA.

In this paper, we use a diffusive approach to reproduce the experimental results from the recent DA experiment at top energy in the LHC. The beam dynamics in the weakly chaotic region is governed by a stochastically perturbed Hamiltonian system, which in turns is described by means of a Fokker–Planck (FP) equation [21], whose solution represents the average evolution of the beam distribution, including also absorbing boundary conditions. This approach allows the diffusion process for the particle distribution to be simulated, providing a natural description of the beam dynamics in the presence of a collimation system, which is a typical situation in colliders based on superconducting magnets.

The novelty of the proposed approach consists of using the remainder estimate of the perturbative series from Nekhoroshev theorem as functional form for the diffusion coefficient of the FP equation. Indeed, Nekhoroskev approach to the optimal estimate of the remainder of perturbative series [20] represents the link between the analysis of the beam dynamics based on the scaling law of DA and that based on a diffusion equation: for the first the theorem provides the form of the scaling law of D(N), while for the latter the theorem provides the form of the diffusion equation.

The plan of the paper is the following: in Sect. 2 the main aspects of the theory of diffusion processes in stochastically perturbed Hamiltonian system are reviewed, while in Sect. 3 the experimental technique is described. The main results of our analysis are presented and discussed in Sect. 4, where a detailed comparison between the theoretical approach based on the diffusion equation and the experimental measures is presented. In Sect. 5 the phase space of the system under consideration is studied by means of symplectic-tracking simulations to provide a confirmation of the assumptions used for the diffusive approach. Some conclusions are drawn in Sect. 6, whereas the mathematical details of the proposed approach are presented in Appendices A–C.

2 Theoretical background

The results of perturbation theory of Hamiltonian systems imply that when the set of invariant KAM tori in phase space has a large measure, the orbits’ diffusion is possible only for a set of initial conditions of extremely small measure [23]. Therefore, the existence of macroscopic diffusion phenomena in phase space has to be related to the presence of weak chaotic regions of large measure in which the large majority of KAM tori are broken [24]. Note that in realistic models of betatron motion, slow modulation of the strength of lattice elements, transverse tune ripple induced by synchrotron motion, or weak stochastic effects, such as noise in active devices, may lead to the appearance of such regions.

Nekhoroshev’s theorem provides optimal estimates for the remainders of the asymptotic perturbative series for Hamiltonian flows, however, also in the case of a symplectic map in the neighbourhood of an elliptic fixed point, it is possible to provide an optimal estimate for the Birkhoff normal forms series [19, 20].

Let I be the unperturbed action, there exists an optimal perturbation order of the Birkhoff’s expansion at which the remainder is estimated according to:

$$\begin{aligned} \Vert R \Vert = A \exp \left[ -\left( \frac{I_*}{I}\right) ^{1/(2\kappa )} \right] \end{aligned}$$
(4)

where \(I_*\) represents an apparent radius of convergence of the perturbative series and the exponent \(\kappa \) depends on the number of degrees of freedom of the system under consideration. We conjecture that under generic conditions the functional form of Nekhoroshev estimate (4) can be applied to measure the strength of the chaotic component of the dynamics in the weak chaotic region. Assuming a diffusion approach for the evolution of the action distribution (see Appendices for the mathematical details) in the one-dimensional case, the Fokker–Planck equation holds:

$$\begin{aligned} \frac{\partial \rho }{\partial t}= \frac{\varepsilon ^2}{2}\frac{\partial }{\partial I}{\mathcal {D}}(I) \frac{\partial }{\partial I}\rho (I,t). \end{aligned}$$
(5)

where \(\varepsilon \) is a scaling factor related to the perturbation amplitude. The Nekhoroshev’s estimate suggests that the following functional form for the action-diffusion coefficient

$$\begin{aligned} {\mathcal {D}}(I)=c \, \exp \left[ -2 \left( \frac{I_*}{I}\right) ^{1/(2\kappa )}\right] \end{aligned}$$
(6)

is suitable to simulate the action diffusion under the previous assumptions. The constant c is computed by normalising the diffusion coefficient according to:

$$\begin{aligned} c \, \int _0^{I_{\mathrm{abs}}} \exp \left[ -2 \left( \frac{I_*}{I}\right) ^{1/(2\kappa )}\right] \mathrm{{d}}I=1, \end{aligned}$$
(7)

where \(I_{\mathrm{abs}}\) represents the position of the absorbing boundary condition. The physical meaning of the parameters \((\varepsilon , \kappa , I_*)\) that characterise the diffusion model (5) and (6) is readily derived from Nekhoroshev’s theorem: (1) \(\varepsilon \) is an adimensional quantity that measures the strength of the non-linear effects acting on the beam; (2) the exponent \(\kappa \) emerges from the analytic structure of the perturbative series and it mainly depends on the phase space dimensionality and on the nature of the non-linear terms that occur in the perturbative series, independently from their strength; (3) \(I_*\) reveals the asymptotic character of the perturbative series and it is related to the strength of non-linear terms. Usually, the region in phase space corresponding to \(I\simeq I_*\) is beyond the short-term dynamic aperture, where our approximation is no more valid. It is worthwhile mentioning that the parameters \(\varepsilon \) and \(I_*\) are in principle correlated since a scaling in the action changes the strength of the perturbation. However, the position of the absorbing barrier is invariant with respect to the global time scaling \(\varepsilon \), whereas it depends on the action scaling.

In Fig. 1, we plot the behaviour of the diffusion coefficient (6) (upper) using parameter values relevant for the comparison with experimental data (see Sect. 4) and we show an example of the numerical solution of the FP equation (5) (lower) with exponential initial distribution and an absorbing boundary condition.

Fig. 1
figure 1

Upper: plot of the Nekhoroshev diffusion coefficient (6) (red curve) as a function of I for \(\kappa =0.33\) and \(I_*=21.5\). To highlight the features of the function (6) the exponential \(\exp (-I)\) (blue curve) is also shown. Lower: evolution of the 1D transverse beam distribution using the FP equation (5) with the Nekhoroshev diffusion coefficient of the upper plot. The parameter values used in the simulations correspond to the experimental setup for Beam 2 with correctors V blow-up (see Table 1)

One clearly sees the effects of the functional form of the Nekhoroshev diffusion coefficient: after a rather fast initial diffusion, the evolution of the beam distribution slows down. Moreover, the changes to the initial shape of the distribution are limited to its tails. This is the reason for the existence of a stable region of finite extent in phase space for finite time, which would give rise to a finite dynamic aperture.

3 LHC dynamic aperture experiment at top energy

With the advent of the LHC and the approval of its high-luminosity upgrade [25], the topic of measuring the DA by means of beam experiments has regained interest, after a break between the design phase of the LHC (see, e.g. Refs. [27,28,29,30] for a review of the comparison between measurements and simulations), and its commissioning and following operation periods.

DA measurements at the LHC (see Fig. 2, upper, for a layout of the LHC ring) have been already carried out at injection energy [26, 31, 32] using different approaches, i.e. the standard kick method [26] or the new approach [31, 32]. For the latter, the technique consists of blowing up both the horizontal and vertical emittances until beam losses can be detected. The beam intensity as a function of time is then recorded, fitted, and compared with the results of numerical simulations, usually showing a very nice agreement [33]. During these measurements, the strength of non-linear elements located in the regular cell of the accelerator (see Fig. 2, middle, for a layout of the LHC cell, including also the non-linear correctors used) is varied to provide several machine configurations to be studied by means of numerical simulations. The encouraging results obtained at injection energy suggested to pursue the DA measurement at flat top energy.

Fig. 2
figure 2

Upper: layout of the LHC (from Ref. [11]). The ring eightfold symmetry is visible, together with the arcs and the long straight sections. Middle: layout of the LHC regular cell (from Ref. [11]). Six dipoles and two quadrupoles with the dipole, quadrupole, sextupole, and octupole magnets (for closed orbit, tune, chromaticity correction and beam stabilisation, respectively) are shown. The spool pieces used to compensate the systematic \(b_3\) component (MCS), \(b_4\) and \(b_5\) components (MCDO nested magnets) are also shown. Bottom: sketch of the layout of the inner triplets and the non-linear correctors used in the experimental tests reported in this paper. The field imperfections of LHC magnets are represented as \(B_y + i \, B_x = B_{\mathrm{ref}} \sum _{n=1}^{M} \left( b_n + i \, a_n\right) \left( \frac{x + i \, y}{R_{\mathrm{r}}} \right) ^{n-1}\) where \(R_{\mathrm{r}}=17\) mm

The goals of the DA measurements performed at 6.5 TeV in the LHC were manyfold: the use of squeezed optics allows probing the impact on beam dynamics of the non-linear field errors stemming from the quadrupoles in the high-luminosity insertions. Thus, one could examine and quantify the influence on beam loss and lifetime from changes in the strength of the normal dodecapole correctors (see Fig. 2, bottom, for a sketch of the high-luminosity insertions, whose magnets were used during the experiment) in the ATLAS and CMS interaction regions (IR) 1 and 5, respectively. This aspect is particularly relevant in view of the future High Luminosity LHC project [25], for which the operational strategy to set the non-linear correctors in the high-luminosity IRs is still to be studied. Moreover, in previous studies, the beam emittance was heated to large values in both horizontal and vertical planes. While this can be considered a benefit of the method insofar as it gives a measure of changes to the average DA over all angles in the xy plane, it may also be regarded as a limitation since with such an approach it is not possible to distinguish between changes of the horizontal and vertical dynamic aperture. To help rectify this aspect, it was decided to measure the dynamic aperture simultaneously for three bunches in a single beam. One bunch was heated horizontally (H blow up, in the following), one vertically (V blow up, in the following), and one in both planes (HV blow up, in the following). Note that a witness bunch of small transverse emittance provides a reference case. The key objective of these measurements was related to the time scale achieved. Typical DA simulations are performed over \(10^{5}\)\(10^{6}\) turns (\(\sim 8\)–88 s of LHC operation) and previous measurements have been performed on the 5–10 min time scale [31, 32]. Operational time scales at top energy in the LHC, by contrast are of the order of \(\sim 12\) h. To justify the extrapolation of simulated data that can be viably studied numerically to orders of magnitude longer times, it is also necessary to establish whether the analytical scaling laws hold over these same time scales. Thus the final objective of this novel measurement campaign was to perform dedicated dynamic aperture measurements on the time scale of an hour, significantly longer than any previous measurement in the LHC.

The experiment was performed using both the clockwise beam (Beam 1) as well as its counter-clockwise partner (Beam 2). The first was made of a single bunch blown up in both horizontal and vertical planes, while the latter comprised four bunches with different emittance blow as mentioned earlier. The transverse damper was used to provide a dipolar excitation, which blows up the transverse emittance due to band-limited, white noise excitation that is injected into the transverse damper feedback loop [34].

The value of \(\beta ^{*}\) in the IR1 and IR5 experimental insertion was 0.4 m. The primary collimators were set at \(\sim 9\,{\sigma _{\mathrm{nom}}}\), while the tertiary collimators were positioned at \(\ge 15\,{\sigma _{\mathrm{nom}}}\), which are significantly in excess than that defined by the horizontal and vertical primaries. The value of \({\sigma _{\mathrm{nom}}}\) is computed assuming the nominal value of the rms normalised emittance, namely \(\epsilon ^{*}_{\mathrm{nom}}=3.75\,{\upmu \mathrm{{m}}}\).

After removing large orbit bumps in the experimental insertions, the fractional tunes were re-corrected to (0.31, 0.32) and chromaticity was set to \(Q'_{x,y}=3.0\) units for both planes and beams. Linear coupling was trimmed down to a value of \(|C^{-}|\approx 0.001\), which is at the limit of the measurement resolution. Having established the baseline conditions for the study, DA measurements were first performed by aggressively blowing up the Beam 2 bunches using the transverse damper up to very large emittances \(\sim \,25\,{\upmu \mathrm{{m}}}\). Large dodecapole sources were introduced by powering the IR-\(b_{6}\) correctors left and right of the interaction point (IP) 1 and 5 uniformly to their maximum current. Then, the single bunch in Beam 1 was also blown up in horizontal and vertical planes, thus allowing DA measurements for both beams. Approximately 1 h of intensity data were recorded in this configuration. Finally the IR non-linear corrections for normal and skew sextupole and normal and skew octupole errors, which had been commissioned at the start of 2017, were collectively removed, and approximately 30 min of intensity data were recorded for this final configuration. Additional details regarding the experimental session and the LHC setup can be found in Ref. [35]. Note that the accuracy of the beam intensity measurement is at the level of \(10^{-3}\).

The reported experimental procedure has been carefully prepared to avoid as much as possible disruptive effects on the results. The intensity of each individual bunch was in the order 7–\(8\times 10^9\) protons (note that the nominal bunch intensity for LHC is \(1.15\times 10^{11}\)), to prevent any collective effect to impact the measurements. The large transverse emittance prevented any brightness-related effects. Synchrotron radiation damping times are in the order of 12 h and 24 h for longitudinal and transverse emittance, respectively, which implies that no impact is to be expected on the timescale of our measurements. Finally, the lifetime for residual gas scattering is estimated to be 100 h, hence completely negligible.

The summary plots from the experimental session are shown in Fig. 3, where the evolution of the relative strength of the non-linear correctors and the bunch intensity are visible. The two machine configurations are characterised by different levels of beam losses depending on the transverse emittances. A careful inspection of the summary plots for Beam 2 leads to the conclusion that the beam losses occur preferentially in the vertical plane. It is also worth stressing that the two beams are coupled by the single-aperture magnets in the experimental IRs, this is, e.g. the case of the non-linear correctors used in this experiment, whereas the remaining parts of the two rings are different, which implies that a different behaviour of Beam 1 and 2 for similar conditions of emittance blow-up should not come as a surprise.

Figure 4 shows the measured transverse profiles of the two beams after the emittance blow up at the beginning and at the end of the loss measurement reported in Fig. 3c (note that the following considerations hold true for all experimental configurations presented here). The profiles have been obtained by means of the synchrotron light monitor and the slight left–right asymmetry of the horizontal profile of Beam 2 is an artefact of the instrument and should be neglected [36]. The values of the \(\sigma \) of the two distributions are the same at the percent level and in general the two profiles match each other very well. The measurements are dominated by the noise whenever the transverse amplitude exceeds \(\approx 2.5~\sigma \), while below this value, the transverse profiles prove to be Gaussian. The initial Gaussian distribution and the final one, as obtained from numerical simulations (see Sect. 4), are also shown. It is clearly seen that the diffusion mechanism is acting on the initial distribution by changing only the tails beyond \(\approx 2.5~\sigma \). The typical losses observed are at the level of 1–2% of the bunch intensity, which agrees with the tail content of Gaussian beyond \(\approx 2.5~\sigma \). Therefore, these observations suggest that a Gaussian initial distribution is an appropriate choice, although the synchrotron radiation monitor does not provide any direct quantitative measurement of the actual tails of the beam distribution.

Note that in the rest of the paper the configuration in which all IR correctors are powered will be indicated as ‘with correctors’, while that with the dodecapolar corrector only as ‘no correctors’.

Fig. 3
figure 3

Summary plots of the DA measurements performed at 6.5 TeV. The time variation (note the different origins of the time scale for Beam 1 and 2) of the relative strength of the non-linear circuits used for the tests are shown, together with the bunch intensity evolution. The strength of the \(b_4\) corrector is given as an example of the time variation for the \(a_3, b_3, a_4\) correctors. The blue and red regions of the intensity curves are used in the numerical simulations, while the grey regions are discarded as they correspond to the transient state during the strength variation of the non-linear circuits. The results are: aHV blow up, Beam 1; bHV blow up, Beam 2; cV blow up, Beam 2; dH blow up, Beam 2

Fig. 4
figure 4

Example of transverse beam profiles after blow up at the beginning and at the end of the loss measurements shown in Fig. 3c. The initial and final Gaussians agree very well. The initial and final beam distributions from the numerical simulations for the corresponding case are also shown

4 Modelling the experimental results with a diffusion equation

The experimental results presented in Sect. 3 have been analysed by means of a 1D FP equation (5) with a Nekhoroshev-like form of the diffusion coefficient as in Eq. (6) (parenthetically, the data from the measurement campaign performed at injection energy in the LHC have been re-analysed using the diffusive approach and the discussion of the results can be found in Ref. [37]). The model needs to find an efficient method to determine the three parameters. The first step is to constrain the model to agree with the measured intensity curve at the end of the experimental time window and this fixes \(\varepsilon \). As a second step, the FP equation has been used to reproduce the various experimental cases determining the values of \(\kappa \) and \(I_*\) by minimising the \(L^2\) norm of the difference between the solution of the FP equation with an absorbing boundary condition at \(I=I_{\mathrm{abs}}\) and the corresponding measured intensity curve. This provided pairs of values for \(\kappa _j, {I_*}_j\) for each experimental configuration. The main observation is that \(\kappa \) is only very mildly depending on the configuration, which is in agreement with the fact that it should be linked with the number of degrees of freedom of the system under consideration. Therefore, the average of \(\kappa _j\) has been used as an estimate of \(\kappa \) for all data sets. The third and last step has been the computation of the solution of the FP equation for the various configurations using the only remaining free parameter \(I_*\) to minimise the \(L^2\) as done for the second step. It is worth pointing out that to remove the noise affecting the beam intensity measurement, which is visible in the two rightmost plots of Fig. 3, the intensity data have been filtered by a 50-data moving average.

The procedure described above aims to point out the functional form of the beam losses produced by the Nekhoroshev’s diffusion coefficient as we tried to avoid using the model parameters to obtain the best agreement with each individual measurement, but rather to find a global agreement between the numerical solutions and the measurement results.

Since the action variable I represents the non-linear invariant of the system, we have chosen as initial condition an exponential distribution

$$\begin{aligned} \rho _0(I)=\sigma ^{-2}\, \exp \left( -\frac{I}{\sigma ^2}\right) \end{aligned}$$
(8)

where \(\sigma ^2\) stands for the measured beam emittance, to reproduce the measured beam profile as shown in Fig. 4. Moreover, by scaling the action variable \(I\rightarrow I/\sigma ^2\) we can set \(\sigma =1\) in the simulations without affecting the beam loss rate. Finally, the absorbing boundary condition \(I_{\mathrm{abs}}\) is computed from the position of the collimator expressed in units of beam emittance and considering the physical plane where one expects the beam diffusion to be more relevant.

In Table 1 we report the model’s parameters obtained by applying the procedure described above, i.e. from the numerical evaluation of the relative intensity losses at the absorbing barrier with FP (5). It is worth noting that for Beam 2 the case with H blow up and ‘with correctors’ features no appreciable beam losses and therefore, no attempt to derive model’s parameters has been made.

Table 1 Summary of the model’s parameters obtained with the numerical simulations of the measured beam losses, using the approach described in the main text. In the case of Beam 2 with horizontal blow up, the losses for the configuration ‘with correctors’ are not high enough to attempt any meaningful modelling. The plane where the absorbing boundary is set is specified in parenthesis for the case with HV blow up. There, the boundary condition is set in the plane and to the value corresponding to the minimum amplitude between the boundary conditions in the horizontal and vertical planes. The \(L^2\) norm is also given, which is to be considered relative to the total beam losses measured for each configuration

The values of the \(L^2\) norm for the final numerical results are also listed in Table 1. The norm provides a cumulative measure of the deviation between the measured and the simulated beam loss curves and the values are relative to the total beam loss measured for each configuration. The order of magnitude is around few percent, which corresponds to about few \(10^{-4}\) of the absolute intensity loss. Note that the precision with which the beam intensity is measured is below the percent level. Therefore, the overall agreement can be considered excellent.

Figure 5 shows the results of the numerical simulations together with the experimental data for the complete Beam 1 data set.

Fig. 5
figure 5

Measured and simulated intensity loss for the Beam 1 data set with HV blow-up using the 1D FP equation (5) for configuration ‘no correctors’ (upper) and ‘with correctors’ (lower). The initial distribution is exponential and the values of the model parameters are those reported in Table 1

The agreement between the measured data and the simulations results is excellent as it is indicated by the values of the \(L^2\) norm in Table 1. Figure 6 shows the results of our analysis for the Beam 2 data set, in which the case with H blow up has been discarded due to insufficient level of beam losses.

Fig. 6
figure 6

Measured and simulated intensity loss for the Beam 2 data sets with HV blow and ‘with correctors’ (a); with HV blow and ‘no correctors’ (b); with V blow and ‘with correctors’ (c); with V blow and ‘no correctors’ (d). The initial distribution is exponential and the values of the model parameters are those reported in Table 1

Also in this case, the agreement between experimental observations and numerical simulations is striking. It is also worth noting that the time span of the various data sets covers a rather wide range of turn numbers and the agreement does not depend on the duration of the measurements.

According to the physical interpretation of the diffusive model parameters, the exponent \(\kappa \) plays a fundamental role in determining the shape of the beam loss curve. The second model’s parameter \(I_*\) defines a transition threshold in the action space from a fast to a slow diffusion and it changes the shape of the curve when its value is comparable with the position of the absorbing barrier.

To illustrate the sensitivity of the simulated beam intensity to the values of \(\kappa \) and \(I_*\) in Fig. 7 we compare the beam loss curves computed with the diffusion model when the two parameters are varied, one at a time, with respect to the optimal value top and centre plots. In the bottom plot the relative difference between the curve reproducing the experimental data and those with varied model parameters is shown.

Fig. 7
figure 7

Loss curves for Beam 2 with V blow up for configuration ‘with correctors’ showing also the results for models in which the parameter \(\kappa \) (upper) or \(I_*\) (lower) is varied around the optimal value. In both cases the parameter \(\varepsilon \) is adapted to fix the total losses, so that all curves intersect at the end of the time interval. In the bottom plot the relative difference between the curve reproducing the experimental data and those with varied model parameters is shown. The variation of \(\kappa \) produces the largest change in the loss curve

The value of \(\kappa \), varied by few percent, influences substantially the shape of the initial part of the interpolating curve, right after the initial fast transient beam losses. This observation supports the assumption that the constancy of the exponent \(\kappa \) for the different considered cases can be attributed to an intrinsic property of the observed diffusion process.

In the perturbation theory, the \(I_*\) parameter is interpreted as a global scaling for the perturbative series related to the nature of the non-linear terms present in the system, although it is not directly linked to their magnitude. The effect of the value of \(I_*\) on the shape of the beam loss curve depends on the ratio \(I_{\mathrm{abs}}/I_*\), which provides the position of the absorbing barrier: a greater value of \(I_*\) reduces the beam halo and consequently the beam losses at the position of the absorbing barrier, whereas the opposite effect occurs for lower values of \(I_*\).

In summary, Fig. 7 indicates that the proposed approach is sensitive to changes in \(\kappa \) and \(I_*\) at the level of few percent, which provides a very strong support to the robustness of the proposed model against variation of its parameters. In turns, this means that the differences between the values obtained by the numerical simulations could reflect actual differences in the dynamics occurring in the weak chaotic regions where the diffusion phenomena take place.

5 Symplectic-tracking checks of the experimental observations

The analysis presented in this paper does not rely on anything else than the numerical solution of the FP equation. Nevertheless, some tracking simulations have been performed to assess the choice of the boundary conditions and the plane of losses as well as some of the assumptions needed for the diffusive approach to be a valid option.

The ring model is the most accurate description of the LHC lattice including the measured field errors (see [38] for more detail) together with the operational configuration of the various correction circuits. The numerical protocol used envisages the generation of sixty realisations of the magnetic errors to take into account the measurement uncertainties, moreover, a polar grid of initial conditions in xy space is defined and its evolution is computed for up to \(10^6\) turns. The polar grid of initial conditions is obtained by dividing the first quadrant of the xy space in 59 angles and along each direction 30 initial conditions are uniformly distributed over intervals of \(2\sigma \).

The evolution of the initial conditions through the LHC lattice is computed using the SixTrack code [39], which implements a second-order symplectic integration method. The loss time, i.e. the time an orbit associated with a given initial condition reaches a pre-defined amplitude, is recorded and associated to each initial condition. The outcome of these simulations is shown in Fig. 8, where the stable region is shown for Beam 1 (upper row) and Beam 2 (lower row) and for each of the two configurations used in the experiment (‘with correctors’ in the left column, ‘no correctors’ in the right one) for the first realisation of the magnetic errors.

Fig. 8
figure 8

Plots of the stable region in xy space for Beam 1 (upper row) and Beam 2 (lower row) for the first realisation of the magnetic errors. The configuration ‘with correctors’ is shown in the left column, while that with ‘no correctors’ in the right one. The various colours indicate different stability time \(N_{\mathrm{stab}}\) and initial conditions that are not stable for at least \(10^5\) turns are represented by a marker whose size is proportional to the stability time. The white lines represent the \(3\sigma \) level lines of the beam distribution for the three types of blow up, namely H, V, and HV

The different colours are used to identify various stability times \(N_{\mathrm{stab}}\), i.e. dark-blue markers indicate particles with \(N_{\mathrm{stab}} < 10^5\) and the marker size is proportional to the value of \(N_{\mathrm{stab}}\). Yellow markers indicate a region for which \(N_{\mathrm{stab}} > 10^5\), while for green markers \(N_{\mathrm{stab}} > 10^6\). The shrinking of the extent of the stable region for increasing values of \(N_{\mathrm{stab}}\) is clearly visible. Moreover, the border of stability is almost circular for Beam 1, whereas it is much more irregular for Beam 2.

Figure 8 shows also three white curves: they represent the \(3\sigma \) level lines of the beam distribution for the three types of blow up applied during the experiment, namely H, V, or HV. For Beam 1, the beam distribution is rather close to the stability boundary and sizeable losses are to be expected. For Beam 2, it is worth noting that the irregular shape of the stability border implies that the \(3\sigma \) edge of the beam distribution is relatively far from the border itself, which is in agreement with the low beam losses measured for the H blow up case. Both the HV and V blow up cases feature the edge of the beam distribution close to the stability border in the vicinity of the y axis. This explains qualitatively the higher losses observed for these cases and, of course, also the fact that the case with HV blow up generates even higher beam losses.

These plots, however, provide only static information about the extent of the stable region of phase space. The time dependence can be reconstructed by means of Eq. (1) and it is shown in Fig. 9, where the Beam 1 (upper row) and the Beam 2 (lower row) cases are shown. The configuration ‘with correctors’ is reported in the left column, whereas that with ‘no correctors’ in the right one.

Each plot features three sets of curves: one representing the DA averaged over all angles in the xy space according to Eq. (1) (hence representing a situation relevant for the HV blow-up case); one representing the DA averaged over 10 angles, only, in the vicinity of the x axis (hence representing a situation relevant for the H blow-up case), and the last one representing the DA averaged over ten angles, only, in the vicinity of the y axis (hence representing a situation relevant for the V blow-up case). Each set of curves includes data from all sixty realisations of the LHC lattices.

The Beam 1 case features a much smaller spread among the sixty realisations than the Beam 2 case. Apart from this, however, the results for the two beams share a number of common features: the curves representing the average close to the x axis have the mildest dependence on time; the curves representing the average close to the y axis and the global average have a very similar shape featuring almost a constant shift between them.

Fig. 9
figure 9

Plots of the DA evolution with turn number for Beam 1 (upper row) and Beam 2 (lower row). The configuration ‘with correctors’ is shown in the left column, while that with ‘no correctors’ in the right one. The results for all 60 realisations of the LHC lattice are reported. The three sets of curves refer to the DA averaged over all angles [see Eq. (1)] or averaged over ten angles close to the x or y axis, respectively

The numerical model can also provide quantitative information on other lattice properties, such as the detuning with amplitude. An important condition for the applicability of the approach based on the diffusion equation is that \(\partial \varOmega /\partial I\) is not too small for the cases under consideration. Indeed, the value of \(\partial \varOmega /\partial I\) has been evaluated and turned out to be O(1) as requested for the validity of Eq. (12).

In summary, the results of the symplectic-tracking simulations confirm that the assumptions needed to apply a diffusive approach to the description of beam losses in the LHC are fulfilled.

6 Conclusions

In this paper, a novel diffusive model capable of an excellent agreement with the experimental results of the recent dynamic aperture at the CERN LHC has been presented. The model is inspired by the optimal estimate of the remainder of the perturbative series provided by Nekhoroshev theorem, which we propose as functional form of the diffusion coefficient. The model features three parameters that characterise the diffusion equation and the physical meaning of these parameters has been highlighted and discussed in detail. The model has been successfully applied to the description of the beam loss data sets, which originated from measurements performed at the CERN LHC at 6.5 TeV. The various data sets represent a number of different configurations in terms of non-linear effects in the beam dynamics, which suggests that the excellent agreement obtained between measurements and the model is a generic feature. One of the model parameters, \(\kappa \), is kept the same for all data sets, which is perfectly in line with its physical interpretation. The deviation of the obtained value of \(\kappa \) from the theoretical estimate is being further investigated. Given that a 1D formalism provides a good description of the experimental measurements, as shown in the previous sections, the theoretical estimate would provide \(\kappa =1\). As a matter of fact, the theoretical estimates consider a local transport in the action, whereas we are applying the proposed model to a global action diffusion. Therefore, one could expect that in our case \(\kappa \) is the result of an effective description of the dynamics on a large spatial scale. Furthermore, it is worth stressing that the results of the numerical simulations depend critically on the values of the model’s parameters, i.e. variations at the level of few percent of these parameters change strongly the functional form of the simulated beam losses. This is a very reassuring observation, indicating that the determination of the model parameters is very robust.

In all the considered cases, a 1D approach has been applied and this assumption has been probed by means of symplectic-tracking simulations, which fully supported the choice made. The interesting question whether the diffusion model can be justified on the basis of tracking simulations requires a detailed study of the phase space structure to detect the existence of weakly chaotic regions and to understand the effect of external random perturbations unavoidable in real accelerators. We plan also to investigate an approach based on the 2D Fokker–Planck equation and to examine in depth the relation between the diffusion coefficient and the stability times for the orbits in phase space computed by means of the Nekhoroshev’s estimate [17].

It is also worth stressing that the proposed approach can be used to predict the beam losses as well as the evolution of the beam distribution, extrapolating experimental measurements to long time scales that are well beyond the possibility of present symplectic-tracking codes. The optimal approach should be to couple symplectic tracking with the Fokker–Planck equation: the first, performed over a limited number of turns, but with a detailed exploration of phase space, should provide the information about the functional form of the diffusion coefficient; the latter would provide information on the long-term dynamics, such as beam losses and transverse distributions, possibly covering realistic times scales for accelerator physics applications. Therefore, in this way one could apply the outlined approach as a real predictive technique for future high-energy colliders.