3.1 Introduction

Non-linear effects in accelerator physics are important both during the design stage and for successful operation of accelerators. Since both of these aspects are closely related, they will be treated together in this overview. Some of the most important aspects are well described by methods established in other areas of physics and mathematics. Given the scope of this handbook, the treatment will be focused on the problems in accelerators used for particle physics experiments. Although the main emphasis will be on accelerator physics issues, some of the aspects of more general interest will be discussed. In particular to demonstrate that in recent years a framework has been built to handle the complex problems in a consistent form, technically superior and conceptually simpler than the traditional techniques. The need to understand the stability of particle beams has substantially contributed to the development of new techniques and is an important source of examples which can be verified experimentally. Unfortunately the documentation of these developments is often poor or even unpublished, in many cases only available as lectures or conference proceedings.

This article is neither rigorous nor a complete treatment of the topic, but rather an introduction to a limited set of contemporary tools and methods we consider useful in accelerator theory.

3.1.1 Motivation

The most reliable tools to study (i.e. description of the machine) are simulations (e.g. tracking codes).

  • Particle Tracking is a numerical solution of the (nonlinear) Initial Value Problem. It is a “integrator” of the equation of motion and a vast amount of tracking codes are available, together with analysis tools (Examples: Lyapunov, Chirikov, chaos detection, frequency analysis, …)

  • It is unfortunate that theoretical and computational tools exist side by side without an undertaking how they can be integrated.

  • It should be undertaken to find an approach to link simulations with theoretical analysis, would allow a better understanding of the physics in realistic machines.

  • A particularly promising approach is based on finite maps [1].

3.1.2 Single Particle Dynamics

The concepts developed here are used to describe single particle transverse dynamics in rings, i.e. circular accelerators or storage rings. This is not a restriction for the application of the presented tools and methods. In the case of linear betatron motion the theory is rather complete and the standard treatment [2] suffices to describe the dynamics. In parallel with this theory the well known concepts such as closed orbit and Twiss parameters are introduced and emerged automatically from the Courant-Snyder formalism [2]. The formalism and applications are found in many textbooks (e.g. [3,4,5]).

In many new accelerators or storage rings (e.g. LHC) the description of the machine with a linear formalism becomes insufficient and the linear theory must be extended to treat non-linear effects. The stability and confinement of the particles is not given a priori and should rather emerge from the analysis. Non-linear effects are a main source of performance limitations in such machines. A reliable treatment is required and the progress in recent years allows to evaluate the consequences. Very useful overview and details can be found in [6,7,8].

3.1.3 Layout of the Treatment

Following a summary of the sources of non-linearities in circular machine, the basic methods to evaluate the consequences of non-linear behaviour are discussed. Since the traditional approach has caused misconception and the simplifications led to wrong conclusions, more recent and contemporary tools are introduced to treat these problems. An attempt is made to provide the physical picture behind these tools rather than a rigorous mathematical description and we shall show how the new concepts are a natural extension of the Courant-Snyder formalism to non-linear dynamics. An extensive treatment of these tools and many examples can be found in [7]. In the last part we summarize the most important physical phenomena caused by the non-linearities in an accelerator.

3.2 Variables

For what follows one should always use canonical variables!

In Cartesian coordinates:

$$\displaystyle \begin{aligned} {{R~=~(X, P_{X}, Y, P_{Y}, Z, P_{Z}, t)}} \end{aligned} $$
(3.1)

If the energy is constant (i.e. P Z = const.), we use:

$$\displaystyle \begin{aligned} {{(X, P_{X}, Y, P_{Y}, Z, t)}} \end{aligned} $$
(3.2)

This system is rather inconvenient, what we want is the description of the particle in the neighbourhood of the reference orbit/trajectory:

$$\displaystyle \begin{aligned} {{R_{d}~=~(X, P_{X}, Y, P_{Y}, Z, t)}} \end{aligned} $$
(3.3)

which are considered now the deviations from the reference and which are zero for a particle on the reference trajectory

It is very important that it is the reference not the design trajectory!

(so far it is a straight line along the Z-direction)

3.2.1 Trace Space and Phase Space

A confusion often arises about the terms Phase Space (x, p x, …) or Trace Space (x, x′, …)

It is not laziness nor stupidity to use one or the other:

  • Beam dynamics is strictly correct only with (x, p x, …), (see later chapter) but in general quantities cannot be measured easily

  • Beam dynamics with (x, x′, …) needs special precaution, but quantities based on these coordinates are much easier to measure

  • Some quantities are different (e.g. emittance)

It comes back to a remark made at the beginning, i.e. that we shall use rings for our arguments. In single pass machine, e.g. linac, beam lines, spectrometers, the beam is not circulating over many turns and several hours, therefore there is no interest in stability issues. Instead for most of these applications what counts is the coordinates and angles at a given position (x, x′, y, y′), e.g. at the end of a beam line or a small spot one an electron microscope. When “accelerator physicists” talk about concepts such as tune, resonances, β-functions, equilibrium emittances etc., all these are irrelevant for single pass machine. There is no need to study iterating systems. In these cases the use of the trace space is fully adequate, in fact preferred because the quantities can be measured. In the end, the mathematical tools are very different from the ones discussed in this article.

3.2.2 Curved Coordinate System

For a “curved” trajectory, in general not circular, with a local radius of curvature ρ(s) in the horizontal (XZ plane), we have to transform to a new coordinate system (x, y, s) (co-moving frame) with:

$$\displaystyle \begin{aligned} \begin{array}{ll} X~=~&(x + \rho) \cos\left( \frac{s}{\rho}\right)~-~\rho~~~~\\ Y~=~&y\\ Z~=~&(x + \rho) \sin\left( \frac{s}{\rho}\right) \end{array} \end{aligned} $$
(3.4)

The new canonical momenta become:

$$\displaystyle \begin{aligned} \begin{array}{ll} p_{x}~=~&P_{X}\cos\left( \frac{s}{\rho}\right)~+~P_{Z}\sin\left( \frac{s}{\rho}\right)\\ p_{y}~=~&P_{Y}\\ p_{s}~=~&P_{Z}\left(1~+~\frac{x}{\rho}\right)\cos\left( \frac{s}{\rho}\right)~-~P_{X}\left(1~+~\frac{x}{\rho}\right)\sin\left( \frac{s}{\rho}\right) \end{array} \end{aligned} $$
(3.5)

3.3 Sources of Non-linearities

Any object creating non-linear electromagnetic fields on the trajectory of the beam can strongly influence the beam dynamics. They can be generated by the environment or by the beam itself.

3.3.1 Non-linear Machine Elements

Non-linear elements can be introduced into the machine on purpose or can be the result of field imperfections. Both types can have adverse effects on the beam stability and must be taken into account.

3.3.1.1 Unwanted Non-linear Machine Elements

The largest fraction of machine elements are either dipole or quadrupole magnets. In the ideal case, these types of magnets have pure dipolar or quadrupolar fields and behave approximately as linear machine elements. Any systematic or random deviation from this linear field introduces non-linear fields into the machine lattice. These effects can dominate the aperture required and limit the stable region of the beam. The definition of tolerances on these imperfections is an important part of any accelerator design.

Normally magnets are long enough that a 2-dimensional field representation is sufficient. The components of the magnetic field can be derived from the potential and in cylindrical coordinates (r,  Θ, s = 0) can be written as:

$$\displaystyle \begin{aligned} B_{r}(r, \Theta) = \sum_{n=1}^{\infty} (B_{n} {\mathrm{\sin}}(n\Theta) + A_{n} {\mathrm{\cos}}(n\Theta)) \left(\frac{r}{R_{ref}}\right)^{n-1}, {} \end{aligned} $$
(3.6)
$$\displaystyle \begin{aligned} B_{\Theta}(r, \Theta) = \sum_{n=1}^{\infty} (B_{n} {\mathrm{\cos}}(n\Theta) - A_{n} {\mathrm{\sin}}(n\Theta)) \left(\frac{r}{R_{ref}}\right)^{n-1}, {} \end{aligned} $$
(3.7)

where R ref is a reference radius and B n and A n are constants. Written in Cartesian coordinates we have:

$$\displaystyle \begin{aligned} B(z) = \sum_{n=1}^{\infty} (B_{n} + i A_{n}) \left(\frac{r}{R_{ref}}\right)^{n-1} {} \end{aligned} $$
(3.8)

where z = x + iy = re i Θ. The terms n correspond to 2n-pole magnets and the B n and A n are the normal and skew multipole coefficients. The beam dynamics set limits on the allowed multipole components of the installed magnets.

3.3.1.2 Wanted Non-linear Machine Elements

In most accelerators the momentum dependent focusing of the lattice (chromaticity) needs to be corrected with sextupoles [3, 4]. Sextupoles introduce non-linear fields into the lattice that are larger than the intrinsic non-linearities of the so-called linear elements (dipoles and quadrupoles). In a strictly periodic machine the correction can be done close to the origin and the required sextupole strengths can be kept small. For colliding beam accelerators usually special insertions are foreseen to host the experiments where the dispersion is kept small and the β-function is reduced to a minimum. The required sextupole correction is strong and can lead to a reduction of the dynamic aperture, i.e. the region of stability of the beam. In most accelerators the sextupoles are the dominant source of non-linearity. To minimize this effect is an important issue in any design of an accelerator.

Another source of non-linearities can be octupoles used to generate amplitude dependent detuning to provide Landau damping in case of instabilities.

3.3.2 Beam–Beam Effects and Space Charge

A strong source of non-linearities are the fields generated by the beam itself. They can cause significant perturbations on the same beam (space charge effects) or on the opposing beam (beam-beam effects) in the case of a colliding beam facility.

As an example, for the simplest case of round beams with the line density n and the beam size σ the field components can be written as:

$$\displaystyle \begin{aligned} E_{r} = -\frac{n e}{4 \pi \epsilon_{0}} \cdot \frac{\partial}{\partial r} \int_{0}^{\infty} \frac{\mathrm{exp}{\textstyle{(-\frac{r^{2}}{(2 \sigma^{2} + q)})}}}{(2 \sigma^{2} + q)} {\mathrm{d}}q, \end{aligned} $$
(3.9)

and

$$\displaystyle \begin{aligned} B_{\Phi} = -\frac{n e \beta c \mu_{0}}{4 \pi } \cdot \frac{\partial}{\partial r} \int_{0}^{\infty} \frac{\mathrm{exp}{\textstyle{(-\frac{r^{2}}{(2 \sigma^{2} + q)})}}}{(2 \sigma^{2} + q)} {\mathrm{d}}q. \end{aligned} $$
(3.10)

In colliding beams with high density and small beams these fields are the dominating source of non-linearities. The full treatment of beam-beam effects is complicated due to mutual interactions between the two beams and a self-consistent treatment is required. in the presence of all other magnets in the ring.

3.4 Map Based Techniques

In the standard approach to single particle dynamics in rings, the equations of motion are introduced together with an ansatz to solve these equations. In the case of linear motion, this ansatz is due to Courant-Snyder [2]. However, this treatment must assume that the motion of a particle in the ring is stable and confined. For a non-linear system this is a priori not known and the attempt to find a complete description of the particle motion must fail.

The starting point for the treatment of the linear dynamics in synchrotrons is based on solving a linear differential equation of the Hill type.

$$\displaystyle \begin{aligned} \frac{d^{2} x(s)}{d s^{2}} + \underbrace{\left( a_{0} + 2\sum_{n=1}^{\infty} a_{n}\cdot \cos{}(2 n s)\right)}_{K(s)} x(s) = 0~. \end{aligned}$$

Each element at position s acts as a source of forces, i.e. we must write for the forces K  → K(s) which is assumed to be a periodic function, i.e. K(s + C)  =  K(s)ring

The solution of this Boundary Value Problem must be periodic too!

It is therefore not applicable in the general case (e.g. Linacs, Beamlines, FFAG, Recirculators, …), much better to treat it as an Initial Value Problem.

In a more useful approach we do not attempt to solve such an overall equation but rather consider the fundamental objects of an accelerators, i.e. the machine elements themselves. These elements, e.g. magnets or other beam elements, are the basic building blocks of the machine. All elements have a well defined action on a particle which can be described independent of other elements or concepts such as closed orbit or β-functions. Mathematically, they provide a “map” from one face of a building block to the other, i.e. a description of how the particles move inside and between elements. In this context, a map can be anything from linear matrices to high order integration routines.

A map based technique is also the basis for the treatment of particle dynamics as an Initial value Problem (IVP).

It follow immediately that for a linear, 1st order equation of the type

$$\displaystyle \begin{aligned} \frac{d x(s)}{d s}~=~K(s)~x(s)~~~~~~~~({\mathsf{and~initial~values~at}}~~~s_{0}) \end{aligned}$$

the solution can always be written as:

$$\displaystyle \begin{aligned} \begin{array}{c} x(s)~=~a\cdot x(s_{0})~+~ b\cdot x'(s_{0}) \\ x'(s)~=~c\cdot x(s_{0})~+~ d\cdot x'(s_{0}) \\ \end{array} \Longrightarrow \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{s} ~~=~~ {{ {\overbrace{ \left( \begin{array}{c} a~~b \\ c~~d \\ \end{array}\right) }^{\textstyle{{\mathsf{A}}}}} \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{s_{0}} }} \end{aligned}$$

where the function K(s) does not have to be periodic. Furthermore, the determinant of the matrix A is always 1. Therefore it is an advantage to use maps (matrices) for a linear systems from the start, without trying to solve a differential equation.

The collection of all machine elements make up the ring pr beam line and it is the combination of the associated maps which is necessary for the description and analysis of the physical phenomena in the accelerator ring or beam line.

For a circular machine the most interesting map is the one which describes the motion once around the machine, the so-called One-Turn-Map. It contains all necessary information on stability, existence of closed orbit, and optical parameters. The reader is assumed to be familiar with this concept in the case of linear beam dynamics (Chap. 2) where all maps are matrices and the Courant-Snyder analysis of the corresponding one-turn-map produces the desired information such as e.g. closed orbit or Twiss parameters.

It should therefore be the goal to generalize this concept to non-linear dynamics. The computation of a reliable one-turn-map and the analysis of its properties will provide all relevant information.

Given that the non-linear maps can be rather complex objects, the analysis of the one-turn-map should be separated from the calculation of the map itself.

3.5 Linear Normal Forms

3.5.1 Sequence of Maps

Starting from a position s 0 and combining all matrices to get the matrix to position s 0 + L (shown for 1D only):

$$\displaystyle \begin{aligned} \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{{{s_{0}~+~L}}} =~~ \underbrace{{{{M}}_{N}} ~~\circ~~ {{{M}}_{N-1}} ~~\circ~~ \ldots ~~\circ~~ {{{M}}_{1}}}_{{{{M}}(s_{0}, L)}} ~~\circ~~ \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{{{s_{0}}}} \end{aligned} $$
(3.11)

For a ring with circumference C one obtains the One-Turn-Matrix (OTM) at s 0

$$\displaystyle \begin{aligned} \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{{{s_{0}~+~C}}} =~~ \underbrace{{{\left( \begin{array}{cc} m_{11} &m_{12} \\ m_{21} &m_{22} \\ \end{array}\right)}}}_{{{{M}}_{OTM}}} ~~\circ~~ \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{{{s_{0}}}} \end{aligned} $$
(3.12)

Without proof, the scalar product:

$$\displaystyle \begin{aligned} \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{{{s_{0}}}} \cdot {{{{M}}_{OTM}}} \left( \begin{array}{c} x \\ x' \\ \end{array}\right)_{{{s_{0}}}} ~~=~~{\mathsf{const.~~=~~J}} \end{aligned} $$
(3.13)

is a constant of the motion: invariant of the One Turn Map.

With this approach we have a strong argument that the construction of the One Turn Map is based on the properties of each element in the machine. It is entirely independent of the purpose of the machine and their global properties. It is not restricted to rings or in general to circular machine.

Once the One Turn Map is constructed, it can be analysed, but this analysis does not depend on how it was constructed.

As a paradigm: the construction of a map (being for a circular machine or not) and its analysis are conceptual and computational separated undertakings.

3.5.2 Analysis of the One Turn Map

The key for the analysis is that matrices can be transformed into Normal Forms. Starting with the One-Turn-Matrix, and try to find a (invertible) transformation A such that:

$$\displaystyle \begin{aligned} {{A}}{{{{M}}}}{{A}}^{{-1}}~=~{{{{R}}}} ~~~~~~~~~{\mathsf{(or:}}~~~~~~~~~ {{A}}^{{-1}}{{{{R}}}}{{A}}~=~{{{{M}}}}{\mathsf{)}} \end{aligned}$$
  • The matrix R is:

    • A “Normal Form”, (or at least a very simplified form of the matrix)

    • For example (most important case): R becomes a pure rotation

  • The matrix R describes the same dynamics as M, but:

    • All coordinates are transformed by A

    • This transformation A “analyses” the complexity of the motion, it contains the structure of the phase space

$$\displaystyle \begin{aligned} {{M}} = {{A}} \circ {{{{R}}}} \circ {{A}}^{-1} ~~~~{\mathrm{or:}}~~~~{{{R}}} = {{A}}^{-1} \circ {{M}} \circ {{A}} \end{aligned}$$

The motion on an ellipse becomes a motion on a circle (i.e. a rotation): R is the simple part of the map and its shape is dumped into the matrix A. R can be obtained by the evaluation of the Eigenvectors and Eigenvalues.

One finds for the two components of the original map:

$$\displaystyle \begin{aligned} {{A}} = \left( \begin{array}{cc} \sqrt{\beta(s)} &0 \\ {} -{\textstyle{\frac{\textstyle{\alpha(s)}}{\textstyle{\sqrt{\beta(s)}}}}} &\frac{1}{\textstyle{\sqrt{\beta(s)}}} \\ {} \end{array}\right) ~~~~~{\mathsf{and}}~~~~~{{R}} = \left( \begin{array}{cc} \cos{}(\Delta\mu) &\sin{}(\Delta\mu) \\ {} -\sin{}(\Delta\mu) &\cos{}(\Delta\mu) \\ {} \end{array}\right) {} \end{aligned} $$
(3.14)

Please note that the normal form analysis gives the eigenvectors (3.14) without any physical picture related to their interpretation. The formulation using α and β is due to Courant and Snyder. Amongst other advantages it can be used to “normalise” the position x: the normalised position x n is the “non-normalized” divided by \(\sqrt {\beta }\). The variation of the normalised position x n is then smaller than in the non-normalized case. This is also better suited for analytical calculation, e.g. involving perturbation theory.

The Normal Form transformation together with this choice gives the required information:

  • μ x is the “tune” Q x ⋅ 2π (now we can talk about phase advance!)

  • β, α, … are the optical parameters and describe the ellipse

  • The closed orbit (an invariant, identical coordinates after one turn!):

  • M OTM  ∘ (x, x′)co  ≡ (x, x′)co

3.5.3 Action-Angle Variables

More appropriate for studies of beam dynamics is the use of Action-Angle variables.

Once the particles “travel” on a circle, the motion is better described by the canonical variables action J x and angle Ψx:

with the definitions and the choice is (3.14):

$$\displaystyle \begin{aligned} \begin{array}{ll} {\textstyle{x~= \sqrt{2 {{J_{x}}} \beta_{x}}~~\cos{}({{\Psi_{x}}})}}\\ {} {{p_{x}~= {\textstyle{-\sqrt{\frac{2 {{J_{x}}}}{\beta_{x}}}}}~~( \sin{}({{\Psi_{x}}}) + \alpha_{x} \cos{}({{\Psi_{x}}}))}}\\ {} {{J_{x}}}~= \frac{1}{2} (\gamma_{x} x^{2}~+~2 \alpha_{x} x p_{x}~+~\beta_{x} p_{x}^{2})\\ \end{array} \end{aligned} $$
(3.15)
  • the angular position along the ring Ψ becomes the independent variable!

  • The trajectory of a particle is now independent of the position s!

  • The constant radius of the circle \(\sqrt {2 J}\) defines the action J (invariant of motion)

3.5.4 Beam Emittance

A sad and dismal story in accelerator physics is the definition of the emittance. Most foolish in this context is to relate emittance to single particles. This is true in particular when we have a beam line which is not periodic. In that case the Courant-Snyder parameters can be determined from the beam. These parameters are related to the moments of the beam, e.g. the beam size is directly related to the second order moment < x 2 > . Using the expression above for the action and angle, we can write for this expression:

$$\displaystyle \begin{aligned} <x^{2}>~~=~~<{2 {{J_{x}}} \beta_{x}}\cdot \cos^{2}({{\Psi_{x}}})>~~=~~2\beta_{x} < J_{x}\cdot \cos^{2}({{\Psi_{x}}})>. {} \end{aligned} $$
(3.16)

The average of \(\cos ^{2}\) can immediately be evaluated as 0.5 and defining the emittance as:

$$\displaystyle \begin{aligned} \epsilon_{x}~~=~~<J_{x}>, {} \end{aligned} $$
(3.17)

we write

$$\displaystyle \begin{aligned} <x^{2}>~~=~~\beta_{x}\cdot \epsilon_{x}. {} \end{aligned} $$
(3.18)

Using a similar procedure (details and derivation in e.g. [3], and to a much lesser extent in [1]) one can determine the moments

$$\displaystyle \begin{aligned} <p_{x}^{2}>~~=~~\gamma_{x}\cdot \epsilon_{x}, {} \end{aligned} $$
(3.19)

and

$$\displaystyle \begin{aligned} <x\cdot p_{x}>~~=~~-\alpha_{x}\cdot \epsilon_{x}. {} \end{aligned} $$
(3.20)

Using these expressions, the emittance becomes readily

$$\displaystyle \begin{aligned} \epsilon_{x}~~=~~\sqrt{<x^{2}> <p_{x}^{2}>~~-~~<x\cdot p_{x}>^{2}} {} \end{aligned} $$
(3.21)

Therefore, once the emittance is measured, the Courant-Snyder parameters are determined by Eqs. (3.18), (3.19), and (3.20).

Since other definitions often refer to the treatment by Courant and Snyder, here a quote from Courant himself in [9]:

Interlude 1

The invariant J is simply related to the area enclosed by the ellipse:

$$\displaystyle \begin{aligned} {\mathsf{Area~enclosed}}~~~=~~2\pi J. {} \end{aligned} $$
(3.22)

In accelerator and storage ring terminology there is a quantity called the emittance which is closely related to this invariant. The emittance, however, is a property of a distribution of particles, not a single particle. Consider a Gaussian distribution in amplitudes. Then the (rms) emittance, 𝜖, is given by:

$$\displaystyle \begin{aligned} (x_{rms})^{2}~~=~~\beta_{x}(s)\cdot \epsilon_{x}. {} \end{aligned} $$
(3.23)

In terms of the action variable, J, this can be rewritten

$$\displaystyle \begin{aligned} \epsilon_{x}~~=~~<J>. {} \end{aligned} $$
(3.24)

where the bracket indicates an average over the distribution in J.

Other definitions based on handwaving arguments or those approximately valid only in special cases, should be discarded, in particular those relying on presumed distributions, e.g. Gaussian.

3.6 Techniques and Tools to Evaluate and Correct Non-linear Effects

The key to a more modern approach shown in this section is to avoid the prejudices about the stability and other properties of the ring. Instead, we must describe the machine in terms of the objects it consists of with all their properties, including the non-linear elements. The analysis will reveal the properties of the particles such as e.g. stability. In the simplest case, the ring is made of individual machine elements such as magnets which have an existence on their own, i.e. the interaction of a particle with a given element is independent of the motion in the rest of the machine. Also for the study of non-linear effects, the description of elements should be independent of concepts such as tune, chromaticity and closed orbit. To successfully study single particle dynamics, one must be able to describe the action of the machine element on the particle as well as the machine element.

3.6.1 Particle Tracking

The ring being a collection of maps, a particle tracking code, i.e. an integrator of the equation of motion, is the most reliable map for the analysis of the machine. Of course, this requires an appropriate description of the non-linear maps in the code. It is not the purpose of this article to describe the details of tracking codes and the underlying philosophy, such details can be found in the literature (see e.g. [6]). Here we review and demonstrate the basic principles and analysis techniques.

3.6.1.1 Symplecticity

If we define a map through \({\vec {z_{2}}}~=~{{M}}_{12}(\vec {z_{1}})\) as a propagator from a location “1” to a location “2” in the ring, we have to consider that not all possible maps are allowed. The required property of the map is called “symplecticity” and in the simplest case where M 12 is a matrix, the symplecticity condition can be written as:

$$\displaystyle \begin{aligned} {{M}} \Rightarrow {{M}}^{T} \cdot {S} \cdot {{M}} = {S} {~~~~\mathrm{where}~~~~} {S} = \left( \begin{array}{cccc} 0 &1 &0 &0 \\ -1 &0 &0 &0 \\ 0 &0 &0 &1 \\ 0 &0 &-1 &0 \\ \end{array}\right) \end{aligned} $$
(3.25)

The physical meaning of this condition is that the map is area preserving in the phase space. The condition can easily be derived from a Hamiltonian treatment, closely related to Liouville’s theorem.

3.6.2 Approximations and Tools

The concept of symplecticity is vital for the treatment of Hamiltonian systems. This is true in particular when the stability of a system is investigated using particle tracking. However, in practice it is difficult to accomplish for a given exact problem. As an example we may have the exact fields and potentials of electromagnetic elements. For a single pass system a (slightly) non-symplectic integrator may be sufficient, but for an iterative system the results are meaningless.

To track particles using the exact model may result in a non-symplectic tracking, i.e. the underlying model is correct, but the resulting physics is wrong.

It is much better to approximate the model to the extend that the tracking is symplectic. One might compromise on the exactness of the final result, but the correct physics is ensured.

As a typical example one might observe possible chaotic motion during the tracking procedure. However, there is always a non-negligible probability that this interpretation of the results may be wrong. To conclude that it is not a consequence of non-symplecticity of the procedure or a numerical artifact it is necessary to identify the physical mechanism leading to this observation.

This may not be possible to achieve using the exact model as input to a (possibly) non-symplectic procedure. Involving approximations to the definition of the problem should reveal the correct physics at the expense of a (hopefully) small error. Staying exact, the physics may be wrong.

As a result, care must be taken to positively identify the underlying process.

This procedure should be based on a approximations as close as possible to the exact problem, but allowing a symplectic evaluation.

An example for this will be shown in Sect. 3.6.3.4.

3.6.3 Taylor and Power Maps

A non-linear element cannot be represented in the form of a linear matrix and more complicated maps have to be introduced [5]. In principle, any well behaved, non-linear function can be developed as a Taylor series. This expansion can be truncated at the desired precision.

Another option is the representation as Lie transformations [8, 10]. Both types are discussed in this section.

3.6.3.1 Taylor Maps

A Taylor map can be written using higher order matrices and in the case of two dimensions we have:

$$\displaystyle \begin{aligned} z_{j}(s_{2})~=~\sum_{k=1}^{4} {{R_{jk}}} z_{k}(s_{1}) ~+\sum_{k=1}^{4}\sum_{l=1}^{4} {{T_{jkl}}} z_{k}(s_{1})z_{l}(s_{1}) \end{aligned} $$
(3.26)

(where z j, j = 1, …, 4, stand for x, x′, y, y′). Let us call the collection: A 2  =  (R, T) the second order map A 2. Higher orders can be defined as needed, e.g. for the 3rd order map A 3 = (R, T, U) we add a third order matrix:

$$\displaystyle \begin{aligned} +~~~\sum_{k=1}^{4}~\sum_{l=1}^{4}~\sum_{m=1}^{4}~ {{U_{jklm}}} z_{k}(s_{1})z_{l}(s_{1})z_{m}(s_{1}) \end{aligned} $$
(3.27)

Since Taylor expansions are not matrices, to provide a symplectic map, it is the associated Jacobian matrix J which must fulfill the symplecticity condition:

$$\displaystyle \begin{aligned} {{{J}}_{ik}}~=~\frac{\partial z_{i}(s_{2})}{\partial z_{k}(s_{1})}~~~~{\mathrm{and}}~~~~{{{{{J}}}}}~~~{\mathrm{must~fulfill:}}~~~{{{{J}}^{t}}} \cdot {{S}} \cdot {{J}} = {{S}} \end{aligned} $$
(3.28)

However, in general J ik  ≠  const and for a truncated Taylor map it can be difficult to fulfill this condition for all z. As a consequence, the number of independent coefficients in the Taylor expansion is reduced and the complete, symplectic Taylor map requires more coefficients than necessary [7].

The explicit maps for a sextupole is:

$$\displaystyle \begin{aligned} \begin{array}{lll} x_{2} &= {\textstyle{x_{1} + L x_{1}^{\prime}}} &-~k_{2} \left(\frac{L^{2}}{4}(x_{1}^{2} - y_{1}^{2}) + \frac{L^{3}}{12}(x_{1}x_{1}^{\prime} - y_{1}y_{1}^{\prime}) + \frac{L^{4}}{24}(x_{1}^{\prime 2} - y_{1}^{\prime 2}) \right)\\ {} x_{2}^{\prime} &= x_{1}^{\prime} &-~k_{2} \left(\frac{L}{2}(x_{1}^{2} - y_{1}^{2}) + \frac{L^{2}}{4}(x_{1}x_{1}^{\prime} - y_{1}y_{1}^{\prime}) + \frac{L^{3}}{6}(x_{1}^{\prime 2} - y_{1}^{\prime 2}) \right)\\ {} y_{2} &= y_{1} + L y_{1}^{\prime} &+~k_{2} \left(\frac{L^{2}}{4}x_{1}y_{1} + \frac{L^{3}}{12}(x_{1}y_{1}^{\prime} + y_{1}x_{1}^{\prime}) + \frac{L^{4}}{24}(x_{1}^{\prime}y_{1}^{\prime}) \right)\\ {} y_{2}^{\prime} &= y_{1}^{\prime} &+~k_{2} \left(\frac{L}{2}x_{1}y_{1} + \frac{L^{2}}{4}(x_{1}y_{1}^{\prime} + y_{1}x_{1}^{\prime}) + \frac{L^{3}}{6}(x_{1}^{\prime}y_{1}^{\prime}) \right)\\ \end{array} \end{aligned} $$
(3.29)

Writing the explicit form of the Jacobian matrix:

$$\displaystyle \begin{aligned} {{ {{{J}}_{ik}}~=~ \left( \begin{array}{cccc} \frac{\partial x_{2}}{\partial x_{1}} &\frac{\partial x_{2}}{\partial x^{\prime}_{1}} &\frac{\partial x_{2}}{\partial y_{1}} &\frac{\partial x_{2}}{\partial y^{\prime}_{1}}\\ {} \frac{\partial x^{\prime}_{2}}{\partial x_{1}} &\frac{\partial x^{\prime}_{2}}{\partial x^{\prime}_{1}} &\frac{\partial x^{\prime}_{2}}{\partial y_{1}} &\frac{\partial x^{\prime}_{2}}{\partial y^{\prime}_{1}}\\ {} \frac{\partial y_{2}}{\partial x_{1}} &\frac{\partial y_{2}}{\partial x^{\prime}_{1}} &{{\frac{\partial y_{2}}{\partial y_{1}}}} &\frac{\partial y_{2}}{\partial y^{\prime}_{1}}\\ {} \frac{\partial y^{\prime}_{2}}{\partial x_{1}} &\frac{\partial y^{\prime}_{2}}{\partial x^{\prime}_{1}} &\frac{\partial y^{\prime}_{2}}{\partial y_{1}} &\frac{\partial y^{\prime}_{2}}{\partial y^{\prime}_{1}}\\ {} \end{array}\right) {{~~~~~{{\rightarrow {k_{2} = 0}}}~~~~~ \left( \begin{array}{cccc} 1 &L &0 &0\\ {} 0 &1 &0 &0\\ {} 0 &0 &{{1}} &L\\ {} 0 &0 &0 &1\\ {} \end{array}\right)}} }} \end{aligned} $$
(3.30)

For k 2  ≠  0 coefficients depend on initial values, e.g.:

$$\displaystyle \begin{aligned} {\frac{\partial y_{2}}{\partial y_{1}}}=1+k_{2}\left(\frac{L^{2}}{4}x_{1}+\frac{L^{3}}{12x_{1}}' \right)\rightarrow {\mathsf{Power~series~are~not~symplectic, cannot~be~used}} \end{aligned}$$

The non-symplecticity can be recovered in the case of elements with L = 0. It becomes small (probably small enough) when the length is small.

As a result, the model is approximated by a small amount, but the symplecticity (and therefore the physics) is ensured. An exact model but compromised integration can fabricate non-existing features and conceal important underlying physics.

The situation is rather different in the case of single pass machines. The long term stability (and therefore symplecticity) is not an issue and the Taylor expansion around the closed orbit is what is really needed. Techniques like the one described in Sect. 3.7.6 provide exactly this in an advanced and flexible formalism.

3.6.3.2 Thick and Thin Lenses

All elements in a ring have a finite length and therefore should be treated as “thick lenses”. However, in general a solution for the motion in a thick element does not exist. It has become a standard technique to avoid using approximate formulae to track through thick lenses and rather perform exact tracking through thin lenses. This approximation is improved by breaking the thick element into several thin elements which is equivalent to a numerical integration. A major advantage of this technique is that “thin lens tracking” is automatically symplectic. In this context it becomes important to understand the implied approximations and how they influence the desired results. We proceed by an analysis of these approximations and show how “symplectic integration” techniques can be applied to this problem.

We demonstrate the approximation using a quadrupole. Although an exact solution of the motion through a quadrupole exists, it is a useful demonstration since it can be shown that all concepts developed here apply also to arbitrary non-linear elements.

Let us assume the transfer map (matrix) for a thick, linearized quadrupole of length L and strength K:

$$\displaystyle \begin{aligned} {{M}}_{s \rightarrow s + L}= {{ \left( \begin{array}{cc} {\mathrm{cos}}(L\cdot \sqrt{K}) &\frac{1}{\sqrt{K}}\cdot {\mathrm{sin}}(L\cdot {\sqrt{K}}) \\ -\sqrt{K}\cdot {\mathrm{sin}}(L\cdot {\sqrt{K}}) &{\mathrm{cos}}(L\cdot {\sqrt{K}}) \\ \end{array}\right) }} \end{aligned} $$
(3.31)

This map is exact and can be expanded as a Taylor series for a “small” length L:

$$\displaystyle \begin{aligned} {{M}}_{s \rightarrow s + L}= {{L^{0}}} \cdot \left( \begin{array}{cc} 1 &0\\ 0 &1\\ \end{array}\right) + {{L^{1}}} \cdot \left( \begin{array}{cc} 0 &1\\ -K &0\\ \end{array}\right) + {{L^{2}}} \cdot \left( \begin{array}{cc} -\frac{1}{2}{K} &0\\ 0 &-\frac{1}{2}{K}\\ \end{array}\right) + \ldots {} \end{aligned} $$
(3.32)

If we keep only terms up to first order in L we get:

$$\displaystyle \begin{aligned} {{M}}_{s \rightarrow s + L}= {{L^{0}}} \cdot \left( \begin{array}{cc} 1 &0\\ 0 &1\\ \end{array}\right) + {{L^{1}}} \cdot \left( \begin{array}{cc} 0 &1\\ -K &0\\ \end{array}\right) + {{O}}(L^{2}) \end{aligned} $$
(3.33)
$$\displaystyle \begin{aligned} {{M}}_{s \rightarrow s + L}= {{ \left( \begin{array}{cc} 1 &L \\ -K\cdot L &1 \\ \end{array}\right) + {{O}}(L^{2}) }} {} \end{aligned} $$
(3.34)

This map is precise to order O(L 1), but since we have det M ≠ 1, this truncated expansion is not symplectic.

3.6.3.3 Symplectic Matrices and Symplectic Integration

However, the map (3.34) can be made symplectic by adding a term −K 2 L 2. This term is of order O(L 2), i.e. does not deteriorate the approximation because the inaccuracy is of the same order.

$$\displaystyle \begin{aligned} {{M}}_{s \rightarrow s + L}= { \left( \begin{array}{cc} 1 &L \\ -K\cdot L &1 \mathbf{{-KL^{2}}} \\ \end{array}\right) } {} \end{aligned} $$
(3.35)

Following the same procedure we can compute a symplectic approximation precise to order O(L 2) from (3.32) using:

$$\displaystyle \begin{aligned} {{M}}_{s \rightarrow s + L}= { \left( \begin{array}{cc} 1 - \frac{1}{2}KL^{2} &L \\ -K\cdot L &1 - \frac{1}{2}KL^{2} \\ \end{array}\right) ~~\Rightarrow~~ \left( \begin{array}{cc} 1 - \frac{1}{2}KL^{2} &L \mathbf{{- \frac{1}{4}KL^{3}}} \\ -K\cdot L &1 -\frac{1}{2}KL^{2} \\ \end{array}\right) } \end{aligned} $$
(3.36)

It can be shown that this “symplectification” corresponds to the approximation of a quadrupole by a single kick in the centre between two drift spaces of length L∕2:

$$\displaystyle \begin{aligned} \left( \begin{array}{cc} 1 &\frac{1}{2}L \\ 0 &1 \\ \end{array}\right) \left( \begin{array}{cc} 1 &0 \\ -K\cdot L &1 \\ \end{array}\right) \left( \begin{array}{cc} 1 &\frac{1}{2}L \\ 0 &1 \\ \end{array}\right) = \left( \begin{array}{cc} 1 - \frac{1}{2}KL^{2} &L - \frac{1}{4}KL^{3} \\ -K\cdot L &1 -\frac{1}{2}KL^{2} \\ \end{array}\right) \end{aligned} $$
(3.37)

It may be mentioned that the previous approximation to 1st order corresponds to a kick at the end of a quadrupole, preceded by a drift space of length L. Both cases are illustrated in Fig. 3.1.

figure 1

Schematic representation of a symplectic kick of first order (left) and second order (right)

One can try to further improve the approximation by adding 3 kicks like in Fig. 3.2 where the distance between kicks and the kick strengths are optimized to obtain the highest order. The thin lens approximation in Fig. 3.2 with the constants:

$$\displaystyle \begin{aligned} a~\approx~0.675602, b~\approx~-0.175602, \alpha~\approx~1.351204, \beta~\approx~-1.702410 {} \end{aligned} $$
(3.38)

provides an O(L 4) integrator [11].

Fig. 3.2
figure 2

Schematic representation of a symplectic integration with thin lenses of fourth order. The figure shows the size of drifts and thin lens kicks

This process is a Symplectic Integration [12] and is a formal procedure to construct higher order integrators from lower order ones. From a 2nd order scheme (1 kick) S 2(t) we construct a 4th order scheme (3 kicks = 3 × 1 kick) like: S 4(t) = S 2(x 1 t) ∘ S 2(x 0 t) ∘ S 2(x 1 t) with:

$$\displaystyle \begin{aligned} x_{0}~=~\frac{-2^{1/3}}{2 - 2^{1/3}}~\approx~-1.702410~~~~~x_{1}~=~\frac{1}{2 - 2^{1/3}}~\approx~1.351204 \end{aligned} $$
(3.39)

In general: If S 2k(t) is a symmetric integrator of order 2k, then we obtain a symmetric integrator of order 2k + 2 by: S 2k+2(t) = S 2k(x 1 t) ∘ S 2k(x 0 t) ∘ S 2k(x 1 t) with:

$$\displaystyle \begin{aligned} x_{0}~=~\frac{-\sqrt[2k+1]{2}}{2 - \sqrt[2k+1]{2}}~~~~~~x_{1}~=~\frac{1}{2 - \sqrt[2k+1]{2}} \end{aligned} $$
(3.40)

Higher order integrators can be obtained in a similar way in an iterative procedure. A very explicit example of the iterative construction of a higher order map from a lower order can be found in [7].

This method can be applied to any other non-linear map and we obtain the same integrators. The proof of this statement and the systematic extension can be done in the form of Lie operators [12].

It should be noted that higher order integrators require maps which drift backwards (3.38) as shown in Fig. 3.2 right. This has two profound consequences. First, a straightforward “physical” interpretation of thin lens models representing drifts and individual small “magnets” (a la MAD) makes no sense and prohibits the use of high order integrators. Secondly, models which require self-consistent time tracking or s tracking (e.g. space charge calculations) must use integrators for which s(t) is monotonic in the magnets.

3.6.3.4 Comparison Symplectic Versus Non-symplectic Integration

A demonstration of a non-symplectic tracking is shown in Fig. 3.3. A particle is tracked through a quadrupole and the poincare section is shown. A quadrupole is chosen because it allows a comparison with the exact solution. The non-symplecticity causes the particle to spiral outwards. As comparison to the exact tracking is shown. In Fig. 3.3 (right) the symplectic integrators of order 1 and 2 as derived above are used instead. The trajectory is now constant and the difference to the exact solution is small. Although the model is approximated but symplectic, the underlying physics (i.e. constant energy in this case) is correct at the expense of a small discrepancy with respect to the exact solution.

Fig. 3.3
figure 3

Poincare section for tracking through a quadrupole. Comparison between exact solution, non-symplectic (left) and symplectic (right) tracking. Shown are symplectic integrators of order 1 and 2

3.7 Hamiltonian Treatment of Electro-Magnetic Fields

A frequently asked question is why one should not just use Newton’s laws and the Lorentz force. Some of the main reasons are:

  • Newton requires rectangular coordinates and time, trajectories with e.g. “curvature” or “torsion” need to introduce “reaction forces”. (For example: LHC has locally non-planar (cork-screw) “design” orbits!).

  • For linear dynamics done by ad hoc introduction of new coordinate frame.

  • With Hamiltonian it is free: The formalism is “coordinate invariant”, i.e. the equations have the same form in every coordinate system.

  • The basic equations ensure that the phase space is conserved

3.7.1 Lagrangian of Electro-Magnetic Fields

3.7.1.1 Lagrangian and Hamiltonian

It is common practice to use q for the coordinates when Hamiltonian and Lagrangian formalisms are used. This is deplorable because q is also used for particle charge.

The motion of a particle is usually described in classical mechanics using the Langrange functional:

$$\displaystyle \begin{aligned} L(~q_{1}(t),\ldots q_{n}(t),~ \dot{q_{1}}(t),\ldots \dot{q_{n}}(t), t~)~~~~ ~{\mathsf{short:}}~~~~~L(q_{i},\dot{q_{i}},t) {} \end{aligned} $$
(3.41)

where q 1(t), …q n(t) are generalized coordinates and \(\dot {q_{1}}(t),\ldots \dot {q_{n}}(t)\) the corresponding generalized velocities. Here q i can stand for any coordinate and any particle, and n can be a very large number.

The integral

$$\displaystyle \begin{aligned} S = \int L(~q_{i}(t), \dot{q_{i}}(t), t~)~{\mathrm{d}}t. {} \end{aligned} $$
(3.42)

defines the action S.

The action S is used with the Hamiltonian principle: a system moves along a path such that the action S becomes stationary, i.e. δS  =  0

Is fulfilled when:

$$\displaystyle \begin{aligned} \frac{d}{dt}\frac{\partial L}{\partial \dot{q_{i}}} - \frac{\partial L}{\partial q_{i}} = 0~~~{\mathsf{(Euler~-~Lagrange~equation)}} {} \end{aligned} $$
(3.43)

It is unfortunate that the term action is used in different contexts and must not be confused with the action-angle variables defined earlier. The action above is a functional rather than a variable.

Without proof or derivation it should be stated that L = T − V = kinetic energy −potential energy.

Given the Lagrangian, the Hamiltonian can be derived as:

$$\displaystyle \begin{aligned} H(\vec{q}, \vec{p}, t)~=~\sum_{i}[p_{i} \dot{q}_{i} ~-~L(\vec{q}, \vec{\dot{q}}, t)]. {} \end{aligned} $$
(3.44)

The coordinates q i are identical to those in the Lagrangian (3.41), whereas the conjugate momenta p i are derived from L as:

$$\displaystyle \begin{aligned} p_{i}~~=~~\frac{\partial{L}}{\partial{\dot{q_{i}}}}. {} \end{aligned} $$
(3.45)

3.7.2 Hamiltonian with Electro-Magnetic Fields

Readers only interested in the final result can skip Eqs. (3.46)–(3.54).

A key for the correct Hamiltonian is the relativistic treatment. An intuitive derivation is presented here, a simpler and elegant derivation should be based on 4-vectors [13]. The action S must be a relativistic invariant and becomes (now using coordinates x and velocities v):

$$\displaystyle \begin{aligned} S = \int L(~x_{i}(t), {v_{i}}(t), t~)~\gamma\cdot{\mathrm{d}}\tau. {} \end{aligned} $$
(3.46)

since the proper time τ is Lorentz invariant, and therefore also γ ⋅ L.

The Lagrangian for a free particle is usually a function of the velocity (see classical formula of the kinematic term), but must not depend on its position.

The only Lorentz invariant with the velocity is [13]:

$$\displaystyle \begin{aligned} U^{\mu}U_{\mu}~~=~~c^{2} {} \end{aligned} $$
(3.47)

where U is the four-velocity.

For the Lagrangian of a (relativistic) free particle we must write

$$\displaystyle \begin{aligned} L_{free}~=~-mc^{2}\sqrt{1~-~\beta_{r}^{2}}~=~-mc^{2}\sqrt{1~-~(\frac{v}{c})^{2}}~=~-\frac{m c^{2}}{\gamma} {} \end{aligned} $$
(3.48)

Using for the electromagnetic Lagrangian a form (without derivation, any textbook):

$$\displaystyle \begin{aligned} L~=~\frac{e}{c}~{v}\cdot\vec{A}~-~e\phi {} \end{aligned} $$
(3.49)

Combining (3.48) and (3.49) we obtain the complete Lagrangian:

$$\displaystyle \begin{aligned} L~=~-\frac{mc^{2}}{\gamma}~+~\frac{e}{c}\cdot\vec{v}\cdot\vec{A}~-~e\cdot\phi {} \end{aligned} $$
(3.50)

thus the conjugate momentum is derived as:

$$\displaystyle \begin{aligned} \vec{P}~~=~~\frac{\partial{L}}{\partial{{v_{i}}}}~~=~~\vec{p}~+~\frac{e}{c} \vec{A} ~~~~~~~~~~\left ( {\mathsf{or}}~~~~~~\vec{P}~~=~~\vec{p}~-~\frac{q}{c} \vec{A} \right) {} \end{aligned} $$
(3.51)

where \(\vec {p}\) is the ordinary kinetic momentum.

A consequence is that the canonical momentum cannot be written as:

$$\displaystyle \begin{aligned} P_{x}~=~m c \gamma \beta_{x} {} \end{aligned} $$
(3.52)

Using the conjugate momentum the Hamiltonian takes the simple form:

$$\displaystyle \begin{aligned} H~~=~~\vec{P}\cdot\vec{v}~-~L {} \end{aligned} $$
(3.53)

The Hamiltonian must be a function of the conjugate variables P and x and after a bit of algebra one can eliminate \(\vec {v}\) using:

$$\displaystyle \begin{aligned} \vec{v}~~=~~\frac{c\vec{P}~-~e\vec{A}}{\sqrt{(\vec{P}~-~\frac{e\vec{A}}{c})^{2}~+~m^{2}c^{2}}} {} \end{aligned} $$
(3.54)

With (3.50) and (3.54) we write for the Hamiltonian for a (ultra relativistic, i.e. γ ≫ 1, β ≈ 1) particle in an electro-magnetic field is given by:

$$\displaystyle \begin{aligned} {{H}}(\vec{x},\vec{p}, t) = c \sqrt{(\vec{P} - e \vec{A}(\vec{x}, t) )^{2} + m^{2}c^{2}} + e \Phi(\vec{x}, t)~~ {} \end{aligned} $$
(3.55)

where \(\vec {A}(\vec {x}, t)\), \(\Phi (\vec {x}, t)\) are the vector and scalar potentials.

Interlude 2

A short interlude,one may want to skip to Eq.(3.60)

Equation (3.55) is the total energy Eof the particle where the difference is the potential energy and the new conjugate momentum \(\vec {P}~=~(\vec {p}~-~\frac {e}{c}\vec {A})\) , replacing \(\vec {p}\).

From the classical expression

$$\displaystyle \begin{aligned} E^{2}~~=~~p^{2}c^{2}~~+~~(m c^{2})^{2} \end{aligned} $$
(3.56)

one can re-write

$$\displaystyle \begin{aligned} (W~-~e\phi)^{2}~~-~~(c\vec{P}~-~e\vec{A})^{2}~~=~~(m c^{2})^{2} \end{aligned} $$
(3.57)

The expression (mc 2)2 is the invariant mass [ 13 ], i.e.

$$\displaystyle \begin{aligned} p_{\mu}p^{\mu}~~=~~(m c)^{2} {} \end{aligned} $$
(3.58)

with the 4-vector for the momentum [ 13 ]:

$$\displaystyle \begin{aligned} p^{\mu}~~=~~(\frac{E}{c}, \vec{p})~~=~~\left( \frac{1}{c}(W~-~e\phi),~~ \vec{P}~-~(\frac{e}{c}\vec{A}) \right) {} \end{aligned} $$
(3.59)

The changes are a consequence using 4-vectors in the presence of electromagnetic fields (potentials).

An interesting consequence of (3.51) is that the momentum is linked to the fields (\(\vec {A}\)) and the angle x′ cannot easily be derived from the total momentum and the conjugate momentum. That is using (x, x′) as coordinate are strictly speaking not valid in the presence of electromagnetic fields.

In this context using (x, x′) or (x, p x) is not equivalent. A general, strong statement that (x, x′) is used in accelerator physics is at best bizarre.

3.7.3 Hamiltonian Used for Accelerator Physics

In a more convenient (and useful) form, using canonical variables x and p x, p y and the design path length s as independent variable (bending field B 0 in y-plane) and no electric fields (for details of the derivation see [14]):

$$\displaystyle \begin{aligned} {{H}}~=~{\underbrace{\overbrace{-(1 + \frac{x}{\rho})}^{\mathsf{due~to~t~\rightarrow~s}}}} \cdot {\overbrace{\sqrt{(1 + \delta)^{2} - p_{x}^{2} - p_{y}^{2}}}^{kinematic}} + {\underbrace{{\overbrace{{\frac{x}{\rho} + {\frac{x^{2}}{2\rho^{2}}}}}^{{\mathsf{due~to~t}~\rightarrow~s}}}}} - {\overbrace{\frac{A_{s}(x, y)}{B_{0} \rho}}^{normalized}} {} \end{aligned} $$
(3.60)

where \(p~=~\sqrt {E^{2}/c^{2}~-~m^{2}c^{2}}\) total momentum, δ = (p − p 0)∕p 0 is relative momentum deviation and A s(x, y) (normalized) longitudinal (along s) component of the vector potential. Only transverse field and no electric fields are considered.

After square root expansion and sorting the A s contributions:

$$\displaystyle \begin{aligned} {{H}} = \overbrace{\frac{p_{x}^{2} + p_{y}^{2}}{2(1 + \delta)}}^{kinematic} - \overbrace{\underbrace{\frac{x\delta}{\rho}}_{bending} + \underbrace{\frac{x^{2}}{2\rho^{2}}}_{focusing}}^{dipole} + \overbrace{\frac{k_{1}}{2}(x^{2} - y^{2})}^{quadrupole} + \overbrace{\frac{k_{2}}{6}(x^{3} - 3xy^{2})}^{sextupole}~+~~\ldots {} \end{aligned} $$
(3.61)
$$\displaystyle \begin{aligned} {\mathsf{using:}}~~~~~~k_{n}~=~k^{(n)}_{n} = \frac{1}{B\rho}\frac{\partial^{n}B_{y}}{\partial x^{n}} ~~~~~~~~~\left(k^{(s)}_{n} = \frac{1}{B\rho}\frac{\partial^{n}B_{x}}{\partial x^{n}}~\right) \end{aligned} $$
(3.62)
  • The Hamiltonian describes the motion of a particle through an element

  • Each element has a component in the Hamiltonian

  • Basis to extend the linear to a nonlinear formalism

A short list of Hamiltonians of some machine elements (3D)

In general for multipoles of order n:

$$\displaystyle \begin{aligned} H_{n}~=~\frac{1}{1+n} {{R}}e\left[(k_{n} +{\mathrm{i}}k^{(s)}_{n})(x +{\mathrm{i}}y)^{n+1}\right]~+~\frac{p_{x}^{2}~+~p_{y}^{2}}{2(1+\delta)} {} \end{aligned} $$
(3.63)

We get for some important types (normal components k n only):

$$\displaystyle \begin{aligned}{{\mathsf{drift~space:}}~~H = -\sqrt{(1 + \delta)^{2}~-~p_{x}^{2}~-~p_{y}^{2}}~~\approx~~\frac{p_{x}^{2} + p_{y}^{2}}{2(1 + \delta)}}\end{aligned} $$
(3.64)
$$\displaystyle \begin{aligned}{{\mathsf{dipole}:}~~H = -\frac{-x \delta}{\rho} + \frac{x^{2}}{2\rho^{2}} + \frac{p_{x}^{2} + p_{y}^{2}}{2(1 + \delta)}}\end{aligned} $$
(3.65)
$$\displaystyle \begin{aligned}{{\mathsf{quadrupole}:}~~H = \frac{1}{2}k_{1}(x^{2} - y^{2}) + \frac{p_{x}^{2} + p_{y}^{2}}{2(1 + \delta)}}\end{aligned} $$
(3.66)
$$\displaystyle \begin{aligned}{{\mathsf{sextupole}:}~~H = \frac{1}{3}k_{2}(x^{3} - {{3 x y^{2})}} + \frac{p_{x}^{2} + p_{y}^{2}}{2(1 + \delta)}}\end{aligned} $$
(3.67)
$$\displaystyle \begin{aligned}{{\mathsf{octupole}:}~~H = \frac{1}{4}k_{3}(x^{4} - {{6 x^{2} y^{2}}} + y^{4}) + \frac{p_{x}^{2} + p_{y}^{2}}{2(1 + \delta)}}\end{aligned} $$
(3.68)

Interlude 3

A few remarks are required after this list of Hamiltonian for particular elements.

  • Unlike said in many introductory textbooks and lectures, a multipole of order n is not required to drive a nth order resonance—nothing could be more wrong!!

  • In leading order perturbation theory, only elements with an even order (and larger than 2) in the Hamiltonian can produce an amplitude dependent tune shift and tune spread.

3.7.3.1 Lie Maps and Transformations

In this chapter we would like to introduce Lie algebraic tools and Lie transformations [15,16,17]. We use the symbol z i  =  (x i, p i) where x and p stand for canonically conjugate position and momentum. We let f(z) and g(z) be any function of x, p and can define the Poisson bracket for a differential operator [18]:

$$\displaystyle \begin{aligned}{}[f, g] = \sum_{i=1}^{n} \left( \frac{\partial f}{\partial x_{i}}\frac{\partial g}{\partial p_{i}} - \frac{\partial f}{\partial p_{i}}\frac{\partial g}{\partial x_{i}}\right) \end{aligned} $$
(3.69)

Assuming that the motion of a dynamic system is defined by a Hamiltonian H, we can now write for the equations of motion [18]:

$$\displaystyle \begin{aligned}{}[x_{i}, H] = \frac{\partial H}{\partial p_{i}} = \frac{d x_{i}}{dt} \end{aligned} $$
(3.70)
$$\displaystyle \begin{aligned}{}[p_{i}, H] = -\frac{\partial H}{\partial x_{i}} = \frac{d p_{i}}{dt} \end{aligned} $$
(3.71)

If H does not explicitly depend on time then:

$$\displaystyle \begin{aligned}{}[f, H] = 0 \end{aligned} $$
(3.72)

implies that f is an invariant of the motion. To proceed, we can define a Lie operator : f :  via the notation:

$$\displaystyle \begin{aligned} :f:g = [f, g] \end{aligned} $$
(3.73)

where : f :  is an operator acting on the function g.

We can define powers as:

$$\displaystyle \begin{aligned} (:f:)^{2}g = :f:(:f:g) = [f,[f, g]] ~~~~{\mathrm{etc.}} \end{aligned} $$
(3.74)

One can collect a set of useful formulae for calculations:

Some common special (very useful) cases for f:

$$\displaystyle \begin{aligned} \begin{array}{ll} :x:~=~\frac{\partial}{\partial p} &~~~~~:p:~=~-\frac{\partial}{\partial x}\\ {} :x:^{2}~=~\overbrace{:x::x:}^{\mathsf{applied~twice}}~=~\frac{\partial^{2}}{\partial p^{2}} &~~~~~:p:^{2}~=~\overbrace{:p::p:}^{\mathsf{applied~twice}}~=~\frac{\partial^{2}}{\partial x^{2}}\\ {} :xp:~=~p\frac{\partial}{\partial p} - x\frac{\partial}{\partial x} &~~~~~:x::p:~=~:p::x:~=~-\frac{\partial^{2}}{\partial x \partial p}\\ {} :x^{2}:~=~2x\frac{\partial}{\partial p} &~~~~~:p^{2}:~=~-2p\frac{\partial}{\partial x}\\ {} :x^{n}:~=~n\cdot x^{n-1}\frac{\partial}{\partial p} &~~~~~:p^{n}:~=~-n\cdot p^{n-1}\frac{\partial}{\partial x}\\ \end{array} {} \end{aligned} $$
(3.75)

Once powers of the Lie operators are defined, they can be used to formulated an exponential form:

$$\displaystyle \begin{aligned} e^{:f:} = \sum_{i=0}^{\infty} \frac{1}{i!}(:f:)^{i} \end{aligned} $$
(3.76)

This expression is call a “Lie transformation”.

Give the Hamiltonian H of an element, the generator f is this Hamiltonian multiplied by the length L of the element.

To evaluate a simple example, for the case H = −p 2∕2 using the exponential form and (3.75):

$$\displaystyle \begin{aligned} \begin{array}{rcl} {{e^{\textstyle{:-Lp^{2}/2:}}}}x &\displaystyle = &\displaystyle x - \frac{1}{2}L:p^{2}:x + \frac{1}{8}L^{2}(:p^{2}:)^{2}x + .. \\ &\displaystyle = &\displaystyle x~ +~ Lp \end{array} \end{aligned} $$
(3.77)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {{e^{\textstyle{:-Lp^{2}/2:}}}}p &\displaystyle = &\displaystyle p - \frac{1}{2}L:p^{2}:p + \ldots \\ &\displaystyle = &\displaystyle p {} \end{array} \end{aligned} $$
(3.78)

One can easily verify that for 1D and δ  =  0 this is the transformation of a drift space of length L (if p  ≈ x′) as introduced previously. The function f(x, p) = −Lp 2∕2 is the generator of this transformation.

Interlude 4

The exact Hamiltonian in two transverse dimensions and with a relative momentum deviation δis (full Hamiltonian with \(\vec {A}(\vec {x}, t)\) = 0):

$$\displaystyle \begin{aligned} H = -\sqrt{(1 + \delta)^{2}~-~p_{x}^{2}~-~p_{y}^{2}}~~~~\longrightarrow~~~~f_{drift}~=~L\cdot H \end{aligned}$$

The exact map for a drift space is now:

$$\displaystyle \begin{aligned} \begin{array}{rcl} x^{new} &\displaystyle = &\displaystyle x + L\cdot \frac{p_{x}}{\sqrt{(1 + \delta)^{2}~-~p_{x}^{2}~-~p_{y}^{2}}}\\ p_{x}^{new} &\displaystyle = &\displaystyle p_{x}\\ y^{new} &\displaystyle = &\displaystyle y + L\cdot \frac{p_{y}}{\sqrt{(1 + \delta)^{2}~-~p_{x}^{2}~-~p_{y}^{2}}}\\ p_{y}^{new} &\displaystyle = &\displaystyle p_{y} \end{array} \end{aligned} $$

In 2D and with δ  ≠  0it is more complicated than Eq.(3.78). In practice the map can (often) be simplified to the well known form.

More general, acting on the phase space coordinates:

$$\displaystyle \begin{aligned} {{e^{:f:}}} (x, p)_{1} = (x, p)_{2} \end{aligned} $$
(3.79)

is the Lie transformation which describes how to go from one point to another.

While a Lie operator propagates variables over an infinitesimal distance, the Lie transformation propagates over a finite distance.

To illustrate this technique with some simple examples, it can be shown easily, using the formulae above, that the transformation:

$$\displaystyle \begin{aligned} e^{:\textstyle{-\frac{1}{2f}}x^{2}:} \end{aligned} $$
(3.80)

corresponds to the map of a thin quadrupole with focusing length f, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_{2} &\displaystyle = &\displaystyle x_{1} \\ p_{2} &\displaystyle = &\displaystyle p_{1} - \frac{1}{f} x_{1} \end{array} \end{aligned} $$

A transformation of the form:

$$\displaystyle \begin{aligned} e^{\textstyle{:-\frac{1}{2}L(k^{2}x^{2} + p^{2}):}} \end{aligned} $$
(3.81)

corresponds to the map of a thick quadrupole with length L and strength k:

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_{2} &\displaystyle = &\displaystyle x_{1} {\mathrm{\cos}}(kL) + \frac{p_{1}}{k}{\mathrm{\sin}}(kL) \end{array} \end{aligned} $$
(3.82)
$$\displaystyle \begin{aligned} \begin{array}{rcl} p_{2} &\displaystyle = &\displaystyle -k x_{1}{\mathrm{\sin}}(kL) + p_{1} {\mathrm{\cos}}(kL) \end{array} \end{aligned} $$
(3.83)

The linear map using Twiss parameters in Lie representation (we shall call it : f 2 :  from now on) is always of the form:

$$\displaystyle \begin{aligned} e^{\textstyle{:f_{2}:}}~~~{\mathrm{with:}}~~~f_{2}(x) = -\frac{\mu}{2}(\gamma x^{2} + 2 \alpha x p + \beta p^{2}) \end{aligned} $$
(3.84)

In case of a general non-linear function f(x), i.e. with a (thin lens) kick like:

$$\displaystyle \begin{aligned} \begin{array}{rcl} x_{2} &\displaystyle = &\displaystyle x_{1} \end{array} \end{aligned} $$
(3.85)
$$\displaystyle \begin{aligned} \begin{array}{rcl} p_{2} &\displaystyle = &\displaystyle p_{1} + f(x_{1}) {} \end{array} \end{aligned} $$
(3.86)

the corresponding Lie operator can be written as:

$$\displaystyle \begin{aligned} e^{\textstyle{:h:}} = e^{\textstyle{:\int_{0}^{x} f(u) {\mathrm{d}}u:}} {~~~\mathrm{or}~~~} e^{\textstyle{:F:}} {~~~\mathrm{with}~~~} F = \int_{0}^{x} f(u) {\mathrm{d}}u. {} \end{aligned} $$
(3.87)

An important property of the Lie transformation is that the one turn map is the exponential of the effective Hamiltonian and the circumference C:

$$\displaystyle \begin{aligned} {{M}}_{ring} =e^{:-C {{H}}_{eff}:}. \end{aligned} $$
(3.88)

The main advantages of Lie transformations are that the exponential form is always symplectic and that a formalism exists for the concatenation of transformations. An overview of this formalism and many examples can be found in [7]. As for the Lie operator, one can collect a set of useful formulae. Another neat package with useful formulae:

With a constant and f, g, h arbitrary functions:

$$\displaystyle \begin{aligned} :a:~=~0~~~~~\longrightarrow~~~~~e^{\textstyle{:a:}}~=~1 \end{aligned}$$
$$\displaystyle \begin{aligned} :f:a~=~0~~~~~\longrightarrow~~~~~e^{\textstyle{:f:}}a~=~a \end{aligned}$$
$$\displaystyle \begin{aligned} e^{\textstyle{:f:}}~[g, h] = [e^{\textstyle{:f:}}g,~e^{\textstyle{:f:}}h] \end{aligned}$$
$$\displaystyle \begin{aligned} e^{\textstyle{:f:}}~(g\cdot h) = e^{\textstyle{:f:}}g~\cdot~ e^{\textstyle{:f:}}h \end{aligned}$$

and very important:

$$\displaystyle \begin{aligned} {{M}}~g(x)~~=~~e^{\textstyle{:f:}}~g(x)~=~g(e^{\textstyle{:f:}}~x)~~~~~~~~~~{\mathsf{e.g.}}~~~~~e^{\textstyle{:f:}}~ x^{2}~=~(e^{\textstyle{:f:}}~ x)^{2} \end{aligned}$$
$$\displaystyle \begin{aligned} {{M}}^{-1}~g(x)~~=~~(e^{\textstyle{:f:}})^{-1}~g(x)~=~e^{\textstyle{-:f:}}~g(x)~~~~~~~~~{\mathsf{note:}}~~~\frac{1}{e^{\textstyle{:f:}}}~~\neq~~(e^{\textstyle{:f:}})^{-1} \end{aligned} $$
(3.89)

3.7.3.2 Concatenation of Lie Transformations

The concatenation is very easy when f and g commute (i.e. [f, g]  =  [g, f]  =  0) and we have:

$$\displaystyle \begin{aligned} e^{\textstyle{{:h:}}}~=~e^{\textstyle{{:f:}}}~e^{\textstyle{{:g:}}} = e^{\textstyle{:f + g:}} \end{aligned} $$
(3.90)

The generators of the transformations can just be added.

To combine two transformations in the general case (i.e. [f, g]  ≠  0) we can use the Baker–Campbell–Hausdorff formula (BCH) which in our convention can be written as:

$$\displaystyle \begin{aligned} \begin{array}{ll} h = f &~+~ g ~+~ \frac{1}{2}:f:g ~+~ {\textstyle{\frac{1}{12}}}:f:^{2}g ~+~ \frac{1}{12}:g:^{2}f \\ {} &~+~ \frac{1}{24}:f::g:^{2}f ~-~ \frac{1}{720}:g:^{4}f \\ {} &~-~\frac{1}{720}:f:^{4}g ~+~ \frac{1}{360}:g::f:^{3}g ~+~ \ldots \end{array} {} \end{aligned} $$
(3.91)

In many practical cases, non-linear perturbations are localized and small compared to the rest of the (often linear) ring, i.e. one of f or g is much smaller, e.g. f corresponds to one turn, g to a small, local distortion.

In that case we can sum up the BCH formula to first order in the perturbation g and get:

$$\displaystyle \begin{aligned} \begin{array}{rcl} ~e^{\textstyle{{:h:}}}~=~e^{\textstyle{{:f:}}}~e^{\textstyle{{:g:}}}~=~\mathrm{exp}\left[:f + \left( \frac{:f:}{1 - e^{-:f:}}\right) g + {{O}}(g^{2}): \right] {} \end{array} \end{aligned} $$
(3.92)

When g is small compared to f, the first order is a good approximation.

For example, we may have a full ring \(e^{:f_{2}:}\) with a small (local) distortion, e.g. a multipole e :g: with g = kx n then the expression:

$$\displaystyle \begin{aligned} e^{\textstyle{:h:}} = e^{\textstyle{:f_{2}:}} e^{\textstyle{:k x^{n}:}}, \end{aligned} $$
(3.93)

allows the evaluation of the invariant h for a single multipole of order n in this case.

In the case that f 2, f 3, f 4, are 2nd, 3rd, 4th order polynomials (Dragt-Finn factorization [19]):

$$\displaystyle \begin{aligned} e^{\textstyle{:f:}} = e^{\textstyle{:f_{2}:}} e^{\textstyle{:f_{3}:}} e^{\textstyle{:f_{4}:}}, \end{aligned} $$
(3.94)

each term is symplectic and the truncation at any order does not violate symplecticity.

One may argue that this method is clumsy when we do the analysis of a linear system. The reader is invited to prove this by concatenating by hand a drift space and a thin quadrupole lens. However, the central point of this method is that the technique works whether we do linear or non-linear beam dynamics and provides a formal procedure. Lie transformations are the natural extension of the linear matrix formalism to a non-linear formalism. There is no need to move from one method to another as required in the traditional treatment.

In the case an element is described by a Hamiltonian H, the Lie map of an element of length L and the Hamiltonian H is:

$$\displaystyle \begin{aligned} e^{\textstyle{-L:H:}} = \sum_{i=0}^{\infty} \frac{1}{i!}(-L:H:)^{\textstyle{i}} \end{aligned} $$
(3.95)

For example, the Hamiltonian for a thick sextupole is:

$$\displaystyle \begin{aligned} H = \frac{1}{3}k(x^{3} - 3xy^{2}) + \frac{1}{2}(p_{x}^{2} + p_{y}^{2}) \end{aligned} $$
(3.96)

To find the transformation we search for:

$$\displaystyle \begin{aligned} {{e^{\textstyle{-L:H:}}}}x~~~\mathrm{and}~~~{{e^{\textstyle{-L:H:}}}}p_{x}~~~~\mathrm{i.e.~for} \end{aligned} $$
(3.97)
$$\displaystyle \begin{aligned} {{e^{\textstyle{-L:H:}}}}x = \sum_{i=0}^{\infty} \frac{-L^{\textstyle{i}}}{i!}(:H:)^{\textstyle{i}}x \end{aligned} $$
(3.98)

We can compute:

$$\displaystyle \begin{aligned} {{:H:^{\textstyle{i}}}}x~~~{\mathsf{for~each}}~i \end{aligned} $$
(3.99)

to get:

$$\displaystyle \begin{aligned} {{:H:^{1}}}x = -p_{x}, \end{aligned} $$
(3.100)
$$\displaystyle \begin{aligned} {{:H:^{2}}}x = -k(x^{2} - y^{2}), \end{aligned} $$
(3.101)
$$\displaystyle \begin{aligned} {{:H:^{3}}}x = 2k(x p_{x} - y p_{y}), \end{aligned} $$
(3.102)
$$\displaystyle \begin{aligned} \ldots. \end{aligned} $$
(3.103)

Putting the terms together one obtains:

$$\displaystyle \begin{aligned} {{e^{\textstyle{-L:H:}}}}x = x + p_{x}L - \frac{1}{2}kL^{2}(x^{2} - y^{2}) - \frac{1}{3}kL^{3}(x p_{x} - y p_{y}) + \ldots \end{aligned} $$
(3.104)

3.7.4 Analysis Techniques: Poincare Surface of Section

Under normal circumstances it is not required to examine the complete time development of a particle trajectory around the machine. Given the experimental fact that the trajectory can be measured only at a finite number of positions around the machine, it is only useful to sample the trajectory periodically at a fixed position. The plot of the rate of change of the phase space variables at the beginning (or end) of each period is the appropriate method and also known as Poincare Surface of Section [20]. An example of such a plot is shown in Fig. 3.4 where the one-dimensional phase space is plotted for a completely linear machine (Fig. 3.4, left) and close to a 5th order resonance in the presence of a single non-linear element (in this case a sextupole) in the machine (Fig. 3.4, right).

Fig. 3.4
figure 4

Poincare surface of section of a particle near the 5th order resonances. Left without non-linear elements, right with one sextupole

It shows very clearly the distortion of the phase space due to the non-linearity, the appearance of resonance islands and chaotic behaviour between the islands. From this plot is immediately clear that the region of stability is strongly reduced in the presence of the non-linear element. The main features we can observe in Fig. 3.4 are that particles can:

  • Move on closed curves

  • Lie on islands, i.e. jump from one island to the next from turn to turn

  • Move on chaotic trajectories

The introduction of these techniques by Poincare mark a paradigm shift from the old classical treatment to a more modern approach. The question of long term stability of a dynamic system is not answered by getting the solution to the differential equation of motion, but by the determination of the properties of the surface where the motion is mapped out. Independent how this surface of section is obtained, i.e. by analytical or numerical methods, its analysis is the key to understand the stability.

3.7.5 Analysis Techniques: Normal Forms

The idea behind this technique is that maps can be transformed into Normal Forms. This tool can be used to:

  • Study invariants of the motion and the effective Hamiltonian

  • Extract non-linear tune shifts (detuning)

  • Perform resonance analysis

In the following we demonstrate the use of normal forms away from resonances. The treatment of the beam dynamics close to resonances is beyond the scope of this review and can be found in the literature (see e.g. [6, 7]).

3.7.5.1 Normal Form Transformation: Linear Case

The strategy is to make a transformation to get a simpler form of the map M, e.g. a pure rotation R( Δμ) as schematically shown in Fig. 3.5 using a transformation like:

$$\displaystyle \begin{aligned} {M} = {{U}} \circ {{{{R}}(\Delta\mu)} \circ {{U}}^{-1} ~~~~{\mathrm{or:}}~~~~{{R}}(\Delta\mu)} = {{U}}^{-1} \circ {M} \circ {{U}} \end{aligned} $$
(3.105)

with

$$\displaystyle \begin{aligned} {{U}} = \left( \begin{array}{cc} \sqrt{\beta(s)} &0 \\ {} -{\textstyle{\frac{\textstyle{\alpha(s)}}{\textstyle{\sqrt{\beta(s)}}}}} &\frac{1}{\textstyle{\sqrt{\beta(s)}}} \\ {} \end{array}\right) ~~~~~{\mathsf{and}}~~~~~{{R}} = \left( \begin{array}{cc} \cos{}(\Delta\mu) &\sin{}(\Delta\mu) \\ {} -\sin{}(\Delta\mu) &\cos{}(\Delta\mu) \\ {} \end{array}\right) \end{aligned} $$
(3.106)

This transformation corresponds to the Courant-Snyder analysis in the linear case and directly provides the phase advance and optical parameters. The optical parameters emerge automatically from the normal form analysis of the one-turn-map.

Fig. 3.5
figure 5

Normal form transformation in the linear case, related to the Courant-Snyder analysis

Although not required in the linear case, we demonstrate how this normal form transformation is performed using the Lie formalism. Starting from the general expression:

$$\displaystyle \begin{aligned} {{{R}}(\Delta\mu)} = {{U}}^{-1} \circ {M} \circ {{U}} \end{aligned} $$
(3.107)

we know that a linear map M in Lie representation is always:

$$\displaystyle \begin{aligned} e^{\textstyle{:f_{2}:}}~~~{\mathsf{with:}}~~~f_{2} = -\frac{\mu}{2}(\gamma x^{2} + 2 \alpha x p_{x} + \beta p_{x}^{2}) \end{aligned} $$
(3.108)

therefore:

$$\displaystyle \begin{aligned} \begin{array}{rcl} {{{R}}(\Delta\mu)} &\displaystyle = {{U}}^{-1}~\circ~{e^{\textstyle{:f_{2}(x):}}}~\circ~{{U}}\\ ~~~ &\displaystyle = e^{\textstyle{U^{-1}:f_{2}:U}}~=~e^{\textstyle{:U^{-1} f_{2}:}} \end{array} \end{aligned} $$
(3.109)

and (with U −1 f 2) f 2 expressed in the new variables X, P x it assumes the form:

$$\displaystyle \begin{aligned} f_{2} = -\frac{\mu}{2}(X^{2} + P_{x}^{2}) ~~~~{\mathsf{because:}}~~~~ \left( \begin{array}{c} X \\ P_{x} \\ \end{array}\right) = {{U}}^{-1} \left( \begin{array}{c} x \\ p_{x} \\ \end{array}\right) \end{aligned} $$
(3.110)

i.e. with the transformation U −1 the rotation : f 2 :  becomes a circle in the transformed coordinates. We transform to action and angle variables J and Φ, related to the variables X and P x through the transformations:

$$\displaystyle \begin{aligned} \begin{array}{rcl} X = \sqrt{2J\beta} \sin\Phi, ~~~~~P_{x} = \sqrt{\frac{2J}{\beta}}\cos\Phi \end{array} \end{aligned} $$
(3.111)

With this transformation we get a simple representation for the linear transfer map f 2:

$$\displaystyle \begin{aligned} \begin{array}{rcl} f_{2} = -\mu J~~~{\mathsf{and:}}~~~{{{R}}(\Delta\mu)}~=~e^{\textstyle{:-\Delta\mu J:}} \end{array} \end{aligned} $$
(3.112)

3.7.5.2 Normal Form Transformation: Non-linear Case

In the more general, non-linear case the transformation is more complicated and one must expect that the rotation angle becomes amplitude dependent (see e.g. [6]). A schematic view of this scheme is shown in Fig. 3.6 where the transformation leads to the desired rotation, however the rotation frequency (phase advance) is now amplitude dependent.

Fig. 3.6
figure 6

Normal form transformation in the non-linear case, leading to amplitude dependent phase advance. The transformation was done for non-resonant amplitudes

We demonstrate the power by a simple example in one dimension, but the treatment is similar for more complex cases. In particular, it demonstrates that this analysis using the algorithm based on Lie transforms leads easily to the desired result. A very detailed discussion of this method is found in [6].

From the general map we have made a transformation such that the transformed map can be expressed in the form \(e^{:h_{2}:}\) where the function h 2 is now a function only of J x, J y, and δ and it is the effective Hamiltonian.

In the non-linear case and away from resonances we can get the map in a similar form:

$$\displaystyle \begin{aligned} N = e^{\textstyle{:h_{eff}(J_{x}, J_{y},\delta):}} \end{aligned} $$
(3.113)

where the effective Hamiltonian h eff depends only on J x, J x, and δ.

If the map for h eff corresponds to a one-turn-map, we can write for the tunes:

$$\displaystyle \begin{aligned} Q_{x}(J_{x}, J_{y},\delta) = \frac{1}{2\pi}\frac{\partial h_{eff}}{\partial J_{x}} {} \end{aligned} $$
(3.114)
$$\displaystyle \begin{aligned} Q_{y}(J_{x}, J_{y},\delta) = \frac{1}{2\pi}\frac{\partial h_{eff}}{\partial J_{y}} {} \end{aligned} $$
(3.115)

and the change of path length:

$$\displaystyle \begin{aligned} \Delta s = -\frac{\partial h_{eff}}{\partial \delta} = \alpha_{c} \delta \end{aligned} $$
(3.116)

In the non-linear case, particles with different J x, J y, δ have different tunes. Their dependence on J x, J y is the amplitude detuning, the dependence on δ are the chromaticities.

The effective Hamiltonian can always be written (here to 3rd order) in a form:

$$\displaystyle \begin{aligned} \begin{array}{rcl} h_{eff}~=~&\displaystyle + &\displaystyle {{\mu_{x}J_{x} + \mu_{y}J_{y} + \frac{1}{2}\alpha_{c}\delta^{2}}}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \end{array} \end{aligned} $$
(3.117)
$$\displaystyle \begin{aligned} \begin{array}{rcl} ~~~~~~~~~~&\displaystyle + &\displaystyle {{c_{x1} J_{x}\delta + c_{y1} J_{y}\delta + c_{3}\delta^{3}}}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \end{array} \end{aligned} $$
(3.118)
$$\displaystyle \begin{aligned} \begin{array}{rcl} ~~~~~~~~~~&\displaystyle + &\displaystyle {{c_{xx} J_{x}^{2} + c_{xy} J_{x}J_{y} + c_{yy} J_{y}^{2} + c_{x2} J_{x} \delta^{2} + c_{y2} J_{y} \delta^{2} + c_{4} \delta^{4}}}\\ \end{array} \end{aligned} $$
(3.119)

and then tune depends on action J and momentum deviation δ:

$$\displaystyle \begin{aligned} \begin{array}{rcl} Q_{x}(J_{x}, J_{y},\delta) = \frac{1}{2\pi}\frac{\partial h_{eff}}{\partial J_{x}} = \frac{1}{2\pi}\left( {{\mu_{x}}} + \overbrace{{2c_{xx} J_{x} + c_{xy} J_{y}}}^{{detuning}} + \overbrace{{c_{x1} \delta + c_{x2} \delta^{2}}}^{{chromaticity}} \right) \end{array} \end{aligned} $$
(3.120)
$$\displaystyle \begin{aligned} \begin{array}{rcl} Q_{y}(J_{x}, J_{y},\delta) = \frac{1}{2\pi}\frac{\partial h_{eff}}{\partial J_{y}} = \frac{1}{2\pi}\left( {{\mu_{y}}} + \overbrace{{2c_{yy} J_{y} + c_{xy} J_{x}}}^{{detuning}} + \overbrace{{c_{y1} \delta + c_{y2} \delta^{2}}}^{{chromaticity}} \right) \end{array} \end{aligned} $$
(3.121)

The meaning of the different contributions are:

  • μ x, μ y: linear phase advance or 2π ⋅ i.e. the tunes for rings

  • \(\frac {1}{2}\alpha _{c}, c_{3}, c_{4}\): linear and nonlinear “momentum compaction”

  • c x1, c y1: first order chromaticities

  • c x2, c y2: second order chromaticities

  • c xx, c xy, c yy: detuning with amplitude

The coefficients are the various aberrations of the optics.

As a first example one can look at the effect of a single (thin) sextupole. The map is:

$$\displaystyle \begin{aligned} \begin{array}{ll} {\mathcal{M}} = &e^{\textstyle{-:\mu J_{x}~+~\mu J_{y}~+~\frac{1}{2}\alpha_{c}\delta^{2}:}}~{e^{\textstyle{:k (x^{3}~-~3xy^{2})}~+~\frac{p_{x}^{2} + p_{y}^{2}}{2(1+\delta)}:}} \\ \end{array} \end{aligned} $$
(3.122)

we get for h eff (see e.g. [6, 7]):

$$\displaystyle \begin{aligned} \begin{array}{ll} h_{eff} = &\mu_{x} J_{x} + \mu_{y} J_{y} + \frac{1}{2} \alpha_{c}\delta^{2} - k D^{3}\delta^{3} - 3 k \beta_{x} J_{x} D \delta + 3 k \beta_{y} J_{y} D \delta \end{array} \end{aligned}$$

Then it follows:

$$\displaystyle \begin{aligned} Q_{x}(J_{x}, J_{y},\delta) = \frac{1}{2\pi}\frac{\partial h_{eff}}{\partial J_{x}} = \frac{1}{2\pi} ({{\mu_{x}}} - {{3 k \beta_{x} D \delta}}) \end{aligned} $$
(3.123)
$$\displaystyle \begin{aligned} Q_{y}(J_{x}, J_{y},\delta) = \frac{1}{2\pi}\frac{\partial h_{eff}}{\partial J_{y}} = \frac{1}{2\pi} ({{\mu_{y}}} + {{3 k \beta_{y} D \delta}}) \end{aligned} $$
(3.124)

Since it was developed to first order only, there is no non-linear detuning with amplitude.

As a second example one can use a linear rotation followed by an octupole, the Hamiltonian is:

$$\displaystyle \begin{aligned} {{H}} = \frac{\mu}{2}(x^{2} + p_{x}^{2}) + \delta(s - s_{0})\frac{x^{4}}{4} = \mu J + \delta(s - s_{0})\frac{x^{4}}{4}~~~~~~{\mathrm{with:}}~~J = \frac{(x^{2} + p_{x}^{2})}{2} {} \end{aligned} $$
(3.125)

The first part of the Hamiltonian corresponds to the generator of a linear rotation and the second part to the localized octupole.

The map, written in Lie representation becomes:

$$\displaystyle \begin{aligned} M = e^{\textstyle{(-\frac{\mu}{2}:x^{2} + p_{x}^{2}:)}}~ e^{\textstyle{:\frac{x^{4}}{4}:}} = e^{\textstyle{:-\mu J:}}~ e^{\textstyle{:\frac{x^{4}}{4}:}}~ =~ R~ e^{\textstyle{:\frac{x^{4}}{4}:}} \end{aligned} $$
(3.126)

The purpose is now to find a generator F for a transformation

$$\displaystyle \begin{aligned} e^{\textstyle{-:F:}}~ M~ e^{\textstyle{:F:}}~~=~~e^{\textstyle{-:F:}}~ e^{\textstyle{:\frac{x^{4}}{4}:}}~ e^{\textstyle{:F:}} \end{aligned} $$
(3.127)

such that the exponents of the map depend only on J and not on x.

Without going through the algebra (advanced tools exist for this purpose, see e.g. [6]) we quote the result and with

$$\displaystyle \begin{aligned} F = -\frac{1}{64}\{-5x^{4} {+} 3p_{x}^{4} {+} 6x^{2}p_{x}^{2} {+} x^{3}p_{x}(8\cot{}(\mu) \,{+}\, 4\cot{}(2\mu)) \,{+}\, xp_{x}^{3}(8\cot{}(\mu) {-} 4\cot{}(2\mu))\} \end{aligned} $$
(3.128)

we can write the map:

$$\displaystyle \begin{aligned} M = e^{\textstyle{-:F:}}~~e^{\textstyle{:-\mu J + {{\frac{3}{8} J^{2}}}:}}~~e^{\textstyle{:F:}} {} \end{aligned} $$
(3.129)

the term \({{\frac {3}{8} J^{2}}}\) implies a tune shift with amplitude for an octupole.

3.7.6 Truncated Power Series Algebra Based on Automatic Differentiation

It was argued that an appropriate technique to evaluate the behaviour of complex, non-linear systems is by numerically tracking through the individual elements. Schematically this is shown in Fig. 3.7 and the tracking through a complicated system relates the output numerically to the input. When the algorithm depicted in Fig. 3.7 represents the full turn in a ring, we obtained the most reliable one-turn-map through this tracking procedure, assuming we have chosen an appropriate representation of the maps for the individual elements.

Fig. 3.7
figure 7

Schematic view of tracking through a complex element

3.7.6.1 Automatic Differentiation: Concept

This procedure may not be fully satisfactory in all cases and one might like to get an analytical expression for the one-turn-map or equivalent. Could we imagine something that relates the output algebraically to the input? This might for example be a Taylor series of the type:

$$\displaystyle \begin{aligned} \begin{array}{rcl} z_{2} = \sum C_{j} z_{1}^{j} = \sum d_{j} f^{(n)} z_{1}^{j} \end{array} \end{aligned} $$
(3.130)

Then we have an analytic map (for all z 1).

To understand why this could be useful, we can study the paraxial behaviour. In Fig. 3.8 we show schematically the trajectories of particles close to the ideal orbit. The red line refers to the ideal trajectory while the other lines show the motion of individual particles with small deviations from the ideal path. The idea is that if we understand how small deviations behave, we understand the system much better.

Fig. 3.8
figure 8

Pictorial view of a paraxial analysis. Red line represents the ideal trajectory

If we now remember the definition of the Taylor series:

$$\displaystyle \begin{aligned} f(x + \Delta x) = f(x) + \sum_{n=1}^{\infty} \frac{\Delta x^{n}}{n!} f^{(n)}(x) \end{aligned} $$
(3.131)

we immediately realize that the coefficients determine the behaviour of small deviations Δx from the ideal orbit x. Therefore the Taylor expansion does a paraxial analysis of the system and the main question is how to get these coefficients without extra work?

The problem is getting the derivatives f (n)(a) of f(x) at a:

$$\displaystyle \begin{aligned} f'(a) = \lim_{\epsilon \rightarrow 0}\frac{f(a + \epsilon) - f(a)}{\epsilon} \end{aligned} $$
(3.132)

Numerically this corresponds to the need to subtract almost equal numbers and divide by a small number. For higher orders f″, f‴.., one must expect numerical problems. An elegant solution to this problem is the use of Differential Algebra (DA) [21].

3.7.6.2 Automatic Differentiation: The Algebra

Here we demonstrate the concept, for more details the literature should be consulted [6, 7, 21].

  1. 1.

    Define a pair (q 0, q 1), where q 0, q 1 are real numbers

  2. 2.

    Define operations on such pairs like:

    $$\displaystyle \begin{aligned}~~~(q_{0}, q_{1})~{{+}}~(r_{0}, r_{1}) = (q_{0} + r_{0}, q_{1} + r_{1})\end{aligned} $$
    (3.133)
    $$\displaystyle \begin{aligned}~~~c~{{\cdot}}~(q_{0}, q_{1}) = (c\cdot q_{0}, c \cdot q_{1})\end{aligned} $$
    (3.134)
    $$\displaystyle \begin{aligned}~~~(q_{0}, q_{1})~{{\cdot}}~(r_{0}, r_{1}) = ( q_{0}\cdot r_{0}, q_{0}\cdot r_{1} + q_{1}\cdot r_{0})\end{aligned} $$
    (3.135)
  3. 3.

    We define the ordering like:

    $$\displaystyle \begin{aligned}~~~(q_{0},q_{1}) < (r_{0},r_{1})~~~{\mathsf{if}}~~~ q_{0} < r_{0}~~~{\mathsf{or}}~~~(q_{0} = r_{0} ~~~{\mathsf{and}}~~~ q_{1} < r_{1})\end{aligned} $$
    (3.136)
    $$\displaystyle \begin{aligned}~~~(q_{0},q_{1}) > (r_{0},r_{1})~~~{\mathsf{if}}~~~ q_{0} > r_{0}~~~{\mathsf{or}}~~~(q_{0} = r_{0} ~~~{\mathsf{and}}~~~ q_{1} > r_{1})\end{aligned} $$
    (3.137)
  4. 4.

    This implies that:

    $$\displaystyle \begin{aligned}~~~(0,0) < (0,1) < (r,0) ~~~({\mathsf{for~any}}~r)\end{aligned} $$
    (3.138)

This means that (0,1) is between 0 and ANY real number, i.e. it is infinitely small, corresponding to the “𝜖” in standard calculus.

Therefore we call this special pair “differential unit” d = (0, 1).

With our rules we can further see that:

$$\displaystyle \begin{aligned} (1,0) \cdot (q_{0},q_{1}) = (q_{0},q_{1}) {~~~~~\mathsf{and}~~~~~} (q_{0},q_{1})~^{-1}~=~\left(\frac{1}{q_{0}}, -\frac{q_{1}}{q_{0}^{2}} \right) \end{aligned} $$
(3.139)

In general the inverse of a function f(q 0, q 1) can de derived like:

$$\displaystyle \begin{aligned} f((q_{0}, q_{1}))\cdot (r_{0}, r_{1})~~=~~(1, 0) {} \end{aligned} $$
(3.140)

using the multiplication rules. The inverse is then (r 0, r 1). For example:

$$\displaystyle \begin{aligned} (q_{0}, q_{1})^{2}\cdot (r_{0}, r_{1})~~=~~(1, 0) {} \end{aligned} $$
(3.141)

gives for the inverse:

$$\displaystyle \begin{aligned} (r_{0}, r_{1})~~=~~\left(\frac{1}{q_{0}^{2}},~~\frac{-2q_{1}}{q_{0}^{3}} \right) {} \end{aligned} $$
(3.142)

3.7.6.3 Automatic Differentiation: The Application

Of course (q, 0) is just the real number q and we define the “real” and the “differential part”:

$$\displaystyle \begin{aligned}~~~q_{0} = {{R}}(q_{0},q_{1})~~~{\mathsf{and}}~~~q_{1} = {{D}}(q_{0},q_{1})\end{aligned} $$
(3.143)

For a function f(x) we have (without proof, see e.g. [21]):

$$\displaystyle \begin{aligned}~~~{{D}}[f(x + d)] = {{D}}[f((x, 0) + (0, 1))] = f'(x) \end{aligned} $$
(3.144)

We use an example instead to demonstrate this with the function:

$$\displaystyle \begin{aligned} f(x) = x^{2} + \frac{1}{x} {} \end{aligned} $$
(3.145)

Using school calculus we have for the derivative:

$$\displaystyle \begin{aligned} f'(x) = 2x - \frac{1}{x^{2}}~~~{\mathsf{and~for~x~=~2~we~get:}}~~~ f(2) = \frac{9}{2},~~~ f'(2) = \frac{15}{4} \end{aligned} $$
(3.146)

We now apply Automatic Differentiation instead. For the variable x in (3.145)

we substitute x → (x, 1) = (2, 1) and using our rules:

$$\displaystyle \begin{aligned} \begin{array}{rcl} f[(2,1)] &\displaystyle = &\displaystyle (x,1)^{2} + (x,1)^{-1}~~=~~(2,1)^{2} + (2,1)^{-1}\\ &\displaystyle = &\displaystyle (4,4) + (\frac{1}{2}, -\frac{1}{4})~=~(\frac{9}{2},~\frac{15}{4})~=~(f(2),~f'(2))~ \end{array} \end{aligned} $$

we arrive at a vector containing the differentials at x = 2. The computation of derivatives becomes an algebraic problem, without need for small numbers. No numerical difficulties are expected and the differential is exact.

3.7.6.4 Automatic Differentiation: Higher Orders

To obtain higher orders, we need higher derivatives, i.e. larger dimension for our vectors:

  1. 1.

    The pair (q 0, 1), becomes a vector of length N and with equivalent rules:

    $$\displaystyle \begin{aligned}(q_{0}, 1)~~\Rightarrow~~(q_{0}, 1, 0, 0, \ldots,0)\end{aligned} $$
    (3.147)
    $$\displaystyle \begin{aligned}(q_{0}, q_{1}, q_{2}, \ldots q_{N})~{{+}}~(r_{0}, r_{1}, r_{2}, \ldots r_{N}) = (s_{0}, s_{1}, s_{2}, \ldots s_{N})\end{aligned} $$
    (3.148)
    $$\displaystyle \begin{aligned}c~{{\cdot}}~(q_{0}, q_{1}, q_{2}, \ldots q_{N})~=~(c\cdot q_{0},c\cdot q_{1},c\cdot q_{2}, \ldots c\cdot q_{N})\end{aligned} $$
    (3.149)
    $$\displaystyle \begin{aligned}(q_{0}, q_{1}, q_{2}, \ldots q_{N})~{{\cdot}}~(r_{0}, r_{1}, r_{2}, \ldots r_{N})~=~(s_{0}, s_{1}, s_{2}, \ldots s_{N})\end{aligned} $$
    (3.150)

    with:

    $$\displaystyle \begin{aligned}s_{i} = \sum_{k=0}^{i} \frac{i!}{k! (i-k)!} q_{k} r_{i-k}\end{aligned} $$
    (3.151)

    If we had started with:

    $$\displaystyle \begin{aligned}~~~x = (a,1,0,0,0\ldots ) \end{aligned} $$
    (3.152)

    we would get:

    $$\displaystyle \begin{aligned}~~~f(x) = (~f(a),~ f^{\prime}(a),~ f^{\prime\prime}(a),~ f^{\prime\prime\prime}(a), \ldots~ f^{(n)}(a)~) \end{aligned} $$
    (3.153)

    Some special cases are:

    $$\displaystyle \begin{aligned} (x, 0, 0, 0, ..)^{n}~=~(x^{n}, 0, 0, 0, ..) {} \end{aligned} $$
    (3.154)
    $$\displaystyle \begin{aligned} (0, 1, 0, 0, ..)^{n}~=~(0, 0, 0, ..,{\overbrace{n!}^{n+1}}, 0, 0, ..) {} \end{aligned} $$
    (3.155)
    $$\displaystyle \begin{aligned} (x, 1, 0, 0, ..)^{2}~=~(x^{2}, 2x, 2, 0, ..) {} \end{aligned} $$
    (3.156)
    $$\displaystyle \begin{aligned} (x, 1, 0, 0, ..)^{3}~=~(x^{3}, 3x^{2}, 6x, 6, ..) {} \end{aligned} $$
    (3.157)

    As another exercise one can consider the function f(x)  =  x −3

    \(f(x)~~\rightarrow ~~f(x, 1, 0, 0, ..)~~=~~{\underbrace {(x, 1, 0, 0, ..)^{-3}~~=~~(f_{0}, f^{\prime }f^{\prime \prime }, f^{\prime \prime \prime }, ..)}_{\textstyle \mathsf {next:~multiply~both~sides~with~(x, 1, 0, 0)^{3}}}}\)

    (1, 0, 0, ..)  =  (x, 1, 0, 0, ..)3 ⋅ (f 0, f , f ′′, f ′′′, …)

    (1, 0, 0, ..)  =  (x 3, 3x 2, 6x, 6, ..) ⋅ (f 0, f , f ′′, f ′′′, …) using (3.157)

    This can easily be solved by forward substitution:

    $$\displaystyle \begin{aligned} \begin{array}{lll} 1~=~&x^{3}\cdot f_{0}~~~~~&\rightarrow~~~f_{0}~=~x^{-3} \\ 0~=~&3x^{2}\cdot f_{0}~~+~~x^{3}\cdot f^{\prime}~~~~~&\rightarrow~~~f^{\prime}~=~-3x^{-4}\\ 0~=~&6x\cdot f_{0}~~+~~2\cdot 3x^{2}\cdot f^{\prime}~~+~~{x^{3}}\cdot f^{\prime\prime}~~~~~&\rightarrow~~~f^{\prime\prime}~=~{12}{x^{-5}}\\ .... \end{array} \end{aligned} $$
    (3.158)

    Using the same procedure for f(x)  =  x −1 one obtains:

    $$\displaystyle \begin{aligned} (x, 1, 0, 0, ..)^{-1}~=~(\frac{1}{x}. -\frac{1}{x^{2}}, \frac{2}{x^{3}}, \ldots ) {} \end{aligned} $$
    (3.159)

    For the function we have used before (3.145) :

    $$\displaystyle \begin{aligned} f(x) = x^{2} + \frac{1}{x} \end{aligned} $$
    (3.160)

    and using (adding!) the expressions (3.156) and (3.159) one has:

    $$\displaystyle \begin{aligned} (f_{0}, f^{\prime}, f^{\prime\prime}, f^{\prime\prime\prime})~~=~~(x^{2}~+~\frac{1}{x},~~~2x~-~\frac{1}{x^{2}},~~ 2~+~\frac{2}{x^{3}},~~ ..) \end{aligned} $$
    (3.161)

3.7.6.5 Automatic Differentiation: More Variables

It can be extended to more variables x, y and a function f(x, y):

$$\displaystyle \begin{aligned}~~~x = (a,1,0,0,0\ldots ) \end{aligned} $$
(3.162)
$$\displaystyle \begin{aligned}~~~y = (b,0,1,0,0\ldots ) \end{aligned} $$
(3.163)

and get (with more complicated multiplication rules):

$$\displaystyle \begin{aligned}~~f((x+dx),~ y+dy)) = \left(f,~ \frac{\partial f}{\partial x},~ \frac{\partial f}{\partial y},~ \frac{\partial^{2} f}{\partial x^{2}},~ \frac{\partial^{2} f}{\partial x \partial y},\ldots \right)(x, y) \end{aligned} $$
(3.164)

3.7.6.6 Differential Algebra: Applications to Accelerators

Of course it is not the purpose of these tools to compute analytical expressions for the derivatives, the examples were used to demonstrate the techniques. The application of these techniques (i.e. Truncated Power Series Algebra [6, 21]) is schematically shown in Fig. 3.9. Given an algorithm, which may be a complex simulation program with several thousands of lines of code, we can use the techniques to “teach” the code how to compute the derivatives automatically.

Fig. 3.9
figure 9

Schematic view of application of Truncated Power Series Algebra

When we push f(x) = (a, 1, 0, 0, 0…) through the algorithm, using our rules, we get all derivatives around a, i.e. we get the Taylor coefficients and can construct the map!

What is needed is to replace the standard operations performed by the computer on real numbers by the algebra defined above. The maps are provided with the desired accuracy and to any order.

Given a Taylor series to high accuracy, the wanted information about stability of the system, global behaviour and optical parameters can be derived more easily. It should be stressed again that the origin is the underlying tracking code, just acting on different data types with different operations.

3.7.6.7 Differential Algebra: Simple Example

A simple example is shown below where the original “tracking code” is shown in the left column (DATEST1) and the corresponding modified code in the right column (DATEST2). The operation is rather trivial to demonstrate the procedure more easily. The code is written in standard FORTRAN 95 which allows operator overloading, but an object oriented language such as C+ + or Python are obviously well suited for this purpose. Standard FORTRAN-95 is however more flexible overloading arbitrary operations. The DA-package used for demonstration only is loaded by the command use my_own_da in the code. To make the program perform the wanted operation we have to make two small modifications:

  1. 1.

    Replace the types real by the type my_taylor (defined in the package).

  2. 2.

    Add the “differential unit” (0, 1), the monomial in this implementation to the variable.

Running these two programs we get the results in the two columns below. In the left column we get the expected result from the real calculation of the expression \(\sin {}(\pi /6))\) = 0.5, while in the right column we get additional numbers sorted according to the array index.

The inspection shows that these numbers are the coefficients of the Taylor expansion of \(\sin {}(x)\) around x  =  π∕6:

$$\displaystyle \begin{aligned} \sin{}(\frac{\pi}{6} + \Delta x) = {{\sin{}(\frac{\pi}{6})}} + {{\cos{}(\frac{\pi}{6})}}\Delta x^{1} - {{\frac{1}{2}\sin{}(\frac{\pi}{6})}}\Delta x^{2} - {{\frac{1}{6}\cos{}(\frac{\pi}{6})}}\Delta x^{3} \end{aligned} $$
(3.165)

We have indeed obtained the derivatives of our “algorithm” through the tracking code.

Some examples related to the analysis of accelerator physics lattices.

In example 1 a lattice with 8 FODO cells is constructed and the quadrupole is implemented as a thin lens “kick” in the center of the element. Note that the example is implemented in the horizontal and the longitudinal planes. For the second example an octupole kick is added to demonstrate the correct computation of the non-linear effect, i.e. the detuning with amplitude.

The procedure is:

  1. 1.

    Track through the lattice and get Taylor coefficients

  2. 2.

    Produce a map from the coefficients

  3. 3.

    Perform a Normal Form Analysis on the map

  

  

  

Defined assignments:

  • M = z, constructs a map M using the coefficients z

  • NORMAL = M, computes a normal form NORMAL using the map M

  • In FORTRAN95 derived “type” plays the role of “structures” in C, and NORMAL contains:

  • NORMAL%tune is the tune Q

  • NORMAL%dtune_da is the detuning with amplitude \(\frac {dQ}{da}\)

  • NORMAL%R, NORMAL%A, NORMAL%A**-1 are the matrices

  • such that: M = A R A−1

  • from the normal form transformation one obtains α, β, γ

Below a comparison is shown in Fig. 3.10 using the lattice function (e.g. β) obtained with this procedure and the optical functions from the corresponding MAD-X output.

Fig. 3.10
figure 10

Comparison: β-function from the model and the corresponding result from MAD-X

One finds that β max  ≈ 300 m, β min  ≈ 170 m and perfect agreement.

3.8 Beam Dynamics with Non-linearities

Following the overview of the evaluation and analysis tools, it is now possible to analyse and classify the behaviour of particles in the presence of non-linearities. The tools presented beforehand allow a better physical insight to the mechanisms leading to the various phenomena, the most important ones being:

  • Amplitude detuning

  • Excitation of non-linear resonances

  • Reduction of dynamic aperture and chaotic behaviour.

This list is necessarily incomplete but will serve to demonstrate the most important aspects.

To demonstrate these aspects, we take a realistic case and show how the effects emerge automatically.

3.8.1 Amplitude Detuning

It was discussed in Sect. 3.7.5 that the one-turn-map can be transformed into a simpler map where the rotation is separated. A consequence of the non-linearities was that the rotation frequency becomes amplitude dependent to perform this transformation. Therefore the amplitude detuning is directly obtained from this normal form transformation.

3.8.1.1 Amplitude Detuning due to Non-linearities in Machine Elements

Non-linear elements cause an amplitude dependent phase advance. The computational procedure to derive this detuning was demonstrated in the discussion on normal for transformations in the case of an octupole Eqs. (3.125) and (3.129). This formalism is valid for any non-linear element.

Numerous other examples can be found in [6] and [5].

3.8.1.2 Amplitude Detuning due to Beam–Beam Effects

For the demonstration we use the example of a beam-beam interaction because it is a very complex non-linear problem and of large practical importance [7, 22].

In this simplest case of one beam-beam interaction we can factorize the machine in a linear transfer map \(e^{:f_{2}:}\) and the beam-beam interaction e :F:, i.e.:

$$\displaystyle \begin{aligned} \begin{array}{rcl} e^{\textstyle{:f_{2}:}}~\cdot~e^{\textstyle{:F:}}~=~e^{\textstyle{:{{h}}:}} \end{array} \end{aligned} $$
(3.166)

with

$$\displaystyle \begin{aligned} \begin{array}{rcl} f_{2}~=~ -\frac{\mu}{2} ( \frac{x^{2}}{\beta} + \beta p^{2}_{x}) \end{array} \end{aligned} $$
(3.167)

where μ is the overall phase, i.e. the tune Q multiplied by 2π, and β is the β-function at the interaction point. We assume the waist of the β-function at the collision point (α = 0). The function F(x) corresponds to the beam-beam potential (3.87):

$$\displaystyle \begin{aligned} \begin{array}{rcl} F(x) = \displaystyle{\int_{0}^{x}} f(u) {\mathrm{d}}u \end{array} \end{aligned} $$
(3.168)

For a round Gaussian beam we use for f(x) the well known expression:

$$\displaystyle \begin{aligned} \begin{array}{rcl} f(x) = \frac{2 N r_{0}}{\gamma x} ( 1 - e^{\textstyle{\frac{-x^{2}}{2\sigma^{2}}}}) \end{array} \end{aligned} $$
(3.169)

Here N is the number of particles per bunch, r 0 the classical particle radius, γ the relativistic parameter and σ the transverse beam size.

For the analysis we examine the invariant h which determines the one-turn-map (OTM) written as a Lie transformation e :h:. The invariant h is the effective Hamiltonian for this problem.

As usual we transform to action and angle variables J and Φ, related to the variables x and p x through the transformations:

$$\displaystyle \begin{aligned} \begin{array}{rcl} x = \sqrt{2J\beta} \mathsf{sin}\Phi, ~~~~~p_{x} = \sqrt{\frac{2J}{\beta}}\cos\Phi \end{array} \end{aligned} $$
(3.170)

With this transformation we get a simple representation for the linear transfer map f 2:

$$\displaystyle \begin{aligned} \begin{array}{rcl} f_{2} = -\mu J \end{array} \end{aligned} $$
(3.171)

The function F(x) we write as Fourier series:

$$\displaystyle \begin{aligned} \begin{array}{rcl} F(x) \Rightarrow \sum_{n=-\infty}^{\infty} c_{n}(J) e^{\textstyle{in\Phi}}~~ {\mathsf{with}}~~ c_n(J)~=~\frac{1}{2\pi}\int_{0}^{2\pi} e^{\textstyle{-in\Phi}} F(x) {\mathrm{d}}\Phi\\ {} \end{array} \end{aligned} $$
(3.172)

For the evaluation of (3.172) see [7]. We take some useful properties of Lie operators (e.g. [6, 7]):

$$\displaystyle \begin{aligned} \begin{array}{rcl} :f_{2}:g(J) = 0,~~~~~:f_{2}:e^{\textstyle{in\Phi}} = in \mu e^{\textstyle{in\Phi}},~~~~~g(:f_{2}:)e^{\textstyle{in \Phi}} = g(in \mu) e^{in\Phi}\\ \end{array} \end{aligned} $$
(3.173)

and the CBH-formula for the concatenation of the maps (3.92):

$$\displaystyle \begin{aligned} \begin{array}{rcl} e^{\textstyle{:f_{2}:}}~e^{\textstyle{:F:}}~=~e^{\textstyle{:h:}}~=~\mathrm{exp}\left[:f_{2} + \left( \frac{:f_{2}:}{1 - e^{\textstyle{-:f_{2}:}}}\right) F + {{O}}(F^{2}): \right]\\ \end{array} \end{aligned} $$
(3.174)

which gives immediately for h:

$$\displaystyle \begin{aligned} \begin{array}{rcl} h = -\mu J + \sum_{n} c_{n}(J) \frac{i n \mu}{1 - e^{\textstyle{-i n \mu}}} e^{\textstyle{in\Phi}} = -\mu J + \sum_{n} c_{n}(J) \frac{n \mu}{2 \sin{}(\frac{n \mu}{2})} e^{\textstyle{(in\Phi + i\frac{n \mu}{2})}}\\ {} \end{array} \end{aligned} $$
(3.175)

Equation (3.175) is the beam-beam perturbed invariant to first order in the perturbation using (3.92).

From (3.175) we observe that for \(\textstyle {\nu ~= \frac {\mu }{2\pi } = \frac {p}{n}}\) resonances appear for all integers p and n when c n(J) ≠ 0.

Away from resonances a normal form transformation gives:

$$\displaystyle \begin{aligned} \begin{array}{rcl} h~=~-\mu J + c_{0}(J)~=const. \end{array} \end{aligned} $$
(3.176)

and the oscillating term disappears. The first term is the linear rotation and the second term gives the amplitude dependent tune shift (see (3.114)):

$$\displaystyle \begin{aligned} \begin{array}{rcl} \Delta \mu(J) = -\frac{1}{2\pi}\frac{dc_{0}(J)}{dJ} \end{array} \end{aligned} $$
(3.177)

The computation of this tuneshift from the equation above can be found in the literature [7, 23].

3.8.1.3 Phase Space Structure

To demonstrate how this technique can be used to reconstruct the phase space structure in the presence of non-linearities, we continue with the very non-linear problem of the beam-beam interaction treated above. To test our result, we compare the invariant h to the results of a particle tracking program.

The model we use in the program is rather simple:

  • linear transfer between interactions

  • beam-beam kick for round beams

  • compute action \(\textstyle {J~=~\frac {\beta ^{*}}{2 \sigma ^{2}}(\frac {x^{2}}{\beta ^{*}} + p_{x}^{2}\beta ^{*})}\)

  • compute phase \(\Phi ~=~{\mathsf {arctan}}(\frac {p_{x}}{x})\)

  • compare J with h as a function of the phase Φ

The evaluation of the invariant (3.175) is done numerically with Mathematica. The comparison between the tracking results and the invariant h from the analytical calculation is shown in Fig. 3.11 in the (J,Φ) space. One interaction point is used in this comparison and the particles are tracked for 1024 turns. The symbols are the results from the tracking and the solid lines are the invariants computed as above. The two figures are computed for amplitudes of 5 σ and 10 σ. The agreement between the models is excellent. The analytic calculation was done up to the order N = 40. Using a lower number, the analytic model can reproduce the envelope of the tracking results, but not the details. The results can easily be generalized to more interaction points [22]. Close to resonances these tools can reproduce the envelope of the phase space structure [22].

Fig. 3.11
figure 11

Comparison: numerical and analytical model for one interaction point. Shown for 5σ x (left) and 10σ x (right). Full symbols from numerical model and solid lines from invariant (3.175)

3.8.2 Non-linear Resonances

Non-linear resonances can be excited in the presence of non-linear fields and play a vital role for the long term stability of the particles.

3.8.2.1 Resonance Condition in One Dimension

For the special case of the beam-beam perturbed invariant (3.175) we have seen that the expansion (3.175) diverges when the resonance condition for the phase advance is fulfilled, i.e.:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \nu~= \frac{\mu}{2\pi} = \frac{p}{n} \end{array} \end{aligned} $$
(3.178)

The formal treatment would imply to use the n-turn map with the n-turn effective Hamiltonian or other techniques. This is beyond the scope of this handbook and can be found in the literature [6, 7]. We should like to discuss the consequences of resonant behaviour and possible applications in this section.

3.8.2.2 Driving Terms

The treatment of the resonance map is still not fully understood and a standard treatment using first order perturbation theory leads to a few wrong conclusions. In particular it is believed that a resonance cannot be excited unless a driving term for the resonance is explicitly present in the Hamiltonian. This implies that the related map must contain the term for a resonance in leading order to reproduce the resonance. This regularly leads to the conclusion that 3rd order resonances are driven by sextupoles, 4th order are driven by octupoles etc. This is only a consequence of the perturbation theory which is often not carried beyond leading order, and e.g. a sextupole can potentially drive resonances of any order. Such a treatment is valid only for special operational conditions such as resonant extraction where strong resonant effects can be well described by a perturbation theory. A detailed discussion of this misconception is given in [6]. A correct evaluation must be carried out to the necessary orders and the tools presented here allow such a treatment in an easier way.

3.8.3 Chromaticity and Chromaticity Correction

For reasons explained earlier, sextupoles are required to correct the chromaticities. In large machines and in particular in colliders with insertions, these sextupoles dominate over the non-linear effects of so-called linear elements.

3.8.4 Dynamic Aperture

Often in the context of the discussion of non-linear resonance phenomena the concept of dynamic aperture in introduced. This is the maximum stable oscillation amplitude in the transverse (x, y)-space due to non-linear fields. It must be distinguished from the physical aperture of the vacuum chamber or other physical restrictions such as collimators.

One of the most important tasks in the analysis of non-linear effects is to provide answers to the questions:

  • Determination of the dynamic aperture

  • Maximising the dynamic aperture

The computation of the dynamic aperture is a very difficult task since no mathematical methods are available to calculate it analytically except for the trivial cases. Following the concepts described earlier, the theory is much more complete from the simulation point of view. Therefore the standard approach to compute the dynamic aperture is done by numerical tracking of particles.

The same techniques can be employed to maximise the dynamic aperture, in the ideal case beyond the limits of the physical aperture. Usually one can define tolerances for the allowed multipole components of the magnets or the optimized parameters for colliding beams when the dominant non-linear effect comes from beam-beam interactions.

3.8.4.1 Long Term Stability and Chaotic Behaviour

In accelerators such as particle colliders, the beams have to remain stable for many hours and we may be asked to answer the question about stability for as many as 109 turns in the machine. This important question cannot be answered by perturbative techniques. In the discussion of Poincare surface-of-section we have tasted the complexity of the phase space topology and the final question is whether particles eventually reach the entire region of the available phase space.

It was proven by Kolmogorov, Arnol’d and Moser (KAM theorem) that for weakly perturbed systems invariant surfaces exist in the neighbourhood of integrable ones. Poincare gave a first hint that stochastic behaviour may be generated in non-linear systems. In fact, higher order resonances change the topology of the phase space and lead to the formation of island chains on an increasingly fine scale. Satisfactory insight to the fine structure of the phase space can only be gained with numerical computation. Although the motion near resonances may be stochastic, the trajectories are constrained by nearby KAM surfaces (at least in one degree of freedom) and the motion remains confined.

3.8.4.2 Practical Implications

In numerical simulations where particles are tracked for millions of turns we would like to determine the region of stability, i.e. dynamic aperture. Since we cannot track ad infinitum, we have to specify criteria whether a particle is stable or not. A straightforward method is to test the particle amplitudes against well defined apertures and declare a particle lost when the aperture is reached. A sufficient number of turns, usually determined by careful testing, is required with this method.

Usually this means to find the particle survival time as a function of the initial amplitude. In general the survival time decreases as the amplitude increases and should reach an asymptotic value at some amplitude. The latter can be identified as the dynamic aperture.

Other methods rely on the assumption that a particle that is unstable in the long term, exhibits features such as a certain amount of chaotic motion.

Typical methods to detect and quantify chaotic motion are:

  • Frequency Map Analysis [24, 25].

  • Lyapunov exponent [26].

  • Chirikov criterion [27].

In all cases care must be taken to avoid numerical problems due to the computation techniques when a simulation over many turns is performed.