6.1 Beam Optics and Lattice Design in High Energy Particle Accelerators

Lattice design in the context we will describe it here is the design and optimization of the principle elements—the lattice cells—of a circular accelerator, and it includes the dedicated variation of the accelerator elements (as for example position and strength of the magnets in the machine) to obtain well defined and predictable parameters of the stored particle beam. It is therefore closely related to the theory of linear beam optics that has been described in Chap. 2 [1].

6.1.1 Geometry of the Ring

For the bending force as well as for the focusing of a particle beam, magnetic fields are applied in an accelerator. In principle, electrostatic fields would also be possible but at high momenta (i.e. if the particle velocity is close to the speed of light) the usage of magnetic fields is much more efficient. In its most general form, the force acting on the particles is given by the Lorentz-force

$$ \boldsymbol{F}=q\left(\boldsymbol{E}+\boldsymbol{v}\times \boldsymbol{B}\right) $$
(6.1)

In high energy accelerators, the velocity v is close to the speed of light and so represents a nice amplification factor whenever we apply a magnetic field. As a consequence, it is much more convenient to use magnetic fields for bending and focusing the particles. Neglecting the E component therefore in Eq. (6.1), the condition for a circular orbit is defined as the equality of the Lorentz force and the centrifugal force:

$$ qvB=\frac{m{v}^2}{\varrho } $$
(6.2)

In a constant transverse magnetic field B, the particle will see a constant deflecting force and the trajectory will be a part of a circle, whose bending radius ρ is determined by the particle momentum p = mv and the external B field.

$$ \uprho =\frac{p}{qB} $$
(6.3)

The term is called beam rigidity. Inside each dipole magnet in a storage ring the bending angle—sketched out in Fig. 6.1—is given by the integrated field strength via

$$ \alpha =\frac{\int Bds}{B\uprho} $$
(6.4)

Requiring a bending angle of 2π for a full circle, we get the condition for the magnetic dipole fields in the ring. In the case of the LHC e.g. for a momentum of p = 7000 GeV/c a number of 1232 dipole magnets are needed each having a length of ~15 m with a B-field of 8.3 T. As a general rule in high energy rings, about 66% (2/3) of the circumference of the machine should be foreseen to install dipole magnets, as they define the maximum particle momentum that can be carried by the machine. This basic dipole structure is completed with focusing elements, beam diagnostic tools etc. and forms the arcs of the ring. They are connected by long straight sections, so-called insertions, where the optics are modified to establish conditions needed e.g. for particle injection or extraction and the installation of the radio-frequency resonators for the particle acceleration. In the case of collider rings so-called mini-beta insertions are included, where the beam dimensions are reduced considerably to increase the particle collision rate and where space is needed for the installation of the particle detectors.

Fig. 6.1
figure 1

B-field in a storage ring dipole magnet and schematic particle orbit

The lattice and correspondingly the beam optics therefore are split in different characteristic parts: arc structures that are used to guide the particle beam and define the geometry of the ring; they establish a regular pattern of focusing elements. And the straight sections, that are optimised for the installation of a manifold of technical devices, including the high-energy physics detectors.

Fig. 6.2
figure 2

Lattice of a high energy storage ring with periodic arc structures and straight sections

6.1.2 Lattice Design

An example of such a high-energy lattice and the corresponding beam optics is shown in Fig. 6.2. In the upper part of the figure the regular pattern of the beta function is plotted in red and green for the two transverse planes. As a consequence of the periodic structure of the lattice, the beta function—and so the beam size—reaches a maximum value in the centre of the focusing, and a minimum in the centre of the defocusing quadrupoles. The lower part of the figure shows the horizontal and vertical dispersion function. The lattice of the complete machine is designed on the basis of small periodic lattice structures—called cells—that repeat many times in the ring. One of the most widespread lattice cells used in high-energy rings is the so-called FODO cell: A magnet structure consisting of focusing and defocusing quadrupole lenses in alternating order. In between the focusing elements the dipole magnets are located and any other machine elements like orbit corrector dipoles, multipole correction coils or diagnostic instruments can be installed.

In Fig. 6.3 the optical solution of such a FODO cell is plotted: The graph shows the β-function in the two transverse planes (red curve for the horizontal, green curve for the vertical plane). In the lower part of the plot the position of the magnet lenses, the lattice, is indicated schematically. In first order the optical properties of such a lattice are determined only by the parameters of the focusing (F) and de-focusing (D) quadrupole lenses. In between these two quadrupole magnets only lattice elements are installed that have zero (“O”) or negligible influence on the transverse particle dynamics. Hence the acronym FODO for such a structure. Due to the symmetry of the cell the solution for the β function is periodic (in general such a FODO cell is the smallest periodic structure in a storage ring) and it reaches its extreme values at the position of the quadrupole lenses. As a consequence, at these locations in the arc, the beam will reach its maximum dimension \( \sigma =\sqrt{\varepsilon \beta} \), and the aperture need will be highest.

Fig. 6.3
figure 3

Basic element of a high-energy storage ring: the FODO cell

Accordingly, the “Twiss” parameter α, which is the derivative of β is generally zero in the middle of the FODO quadrupoles. Based on the thin lens approximation a number of scaling laws and rules can be established to understand the properties of such a FODO structure [2]: How do we arrange the strength and position of the quadrupole lenses in the lattice to obtain a certain beta-function? How does the cell length influence the phase advance of the particle trajectories? How do we guarantee that, turn by turn, a stable particle oscillation is obtained?

In the following we briefly summarise these rules.

  • Stability of the motion: the strengths of the focusing (and defocusing) elements in the lattice have to be such that the particle oscillation does not increase. This condition—the stability criterion for a periodic structure in a lattice—is obtained in a FODO if the focal length of the magnets is larger than a quarter of the cell length:

    $$ f=\frac{1}{kl}=\frac{L_{cell}}{4}. \vspace*{3pt}$$
    (6.5)
  • The beta function—and so the beam size—is determined by the phase advance of the cell and its length:

    $$ {\beta}_{\mathit{\max},\mathit{\min}}=\frac{1\pm \sin \left({\varphi}_{cell}/2\right)}{\sin {\varphi}_{cell}}{L}_{cell}. \vspace*{3pt}$$
    (6.6)
  • A similar scaling law is obtained for the dispersion:

    $$ {D}_{\mathit{\max},\mathit{\min}}=\frac{L_{cell}^2}{4\rho }\ \frac{1\pm \frac{1}{2}\sin \left({\varphi}_{cell}/2\right)}{\sin^2\left({\varphi}_{cell}/2\right)}. \vspace*{3pt}$$
    (6.7)

In general, small values for the β functions as well as for the dispersion are desired. It will be the intention of the lattice designer to minimise the beam size, and so to optimise the aperture need of the beam. In addition the β-function indicates the sensitivity of the beam with respect to external fields and field errors. A change in a quadrupole field e.g. will shift the tune of the beam by

$$ \varDelta Q=\frac{1}{4\pi}\int \varDelta k(s)\beta (s) ds. \vspace*{3pt}$$
(6.8)

The effect is proportional to the size of the applied change in quadrupole field, Δk but also to the value of the beta function at this position. Therefore, the phase advance of the FODO cell has to be chosen to obtain smallest values for β in both transverse planes, which leads in the case of protons or heavy ions to an optimum phase advance of 90° per cell. It will be no surprise that the focusing structure of typical high energy proton rings like SPS, Tevatron, HERA-p and LHC were optimised for this value.

In addition to the main building blocks, the dipoles and quadrupole magnets, the FODO will be equipped with a number of correction magnets for orbit correction, compensation of higher harmonic field errors of the main magnets, and sextupoles for chromaticity compensation of the machine. The FODO cell of LHC, including these corrector magnets is illustrated in Fig. 6.4.

Fig. 6.4
figure 4

FODO cell of LHC. In addition to the two main quadrupoles and six dipole magnets, diagnostic instruments and multipole compensation coils are included in the arc lattice

Six dipoles and two main quadrupoles are forming the basic structure of the cell; they are complemented by orbit correction dipoles, trim quadrupoles that are used for fine tuning of the working point and multipole correction coils to compensate higher order field distortions up to 12 pole [3].

Among the higher order correction coils mentioned above the sextupoles play the most critical role in the arc structure, as they are indispensable to compensate the chromatic errors in the lattice. Chromaticity is an optical error that describes the distortion of the focusing properties in a lattice under the presence of momentum spread of the particle beam. In general a sextupole magnet will be installed to support each quadrupole in the arc. At least two sextupole families are required, one for each transverse plane. In some cases several families per plane are installed to improve the region of stability in the transverse plane (the so-called dynamic aperture of the storage ring). They have to be strong enough to correct the chromaticity created in the arc cells as well as in the insertion sections. The mechanism of chromaticity correction is based on the combination of the dispersion function that sorts the particles according to their momentum and the nonlinear field of a sextupole magnet:

$$ {B}_z=\frac{1}{2}\ \overset{\sim }{g}\left({x}^2-{z}^2\right), $$
(6.9)

where

$$ \overset{\sim }{g}=\frac{d^2{B}_z}{d{x}^2} $$
(6.10)

describes the sextupole “gradient”.

Normalizing the sextupole field to the beam rigidity we write the contribution of each sextupole to the chromaticity as

$$ \varDelta Q=\frac{1}{4\pi}\int {k}_{sext} D\beta dl $$
(6.11)

and it depends indeed on the value of both, beta function and dispersion. Therefore the sextupole magnets that are needed to compensate the natural chromaticity in the ring will be located in the lattice at places where at the same time the dispersion and the beta function are large, i.e. close to the corresponding quadrupole lenses.

6.2 Lattice Insertions

The arc structure of a storage ring is usually built out of regular patterns like FODO cells that are repeated periodically and determine the geometry of the machine. Straight sections are inserted to combine these arcs and provide the space required for beam injection, extraction, or dispersion free lattice parts to install e.g. RF systems. Finally space is needed to establish the conditions that are required for the collisions of the two counter rotating beams. As an example of the general layout of a storage ring we refer again to the LHC lattice. Eight straight sections connect eight arcs: four of them are used for beam injection, extraction and collimation, the remaining four are optimised to house the high-energy detectors (IR1, 5, 2, 8 in Fig. 6.5). Here the storage ring lattice has to provide the free space needed for the installation of a large modern particle detector and the beam optics has to be modified to provide the strong focusing needed at the collision point.

Fig. 6.5
figure 5

Lattice geometry of the LHC

6.2.1 Low Beta Insertions

The most important “insertion” for a particle collider ring is the so-called mini beta structure: The key issue of a collider is its luminosity [4] that defines the rate of produced collision events (particles or particle reactions of interest) in the machine. Its value is defined by the machine lattice and under the assumption of equal beam properties in the two colliding beams it is given by the stored currents in the two beams, I p1, I p2, the revolution frequency f 0, the number of stored bunches, n b, and most of all by the transverse size of the two beams, \( {\sigma}_x^{\ast } \) and \( {\sigma}_y^{\ast } \). In the simplest case we get:

$$ L=\frac{1}{4\pi {e}^2{f}_0{n}_b}\ \frac{I_{p1}{I}_{p2}}{\sigma_x^{\ast }{\sigma}_y^{\ast }}. $$
(6.12)

A more general formula that includes geometric and optical reduction factors is presented in Sect. 6.4 [4]. At the interaction point “IP”, the intention of the lattice designer will be to reduce the beta function as much as possible in order to obtain the smallest possible beam. The main limiting factor comes from a basic principle which is valid for any system of particles under the influence of conservative forces (“Liouville’s Theorem”): Under conservative forces, the density of the particle’s phase space volume is constant. Applying this law to a particle beam in an accelerator it means that the beam dimension and divergence are not independent of each other. Namely for the design of symmetric drift space in a storage ring we can deduce a rule for the beta function: Starting from a waist (α  = 0 at the collision point) the beta function develops as

$$ \beta (s)={\beta}^{\ast }+\frac{s^2}{\beta^{\ast }}. $$
(6.13)

The star refers to the value at the waist (e.g. the interaction point “IP”). This relation is a direct consequence of Liouville’s theorem and therefore of fundamental nature. As a consequence the behaviour of β in a symmetric drift cannot be changed and has a strong impact on the design of a storage ring: Small beta functions at the collision point and a large distance to the first focusing element lead to high values of the beta function and correspondingly to large beam dimensions at the first focusing element in front and after the IP.

The preparation of the beam optics for the installation of modern high-energy detectors therefore needs special treatment in the lattice design to provide the large space needed for the detector hardware. An illustrative example is shown in Fig. 6.6: a long symmetric drift space that holds the experiment is centred around the interaction point of the colliding beams. Depending on the respective value of beta at the IP the beta functions increase in the horizontal (red) and vertical (green) plane and are focused back using a couple of strong, large aperture and high quality quadrupole lenses. Depending on the particular situation (namely the ratio of the two β values in the two planes a quadrupole doublet or triplet arrangement will be the adequate choice for these mini beta quadrupoles. Additional independent quadrupole magnets (i.e. individually powered magnets) will be needed to create a smooth transition of the optics from the IP to the periodic solution of the FODO cells in the arc. In general eight parameters have to be optimised: the β and α values in the two planes, the dispersion and its derivative and the phase advance of the complete mini beta system. As a consequence such a mini beta insertion will have to be equipped with at least eight individually powered quadrupole magnets to fulfil this requirement.

Fig. 6.6
figure 6

Layout of a mini beta insertion scheme. The example shows a low beta insertion based on a quadrupole doublet. The vertical beta function (green line) starting with smaller values at the IP shows a stronger increase than the beta in the horizontal plane. Accordingly the doublet quadrupoles are powered in QD-QF polarity

It has been pointed out in the previous chapter that the emittance of a particle beam is not constant during acceleration but depends on the energy of the particle beam. In the case of a proton or ion beam the adiabatic shrinking is the dominant effect and the emittance follows the rule ε ∝ 1/βγ where β and γ are the relativistic parameters. As a consequence the emittance in a proton storage ring is highest at injection energy and the beam optics has to be optimised to limit the beta function at any place in the machine to values that guarantee sufficient aperture. At high energy (the so-called flat-top) the emittance is small enough that the mini beta concept can be used to full extend and only here the β can be reduced to the small values that are required to deliver the design luminosity values. The lattice of the mini beta insertion therefore has to be optimised in a way, that two quite different beam optics can be established by corresponding adjustment of the quadrupole gradients: A low energy optics for injection and the early steps of the acceleration and a true mini beta optics that will be used for the collider run at high energy.

The procedure to pass from the injection optics to the luminosity case is often called “beta squeeze” and is a critical situation as optics, orbits and global beam parameters like tune and chromaticity have to be maintained constant and well controlled during the changing quadrupole settings. Several intermediate steps might be needed to guarantee a smooth transition between the two operation modes. In the case of the LHC the 450 GeV injection case and the 7 TeV luminosity optics are compared in Fig. 6.7.

Fig. 6.7
figure 7

(Left) Beam optics for the LHC: 450 GeV injection optics optimised for small values of beta to gain highest aperture in the machine. (Right) Low beta optics for the LHC luminosity operation: due to the small values at the IP the beta function reaches large values in the low beta quadrupole lenses. (Note the different scale of the vertical axis)

6.2.2 Injection and Extraction Insertions

In addition to the mini beta insertions where the beams are optimised for highest collision rates, additional insertions are needed in the storage ring for beam injection and extraction. In these cases the same rules are valid as for the mini beta insertions but in general the consequences are more relaxed. Additional hardware that has to be installed for the injection process (fast kicker magnets and septum dipoles to inject the new beam) is much smaller than the detectors at the collision points. Still, however, some modifications of the lattice will be needed and the optics will have to be re-matched to establish the required space. A special additional feature should be mentioned here: the new beam that is being injected has to match perfectly in energy and in phase space to the optical parameters of the storage ring or synchrotron. At the end of the beam transfer line as well as in the storage ring the focusing fields have to be optimised to obtain the same values of the Twiss functions α and β in both transverse planes. As in the case of the mini beta insertions additional individually powered quadrupole magnets are needed. As an example the beam optics of the SPS-LHC transfer-line is plotted in Fig. 6.8. At the beginning and the end of the lattice structure—indicated by red markers in the figure—the beta function is modified to match the optics from the SPS to the FODO channel of the transfer line and from the FODO to the LHC insertion at IR2 and IR8 where the injection elements are located.

Fig. 6.8
figure 8

Transfer line between the SPS and the LHC. Two matching sections have to be introduced to adopt the beam optics from the SPS to the transfer line and to the LHC

6.2.3 Dispersion Suppressors

The dispersion function D(s) has already been introduced in Sects. 2.4 and 6.1. It describes the trajectory in the case of a momentum deviation of the particle and is the consequence of the corresponding error in the bending strength of the dipole magnets. In the arc structure with its regular pattern of dipole magnets, dispersive effects cannot be avoided (but they should be minimised) and the additional amplitude due to the dispersion has to be considered if we are talking about particle trajectories or beam sizes. In linear approximation and for a small momentum spread Δp/p in the beam, the amplitude of a particle oscillation is obtained by

$$ x(s)={x}_{\beta }(s)+D(s)\frac{\Delta p}{p_0}, $$
(6.14)

where x β describes the solution of the homogeneous differential equation (the usual betatron oscillations of the particle) and the second term—the dispersion term—corresponds to the additional oscillation amplitude for particles with a relative momentum error Δp/p 0. At the interaction point where the smallest beam sizes are required to obtain the highest luminosity, we intend to suppress the dispersion and as the collision point is generally located in a straight section of the accelerator, techniques have been developed to obtain dispersion free sections inside the lattice. The insertions that are used to reduce the dispersion function from its periodic value in the arc to zero are called dispersion suppressors [2, 5, 6].

It has to be mentioned in this context that especially in the case of synchrotron light sources a variety of lattice types has been developed with the goal to achieve small or even zero dispersion in the ring or in parts of it. However, these lattices are optimised for the purpose of high brilliant synchrotron radiation and are not ideal for high-energy particle accelerators, where FODO cells are usually the most appropriate choice.

Referring to high energy colliders we will concentrate therefore on the interaction region, i.e. a straight section of a ring where two counter rotating beams collide in a dispersion free part of the storage ring. A non-vanishing dispersion dilutes the luminosity of the machine and leads to additional stop bands in the working diagram of the accelerator (“synchro-betatron resonances”), that are driven by the beam-beam interaction. Therefore sections are inserted in our magnet lattice that are designed to reduce the function D(s) to zero. Three main techniques are widely used: the quadrupole based dispersion suppressor, the missing bend scheme and the half bend scheme. We will not present all of them in detail but instead restrict ourselves to the basic idea behind it.

Fig. 6.9
figure 9

Periodic FODO and horizontal dispersion function in a regular FODO structure

6.2.3.1 The “Straightforward” Way: Dispersion Suppression Using Quadrupole Magnets

Let us assume here that a periodic lattice is given in the arc (see Fig. 6.2) and that this FODO structure simply is continued through the straight section—but with vanishing dispersion. Given an optical solution in the arc cells, as for example shown in Fig. 6.9, we have to guarantee that starting from the periodic solution of the optical parameters α(s), β(s) and D(s) we obtain a situation at the end of the suppressor where we get D(s) = D'(s) = 0 and the values for α and β unchanged.

The boundary conditions after the suppressor section

$$ D(s)={D}^{\prime }(s)=0,\vspace*{-24pt} $$
$$ {\beta}_x(s)={\beta}_{x\ \mathrm{arc}},{\alpha}_x(s)={\alpha}_{x\ \mathrm{arc}}, \vspace*{-12pt}$$
(6.15)
$$ {\beta}_y(s)={\beta}_{y\ \mathrm{arc}},{\alpha}_x(s)={\alpha}_{y\ \mathrm{arc}}, $$

can be fulfilled by introducing six additional quadrupole lenses whose strengths have to be matched individually in an adequate way. This can be done by using one of the beam optics codes that are available today in every accelerator laboratory. An example is shown in Fig. 6.10, starting from a FODO structure with a phase advance of φ ≈ 70° per cell.

Fig. 6.10
figure 10

Periodic FODO and horizontal dispersion function in a regular FODO structure dispersion suppressor scheme based on individually powered quadrupole lenses

The advantages of this scheme are:

  • it works for any phase advance of the arc structure;

  • matching works also for different optical parameters α and β before and after the dispersion suppressor as—within a certain range—the quadrupoles can be used to match the Twiss functions to different values;

  • the ring geometry is unchanged as the number and location of dipole magnets in the ring is unchanged.

On the other hand there are a number of disadvantages that have to be mentioned:

  • as the strength of the additional quadrupole magnets have to be matched individually the scheme needs additional power supplies and quadrupole magnet types which can be an expensive requirement;

  • the required quadrupole fields are in general stronger than in the arc;

  • the β function reaches higher values (sometimes really high values) which leads to higher beam sensitivity and larger aperture needs.

There are alternative ways to suppress the dispersion, which do not need individually powered quadrupole lenses but instead change the strength of the dipole magnets at the end of the arc structure.

6.2.3.2 The “Clever” Way: Half Bend Schemes

This dispersion suppressing scheme is made up of n additional FODO cells that are added to the periodic arc structure but where the bending strength of the dipole magnets is reduced. As before we split the lattice into three parts: the periodic structure of the FODO cells in the arc, the lattice insertion where the dispersion is suppressed, followed by a dispersion free section which can be another FODO structure without bending magnets or a mini beta insertion.

Starting from the dispersion free straight section the basic idea of this scheme is to create with a special arrangement of dipole magnets inside the dispersion suppressor—exactly the dispersion that corresponds to the periodic solution of the arc FODO cells. The solution will depend on the phase advance of the cells as well as on the strength of the bending magnets inside the suppressor magnets.

As explained before in the beam optics chapter, the matrix for a periodic part of the lattice (namely one single cell in our case) can be expressed as

$$ {M}_{cell}=\left(\begin{array}{ccc}C& S& D\\ {}{C}^{\prime }& {S}^{\prime }& {D}^{\prime}\\ {}0& 0& 1\end{array}\right)=\left(\begin{array}{ccc}\cos {\phi}_c& {\beta}_C\sin {\phi}_c& D\\ {}-\frac{1}{\beta_c}\sin {\phi}_C& \cos {\phi}_c& {D}^{\prime}\\ {}0& 0& 1\end{array}\right), $$
(6.16)

where the index “c” reflects the solution of a cell, ϕ c denotes the phase advance for a single cell and the elements D and D' correspond to its periodic dispersion.

As usual the dispersion elements are obtained by

$$ D(l)=S(l)\underset{0}{\overset{l}{\int }}\frac{C\left(\overset{\sim }{s}\right)}{\varrho \left(\overset{\sim }{s}\right)}d\overset{\sim }{s}-C(l)\underset{0}{\overset{l}{\int }}\frac{S\left(\overset{\sim }{s}\right)}{\varrho \left(\overset{\sim }{s}\right)}d\overset{\sim }{s}. $$
(6.17)

The functions C(s) and S(s) are the cosine and sine like matrix elements of the lattice element in the sense that e.g. C(s) = M[1,1], and the integral is executed over one complete cell.

In the dispersion suppressor section, the dispersion D(s) starts with the value D 0 the end of the arc cell and is reduced to zero. Or turning it around and thinking from right to left: the dispersion has to be created inside the suppressor part by proper arrangement of the dipole magnets, starting from D = D' = 0 in the straight section to reach the values that correspond to the periodic dispersion of the arc cells. Solving the equation above by integrating over a certain number of cells will determine the bending strength 1/ρ and the number n of cells in the suppressor part that is needed to fulfill the boundary condition and get the values of the dispersion in the following periodic arc cell.

For a given phase advance φ c per cell two conditions for the dispersion matching are obtained that combine the number of suppressor cells, n, and the strength of the suppressor dipoles, δ supr:

$$ \left.\begin{array}{c}2{\delta}_{\mathrm{supr}}{\sin}^2\left(\frac{n{\phi}_c}{2}\right)={\delta}_{\mathrm{arc}}\ \\ {}\sin \left(n{\phi}_c\right)=0\end{array}\right\}\ {\delta}_{\mathrm{supr}}=\frac{1}{2}{\delta}_{\mathrm{arc}}. $$
(6.18)

If the phase advance per cell in the arc fulfills the condition sin( c) = 0, the strength of the dipoles in the suppressor region is just half the strength of the arc dipoles. In other words the phase advance has to fulfill the condition

$$ n{\phi}_c= k\pi, \kern0.75em k=1,3,\dots . $$
(6.19)

There are a number of possible phase advances that fulfill that relation, but clearly not every arbitrary phase is allowed. Possible constellations would be for example, ϕ c = 90°, n = 2 cells, or, ϕ c = 60°, n = 3 cells in the suppressor.

Figure 6.11 shows such a half bend dispersion suppressor, starting from a FODO structure with 60° phase advance per cell. The focusing strength of the FODO cells before and after the suppressor are identical, with the exception that—clearly—the FODO cells on the right are “empty”, i.e. they have no bending magnets.

Fig. 6.11
figure 11

Dispersion suppressor based on the half bend scheme

It is evident that unlike to the suppressor scheme with quadrupole lenses now the beta function is unchanged in the suppressor region.

Again this scheme has advantages:

  • no additional quadrupole lenses are needed and no individual power supplies;

  • in first order the β functions are unchanged; aperture needs and beam sensitivity are not increased;

and disadvantages:

  • it works only for certain values of the phase advance in the structure and therefore restricts the free choice of the optics in the arc;

  • special dipole magnets are needed (having half the strength of the arc types);

  • the geometry of the ring is changed.

It has to be mentioned here, that in theses equations the phase advance of the suppressor part is equal to the one of the arc structure—which is not completely true as the weak focusing term 1/ϱ 2 in the arc FODO differs from the term 1/(2ρ)2 in the half bend scheme. As, however, the impact of the weak focusing on the beam optics can be neglected in many practical cases Eq. (6.18) is nearly correct.

The application of such a scheme is very elegant, but as it has a strong impact on the beam optics and geometry it has to be embedded in the accelerator design at an early stage.

6.2.3.3 The “Missing Bend” Dispersion Suppressor Scheme

A similar approach is used in the case of the missing bend dispersion suppressor: It consists of a number of n cells without dipole magnets at the end of the arc, followed by m cells that are identical to the arc cells. The matching condition for this missing bend scheme with respect to the phase advance is

$$ \frac{2n+m}{2}{\phi}_c=\left(2k+1\right)\frac{\pi }{2}. $$
(6.20)

For the number m of the required cells after the empty cells we get:

$$ \sin \frac{m{\phi}_c}{2}=\frac{1}{2},\kern1.5em k=0,2,\dots, \kern0.75em \mathrm{or}\kern0.75em \sin \frac{m{\phi}_c}{2}=-\frac{1}{2},\kern1em k=1,3,\dots . $$
(6.21)

The following example is based on ϕ c = 60°, where the conditions above are fulfilled for m = n = 1, Fig. 6.12.

Fig. 6.12
figure 12

Dispersion suppressor based on the missing bend scheme. The FODO cell without dipoles and the following standard cell are indicated by blue and green markers in the plot

There are more scenarios for a variety of phase relations in the arc and the corresponding bending strength needed to reduce D(s), see [2, 3]. In general, one will combine one of the two schemes (missing or half bend suppressor) with a certain number of individual quadrupole lenses to guarantee the flexibility of the system with respect to phases changes in the lattice and to keep the size of the beta-function moderate.

6.3 Injection and Extraction Techniques

Transfer of a beam between accelerators or onto external dumps, targets and measurement devices is a specialized topic and requires dedicated systems for injection and extraction [7], as well as beam transfer lines. Injection is the final process of the transfer of beam between one accelerator and another, either from a linear to a circular accelerator or between circular accelerators. Extraction is the removal of beam from an accelerator, either for the transfer to another accelerator or to deposit the beam on a target, dump or measurement system. Both injection and extraction systems need to be designed to transfer beam with minimum beam loss, to achieve the desired beam parameters and usually to minimize the dilution of the beam emittance.

Single-turn injection and extraction methods are rather straightforward for both lepton and hadron machines. They generally involve a kicker system to deflect the beam onto or away from the closed orbit, a septum (or series of septa for higher energy beams) to deflect the beam into or out of the accelerator aperture, and frequently also a closed orbit bump to approach the septum and reduce the required kicker strength. For these single-turn methods, the beam losses can be very low, and the emittance dilution associated with the injection or extraction can be very small, defined by the delivery precision, the optics mismatch, the kicker flat top ripple and septum stability. For both injection and extraction, the circulating beam can be adversely affected by septum stray fields penetrating into the circulating beam region and by the kicker field rise time which can overlap temporally with circulating bunches. Injecting a bunched beam into another accelerator also requires that the momentum spread and phase be matched to the RF bucket, and that the RF system can accept the transient beam loading which arises from the sudden change in beam intensity.

Multiple-turn injection is used to fill the circumference of a receiving accelerator and to accumulate bunch intensity. A wide variety of multiple-turn injection and extraction schemes exist, and these can be very different for lepton and hadron machines. Lepton injection schemes can take advantage of synchrotron radiation damping to achieve high beam brightness, while for hadron machines space charge effects dominate, especially at low energy. High brightness proton injection can make use of phase-space “painting” to precisely tailor the transverse and longitudinal distributions, particularly with H charge exchange injection or slip stacking; while resonant multiple-turn extraction schemes have been developed to provide quasi-continuous particle fluxes for periods which range from milliseconds to hours. The additional hardware systems required for these more advanced injection and extraction techniques include multiple RF systems, programmed fast closed-orbit bumps, stripping foils and non-linear lattice elements.

Overall, injection and extraction techniques share many similarities and hardware requirements [8]: one important difference between them is that extraction is usually at higher beam rigidity, which implies less effect from space charge and also stronger and hence longer deflecting systems, which can have a significant effect on lattice and insertion design [9,10,11].

6.3.1 Fast Injection

Fast injection [12,13,14] is typically used to fill another machine with bunch-to-bucket transfer, or to fill a collider over several injections with ‘boxcar’ stacking, where bunches or trains of bunches are added sequentially like boxcars (wagons) to a train. The system design depends critically on the aperture needed for the beam, and the kicker rise time, fall time and flat top duration. Very fast kicker rise times are often required to maximize the amount of beam which can be injected, especially in machines with small circumferences, since the kicker rise and fall times must be significantly shorter than the revolution time.

6.3.2 Slip-Stacking Injection

In slip-stacking [15], two trains of bunches are merged to increase the bunch intensity, using separate RF systems. A first train of bunches is injected on the closed orbit and captured by the first RF system. This train of bunches is then decelerated, and as a result circulates on a different orbit. A second batch is then injected on the closed orbit and captured by the second RF system. The two trains of bunches have slightly different energies and can be made to move relative to each other in phase. When the phase difference reaches zero, both sets of bunches are captured together and merged, by a rapid change of the RF frequency. The accelerator needs enough momentum aperture to accept both beams, and sophisticated RF control to make the manipulations. The final longitudinal emittance is the sum of the two individual emittances multiplied by an unavoidable blowup factor, typically around 1.5.

6.3.3 H Charge-Exchange Injection

High brightness, low energy proton machines frequently make use of H charge exchange injection [16]. In this technique, a linac accelerates H ions which are then merged with the circulating proton beam in a dipole magnet, Fig. 6.13, before the two loosely-attached electrons are stripped away in a foil which is almost transparent to the circulating beam.

Fig. 6.13
figure 13

Merging H and p+ beams in H charge exchange injection

This technique allows the accumulation of high brightness beams, since unlike other methods it allows injection into the already occupied phase space area. Transverse particle distributions can be controlled using phase space painting, to ameliorate space charge effects, reduce beam losses and increase accumulated intensity. The stripping is achieved with thin foils of carbon or diamond-like carbon, with a thickness typically in the micron range, which is a compromise between obtaining high stripping efficiency and minimizing the beam losses and emittance growth from scattering.

Fast painting bumpers or kickers in both planes are used to displace the circulating beam access with respect to the foil, and the waveform of the bumper field can be varied to achieve the desired phase space density distribution. This is the process of phase space painting, where the small emittance LINAC beam is the brush and the large acceptance of the receiving machine is the canvas.

In addition to beam loss from scattering at the foil, another significant source of beam loss can be the field-stripping in the third chicane magnet of excited H0. In ISIS [17] the injection is made on the ramp, and the dispersion at the foil provides some of the transverse phase space painting. For SNS [18], where the average beam power is over 1 MW, the uncontrolled beam losses must be kept extremely low and the accumulation is made over 1160 turns.

Fig. 6.14
figure 14

Betatron injection. The injected beam is mismatched and performs betatron oscillations until damped by emission of synchrotron radiation

The use of stripping foils is disadvantageous for several reasons, in particular the associated uncontrolled beam losses, but also due to the simple mechanical and radiological difficulties in handling such fragile objects. A foil-free method of H stripping using a high-powered laser to resonantly excite neutral H0 before field stripping in a dipole has been proposed and demonstrated in principle, and is promising for very high energy H injection systems [19].

6.3.4 Lepton Accumulation Injection

Injection of leptons can take advantage of the strong damping which is present from synchrotron radiation to accumulate intensity. This is very commonly used at Synchrotron Radiation rings, where top-up operation [20] consists of frequently injecting small amounts of beam to replace beam losses and keep the beam and synchrotron radiation intensities stable in a very small range.

In betatron injection, Fig. 6.14, the injected bunch or train is injected with an orbit offset with respect to the circulating beam, which is moved towards the injection septum with a fast closed-orbit bump. The offset between the injected beam and the circulating beam must be large enough to accommodate the injection septum. The particles of the newly-injected bunches then perform damped betatron oscillations around the closed orbit, until they merge with the already circulating beam. This technique has the disadvantage that the betatron amplitude may be large in regions of the accelerator where the β-function is large. In the alternative synchrotron injection [21], Fig. 6.15, the new particles are injected with a momentum offset δp and a position offset X into a region with dispersion D, such that X = δp × D. The particles are injected onto the matched betatron orbit for their momentum, and thus only perform synchrotron oscillations around the stored particles, with the transverse offsets following the dispersion function. For LEP a combination of betatron and synchrotron injection was preferred [22], since the dispersion in the long straight sections was very small and the background to the experiments could be significantly improved.

Fig. 6.15
figure 15

Synchrotron injection. The injected beam has a momentum offset, and the injection trajectory is matched to the local dispersion orbit. The beam then performs oscillations about the closed orbit determined by the dispersion function, as the momentum changes with the synchrotron oscillations

6.3.5 Fast Extraction

Fast extraction is typically used to provide beam to a higher energy machine with bunch-to-bucket transfer. As for fast injection, the system design depends critically on the aperture needed for the beam, and the kicker rise time, fall time and flat top duration. Achieving fast kicker rise time with sufficient deflection angle at high beam rigidity is a common challenge, as is the design of the extraction insertion where the septum strength must be sufficient to provide enough clearance at the next downstream accelerator element. As beam energies increase, protection from mis-steered beam of the extraction septum and of other accelerator components becomes important; for the LHC beam extraction system at 7 TeV [23], the synchronization of the kicker system and protection from asynchronous kicker firing is a critical system design feature. Closed orbit bumps can be used to move the beam closer to the septum, to reduce the required kick strength, Fig. 6.16.

Fig. 6.16
figure 16

Schematic of fast extraction system with kicker, septum and orbit bumpers. For higher energy machines, protection devices to intercept and dilute any mis-kicked extracted beam are placed in front of the septum and downstream QF quadrupole

6.3.6 Resonant Extraction

Many rate-limited applications such as physics experiments, test beams or medical treatment beams require a slow flux of particles with as uniform a time structure as possible. Resonant extraction using the third integer is the most common method of providing such uniform spills. In this ‘slow’ extraction [24], a triangular stable area in phase space (usually horizontal) is defined by exciting sextupole elements, and by moving the machine tune close to the third integer resonance. Before the start of the extraction process, particles remain stable if their single-particle emittances are smaller than the area of the stable triangle.

The beam is extracted by driving some particles unstable in a controlled way. The unstable particle amplitudes increase rapidly, following the outward-going separatrix every three turns, and the particles eventually move into the high-field region of a very thin electrostatic septum and are extracted, Fig. 6.17. The rate of extraction is controlled either by modulating the excitation process or by controlling the stable area. Several techniques for driving the particles unstable are possible:

  1. i)

    the stable area can be reduced by increasing the resonance (sextupole) strength or by moving the tune closer to the third integer. Increasing the resonance strength reduces the stable area, but the smallest amplitude particles cannot be extracted, and changing the resonance affects the machine optics. Crossing the resonant tune offers the advantage that all of the beam can be extracted; however, the optics is still perturbed and in addition the position of the extracted beam in phase space changes as particles are extracted;

  2. ii)

    the particle amplitudes can be increased by use of a transverse excitation. The stable area is kept fixed and the particle amplitudes increased, as in RF-knockout [25] where a high-frequency damper is used near the betatron resonance frequency to excite the beam. The machine optics is not changed and this method allows very fine control of the spill flux, suitable for medical machines;

  3. iii)

    the particles can be accelerated into the resonance where the chromaticity couples the momentum and the tune. A betatron core can be used [26] to accelerate the beam smoothly through the resonance. As the momentum of the beam changes, this is coupled via the chromaticity into a tune change. This method provides stability and insensitivity to power supply ripple. An alternative method (Constant Optics Slow Extraction) is to change the strengths of all machine elements to achieve the same effect, where the beam momentum remains fixed but the accelerator momentum changes [27];

  4. iv)

    RF noise can be applied to gradually diffuse particles longitudinally, which through the chromaticity are brought into resonance. This stochastic extraction [28] allows extremely long and uniform spills, and again has the advantage of leaving the machine lattice functions unchanged.

Fig. 6.17
figure 17

Resonant extraction in normalised phased space. The amplitudes of particles outside the stable area grow rapidly, following the outward-going separatrix lines every three turns until they reach the electrostatic septum

It should be noted that extraction can also be made using the second order resonance, where octupole fields are used to define a stable area in phase space. The amplitude growth with time is much faster, and the beam can be extracted in several hundred turns.

The use of a physical septum means that losses and activation are key performance aspects for slow extraction. Several interesting techniques exist to reduce beam losses at extraction [29], including the use of scatterers to reduce the particle density at the septum, multipoles to manipulate the separatrix density and techniques to reduce the angular spread of the beam and reduce the effective septum width.

6.3.7 Continuous Transfer Extraction

A frequent requirement in an accelerator complex is to fill a large circumference machine with the contents of a smaller machine. One way of doing this is boxcar stacking; another technique is continuous transfer [30], where the beam in the first machine is extracted over a number of turns, like peeling the skin from an orange in a continuous strip. The machine tune is brought near to the appropriate integer n, where the beam will be extracted in n+1 turns. A fast closed bump is then applied to the circulating beam with kickers to move the beam partly across a septum, such that a fraction of the beam is cut and extracted. The machine tune rotates the beam in phase space such that subsequent slices are extracted—when the n th turn is extracted, the bump amplitude is increased to extract the remaining central part. This process is of use where the injector can service other machines or experiments while the receiving machine is accelerating the beam, since it minimises the time spent filling. The disadvantage of the technique is that large beam losses occur at the septum, with the transfer efficiency typically 85%. The transfer can be made with a bunched beam, leaving space for the kicker rise time, but this means that the receiving machine will need to capture a beam with strong intensity modulation. Another feature of this extraction is that the extracted slices all have different emittances, as the slices in phase space are all different.

6.3.8 Resonant Continuous Transfer Extraction

To reduce the beam losses from continuous transfer, a hybrid technique has been developed and deployed called Multi-Turn Extraction [31] where non-linear resonances are excited which define stable areas in phase space. These are populated by the controlled crossing of a resonance, and the islands are then separated by varying the multipole strength to provide a physical separation at the septum, to reduce or avoid transverse losses. The beam needs to be bunched with a gap to avoid losses during the kicker rise time. In addition to the lower losses, another advantage of this technique is that the extracted islands all have the same emittance.

6.3.9 Other Injection and Extraction Techniques

More exotic injection and extraction techniques also exist as working systems or concepts. These include radio-frequency stacking [32], pion-decay injection into muon storage rings [33] and combined cooling and stacking [34]. Charge exchange extraction [35] is used in cyclotrons, with a stripping foil, to convert for example H to p+, or \( {H}_2^{+} \) to \( {H}_2^{2+} \) so that the beam is then deflected out of the accelerator. Finally, very high energy particle extraction can be envisaged with a bent crystal replacing the septum [36].

6.4 Concept of Luminosity

6.4.1 Introduction

In particle physics experiments the energy available for the production of new effects is the most important parameter. Besides the energy the number of useful interactions (events), is important. The quantity that measures the ability of a particle accelerator to produce the required number of interactions is called the luminosity (see Chap. 2) and is the proportionality factor between the number of events per second dR/dt and the cross section σ p:

$$ \frac{dR}{dt}\kern0.5em =\kern0.5em \mathrm{\mathcal{L}}\cdot {\sigma}_p \vspace*{-3pt}$$
(6.22)

The unit of the luminosity is therefore cm−2 s−1.

Here we will derive a general expression for the luminosity and give formulae for basic cases. Additional complications such as crossing angle and offset collisions are added to the calculation. Other effects such as the hourglass effect are estimated from the generalized expression.

In the final section we will discuss the measurement and calibration of the luminosity for both e+e as well as hadron colliders.

6.4.2 Computation of Luminosity

In the case of two colliding bunches, both serve as “target” as well as “incoming” beam at the same time. A schematic picture is shown in Fig. 6.18. The overlap integral which is proportional to the luminosity L can be written as [37]:

$$ \mathrm{\mathcal{L}}\propto {KN}_1{N}_2\cdot \int \hspace*{-2pt}\int \hspace*{-2pt}\int\hspace*{-2pt} {\int}_{-\infty}^{+\infty }{\rho}_1\left(x,y,s,-{s}_0\right){\rho}_2\left(x,y,s,{s}_0\right) \mathrm{d}x\mathrm{d}y\mathrm{d}s\mathrm{d}{s}_0 \vspace*{-6pt}$$
(6.23)
Fig. 6.18
figure 18

Schematic view of a colliding beam interaction

Here ρ 1(x, y, s, s 0) and ρ 2(x, y, s, s 0) are the time dependent beam density distribution functions and N 1 and N 2 the number of particles per bunch. We assume, that the two bunches meet at s 0 = 0 and s 0 = c ⋅ t is used as the “time” variable. Because the beams are moving against each other, we have to introduce the kinematic factor [38]:

$$ K=\sqrt{{\left({\overrightarrow{\nu}}_1-{\overrightarrow{\nu}}_2\right)}^2-{\left({\overrightarrow{\nu}}_1\times {\overrightarrow{\nu}}_2\right)}^2/{c}^2} $$
(6.24)

This factor is needed to make the luminosity and therefore the cross section relativistically invariant.

For the calculation we assume Gaussian profiles in all dimensions of the form:

$$ \kern1em {\rho}_{iz}(z)=\frac{1}{\sigma_z\sqrt{2\pi }}\exp \left(-\frac{z^2}{2{\sigma}_z^2}\right)\kern0.5em \mathrm{where}\kern1em i=1,2,\kern1em z=x,y \vspace*{-3pt}$$
(6.25)

in the transverse planes and

$$ \kern1em {\rho}_s\left(s\pm {s}_0\right)=\frac{1}{\sigma_s\sqrt{2\pi }}\exp \left(-\frac{{\left(s\pm {s}_0\right)}^2}{2{\sigma}_s^2}\right) \vspace*{-3pt}$$
(6.26)

in the longitudinal plane.

We further assume that the distributions are independent in the three coordinates and can be factorized. The integral (6.23) can then be evaluated. For the general case of: σ 1x ≠ σ 2x, σ 1y ≠ σ 2y, but assuming approximately equal bunch lengths σ 1s ≈ σ 2s we get the formula:

$$ \mathrm{\mathcal{L}}=\frac{N_1{N}_2{f}_c}{2\pi \sqrt{\sigma_{1x}^2+{\sigma}_{2x}^2}\sqrt{\sigma_{2y}^2+{\sigma}_{2y}^2}} $$
(6.27)

Where N 1 and N 2 are the bunch intensities and f c the repetition rate. In the case of a circular collider with N b bunches and a revolution frequency of f rev, we have f c = f rev ⋅ N b.

6.4.3 Luminosity with Correction Factors

The Eq. (6.26) requires correction factors when the beam do not fully overlap (crossing angle and offset), the beam size varies in the longitudinal plane (hour glass effect) or in the case of non-Gaussian beams.

6.4.3.1 Effect of Crossing Angle and Transverse Offset

Here we give the correction to the luminosity calculation in the case where two bunches do not collide exactly head-on, but with a crossing angle and/or transverse offset. In that case the luminosity is reduced and we must apply a correction factor to compute the correct value. For simplicity we assume crossing angle and offset in the horizontal (x) plane, but this is not a restriction. The integration (6.23) can be carried out by rotating the coordinate systems of the two beams each by half the crossing angle [37] and can be simplified introducing the factors:

$$ \begin{array}{l}A=\frac{\sin^2\frac{\phi }{2}}{\sigma_x^2}+\frac{\cos^2\frac{\phi }{2}}{\sigma_s^2},\kern1em B=\frac{\left({d}_2-{d}_1\right)\sin \left(\phi /2\right)}{2{\sigma}_x^2},\kern1em W={e}^{-\frac{1}{4{\sigma}_x^2}{\left({d}_2-{d}_1\right)}^2}\end{array} \vspace*{-18pt}$$
(6.28)
$$ S=\frac{1}{\sqrt{1+{\left(\frac{\sigma_s}{\sigma_x}\tan \frac{\phi }{2}\right)}^2}}\approx \frac{1}{\sqrt{1+{\left(\frac{\sigma_s}{\sigma_x}\frac{\phi }{2}\right)}^2}}\vspace*{-3pt} $$
(6.29)

where Φ/2 is half the crossing angle and d 1 and d 2 are the transverse offsets of the two beams (Fig. 6.19).

Fig. 6.19
figure 19

Schematic view of a colliding beam interaction at a crossing angle

We can re-write the luminosity with three correction factors:

$$ \mathrm{\mathcal{L}}=\frac{N_1{N}_2{fN}_b}{4{\pi \sigma}_x{\sigma}_y}\frac{N_1{N}_2{fN}_b}{4{\pi \sigma}_x{\sigma}_y}\cdot W\cdot {e}^{\frac{B^2}{A}}\cdot S \vspace*{-3pt}$$
(6.30)

This factorization enlightens the different contributions and allows straightforward calculations. The last factor S is the luminosity reduction factor for a crossing angle. One factor W reduces the luminosity in the presence of beam offsets and the factor \( {e}^{\frac{B^2}{A}} \) is only present when we have a crossing angle and offsets simultaneously in the same plane. The formulae for the luminosity under very general conditions can be found in [39]. A popular interpretation of this result is to consider it a correction to the beam size and to introduce an “effective beam size” like:

$$ {\sigma}_{eff}=\sigma /\sqrt{1+{\left(\frac{\sigma_s}{\sigma}\frac{\phi }{2}\right)}^2} \vspace*{-3pt}$$
(6.31)

This equation is valid when σ z ≫ σ. The effective beam size can then be used in the standard formula for the beam size in the crossing plane. This concept of an effective beam size is interesting because it also applies to the calculation of beam-beam effects of bunched beams with a crossing angle [40, 41].

In the case of flat beams, (i.e. σ z ≪ σ z) a more general expression has to be used, (see e.g. [39]).

To avoid the loss of luminosity, the use of crab cavities is an option, where the bunches are deflected transversely before and after the collision Fig. 6.20.

Fig. 6.20
figure 20

Scheme of crab crossing with transversely deflecting cavities

6.4.3.2 Hour Glass Effect

In a low-β region the β-function varies with the distance s to the minimum like:

$$ \beta (s)={\beta}^{\ast}\left(1+{\left(\frac{s}{\beta^{\ast }}\right)}^2\right) $$
(6.32)

For very small β comparable to the bunch length, the β-function is not a constant along the longitudinal dimension of the bunch. It cannot be considered a constant in Eq. (6.23). It follows a parabola and rises very fast and can become very large for small β .

In our formulae we have to replace σ by σ(s) and get a more general expression for the luminosity (assuming equal parameters in both beams, the most general expression can be found in [39]):

$$ \kern1em \frac{\mathcal{L}\left({\sigma}_s\right)}{\mathcal{L}(0)}={\int}_{-\infty}^{+\infty}\frac{1}{\sqrt{\pi }}\frac{e^{-{u}^2}}{\sqrt{\left[1+{\left(\frac{u}{u_x}\right)}^2\right]\cdot \left[1+{\left(\frac{u}{u_y}\right)}^2\right]}}\mathrm{d}u $$
(6.33)

Using the expressions: \( {u}_x={\beta}_x^{\ast }/{\sigma}_s \) and \( {u}_y={\beta}_y^{\ast }/{\sigma}_s \)

For the case of round beams it can be simplified and the integral becomes:

$$ \kern1em \frac{\mathcal{L}\left({\sigma}_s\right)}{\mathcal{L}(0)}={\int}_{-\infty}^{+\infty}\frac{1}{\sqrt{\pi }}\frac{e^{-{u}^2}}{\left[1+{\left(\frac{u}{u_x}\right)}^2\right]} du=\sqrt{\pi}\cdot {u}_x\cdot {e}^{u_x^2}\cdot \operatorname{erfc}\left({u}_x\right) $$
(6.34)

Here erfc(u) is the complex error function. The hourglass effect depends strongly on the relative value of β and the bunch length σ s. For small β the effect becomes relevant since the beam size varies rapidly along the longitudinal bunch direction, i.e. when s becomes comparable to the bunch length in Eq. (6.32). A loss of luminosity according to Eq. (6.34) is the consequence.

6.4.3.3 Crabbed Waist Scheme

In the case of a large crossing angle, the collision point of particles is displaced. Schematically this is shown in Figs. 6.21 and 6.22.

Fig. 6.21
figure 21

Collision with large crossing angle and longitudinally displaced collison point

Fig. 6.22
figure 22

Collision with large crossing angle and longitudinally displaced collison point. Shown for three particles with different amplitudes

One possible consequence can be the coupling between the transverse and and longitudinal plane. Such a coupling is particularly bad for flat beams since the vertical beam size will increase significantly.

In Fig. 6.23 the vertical β-function is indicated and the result of this effect is that the particles collide at positions with different vertical β-functions.

Fig. 6.23
figure 23

Collisions with different vertical β-functions

This can be mitigated [42] by making the vertical waist (\( {\beta}_y^{min} \)) amplitude dependent in the horizontal plane Fig. 6.24. All particle collide now at the minimum of the vertical β-function.

Fig. 6.24
figure 24

All collisions at a minimum of the vertical β-functions using a crabbed waist scheme

It should be emphasized that the main purpose of such a scheme is not to reduce a geometrical loss but to reduce the coupling. Therefore it is of interest only for flat beams.

This scheme is established using two sextupoles.

6.4.4 Integrated Luminosity and Event Pile Up

The maximum luminosity, and therefore the instantaneous number of interactions per second, is very important, but the final figure of merit is the so-called integrated luminosity:

$$ {\mathcal{L}}_{\mathrm{int}}={\int}_0^T\mathcal{L}\left({t}^{\prime}\right)d{t}^{\prime } $$
(6.35)

because it directly relates to the number of observed events:

$$ {\mathcal{L}}_{\mathrm{int}}\cdot {\sigma}_p=\mathrm{number}\ \mathrm{of}\ \mathrm{events}\ \mathrm{of}\ \mathrm{interest} $$
(6.36)

The integral is taken over the sensitive time, i.e. excluding possible dead time. The unit of the integrated luminosity is cm−2 and often expressed in inverse barn (1 barn−1 = 1024 cm−2).

Another important parameter for a beam with high luminosity and bunched beams are the number of collisions per bunch crossing, the so-called pile up. In particular for collisions with a large cross section this can become a problem. In the case of the LHC, bunch crossings occur every 25 ns and the expected pile up is more than 20 for proton-proton collisions. The challenge is to maximise the useful luminosity while keeping the pile up to a level that can be handled by the particle detectors.

6.4.5 Measurement and Calibration of Luminosity

To obtain the exact integrated luminosity, it has to be recorded continuously. It is rather straightforward to obtain a counting rate directly proportional to the total interaction rate dR/dt. This relative signal has to be calibrated to deliver the absolute luminosity. We have already seen some effects that affect the absolute luminosity and therefore to a large extent the luminosity measurement. In particular the crossing angle and the luminous region are of importance since they have immediate implications for the geometrical acceptance of the instruments.

In principle one can determine the absolute luminosity when all relevant beam parameters are known, i.e. the bunch intensities, beam sizes (r.m.s. in case of unknown beam profiles) and the exact geometry. However the precise measurement of beam sizes is a challenge, in particular for hadron colliders when a non-destructive measurement is required. When the energy spread in the beams is large (e.g. some e+e colliders), a residual dispersion at the interaction point increases significantly the beam size and must be included.

There exist other methods which relate the counting rate to well known processes which can be used for calibration. We shall discuss several methods for both, lepton and hadron colliders.

6.4.6 Absolute Luminosity: Lepton Colliders

Once the relative luminosity is known, a very precise method is to compare the counting rate to well known and calculable processes. In case of e+e colliders these are electromagnetic processes such as elastic scattering (Bhabha scattering). The principle is shown in Fig. 6.25. Particle detectors are used to measure the trajectories at very small angles and with a coincidence of particles on both sides of the interaction point. For a precise measurement one has to go to very small angles since the elastic cross section σ el has a strong dependence on the scattering angle (σ el ∝ Θ−3).

Fig. 6.25
figure 25

Principle of luminosity measurement using Bhabha scattering for e+e colliders

Furthermore, the cross section diminishes rapidly with increasing energy (\( {\sigma}_{el}\propto \frac{1}{E^2} \)) and the result may be small counting rates. At LEP energies with \( \mathcal{L} \) = 1030 cm−2 s−1 one can expect only about 25 Hz for the counting rate. Background from other processes can become problematic when the signal is small.

6.4.7 Absolute Luminosity: Hadron Colliders

For hadron colliders two types of calibration have become part of regular operation, the measurement of the beam size by scanning the beam and the calibration with the cross section for small angle scattering. The determination of the bunch intensities is usually easier, although non-trivial in the case of a collider with several thousand bunches.

6.4.7.1 Measurement by Profile Monitors and Beam Displacement

Typical profile measurement devices are wire scanners where a thin wire is moved through the beam and the interaction of the beam with the wire gives the signal. For high intensity hadron beams this has however limitations. Non-destructive devices such as synchrotron light monitors are available but the emitted light from hadrons is often not sufficient for a precise measurement.

An alternative is to measure the beam size by displacing the two beams against each other. The relative luminosity reduction due to this offset can be measured and is described by the formula (6.28) developped earlier:

$$ \mathcal{L}(d)/{\mathcal{L}}_0\kern0.5em W={e}^{-\frac{d^2}{4{\sigma}^2}} $$
(6.37)

where d is the separation between the beams and the measurement of the luminosity ratio is a direct measurement of W. This method was already used in the CERN Intersection Storage Rings (ISR) and known as “van der Meer scan”.

The expected counting rate of such a scan is shown in Fig. 6.26. A fit to the above formula gives the beam size. A drawback of this method is the distortion of the beam optics in case of very strong beam-beam interactions [40]. This effect has to be evaluated carefully.

Fig. 6.26
figure 26

Principle of luminosity measurement using transverse beam displacement

6.4.7.2 Absolute Measurement with Optical Theorem

This method is similar to the measurement of Bhabha scattering for e+e colliders but requires dedicated experiments and often special machine conditions.

The total elastic and inelastic counting rate is related to the luminosity and the total cross section (elastic and inelastic) by the expression:

$$ {\sigma}_{tot}\cdot \mathcal{L}={N}_{inel}+{N}_{el}\kern0.5em \left(\mathrm{Total}\ \mathrm{counting}\ \mathrm{rate}\right) $$
(6.38)

The key to this method is that the total cross section is related to the elastic cross section for small values of the momentum transfer t by the so-called optical theorem [43]:

$$ {\left.\underset{t\to 0}{\lim}\frac{d{\sigma}_{el}}{dt}=\left(1+{\rho}^2\right)\frac{\sigma_{tot}^2}{16\pi }=\frac{1}{\mathcal{L}}\frac{dN_{el}}{dt}\right|}_{t=0} $$
(6.39)

Therefore the luminosity can in principle be calculated directly from experimental rates through:

$$ \mathcal{L}=\frac{\left(1+{\rho}^2\right)}{16\pi}\frac{{\left({N}_{inel}+{N}_{el}\right)}^2}{{\left({dN}_{el}/ dt\right)}_{t=0}} $$
(6.40)

All counting rates, the total number of events N inel + N el and the differential elastic counting rate dN el/dt at small t have to be measured with high precision. This requires a very good detector coverage of the whole space (4π) for the inelastic rate and the possibility to measure to very small values of t.

A slightly modified version of the above uses the Coulomb scattering amplitude which can be precisely calculated. The elastic scattering amplitude is a superposition of the strong (f s) and Coulomb (f c) amplitudes, the latter dominates at small t. We can re-write the differential elastic cross section \( \frac{d{\sigma}_{el}}{dt} \):

$$ { {\begin{array}{l}{\left.\underset{t\to 0}{\lim}\frac{d{\sigma}_{el}}{dt}=\frac{1}{\mathcal{L}}\frac{dN_{el}}{dt}\right|}_{t=0}=\pi {\left|{f}_c+{f}_s\right|}^2\simeq \pi \mid \frac{2{\alpha}_{em}}{-t}+\frac{\sigma_{tot}}{4\pi}\left(\rho +i\right){e}^{B\frac{t}{2}}{\left|{}^2\simeq \frac{4{\pi \alpha}_{em}^2}{t^2}\right|}_{\mid t\mid \to 0}\end{array}}} $$
(6.41)

If the differential cross section is measured over a large enough range, the unknown parameters σ tot, ρ, B and \( \mathcal{L} \) can be determined by a fit. A measurement [44,45,46] together with some crude fits is shown in Fig. 6.27 to demonstrate the principle. The advantage of this method is that it can be performed measuring only elastic scattering without the need of a full coverage to measure Ninel. It is therefore a good way to measure the luminosity (and total cross section σ tot and interference parameter ρ!) although the previous method is of more practical importance for regular use.

Fig. 6.27
figure 27

Principle of luminosity measurement using optical theorem in proton proton (antiproton) collisions

The measurement of the Coulomb amplitude usually requires dedicated experiments with detectors very close to the beam (e.g. with so-called Roman Pots) and therefore special parameters such as reduced intensity and zero crossing angle. Furthermore, in order to measure very small angle scattering, one has to reduce the divergence in the beam itself (\( {\sigma}^{\prime }=\sqrt{\upepsilon /\beta } \)). For that purpose special running conditions with a high β at the collision point are often needed (β > 1000 m) [45]. The precision of such a measurement is however as good as a few percent.

6.4.8 Luminosity in Linear Colliders

In linear colliders the beams collide only once and to get a high luminosity a very small beam size and therefore small β at the collision point are required.

This implies additional effects such a beam disruption and an enhanced luminosity due to the so-called pinch effect.

Due to very strong field of the quadrupoles of the final focusing, significant synchrotron radiation is produced.

6.4.8.1 Disruption and Luminosity Enhancement Factor

The basic formula for the luminosity of a linear collider is shown in Eq. (6.42).

$$ \mathcal{L}=\frac{N^2\ {f}_{rep}\ {n}_b}{4\pi \overline{\sigma_x}\ \overline{\sigma_y}}\kern0.5em \to \kern0.5em \mathcal{L}=\frac{H_D\cdot {N}^2\ {f}_{rep}\ {n}_b}{4{\pi \sigma}_x\ {\sigma}_y} $$
(6.42)

The revolution frequency has to be reaplced by the repetition rate f rep of the colliding bunches.

The luminosity is increased by the enhancement factor H D which takes into account the reduction of the nominal beam size by the disruptive field (pinch effects).

This enhancement foctor is related to the beam disruption parameter

$$ {D}_{x,y}=\frac{2{r}_eN{\sigma}_z}{{\gamma \sigma}_{x,y}\left({\sigma}_x+{\sigma}_y\right)} $$
(6.43)

For weak disruption (D≪ 1) and round beams the enhancement factor can be written as:

$$ {H}_D=1+\frac{2}{3\sqrt{\pi }}D+\mathcal{O}\left({\mathcal{D}}^2\right) $$
(6.44)

When the disruption is strong or for flat beams, computer simulations are necessary.

6.4.8.2 Beamstrahlung

The strong synchrotron radiation (beamstrahlung) has two main effects:

  • Spread of the centre of mass energy.

  • Pair creation and background in the detectors.

It is parametrized by the parameter Y which can be written as the mean field strength in the rest frame, normalized to the critical field B c:

$$ Y=\frac{<E+B>}{B_c}\approx \frac{5}{6}\frac{r_e^2\gamma N}{{\alpha \sigma}_z\left({\sigma}_x+{\sigma}_y\right)} \vspace*{-12pt}$$
(6.45)
$$ {B}_c=\frac{m^2{c}^3}{e\mathrm{\hslash}}\approx 4.4\times {10}^{13}G $$
(6.46)

6.5 Synchrotron Radiation and Damping

6.5.1 Basic Properties of Synchrotron Radiation

Charged particles radiate when they are deflected in the magnetic field [47] (transverse acceleration) [see also Sect. 11.1 for a more detailed treatment]. In the ultra-relativistic case, when the particle speed is very close to the speed of light, β ≈ c, most of the radiation is emitted in the forward direction [48] into a cone centred on the tangent to the trajectory and with an opening angle of 1/γ, where γ is the Lorentz factor (since for a few GeV electron or a few TeV proton, γ ≈ 1000, the photon emission angles are within a milliradian of the tangent to the trajectory).

The power emitted by a particle is proportional to the square of its energy E and to the square of the deflecting magnetic field B:

$$ {P}_{SR}\propto {E}^2{B}^2, $$
(6.47)

and in terms of Lorentz factor γ and the local bending radius ρ can be written as follows:

$$ {P}_{SR}=\frac{2}{3}\alpha\ \mathrm{\hslash}{c}^2\ \frac{\gamma^4}{\rho^2}, $$
(6.48)

where α is the fine-structure constant and the Plank’s constant is given in a convenient conversion constant:

$$ \alpha =\frac{1}{137}\kern0.75em \mathrm{and}\kern0.75em \mathrm{\hslash}c=197\ \mathrm{MeV}\ \mathrm{fm}. $$
(6.49)

The emitted power is a very steep function of both the particle energy and particle mass, being proportional to the fourth power of γ.

Integrating the above expression around the machine we obtain the amount of energy lost per turn:

$$ {U}_0=\frac{4\pi }{3}\alpha\ \mathrm{\hslash}c\ \frac{\gamma^4}{\rho }. $$
(6.50)

The emitted radiation spectrum consists of harmonics of the revolution frequency and peaks near the so-called critical frequency or critical photon energy. It is defined such that exactly half of the radiated power is emitted below it:

$$ {\varepsilon}_c=\frac{2}{3}\ \mathrm{\hslash}c\ \frac{\gamma^3}{\rho }. $$
(6.51)

On the average a particle then emits n c ≈ 2παγ photons per turn.

6.5.2 Radiation Damping

In a storage ring the steady loss of energy to synchrotron radiation is compensated in the RF cavities, where the particle receives each turn the average amount of energy lost. The energy lost per turn is normally a small fraction of the total particle energy, typically of the order of one part per thousand.

Transverse Oscillations

Since the radiation is emitted along the tangent to the trajectory, only the amplitude of the momentum changes. As the RF cavities increase the longitudinal component of the momentum only, the transverse component is damped exponentially with the damping rate of the order of U 0 per revolution time. A typical transverse damping time corresponds simply to the number of turns it would take to lose the amount of energy equal to the particle energy. The damping times are very fast, in case of a few GeV electron ring being on the order of a few milliseconds.

$$ {A}_{\perp }={A}_0{e}^{-\frac{t}{\tau }},\mathrm{where}\ \frac{1}{\tau }=\frac{U_0}{2E{T}_0}. $$
(6.52)

In a given storage ring the damping time is inversely proportional to the cube of the particle energy.

Longitudinal or Synchrotron Oscillations

Synchrotron oscillations are damped because the energy loss per turn is a quadratic function of the particle’s energy. The damping rate is typically twice the rate for transverse oscillations.

Damping Partition Numbers and Robinson Theorem

For particles that emit synchrotron radiation the dynamics is characterized by the damping of particle oscillations in all three degrees of freedom. In fact, the total amount of damping (Robinson theorem [49]), i.e. the sum of the damping decrements depends only on the particle energy and the emitted synchrotron radiation power:

$$ \frac{1}{\tau_x}+\frac{1}{\tau_y}+\frac{1}{\tau_{\varepsilon }}=\frac{2{U}_0}{E{T}_0}=\frac{U_0}{2E{T}_0}\left({J}_x+{J}_y+{J}_{\varepsilon}\right) $$
(6.53)

where we have introduced the usual notation of damping partition numbers that show how the total amount of damping in the system is distributed among the three degrees of freedom. A typical set of the damping partition numbers is (1,1,2) and their sum is, according to the Robinson theorem, a constant.

$$ {J}_x+{J}_y+{J}_{\varepsilon }=4. $$
(6.54)

Adjustment of Damping Rates

The partition numbers can differ from the above values, while their sum remains a constant. In fact, under certain circumstances, the motion can become “anti-damped”, i.e. the damping time can become negative, leading to an exponential growth of the oscillations amplitudes. From a more detailed analysis of damping rates [50] the damping time can be written as

$$ \begin{array}{l}\frac{1}{\tau_{\varepsilon }}=\frac{U_0}{2E{T}_0}\left(2+\mathcal{D}\right),\kern0.5em \mathrm{and}\kern0.5em \frac{1}{\tau_x}=\frac{U_0}{2E{T}_0}\left(1-\mathcal{D}\right),\kern0.75em \mathrm{where}\kern0.5em \mathcal{D}\equiv \frac{\oint \frac{D}{\rho}\left(2k+\frac{1}{\rho^2}\right) ds}{\oint \frac{ds}{\rho^2}}.\end{array} \vspace*{-9pt}$$
(6.55)

The constant introduced above is an integral of the dispersion function \( \mathcal{D} \) and the magnetic guide field functions, i.e. bending radius and gradient around the ring and is independent of the particle energy. It deviates substantially from zero only when a particle encounters combined function elements, i.e. where the product of the field gradient and the curvature is non-zero. The damping partition numbers then are:

$$ {J}_x=1-\mathcal{D},{J}_{\varepsilon }=2+\mathcal{D},{J}_x+{J}_{\varepsilon }=3. $$
(6.56)

The vertical damping partition number is usually unchanged as the vertical dispersion is zero in storage rings that are built in one (horizontal) plane.

The amount of damping can be repartitioned between the horizontal and energy-time oscillations by altering the value of the \( \mathcal{D} \) constant [50]. This can be achieved by either using combined function magnetic elements in the lattice, or by introducing a special combined function wiggler magnet (so-called Robinson wiggler). Values of horizontal partition number as high as 2.5 have been obtained that way. Values of \( \mathcal{D} \) > 1 lead to anti-damping of horizontal betatron oscillations, while for \( \mathcal{D} \) < −2 the synchrotron oscillations become unstable.

6.6 Computer Codes for Beam Dynamics

6.6.1 Introduction

The design and operation of an accelerator today is unthinkable without the help of computer codes, the reason being large, complex structures (like in the case of big accelerators and colliders, e.g. LHC) or complications in the beam dynamics of small or special purpose machines (e.g. FFAG). Their complexity does not allow the computation with pencil and paper. Here we address only the codes for beam dynamics, i.e. special codes for the design of accelerators components such as magnets or RF equipment will not be treated but can be found in the literature. The main fields where beam dynamics codes are essential are:

  • Determination of parameters and the design of beam lines and accelerators

  • Evaluation of performance

  • Control, machine protection and operation

Different classes of codes are used in these fields which also resemble the life cycle of an accelerator.

Given the scope of this handbook and the rapid evolution of computer codes and software techniques, we do not attempt to provide a list of existing codes, but rather will describe the main features, techniques and applications of the different types of codes. Details and access to existing codes can be found in computer code libraries on the internet. A supported library is provided by the Los Alamos Accelerator Code Group (LAACG) [51], another one supported by Astec (UK) [52]. It contains links to popular and frequently used codes from many laboratories and institutions.

6.6.2 Classes of Beam Dynamics Codes

The different classes of codes can be divided according to their application:

  • General purpose optics codes

  • Beam dynamics of single particles

  • Beam dynamics of multi particles

Optics codes are used mainly in the initial design phase of an accelerator, rings as well as beam lines and linear accelerators. The evaluation of the performance (stability etc.) is done using codes to simulate the beam dynamics of single particles as well as ensembles of particles and their interaction with the environment or other particles in the beam(s).

6.6.3 Optics Codes

A large group of computer codes for beam dynamics are used to design the lattice of an accelerator or beam line and to compute and optimize the optical parameters. The range of available codes extends from small codes for pedagogical purpose to large general purpose programs. Such codes can have easily 100,000 lines of codes or more. The accelerator physics is described in the existing literature [53] and in this handbook. The main applications of general purpose optics codes are:

  • Determination of main parameters and the computation of linear and non-linear optics. This implies to find periodic solutions for the optical parameters and the closed orbit.

  • Parameter matching (optical/geometrical) and lattice optimization, i.e. the properties of elements are varied until the optical functions assume their desired values.

  • Simulation of imperfections and algorithms for their corrections.

  • Simulation of synchrotron radiation and evaluation of radiation integrals to derive estimates for parameters (e.g. equilibrium emittances) in lepton machines.

The result should be a consistent set of parameters fulfilling the design requirements. They are the basis for the design of machine elements.

Depending on the complexity of the problem, different techniques are in use for optics codes. The majority of these codes rely on the description of machine elements using maps, which can be of higher order for non-linear elements. In the simplest case for the description of linear machines the maps become matrices and are therefore often referred to as “matrix codes” [54]). The concatenation of the matrices provide a matrix for the entire ring and its analysis gives the optical parameters, closed orbit etc.

Another technique is to follow the particles through the accelerator, i.e. integrating the equation of motion in the electromagnetic fields of the machine elements. The analysis of the results of these “tracking programs” provides the required parameters and information about the stability of the machine (for some details see [54]).

Dealing with complex machines, other considerations may become important such as e.g.:

  • Definition of an input language which can be used by other programs. This input language defines the sequence of elements, i.e. the ring or a beam line, as well as the properties of the elements such as e.g. their types (dipole, quadrupole,..), lengths and strengths.

  • For large machines with a large number of elements the interface to a data base may be required. Large machines such as the LHC or future colliders have several thousand elements.

  • An interface to the control system for on-line modelling is desirable

6.6.4 Single Particle Tracking Codes

To evaluate the performance of accelerators, in particular multi pass, i.e. circular machines, one has to deal with complex iterative processes. The standard perturbation theories can fail to correctly describe the behaviour beyond leading orders. Single particle tracking codes are successfully used when analytical methods fail to describe the effect of non-linear forces on the stability of the particles. Many tracking codes have been developed together with the necessary tools to analyse the results and from the simulation point of view the treatment of non-linear effects is well established. Conceptually, in a tracking code the equation of motion of a particle in an accelerator element is solved and the phase space coordinates of the particle are followed through all elements of the accelerator or beam line. To obtain the desired information, it may be necessary to repeat this process for up to 107 turns which require appropriate algorithms and techniques to avoid numerical problems. Similar problems exist and some of these techniques have been developped for celestial mechanics. In order to draw conclusions from the tracking data it is necessary to provide tools to allow a qualitative and quantitative understanding of the results [55]. The outcome of the analysis allows to answer the most important questions for the design of a machine such as:

  • Stability of particle motion

  • Dynamic aperture

  • Specifications for the properties of machine elements

  • Optimization or the particle stability

In general the results of these studies are used in an iterative procedure to improve and optimize the design of the machine.

6.6.4.1 Techniques

A requirement for all techniques employed for particle tracking is that the associated maps must be symplectic. To solve the equation of motion, most programs use explicit canonical integration techniques, e.g.:

  • Thin lens tracking (most common since they are automatically symplectic and fast)

  • Ray tracing (accuracy by slicing into large number of steps, but time consuming)

  • Symplectic integration (see [54] and references therein)

6.6.4.2 Analysis of Tracking Data

Some of the analysis techniques are discussed in the chapter on non-linear dynamics in this handbook in more detail and some are mentioned here for completeness:

  • Taylor maps using Truncated Power Series Algebra (TPSA, [54])

  • Lie algebraic maps [54, 56]

  • Normal form analysis

The results of the analysis include non-linear resonances and distortion, non-linear tuneshift with amplitude and an evaluation of the long term stability. In all cases the interpretation of the results requires a careful analysis of the range where the data is meaningful to avoid wrong conclusions. Typical problems are numerical effects which can lead to unphysical features.

6.6.5 Multi Particle Tracking Codes

Multi particle tracking codes are used when we are concerned with the behaviour of an ensemble of particle. The calculations largely rely on techniques developped for single particle dynamics. Typical applications are the simulation of:

  • Space charge effects, mutual interaction of particles within the same beam.

  • Collective instabilities and interaction with environment (impedance)

  • Beam-beam effects in case of particle colliders, i.e. the interactions with the fields produced by the counter-rotating beam.

  • Electron cloud effects, i.e. secondary electron production by synchrotron radiation

A key issue for multi particle simulation codes is the evaluation of the electromagnetic fields produced by the beams or the environment. New techniques and the availability of parallel computing facilities have allowed vast progress in this field in the last 20 years.

6.6.6 Machine Protection

For large energy and high intensity machines the protection of the machine elements becomes an important part of the design. Simulation codes have to include the interaction of particles with matter.

6.7 Electron-Positron Circular Colliders

Electron-positron (e+e) collider rings have been a mainstay of both discovery and precision physics for half a century: discovery, since the simple initial state can create any particle coupled to the electromagnetic field; precision, from the combination of high luminosity and large cross-sections at a rich spectrum of resonances up to \( \sqrt{s}\simeq 200\kern0.1em \mathrm{GeV} \). While the fundamentals of these machines have remained in essence the same, the technology has matured to the point where luminosities of the latest “factories” exceed what was thought possible in the 1970s and early 1980s by 2–3 orders of magnitude.

These colliders are based on the principle of the synchrotron (Sect. 1.2.6) although the name is barely appropriate for those which enjoy the advantage of full-energy injection. Beams are necessarily bunched by an RF system, which must provide sufficient voltage to compensate the energy lost by synchrotron radiation.

6.7.1 Physics of Electron-Positron Rings

Consider an ideal storage ring constructed with bending and focussing magnets such that a particle of charge e and constant momentum p 0 could circulate on a stable closed orbit, O xy, in transverse phase space (x, p x, y, p y), with local radius of curvature ρ(s). The orbits and optical functions (β x, y(s), dispersion D x, y(s), etc., Chap. 2) of such hypothetical, non-radiating particles are a construct useful in the description of e± dynamics. Real, radiating, e+ of energy \( E=\sqrt{p^2{c}^2+{m}^2{c}^4}\simeq pc\simeq {p}_0c \) can circulate in a phase-space neighbourhood of O xy provided RF cavities of a proper frequency and sufficient voltage are added to compensate the average radiative energy loss and provide longitudinal phase stability (e can circulate in the opposite direction). In a semi-classical picture [49, 53, 57, 58], e± emit photons at random times according to the classical synchrotron radiation spectrum [47, 48] and make stochastic transitions between betatron trajectories corresponding to their instantaneous momenta. This picture can be understood [59] by recognising that a storage ring differs from an atom in that changes, Δn = n u/E, in orbital quantum number, n, corresponding to typical photon emissions of energy u, satisfy n ≫ Δn ≫ 1.

There is no deterministic closed orbit but the full 6D central orbit, O xyz of a bunch of many electrons normally coincides with the attractive stable orbit calculated by averaging over photon emissions to include only the classical deterministic part of the synchrotron radiation (this includes the stable phase with respect to the RF system). If the domain of attraction of this orbit is large enough, the beam can have a good lifetime (Eq. 6.60 below). Because of the energy variation round the ring (localised RF cavities giving “energy-sawtooth”), the transverse projection of O xyz does not coincide with O xy. Figure 6.28 shows an example.

Fig. 6.28
figure 28

The cumulative RF voltage (black dashed line) around the ring and four components of the ideal six-dimensional closed orbit of the e+ beam in CERN’s LEP collider, at a central beam energy of 94 GeV in an optics used in 1998. The fractional deviation of the beam energy on the closed orbit p tc (red) exhibits the “energy sawtooth” due to the energy lost by synchrotron radiation in the eight arcs and its replenishment by RF systems located around the experimental interaction points. The conjugate time-lag coordinate t c (green) reflects the corresponding path-length changes. The horizontal closed orbit x c = D x p tc + x cB (blue) is a combination of the local dispersion orbit and a forced betatron oscillation and its conjugate p xc. Without radiation, these four orbit components would be zero. LEP was a single ring collider with e+ and e beams of similar intensity circulating in the same beam pipe. In normal operation, the average x c for the two beams, which were approximately equal and opposite, \( {x}_c^{+}\simeq -{x}_c^{-} \), was measured and corrected to the central trajectory. At higher energies, these orbits could be separated by a few cm near the RF systems

Neglecting intensity-dependent phenomena, the equilibrium dimensions of the beam are macroscopic quantum effects determined by the balance between radiation damping (the dependence of the classical radiation lost in magnetic fields on the energy, [49, 57, 58] and Sect. 6.5), and the quantum fluctuations (discrete photon nature) of the synchrotron radiation [57, 58]. Generally, the effects are linear enough that the core of the distribution is gaussian in each normal mode coordinate.

The mean-square fractional energy spread in the beam is

$$ \frac{\sigma_E^2}{E^2}=\frac{55}{32\sqrt{3}}\frac{\mathrm{\hslash}}{mc}{\left(\frac{E_0}{mc^2}\right)}^2\frac{\oint \left|{G}^3\right| ds}{J_z\oint {G}^2 ds}\simeq \frac{1}{2}{\gamma}^2\frac{\lambda_e}{\rho_0}, $$
(6.57)

where G = eB/p 0 c = ρ −1 is the inverse of the local bending radius of O xy, ∮⋯ds denotes an integral around O xy, J z is the longitudinal damping partition number (Sect. 6.5), λ e = ℏ/mc is the reduced Compton wavelength of the electron and the last equality holds to the extent that G(s) is zero or has a constant value 1/ρ 0 (isomagnetic ring).

Economic arguments, balancing construction cost against power consumption, are sometimes invoked to derive a scaling of radius with energy squared but this only applies for the highest energy rings with a few bunches (see [60] for the scaling of design parameters). More generally, the chromaticity correction and dynamic aperture constraints (Sect. 3.4) in collider rings require 6σ E/E ≲ 1%, so imposing a minimum radius ρ/m ≈ 0.26(E 0/GeV)2. The spread in centre-of-mass energies of collisions \( {\sigma}_{\sqrt{s}}=\sqrt{2}{\sigma}_E \) (if D x = 0 at the collision point) should also be kept small.

The equilibrium horizontal emittance for flat rings without betatron coupling is

$$ {\varepsilon}_x=\frac{55}{32\sqrt{3}}\frac{\mathrm{\hslash}}{mc}{\left(\frac{E_0}{mc^2}\right)}^2\frac{\oint \left|\mathcal{H}{G}^3\right| ds}{J_x\oint {G}^2 ds}, $$
(6.58)

where \( \mathcal{H}={\beta}_x{D_x}^2+2{\alpha}_x{D}_x{D}_x^{\prime }+{\gamma}_x{D^{\prime}}_x^2 \) is a quadratic form constructed from the dispersion and betatron matrix (Sect. 2.1). Together, Eqs. (6.57) and (6.58) give the mean-square equilibrium beam size at any point in the ring

$$ {\sigma}_x^2=\left\langle {\left({x}_{\beta }+{D}_x\left(E-{E}_0\right)/{E}_0\right)}^2\right\rangle ={\beta}_x{\varepsilon}_x+{D}_x^2{\left({\sigma}_E/E\right)}^2. $$
(6.59)

The vertical emittance is usually smaller and due to some coupling of horizontal betatron motion into the vertical and vertical dispersion from orbit errors or other vertical bends. More general formalisms [53, 61] describe the radiation-generated emittances for the eigenmodes of linear oscillations about general six-dimensional central orbits.

A true equilibrium (strictly, stationary) state does not exist because the quantum fluctuations lead to loss from the tails of the beam with lifetimes for the three modes given by

$$ {\tau}_{q,u}=\frac{1}{2}{\tau}_u\frac{e^{\xi_u}}{\xi_u},\kern1em \mathrm{where}\kern1em {\xi}_u=\frac{A_u^2}{2{\sigma}_u^2},\kern1em \mathrm{for}\kern1em u=x,y,z,\kern1em {\xi}_u\gtrsim 20, $$
(6.60)

where the τ u are the radiation damping times and A u are appropriate acceptances [53, 57, 58]. For the synchrotron mode, A z is the RF bucket half-height

$$ {\left(\frac{A_z}{E_0}\right)}^2=\frac{2{U}_0}{\pi \left|\eta \right|{h}_{\mathrm{RF}}{E}_0}\left[\sqrt{{\left({eV}_{\mathrm{RF}}/{U}_0\right)}^2-1}-\operatorname{arccos}\left({U}_0/{eV}_{\mathrm{RF}}\right)\right]. $$
(6.61)

For adequate lifetime at small intensity, the mechanical and dynamic apertures and RF voltage must be large enough.

The bunch length is given by σ z = c|η|σ E/(ω s E 0) where ω s is the angular synchrotron frequency and η ≃ α c the frequency slip factor ([53], Sects. 2.5.2 and 2.5.3).

6.7.2 Design of Colliders

Colliders are designed from the interaction point outwards. The classical design is based on head-on collisions of flat beams. However a number of other configurations have been explored and the most promising among them is described in the following section. In the classical scheme, luminosity (Sect. 6.4) is maximised by achieving very flat beams, κ = ε y/ε x ≪ 1; we consider only beams of equal energy, size and single bunch population, N b, colliding head-on, with \( {\sigma}_y^{\ast}\ll {\beta}_y^{\ast } \); for generalisations see [53]. The beam-beam effect (Sect. 4.6.1) generally imposes maximum attainable values on the horizontal and vertical beam-beam parameters

$$ {\xi}_{x,y}=\frac{r_e{N}_b{\beta}_{x,y}^{\ast }}{2\pi \kern0.1em \left({E}_0/{mc}^2\right)\kern0.1em {\sigma}_{x,y}^{\ast}\left({\sigma}_x^{\ast }+{\sigma}_y^{\ast}\right)}, $$
(6.62)

where r e is the electron classical radius, \( {\beta}_{x,y}^{\ast } \) and \( {\sigma}_{x,y,z}^{\ast } \) are the optical functions and beam sizes at the collision point. Typically, one finds maxξ y = 0.03 − 0.1 with the highest values attained when the machine is very well corrected (favourable tunes, central orbits close to design, minimised vertical dispersion) and when radiation damping is strong. Then the luminosity (Sect. 6.4) can be expressed as

$$ L=\frac{f_c{N}_b}{2{r}_e}\left(\frac{E_0}{mc^2}\right)\frac{\left(1+\kappa \right){\xi}_y}{\beta_y^{\ast }}, $$
(6.63)

where f c is the frequency at which identical bunches collide; in the simplest case f c = k b f 0 where k b is the number of bunches per beam.

The number of bunches in a single-ring collider is limited by the possibilities for separating the opposing beams at unwanted encounters, e.g., by local or long-range (“pretzel scheme”) electrostatic orbit bumps [53]. Collective effects limiting the single-bunch intensity (bunch-lengthening, transverse mode-coupling, see Chap. 4) are a major concern. In recent double-ring colliders, many more bunches can be stored. A crossing angle at the collision point separates the beams at encounters in the adjacent common section of the beam pipe. In recent years, the highest luminosity collider designs have adopted a new scheme described in the following section.

Multi-bunch collective effects and other limits related to total beam current (e.g., component heating by wakefields or synchrotron radiation, beam-loading, electron-cloud, ion-trapping [53]) tend to dominate. The impedance and surface properties of the vacuum chamber are critical.

Integrated luminosity can be further maximised in moderate energy rings for which a full-energy injector is available by topping up the intensity of the stored beam rather than dumping and refilling. The static magnetic configuration (no ramp and squeeze cycle) simplifies operation dramatically.

The arcs of collider rings are usually composed of FODO cells whose length and phase advance determine the emittance through Eq. (6.58). To minimise radiation power, the bending magnets are made as long as possible. In the highest energy rings, the quadrupoles must also be lengthened.

Low-β insertions (Sect. 6.2.1) provide small values of \( {\beta}_y^{\ast } \) at the interaction point(s) of the experiment(s) in long straight sections. These can also accommodate the accelerating cavities of the RF system, beam instrumentation and wiggler magnets and are connected to the arcs via dispersion suppressors.

Wiggler magnets modify the radiation damping, bunch length and/or emittance by contributing additional terms [53] with large |G| to the integrals in Eqs. (6.57) and (6.58), so providing additional flexibility to maximise performance (e.g., at lower energy).

Sextupoles incorporated in the arcs must correct the large chromatic aberrations generated in the low-β quadrupoles while preserving adequate dynamic aperture (Sect. 3.4.4).

Many variations on this classical e+e collider design are possible with new interaction region concepts showing promise (Sect. 6.4) in overcoming the need for ever-increasing beam current and ever-shorter bunches.

At higher intensities, phenomena such as the Touschek effect and intra-beam scattering [53], sometimes in combination with non-linear single particle dynamics or beam-beam effects, can reduce the lifetime below the values implied by Eq. (6.60); see Chap. 3 and Sect. 4.6.

6.7.3 Large Piwinski Angle and Crab Waist Collision Scheme

The need for precision measurements of rare decay modes with small cross sections at e+e factories has driven requirements on peak luminosity to unprecedented levels. Conventional collision schemes, see Eq. (6.63), are based on pushing up the beam currents, lowering the \( {\beta}_y^{\ast } \), and increasing the beam emittance so as not to exceed the beam-beam tune-shift limits. Passing from single to double ring colliders allowed the number of bunches to be increased considerably. However in order to avoid luminosity reduction due to parasitic (or long-range) bunch encounters near the collision point, beams had to be collided with a small horizontal crossing angle rather than head-on. However, this approach has come to a dead end since high currents result in high power losses, beam instabilities and increased power consumption.

Because of the parabolic variation of \( {\beta}_y(s)={\beta}_y^{\ast }+{s}^2/{\beta}_y^{\ast } \) in the vicinity of the interaction point (IP), the longitudinal region in which individual particle collisions occur will include places where the effective \( {\beta}_y(s)\,{>>}\,{\beta}_y^{\ast } \) at the IP, and will therefore contribute less to the luminosity. This so-called hour-glass effect imposes a condition on the bunch-length: σ z ≲ ß y(s). Unfortunately, shortening the bunch length is costly since it requires high voltage in the RF cavities, can excite collective instabilities, induce higher-order mode (HOM) heating in the beam pipe, and lead to coherent synchrotron radiation emission, which in turn deteriorates the bunch shape. On the other hand, increasing the bunch current leads to coupled bunch instabilities, HOM heating of the beam pipe, and higher wall-plug power.

A solution to these problems came with the idea of a new collision scheme, called “Large Piwinski Angle and Crab Waist Sextupoles” (LPA&CW), by P. Raimondi in 2006 [62]. This scheme has two main ingredients:

  1. 1.

    A large horizontal crossing angle at the IP, combined with very small horizontal beam size, resulting in a large Piwinski angle;

  2. 2.

    a pair of sextupoles, each placed on one side of the IP at a specific betatron phase from it.

The Piwinski angle is defined as:

$$ \varPhi =\frac{\sigma_z \ tan \left(\theta \right)}{\sigma_x}\approx \theta \frac{\sigma_z}{\sigma_x}. $$
(6.64)

Consider two bunches with RMS beam size σ x and bunch length σ x, colliding at a horizontal crossing angle 2θ. For flat beams colliding at a small crossing angle θ ≪ 1 and large Piwinski angle Φ ≫ 1, the luminosity L and the tune-shifts scale as [63]:

$$ L\kern1em \propto \kern1em \frac{N{\xi}_y}{\beta_y^{\ast }} \vspace*{-12pt}$$
(6.65)
$$ {\xi}_y\kern1em \propto \kern1em \frac{N}{2{\theta \sigma}_z}\sqrt{\beta_y^{\ast }/{\upepsilon}_y} \vspace*{-12pt}$$
(6.66)
$$ {\xi}_x\kern1em \propto \kern1em \frac{N}{{\left(2{\theta \sigma}_z\right)}^2} $$
(6.67)

In the LPA scheme the Piwinski angle is increased by decreasing σ x and increasing θ. The most relevant consequence is that the overlap area of the two colliding beams is now reduced, since it is proportional to σ x/θ. As a plus, as can be seen from Eq. (6.67), the horizontal tune shift in this case drops like (2θσ z)2, so the beam-beam interaction can be considered as one-dimensional and only the vertical plane is relevant.

Now, the vertical \( {\beta}_y^{\ast } \) function at the IP can be decreased, as much as the focussing magnet technology allows, to be comparable to the overlap area size that, in this case, is smaller than the bunch length. In this case that is much smaller the bunch length, so relaxing the problems of HOM heating, coherent synchrotron radiation and excessive power consumption:

$$ {\beta}_y^{\ast}\approx \frac{\sigma_x}{2\theta}\ll {\sigma}_z $$
(6.68)

This scheme has several advantages:

  • a smaller spot size at the IP, leading to higher luminosity,

  • a reduction of the vertical tune-shift parameter,

  • the mitigation of synchro-betatron resonances.

Long range beam-beam interactions no longer limit the maximum achievable luminosity when the distance between bunches is short. These parasitic crossings become negligible because of the larger crossing angle and the smaller horizontal beam size. The separation at each encounter is larger in terms of σ x .

However the large Piwinski angle itself may introduce new beam-beam resonances which can limit the maximum achievable tune shifts. The second ingredient of the LPA&CW scheme, the pair of Crab Waist sextupoles, is designed to solve this problem. The CW transformation causes the horizontal oscillations to modulate the vertical motion modulation and thereby suppresses the betatron and synchro-betatron resonances. The CW scheme is realised by installing a couple of sextupole magnets on the two sides of the IP, preferably in a high β and zero dispersion region. To provide the exact compensation the sextupoles be at π horizontal and a π/2 vertical betatron phase advance from the IP.

The CW transformation can be described by the Hamiltonian:

$$ H={H}_0+\frac{1}{2\theta }{xp}_y^2 $$
(6.69)

where H 0 is the Hamiltonian of the particle’s motion without the CW, x is the horizontal particle coordinate and p y the vertical momentum. The effect of the CW transformation is a vertical betatron function twist according to:

$$ {\beta}_y={\beta}_y^{\ast }+\frac{{\left(s-x/2\theta \right)}^2}{\beta_y^{\ast }} $$
(6.70)

In this case, the β y waist of one beam is twisted to be oriented along the central trajectory of the other beam. As a consequence, all particles, independently of their x position, collide at the minimum β y spot of both beams, with an increase of few percent in the geometric luminosity due to the β y redistribution along the overlapping beams area. A sketch of this shown in Fig. 6.29.

Fig. 6.29
figure 29

Crab Waist collision scheme

However the main CW effect is to suppress the betatron and synchro-betatron resonances which would arise due to the vertical motion modulation induced by the horizontal oscillations. This increases the space for the working betatron tunes of the collider. Moreover beam-beam simulations showed that beam tails are very much reduced and the beam-beam blow-up is also suppressed.

The CW sextupoles strength should satisfy the following condition:

$$ K=\frac{1}{2\theta}\frac{1}{\beta_y^{\ast }{\beta}_y}\sqrt{\frac{\beta_x^{\ast }}{\beta_y}} $$
(6.71)

where starred β values are those at the IP and the others are at the sextupole location. The CW sextupoles can reduce the dynamic aperture if there are other non-linearities between them. For this reason they should ideally be installed before the chromaticity correction sextupoles.

The (LPA&CW) collision scheme was first tested at the DAΦNE Φ-Factory in Frascati (Italy) in 2008 [64], by modifying the interaction region to increase the crossing angle, decrease both the β and allocate space for the sextupoles. The result was a boost in luminosity of about a factor of 4 and measurement of the beam profile showed that the bunches kept their Gaussian shape. This scheme has since then been adopted by all new collider designs worldwide (SuperKEKB, FCC-ee, various τ-charm Factory proposals). The collider SuperKEKB in Japan, however, has adopted the LPA scheme (which they called “nano-beams”) without the CW sextupoles because of the lack of space in the IR.

6.8 Hadron Colliders and Electron-Proton Colliders

6.8.1 Principles of Hadron Colliders

Hadron colliders are discovery machines which provide high centre-of-mass energy and cover a wide energy range. Contrary to electron-positron colliders, where the energy and quantum state of the initial particles is precisely known, the input conditions are less well defined in the case of proton-proton or proton-antiproton collisions. In fact, such collisions are, unlike e+e annihilation, collisions of quarks and antiquarks the momenta of which are distributed according to the structure function of the hadron and are hence not precisely defined. As far as the analysis of the events is concerned, hadronic collisions result in a much larger number of tracks in the detector than in the case of e+e annihilation, providing an additional challenge. From a machine physics point of view hadron machines have the enormous advantage that the particle beam energy is not limited by synchrotron radiation. This is because the proton mass is 2000 times higher than the one of the electron, and the energy loss due to synchrotron radiation scales with m−4. What is limiting the achievable beam energy in a hadron collider is the magnetic field to be provided by the dipole magnets in order to bend the particle beam on a circular trajectory. The radius of curvature of a particle in a dipole field is given by

$$ \frac{1}{\rho}\left[{m}^{-1}\right]=\frac{eB}{p}p=0.2998\frac{B\kern0.5em \left[T\right]}{p\left[ GeV/c\right]} $$

where ρ is the radius of curvature, B is the magnetic field, e is the elementary charge and p is the particle momentum. The quantity Bρ is called the rigidity [65, 66]. The beam rigidity determines the B-field required to bend the beam on a circular trajectory with given bending radius. The maximum achievable B-field being limited to about 2 Tesla for normal conducting magnets, today’s high-energy hadron colliders use superconducting bending magnets.

The quasi absence of synchrotron radiation leads to another feature of hadron colliders, which is the fact that there is no synchrotron radiation damping and the transverse emittance is hence determined and preserved throughout the injector chain. Emittance blow up, e.g. via injection mismatch, is therefore critical.

6.8.2 Proton-Antiproton Colliders

A machine with one single vacuum chamber, e.g. the Super Proton Synchrotron SPS (“SppbarS in this operation mode) in the 1980s or the Tevatron can accomplish the acceleration of protons and antiprotons.

During the years 1981–1987, the CERN SPS was operated as a proton-antiproton collider, providing high energy collisions for two major experiments located in adjacent sextants of the accelerator. This operation was first with three dense bunches of protons in collision with three rather weak bunches of antiprotons, with no separation of the beams at the unused crossing points. After increasing the antiproton production rate, six bunches per beam were used. The SPS has normal conducting bending magnets and a circumference of 6.9 km. The beam energy provided by the SPS as proton-antiproton collider was 315 GeV [67, 68]. The SppS was the first hadron collider operating with bunched beams. Before the commissioning of the machine it was debated if it was possible to collide proton and antiproton bunches, or if the beams would become unstable due to the presence of the beam-beam interaction without damping as in e+e colliders. Its success demonstrated the feasibility of high energy hadron colliders [69].

Higher energy was achieved by the Tevatron, using superconducting magnets with a maximum B field of 4.5 T. The circumference of the machine is 6.28 km; comparable to that of the SPS. In the final stage of operations (“run II”), beams are injected at 150 GeV and accelerated to 980 GeV. The bunches (36+36) circulate in the same aperture, the protons clockwise and the antiprotons anticlockwise. The machine has a lattice with four dipoles followed by a quadrupole, with a total of 772+2 dipoles and 90+90 quadrupoles, plus a number of corrector magnets.

6.8.3 Proton-Proton Colliders

Proton-proton colliders require a dedicated magnet design with two separate vacuum chambers for the two equally charged beams. The first proton-proton collider was the ISR (Intersecting Storage Rings) at CERN. It consisted of two rings of 943 m length which were intersecting at eight points. Out of these eight intersection points six were used for experiments. The ISR was operated between 1970 and 1984. The top energy achieved for protons was 31.4 GeV/c. The ISR allowed not only proton-proton collisions, but stored and collided later also deuterons, alpha particles and antiprotons. The ISR pioneered a number of techniques which were beneficial which paved the path for future high energy colliders like the SPS.

The proton-proton collider with the highest energy ever built is the Large Hadron Collider (LHC) at CERN [70]. It uses superconducting magnets with two separate vacuum chambers for the two equally charged beams. The design field is 8.36 T, and the machine circumference 26.659 km which yields a design beam energy of 7 TeV. Higher field levels are being studied in the frame of possible further energy upgrades.

A machine with an even higher beam energy of up to 50 TeV is presently being studied by an international collaboration. The Future Circular Collider (FCC) has a hadron-hadron option (FCC-hh) with a beam energy of 50 TeV [71]. The latest design features a machine circumference of 97.75 km with a maximum dipole field of 15.7 T. The size of the machine is a compromise of civil engineering constraints and dipole feasibility.

6.8.4 Electron-Proton Colliders

Collisions between electrons and protons are used to study the inner structure of the proton e.g. the quark gluon distribution underneath the valence quarks. The electrons are used as a point like probe to determine the inner structures in the target. This deep inelastic scattering studies were performed in the beginning using an accelerated electron beam colliding on a fixed target. Due to kinematic considerations however a much higher resolution is obtained if two accelerated beams are brought into collision.

Due to the different nature and beam dynamics of the two particles an electron-proton collider cannot be built as a single ring machine: It consists of two storage rings of equal circumference, one being optimised for the acceleration and storage of electrons, the other for a high energy proton beam. The design of these two rings looks quite different and completely different effects determine the performance limitations of the rings. Figure 6.30 shows the two storage rings of the HERA collider:

Fig. 6.30
figure 30

View of the two independent storage rings for electron and proton acceleration in HERA. The super conducting proton lattice is placed on top of the conventional electron ring

HERA was built as a 6.3 km long double ring collider with beam energy of 27.5 GeV for the electron beam, and 920 GeV for the proton beam [72]. The fundamental layout was based on four arcs and four straight sections where the high-energy detectors were located. The proton machine was designed as a superconducting magnet lattice in the arcs to achieve the highest possible beam rigidity (or particle energy). The electron storage ring was built in conventional magnet technology: here the limiting factor was the synchrotron radiation emitted by the electrons which was too strong to justify super conducting magnet technology. Basic limits for the achievable beam energy therefore were in the case of the protons the magnetic field of the bending magnets (B = 5.1 T) and for the electron ring the available RF power that was needed to compensate the synchrotron radiation losses. Both rings had been built on top of each other to guarantee an equal revolution time of the circulating particle bunches.

The interaction region of such a two ring collider deserves special attention: While the two beams are brought into collision in a common vacuum system and magnet lattice, they have to be separated after the IP and guided into their respective magnet lattices. Especially in the case of the electron beams the separation has to be performed fast enough, as the strong focusing fields of proton mini beta magnets can only be applied after a full separation of the beams.

Two mini beta insertions therefore have to be installed and combined with an effective beam separation scheme. In the case of HERA the separation has been achieved by using the different momenta of the beams: The mini beta quadrupoles of the electron beam have been placed offset with respect to their magnetic axis and acted as combined function magnets. Consequently the electron beam was bent due to its smaller beam rigidity to the inner side of the ring and at a distance s  = 20 m the first proton magnet could be installed. A schematic view of this nested interaction region is shown in Fig. 6.31.

Fig. 6.31
figure 31

Layout of the HERA interaction region: The inner triplet of the electron lattice is combined with the beam separation scheme and embedded inside the doublet quadrupoles of the proton mini beta insertion

The advantage of this scheme is its compactness as beam separation and focusing are obtained at the same time. Special care however is needed as the electron quadrupoles of the mini beat section will have an effect on the proton beam that depends on the corresponding energy of the electron beam. This dynamic influence on the optics and orbit of the protons therefore has to be compensated during the acceleration of the electrons as well as during the beta squeeze.

The luminosity formula for such a double ring collider is given by

$$ L=\frac{1}{2\pi {e}^2{f}_0}\ast \frac{\sum \limits_i\left({I}_{pi}\ast {I}_{ei}\right)}{\sqrt{\sigma_{xp}^2+{\sigma}_{xe}^2}\ast \sqrt{\sigma_{yp}^2+{\sigma}_{ye}^2}} $$

It depends on the product of the single bunch currents Ipi and Iei and the sum of this contribution over the overall number i of colliding bunches in the rings. As the beams are guided in different magnet lattices the beam sizes σ are independent of each other. Nevertheless the beams have to be matched, i.e. the beam sizes of the two beams at the IP have to be equal in both planes: σxp = σxe and σyp = σye. This condition deserves special attention as the beam emittances of protons and electrons are quite different and independent beam optics have to be established to achieve matched beam sizes.

Another special feature of an electron proton collider is the synchrotron radiation that is emitted by the electron beam. Usually this effect is present in the arc structure where the dipole fields bend the beam on the design orbit. Due to the separation fields needed in the interaction region the synchrotron radiation is also emitted close to the IP and special care is needed to shield the high energy detector from the emitted photons. For the HERA collider this problem has been studied in detail (see Fig. 6.32) and a combination of absorbers and movable collimation masks have been used to avoid hits from direct or back scattered photons into the detector parts.

Fig. 6.32
figure 32

Synchrotron radiation emitted during beam separation in the HERA interaction region. The plot shows schematically the interaction region lattice, the beam dimension and the direction and density of the synchrotron light

The performance limitations of such an e-p-collider are given by the singe bunch intensity of the protons (limited by the particle source), the overall current of the electrons (limited by RF power or beam instabilities), the number of bunches that can be stored in the machine (limited by technical reasons of the injection elements) and the usual limits of the mini beta insertions that have been mentioned above.

The main parameters of HERA are summarised in Table 6.1:

Table 6.1 Summary of the main parameters of HERA

6.9 Ion Colliders

This section has been authored by Brookhaven Science Associates, LLC under Contract No. DE-SC0012704 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.

Ion colliders are research tools for high-energy nuclear physics. The collisions of fully stripped high-energy ions, that is, atomic nuclei, create matter of a temperature and density that existed in the first microseconds after the Big Bang. The matter created in these high-energy ion collisions is known as the Quark Gluon Plasma (QGP), and interactions between the quarks and gluons is the subject of the theory of quantum chromodynamics (QCD). The basic interactions are studied in simpler collisions such as e+e or pp but heavy-ion collisions allow the study of more complex collective phenomena in QCD. The collisions in ion colliders can create hadronic matter at much higher densities and temperatures than fixed target experiments although at a much lower luminosity.

The collisions of heavy ions in RHIC and the LHC have yielded a number of new results and revealed phenomena that were unexpected on the basis of previous theoretical understanding. The QGP generated in the heavy ion collisions in RHIC was expected to be weakly interacting, but found to be strongly interacting like an almost perfect liquid [73, 74]. Hadronic jets created in the collisions have a rather short mean free path in the QGP leading to a phenomenon termed “jet quenching” [74], and the largest ever measured vorticity was seen in heavy ion collisions [75]. The collisions also created the heaviest artificially made antimatter nuclei, anti-helium-4 [76, 77]. The higher energies in the LHC create many more hard probes and heavy bound states such as charmonium (J/ψ) or bottomonium (ϒ) and, in the highest-energy p-Pb collisions, toponium. Z and W bosons, particles that do not interact with the QGP via the strong interaction, were never before seen in heavy ion collisions. The ALICE experiment also reported the highest temperatures directly measured in the laboratory [78].

The colliding nuclei also have high electric charges (Z ~ 80). Together with the powerful Lorentz-compression at high energies, these generate enormous electromagnetic fields outside the nuclear radius. As first shown by Fermi, Weizsäcker and Williams, these fields can be represented as a beam of high energy quasi-real photons, leading to so-called ultraperipheral photonuclear and photon-photon collisions. Besides their intrinsic interest, the high cross-sections for these processes have consequences for the operation of the collider. The ATLAS experiment at the LHC has published the first evidence for light-on-light elastic scattering, a long-predicted fundamental process of nonlinear quantum electrodynamics, transcending Maxwell’s equations.

The first ion collider was the CERN Intersecting Storage Rings (ISR), which briefly collided light ions [79, 80] in the late 1970s. The BNL Relativistic Heavy Ion Collider (RHIC) has been in operation since 2000 and collided a number of species at numerous energies. The CERN Large Hadron Collider (LHC) started its Run 1 heavy ion program in 2010 and has provided mainly p-p, p-Pb and Pb-Pb at increasing luminosity with a substantial increase in energy in Run 2 (2015–2018). Both RHIC and the LHC have an expected operating time exceeding 20 years. Further upgrades to the LHC, its injector complex and its experiments, foreseen in the shutdown after Run 2, should allow the integrated luminosity in Runs 3 and 4 (up to 2029) to exceed Runs 1 and 2 by an order of magnitude. Table 6.2 shows all species combinations and energy ranges demonstrated to date for the ISR, RHIC and LHC. All three machines also collide protons. In RHIC the protons are spin-polarized, making the machine the only collider of spin-polarized protons ever built. The LHC is the highest energy proton-proton and heavy-ion collider ever built. Critically, proton-proton collisions at the same energy per nucleon provide reference data for heavy ion collisions. In the following, we will limit our comments to the ion operation in RHIC and the LHC.

Table 6.2 Ion species and energies achieved in ISR, RHIC and LHC as of 2017

Ion colliders differ from proton or antiproton colliders in a number of ways: the preparation of the ions in the source and the pre-injector chain is limited by other effects than for protons; frequent changes in the collision energy and particle species, including asymmetric species, are typical; and the interaction of ions with each other and accelerator components is different from protons. This has implications for collision products, collimation, the beam dump, and intercepting instrumentation devices such as profile monitors. Thus, the performance limitations of heavy-ion colliders are also different from proton-proton colliders.

Figure 6.33 shows the achieved nucleon-pair luminosities L NN, averaged over a store, for all species combinations and energies in RHIC. The plot demonstrates the flexibility of RHIC in colliding different species combinations (all of them at or near the center of mass energy √s NN = 100 GeV), energy scans for a number of species combinations (Au+Au, Cu+Cu, d+Au p↑+p↑), and a luminosity that is strongly decreasing with the collision energy.

Fig. 6.33
figure 33

Achieved nucleon-pair luminosity L NN, averaged over a store, for all species combination and energies in RHIC

In the preparation for the collider use, the charge state Z of the ions is successively increased. A high charge state Z increases the bending and acceleration efficiency, but also increases the effects of space charge and intrabeam scattering (IBS). The direct space charge tune shift ΔQ, typically limited to values of less than 0.5, is given by [81]

$$ \varDelta Q=-\frac{\lambda R}{2{\varepsilon}_n\beta {\gamma}^2}\frac{r_0{Z}^2}{A}, $$

where λ is the particle line density, R the machine circumference, ε n the normalized emittance, β and γ the relativistic factors, r 0 the classical proton radius, and A the mass number. IBS growth rates 1/T x,y,s scale like [81]

$$ \frac{1}{T_{x,y,s}}\propto \frac{Z^4}{A^2}\frac{N_b}{\gamma {\varepsilon}_x{\varepsilon}_y{\varepsilon}_s} $$

where N b is the bunch intensity, and ε x,y,s are the normalized emittances. High charge states also reduce the electron stripping probability, and electron stripping at higher energies is generally more efficient.

Table 6.3 shows the charge states and energies in the RHIC and LHC injector chains for the Au and Pb respectively, the heavy ion species most often used in these machines. For RHIC singly charge ions are generated in a hollow cathode or laser ion source (LION) [82], and transferred into an Electron Beam Ion Source (EBIS) [83]. With EBIS, beams of almost any element can be prepared for RHIC including uranium and spin-polarized 3He. After increasing the charge state to Z = +32 the ions are accelerated through an RFQ and short linac, and injected into the Booster. After acceleration in the Booster, all but two electrons are stripped before injection into the AGS, and the ions are further accelerated. To increase the intensity of the ion bunches, bunches are merged in both the Booster and AGS. The last two electrons are stripped in the transfer line from the AGS to RHIC. In RHIC all ions except protons have to cross the transition energy, when bunches become short and peak currents high. In addition, the longitudinal motion is frozen for a short period, and the short bunches can trigger the creation of an electron cloud [84]. This situation makes the beams vulnerable to instabilities [85], which limited the bunch intensity for a number of years [84].

Table 6.3 Preparation of the heavy ions for RHIC and LHC

At CERN an ECR ion source is used, followed by an RFQ and the heavy ion LINAC3 [86, 87]. After passing a carbon foil that strips electrons, the ions are then accumulated in Low Energy Ion Ring (LEIR) [88]. During the 71-turn injection and before acceleration the ions are cooled with an electron beam, with a transverse cooling time of 0.2 s. To minimize dynamic vacuum effects from charge-change processes the LEIR vacuum system is designed for a dynamic pressure of less than 10−12 mbar. From LEIR the ions are injected into the PS where the bunches are split to obtain the bunch spacing needed for the LHC. After acceleration in the PS the last remaining electrons are stripped before injection into the SPS. In the SPS at injection space charge and intrabeam scattering were a concern, and an emittance growth of about 20% is observed at injection. Acceleration in the SPS requires a special fixed frequency acceleration scheme since the main 200 MHz RF system does not have the frequency range required to accelerate heavy ions with a constant harmonic number. The SPS acceleration scheme takes advantage of the fact than the ion bunch train only fills a fraction of the circumference allowing for an adjustment of the RF phase during the time without beam [89].

The luminosity is given by

$$ L=\left(\beta \gamma \right)\frac{f_{rev}}{4\pi }{k}_c\frac{N_{b1}{N}_{b2}}{\varepsilon_n{\beta}^{\ast }}H $$

where f rev is the revolution frequency, k c the number of bunch-bunch collisions per turn, N b1 and N b2 the bunch intensities in the two beams respectively, and β the lattice envelope function at the interaction point. The factor H accounts for the hourglass effect and crossing angles, and is smaller than and of order 1. The luminosity is limited by different effects in RHIC and the LHC.

In RHIC bunches of fully stripped heavy ions like Au79+ with the same number of charges as proton bunches have IBS growth rates an order of magnitude larger. In RHIC at injection IBS leads to bunch lengthening, and at store to particle loss out of the RF buckets and an increase in the transverse emittance. Longitudinal and transverse bunched beam stochastic cooling at store has been implemented [90] to counteract IBS. This and an increase in the bunch intensity have significantly increased the average store luminosity (Fig. 6.34). Table 6.4 shows the latest RHIC parameters for Au–Au operation.

Fig. 6.34
figure 34

RHIC instantaneous Au+Au luminosity in 2007 with longitudinal stochastic cooling in the Yellow ring only, and in 2014 with 3D cooing in both rings. The increase in the initial luminosity is due to an increase in the bunch intensity

Table 6.4 Main operating parameters achieved for the most commonly heavy ions in RHIC and LHC as of 2017

Other effects that have limited the heavy ion performance in the past include: the availability of high intensity bunches from the injector chain, instabilities at transition [91] driven by the machine impedance and electron clouds (RHIC is the only superconducting accelerator that crosses the transition energy), dynamic pressure increases including pressure instabilities caused by electron clouds [84], beam loading in the storage RF system (bunches are accelerated with h = 360 and transferred into a h = 7×360 system at store), and chromatic lattice aberrations at β  < 70 cm.

The LHC heavy ion operation started in 2010 and the luminosity is principally limited by two effects [92, 93]. Firstly, secondary beams generated in collision and having a Z/A ratio different from the primary beam will be lost in the dispersion suppressor, a location with superconducting magnets with a limited ability to absorb heat [94]. Secondly, the collimation efficiency for ions is lower than for protons leading again to losses in uncontrolled regions [95]. Expected LHC ion parameters are shown in Table 6.4. The two most important processes for the generation of secondary beams in collisions are Bound-Free Pair Production (BFPP),

$$ {}^{208}{\mathrm{Pb}}^{82+}+{}^{208}{\mathrm{Pb}}^{82+}\ \overset{\gamma }{\to }\ {}^{208}{\mathrm{Pb}}^{82+}+{}^{208}{\mathrm{Pb}}^{81+}+{\mathrm{e}}^{+}, $$
(6.72)

with a cross section of 281 barn; and Electromagnetic Dissociation (EMD) with a total cross section of 226 barn, about half of which is from the 1-neutron reaction

$$ {}^{208}{\mathrm{Pb}}^{82+}+{}^{208}{\mathrm{Pb}}^{82+}\ \overset{\gamma }{\to }\ {}^{208}{\mathrm{Pb}}^{82+}+{}^{207}{\mathrm{Pb}}^{81+}+\mathrm{n}. $$
(6.73)

Beam losses due to BFPP were observed in RHIC with 63Cu29+ ions [96] and effective mitigation measures have now been implemented at the LHC [97, 98]. These have allowed Pb-Pb luminosities far beyond the design value from 2015 onwards. Collimation of heavy ions is fundamentally different from protons. Protons are scattered at a primary collimator and collected at a secondary collimator. Heavy ions undergo nuclear fragmentation and electromagnetic dissociation in the primary collimator. The fragments created have a wide range of Z/A ratios that are not collected by the secondary collimators. Measurements of collimation efficiency were done in the CERN SPS and compared with detailed simulations to obtain reliable estimates of the heavy ion collimation efficiency in the LHC [95].

The LHC also collided Xe nuclei in 2017 [98] and may collide other species in future, generally with a view to increasing the nucleon-nucleon luminosity.

In the collision of asymmetric species the 2-in-1 magnet design of the LHC requires that the magnetic fields in two rings are the same (the two RHIC rings are independent and can have different fields). For p–Pb operation it is then necessary to have different revolution frequencies at injection and during the energy ramp [99]. Lead beams in the LHC at design energy have noticeable synchrotron radiation damping times (6 h and 13 h longitudinally and transversally) that are of the same order as the IBS emittance growth times (8 h and 13 h longitudinally and transversally) [92].

6.10 Beam Cooling

6.10.1 Introduction

Beam cooling aims at reducing the size and the energy spread of a particle beam circulating in a storage ring or in an ion trap. This reduction of size should not be accompanied by beam loss; the goal is to increase the particle density [100]. Since the beam size varies with the focusing properties of the storage ring, it is useful to introduce normalized measures of size and density. Such quantities are the (horizontal, vertical and longitudinal) emittances and the phase-space density. For our present purpose they may be regarded as the (squares of the) horizontal and vertical beam diameters, the energy spread, and the density, normalized by the focusing strength and the size of the ring to make them independent of the storage ring properties. Phase-space density is then a general figure of merit of a particle beam, and cooling improves this figure of merit. The terms beam temperature and beam cooling have been taken over from the kinetic theory of gases. For visualization one may imagine a beam of particles going around in a storage ring. Particles will oscillate around the beam centre in much the same way that particles of a hot gas bounce back and forth between the walls of a container. The larger the mean square of the velocity of these oscillations in a beam, the larger the beam size. The mean square velocity spread is used to define the beam temperature in analogy to the temperature of the gas which is determined by the kinetic energy of the molecules.

There are several basic motivations for the application and development of different beam cooling techniques:

  • Collection and accumulation of rare particles, e.g. antiprotons or short lived particles such as muons.

  • Improvement of interaction rate and resolution, e.g. collision experiments with antiprotons or with ions; increase in luminosity. For fixed target experiments: sharply collimated and/or highly mono-energetic beams for precision experiments.

  • Preservation of beam quality, mitigation and suppression of beam blow-up.

  • Preparation of crystalline beams.

Several cooling techniques are operational or have been discussed:

  • Radiation cooling (often referred to as radiation damping); linked to energy loss of particles via synchrotron radiation (used in virtually all modern electron synchrotrons).

  • Stochastic cooling (works well for “hot” beams to get them “tempered).

  • Electron cooling (most suitable for “tempered” beams to get them “cold”).

  • Laser cooling (essentially for ions where two level transitions of electrons can be excited).

  • Ionization- and friction-cooling (mainly discussed in the context of muon cooling).

  • Resistive cooling; used to cool charged particles in a trap where the kinetic energy of the particle is dissipated in the resistive losses of a resonant circuit.

  • Coherent electron cooling, a kind of blend from stochastic cooling at very high frequencies and electron cooling (under development at BNL theses days)

The use of the terms cooling and damping is not always well distinguished and unambiguous in the literature. Even in the context of stochastic cooling the authors were using the term damping in the early days. A similar observation can be made for radiation damping and cooling. One may consider defining any action on individual particles as “cooling” and any action on groups of particles as damping. Examples are the feedback systems in circular machines which are commonly referred to as dampers and which prevent emittance blow up, while a very similar feedback system just having a much higher electronic gain can work as (stochastic) cooler and reduce the emittance. However typically such damper systems have a much smaller bandwidth and lower operation frequency as compared to stochastic cooling hardware.

6.10.2 Beam Cooling Techniques

6.10.2.1 Radiation Cooling

Back in 1956, A.A. Kolomenski and A.N. Lebedev [101] pointed out that the ‘synchrotron light’ emitted by an electron moving on a curved orbit can have a damping effect on the motion of the particle. This is because the radiation is sharply peaked in the forward direction. The continuous emission of synchrotron radiation leads to a friction force opposite to the direction of the motion. For a particle moving on the design orbit, the energy loss is restored and the friction force is on average compensated by the RF-system. For a real particle the residual friction force tends to damp the deviation from the design orbit (Fig. 6.35). This cooling force is counteracted by the ‘radiation excitation’: synchrotron light is really emitted in discrete quanta and these many small kicks tend to heat the particle. The final emittances result from the equilibrium of radiation damping and excitation. We will see that a similar interplay between a specific cooling and heating mechanism is characteristic also for the other cooling methods.

Fig. 6.35
figure 35

The principle of transverse cooling by synchrotron radiation (transverse velocities exaggerated)

The theory of cooling by synchrotron radiation is in a mature state. Following up on Sands’ classical treatment on “the physics of electron storage rings”, radiation cooling has found its place in text books. The immense success of modern electron–positron machines, both ‘synchrotron light facilities’ (e.g. ESRF, ALS, APS, BESSY, SPRING8) and colliders (e.g. LEP, PEP II, KEKB) would not have been possible without the full understanding of radiation effects. Virtually all these machines depend critically on radiation cooling to attain the minute emittances necessary in their application. Linear e+e-collider schemes (like CLIC, TESLA, NLC, JLC) too, have to rely on ‘damping rings’ in their injector chain to produce the ultra-high phase-space density required. For historical reasons the reduction of beam emittance due to the emission of synchrotron radiation (typically from leptons) is usually referred to as radiation damping, although the term “cooling “might be more consistent.

The cooling rates as well as the final beam size and momentum spread depend on the lattice functions in regions where the orbit is curved. The art is then to ‘arrange’ these functions such that the desired beam property results. The strategy for ‘low emittance lattices’ is well developed and ‘third-generation machines’ providing beams of extremely high brightness have come into operation. To enhance the cooling, wiggler magnets are used, producing a succession of left and right bends. This increases the radiation and thereby the damping rates. The heating can be kept small by placing the wiggler at locations where the focusing functions of the ring are appropriate to make the particle motion insensitive to kicks. More details for cooling by synchrotron radiation are given in Sect. 6.5.

Radiation cooling and lattice properties of the storage ring are thus intimately linked and by smart design, orders of magnitude in the equilibrium emittances have been gained. This may serve as example for other cooling techniques for which the art of ‘low emittance lattices’ is only now emerging.

6.10.2.2 Microwave Stochastic Cooling

For (anti-)protons and heavier ions, radiation damping is almost negligible at the energies currently accessible in accelerators except for the LHC. One of the ‘artificial’ cooling methods devised for these heavy particles is stochastic cooling by a broadband feedback system (Fig. 6.36). The name “stochastic damping” was coined by Simon van der Meer who invented this method in 1968 (first published in 1972) [102] to underline the statistical basis of the method. First successful tests and observations were done at the CERN ISR (Intersecting Storage Rings) [102] followed by a dedicated “Initial Cooling Experiment” ICE [103]. In 1984 Simon van der Meer shared the Nobel Prize [104] in physics with Carlo Rubbia for his contribution to the observation of the intermediate vector boson. Microwave stochastic cooling was considered a key ingredient for reaching sufficient phase space density of the precious and rare antiprotons to produce a small number of W- and Z-Bosons in the CERN Super Proton-Antiproton Synchrotron (SppS) experiment in 1982. At the core of stochastic cooling is the observation, that the phase-space density can be increased by a system that acts to reduce the deviation of small sections, called samples, of the beam. By measuring and correcting the statistical fluctuations (‘Schottky noise’) of the sample averages, the spreads in the corresponding beam properties are gradually reduced. Stochastic cooling may thus be viewed as a ‘sampling procedure’ where samples are continuously taken from the beam and the average of each sample is corrected. The basic principle of (transverse) stochastic cooling is sketched in Fig. 6.36.

Fig. 6.36
figure 36

The basic set up for (horizontal) stochastic cooling

A somewhat different picture is based on the behavior of a test particle. At each passage it receives its own ’coherent’ kick plus the ‘incoherent’ random kicks due to all other sample members. The sample length T s (response-time) is given by the bandwidth W of the system through T s ≈ 1/2W and the number N s of particles per sample is proportional to T s. Hence a large bandwidth is important to work with small samples. Present day cooling rings have a revolution time between a fraction of a μs (e.g. CERN AD) up to about 20 μs (Relativistic Heavy Ion Collider (RHIC), Fermilab bunched beam cooling systems). The sample length T s amounts usually to less than 1 ns which corresponds to a cooling system bandwidth of 500 MHz in this case assuming the generalized Nyquist criterion for band-limited signals under ideal assumptions. Thus each sample contains only a small fraction of the total beam population circulating in the machine. Another important ingredient is ‘mixing’, i.e. the renewal of the sample population due to the spread of the particle revolution frequencies.

Based on the ‘sampling’ and/or the ‘test particle picture’ one derives in a few steps [105] a simplified relation for the cooling rate 1/τ of the transverse emittance ε with (1/τ = (1/ε)/dt) or for longitudinal phase space the momentum deviation (1/τ = (1/Δp)dΔp/dt):

$$ \frac{1}{\tau }=\frac{W}{N}\underset{<\mathrm{coherent}\ \mathrm{effect}>}{\left[2g\left(1-{\tilde{M}}^{-2}\right)\right.}-\underset{<\mathrm{incoherent}\ \mathrm{effect}>}{\left.{g}^2\left(M+U/{Z}^2\right)\right]}. $$
(6.74)

The parameters appearing in Eq. (6.74) have the following significance:

N

number of particles in the coasting beam

 

W

cooling system bandwidth

 

g

gain parameter (fraction of sample error corrected per turn)

(g < 1)

M

desired mixing factor (mixing on the way kicker–pick-up = good mixing)

(M > 1)

\( {\tilde{M}} \)

undesired mixing factor (slippage on the way pick-up–kicker = bad mixing)

(\( {\tilde{M}} \) > 1)

U

noise to signal power ratio (for single charged particles)

(U > 0)

Z

charge number of beam particles (≤ atomic number of the ion!)

(Z ≥ 1)

There is an optimum value of g for which Eq. (6.74) has a maximum. As to the other parameters, N and Z are properties of the beam, W is a property of the cooling system and M, \( {\tilde{M}} \) and U depend on the interplay of cooling system-, beam- and storage ring characteristics. The term in the bracket can at best be 1 but is more like 1/10 to 1/100 in real systems, depending on how well the mixing and noise problems are solved. The ideal cooling rate W/N can be interpreted as the maximum rate at which information on single particles can be acquired. Note that the gain parameter g (fractional sample error correction) should not be confounded with the electronic gain of cooling system which is typically 120 db or 12 orders of magnitude in power.

Lattice parameters are especially important for the achievement of ‘good’ values of M, \( {\tilde{M}} \) and U, maximising the bracket in Eq. (6.74). In addition to the struggle for large bandwidth, the advance in stochastic cooling is intimately linked to progress in dealing with the noise and mixing factors. In summary it can be said that present-day systems are working with a bandwidth of around 1 GHz for an individual cooling system with the possibility of extensions up to nearly 10 GHz by using several cooling bands in the same ring. Limitations on W are discussed in [106].

Turning to the mixing dilemma discussed at length in [107], we note that stochastic cooling only works if after each correction the samples (at least partly) re-randomise (desired mixing), and at the same time a particle on its way from pick-up to kicker does not slip too much with respect to its own signal (undesired mixing). The mixing rates 1/M and \( 1/{\tilde{M}} \) are related to the fraction of the sample length by which a particle with the typical momentum deviation slips with respect to the nominal particle. Here M refers to the way from kicker to pick-up (‘K to P’), and \( {\tilde{M}} \) to the way pick-up to kicker (‘P to K’). Both depend on the flight-time dispersion which in turn is given by the local ‘off-momentum factors’,

$$ {\eta}_{\mathrm{kp}}={\left(\frac{dT}{T}/\frac{dp}{p}\right)}_{\mathrm{kp}}, $$
(6.75)

and the similar quantity η pk respectively. For a regular lattice the beam paths ‘K to P’ and ‘P to K’ consist of a number of identical cells and one has

$$ {\eta}_{\mathrm{kp}}\approx {\eta}_{\mathrm{pk}}\approx \eta =\left|{\gamma}_{\mathrm{tr}}^{-2}-{\gamma}^{-2}\right|, $$
(6.76)

i.e. the local η-factors are close to the off-momentum factor for the whole ring. In this situation the ratio \( {\tilde{M}}/M \) is simply given by the corresponding path lengths (T pk and T kp). Then, e.g. in the case of the CERN AD (antiproton decelerator) where the cooling loop cuts diagonally across the ring, one has \( {\tilde{M}}\approx M \) instead of the desired \( {\tilde{M}}\gg 1 \), M = 1. The usual compromise is to accept imperfect mixing, letting both \( {\tilde{M}} \) and M be in the range of 3–5, say. The price to pay is a slower cooling rate, for example 1/τ ≤ 0.28W/N in the case of \( {\tilde{M}}=M \) instead of 1/τ ≤ W/N for perfect mixing.

‘Optimum mixing lattices’ (also referred to as ‘split ring designs’) have been proposed for the 10 GeV ‘SuperLEAR’ ring [108] (which was, however, never built). The idea is to make the path P to K isochronous (η pk = 0) and the path K to P strongly flight-time dispersive (η kp ≫ 0). These lattice properties have to be reconciled with the many other requirements of the storage ring. The next generation of stochastic cooling rings will use such split rings lattices. They were discussed for RIKEN in Japan [109] and are under construction for GSI and FAIR in Germany [110, 111]. It should be mentioned that the condition η pk = 0, η kp ≫ 0 can increase the cooling rate for transverse and for longitudinal ‘Palmer-Hereward’ cooling where the transverse displacement concurrent with the betatron amplitude and the momentum error of the particles is used. For momentum cooling by the filter (‘Thorndahl’) method, the split ring design brings less improvement since here the time of flight over a full revolution is used as a measure of momentum. A storage ring with η pk = 0 and η ≈ 1–2% is under construction for GSI and FAIR in Germany, meeting best conditions for both transverse cooling and filter momentum cooling of antiprotons [112].

Regarding the situation at GSI it should be mentioned that a first successful experiment was performed at the ESR to measure the nuclear radius of the radioactive nucleus 56Ni. To this purpose stochastic precooling and subsequent electron cooling were used in order to accumulate enough intensity for a sufficient S/N in a scattering experiment with an internal hydrogen target [113].

This is not the end of the mixing dilemma: during momentum cooling, as Δp/p decreases, the M-factors increase (c.f. Eq. 6.75) and the mixing situation tends to degrade. One can in principle stay close to the optimum by changing η (‘dynamic transition tuning’) as cooling proceeds. Similar considerations hold for machines with variable working energy where, through a change of η, good mixing can be maintained. Again these improvements might be incorporated in the next generation of cooling rings (e.g. at FAIR [112]).

As for the noise, from Eq. (6.74) it is clear that a balanced design aims at U/Z 2 ≪ M. The noise to signal (power-)ratio depends on the technology of the pre-amplifier and other ‘low level components’ on the one hand and on the sensitivity of the pick-up device on the other hand. There has been great progress in the design of the pick-up and kicker structures and the other components of the cooling loop. These components developed in different labs (e.g. BNL [114], CERN [107], Fermilab [115], Forschungszentrum Jülich (FZJ) [116], GSI [112, 117]) are in fact formidable ‘high-fidelity (HiFi) systems’ with an unprecedented combination of high sensitivity, low noise, great bandwidth, large amplification, very linear phase response, and excellent compatibility with the ultra-high vacuum of the storage ring.

A more detailed discussion of stochastic cooling hardware progress over the last 30 years can be found in [118]. Regarding pick-up and kicker structures we have seen printed versions arriving in the late 1980s of the classical λ/4 strip-line couplers which are referred to as printed loop or printed slotline couplers which are normally used as “phased arrays” [119, 120]. As for travelling wave structures starting from the TEM type slotted line version of Faltin [121] McGinnis developed a related device [122] not based on a TEM line, but essentially a waveguide directional coupler with slots masks for the coupling. Those waveguide type slot array couplers have the advantage (in contrast to the Faltin version) that they can operate efficiently also for highly relativistic beams and they exhibit a very high longitudinal and transverse sensitivity over a bandwidth of several 100 MHz in the GHz region. As a particular development the kicker structure for the BNL RHIC bunched beam stochastic cooling system [123,124,125] is worth mentioning. It consists of an array of cavities which are cut in length and can be opened by a mechanical plunging mechanism in order to let the injected beam pass without aperture limitations.

Another travelling wave structure is the perforated structure which was originally proposed in 2011 [126] and later developed for HIRFL-CSRe stochastic cooling. A large number of small slots in the electrode provides distributed inductive loading, slowing down the phase velocity of the travelling wave structure for the low beta beams. This device is very broadband and operates from low frequencies onwards as a forward coupler. Even for 2.76 m long electrodes used in HIFRL-CSRe, it can be used from a few MHz to 1.2 GHz [127].

Another very promising recent development for pick-ups and kickers are “slot ring” structures [128]. These structures were originally developed for the High Energy Storage Ring (HESR) of the FAIR project at GSI, Germany and successfully tested at the Nuclotron (JINR, Russia) for longitudinal cooling and at COSY (FZJ) for longitudinal and transverse cooling. Slot ring couplers have a fixed aperture and can be used for all three cooling planes simultaneously [128].

In CERN’s anti-proton decelerator AD stochastic cooling is employed at 3.57 GeV/c and 2 GeV/c in both transverse planes and for the longitudinal plane (filter cooling) [129]. The current system uses a set of two kickers and pick-ups, each combining one transverse plane and the longitudinal cooling, with a total of 4.8 kW installed power. It is undergoing a consolidation and upgrade [130] which is including a notch filter with optical delay lines. Cooling times of 15–20 s reduce transverse emittances to 3–4 π mm rad and Dp/p to ±0.3 × 10−3 at 3.57 GeV/c and to ±0.08 × 10−3 at 2 GeV/c at intensities of 5 × 107 antiprotons. The system uses a bandwidth of one octave between 850 MHz and 1.7 GHz. This is the actual status in early 2019.

In parallel, CLASS A solid state amplifiers [131] (kicker driver) gradually took over from TWT (travelling wave tube) units, although TWTs are still in operation for stochastic cooling e.g. at Fermilab where they work reliably. Notch filters, required for Thorndahl type longitudinal cooling (filter cooling) are implemented since about 1990 with good success in optical fibre technology [125, 132].

Optical signal transmission across the ring (Fermilab de-buncher) has been realized with a laser beam in an evacuated metal pipe (no signal fluctuation from temperature effects of air and humidity on the laser beam). The driving force to select this method of signal transmission was the very tight requirement in terms of transmission delay and delay stability. Anything slower than speed of light would not have permitted timely arrival of the correction signal at the kicker. In 2017 very fast (around 99% speed of light) hollow optical fibres were applied successfully for analog and wideband signal transmission across the ring at COSY (FZ-Jülich, Germany) [133].

Front end amplifiers showed slow but steady progress and these days we can easily get an uncooled 1–2 GHz or 2–4 GHz device with a noise temperature of 30 K.

Examples of remarkable recent progress in the field of microwave stochastic cooling are

  • bunched beam stochastic cooling at RHIC and Fermilab [134,135,136,137],

  • the impressive improvements of the performance and the interplay of all stochastic cooling and stacking systems at Fermilab together with elaborate beam handling methods such as “slip stacking” [135, 136].

It should be noted that there have been unsuccessful attempts to get bunched beam stochastic cooling operational in large machines despite the fact that one of the first evidence on stochastic cooling at all, in ICE [137] already worked with a bunched beam. However the bunch length was very large. Attempts which failed were in the frame of the SPS p-pbar program at CERN [124] and later (around 1990) also in the Tevatron [138, 139]. Bunched beam cooling is of course hampered by the higher particle density in the bunch. In fact in Eq. (6.74) the number N for the coasting beam has to be replaced by N b/B f = N b∙1.4∙2πR/l b (with R radius and l b length of bunch) for a rough estimate [137, 140]. In addition to those expected effects the direct (coherent) bunch signal (proportional to N at low frequencies) tends to mask the very weak Schottky signals required for cooling [137, 140]. This is one of the reasons to place the cooling bands towards high frequencies. In addition a subtle but important difficulty is related to the presence of unexpected and rather strong coherent signals in the bunched beam spectrum which lead to saturation of the front end amplifiers via intermodulation [139, 140]. A theoretical treatment of these persisting “turbulence islands” in the bunch was given by Blaskiewiecz [141]. Just in the recent years this problem became mastered at BNL (gold ions) [135] and also in the Fermilab recycler [140]. However, bunched beam stochastic cooling has always been working reasonably well in small machine like the CERN AC [107], LEAR [142], Fermilab de-buncher and accumulator [143] and others since the relative intensity of those coherent signals was less violent compared to large machines. However for the small machines there was little interest in bunched beam stochastic cooling.

New applications of stochastic cooling may include:

  • fast cooling and stacking of low intensity radioactive ion beams with cooling times of 100 ms or less as discussed for RIKEN [109] and under construction at GSI [110],

  • fast optical stochastic cooling [144,145,146] (e.g. of intense muon beams but also for bunched beam cooling in large rings) for which a bandwidth of 1012–1013 Hz and a new pick-up, kicker and amplifier technology, and new lattice designs have been contemplated.

Let us have a quick look at these developments. The challenge of fast low-intensity cooling can be discussed with reference to Fig. 6.37 [108], which illustrates the optimum cooling time vs. intensity N. For large N the cooling time increases linearly with N with the slope \( 1/W\left[M/{\left(1-{{\tilde{M}}}^{-2}\right)}^2\right] \). This is the mixing and bandwidth limit. For small N, cooling time levels off to a constant ‘noise limited’ value reached for U/Z 2 ≫ M (note that U ∝ 1/N).

Fig. 6.37
figure 37

Sketch of cooling time vs. intensity (for the ‘mixing limit’ τ = 10 N/W is taken in the figure)

The art is to shift the levelling off to small N by improving the signal to noise ratio. Theoretically, short cooling times are then possible (e.g. 10 ms for N = 105 Sn50+ ions and a few 100 MHz bandwidth as discussed for RIKEN). However, other difficulties like the broadband power needed for such a rapid emittance decrease, and the residual RF-structure after debunching may pose new problems for fast cooling and stacking.

Optical stochastic cooling (OSC) proposed by Mikhailchenko, Zholents and Zolotorev [144, 145] in 1993 is an extension of certain basic concepts of microwave stochastic cooling into the optical frequency range using different pickup and kicker mechanism and structures. It is a potentially very promising technique but has not been tested in practice so far. Challenges may be amongst other items the stability and linearity of the optical signal transmission chain as well as of the circulating hadron beam. Maybe we shall soon see important steps towards this technology at BNL in a forthcoming “coherent electron cooling experiment”.

At BNL stochastic cooling has been implemented at top energy in the RHIC [147]. The bunch cores have full length 5 ns and are spaced by 100 ns. The root mean square Schottky voltage is typically 10% of the coherent voltage generated by the average bunch shape and multi-kilovolt kicker voltages are required for optimal longitudinal cooling. Several novel technologies were required to meet the challenges. The kicker voltage was obtained by periodically extending the pickup signal, passing it through narrow band filters spaced by 200 MHz (1/5 ns) and driving individual cavities. Taming coherent lines while meeting timing requirements was a serious challenge [148]. When all was said and done the cooling system increased the integrated luminosity of uranium-uranium collisions by a factor of five and typically doubled the gold-gold luminosity.

At Fermilab, OSC is being pursued at the IOTA facility [149]. In OSC a particle emits electromagnetic radiation in the first (pickup) wiggler. Then, the radiation amplified in an optical amplifier (OA) makes a longitudinal kick to the same particle in the second (kicker). A magnetic chicane is used to make space for the OA and to delay a particle so that to compensate for a delay of its radiation in the OA resulting in simultaneous arrival of the particle and its amplified radiation to the kicker wiggler. The chosen optical wavelength is 800 nm, resulting in bandwidths approaching 1014 Hz. In the proposed test, the use of 100-MeV (γ = 200) electrons instead of protons greatly reduces the cost of the experiment but does not limit its generality and applicability to hadron colliders. Conceptual design of the system is complete, with engineering design of the wiggler and optical hardware underway.

Already in late 1970s the need for “stochastic stacking” has been realized [150]. In the “old” CERN AA (antiproton accumulator) [151] early stacking methods were tested and applied in routine operation. In the CERN AAC (antiproton accumulator complex) [152] the antiprotons (pbar or p) coming from the AC (collector ring) were transferred to the inner ring (AA = accumulator). There dedicated stack tail and stack core systems took over the antiprotons after they have passed a pre-cooling system in the AA and were transferred to another orbit by means of RF manipulations. At Fermilab [138] stacking is done in the accumulator ring and later also in the recycler. For the future stacking with stochastic cooling is planned in the frame of the FAIR project [110]. Stochastic stacking of rare radioactive ions has been considered during the planning phase of RIKEN [110] upgrades between 1900 and about 2000 and for FAIR [110].

At Fermilab huge progress has been made since the year 2000 [153], this includes stacking with stochastic cooling was done in three separate machines. The debuncher [154], which accepted ~1.5 × 108 antiprotons every 2.1 s, used the McGinnis waveguide directional couplers in eight bands over the frequency range 4–8 GHz for a factor of 10 reduction in longitudinal and transverse size. A key piece was the implementation of ramping the amplifier gain down during the cycle, to counter act noise to signal for the momentum bands in the notch filters. A 6 dB decrease in gain resulted in a 12% decrease in the 95% momentum width after 2 s of cooling. The Accumulator [154] accepted the same ~1.5 × 108 antiprotons and used the Palmer method to build a ‘stack’. Peak performance reached 2.6 × 1011 antiprotons in an hour, with regular transfers to the Recycler to mitigate the known decrease in performance with larger stacks. The Recycler, using a combination of stochastic and electron cooling [155], reached intensities of greater than 4 × 1012 regularly, with peak intensity of 6.1 × 1012 and delivering over 4 × 1013 per week to the collider program.

6.10.2.3 Electron Cooling

The concept of cooling a “hot” beam of ions by mixing it over a short distance in a circular machine with a cold electron beam had been developed by Budker [156] in 1966. It was first tested in 1974 with 68 MeV protons at the NAP-M storage ring at in Novosibirsk. The notions of ‘beam temperature’ and ‘beam cooling’ were introduced and become lucid in the context of electron cooling, which is readily viewed as temperature relaxation in the mixture of a hot ion beam with a co-moving cold electron ‘fluid’. The equilibrium emittances, obtainable when other ‘heating mechanisms’ are negligible, can easily be estimated from this analogy, assuming equalisation of the temperatures ((MΔv 2)/ion → (mΔv 2)/electron). For a simple estimate of the cooling time, another resemblance, namely the analogy with slowing down of swift particles in matter, can be helpful. A nice presentation of this subject is given in Jackson’s book [157]: the energy loss in matter is due to the interaction with the shell electrons and in first approximation these electrons are regarded as free rather than bound. Results for this case can be directly applied to the ‘stopping of the heavy particles in the co-moving electron plasma’. The calculations are performed assuming ‘binary collisions’ involving only one ion and one electron at a time.

Using this approximation the cooling time can be written as

$$ \frac{1}{\tau}\approx \frac{1}{k}\frac{q^2}{A}{\eta}_c{L}_C{r}_e{r}_p\frac{j}{e}\frac{1}{\beta^4{\gamma}^5{\theta}^3}, $$
(6.77)

where

  • k = 0.6: for a Gaussian distribution (not realistic),

  • k = 0.16: for a flattened distribution,

  • q: ion charge number,

  • A: ion mass number,

  • η c: length of cooling section/circumference,

  • L C ≈ 10: Coulomb logarithm (log of max/min impact parameter),

  • r e ≈ 2.8 × 10−13 cm: classical electron radius,

  • r p ≈ 1.5 × 10−16 cm: classical proton radius,

  • j (A/cm2): electron beam current density,

  • e ≈ 1.6 × 10−19 C: elementary charge

  • \( \theta ={\left({\theta}_e^2+{\theta}_i^2\right)}^{1/2}=\left(\frac{T_e}{m_e{c}^2}+\frac{T_i}{m_i{c}^2}\right) \): r.m.s. angle between electron and ion beams,

  • β, γ: relativistic factors.

The cooling rate (1/τ) thus obtained exhibits the dependence on the main beam and storage ring parameters [158]. Notable is the dependence on both the electron and the ion (both longitudinal and transverse) velocity spreads: \( \tau \propto {\theta}^3\propto \left({\left|\Delta {\boldsymbol{v}}_{e_{\mathrm{rms}}}\right|}^3+{\left|\Delta {\boldsymbol{v}}_{i_{\mathrm{rms}}}\right|}^3\right) \). This indicates an ‘ion spread dominated regime’, where cooling gets faster as the ions cool down until it saturates for \( \left|\Delta {\boldsymbol{v}}_{i_{\mathrm{rms}}}\right|<\left|\Delta {\boldsymbol{v}}_{e_{\mathrm{rms}}}\right| \) (‘electron dominated regime’). Remarkable also is the strong energy dependence predicted in this model: τ ∝ β 4 γ 5, with all other parameters (including the electron current density j) kept constant [159].

Neglected in the simple theory are the ‘flattened distribution’, the ‘magnetisation’ and the ‘electron space-charge’ effects, all three (also) discovered and explained at Novosibirsk [159, 160]. In essence the flattened distribution effect takes into account that (due to the acceleration) the electron velocity spread is not isotropic but contracted (by [E cathode/E final]1/2) in the longitudinal direction. The magnetisation effect is due to the spiraling (Larmor-) motion of the electrons in the magnetic field of the solenoid that is used to guide the electron beam. Then for electron-ion encounters with long ‘collision times’ (impact parameter ≫ Larmor radius), the transverse electron velocity spread averages to zero. Finally the electron space-charge induces a potential that leads to a parabolic velocity profile v(r) over the beam whereas the ions exhibit a linear dependence v(x) and v(y) given by the storage ring lattice. Hence the difficulty arises to match the ion and electron velocities. Flattening and magnetisation can have a beneficial outcome, whereas space-charge has a hampering influence on the cooling process. All three effects complicate the theory, spoil the hope for simple analytical formulae and obscure the comparison between measurements at different machines, and even different situations at the same cooler. As an example the cooling assembly used in the low energy antiproton ring (LEAR) is sketched in Fig. 6.38.

Fig. 6.38
figure 38

An electron cooling assembly (LEAR electron cooler) from [158]

The electrons are produced in a gun and directed into the cooling region where they overlap the ion beam over a length 1 m. At the end of the cooling section the electrons are steered away from the ions into a collector where their energy is recuperated. On their whole way from the cathode of the gun to the collector the electrons are usually immersed in a longitudinal magnetic guiding field. This field is constant over the full length or stronger in the gun region. In the latter case the transverse electron temperature in the overlap region decreases (due to “magnetic expansion”) at the expense of the longitudinal temperature. This can reduce the cooling time in situations where the electron temperature dominates.

An important technical problem is electron beam power consumption. To reduce direct losses of the beam power the recuperation (recovering) method is used. It assumes biasing of the collector to negative potential slightly above the cathode potential. Then the power consumption is defined mainly by product of the beam current by the difference of the collector and gun potentials.

In the two toroidal sections, adjacent to the overlap region, the solenoid to create the longitudinal field is curved to guided the electron beam parallel to the ions at the entrance and away from them at the exit. Also in the toroidal regions the solenoid has a larger diameter to permit the penetration of the ion beam.

Many papers deal with the ‘exact and general theory’ [161] and computer programs like BETACOOL [162] try to include all the subtle effects. Numerous also are the experimental results from 11 (or so) present and past cooling rings. It is not easy to compare the data from different experiments because the cooling in each plane depends in a complicated way on the emittances in all three directions both of the ion and the electron beam. Moreover different quantities are used to measure/define ‘cooling strength’ (examples: cooling of a large injected beam, response of a cold beam to a ‘kick’ or to a transverse or an energy displacement, equilibrium with heating by noise).

In the context of the accumulation of lead ions for the future Large Hadron Collider (LHC) [163], a program of experiments [164] was performed at the LEAR ring to determine optimum lattice functions [165]. Results indicate rather small optimum betatron functions (3–5 m instead of the expected 10 m) and large dispersion (D = 2–3 m instead of the expected 0–1 m). The dependence on dispersion is not fully reproduced by simple analytical formulae. There are other old questions: e.g. the (dis)advantage of magnetic expansion, the dependence of the cooling time on the charge of the ion, the (dis)advantage of neutralising the electron beam, the enigma of the stability of the cooled beam [166], the puzzle of the anomalously fast recombination of certain ions with cooling electrons [167], the (dis)advantage of a hollow electron beam [168].

Considerations so far concern electron cooling at ‘low energies’ (T e = 2–300 keV) where cooling rings have flourished since the 1980s. More recently medium energy cooling (T e = 1–10 MeV) has re-gained a lot of interest [167,168,169]. Clearly the higher energy requires new technology and extrapolation to a new range of parameters. At Fermilab high energy e-cooling (with up to 5 MeV electrons using electrostatic acceleration for cooling of 8 GeV bunched antiprotons in the recycler) has been successfully developed and implemented. The generation and recirculation of the 4.3 MeV and 0.5 Ampere electron beam and its adaptation to the antiproton beam over a cooling length of 20 m are remarkable achievements. Finally the idea of ‘very high energy electron cooling’ (T e ≥ 50 MeV) has been revived as this might improve the luminosity of RHIC [168, 170]. At this energy the electron beam could circulate in a small ‘low-emittance storage ring’ with strong radiation damping. An attractive alternative is a scheme [170], in which the low-emittance beam after acceleration is re-decelerated after the passage through the cooling section to recuperate its energy.

In summary: 45 years after its invention, the field of electron cooling continues to expand with exciting old and new questions to be answered. Bunched beam cooling is no longer a magic barrier and even a merger between electron cooling and stochastic cooling i.e. the “coherent electron cooling” [171] appears at the horizon. In the concept of coherent electron information of the particle distribution of the hadron beam to be cooled is sampled by the electron beam, amplified and further downstream fed back onto the hadron beam.

6.10.2.4 Laser Cooling

Due to the pioneering work of the Heidelberg (TSR) [172] and Aarhus (ASTRID) [173] groups in the 1990s, laser cooling in storage rings has evolved into a very powerful technique. Longitudinal cooling times as short as a few milliseconds and momentum spreads as small as 10−6 are reported. These bright perspectives are somewhat mitigated by two specific attributes [174]: laser cooling takes place (mainly) in the longitudinal plane and it works (only) for special ions that have a closed transition between a stable (or meta-stable) lower state and a short-lived higher state. The transition is excited by laser light, and the return to the lower state occurs through spontaneous re-emission (Fig. 6.39). ‘Unclosed’ transitions, where the de-excitation to more than one level is possible, are not suited because ions decaying to the ‘wrong’ states are lost for further cooling cycles. This limits the number of ion candidates (although extended schemes with additional lasers to ‘pump back’ from the unwanted states could enlarge the number of ion species susceptible to cooling). Up to now, a few singly charged ions (like Li1+, Be1+ or Mg1+) have been used with ‘normal’ transitions accessible to laser frequencies. Transitions between fine structure, or even hyperfine levels of highly-charged heavy ions have also been considered, but in that case the cooling force is less pronounced and not so much superior to the electron cooling force which increases with charge (like Q 1.5 or even Q 2).

Fig. 6.39
figure 39

Sketch of Laser–ion interaction

The laser irradiates the circulating ions co-linearly over the length of a straight section of the storage ring [174]. The absorption is very sharply resonant at the transition frequency. Then the Doppler shift (ω = (1 ± v/c)γω laser) seen by the ion makes the interaction strongly dependent on its velocity. This leads to a sharp resonance of the absorption as a function of the velocity (Fig. 6.40). The corresponding recoil (friction) force accelerates/decelerates the ions with a maximum rate at the resonant momentum. To obtain cooling to a fixed momentum, a second force f(v) is necessary. It can be provided by a second (counter-propagating) laser or by a betatron core or by an RF-cavity, which decelerate the ions ‘towards the resonance of the first laser’ (Fig. 6.40). The interaction with the laser photons (and hence the cooling) takes place in the direction of the laser beam (longitudinal plane of the ions). De-excitation proceeds by re-emission of photons in all directions and this leads to heating of the ions in all three planes.

Fig. 6.40
figure 40

Force F(v) due to a single laser and different schemes for cooling to a fixed velocity

Through transverse-to-longitudinal coupling, part of the cooling can be transferred to the horizontal and vertical planes. Intra-beam scattering [175], dispersion [176] and special coupling cavities [177] have been considered for this purpose. Transfer by scattering and by dispersion has been demonstrated at the cooling rings, although the transverse cooling thus obtained was weak, a fact explainable by the weakness of the coupling.

The main motivation for laser cooling has been the goal of achieving ultra-cold crystalline beams [178] where the ions are held in place because the Coulomb repulsion overrides the energy of their thermal motion. A second application, cooling of low-charge states of heavy ions, was proposed [179] in order to prepare high-density drive beams for inertial confinement fusion. Several years ago a study [180] on the use of laser cooling of ions for the LHC was published. All these applications for the moment meet with difficulties: crystallisation, in full three-dimensional beauty, is hampered by the lattice properties of (present) storage rings and by the relative weakness of transverse cooling. Cooling for fusion is not fast enough [181] to ‘compress’ the high-intensity large-momentum-spread beam during the few milliseconds lifetime given by intra-beam charge exchange between the ions. And, finally laser cooling of highly charged ions for colliders meets with the competition of electron cooling and also with the restrictions on the choice of suitable ion species and states [180]. The investigations on laser cooling to obtain crystalline beams continue [182] and a special storage ring (S-LSR) with lattice properties apt to reach this goal [183, 184] has been built at Kyoto university.

In conclusion: laser cooling in storage rings has led to very interesting and important results concerning the physics of cooling and cooling rings and, also atomic and laser physics. However, ‘accelerator applications’ like for electron or stochastic cooling are not realistic for the near future. The goal to obtain crystalline beams in special storage rings is under intense investigation.

6.10.2.5 Ionisation Cooling

Excellent reviews of ionisation cooling are given in papers by Skrinsky [185] and Neuffer [186]. The basic setup (Fig. 6.41) consists of a block of material (absorber) in which the particles lose energy, followed by an accelerating gap (RF-cavity) where the energy loss is restored. Losses in the absorber reduce both the longitudinal and the transverse momentum of the particle. The RF-cavity (ideally) only restores the longitudinal component and the net result is transverse cooling (Fig. 6.41). There is an obvious resemblance to radiation damping (Fig. 6.35), in which energy loss by synchrotron radiation followed by RF-acceleration results in cooling. Longitudinal ionisation cooling is also possible, especially in the range where the loss increases with energy (i.e. above the energy where the minimum of dE/ds occurs). At the expense of horizontal cooling, the longitudinal effect can be enhanced by using a wedge-shaped absorber in a region where the orbits exhibit dispersion with energy.

Fig. 6.41
figure 41

Basic setup for (transverse) ionisation cooling (adapted from [186])

The statistical fluctuations (‘straggling’) of the loss and the angular (multiple) scattering introduce heating of the longitudinal and transverse emittances. The ratio of ionisation loss due to angular scattering favours light absorber material. Equilibrium emittances depend strongly on the lattice functions at the position of the absorber and the cavity. As in the case of radiation damping, the sum of the cooling rates (also in the case of a wedge absorber) is invariant with a value J x + J y + J E ≈ 2 + J E ≈ 2 for the ‘damping partition numbers’, instead of J x + J y + J E = 4 for radiation damping. The quantity J E depends on the slope of the dE/ds vs. E curve and is about constant and roughly equal to 0.12 for light materials above the minimum of dE/ds, but is strongly negative below. In terms of the partition numbers, the three emittance damping rates can be expressed by the energy loss ΔE μ of the muons in the absorber and the length Δs of the basic cell (Fig. 6.41) as:

$$ \frac{1}{\varepsilon_i}\frac{d{\varepsilon}_i}{ds}={J}_i\frac{1}{E_{\mu }}\frac{\Delta {E}_{\mu }}{\Delta s}. $$
(6.78)

A large number of cells or traversals through a cell is necessary to obtain appreciable emittance reduction.

Almost by a miracle, the muon mass falls into a narrow ‘window’ where ionisation cooling within the short life of the particle looks possible (although not easy). For electrons as well as for protons and heavier particles, the method is not practical, because the effect of bremsstrahlung (for e’s) and non-elastic processes in the absorber (for p’s), leads to unacceptable loss.

With the revival of interest for muon colliders and, related to that, neutrino factories [187], large collaborations (including more than 15 institutes, [188]) is undertaking a demonstration experiment. The ISIS accelerator at the Rutherford lab. is chosen for this task. Neutrino factory and muon collider proposals have to rely critically on muon cooling: typically 50 m to several 100 m long channels with solenoidal focussing (superconducting solenoids) are foreseen to reduce the phase-space of the muons emerging from pion decay. Liquid hydrogen absorbers, each 0.5–1 m in length, alternate with high-field accelerating cavities.

The variant selected by MICE is a ‘single particle experiment’ where one muon at a time is traced. Fast spectrometers, capable of resolving 1 muon per 25 ns, record/compare the three position coordinates and the three velocity components of the muon at the entrance and the exit of a short cooling section. Typically such a test-section should lead to 10% emittance reduction. The emittance pattern is ‘painted’ by a scatterer or a steering magnets changing the entrance conditions of the particle at random (scatterer) or in a programmed manner. A large number of muons are necessary to establish the six-dimensional phase-space reduction with sufficient statistics.

Apart from the spectrometers, other challenges can be identified: long term mechanical stability, muon decay and birth, contamination with other particles and non-linarites in focussing which deform the emittance pattern. In the coming years we will see a large effort on muon cooling scenarios and tests.

6.10.2.6 Cooling of Particles in Traps

In many experiments utilizing ion traps, the ions must first be cooled in order to perform high precision measurements. Cooling refers here to the reduction of kinetic energy of confined particles. A detailed review of cooling traps is given in [189] and the implementation of several cooling methods into a big project is described in [190].

With adequate modifications, most of methods discussed above for storage rings stochastic- [191], electron- [192], or laser cooling [193] can also be applied to traps. Some (like stochastic cooling) are more difficult others (e.g. laser cooling) are much more powerful in the traps.

There are however several techniques which are especially adapted to or even working only in the environment of particles confined in a trap which is frequently cryogenically cooled. A gross classification is to divide them into lossy and lossless methods in terms of conservation of number of particles. A lossy technique (which in the strict definition of [100] would not be classified as “phase density cooling”) is evaporative cooling. In this case, just as in the evaporation of water, the more energetic molecules leave the trap and the temperature of the condensate is thereby strongly reduced. Experiments at the forefront of physics making Bose-Einstein condensates at a temperature of a tiny fraction of a degree have thus become possible [194].

An example of a widely used lossless process is resistive cooling: the trap electrodes are connected to an external circuit to dissipate energy from the ions through induced currents [189, 195] (Fig. 6.42).

Fig. 6.42
figure 42

Principle of resistive cooling of a trapped ion

In other words the particle’s kinetic energy is dampened by I 2 R losses in a resistive circuit [195, 196]. Idealistically speaking, the resistor or the losses in a resonant circuit and absorb the particle’s energy to create a thermal equilibrium when there is no other heating source involved. Since the resistor has a specific physical temperature, it generates Johnson noise that in turn stochastically drives the trapped particles. Resistive cooling was first applied by H. Dehmelt and collaborators in 1975 [195].

To estimate the cooling time, a simple single particle model is used, where-by it is harmonically bound between two capacitor plates [195]. Due to this model, the energy is dampened with a time constant τ calculated by:

$$ \tau =\frac{4m{z}_0}{q^2R}. $$
(6.79)

Here 2z 0 is the separation of the capacitor plates (the electrodes of the trap) and R stands for the real part of the impedance from the attached external circuit, q is the charge and m the mass of the trapped particles. From Eq. (6.79) [189] one can easily conclude that light, highly charged particles are efficiently cooled. The cooling rate can be further improved by developing a high resistance in the external circuit.

In general, the external circuit which is often in vacuum and at cryo-temperature includes a low-noise amplifier to couple the induced current signal to room temperature, and thus enable plasma diagnostics. The input noise temperature of the amplifier is, depending on its coupling to the resonant circuit, closely related to the minimum achievable temperature of the particles in the trap. Most frequently, the impedance Z (with the real part R) shown in Fig. 6.42 is implemented as inductance L so that the circuit becomes resonant at the oscillation frequency of the ions. This is to tune out (compensate at resonance) the parasitic capacitance of the electrodes. This inductance may be implemented as discrete solenoid coil made of copper or superconducting wire. The quality factor Q = R/ωL of the tuned circuit has to be large to guarantee efficient resistive cooling A high Q in turn means to incorporate a low loss network. As a caveat it should be mentioned that extremely high Q values (above say 105) may be problematic if the bandwidth of the resonance becomes smaller than the width of the particle spectrum. For further reading we refer to [197], where Shockley presents the basic equations for trapped and charged particles in a Penning trap.

There exist a large number of other cooling techniques used in traps, such as collisional cooling, RF- and optical sideband cooling (resolved and unresolved sideband methods) and sympathetic cooling. The list of examples given here is certainly not exhaustive and a detailed description can be found in review articles [189]. As for the term RF cooling (which may be confounded with RF-sideband cooling) [189] it should be pointed out that this refers to a reduction of temperature or vibration amplitude of a microscopic cantilevered bar in vacuum using an RF resonant circuit which is fed externally with a few milli-Watt of RF power [198].