Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1.1 An Overview of the Fundamental Interactions

A possible goal of fundamental physics is to reduce all natural phenomena to a set of basic laws and theories which, at least in principle, can quantitatively reproduce and predict experimental observations. At the microscopic level all the phenomenology of matter and radiation, including molecular, atomic, nuclear, and subnuclear physics, can be understood in terms of three classes of fundamental interactions: strong, electromagnetic, and weak interactions. For all material bodies on the Earth and in all geological, astrophysical, and cosmological phenomena, a fourth interaction, the gravitational force, plays a dominant role, but this remains negligible in atomic and nuclear physics. In atoms, the electrons are bound to nuclei by electromagnetic forces, and the properties of electron clouds explain the complex phenomenology of atoms and molecules. Light is a particular vibration of electric and magnetic fields (an electromagnetic wave). Strong interactions bind the protons and neutrons together in nuclei, being so strongly attractive at short distances that they prevail over the electric repulsion due to the like charges of protons. Protons and neutrons, in turn, are composites of three quarks held together by strong interactions occur between quarks and gluons (hence these particles are called “hadrons” from the Greek word for “strong”). The weak interactions are responsible for the beta radioactivity that makes some nuclei unstable, as well as the nuclear reactions that produce the enormous energy radiated by the stars, and in particular by our Sun. The weak interactions also cause the disintegration of the neutron, the charged pions, and the lightest hadronic particles with strangeness, charm, and beauty (which are “flavour” quantum numbers), as well as the decay of the top quark and the heavy charged leptons (the muon μ and the tau τ). In addition, all observed neutrino interactions are due to these weak forces.

All these interactions (with the possible exception of gravity) are described within the framework of quantum mechanics and relativity, more precisely by a local relativistic quantum field theory. To each particle, treated as pointlike, is associated a field with suitable (depending on the particle spin) transformation properties under the Lorentz group (the relativistic spacetime coordinate transformations). It is remarkable that the description of all these particle interactions is based on a common principle: “gauge” invariance. A “gauge” symmetry is invariance under transformations that rotate the basic internal degrees of freedom, but with rotation angles that depend on the spacetime point. At the classical level, gauge invariance is a property of the Maxwell equations of electrodynamics, and it is in this context that the notion and the name of gauge invariance were introduced. The prototype of all quantum gauge field theories, with a single gauged charge, is quantum electrodynamics (QED), developed in the years from 1926 until about 1950, which is indeed the quantum version of Maxwell’s theory. Theories with gauge symmetry in four spacetime dimensions are renormalizable and are completely determined given the symmetry group and the representations of the interacting fields. The whole set of strong, electromagnetic, and weak interactions is described by a gauge theory with 12 gauged non-commuting charges. This is called the “Standard Model” of particle interactions (SM). Actually, only a subgroup of the SM symmetry is directly reflected in the spectrum of physical states. A part of the electroweak symmetry is hidden by the Higgs mechanism for spontaneous symmetry breaking of the gauge symmetry.

The theory of general relativity is a classical description of gravity (in the sense that it is non-quantum mechanical). It goes beyond the static approximation described by Newton’s law and includes dynamical phenomena like, for example, gravitational waves. The problem of formulating a quantum theory of gravitational interactions is one of the central challenges of contemporary theoretical physics. But quantum effects in gravity only become important for energy concentrations in spacetime which are not in practice accessible to experimentation in the laboratory. Thus the search for the correct theory can only be done by a purely speculative approach. All attempts at a description of quantum gravity in terms of a well defined and computable local field theory along similar lines to those used for the SM have so far failed to lead to a satisfactory framework. Rather, at present, the most complete and plausible description of quantum gravity is a theory formulated in terms of non-pointlike basic objects, the so-called “strings”, extended over much shorter distances than those experimentally accessible and which live in a spacetime with 10 or 11 dimensions. The additional dimensions beyond the familiar 4 are, typically, compactified, which means that they are curled up with a curvature radius of the order of the string dimensions. Present string theory is an all-comprehensive framework that suggests a unified description of all interactions including gravity, in which the SM would be only a low energy or large distance approximation.

A fundamental principle of quantum mechanics, the Heisenberg uncertainty principle, implies that, when studying particles with spatial dimensions of order \(\Delta x\) or interactions taking place at distances of order \(\Delta x\), one needs as a probe a beam of particles (typically produced by an accelerator) with impulse \(p\gtrsim \hslash /\Delta x\), where is the reduced Planck constant ( = h∕2π). Accelerators presently in operation, like the Large Hadron Collider (LHC) at CERN near Geneva, allow us to study collisions between two particles with total center of mass energy up to \(2E \sim 2pc\lesssim 7\)–14 TeV. These machines can, in principle, study physics down to distances \(\Delta x\gtrsim 10^{-18}\) cm. Thus, on the basis of results from experiments at existing accelerators, we can indeed confirm that, down to distances of that order of magnitude, electrons, quarks, and all the fundamental SM particles do not show an appreciable internal structure, and look elementary and pointlike. We certainly expect quantum effects in gravity to become important at distances \(\Delta x \leq 10^{-33}\) cm, corresponding to energies up to E ∼ M Planck c 2 ∼ 1019 GeV, where M Planck is the Planck mass, related to Newton’s gravitational constant by G N = ℏ cM Planck 2. At such short distances the particles that so far appeared as pointlike may well reveal an extended structure, as would strings, and they may be described by a more detailed theoretical framework for which the local quantum field theory description of the SM would be just a low energy/large distance limit.

From the first few moments of the Universe, just after the Big Bang, the temperature of the cosmic background gradually went down, starting from kT ∼ M Planck c 2, where k = 8. 617 × 10−5 eV K−1 is the Boltzmann constant, down to the present situation where T ∼ 2. 725 K. Then all stages of high energy physics from string theory, which is a purely speculative framework, down to the SM phenomenology, which is directly accessible to experiment and well tested, are essential for the reconstruction of the evolution of the Universe starting from the Big Bang. This is the basis for the ever increasing connection between high energy physics and cosmology.

1.2 The Architecture of the Standard Model

The Standard Model (SM) is a gauge field theory based on the symmetry group \(SU(3)\bigotimes SU(2)\bigotimes U(1)\). The transformations of the group act on the basic fields. This group has 8 + 3 + 1 = 12 generators with a nontrivial commutator algebra (if all generators commute, the gauge theory is said to be “Abelian”, while the SM is a “non-Abelian” gauge theory). \(SU(2)\bigotimes U(1)\) describes the electroweak (EW) interactions [225, 316, 359] and the electric charge Q, the generator of the QED gauge group U(1) Q , is the sum of T 3, one of the SU(2) generators and of Y∕2, where Y is the U(1) generator: Q = T 3 + Y∕2. SU(3) is the “colour” group of the theory of strong interactions (quantum chromodynamics QCD [215, 234, 360]).

In a gauge theory,Footnote 1 associated with each generator T is a vector boson (also called a gauge boson) with the same quantum numbers as T, and if the gauge symmetry is unbroken, this boson is of vanishing mass. These vector bosons (i.e., of spin 1) act as mediators of the corresponding interactions. For example, in QED the vector boson associated with the generator Q is the photon γ. The interaction between two charged particles in QED, for example two electrons, is mediated by the exchange of one (or occasionally more than one) photon emitted by one electron and reabsorbed by the other. Similarly, in the SM there are 8 gluons associated with the SU(3) colour generators, while for \(SU(2)\bigotimes U(1)\) there are four gauge bosons W +, W , Z 0, and γ. Of these, only the gluons and the photon γ are massless, because the symmetry induced by the other three generators is actually spontaneously broken. The masses of W +, W , and Z 0 are very large indeed on the scale of elementary particles, with values m W  ∼ 80. 4 GeV and m Z  ∼ 91. 2 GeV, whence they are as heavy as atoms of intermediate size, like rubidium and molybdenum, respectively.

In the electroweak theory, the breaking of the symmetry is of a particular type, referred to as spontaneous symmetry breaking. In this case, charges and currents are as dictated by the symmetry, but the fundamental state of minimum energy, the vacuum, is not unique and there is a continuum of degenerate states that all respect the symmetry (in the sense that the whole vacuum orbit is spanned by applying the symmetry transformations). The symmetry breaking is due to the fact that the system (with infinite volume and an infinite number of degrees of freedom) is found in one particular vacuum state, and this choice, which for the SM occurred in the first instants of the life of the Universe, means that the symmetry is violated in the spectrum of states. In a gauge theory like the SM, the spontaneous symmetry breaking is realized by the Higgs mechanism [189, 236, 243, 261] (described in detail in Sect. 1.7): there are a number of scalar (i.e., zero spin) Higgs bosons with a potential that produces an orbit of degenerate vacuum states. One or more of these scalar Higgs particles must necessarily be present in the spectrum of physical states with masses very close to the range so far explored. The Higgs particle has now been found at the LHC with m H ∼ 126 GeV [341, 345], thus making a big step towards completing the experimental verification of the SM. The Higgs boson acts as the mediator of a new class of interactions which, at the tree level, are coupled in proportion to the particle masses and thus have a very different strength for, say, an electron and a top quark.

The fermionic matter fields of the SM are quarks and leptons (all of spin 1/2). Each type of quark is a colour triplet (i.e., each quark flavour comes in three colours) and also carries electroweak charges, in particular electric charges + 2∕3 for up-type quarks and − 1∕3 for down-type quarks. So quarks are subject to all SM interactions. Leptons are colourless and thus do not interact strongly (they are not hadrons) but have electroweak charges, in particular electric charges − 1 for charged leptons (e , μ and τ) and charge 0 for neutrinos (ν e , νμ and ντ). Quarks and leptons are grouped in 3 “families” or “generations” with equal quantum numbers but different masses. At present we do not have an explanation for this triple repetition of fermion families:

$$\displaystyle{ \left [\,\begin{array}{*{10}c} u\,&\,u\,&\,u\,&\,\upnu _{e} \\ d\,&\,d\,&\,d\,&\,e\\ \end{array} \,\right ]\;,\qquad \left [\,\begin{array}{*{10}c} c\,&\,c\,&\,c\,&\,\upnu _{\upmu }\\ s\, &\,s\, &\,s\, & \,\upmu \\ \end{array} \,\right ]\;,\qquad \left [\,\begin{array}{*{10}c} t\,&\,t\,&\,t\,&\,\upnu _{\uptau } \\ b\,&\,b\,&\,b\,&\,\uptau \\ \end{array} \,\right ]\;. }$$
(1.1)

The QCD sector of the SM (see Chap. 2) has a simple structure but a very rich dynamical content, including the observed complex spectroscopy with a large number of hadrons. The most prominent properties of QCD are asymptotic freedom and confinement. In field theory, the effective coupling of a given interaction vertex is modified by the interaction. As a result, the measured intensity of the force depends on the square Q 2 of the four-momentum Q transferred among the participants. In QCD the relevant coupling parameter that appears in physical processes is α s = e s 2∕4π, where e s is the coupling constant of the basic interaction vertices of quarks and gluons: qqg or \(ggg\ \big[\) see (1.28)–(1.31)\(\big]\).

Asymptotic freedom means that the effective coupling becomes a function of Q 2, and in fact α s(Q 2) decreases for increasing Q 2 and vanishes asymptotically. Thus, the QCD interaction becomes very weak in processes with large Q 2, called hard processes or deep inelastic processes (i.e., with a final state distribution of momenta and a particle content very different than those in the initial state). One can prove that in four spacetime dimensions all pure gauge theories based on a non-commuting symmetry group are asymptotically free, and conversely. The effective coupling decreases very slowly at large momenta, going as the reciprocal logarithm of Q 2, i.e., α s(Q 2) = 1∕blog(Q 2Λ 2), where b is a known constant and Λ is an energy of order a few hundred MeV. Since in quantum mechanics large momenta imply short wavelengths, the result is that at short distances (or Q > Λ) the potential between two colour charges is similar to the Coulomb potential, i.e., proportional to α s(r)∕r, with an effective colour charge which is small at short distances.

In contrast, the interaction strength becomes large at large distances or small transferred momenta, of order Q < Λ. In fact, all observed hadrons are tightly bound composite states of quarks (baryons are made of qqq and mesons of \(q\bar{q}\)), with compensating colour charges so that they are overall neutral in colour. In fact, the property of confinement is the impossibility of separating colour charges, like individual quarks and gluons or any other coloured state. This is because in QCD the interaction potential between colour charges increases linearly in r at long distances. When we try to separate a quark and an antiquark that form a colour neutral meson, the interaction energy grows until pairs of quarks and antiquarks are created from the vacuum. New neutral mesons then coalesce and are observed in the final state, instead of free quarks. For example, consider the process \(e^{+}e^{-}\rightarrow q\bar{q}\) at large center-of-mass energies. The final state quark and antiquark have high energies, so they move apart very fast. But the colour confinement forces create new pairs between them. What is observed is two back-to-back jets of colourless hadrons with a number of slow pions that make the exact separation of the two jets impossible. In some cases, a third, well separated jet of hadrons is also observed: these events correspond to the radiation of an energetic gluon from the parent quark–antiquark pair.

In the EW sector, the SM (see Chap. 3) inherits the phenomenological successes of the old (VA) ⊗ (VA) four-fermion low-energy description of weak interactions, and provides a well-defined and consistent theoretical framework that includes weak interactions and quantum electrodynamics in a unified picture. The weak interactions derive their name from their strength. At low energy, the strength of the effective four-fermion interaction of charged currents is determined by the Fermi coupling constant G F. For example, the effective interaction for muon decay is given by

$$\displaystyle{ \mathcal{L}_{\mathrm{eff}} = \frac{G_{\mathrm{F}}} {\sqrt{2}}\big[\bar{\nu }_{\mu }\gamma _{\alpha }(1 -\gamma _{5})\mu \big]\big[\bar{e}\gamma ^{\alpha }(1 -\gamma _{5})\nu _{e}\big]\;, }$$
(1.2)

with [307]

$$\displaystyle{ G_{\mathrm{F}} = 1.166\,378\,7(6) \times 10^{-5}\ \mathrm{GeV}^{-2}\;. }$$
(1.3)

In natural units  = c = 1, G F (which we most often use in this work) has dimensions of (mass)−2. As a result, the strength of weak interactions at low energy is characterized by G F E 2, where E is the energy scale for a given process (E ≈ m μ for muon decay). Since

$$\displaystyle{ G_{\mathrm{F}}E^{2} = G_{\mathrm{ F}}m_{\mathrm{p}}^{2}(E/m_{\mathrm{ p}})^{2} \approx 10^{-5}(E/m_{\mathrm{ p}})^{2}\;, }$$
(1.4)

where m p is the proton mass, the weak interactions are indeed weak at low energies (up to energies of order a few tens of GeV). Effective four-fermion couplings for neutral current interactions have comparable intensity and energy behaviour. The quadratic increase with energy cannot continue for ever, because it would lead to a violation of unitarity. In fact, at high energies, propagator effects can no longer be neglected, and the current–current interaction is resolved into current–W gauge boson vertices connected by a W propagator. The strength of the weak interactions at high energies is then measured by g W , the W–μ–νμ coupling, or even better, by α W  = g W 2∕4π, analogous to the fine-structure constant α of QED (in Chap. 3, g W is simply denoted by g or g 2). In the standard EW theory, we have

$$\displaystyle{ \alpha _{W} = \sqrt{2}G_{\mathrm{F}}m_{W}^{2}/\pi \approx 1/30\;. }$$
(1.5)

That is, at high energies the weak interactions are no longer so weak.

The range r W of weak interactions is very short: it was only with the experimental discovery of the W and Z gauge bosons that it could be demonstrated that r W is non-vanishing. Now we know that

$$\displaystyle{ r_{W} = \frac{\hslash } {m_{W}c} \approx 2.5 \times 10^{-16}\ \mathrm{cm}\;, }$$
(1.6)

corresponding to m W  ≈ 80. 4 GeV. This very high value for the W (or the Z) mass makes a drastic difference, compared with the massless photon and the infinite range of the QED force. The direct experimental limit on the photon mass is [307] m γ  < 10−18 eV. Thus, on the one hand, there is very good evidence that the photon is massless, and on the other, the weak bosons are very heavy. A unified theory of EW interactions has to face this striking difference.

Another apparent obstacle in the way of EW unification is the chiral structure of weak interactions: in the massless limit for fermions, only left-handed quarks and leptons (and right-handed antiquarks and antileptons) are coupled to W particles. This clearly implies parity and charge-conjugation violation in weak interactions.

The universality of weak interactions and the algebraic properties of the electromagnetic and weak currents [conservation of vector currents (CVC), partial conservation of axial currents (PCAC), the algebra of currents, etc.] were crucial in pointing to the symmetric role of electromagnetism and weak interactions at a more fundamental level. The old Cabibbo universality [120] for the weak charged current, viz.,

$$\displaystyle\begin{array}{rcl} J_{\alpha }^{\mathrm{weak}}& =& \bar{\nu }_{\mu }\gamma _{\alpha }(1 -\gamma _{ 5})\mu +\bar{\nu } _{e}\gamma _{\alpha }(1 -\gamma _{5})e +\cos \theta _{\mathrm{c}}\bar{u}\gamma _{\alpha }(1 -\gamma _{5})d \\ & & +\sin \theta _{\mathrm{c}}\bar{u}\gamma _{\alpha }(1 -\gamma _{5})s + \cdots \;, {}\end{array}$$
(1.7)

suitably extended, is naturally implied by the standard EW theory. In this theory the weak gauge bosons couple to all particles with couplings that are proportional to their weak charges, in the same way as the photon couples to all particles in proportion to their electric charges. In (1.7), d  = dcosθ c + ssinθ c is the weak isospin partner of u in a doublet. The (u, d ) doublet has the same couplings as the (ν e , ) and (νμ, μ) doublets.

Another crucial feature is that the charged weak interactions are the only known interactions that can change flavour: charged leptons into neutrinos or up-type quarks into down-type quarks. On the other hand, there are no flavour-changing neutral currents at tree level. This is a remarkable property of the weak neutral current, which is explained by the introduction of the Glashow–Iliopoulos–Maiani (GIM) mechanism [226] and led to the successful prediction of charm.

The natural suppression of flavour-changing neutral currents, the separate conservation of e, μ, and τ leptonic flavours that is only broken by the small neutrino masses, the mechanism of CP violation through the phase in the quark-mixing matrix [269], are all crucial features of the SM. Many examples of new physics tend to break the selection rules of the standard theory. Thus the experimental study of rare flavour-changing transitions is an important window on possible new physics.

The SM is a renormalizable field theory, which means that the ultraviolet divergences that appear in loop diagrams can be eliminated by a suitable redefinition of the parameters already appearing in the bare Lagrangian: masses, couplings, and field normalizations. As will be discussed later, a necessary condition for a theory to be renormalizable is that only operator vertices of dimension not greater than 4 (that is m 4, where m is some mass scale) appear in the Lagrangian density \(\mathcal{L}\) (itself of dimension 4, because the action S is given by the integral of \(\mathcal{L}\) over d4 x and is dimensionless in natural units such that  = c = 1). Once this condition is added to the specification of a gauge group and of the matter field content, the gauge theory Lagrangian density is completely specified. We shall see the precise rules for writing down the Lagrangian of a gauge theory in the next section.

1.3 The Formalism of Gauge Theories

In this section we summarize the definition and the structure of a Yang–Mills gauge theory [371]. We will list here the general rules for constructing such a theory. Then these results will be applied to the SM.

Consider a Lagrangian density \(\mathcal{L}[\phi,\partial _{\mu }\phi ]\) which is invariant under a D dimensional continuous group Γ of transformations:

$$\displaystyle{ \phi ^{{\prime}}(x) = U(\theta ^{A})\phi (x)\quad \quad (A = 1,2,\ldots,D)\;, }$$
(1.8)

with

$$\displaystyle{ U(\theta ^{A}) =\exp \bigg[\mathrm{i}g\sum _{ A}\theta ^{A}T^{A}\bigg] \sim 1 + \mathrm{i}g\sum _{ A}\theta ^{A}T^{A} + \cdots \;. }$$
(1.9)

The quantities θ A are numerical parameters, like angles in the particular case of a rotation group in some internal space. The approximate expression on the right is valid for θ A infinitesimal. Then, g is the coupling constant and T A are the generators of the group Γ of transformations (1.8) in the (in general reducible) representation of the fields ϕ. Here we restrict ourselves to the case of internal symmetries, so the T A are matrices that are independent of the spacetime coordinates, and the arguments of the fields ϕ and ϕ in (1.8) are the same.

If U is unitary, then the generators T A are Hermitian, but this need not be the case in general (although it is true for the SM). Similarly, if U is a group of matrices with unit determinant, then the traces of the T A vanish, i.e., tr(T A) = 0. In general, the generators satisfy the commutation relations

$$\displaystyle{ [T^{A},T^{B}] = \mathrm{i}C_{ ABC}T^{C}\;. }$$
(1.10)

For A, B, C, , up or down indices make no difference, i.e., T A = T A , etc. The structure constants C ABC are completely antisymmetric in their indices, as can be easily seen. Recall that if all generators commute, the gauge theory is said to be “Abelian” (in this case all the structure constants C ABC vanish), while the SM is a “non-Abelian” gauge theory.

We choose to normalize the generators T A in such a way that, for the lowest dimensional non-trivial representation of the group Γ (we use t A to denote the generators in this particular representation), we have

$$\displaystyle{ \mathrm{tr}\big(t^{A}t^{B}\big) = \frac{1} {2}\delta ^{AB}\;. }$$
(1.11)

A normalization convention is needed to fix the normalization of the coupling g and the structure constants C ABC . In the following, for each quantity f A, we define

$$\displaystyle{ \mathbf{f} =\sum _{A}T^{A}f^{A}\;. }$$
(1.12)

For example, we can rewrite (1.9) in the form

$$\displaystyle{ U(\theta ^{A}) =\exp (\mathrm{i}g\boldsymbol{\theta }) \sim 1 + \mathrm{i}g\boldsymbol{\theta } + \cdots \;. }$$
(1.13)

If we now make the parameters θ A depend on the spacetime coordinates, whence θ A = θ A(x μ ), then \(\mathcal{L}[\phi,\partial _{\mu }\phi ]\) is in general no longer invariant under the gauge transformations U[θ A(x μ )], because of the derivative terms. Indeed, we then have μ ϕ  =  μ () ≠ U∂ μ ϕ. Gauge invariance is recovered if the ordinary derivative is replaced by the covariant derivative

$$\displaystyle{ D_{\mu } = \partial _{\mu } + ig\mathbf{V}_{\mu }\;, }$$
(1.14)

where V μ A are a set of D gauge vector fields (in one-to-one correspondence with the group generators), with the transformation law

$$\displaystyle{ \mathbf{V}_{\mu }^{{\prime}} = U\mathbf{V}_{\mu }U^{-1} - \frac{1} {\mathrm{i}g}(\partial _{\mu }U)U^{-1}. }$$
(1.15)

For constant θ A, V reduces to a tensor of the adjoint (or regular) representation of the group:

$$\displaystyle{ \mathbf{V}_{\mu }^{{\prime}} = U\mathbf{V}_{\mu }U^{-1} \approx \mathbf{V}_{\mu } + \mathrm{i}g[\boldsymbol{\theta },\mathbf{V}_{\mu }] + \cdots \;, }$$
(1.16)

which implies that

$$\displaystyle{ V _{\mu }^{{\prime}C} = V _{\mu }^{C} - gC_{ ABC}\theta ^{A}V _{\mu }^{B} + \cdots \;, }$$
(1.17)

where repeated indices are summed over.

As a consequence of (1.14) and (1.15), D μ ϕ has the same transformation properties as ϕ :

$$\displaystyle{ (D_{\mu }\phi )^{{\prime}} = U(D_{\mu }\phi )\;. }$$
(1.18)

In fact,

$$\displaystyle\begin{array}{rcl} (D_{\mu }\phi )^{{\prime}}& =& (\partial _{\mu } + ig\mathbf{V}^{{\prime}}_{ \mu })\phi ^{{\prime}} \\ & =& (\partial _{\mu }U)\phi + U\partial _{\mu }\phi + igU\mathbf{V}_{\mu }\phi - (\partial _{\mu }U)\phi = U(D_{\mu }\phi )\;.{}\end{array}$$
(1.19)

Thus \(\mathcal{L}[\phi,D_{\mu }\phi ]\) is indeed invariant under gauge transformations. But at this stage the gauge fields V μ A appear as external fields that do not propagate. In order to construct a gauge invariant kinetic energy term for the gauge fields V μ A, we consider

$$\displaystyle{ [D_{\mu },D_{\nu }]\phi = \mathrm{i}g\big\{\partial _{\mu }\mathbf{V}_{\nu } - \partial _{\nu }\mathbf{V}_{\mu } + \mathrm{i}g[\mathbf{V}_{\mu },\mathbf{V}_{\nu }]\big\}\phi \equiv \mathrm{i}g\mathbf{F}_{\mu \nu }\phi \;, }$$
(1.20)

which is equivalent to

$$\displaystyle{ F_{\mu \nu }^{A} = \partial _{\mu }V _{\nu }^{A} - \partial _{\nu }V _{\mu }^{A} - gC_{ ABC}V _{\mu }^{B}V _{\nu }^{C}\;. }$$
(1.21)

From (1.8), (1.18), and (1.20), it follows that the transformation properties of F μ ν A are those of a tensor of the adjoint representation:

$$\displaystyle{ \mathbf{F}_{\mu \nu }^{{\prime}} = U\mathbf{F}_{\mu \nu }U^{-1}\;. }$$
(1.22)

The complete Yang–Mills Lagrangian, which is invariant under gauge transformations, can be written in the form

$$\displaystyle{ \mathcal{L}_{\mathrm{YM}} = -\frac{1} {2}Tr\mathbf{F}_{\mu \nu }\mathbf{F}^{\mu \nu } + \mathcal{L}[\phi,D_{\mu }\phi ] = -\frac{1} {4}\sum _{A}F_{\mu \nu }^{A}F^{A\mu \nu } + \mathcal{L}[\phi,D_{\mu }\phi ]\;. }$$
(1.23)

Note that the kinetic energy term is an operator of dimension 4. Thus if \(\mathcal{L}\) is renormalizable, so also is \(\mathcal{L}_{\mathrm{YM}}\). If we give up renormalizability, then more gauge invariant higher dimensional terms could be added. It is already clear at this stage that no mass term for gauge bosons of the form m 2 V μ V μ is allowed by gauge invariance.

1.4 Application to QED and QCD

For an Abelian theory like QED, the gauge transformation reduces to U[θ(x)] = exp[ieQθ(x)], where Q is the charge generator (for more commuting generators, one simply has a product of similar factors). According to (1.15), the associated gauge field (the photon) transforms as

$$\displaystyle{ V _{\mu }^{{\prime}} = V _{\mu } - \partial _{\mu }\theta (x)\;, }$$
(1.24)

and the familiar gauge transformation is recovered, with addition of a 4-gradient of a scalar function. The QED Lagrangian density is given by

$$\displaystyle{ \mathcal{L} = -\frac{1} {4}F^{\mu \nu }F_{\mu \nu } +\sum _{\psi }\bar{\psi }(\mathrm{i}D/ - m_{\psi })\psi \;. }$$
(1.25)

Here D∕ = D μ γ μ, where γ μ are the Dirac matrices and the covariant derivative is given in terms of the photon field A μ and the charge operator Q by

$$\displaystyle{ D_{\mu } = \partial _{\mu } + \mathrm{i}eA_{\mu }Q }$$
(1.26)

and

$$\displaystyle{ F_{\mu \nu } = \partial _{\mu }A_{\nu } - \partial _{\nu }A_{\mu }\;. }$$
(1.27)

Note that in QED one usually takes e to be the particle, so that Q = −1 and the covariant derivative is D μ  =  μ −ieA μ when acting on the electron field. In the Abelian case, the F μ ν tensor is linear in the gauge field V μ , so that in the absence of matter fields the theory is free. On the other hand, in the non-Abelian case, the F μ ν A tensor contains both linear and quadratic terms in V μ A, so the theory is non-trivial even in the absence of matter fields.

According to the formalism of the previous section, the statement that QCD is a renormalizable gauge theory based on the group SU(3) with colour triplet quark matter fields fixes the QCD Lagrangian density to be

$$\displaystyle{ \mathcal{L} = -\frac{1} {4}\sum _{A=1}^{8}F^{A\mu \nu }F_{\mu \nu }^{A} +\sum _{ j=1}^{n_{\mathrm{f}} }\bar{q}_{j}(iD/ - m_{j})q_{j}\;. }$$
(1.28)

Here q j are the quark fields with n f different flavours and mass m j , and D μ is the covariant derivative of the form

$$\displaystyle{ D_{\mu } = \partial _{\mu } + \mathrm{i}e_{\mathrm{s}}\mathbf{g}_{\boldsymbol{\mu }}\;, }$$
(1.29)

with gauge coupling e s. Later, in analogy with QED, we will mostly use

$$\displaystyle{ \alpha _{\mathrm{s}} = \frac{e_{\mathrm{s}}^{2}} {4\pi } \;. }$$
(1.30)

In addition, \(\mathbf{g}_{\boldsymbol{\mu }} =\sum _{A}t^{A}g_{\mu }^{A}\), where g μ A, A = 1, , 8, are the gluon fields and t A are the SU(3) group generators in the triplet representation of the quarks (i.e., t A are 3 × 3 matrices acting on q). The generators obey the commutation relations [t A, t B] = iC ABC t C, where C ABC are the completely antisymmetric structure constants of SU(3). The normalizations of C ABC and e s are specified by those of the generators t A, i.e., \(\mathrm{Tr}[t^{A}t^{B}] =\delta ^{AB}/2\ \big[\) see (1.11)\(\big]\). Finally, we have

$$\displaystyle{ F_{\mu \nu }^{A} = \partial _{\mu }g_{\nu }^{A} - \partial _{\nu }g_{\mu }^{A} - e_{\mathrm{ s}}C_{ABC}g_{\mu }^{B}g_{\nu }^{C}\;. }$$
(1.31)

Chapter 2 is devoted to a detailed description of QCD as the theory of strong interactions. The physical vertices in QCD include the gluon–quark–antiquark vertex, analogous to the QED photon–fermion–antifermion coupling, but also the 3-gluon and 4-gluon vertices, of order e s and e s 2 respectively, which have no analogue in an Abelian theory like QED. In QED the photon is coupled to all electrically charged particles, but is itself neutral. In QCD the gluons are coloured, hence self-coupled. This is reflected by the fact that, in QED, F μ ν is linear in the gauge field, so that the term F μ ν 2 in the Lagrangian is a pure kinetic term, while in QCD, F μ ν A is quadratic in the gauge field, so that in F μ ν A2, we find cubic and quartic vertices beyond the kinetic term. It is also instructive to consider a scalar version of QED:

$$\displaystyle{ \mathcal{L} = -\frac{1} {4}F^{\mu \nu }F_{\mu \nu } + (D_{\mu }\phi )^{\dag }(D^{\mu }\phi ) - m^{2}(\phi ^{\dag }\phi )\;. }$$
(1.32)

For Q = 1, we have

$$\displaystyle{ (D_{\mu }\phi )^{\dag }(D^{\mu }\phi ) = (\partial _{\mu }\phi )^{\dag }(\partial ^{\mu }\phi ) + \mathrm{i}eA_{\mu }\big[(\partial ^{\mu }\phi )^{\dag }\phi -\phi ^{\dag }(\partial ^{\mu }\phi )\big] + e^{2}A_{\mu }A^{\mu }\phi ^{\dag }\phi \;. }$$
(1.33)

We see that for a charged boson in QED, given that the kinetic term for bosons is quadratic in the derivative, there is a gauge–gauge–scalar–scalar vertex of order e 2. We understand that in QCD the 3-gluon vertex is there because the gluon is coloured, and the 4-gluon vertex because the gluon is a boson.

1.5 Chirality

We recall here the notion of chirality and related issues which are crucial for the formulation of the EW Theory. The fermion fields can be described through their right-handed (RH) (chirality + 1) and left-handed (LH) (chirality − 1) components:

$$\displaystyle{ \psi _{\mathrm{L,R}} = [(1 \mp \gamma _{5})/2]\psi \;,\quad \bar{\psi }_{\mathrm{L,R}} =\bar{\psi } [(1 \pm \gamma _{5})/2]\;, }$$
(1.34)

where γ 5 and the other Dirac matrices are defined as in the book by Bjorken and Drell [102]. In particular, γ 5 2 = 1, γ 5  = γ 5. Note that (1.34) implies

$$\displaystyle{\bar{\psi }_{\mathrm{L}} =\psi _{ \mathrm{L}}^{\dag }\gamma _{ 0} =\psi ^{\dag }[(1 -\gamma _{ 5})/2]\gamma _{0} =\bar{\psi }\gamma _{0}[(1 -\gamma _{5})/2]\gamma _{0} =\bar{\psi } [(1 +\gamma _{5})/2]\;.}$$

The matrices P ± = (1 ±γ 5)∕2 are projectors. They satisfy the relations P ± P ± = P ±, P ± P  = 0, P + + P  = 1. They project onto fermions of definite chirality. For massless particles, chirality coincides with helicity. For massive particles, a chirality + 1 state only coincides with a + 1 helicity state up to terms suppressed by powers of mE.

The 16 linearly independent Dirac matrices (Γ) can be divided into γ 5-even (Γ E) and γ 5-odd (Γ O) according to whether they commute or anticommute with γ 5. For the γ 5-even, we have

$$\displaystyle{ \bar{\psi }\varGamma _{\mathrm{E}}\psi =\bar{\psi } _{\mathrm{L}}\varGamma _{\mathrm{E}}\psi _{\mathrm{R}} +\bar{\psi } _{\mathrm{R}}\varGamma _{\mathrm{E}}\psi _{\mathrm{L}}\quad \quad (\varGamma _{\mathrm{E}} \equiv 1,\mathrm{i}\gamma _{5},\sigma _{\mu \nu })\;, }$$
(1.35)

whilst for the γ 5-odd,

$$\displaystyle{ \bar{\psi }\varGamma _{\mathrm{O}}\psi =\bar{\psi } _{\mathrm{L}}\varGamma _{\mathrm{O}}\psi _{\mathrm{L}} +\bar{\psi } _{\mathrm{R}}\varGamma _{\mathrm{O}}\psi _{\mathrm{R}}\quad \quad (\varGamma _{\mathrm{O}} \equiv \gamma _{\mu },\gamma _{\mu }\gamma _{5}). }$$
(1.36)

We see that in a gauge Lagrangian, fermion kinetic terms and interactions of gauge bosons with vector and axial vector fermion currents all conserve chirality, while fermion mass terms flip chirality. For example, in QED, if an electron emits a photon, the electron chirality is unchanged. In the ultrarelativistic limit, when the electron mass can be neglected, chirality and helicity are approximately the same and we can state that the helicity of the electron is unchanged by the photon emission. In a massless gauge theory, the LH and the RH fermion components are uncoupled and can be transformed separately. If in a gauge theory the LH and RH components transform as different representations of the gauge group, one speaks of a chiral gauge theory, while if they have the same gauge transformations, one has a vector gauge theory. Thus, QED and QCD are vector gauge theories because, for each given fermion, ψ L and ψ R have the same electric charge and the same colour. Instead, the standard EW theory is a chiral theory, in the sense that ψ L and ψ R behave differently under the gauge group (so that parity and charge conjugation non-conservation are made possible in principle). Thus, mass terms for fermions (of the form \(\bar{\psi }_{\mathrm{L}}\psi _{\mathrm{R}}\) + h.c.) are forbidden in the EW gauge-symmetric limit. In particular, in the Minimal Standard Model (MSM), i.e., the model that only includes all observed particles plus a single Higgs doublet, all ψ L are SU(2) doublets, while all ψ R are singlets.

1.6 Quantization of a Gauge Theory

The Lagrangian density \(\mathcal{L}_{\mathrm{YM}}\) in (1.23) fully describes the theory at the classical level. The formulation of the theory at the quantum level requires us to specify procedures of quantization, regularization and, finally, renormalization. To start with, the formulation of Feynman rules is not straightforward. A first problem, common to all gauge theories, including the Abelian case of QED, can be realized by observing that the free equations of motion for V μ A, as obtained from (1.21) and (1.23), are given by

$$\displaystyle{ \big(\partial ^{2}g_{\mu \nu } - \partial _{\mu }\partial _{\nu }\big)V ^{A\nu } = 0\;. }$$
(1.37)

Normally the propagator of the gauge field should be determined by the inverse of the operator 2 g μ ν μ ν . However, it has no inverse, being a projector over the transverse gauge vector states. This difficulty is removed by fixing a particular gauge. If one chooses a covariant gauge condition μ V μ A = 0, then a gauge fixing term of the form

$$\displaystyle{ \Delta \mathcal{L}_{\mathrm{GF}} = -\frac{1} {2\lambda }\sum _{A}\vert \partial ^{\mu }V _{\mu }^{A}\vert ^{2} }$$
(1.38)

has to be added to the Lagrangian (1∕λ acts as a Lagrangian multiplier). The free equations of motion are then modified as follows:

$$\displaystyle{ \big[\partial ^{2}g_{\mu \nu } - (1 - 1/\lambda )\partial _{\mu }\partial _{\nu }\big]V ^{A\nu } = 0\;. }$$
(1.39)

This operator now has an inverse whose Fourier transform is given by

$$\displaystyle{ D_{\mu \nu }^{AB}(q) = \frac{\mathrm{i}} {q^{2} + \mathrm{i}\epsilon }\left [-g_{\mu \nu } + (1-\lambda ) \frac{q_{\mu }q_{\nu }} {q^{2} + \mathrm{i}\epsilon }\right ]\delta ^{AB}\;, }$$
(1.40)

which is the propagator in this class of gauges. The parameter λ can take any value and it disappears from the final expression of any gauge invariant, physical quantity. Commonly used particular cases are λ = 1 (Feynman gauge) and λ = 0 (Landau gauge).

While in an Abelian theory the gauge fixing term is all that is needed for a correct quantization, in a non-Abelian theory the formulation of complete Feynman rules involves a further subtlety. This is formally taken into account by introducing a set of D fictitious ghost fields that must be included as internal lines in closed loops (Faddeev–Popov ghosts [197]). Given that gauge fields connected by a gauge transformation describe the same physics, there are clearly fewer physical degrees of freedom than gauge field components. Ghosts appear, in the form of a transformation Jacobian in the functional integral, in the process of elimination of the redundant variables associated with fields on the same gauge orbit [14]. By performing some path integral acrobatics, the correct ghost contributions can be translated into an additional term in the Lagrangian density. For each choice of the gauge fixing term, the ghost Lagrangian is obtained by considering the effect of an infinitesimal gauge transformation V μ ′ C = V μ CgC ABC θ A V μ B μ θ C on the gauge fixing condition. For μ V μ C = 0, one obtains

$$\displaystyle{ \partial ^{\mu }V _{\mu }^{{\prime}C} = \partial ^{\mu }V _{\mu }^{C} - gC_{ ABC}\partial ^{\mu }(\theta ^{A}V _{\mu }^{B}) - \partial ^{2}\theta ^{C} = -\big[\partial ^{2}\delta _{ AC} + gC_{ABC}V _{\mu }^{B}\partial ^{\mu }\big]\theta ^{A}\;, }$$
(1.41)

where the gauge condition μ V μ C = 0 has been taken into account in the last step. The ghost Lagrangian is then given by

$$\displaystyle{ \Delta \mathcal{L}_{\mathrm{Ghost}} =\bar{\eta } ^{C}\big[\partial ^{2}\delta _{ AC} + gC_{ABC}V _{\mu }^{B}\partial ^{\mu }\big]\eta ^{A}\;, }$$
(1.42)

where η A is the ghost field (one for each index A) which has to be treated as a scalar field, except that a factor − 1 has to be included for each closed loop, as for fermion fields.

Starting from non-covariant gauges, one can construct ghost-free gauges. An example, also important in other respects, is provided by the set of “axial” gauges n μ V μ A = 0, where n μ is a fixed reference 4-vector (actually, for n μ spacelike, one has an axial gauge proper, for n 2 = 0, one speaks of a light-like gauge, and for n μ timelike, one has a Coulomb or temporal gauge). The gauge fixing term is of the form

$$\displaystyle{ \Delta \mathcal{L}_{\mathrm{GF}} = -\frac{1} {2\lambda }\sum _{A}\vert n^{\mu }V _{\mu }^{A}\vert ^{2}\;. }$$
(1.43)

With a procedure that can be found in QED textbooks [102], the corresponding propagator in Fourier space is found to be

$$\displaystyle{ D_{\mu \nu }^{AB}(q) = \frac{\mathrm{i}} {q^{2} + \mathrm{i}\epsilon }\left [-g_{\mu \nu } + \frac{n_{\mu }q_{+}n_{\nu }q_{\mu }} {(nq)} - \frac{n^{2}q_{\mu }q_{\nu }} {(nq)^{2}}\right ]\delta ^{AB}\;. }$$
(1.44)

In this case there are no ghost interactions because n μ V μ ′ A, obtained by a gauge transformation from n μ V μ A, contains no gauge fields, once the gauge condition n μ V μ A = 0 has been taken into account. Thus the ghosts are decoupled and can be ignored.

The introduction of a suitable regularization method that preserves gauge invariance is essential for the definition and the calculation of loop diagrams and for the renormalization programme of the theory. The method that is currently adopted is dimensional regularization [334], which consists in the formulation of the theory in n dimensions. All loop integrals have an analytic expression that is actually valid also for non-integer values of n. Writing the results for n = 4 −ε the loops are ultraviolet finite for ε > 0 and the divergences reappear in the form of poles at ε = 0.

1.7 Spontaneous Symmetry Breaking in Gauge Theories

The gauge symmetry of the SM was difficult to discover because it is well hidden in nature. The only observed gauge boson that is massless is the photon. The gluons are presumed massless but cannot be directly observed because of confinement, and the W and Z weak bosons carry a heavy mass. Indeed a major difficulty in unifying the weak and electromagnetic interactions was the fact that electromagnetic interactions have infinite range (m γ  = 0), whilst the weak forces have a very short range, owing to m W, Z  ≠ 0. The solution to this problem lies in the concept of spontaneous symmetry breaking, which was borrowed from condensed matter physics.

Consider a ferromagnet at zero magnetic field in the Landau–Ginzburg approximation. The free energy in terms of the temperature T and the magnetization M can be written as

$$\displaystyle{ F(\mathbf{M},T) \simeq F_{0}(T) + \frac{1} {2}\mu ^{2}(T)\mathbf{M}^{2} + \frac{1} {4}\lambda (T)(\mathbf{M}^{2})^{2} + \cdots \;. }$$
(1.45)

This is an expansion which is valid at small magnetization. The neglect of terms of higher order in \(\mbox{ $\mathbf{M}$}^{2}\) is the analogue in this context of the renormalizability criterion. Furthermore, λ(T) > 0 is assumed for stability, and F is invariant under rotations, i.e., all directions of M in space are equivalent. The minimum condition for F reads

$$\displaystyle{ \partial F/\partial M_{i} = 0\;,\quad \big[\mu ^{2}(T) +\lambda (T)\mathbf{M}^{2}\big]\mathbf{M} = 0\;. }$$
(1.46)

There are two cases, shown in Fig. 1.1. If \(\mu ^{2}\gtrsim 0\), then the only solution is M = 0, there is no magnetization, and the rotation symmetry is respected. In this case the lowest energy state (in a quantum theory the vacuum) is unique and invariant under rotations. If μ 2 < 0, then another solution appears, which is

$$\displaystyle{ \vert \mathbf{M}_{0}\vert ^{2} = -\mu ^{2}/\lambda \;. }$$
(1.47)

In this case there is a continuous orbit of lowest energy states, all with the same value of | M | , but different orientations. A particular direction chosen by the vector M 0 leads to a breaking of the rotation symmetry.

Fig. 1.1
figure 1

The potential V = μ 2 M 2∕2 +λ(M 2)2∕4 for positive (a ) or negative μ 2 (b ) (for simplicity, M is a 2-dimensional vector). The small sphere indicates a possible choice for the direction of M

For a piece of iron we can imagine bringing it to high temperature and letting it melt in an external magnetic field B. The presence of B is an explicit breaking of the rotational symmetry and it induces a nonzero magnetization M along its direction. Now we lower the temperature while keeping B fixed. Both λ and μ 2 depend on the temperature. With lowering T, μ 2 goes from positive to negative values. The critical temperature T crit (Curie temperature) is where μ 2(T) changes sign, i.e., μ 2(T crit) = 0. For pure iron, T crit is below the melting temperature. So at T = T crit iron is a solid. Below T crit we remove the magnetic field. In a solid the mobility of the magnetic domains is limited and a non-vanishing M 0 remains. The form of the free energy is again rotationally invariant as in (1.45). But now the system allows a minimum energy state with non-vanishing M in the direction of B. As a consequence the symmetry is broken by this choice of one particular vacuum state out of a continuum of them.

We now prove the Goldstone theorem [228]. It states that when spontaneous symmetry breaking takes place, there is always a zero-mass mode in the spectrum. In a classical context this can be proven as follows. Consider a Lagrangian

$$\displaystyle{ \mathcal{L} = \frac{1} {2}\vert \partial _{\mu }\phi \vert ^{2} - V (\phi )\;. }$$
(1.48)

The potential V (ϕ) can be kept generic at this stage, but in the following we will be mostly interested in a renormalizable potential of the form

$$\displaystyle{ V (\phi ) = -\frac{1} {2}\mu ^{2}\phi ^{2} + \frac{1} {4}\lambda \phi ^{4}\;, }$$
(1.49)

with no more than quartic terms. Here by ϕ we mean a column vector with real components ϕ i (1 = 1, 2, , N) (complex fields can always be decomposed into a pair of real fields), so that, for example, ϕ 2 =  i ϕ i 2. This particular potential is symmetric under an N × N orthogonal matrix rotation ϕ  = , where O is an SO(N) transformation. For simplicity, we have omitted odd powers of ϕ, which means that we have assumed an extra discrete symmetry under ϕ ↔ −ϕ. Note that, for positive μ 2, the mass term in the potential has the “wrong” sign: according to the previous discussion this is the condition for the existence of a non-unique lowest energy state. Further, we only assume here that the potential is symmetric under the infinitesimal transformations

$$\displaystyle{ \phi \rightarrow \phi ^{{\prime}} =\phi +\updelta \phi \;,\quad \updelta \phi _{ i} = \mathrm{i}\updelta \theta ^{A}t_{ ij}^{\,A}\phi _{ j}\;, }$$
(1.50)

where δθ A are infinitesimal parameters and t ij A are the matrices that represent the symmetry group on the representation carried by the fields ϕ i (a sum over A is understood). The minimum condition on V that identifies the equilibrium position (or the vacuum state in quantum field theory language) is

$$\displaystyle{ \frac{\partial V } {\partial \phi _{i}} (\phi _{i} =\phi _{ i}^{0}) = 0\;. }$$
(1.51)

The symmetry of V implies that

$$\displaystyle{ \updelta V = \frac{\partial V } {\partial \phi _{i}} \updelta \phi _{i} = \mathrm{i}\updelta \theta ^{A}\frac{\partial V } {\partial \phi _{i}} t_{ij}^{\,A}\phi _{ j} = 0\;. }$$
(1.52)

By taking a second derivative at the minimum ϕ i  = ϕ i 0 of both sides of the previous equation, we obtain that, for each A,

$$\displaystyle{ \frac{\partial ^{2}V } {\partial \phi _{k}\partial \phi _{i}}(\phi _{i} =\phi _{ i}^{0})t_{ ij}^{\,A}\phi _{ j}^{0} + \frac{\partial V } {\partial \phi _{i}} (\phi _{i} =\phi _{ i}^{0})t_{ ik}^{\,A} = 0. }$$
(1.53)

The second term vanishes owing to the minimum condition (1.51). We then find

$$\displaystyle{ \frac{\partial ^{2}V } {\partial \phi _{k}\partial \phi _{i}}(\phi _{i} =\phi _{ i}^{0})t_{ ij}^{\,A}\phi _{ j}^{0} = 0\;. }$$
(1.54)

The second derivatives M ki 2 = ( 2 V∂ ϕ k ∂ ϕ i )(ϕ i  = ϕ i 0) define the squared mass matrix. Thus the above equation in matrix notation can be written as

$$\displaystyle{ M^{2}t^{A}\phi ^{0} = 0\;. }$$
(1.55)

In the case of no spontaneous symmetry breaking, the ground state is unique, and all symmetry transformations leave it invariant, so that, for all A, t A ϕ 0 = 0. On the other hand, if for some values of A the vectors (t A ϕ 0) are non-vanishing, i.e., there is some generator that shifts the ground state into some other state with the same energy (whence the vacuum is not unique), then each t A ϕ 0 ≠ 0 is an eigenstate of the squared mass matrix with zero eigenvalue. Therefore, a massless mode is associated with each broken generator. The charges of the massless modes (their quantum numbers in quantum language) differ from those of the vacuum (usually all taken as zero) by the values of the t A charges: one says that the massless modes have the same quantum numbers as the broken generators, i.e., those that do not annihilate the vacuum.

The previous proof of the Goldstone theorem has been given for the classical case. In the quantum case, the classical potential corresponds to the tree level approximation of the quantum potential. Higher order diagrams with loops introduce quantum corrections. The functional integral formulation of quantum field theory [14, 250] is the most appropriate framework to define and compute, in a loop expansion, the quantum potential which specifies the vacuum properties of the quantum theory in exactly the way described above. If the theory is weakly coupled, e.g., if λ is small, the tree level expression for the potential is not too far from the truth, and the classical situation is a good approximation. We shall see that this is the situation that occurs in the electroweak theory with a moderately light Higgs particle (see Sect. 3.5).

We note that for a quantum system with a finite number of degrees of freedom, for example, one described by the Schrödinger equation, there are no degenerate vacua: the vacuum is always unique. For example, in the one-dimensional Schrödinger problem with a potential

$$\displaystyle{ V (x) = -\frac{\mu ^{2}} {2}x^{2} + \frac{\lambda } {4}x^{4}\;, }$$
(1.56)

there are two degenerate minima at x = ±x 0 = (μ 2λ)1∕2, which we denote by | + 〉 and | −〉. But the potential is not diagonal in this basis: the off-diagonal matrix elements

$$\displaystyle{ \langle +\vert V \vert -\rangle =\langle -\vert V \vert +\rangle \sim \exp (-khd) =\delta }$$
(1.57)

are different from zero due to the non-vanishing amplitude for a tunnel effect between the two vacua given in (1.57), proportional to the exponential of minus the product of the distance d between the vacua and the height h of the barrier, with k a constant (see Fig. 1.2). After diagonalization the eigenvectors are \((\vert +\rangle +\vert -\rangle )/\sqrt{2}\) and \((\vert +\rangle -\vert -\rangle )/\sqrt{2}\), with different energies (the difference being proportional to δ). Suppose now that we have a sum of n equal terms in the potential, i.e., V =  i V (x i ). Then the transition amplitude would be proportional to δ n and would vanish for infinite n: the probability that all degrees of freedom together jump over the barrier vanishes. In this example there is a discrete number of minimum points. The case of a continuum of minima is obtained, still in the Schrödinger context, if we take

Fig. 1.2
figure 2

A Schrödinger potential V (x) analogous to the Higgs potential

$$\displaystyle{ V = \frac{1} {2}\mu ^{2}\mathbf{r}^{2} + \frac{1} {4}\lambda (\mathbf{r}^{2})^{2}\;, }$$
(1.58)

with r = (x, y, z). The ground state is also unique in this case: it is given by a state with total orbital angular momentum zero, i.e., an s-wave state, whose wave function only depends on | r | , i.e., it is independent of all angles. This is a superposition of all directions with the same weight, analogous to what happened in the discrete case. But again, if we replace a single vector r, with a vector field M(x), that is, a different vector at each point in space, the amplitude to go from a minimum state in one direction to another in a different direction goes to zero in the limit of infinite volume. Put simply, the vectors at all points in space have a vanishingly small amplitude to make a common rotation, all together at the same time. In the infinite volume limit, all vacua along each direction have the same energy, and spontaneous symmetry breaking can occur.

A massless Goldstone boson corresponds to a long range force. Unless the massless particles are confined, as for the gluons in QCD, these long range forces would be easily detectable. Thus, in the construction of the EW theory, we cannot accept physical massless scalar particles. Fortunately, when spontaneous symmetry breaking takes place in a gauge theory, the massless Goldstone modes exist, but they are unphysical and disappear from the spectrum. In fact, each of them becomes the third helicity state of a gauge boson that takes mass. This is the Higgs mechanism [189, 236, 243, 261] (it should be called the Englert–Brout–Higgs mechanism, because of the simultaneous paper by Englert and Brout). Consider, for example, the simplest Higgs model described by the Lagrangian [243, 261]

$$\displaystyle{ \mathcal{L} = -\frac{1} {4}F_{\mu \nu }^{2} +\big \vert (\partial _{\mu } + \mathrm{i}eA_{\mu }Q)\phi \big\vert ^{2} +\mu ^{2}\phi ^{{\ast}}\phi - \frac{\lambda } {2}(\phi ^{{\ast}}\phi )^{2}\;. }$$
(1.59)

Note the “wrong” sign in front of the mass term for the scalar field ϕ, which is necessary for the spontaneous symmetry breaking to take place. The above Lagrangian is invariant under the U(1) gauge symmetry

$$\displaystyle{ A_{\mu } \rightarrow A_{\mu }^{{\prime}} = A_{\mu } - \partial _{\mu }\theta (x)\;,\qquad \phi \rightarrow \phi ^{{\prime}} = \mathrm{exp}\big[\mathrm{i}eQ\theta (x)\big]\phi \;. }$$
(1.60)

For the U(1) charge Q, we take  = −ϕ, as in QED, where the particle is e . Let ϕ 0 = v ≠ 0, with v real, be the ground state that minimizes the potential and induces the spontaneous symmetry breaking. In our case v is given by v 2 = μ 2λ. Exploiting gauge invariance, we make the change of variables

$$\displaystyle\begin{array}{rcl} \phi (x)& \rightarrow & \left [v + \frac{h(x)} {\sqrt{2}} \right ]\mathrm{exp}\left [-\mathrm{i} \frac{\zeta (x)} {v\sqrt{2}}\right ], \\ A_{\mu }(x)& \rightarrow & A_{\mu } - \partial _{\mu } \frac{\zeta (x)} {ev\sqrt{2}}. {}\end{array}$$
(1.61)

Then the position of the minimum at ϕ 0 = v corresponds to h = 0, and the Lagrangian becomes

$$\displaystyle{ \mathcal{L} = -\frac{1} {4}F_{\mu \nu }^{2} + e^{2}v^{2}A_{\mu }^{2} + \frac{1} {2}e^{2}h^{2}A_{\mu }^{2} + \sqrt{2}e^{2}hvA_{\mu }^{2} + \mathcal{L}(h)\;. }$$
(1.62)

The field ζ(x) is the would-be Goldstone boson, as can be seen by considering only the ϕ terms in the Lagrangian, i.e., setting A μ  = 0 in (1.59). In fact, in this limit the kinetic term μ ζ ∂ μ ζ remains but with no ζ 2 mass term. Instead, in the gauge case of (1.59), after changing variables in the Lagrangian, the field ζ(x) completely disappears (not even the kinetic term remains), whilst the mass term e 2 v 2 A μ 2 for A μ is now present: the gauge boson mass is \(M = \sqrt{2}ev\). The field h describes the massive Higgs particle. Leaving a constant term aside, the last term in (1.62) is now

$$\displaystyle{ \mathcal{L}(h) = \frac{1} {2}\partial _{\mu }h\partial ^{\mu }h - h^{2}\mu ^{2} + \cdots \;, }$$
(1.63)

where the dots stand for cubic and quartic terms in h. We see that the h mass term has the “right” sign, due to the combination of the quadratic tems in h which, after the shift, arise from the quadratic and quartic terms in ϕ. The h mass is given by m h 2 = 2μ 2.

The Higgs mechanism is realized in well-known physical situations. It was actually discovered in condensed matter physics by Anderson [58]. For a superconductor in the Landau–Ginzburg approximation, the free energy can be written as

$$\displaystyle{ F = F_{0} + \frac{1} {2}\mathbf{B}^{2} + \frac{1} {4m}\big\vert (\boldsymbol{\nabla }- 2\mathrm{i}e\mathbf{A})\phi \big\vert ^{2} -\alpha \vert \phi \vert ^{2} +\beta \vert \phi \vert ^{4}\;. }$$
(1.64)

Here B is the magnetic field, | ϕ | 2 is the Cooper pair (e e ) density, and 2e and 2m are the charge and mass of the Cooper pair. The “wrong” sign of α leads to ϕ ≠ 0 at the minimum. This is precisely the non-relativistic analogue of the Higgs model of the previous example. The Higgs mechanism implies the absence of propagation of massless phonons (states with dispersion relation ω = kv, with constant v). Moreover, the mass term for A is manifested by the exponential decrease of B inside the superconductor (Meissner effect). However, in condensed matter examples, the Higgs field is not elementary, but rather a condensate of elementary fields (like for the Cooper pairs).

1.8 Quantization of Spontaneously Broken Gauge Theories: R ξ Gauges

In Sect. 1.6 we discussed the problems arising in the quantization of a gauge theory and in the formulation of the correct Feynman rules (gauge fixing terms, ghosts, etc.). Here we give a concise account of the corresponding results for spontaneously broken gauge theories. In particular, we describe the R ξ gauge formalism [14, 207, 250]: in this formalism the interplay of transverse and longitudinal gauge boson degrees of freedom is made explicit and their combination leads to the cancellation of the gauge parameter ξ from physical quantities. We work out in detail an Abelian example that will be easy to generalize later to the non-Abelian case.

We go back to the Abelian model of (1.59) (with Q = −1). In the treatment presented there, the would-be Goldstone boson ζ(x) was completely eliminated from the Lagrangian by a nonlinear field transformation formally identical to a gauge transformation corresponding to the U(1) symmetry of the Lagrangian. In that description, in the new variables, we eventually obtain a theory with only physical fields: a massive gauge boson A μ with mass \(M = \sqrt{2}ev\) and a Higgs particle h with mass \(m_{h} = \sqrt{2}\mu\). This is called a “unitary” gauge, because only physical fields appear. But if we work out the propagator of the massive gauge boson, viz.,

$$\displaystyle{ \mathrm{i}D_{\mu \nu }(k) = -\mathrm{i} \frac{g_{\mu \nu } - k_{\mu }k_{\nu }/M^{2}} {k^{2} - M^{2} + \mathrm{i}\epsilon }\;, }$$
(1.65)

we find that it has a bad ultraviolet behaviour due to the second term in the numerator. This choice does not prove to be the most convenient for a discussion of the ultraviolet behaviour of the theory. Alternatively, one can go to a different formulation where the would-be Goldstone boson remains in the Lagrangian, but the complication of keeping spurious degrees of freedom is compensated by having all propagators with good ultraviolet behaviour (“renormalizable” gauges). To this end we replace the nonlinear transformation for ϕ in (1.61) by its linear equivalent (after all, perturbation theory deals with small oscillations around the minimum):

$$\displaystyle{ \phi (x) \rightarrow \left [v + \frac{h(x)} {\sqrt{2}} \right ]\mathrm{exp}\left [-\mathrm{i} \frac{\zeta (x)} {v\sqrt{2}}\right ] \sim \left [v + \frac{h(x)} {\sqrt{2}} -\mathrm{i}\frac{\zeta (x)} {\sqrt{2}}\right ]\;. }$$
(1.66)

Here we have only applied a shift by the amount v and separated the real and imaginary components of the resulting field with vanishing vacuum expectation value. If we leave A μ as it is and simply replace the linearized expression for ϕ, we obtain the following quadratic terms (those important for propagators):

$$\displaystyle\begin{array}{rcl} \mathcal{L}_{\mathrm{quad}}& =& -\frac{1} {4}\sum _{A}F_{\mu \nu }^{A}F^{A\mu \nu } + \frac{1} {2}M^{2}A_{\mu }A^{\mu } \\ & & +\frac{1} {2}(\partial _{\mu }\zeta )^{2} + MA_{\mu }\partial ^{\mu }\zeta + \frac{1} {2}(\partial _{\mu }h)^{2} - h^{2}\mu ^{2}\;.{}\end{array}$$
(1.67)

The mixing term between A μ and μ ζ does not allow us to write diagonal mass matrices directly. But this mixing term can be eliminated by an appropriate modification of the covariant gauge fixing term given in (1.38) for the unbroken theory. We now take

$$\displaystyle{ \Delta \mathcal{L}_{\mathrm{GF}} = -\frac{1} {2\xi }(\partial ^{\mu }A_{\mu } -\xi M\zeta )^{2}\;. }$$
(1.68)

By adding \(\Delta \mathcal{L}_{\mathrm{GF}}\) to the quadratic terms in (1.67), the mixing term cancels (apart from a total derivative that can be omitted) and we have

$$\displaystyle\begin{array}{rcl} \mathcal{L}_{\mathrm{quad}}& =& -\frac{1} {4}\sum _{A}F_{\mu \nu }^{A}F^{A\mu \nu } + \frac{1} {2}M^{2}A_{\mu }A^{\mu } -\frac{1} {2\xi }(\partial ^{\mu }A_{\mu })^{2} \\ & & +\frac{1} {2}(\partial _{\mu }\zeta )^{2} - \frac{\xi } {2}M^{2}\zeta ^{2} + \frac{1} {2}(\partial _{\mu }h)^{2} - h^{2}\mu ^{2}\;.{}\end{array}$$
(1.69)

We see that the ζ field appears with a mass \(\sqrt{\xi }M\) and its propagator is

$$\displaystyle{ \mathrm{i}D_{\zeta } = \frac{\mathrm{i}} {k^{2} -\xi M^{2} + \mathrm{i}\epsilon }\;. }$$
(1.70)

The propagators of the Higgs field h and of gauge field A μ are

$$\displaystyle{ \mathrm{i}D_{h} = \frac{\mathrm{i}} {k^{2} - 2\mu ^{2} + \mathrm{i}\epsilon }\;, }$$
(1.71)
$$\displaystyle{ \mathrm{i}D_{\mu \nu }(k) = \frac{-\mathrm{i}} {k^{2} - M^{2} + \mathrm{i}\epsilon }\left [g_{\mu \nu } - (1-\xi ) \frac{k_{\mu }k_{\nu }} {k^{2} -\xi M^{2}}\right ]\;. }$$
(1.72)

As anticipated, all propagators have good behaviour at large k 2. This class of gauges are called “R ξ gauges” [207]. Note that for ξ = 1 we have a sort of generalization of the Feynman gauge with a Goldstone boson of mass M and a gauge propagator:

$$\displaystyle{ \mathrm{i}D_{\mu \nu }(k) = \frac{-\mathrm{i}g_{\mu \nu }} {k^{2} - M^{2} + \mathrm{i}\epsilon }\;. }$$
(1.73)

Furthermore, for ξ →  the unitary gauge description is recovered, since the would-be Goldstone propagator vanishes and the gauge propagator reproduces that of the unitary gauge in (1.65). All ξ dependence present in individual Feynman diagrams, including the unphysical singularities of the ζ and A μ propagators at k 2 = ξ M 2, must cancel in the sum of all contributions to any physical quantity.

An additional complication is that a Faddeev–Popov ghost is also present in R ξ gauges (while it is absent in an unbroken Abelian gauge theory). In fact, under an infinitesimal gauge transformation with parameter θ(x), we have the transformations

$$\displaystyle\begin{array}{rcl} A_{\mu }& \rightarrow & A_{\mu } - \partial _{\mu }\theta, {}\\ \phi & \rightarrow & (1 -\mathrm{i}e\theta )\left [v + \frac{h(x)} {\sqrt{2}} -\mathrm{i}\frac{\zeta (x)} {\sqrt{2}}\right ], {}\\ \end{array}$$

so that

$$\displaystyle{ \updelta A_{\mu } = -\partial _{\mu }\theta \;,\quad \updelta h = -e\zeta \theta \;,\quad \updelta \zeta = e\theta \sqrt{2}\left (v + \frac{h} {\sqrt{2}}\right )\;. }$$
(1.74)

The gauge fixing condition μ A μξ M ζ = 0 undergoes the variation

$$\displaystyle{ \partial _{\mu }A^{\mu } -\xi M\zeta \; \rightarrow \; \partial _{\mu }A^{\mu } -\xi M\zeta -\left [\partial ^{2} +\xi M^{2}\left (1 + \frac{h} {v\sqrt{2}}\right )\right ]\theta \;, }$$
(1.75)

where we have used \(M = \sqrt{2}ev\). From this, recalling the discussion in Sect. 1.6, we see that the ghost is not coupled to the gauge boson (as usual for an Abelian gauge theory), but has a coupling to the Higgs field h. The ghost Lagrangian is

$$\displaystyle{ \varDelta \mathcal{L}_{\mathrm{Ghost}} =\bar{\eta } \left [\partial ^{2} +\xi M^{2}\left (1 + \frac{h} {v\sqrt{2}}\right )\right ]\eta \;. }$$
(1.76)

The ghost mass is seen to be \(m_{\mathrm{gh}} = \sqrt{\xi }M\) and its propagator is

$$\displaystyle{ \mathrm{i}D_{\mathrm{gh}} = \frac{\mathrm{i}} {k^{2} -\xi M^{2} + \mathrm{i}\epsilon }\;. }$$
(1.77)

The detailed Feynman rules follow for all the basic vertices involving the gauge boson, the Higgs, the would-be Goldstone boson, and the ghost, and are easily derived, with some algebra, from the total Lagrangian including the gauge fixing and ghost additions. The generalization to the non-Abelian case is in principle straightforward, with some formal complications involving the projectors over the space of the would-be Goldstone bosons and over the orthogonal space of the Higgs particles. But for each gauge boson that takes mass M a , we still have a corresponding would-be Goldstone boson and a ghost with mass \(\sqrt{\xi }M_{a}\). The Feynman diagrams, both for the Abelian and the non-Abelian case, are listed explicitly, for example, in the textbook by Cheng and Li [250].

We conclude that the renormalizability of non-Abelian gauge theories, also in the presence of spontaneous symmetry breaking, was proven in the fundamental work by t’Hooft and Veltman [358], and is discussed in detail in [278].