2.1 Introduction to Chaps. 2, 3 and 4

Stefano Forte

The presentation of the Standard Model in Chap. 2, Chaps. 3 and 4 was originally written by Guido Altarelli in 2007. In this introduction we provide a brief update (with references), and a discussion of the main developments which have taken place since the time of the writing.

Chapter 2 presents the architecture of the Standard Model, the way symmetries are realized and the way this can is described at the quantum level. The structure of the Standard Model is now well-established since half a century or so. The presentation in this chapter highlights the experimental (and thus, to a certain extent, historical Chap. 2) origin of the main structural aspects of the theory. The only aspects of the presentation which require (minimal) updating are the numerical values given for parameters, such as the Fermi coupling constant GF, see Eq. (2.3). All of these parameters have been known quite accurately since the early 2000s (with the exception of neutrino masses, see Sect. 3.7 of Chap. 3), and thus their values are quite stable. The numbers given below are taken from the then-current edition of the Particle Data Book (PDG) [7]. At any given time, in order to have the most recent and accurate values, the reader should consult the most recent edition of the PDG [25], preferably using the web-based version [26], which is constantly updated.

Chapter 3 presents the Electroweak sector of the Standard Model, which was established as a successful theory by extensive experimentation at the LEP electron-positron collider of CERN in the last decade of the past century, including some aspects of the theory, such as the CKM mechanism for mass mixing (see Sect. 3.6) which were originally often considered to be only approximate. The discovery, at the turn of the century, of neutrino mixing, and thus non-vanishing neutrino masses (see Sect. 3.7) has been the only significant addition to the minimal version of the electroweak theory as formulated in the sixties and seventies of the past century. The general understanding of electroweak interactions was thus essentially settled at the time of the writing of this chapter.

From the experimental point of view, the main development since then is the successful completion of the first two runs of the LHC, which have provided further confirmation of the standard Electroweak theory (see Ref. [27] for a recent review). From a theoretical point of view, the main surprise (from the LHC, but also a number of other experiments) is that there have been no surprises.

First and foremost, the Higgs sector of the Standard Model: after discovery of the Higgs boson in 2012 [28, 29] the Higgs sector has turned out so far to be in agreement with the minimal one-doublet structure presented in Sect. 3.5. The discussion presented there, as well as the phenomenology of the Standard Model Higgs of Sect. 3.13, remain thus essentially unchanged by the Higgs discovery. A theoretical introduction with more specific reference to the LHC can be found in Ref. [30], while the current experimental status of Higgs properties can be found in the continually updated pages of the CERN Higgs cross-section working group [31]. Perhaps, the only real surprise in the Higgs sector of the Standard Model is the extreme closeness of the measured Higgs mass to the critical value required for vacuum stability (see Sect. 3.13.1 below)—a fact with interesting cosmological implications [32]. The discovery of the Higgs has changed somewhat the nature of global fits of Standard Model parameters discussed in Sect. 3.12: with the value of the Higgs mass known, the fit is over-constrained—though the conclusion of global consistency remains unchanged. An updated discussion is given in Ref. [27], as well as in the review on the Electroweak Model by Erler and Freitas in the PDG [26].

Besides Higgs discovery, the general trend of the last several years has been that of the gradual disappearance of all anomalies—instances of discrepancy between Standard Model predictions and the data—either due to more accurate theory calculations (or even the correction of errors: see Sect. 3.9), or to more precise measurements. A case in point is that of the measurements of the electroweak mixing angle, discussed in Sect. 3.12: the tensions or signals of disagreement which are discussed there have all but disappeared, mostly thanks to more accurate theoretical calculations. Another case in which the agreement between Standard Model and experiment is improving (albeit perhaps more slowly) is that of lepton anomalous magnetic moments, discussed in Sect. 3.9. In both cases, updates on the current situation can again been found in Ref. [27], and in the aforementioned PDG review by Erler and Freitas.

Finally, there is a number of cases in which data from LHC experiments (as well as other experiments, specifically in the fields of flavor physics and neutrino physics) have brought more accuracy and more stringent tests, without changing the overall picture. These include gauge boson couplings, discussed in Sects. 3.33.4, for which we refer to Ref. [27]; the CKM matrix and flavor physics, discussed in Sect. 3.6, for which we refer to the review by Ceccucci, Ligeti and Sakai in the PDG [26]; neutrino masses and mixings, discussed in Sect. 3.7, for which we refer to the PDG review by Gonzalez-Garcia and Yokohama [26].

This perhaps unexpected success of the Standard Model, and the failure to find any evidence so far of new physics (and in particular supersymmetry) at the LHC has somewhat modified the perspective on the limitations of the Standard Model discussed in Sect. 3.14. Specifically, the significance of the hierarchy problem—the so-called “naturalness” issue—must be questioned, given that it entails new physics which has not be found: a suggestive discussion of this shift in perspective is in Ref. [33]. Yet, the classification of possible new physics scenarios of Sect. 3.14 remains essentially valid: recent updates are in Ref. [34] for supersymmetric models, and in Ref. [35] for non-supersymmetric ones. Consequently, looking for new physics has now become a precision exercise, and this has provided a formidable stimulus to the study of Electroweak radiative corrections, which has been the subject of very intense activity beyond the classic results discussed in Sect. 3.10: a recent detailed review is in Ref. [36].

Chapter 4 is devoted to the theory of strong interactions, Quantum Chromodynamics (QCD). This theory has not changed since its original formulation in the second half of the past century. Specifically, its application to hard processes, which allows for the use of perturbative methods, is firmly rooted in the set of classic results and techniques discussed in Sect. 4.5 below. What did slowly change over the years is the experimental status of QCD. What used to be, in the past century, a theory established qualitatively, has gradually turned into a theory firmly established experimentally—though, at the time this chapter was written, not quite tested to the same precision as the electroweak theory (see Sect. 4.7). Now, after the first two runs of the LHC, it can be stated that the whole of the Standard Model, QCD and the Electroweak theory, are tested to the same very high level of accuracy and precision, typically at the percent or sub-percent level.

Turning QCD into a precision theory has been a pre-requisite for successful physics at the LHC, a hadron collider in which every physical process necessarily involves the strong interaction, since the colliding objects are protons (or nuclei). This has grown into a pressing need as the lack of discovery of new particles or major deviations from Standard Model predictions has turned the search for new physics signals into a precision exercise: it has turned the LHC from an “energy frontier” to a “rarity/accuracy frontier” machine—something that was deemed inconceivable just before the start of its operation [37].

This rapid progress has happened thanks to an ever-increasing set of computational techniques, which, building upon the classic results presented in this chapter, has allowed for an enormous expansion of the set of perturbative computations of processes at colliders which are introduced in Sect. 4.5.4, and discussed in more detail in the context of LHC (and specifically Higgs) physics in Ref. [30].

To begin with, basic quantities such as the running of the coupling, discussed in Sect. 4.4, and \(R_{e^+e^-}\), discussed in Sect. 4.5.1 are now know to one extra perturbative order (see the QCD review of the PDG [26] for the current state of the art and full references). These are five-loop perturbative calculations, now made possible thanks to the availability of powerful computing resources. Furthermore, the set of processes discussed in Sect. 4.5.4 has now been extended to include essentially all relevant hadron collider processes, which have been routinely computed to third perturbative order, while the first fourth-order calculations have just started appearing. Again, the QCD review of the PDG [26] provides a useful status update, including comparison between computation and experiment, which refer to cross-sections which span about ten orders of magnitude in size.

This progress has been happening thanks to the development of a vast new set of computational techniques, which, rooted in perturbative QCD, have now spawned a dedicated research field: that of amplitudes [38], which relates phenomenology, quantum field theory, and mathematics. The classic set of methods for “resummation”—the sum of infinite classes of perturbative contributions, discussed specifically in Sect. 4.5.3.1 for deep-inelastic scattering, has been extended well beyond the processes and accuracy discussed in Sect. 4.5.4—an up-to-date list is in the QCD review of the PDG [26]. Moreover, an entirely new set of resummation techniques has been developed, using the methodology of effective field theories: the so-called soft-collinear effective theory (SCET) which provides an extra tool in the resummation box [39]. One remarkable consequence of all these developments is that it is now possible to understand in detail the structure of pure strong interaction events, in which jets of hadrons are produced in the final state, by looking inside these events and tracing their structure in terms of the fundamental fields of QCD—quarks and gluons [40].

One topic in which things have changed rather less is the determination of the strong coupling, discussed in Sect. 4.7. Whereas the agreement between predicted and observed scaling violations discussed in Sect. 4.6.3 is ever more impressive (see the review on structure functions of the PDG [26]) the accuracy on the determination of the strong coupling itself has not improved much. Updated discussions can be found in the QCD review of the PDG, as well as in Ref. [41]. Progress is likely to come from future, more accurate LHC data, as well as from non-perturbative calculations [42] (not discussed here) soon expected to become competitive.

All in all, the dozen or so years since the original writing of these chapter have seen a full vindication of the Standard Model as a correct and accurate theory, and have stimulated a vast number of highly sophisticated experimental and theoretical results which build upon the treatment presented below.

2.2 Introduction

The ultimate goal of fundamental physics is to reduce all natural phenomena to a set of basic laws and theories that, at least in principle, can quantitatively reproduce and predict the experimental observations. At microscopic level all the phenomenology of matter and radiation, including molecular, atomic, nuclear and subnuclear physics, can be understood in terms of three classes of fundamental interactions: strong, electromagnetic and weak interactions. In atoms the electrons are bound to nuclei by electromagnetic forces and the properties of electron clouds explain the complex phenomenology of atoms and molecules. Light is a particular vibration of electric and magnetic fields (an electromagnetic wave). Strong interactions bind the protons and neutrons together in nuclei, being so intensively attractive at short distances that they prevail over the electric repulsion due to the equal sign charges of protons. Protons and neutrons, in turn, are composites of three quarks held together by strong interactions to which quarks and gluons are subject (hence these particles are called “hadrons” from the Greek word for “strong”). To the weak interactions are due the beta radioactivity that makes some nuclei unstable as well as the nuclear reactions that produce the enormous energy radiated by the stars and by our Sun in particular. The weak interactions also cause the disintegration of the neutron, the charged pions, the lightest hadronic particles with strangeness, charm, and beauty (which are “flavour” quantum numbers) as well as the decay of the quark top and of the heavy charged leptons (the muon μ and the tau τ). In addition all observed neutrino interactions are due to weak forces.

All these interactions are described within the framework of quantum mechanics and relativity, more precisely by a local relativistic quantum field theory. To each particle, described as pointlike, is associated a field with suitable (depending on the particle spin) transformation properties under the Lorentz group (the relativistic space-time coordinate transformations). It is remarkable that the description of all these particle interactions is based on a common principle: “gauge” invariance. A “gauge” symmetry is invariance under transformations that rotate the basic internal degrees of freedom but with rotation angles that depend on the space-time point. At the classical level gauge invariance is a property of the Maxwell equations of electrodynamics and it is in this context that the notion and the name of gauge invariance were introduced. The prototype of all quantum gauge field theories, with a single gauged charge, is QED, Quantum Electro-Dynamics, developed in the years from 1926 until about 1950, which indeed is the quantum version of Maxwell theory. Theories with gauge symmetry, at the renormalizable level, are completely determined given the symmetry group and the representations of the interacting fields. The whole set of strong, electromagnetic and weak interactions is described by a gauge theory, with 12 gauged non-commuting charges, which is called “the Standard Model” of particle interactions (SM). Actually only a subgroup of the SM symmetry is directly reflected in the spectrum of physical states. A part of the electroweak symmetry is hidden by the Higgs mechanism for the spontaneous symmetry breaking of a gauge symmetry.

For all material bodies on the Earth and in all geological, astrophysical and cosmological phenomena a fourth interaction, the gravitational force, plays a dominant role, while it is instead negligible in atomic and nuclear physics. The theory of general relativity is a classic (in the sense of non quantum mechanical) description of gravitation that goes beyond the static approximation described by Newton law and includes dynamical phenomena like, for example, gravitational waves. The problem of the formulation of a quantum theory of gravitational interactions is one of the central problems of contemporary theoretical physics. But quantum effects in gravity become only important for energy concentrations in space-time which are not in practice accessible to experimentation in the laboratory. Thus the search for the correct theory can only be done by a purely speculative approach. All attempts at a description of quantum gravity in terms of a well defined and computable local field theory along similar lines as for the SM have so far failed to lead to a satisfactory framework. Rather, at present the most complete and plausible description of quantum gravity is a theory formulated in terms of non pointlike basic objects, the so called “strings”, extended over distances much shorter than those experimentally accessible, that live in a space-time with 10 or 11 dimensions. The additional dimensions beyond the familiar 4 are, typically, compactified which means that they are curled up with a curvature radius of the order of the string dimensions. Present string theory is an all-comprehensive framework that suggests a unified description of all interactions together with gravity of which the SM would be only a low energy or large distance approximation.

A fundamental principle of quantum mechanics, the Heisenberg indetermination principle, implies that, for studying particles with spatial dimensions of order Δx or interactions taking place at distances of order Δx, one needs as a probe a beam of particles (typically produced by an accelerator) with impulse p ≳ħ∕ Δx, where ħ is the reduced Planck constant (ħ = h∕2π). Accelerators presently in operation or available in the near future, like the Large Hadron Collider at CERN near Geneva, allow to study collisions between two particles with total center of mass energy up to \(2E\sim 2pc\lesssim 14\) TeV. These machines, in principle, can allow to study physics down to distances Δx ≳ 10−18 cm. Thus, on the basis of results from experiments at existing accelerators, we can confirm that, down to distances of that order of magnitude, indeed electrons, quarks and all the fundamental SM particles do not show an appreciable internal structure and look elementary and pointlike. We expect that quantum effects in gravity will certainly become important at distances Δx ≳ 10−33 cm corresponding to energies up to E ∼ MPlc2 ∼ 1019 GeV, where MPl is the Planck mass, related to Newton constant by \(G_N = \hbar c/M_{Pl}^2\). At such short distances the particles that so far appeared as pointlike could well reveal an extended structure, like for strings, and be described by a more detailed theoretical framework of which the local quantum field theory description of the SM would be just a low energy/large distance limit.

From the first few moments of the Universe, after the Big Bang, the temperature of the cosmic background went down gradually, starting from kT ∼ MPlc2, where k = 8.617…10−5 eV K−1 is the Boltzmann constant, down to the present situation where T ∼ 2.725 K. Then all stages of high energy physics from string theory, which is a purely speculative framework, down to the SM phenomenology, which is directly accessible to experiment and well tested, are essential for the reconstruction of the evolution of the Universe starting from the Big Bang. This is the basis for the ever increasing relation between high energy physics and cosmology.

2.3 Overview of the Standard Model

The SM is a gauge field theory based on the symmetry group SU(3) ⊗ SU(2) ⊗ U(1). The transformations of the group act on the basic fields. This group has 8+3+1= 12 generators with a non trivial commutator algebra (if all generators commute the gauge theory is said to be “abelian”, while the SM is a “non abelian” gauge theory). SU(3) is the “colour” group of the theory of strong interactions (QCD: Quantum Chromo-Dynamics [1,2,3]). SU(2) ⊗ U(1) describes the electroweak (EW) interactions [4,5,6] and the electric charge Q, the generator of the QED gauge group U(1)Q, is the sum of T3, one of the SU(2) generators and of Y∕2, where Y  is the U(1) generator: Q = T 3 + Y∕2.

In a gauge theory to each generator T is associated a vector boson (also said gauge boson) with the same quantum numbers as T, and, if the gauge symmetry is unbroken, this boson is of vanishing mass. These vector (i.e. of spin 1) bosons act as mediators of the corresponding interactions. For example, in QED the vector boson associated to the generator Q is the photon γ. The interaction between two charged particles in QED, for example two electrons, is mediated by the exchange of one (or seldom more than one) photon emitted by one electron and reabsorbed by the other one. Similarly in the SM there are 8 massless gluons associated to the SU(3) colour generators, while for SU(2) ⊗ U(1) there are 4 gauge bosons W+, W, Z0 and γ. Of these, only the photon γ is massless because the symmetry induced by the other 3 generators is actually spontaneously broken. The masses of W+, W and Z0 are quite large indeed on the scale of elementary particles: mW ∼ 80.4 GeV, mZ ∼ 91.2 GeV are as heavy as atoms of intermediate size like rubidium and molibdenum, respectively. In the electroweak theory the breaking of the symmetry is of a particular type, denoted as spontaneous symmetry breaking. In this case charges and currents are as dictated by the symmetry but the fundamental state of minimum energy, the vacuum, is not unique and there is a continuum of degenerate states that all together respect the symmetry (in the sense that the whole vacuum orbit is spanned by applying the symmetry transformations). The symmetry breaking is due to the fact that the system (with infinite volume and infinite number of degrees of freedom) is found in one particular vacuum state, and this choice, which for the SM occurred in the first instants of the Universe life, makes the symmetry violated in the spectrum of states. In a gauge theory like the SM the spontaneous symmetry breaking is realized by the Higgs mechanism (described in detail in Sect. (2.7)): there are a number of scalar (i.e. of zero spin) Higgs bosons with a potential that produces an orbit of degenerate vacuum states. One or more of these scalar Higgs particles must necessarily be present in the spectrum of physical states with masses very close to the range so far explored. It is expected that the Higgs particle(s) will be found at the LHC thus completing the experimental verification of the SM.

The fermionic (all of spin 1/2) matter fields of the SM are quarks and leptons. Each type of quark is a colour triplet (i.e. each quark flavour comes in three colours) and also carries electroweak charges, in particular electric charges +2/3 for up-type quarks and −1/3 for down-type quarks. So quarks are subject to all SM interactions. Leptons are colourless and thus do not interact strongly (they are not hadrons) but have electroweak charges, in particular electric charges −1 for charged leptons (e, μ and τ) while it is 0 for neutrinos (νe, νμ and ντ). Quarks and leptons are grouped in 3 “families” or “generations” with equal quantum numbers but different masses. At present we do not have an explanation for this triple repetition of fermion families:

(2.1)

The QCD sector of the SM has a simple structure but a very rich dynamical content, including the observed complex spectroscopy with a large number of hadrons. The most prominent properties of QCD are asymptotic freedom and confinement. In field theory the effective coupling of a given interaction vertex is modified by the interaction. As a result, the measured intensity of the force depends on the transferred (four)momentum squared, Q2, among the participants. In QCD the relevant coupling parameter that appears in physical processes is \(\alpha _s=e_s^2/4\pi \) where es is the coupling constant of the basic interaction vertices of quark and gluons: qqg or ggg (see Eq. (2.30)). Asymptotic freedom means that the effective coupling becomes a function of Q2: αs(Q2) decreases for increasing Q2 and vanishes asymptotically. Thus, the QCD interaction becomes very weak in processes with large Q2, called hard processes or deep inelastic processes (i.e. with a final state distribution of momenta and a particle content very different than those in the initial state). One can prove that in 4 space-time dimensions all pure-gauge theories based on a non commuting group of symmetry are asymptotically free and conversely. The effective coupling decreases very slowly at large momenta with the inverse logarithm of Q2: \(\alpha _s(Q^2)=1/b\log {Q^2/\Lambda ^2}\) where b is a known constant and Λ is an energy of order a few hundred MeV. Since in quantum mechanics large momenta imply short wavelengths, the result is that at short distances the potential between two colour charges is similar to the Coulomb potential, i.e. proportional to αs(r)∕r, with an effective colour charge which is small at short distances. On the contrary the interaction strength becomes large at large distances or small transferred momenta, of order \(Q\lesssim \Lambda \). In fact all observed hadrons are tightly bound composite states of quarks (baryons are made of qqq and mesons of \(q\bar q\)), with compensating colour charges so that they are overall neutral in colour. In fact, the property of confinement is the impossibility of separating colour charges, like individual quarks and gluons or any other coloured state. This is because in QCD the interaction potential between colour charges increases at long distances linearly in r. When we try to separate the quark and the antiquark that form a colour neutral meson the interaction energy grows until pairs of quarks and antiquarks are created from the vacuum and new neutral mesons are coalesced and observed in the final state instead of free quarks. For example, consider the process \(e^+e^- \rightarrow q\bar {q}\) at large center of mass energies. The final state quark and antiquark have large energies, so they separate in opposite directions very fast. But the colour confinement forces create new pairs in between them. What is observed is two back-to-back jets of colourless hadrons with a number of slow pions that make the exact separation of the two jets impossible. In some cases a third well separated jet of hadrons is also observed: these events correspond to the radiation of an energetic gluon from the parent quark-antiquark pair.

In the EW sector the SM inherits the phenomenological successes of the old (V − A) ⊗ (V − A) four-fermion low-energy description of weak interactions, and provides a well-defined and consistent theoretical framework including weak interactions and quantum electrodynamics in a unified picture. The weak interactions derive their name from their intensity. At low energy the strength of the effective four-fermion interaction of charged currents is determined by the Fermi coupling constant GF. For example, the effective interaction for muon decay is given by

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}} = (G_F/\sqrt 2) \left[ \bar \nu_{\mu}\gamma_{\alpha}(1-\gamma_5)\mu \right] \left[ \bar e\gamma^{\alpha}(1-\gamma_5)\nu_e \right]~, {} \end{aligned} $$
(2.2)

with [7]

$$\displaystyle \begin{aligned} G_F = 1.16639(1) \times 10^{-5}~\mathrm{GeV}^{-2}~. {} \end{aligned} $$
(2.3)

In natural units ħ = c = 1, GF has dimensions of (mass)−2. As a result, the intensity of weak interactions at low energy is characterized by GFE2, where E is the energy scale for a given process (E ≈ mμ for muon decay). Since

$$\displaystyle \begin{aligned} G_FE^2 = G_Fm^2_p(E/m_p)^2 \simeq 10^{-5}(E/m_p)^2~, {} \end{aligned} $$
(2.4)

where mp is the proton mass, the weak interactions are indeed weak at low energies (up to energies of order a few ten’s of GeV). Effective four fermion couplings for neutral current interactions have comparable intensity and energy behaviour. The quadratic increase with energy cannot continue for ever, because it would lead to a violation of unitarity. In fact, at large energies the propagator effects can no longer be neglected, and the current–current interaction is resolved into current–W gauge boson vertices connected by a W propagator. The strength of the weak interactions at high energies is then measured by gW, the W −−μνμ coupling, or, even better, by \(\alpha _W = g^2_W/4\pi \) analogous to the fine-structure constant α of QED (in Chap. 3, gW is simply denoted by g or g 2). In the standard EW theory, we have

$$\displaystyle \begin{aligned} \alpha_W = \sqrt 2~G_F~m^2_W/\pi \cong 1/30~. {} \end{aligned} $$
(2.5)

That is, at high energies the weak interactions are no longer so weak.

The range rW of weak interactions is very short: it is only with the experimental discovery of the W and Z gauge bosons that it could be demonstrated that rW is non-vanishing. Now we know that

$$\displaystyle \begin{aligned} r_W = \frac{\hbar}{m_Wc} \simeq 2.5 \times 10^{-16}~\mathrm{cm}, {} \end{aligned} $$
(2.6)

corresponding to mW ≃ 80.4 GeV. This very large value for the W (or the Z) mass makes a drastic difference, compared with the massless photon and the infinite range of the QED force. The direct experimental limit on the photon mass is [7] mγ < 6 10−17 eV. Thus, on the one hand, there is very good evidence that the photon is massless. On the other hand, the weak bosons are very heavy. A unified theory of EW interactions has to face this striking difference.

Another apparent obstacle in the way of EW unification is the chiral structure of weak interactions: in the massless limit for fermions, only left-handed quarks and leptons (and right-handed antiquarks and antileptons) are coupled to W’s. This clearly implies parity and charge-conjugation violation in weak interactions.

The universality of weak interactions and the algebraic properties of the electromagnetic and weak currents [the conservation of vector currents (CVC), the partial conservation of axial currents (PCAC), the algebra of currents, etc.] have been crucial in pointing to a symmetric role of electromagnetism and weak interactions at a more fundamental level. The old Cabibbo universality [8] for the weak charged current:

$$\displaystyle \begin{aligned} \begin{array}{rcl} J^{\mathrm{weak}}_{\alpha} &\displaystyle =&\displaystyle \bar \nu_{\mu}\gamma_{\alpha} (1-\gamma_5)\mu + \bar \nu_e\gamma_{\alpha}(1-\gamma_5) e + \cos\theta_c~\bar u \gamma_{\alpha}(1-\gamma_5)d + \\ &\displaystyle +&\displaystyle \sin \theta_c~\bar u \gamma_{\alpha}(1-\gamma_5)s + \ldots~, {} \end{array} \end{aligned} $$
(2.7)

suitably extended, is naturally implied by the standard EW theory. In this theory the weak gauge bosons couple to all particles with couplings that are proportional to their weak charges, in the same way as the photon couples to all particles in proportion to their electric charges [in Eq. (2.7), \(d' = \cos \theta _c~d + \sin \theta _c~s\) is the weak-isospin partner of u in a doublet. The (u, d′) doublet has the same couplings as the (νe, ) and (νμ, μ) doublets].

Another crucial feature is that the charged weak interactions are the only known interactions that can change flavour: charged leptons into neutrinos or up-type quarks into down-type quarks. On the contrary, there are no flavour-changing neutral currents at tree level. This is a remarkable property of the weak neutral current, which is explained by the introduction of the Glashow-Iliopoulos-Maiani (GIM) mechanism [9] and has led to the successful prediction of charm.

The natural suppression of flavour-changing neutral currents, the separate conservation of e, μ and τ leptonic flavours that is only broken by the small neutrino masses, the mechanism of CP violation through the phase in the quark-mixing matrix [10], are all crucial features of the SM. Many examples of new physics tend to break the selection rules of the standard theory. Thus the experimental study of rare flavour-changing transitions is an important window on possible new physics.

The SM is a renormalizable field theory which means that the ultra-violet divergences that appear in loop diagrams can be eliminated by a suitable redefinition of the parameters already appearing in the bare lagrangian: masses, couplings and field normalizations. As it will be discussed later, a necessary condition for a theory to be renormalizable is that only operator vertices of dimension not larger than 4 (that is m4 where m is some mass scale) appear in the lagrangian density \(\mathcal {L}\) (itself of dimension 4, because the action S is given by the integral of \(\mathcal {L}\) over d4x and is dimensionless in natural units: ħ = c = 1). Once this condition is added to the specification of a gauge group and of the matter field content the gauge theory lagrangian density is completely specified. We shall see the precise rules to write down the lagrangian of a gauge theory in the next Section.

2.4 The Formalism of Gauge Theories

In this Section we summarize the definition and the structure of a gauge Yang–Mills theory [11]. We will list here the general rules for constructing such a theory. Then these results will be applied to the SM.

Consider a lagrangian density \(\mathcal {L}[\phi ,\partial _{\mu }\phi ]\) which is invariant under a D dimensional continuous group of transformations:

$$\displaystyle \begin{aligned} \phi'(x) = U(\theta^A)\phi(x)\quad \quad (A = 1, 2, \ldots, D)~. {} \end{aligned} $$
(2.8)

with:

$$\displaystyle \begin{aligned} U(\theta^A) = \exp{[ig \sum_A~\theta^AT^A]} \sim~1 + ig \sum_A~\theta^AT^A~+\dots, {} \end{aligned} $$
(2.9)

The quantities θA are numerical parameters, like angles in the particular case of a rotation group in some internal space. The approximate expression on the right is valid for θA infinitesimal. Then, g is the coupling constant and TA are the generators of the group Γ of transformations (2.8) in the (in general reducible) representation of the fields ϕ. Here we restrict ourselves to the case of internal symmetries, so that TA are matrices that are independent of the space-time coordinates and the arguments of the fields ϕ and ϕ′ in Eq. (2.8) is the same. If U is unitary, then the generators TA are Hermitian, but this need not be the case in general (though it is true for the SM). Similarly if U is a group of matrices with unit determinant, then the traces of TA vanish: tr(TA) = 0. The generators TA are normalized in such a way that for the lowest dimensional non-trivial representation of the group Γ (we use tA to denote the generators in this particular representation) we have

$$\displaystyle \begin{aligned} \mathrm{tr}(t^At^B) = \frac{1}{2} \delta^{AB}~. {} \end{aligned} $$
(2.10)

The generators satisfy the commutation relations

$$\displaystyle \begin{aligned}{}[T^A,T^B] = iC_{ABC}T^C~. {} \end{aligned} $$
(2.11)

For A, B, C…. up or down indices make no difference: TA = TA etc. The structure constants CABC are completely antisymmetric in their indices, as can be easily seen. In the following, for each quantity fA we define

$$\displaystyle \begin{aligned} \mathbf{f} = \sum_A~T^Af^A~. {} \end{aligned} $$
(2.12)

For example, we can rewrite Eq. (2.9) in the form:

$$\displaystyle \begin{aligned} U(\theta^A) = \exp{[ig\boldsymbol{\theta}]} \sim~1 + ig\boldsymbol{\theta}~+\dots, {} \end{aligned} $$
(2.13)

If we now make the parameters θA depend on the space–time coordinates θA = θA(xμ), \(\mathcal {L}[\phi ,\partial _{\mu }\phi ]\) is in general no longer invariant under the gauge transformations U[θA(xμ)], because of the derivative terms: indeed μϕ′ = μ()≠U∂μϕ. Gauge invariance is recovered if the ordinary derivative is replaced by the covariant derivative:

$$\displaystyle \begin{aligned} D_{\mu} = \partial_{\mu} + ig{\mathbf{V}}_{\mu}~, {} \end{aligned} $$
(2.14)

where \(V^A_{\mu }\) are a set of D gauge vector fields (in one-to-one correspondence with the group generators) with the transformation law

$$\displaystyle \begin{aligned} {\mathbf{V}}^{\prime}_{\mu} = U{\mathbf{V}}_{\mu}U^{-1} - (1/ig)(\partial_{\mu}U)U^{-1}~. {} \end{aligned} $$
(2.15)

For constant θA, V reduces to a tensor of the adjoint (or regular) representation of the group:

$$\displaystyle \begin{aligned} {\mathbf{V}}^{\prime}_{\mu} = U{\mathbf{V}}_{\mu}U^{-1} \simeq {\mathbf{V}}_{\mu} + ig[\boldsymbol{\theta}, {\mathbf{V}}_{\mu}]~\dots, {} \end{aligned} $$
(2.16)

which implies that

$$\displaystyle \begin{aligned} V^{\prime C}_{\mu} = V^C_{\mu} - gC_{ABC}\theta^AV^B_{\mu}~\dots, {} \end{aligned} $$
(2.17)

where repeated indices are summed up.

As a consequence of Eqs. (2.14) and (2.15), Dμϕ has the same transformation properties as ϕ:

$$\displaystyle \begin{aligned} (D_{\mu}\phi)' = U(D_{\mu}\phi)~. {} \end{aligned} $$
(2.18)

In fact

$$\displaystyle \begin{aligned} \begin{array}{rcl} (D_{\mu}\phi)' &\displaystyle =&\displaystyle (\partial_{\mu} + ig\mathbf{V^{\prime}}_{\mu})\phi' = (\partial_{\mu}U) \phi+U\partial_{\mu}\phi+igU{\mathbf{V}}_{\mu} \phi-(\partial_{\mu}U)\\ \phi &\displaystyle =&\displaystyle U(D_{\mu}\phi)~. {} \end{array} \end{aligned} $$
(2.19)

Thus \(\mathcal {L}[\phi ,D_{\mu }\phi ]\) is indeed invariant under gauge transformations. But, at this stage, the gauge fields \(V_{\mu }^A\) appear as external fields that do not propagate. In order to construct a gauge-invariant kinetic energy term for the gauge fields \(V_{\mu }^A\), we consider

$$\displaystyle \begin{aligned}{}[D_{\mu},D_{\nu}] \phi = ig\{\partial_{\mu}{\mathbf{V}}_{\nu} - \partial_{\nu}{\mathbf{V}}_{\mu} + ig[\mathbf{ V}_{\mu},{\mathbf{V}}_{\nu}]\}\phi \equiv ig {\mathbf{F}}_{\mu\nu}\phi~, {} \end{aligned} $$
(2.20)

which is equivalent to

$$\displaystyle \begin{aligned} F^A_{\mu\nu} = \partial_{\mu}V^A_{\nu} - \partial_{\nu}V^A_{\mu} - gC_{ABC}V^B_{\mu}V^C_{\nu}~. {} \end{aligned} $$
(2.21)

From Eqs. (2.8), (2.18) and (2.20) it follows that the transformation properties of \(F^A_{\mu \nu }\) are those of a tensor of the adjoint representation

$$\displaystyle \begin{aligned} {\mathbf{F}}^{\prime}_{\mu\nu} = U{\mathbf{F}}_{\mu\nu}U^{-1}~. {} \end{aligned} $$
(2.22)

The complete Yang–Mills lagrangian, which is invariant under gauge transformations, can be written in the form

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{YM}} = - \frac{1}{2}Tr{\mathbf{F}}_{\mu\nu}{\mathbf{F}}^{\mu\nu}+ \mathcal{L} [\phi,D_{\mu}\phi]~ = - \frac{1}{4} \sum_A F^A_{\mu\nu}F^{A\mu\nu} + \mathcal{L} [\phi,D_{\mu}\phi]~. {} \end{aligned} $$
(2.23)

Note that the kinetic energy term is an operator of dimension 4. Thus if \(\mathcal {L}\) is renormalizable, also \(\mathcal {L}_{\mathrm {YM}}\) is renormalizable. In fact it is the most general gauge invariant and renormalizable lagrangian density. If we give up renormalizability then more gauge invariant higher dimension terms could be added. It is already clear at this stage that no mass term for gauge bosons of the form m2VμVμ is allowed by gauge invariance.

For an abelian theory, as for example QED, the gauge transformation reduces to U[θ(x)] = exp[ieQθ(x)], where Q is the charge generator. The associated gauge field (the photon), according to Eq. (2.15), transforms as

$$\displaystyle \begin{aligned} V^{\prime}_{\mu} = V_{\mu} - \partial_{\mu}\theta(x)~. {} \end{aligned} $$
(2.24)

and the familiar gauge transformation by addition of a 4-gradient of a scalar function is recovered. The QED lagrangian density is given by:

(2.25)

Here , where γμ are the Dirac matrices and the covariant derivative is given in terms of the photon field Aμ and the charge operator Q by:

$$\displaystyle \begin{aligned} D_{\mu}=\partial_{\mu}+ieA_{\mu}Q{} \end{aligned} $$
(2.26)

and

$$\displaystyle \begin{aligned} F_{\mu\nu}~=~\partial_{\mu} A_{\nu}-\partial_{\nu} A_{\mu}{} \end{aligned} $$
(2.27)

Note that in QED one usually takes the e to be the particle, so that Q = −1 and the covariant derivative is Dμ = μ − ieAμ when acting on the electron field. In this case, the Fμν tensor is linear in the gauge field Vμ so that in the absence of matter fields the theory is free. On the other hand, in the non abelian case the \(F^A_{\mu \nu }\) tensor contains both linear and quadratic terms in \(V^A_{\mu }\), so that the theory is non-trivial even in the absence of matter fields.

2.5 Application to QCD

According to the formalism of the previous section, the statement that QCD is a renormalizable gauge theory based on the group SU(3) with colour triplet quark matter fields fixes the QCD lagrangian density to be

(2.28)

Here qj are the quark fields (of nf different flavours) with mass mj and Dμ is the covariant derivative:

$$\displaystyle \begin{aligned} D_{\mu}=\partial_{\mu}+ie_s\mathbf{g_{\mu}};{} \end{aligned} $$
(2.29)

es is the gauge coupling and later we will mostly use, in analogy with QED

$$\displaystyle \begin{aligned} \alpha_s=\frac{e_s^2}{4\pi}.{} \end{aligned} $$
(2.30)

Also, \(\mathbf {g_{\mu }}= \sum _A~t^Ag_{\mu }^A~\) where \(g_{\mu }^A\), A = 1, 8, are the gluon fields and tA are the SU(3) group generators in the triplet representation of quarks (i.e. tA are 3 × 3 matrices acting on q); the generators obey the commutation relations [tA, tB] = iCABCtC where CABC are the complete antisymmetric structure constants of SU(3) (the normalisation of CABC and of es is specified by Tr[tAtB] = δAB∕2);

$$\displaystyle \begin{aligned} F^A_{\mu\nu}~=~\partial_{\mu} g^A_{\nu}-\partial_{\nu} g^A_{\mu}~-~e_sC_{ABC}g^B_{\mu}g^C_{\nu}{} \end{aligned} $$
(2.31)

Chapter 4 is devoted to a detailed description of the QCD as the theory of strong interactions. The physical vertices in QCD include the gluon-quark-antiquark vertex, analogous to the QED photon-fermion-antifermion coupling, but also the 3-gluon and 4-gluon vertices, of order es and \(e_s^2\) respectively, which have no analogue in an abelian theory like QED. In QED the photon is coupled to all electrically charged particles but itself is neutral. In QCD the gluons are coloured hence self-coupled. This is reflected in the fact that in QED Fμν is linear in the gauge field, so that the term \(F_{\mu \nu }^2\) in the lagrangian is a pure kinetic term, while in QCD \(F^A_{\mu \nu }\) is quadratic in the gauge field so that in \(F^{A2}_{\mu \nu }\) we find cubic and quartic vertices beyond the kinetic term. Also instructive is to consider the case of scalar QED:

$$\displaystyle \begin{aligned} \mathcal{L}~=~-\frac{1}{4}F^{\mu\nu}F_{\mu\nu}~+~(D_{\mu}\phi)^\dagger (D^{\mu}\phi)-m^2(\phi^\dagger \phi){} \end{aligned} $$
(2.32)

For Q = 1 we have:

$$\displaystyle \begin{aligned} (D_{\mu}\phi)^\dagger (D^{\mu}\phi)~=~(\partial_{\mu}\phi)^\dagger (\partial^{\mu}\phi)~+~ieA_{\mu}[(\partial^{\mu}\phi)^\dagger \phi ~-~\phi^\dagger (\partial^{\mu}\phi)]~+~e^2A_{\mu}A^{\mu}\phi^\dagger \phi {} \end{aligned} $$
(2.33)

We see that for a charged boson in QED, given that the kinetic term for bosons is quadratic in the derivative, there is a two-gauge vertex of order e2. Thus in QCD the 3-gluon vertex is there because the gluon is coloured and the 4-gluon vertex because the gluon is a boson.

2.6 Quantization of a Gauge Theory

The lagrangian density \(\mathcal {L}_{YM}\) in Eq. (2.23) fully describes the theory at the classical level. The formulation of the theory at the quantum level requires that a procedure of quantization, of regularization and, finally, of renormalization is also specified. To start with, the formulation of Feynman rules is not straightforward. A first problem, common to all gauge theories, including the abelian case of QED, can be realized by observing that the free equation of motion for \(V_{\mu }^A\), as obtained from Eqs. ((2.21), (2.23)), is given by

$$\displaystyle \begin{aligned}{}[\partial^2 g_{\mu\nu}-\partial_\mu \partial_\nu]V^{A\nu}=0 {} \end{aligned} $$
(2.34)

Normally the propagator of the gauge field should be determined by the inverse of the operator [2gμν − μν] which, however, has no inverse, being a projector over the transverse gauge vector states. This difficulty is removed by fixing a particular gauge. If one chooses a covariant gauge condition \(\partial ^\mu V_{\mu }^A=0\) then a gauge fixing term of the form

$$\displaystyle \begin{aligned} \Delta \mathcal{L}_{GF}=-\frac{1}{2\lambda}\sum_A |\partial^\mu V_{\mu}^A|{}^2 {} \end{aligned} $$
(2.35)

has to be added to the lagrangian (1∕λ acts as a lagrangian multiplier). The free equations of motion are now modified as follows:

$$\displaystyle \begin{aligned}{}[\partial^2 g_{\mu\nu}-(1-1/\lambda)\partial_\mu \partial_\nu]V^{A\nu}=0. {} \end{aligned} $$
(2.36)

This operator now has an inverse whose Fourier transform is given by:

$$\displaystyle \begin{aligned} D_{\mu\nu}^{AB}(q)= \frac{i}{q^2+i\epsilon}~[- g_{\mu\nu}+(1-\lambda)\frac{q_\mu q_\nu}{q^2+i\epsilon}]~\delta^{AB} {} \end{aligned} $$
(2.37)

which is the propagator in this class of gauges. The parameter λ can take any value and it disappears from the final expression of any gauge invariant, physical quantity. Commonly used particular cases are λ = 1 (Feynman gauge) and λ = 0 (Landau gauge).

While in an abelian theory the gauge fixing term is all that is needed for a correct quantization, in a non abelian theory the formulation of complete Feynman rules involves a further subtlety. This is formally taken into account by introducing a set of D fictitious ghost fields that must be included as internal lines in closed loops (Faddeev-Popov ghosts [12]). Given that gauge fields connected by a gauge transformation describe the same physics, clearly there are less physical degrees of freedom than gauge field components. Ghosts appear, in the form of a transformation Jacobian in the functional integral, in the process of elimination of the redundant variables associated with fields on the same gauge orbit [13]. The correct ghost contributions can be obtained from an additional term in the lagrangian density. For each choice of the gauge fixing term the ghost langrangian is obtained by considering the effect of an infinitesimal gauge transformation \(V_{\mu }^{'C}=V_{\mu }^C-gC_{ABC}\theta ^AV_{\mu }^B-\partial _\mu \theta ^C\) on the gauge fixing condition. For \(\partial ^\mu V_{\mu }^C=0\) one obtains:

$$\displaystyle \begin{aligned} \partial^\mu V_{\mu}^{'C} =\partial^\mu V_{\mu}^C-gC_{ABC}\partial^\mu (\theta^A V_{\mu}^B)-\partial^2 \theta^C~=~ -[\partial^2 \delta_{AC} +g C_{ABC} V_\mu^B\partial^\mu]\theta^A~{} \end{aligned} $$
(2.38)

where the gauge condition \(\partial ^\mu V_{\mu }^C=0\) has been taken into account in the last step. The ghost lagrangian is then given by:

$$\displaystyle \begin{aligned} \Delta \mathcal{L}_{Ghost}= \bar \eta^C[\partial^2 \delta_{AC}+gC_{ABC}V_\mu^B\partial^\mu ]\eta^A~{} \end{aligned} $$
(2.39)

where ηA is the ghost field (one for each index A) which has to be treated as a scalar field except that a factor (−1) for each closed loop has to be included as for fermion fields.

Starting from non covariant gauges one can construct ghost-free gauges. An example, also important in other respects, is provided by the set of “axial” gauges: \(n^\mu V_\mu ^A=0\) where nμ is a fixed reference 4-vector (actually for nμ spacelike one has an axial gauge proper, for n2 = 0 one speaks of a light-like gauge and for nμ timelike one has a Coulomb or temporal gauge). The gauge fixing term is of the form:

$$\displaystyle \begin{aligned} \Delta \mathcal{L}_{GF}=-\frac{1}{2\lambda}\sum_A |n^\mu V_{\mu}^A|{}^2 {} \end{aligned} $$
(2.40)

With a procedure that can be found in QED textbooks [14] the corresponding propagator, in Fourier space, is found to be:

$$\displaystyle \begin{aligned} D_{\mu\nu}^{AB}(q)= \frac{i}{q^2+i\epsilon}~[- g_{\mu\nu}+\frac{n_\mu q_+n_\nu q_\mu}{(nq)}-\frac{n^2 q_\mu q_\nu}{(nq)^2}]~\delta^{AB} {} \end{aligned} $$
(2.41)

In this case there are no ghost interactions because \(n^\mu V_\mu ^{'A}\), obtained by a gauge transformation from \(n^\mu V_\mu ^A\), contains no gauge fields, once the gauge condition \(n^\mu V_\mu ^A=0\) has been taken into account. Thus the ghosts are decoupled and can be ignored.

The introduction of a suitable regularization method that preserves gauge invariance is essential for the definition and the calculation of loop diagrams and for the renormalization programme of the theory. The method that is by now currently adopted is dimensional regularization [15] which consists in the formulation of the theory in n dimensions. All loop integrals have an analytic expression that is actually valid also for non integer values of n. Writing the results for n = 4 − 𝜖 the loops are ultraviolet finite for 𝜖 > 0 and the divergences reappear in the form of poles at 𝜖 = 0.

2.7 Spontaneous Symmetry Breaking in Gauge Theories

The gauge symmetry of the SM was difficult to discover because it is well hidden in nature. The only observed gauge boson that is massless is the photon. The gluons are presumed massless but cannot be directly observed because of confinement, and the W and Z weak bosons carry a heavy mass. Indeed a major difficulty in unifying the weak and electromagnetic interactions was the fact that e.m. interactions have infinite range (mγ = 0), whilst the weak forces have a very short range, owing to mW,Z ≠ 0.

The solution of this problem is in the concept of spontaneous symmetry breaking, which was borrowed from statistical mechanics.

Consider a ferromagnet at zero magnetic field in the Landau–Ginzburg approximation. The free energy in terms of the temperature T and the magnetization M can be written as

$$\displaystyle \begin{aligned} F(\mathbf{M}, T) \simeq F_0(T) + 1/2~\mu^2(T){\mathbf{M}}^2 + 1/4~\lambda(T)({\mathbf{M}}^2)^2 + \ldots~. {} \end{aligned} $$
(2.42)

This is an expansion which is valid at small magnetization. The neglect of terms of higher order in \(\vec M^2\) is the analogue in this context of the renormalizability criterion. Also, λ(T) > 0 is assumed for stability; F is invariant under rotations, i.e. all directions of M in space are equivalent. The minimum condition for F reads

$$\displaystyle \begin{aligned} \partial F/\partial M_i = 0, \quad [\mu^2(T) + \lambda(T){\mathbf{M}}^2]\mathbf{M} = 0~. {} \end{aligned} $$
(2.43)

There are two cases, shown in Fig. 2.1. If μ2 ≳ 0, then the only solution is M = 0, there is no magnetization, and the rotation symmetry is respected. In this case the lowest energy state (in a quantum theory the vacuum) is unique and invariant under rotations. If μ2 < 0, then another solution appears, which is

$$\displaystyle \begin{aligned} |{\mathbf{M}}_0|{}^2 = -\mu^2/\lambda~. {} \end{aligned} $$
(2.44)

In this case there is a continuous orbit of lowest energy states, all with the same value of |M| but different orientations. A particular direction chosen by the vector M 0 leads to a breaking of the rotation symmetry.

Fig. 2.1
figure 1

The potential V = 1∕2 μ2M2 + 1∕4 λ(M2)2 for positive (a) or negative μ2 (b) (for simplicity, M is a 2-dimensional vector). The small sphere indicates a possible choice for the direction of M

For a piece of iron we can imagine to bring it to high temperature and let it melt in an external magnetic field B. The presence of B is an explicit breaking of the rotational symmetry and it induces a non zero magnetization M along its direction. Now we lower the temperature while keeping B fixed. The critical temperature T crit (Curie temperature) is where μ2(T) changes sign: μ2(T crit) = 0. For pure iron T crit is below the melting temperature. So at T = T crit iron is a solid. Below T crit we remove the magnetic field. In a solid the mobility of the magnetic domains is limited and a non vanishing M 0 remains. The form of the free energy becomes rotationally invariant as in Eq. (2.43). But now the system allows a minimum energy state with non vanishing M in the direction where B was. As a consequence the symmetry is broken by this choice of one particular vacuum state out of a continuum of them.

We now prove the Goldstone theorem [16]. It states that when spontaneous symmetry breaking takes place, there is always a zero-mass mode in the spectrum. In a classical context this can be proven as follows. Consider a lagrangian

$$\displaystyle \begin{aligned} \mathcal{L} =\frac{1}{2} |\partial_{\mu}\phi|{}^2 - V(\phi). {} \end{aligned} $$
(2.45)

The potential V (ϕ) can be kept generic at this stage but, in the following, we will be mostly interested in a renormalizable potential of the form (with no more than quartic terms):

$$\displaystyle \begin{aligned} V(\phi)=-\frac{1}{2} \mu^2~ \phi^2+\frac{1}{4}\lambda ~\phi^4. {} \end{aligned} $$
(2.46)

Here by ϕ we mean a column vector with real components ϕi (1=1,2…N) (complex fields can always be decomposed into a pair of real fields), so that, for example, \(\phi ^2=\sum _i\phi _i^2\). This particular potential is symmetric under a NxN orthogonal matrix rotation ϕ′ = , where O is a SO(N) transformation. For simplicity, we have omitted odd powers of ϕ, which means that we assumed an extra discrete symmetry under ϕ ↔−ϕ. Note that, for positive μ2, the mass term in the potential has the “wrong” sign: according to the previous discussion this is the condition for the existence of a non unique lowest energy state. More in general, we only assume here that the potential is symmetric under the infinitesimal transformations

$$\displaystyle \begin{aligned} \phi \rightarrow \phi' = \phi + \delta \phi, \quad \delta \phi_i = i \delta\theta^A t_{ij}^A\phi_j~. {} \end{aligned} $$
(2.47)

where δθA are infinitesimal parameters and \(t_{ij}^A\) are the matrices that represent the symmetry group on the representation of the fields ϕi (a sum over A is understood). The minimum condition on V  that identifies the equilibrium position (or the vacuum state in quantum field theory language) is

$$\displaystyle \begin{aligned} (\partial V/\partial \phi_i)(\phi_i = \phi^0_i) = 0~. {} \end{aligned} $$
(2.48)

The symmetry of V  implies that

$$\displaystyle \begin{aligned} \delta V = (\partial V/\partial \phi_i)\delta \phi_i = i \delta \theta^A(\partial V/\partial \phi_i)t_{ij}^A\phi_j = 0~. {} \end{aligned} $$
(2.49)

By taking a second derivative at the minimum \(\phi _i = \phi ^0_i\), given by the previous equation, we obtain that, for each A:

$$\displaystyle \begin{aligned} \frac{\partial^2V}{\partial \phi_k\partial \phi_i} (\phi_i = \phi^0_i)t_{ij}^A\phi^0_j + \frac{\partial V}{\partial \phi_i} (\phi_i = \phi^0_i)t_{ik}^A = 0~. {} \end{aligned} $$
(2.50)

The second term vanishes owing to the minimum condition, Eq. (2.48). We then find

$$\displaystyle \begin{aligned} \frac{\partial^2V}{\partial \phi_k\partial \phi_i} (\phi_i = \phi^0_i)t_{ij}^A\phi^0_j = 0~. {} \end{aligned} $$
(2.51)

The second derivatives \(M^2_{ki} = (\partial ^2V/\partial \phi _k \partial \phi _i)(\phi _i = \phi ^0_i)\) define the squared mass matrix. Thus the above equation in matrix notation can be written as

$$\displaystyle \begin{aligned} M^2 t^A \phi^0 = 0~. {} \end{aligned} $$
(2.52)

In the case of no spontaneous symmetry breaking the ground state is unique, all symmetry transformations leave it invariant, so that, for all A, tAϕ0 = 0. On the contrary, if, for some values of A, the vectors (tAϕ0) are non-vanishing, i.e. there is some generator that shifts the ground state into some other state with the same energy (hence the vacuum is not unique), then each tAϕ0≠0 is an eigenstate of the squared mass matrix with zero eigenvalue. Therefore, a massless mode is associated with each broken generator. The charges of the massless modes (their quantum numbers in quantum language) differ from those of the vacuum (usually taken as all zero) by the values of the tA charges: one says that the massless modes have the same quantum numbers of the broken generators, i.e. those that do not annihilate the vacuum.

The previous proof of the Goldstone theorem has been given in the classical case. In the quantum case the classical potential corresponds to tree level approximation of the quantum potential. Higher order diagrams with loops introduce quantum corrections. The functional integral formulation of quantum field theory [13, 17] is the most appropriate framework to define and compute, in a loop expansion, the quantum potential which specifies, exactly as described above, the vacuum properties of the quantum theory. If the theory is weakly coupled, e.g. if λ is small, the tree level expression for the potential is not too far from the truth, and the classical situation is a good approximation. We shall see that this is the situation that occurs in the electroweak theory if the Higgs is moderately light (see Chap. 3, Sect. 3.13.1).

We note that for a quantum system with a finite number of degrees of freedom, for example one described by the Schrödinger equation, there are no degenerate vacua: the vacuum is always unique. For example, in the one dimensional Schrödinger problem with a potential:

$$\displaystyle \begin{aligned} V(x)=- \mu^2/2~ x^2+\lambda ~x^4/4~, {} \end{aligned} $$
(2.53)

there are two degenerate minima at \(x=\pm x_0=\sqrt {(}\mu ^2/\lambda )\) which we denote by |+〉 and |−〉. But the potential is not diagonal in this basis: the off diagonal matrix elements:

$$\displaystyle \begin{aligned} \langle+|V|-\rangle=\langle-|V|+\rangle \sim \exp{(-khd)}= \delta {} \end{aligned} $$
(2.54)

are different from zero due to the non vanishing amplitude for a tunnel effect between the two vacua, proportional to the exponential of the product of the distance d between the vacua and the height h of the barrier with k a constant (see Fig. 2.2). After diagonalization the eigenvectors are \((|+\rangle +|-\rangle )/\sqrt {2}\) and \((|+\rangle -|-\rangle )/\sqrt {2}\), with different energies (the difference being proportional to δ). Suppose now that you have a sum of n equal terms in the potential, V =∑iV (xi). Then the transition amplitude would be proportional to δn and would vanish for infinite n: the probability that all degrees of freedom together jump over the barrier vanishes. In this example there is a discrete number of minimum points. The case of a continuum of minima is obtained, always in the Schrödinger context, if we take

$$\displaystyle \begin{aligned} V=1/2~\mu^2{\mathbf{r}}^2 + 1/4~\lambda({\mathbf{r}}^2)^2~, {} \end{aligned} $$
(2.55)

with r = (x, y, z). Also in this case the ground state is unique: it is given by a state with total orbital angular momentum zero, an s-wave state, whose wave function only depends on |r|, independent of all angles. This is a superposition of all directions with the same weight, analogous to what happened in the discrete case. But again, if we replace a single vector r, with a vector field M(x), that is a different vector at each point in space, the amplitude to go from a minimum state in one direction to another in a different direction goes to zero in the limit of infinite volume. In simple words, the vectors at all points in space have a vanishing small amplitude to make a common rotation, all together at the same time. In the infinite volume limit all vacua along each direction have the same energy and spontaneous symmetry breaking can occur.

Fig. 2.2
figure 2

A Schrödinger potential V (x) analogous to the Higgs potential

The massless Goldstone bosons correspond to a long range force. Unless the massless particles are confined, as for the gluons in QCD, these long range forces would be easily detectable. Thus, in the construction of the EW theory we cannot accept massless physical scalar bosons. Fortunately, when spontaneous symmetry breaking takes place in a gauge theory, the massless Goldstone modes exist, but they are unphysical and disappear from the spectrum. Each of them becomes, in fact, the third helicity state of a gauge boson that takes mass. This is the Higgs mechanism (it should be called Englert-Brout-Higgs mechanism [18], because an equal merit should be credited to the simultaneous paper by Englert and Brout). Consider, for example, the simplest Higgs model described by the lagrangian

$$\displaystyle \begin{aligned} \mathcal{L} = -\frac{1}{4}~F^2_{\mu\nu} + |(\partial_{\mu} + ieA_{\mu}Q)\phi|{}^2 + \mu^2 \phi^*\phi - \frac{\lambda}{2}(\phi^*\phi)^2~. {} \end{aligned} $$
(2.56)

Note the ‘wrong’ sign in front of the mass term for the scalar field ϕ, which is necessary for the spontaneous symmetry breaking to take place. The above lagrangian is invariant under the U(1) gauge symmetry

$$\displaystyle \begin{aligned} A_{\mu} \rightarrow A^{\prime}_{\mu} = A_{\mu} - \partial_{\mu}\theta(x), \quad \phi \rightarrow \phi' = \mathrm{exp}[ieQ\theta(x)]~\phi . {} \end{aligned} $$
(2.57)

For the U(1) charge Q we take  = −ϕ, like in QED, where the particle is e. Let ϕ0 = v ≠ 0, with v real, be the ground state that minimizes the potential and induces the spontaneous symmetry breaking. In our case v is given by v2 = μ2λ. Making use of gauge invariance, we can do the change of variables

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \phi(x) \rightarrow[v+h(x)/\sqrt 2]~\mathrm{exp}[-i \zeta(x)/v\sqrt 2]~, \\ &\displaystyle &\displaystyle A_{\mu}(x) \rightarrow A_{\mu} - \partial_{\mu} \zeta(x)/ev\sqrt 2. {} \end{array} \end{aligned} $$
(2.58)

Then h = 0 is the position of the minimum, and the lagrangian becomes

$$\displaystyle \begin{aligned} \mathcal{L} = -\frac{1}{4}F^2_{\mu\nu} +e^2v^2A^2_{\mu} + \frac{1}{2} e^2h^2A^2_{\mu} + \sqrt 2e^2h vA^2_{\mu} + \mathcal{L}(h)~. {} \end{aligned} $$
(2.59)

The field ζ(x) is the would-be Goldstone boson, as can be seen by considering only the ϕ terms in the lagrangian, i.e. setting Aμ = 0 in Eq. (2.56). In fact in this limit the kinetic term μζ∂μζ remains but with no ζ2 mass term. Instead, in the gauge case of Eq. (2.56), after changing variables in the lagrangian, the field ζ(x) completely disappears (not even the kinetic term remains), whilst the mass term \(e^2v^2A^2_{\mu }\) for Aμ is now present: the gauge boson mass is \(M=\sqrt {2}ev\). The field h describes the massive Higgs particle. Leaving a constant term aside, the last term in Eq. (2.59) is given by:

$$\displaystyle \begin{aligned} \mathcal{L}(h)=\frac{1}{2} \partial_\mu h \partial^\mu h -h^2 \mu^2+\ldots . {} \end{aligned} $$
(2.60)

where the dots stand for cubic and quartic terms in h. We see that the h mass term has the “right” sign, due to the combination of the quadratic terms in h that, after the shift, arise from the quadratic and quartic terms in ϕ. The h mass is given by \(m^2_h=2\mu ^2\).

The Higgs mechanism is realized in well-known physical situations. It was actually discovered in condensed matter physics by Anderson [19]. For a superconductor in the Landau–Ginzburg approximation the free energy can be written as

$$\displaystyle \begin{aligned} F = F_0 + \frac{1}{2}{\mathbf{B}}^2 + |(\boldsymbol{\nabla} - 2ie\mathbf{A})\phi|{}^2/4m - \alpha|\phi|{}^2 + \beta|\phi|{}^4~. {} \end{aligned} $$
(2.61)

Here B is the magnetic field, |ϕ|2 is the Cooper pair (ee) density, 2e and 2m are the charge and mass of the Cooper pair. The ‘wrong’ sign of α leads to ϕ ≠ 0 at the minimum. This is precisely the non-relativistic analogue of the Higgs model of the previous example. The Higgs mechanism implies the absence of propagation of massless phonons (states with dispersion relation ω = kv with constant v). Also the mass term for A is manifested by the exponential decrease of B inside the superconductor (Meissner effect).

2.8 Quantization of Spontaneously Broken Gauge Theories: Rξ Gauges

We have discussed in Sect. 2.6 the problems arising in the quantization of a gauge theory and in the formulation of the correct Feynman rules (gauge fixing terms, ghosts etc.). Here we give a concise account of the corresponding results for spontaneously broken gauge theories. In particular we describe the Rξ gauge formalism [13, 17, 20]: in this formalism the interplay of transverse and longitudinal gauge boson degrees of freedom is made explicit and their combination leads to the cancellation from physical quantities of the gauge parameter ξ. We work out in detail an abelian example that later will be easy to generalize to the non abelian case.

We restart from the abelian model of Eq. (2.56) (with Q = −1). In the treatment presented there the would be Goldstone boson ζ(x) was completely eliminated from the lagrangian by a non linear field transformation formally identical to a gauge transformation corresponding to the U(1) symmetry of the lagrangian. In that description, in the new variables we eventually obtain a theory with only physical fields: a massive gauge boson Aμ with mass \(M=\sqrt {2} e v\) and a Higgs particle h with mass \(m_h=\sqrt {2}\mu \). This is called a “unitary” gauge, because only physical fields appear. But for a massive gauge boson the propagator:

$$\displaystyle \begin{aligned} iD_{\mu\nu}(k)=-i\frac{g_{\mu\nu}-k_\mu k_\nu/M^2}{k^2-M^2+i\epsilon}~, {} \end{aligned} $$
(2.62)

has a bad ultraviolet behaviour due to the second term in the numerator. This choice does not prove to be the most convenient for a discussion of the ultraviolet behaviour of the theory. Alternatively one can go to an alternative formulation where the would be Goldstone boson remains in the lagrangian but the complication of keeping spurious degrees of freedom is compensated by having all propagators with good ultraviolet behaviour (“renormalizable” gauges). To this end we replace the non linear transformation for ϕ in Eq. (2.58) with its linear equivalent (after all perturbation theory deals with the small oscillations around the minimum):

$$\displaystyle \begin{aligned} \phi(x) \rightarrow[v+h(x)/\sqrt 2]~\mathrm{exp}[-i \zeta(x)/v\sqrt 2]~\sim~[v+h(x)/\sqrt 2-i \zeta(x)/\sqrt 2]~. {} \end{aligned} $$
(2.63)

Here we have only applied a shift by the amount v and separated the real and imaginary components of the resulting field with vanishing vacuum expectation value. If we leave Aμ as it is and simply replace the linearized expression for ϕ, we obtain the following quadratic terms (those important for propagators):

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathcal{L}_{\mathrm{quad}} &\displaystyle =&\displaystyle - \frac{1}{4} \sum_A F^A_{\mu\nu}F^{A\mu\nu} + \frac{1}{2} M^2A_\mu A^\mu+ \\ &\displaystyle +&\displaystyle \frac{1}{2}(\partial_\mu \zeta)^2 + MA_\mu \partial^\mu\zeta+\frac{1}{2}(\partial_\mu h)^2 -h^2 \mu^2 {} \end{array} \end{aligned} $$
(2.64)

The mixing term between Aμ and μζ does not allow to directly write diagonal mass matrices. But this mixing term can be eliminated by an appropriate modification of the covariant gauge fixing term given in Eq. (2.35) for the unbroken theory. We now take:

$$\displaystyle \begin{aligned} \Delta \mathcal{L}_{GF}=-\frac{1}{2\xi}(\partial^\mu A_{\mu}-\xi M\zeta)^2~. {} \end{aligned} $$
(2.65)

By adding \(\Delta \mathcal {L}_{GF}\) to the quadratic terms in Eq. (2.64) the mixing term cancels (apart from a total derivative that can be omitted) and we have:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathcal{L}_{\mathrm{quad}} &\displaystyle =&\displaystyle - \frac{1}{4} \sum_A F^A_{\mu\nu}F^{A\mu\nu} + \frac{1}{2} M^2A_\mu A^\mu-\frac{1}{2\xi}(\partial^\mu A_{\mu})^2+ \\ &\displaystyle +&\displaystyle \frac{1}{2}(\partial_\mu \zeta)^2 -\frac{\xi}{2}M^2\zeta^2+ \frac{1}{2}(\partial_\mu h)^2 -h^2 \mu^2 {} \end{array} \end{aligned} $$
(2.66)

We see that the ζ field appears with a mass \(\sqrt {\xi }M\) and its propagator is:

$$\displaystyle \begin{aligned} iD_\zeta=\frac{i}{k^2-\xi M^2+i\epsilon}. {} \end{aligned} $$
(2.67)

The propagators of the Higgs field h and of gauge field Aμ are:

$$\displaystyle \begin{aligned} iD_h=\frac{i}{k^2-2\mu^2+i\epsilon}~, {} \end{aligned} $$
(2.68)
$$\displaystyle \begin{aligned} iD_{\mu\nu}(k)=\frac{-i}{k^2-M^2+i\epsilon}(g_{\mu\nu}-(1-\xi)\frac{k_\mu k_\nu}{k^2-\xi M^2})~. {} \end{aligned} $$
(2.69)

As anticipated, all propagators have a good behaviour at large k2. This class of gauges are called “Rξ gauges” [20]. Note that for ξ = 1 we have a sort of generalization of the Feynman gauge with a Goldstone of mass M and a gauge propagator:

$$\displaystyle \begin{aligned} iD_{\mu\nu}(k)=\frac{-ig_{\mu\nu}}{k^2-M^2+i\epsilon}~. {} \end{aligned} $$
(2.70)

Also for ξ → the unitary gauge description is recovered in that the Goldstone propagator vanishes and the gauge propagator reproduces that of the unitary gauge in Eq. (2.62). All ξ dependence, including the unphysical singularities of the ζ and Aμ propagators at k2 = ξM2, present in individual Feynman diagrams, must cancel in the sum of all contributions to any physical quantity.

An additional complication is that a Faddeev-Popov ghost is also present in Rξ gauges (while it is absent in an unbroken abelian gauge theory). In fact under an infinitesimal gauge transformation with parameter θ(x):

$$\displaystyle \begin{aligned} \begin{array}{rcl} A_\mu &\displaystyle \rightarrow&\displaystyle A_\mu-\partial_\mu \theta \\ \phi &\displaystyle \rightarrow&\displaystyle (1-ie\theta)[v+h(x)/\sqrt{2}-i \zeta(x)/\sqrt{2}]~, {} \end{array} \end{aligned} $$
(2.71)

so that:

$$\displaystyle \begin{aligned} \delta A_\mu=-\partial_\mu \theta,~~~~\delta h=-e\zeta \theta,~~~~\delta \zeta= e \theta \sqrt{2}(v+h/\sqrt{2})~. {} \end{aligned} $$
(2.72)

The gauge fixing condition μAμ − ξMζ = 0 undergoes the variation:

$$\displaystyle \begin{aligned} \partial_\mu A^\mu-\xi M\zeta \rightarrow \partial_\mu A^\mu-\xi M\zeta -[\partial^2 +\xi M^2(1+h/v\sqrt{2})]\theta~, {} \end{aligned} $$
(2.73)

where we used \(M=\sqrt {2}ev\). From this, recalling the discussion in Sect. 2.6, we see that the ghost is not coupled to the gauge boson (as usual for an abelian gauge theory) but has a coupling to the Higgs field h. The ghost lagrangian is:

$$\displaystyle \begin{aligned} \Delta \mathcal{L}_{Ghost}= \bar \eta[\partial^2 +\xi M^2(1+h/v\sqrt{2})]\eta~. {} \end{aligned} $$
(2.74)

The ghost mass is seen to be \(m_{gh} =\sqrt {\xi } M\) and its propagator is:

$$\displaystyle \begin{aligned} iD_{gh}=\frac{i}{k^2-\xi M^2+i\epsilon}. {} \end{aligned} $$
(2.75)

The detailed Feynman rules follow from all the basic vertices involving the gauge boson, the Higgs, the would be Goldstone boson and the ghost and can be easily derived, with some algebra, from the total lagrangian including the gauge fixing and ghost additions. The generalization to the non abelian case is in principle straightforward, with some formal complications involving the projectors over the space of the would be Goldstone bosons and over the orthogonal space of the Higgs particles. But for each gauge boson that takes mass Ma we still have a corresponding would be Goldstone boson and a ghost with mass \(\sqrt {\xi }M_a\). The Feynman diagrams, both for the abelian and the non abelian case, are listed explicitly, for example, in the Cheng and Li textbook in ref.[17].

We conclude that the renormalizability of non abelian gauge theories, also in presence of spontaneous symmetry breaking, was proven in the fundamental works of t’Hooft and Veltman [21] and discussed in detail in [22].