5.1 From Bristol to Liverpool

In 1948, at Chadwick’s instigation, and after long negotiations with the Vice Chancellor to ensure that he would have an effectively independent research institute with the promise of a well-funded theoretical physics libraryFootnote 1 under his own control, and minimal undergraduate teaching responsibilities, Fröhlich moved to Liverpool, to become, at the age of 42, the first Professor of Theoretical Physics (Fig. 5.1)

Fig. 5.1
figure 1

Fröhlich at the time he was elected Fellow of the Royal Society, March 1951

, a position for which, incidentally, Schrödinger had proposed himself in 1946.Footnote 2

The Liverpool Chair had first been advertised by the university in May 1947, and applications were received from Benham, Jaeger, Jahn and Temperley . Although only Jahn and Temperley were invited for interview, Jaeger’s application did receive serious consideration, partly on account of his work being known to the External Advisors (Mott and Peierls ) and to certain members of the Selection Committee. Whilst it was agreed that all three merited careful consideration, it was finally concluded that none of them was likely to fully meet the needs of the subject in Liverpool. After discussion with the External Advisors as to whether an approach should be made to someone who had not applied for the Chair, they suggested that consideration be given to Fröhlich, who was already well-known to them; indeed, Mott, in his report to the Selection Committee, revealed that Fröhlich was likely soonFootnote 3 to be elected Fellow of the Royal Society . The Committee welcomed this proposal, and Fröhlich was duly invited to Liverpool for preliminary talks with the Vice Chancellor and Chadwick. On 20 February 1948, in the absence of the External Advisors, he met the Selection Committee, several of whose members already knew him, and they were all convinced that, in view of his high standing, he would be a valuable and much needed contribution to the work of the university. Accordingly, the Committee unanimously recommended that Fröhlich be appointed to the Chair of Theoretical Physics as from 1 October 1948; the appointment was reported in Nature on 8 May that year (Fig. 5.2).

Fig. 5.2
figure 2

Announcement in Nature (8 May, 1948) of Fröhlich’s appointment to the Liverpool Chair of Theoretical Physics—Reproduced with the permission of Nature

From Bristol and the ERA, he brought with him a substantial number of co-workers and research students, including H. Pelzer (a member of the Senior Research Staff of the ERA, and well-known as the co-author with Wigner of a seminal paper on the rates of chemical reactions), B. Szigeti (a Research Fellow supported by the ERA) who was highly experienced in dielectrics and crystal dynamics, A.B. Bhatia (1851 Exhibition Scholar), and Kun Huang (an ICI Fellow) who went to Edinburgh during the university vacations to work with Max Born on what was to become the well-known treatise ‘Dynamical Theory of Crystal Lattices’, which was first published in 1954 (Born and Huang 1954). In addition there was S. Zienau , a research student working on electrons in polar crystals, and later (in 1950—vide Sect. 5.3), co-author, together with Fröhlich and Pelzer, of the famous paper [F72] in which the theory of what became known as the ‘large’ polaron was first formulated. The only staff member besides Fröhlich was his assistant lecturer, R. Huby . The first research students were:

Miss S.N. Ruddlesden and Mr A.C. Clark, both of whom had recently graduated from Liverpool (Huby 1950, 1988).

A distinguishing characteristic of his department, which, until 1959, occupied an elegant Georgian house at 6 Abercromby Square —and which was rather separate from the rest of the university—was the constant stream of eminent visitors from abroad, such as Bardeen, Bethe, Heisenberg, Heitler, Onsager and Prigogine , which made it a truly international centre of excellence in theoretical physics. To some in Liverpool, this was a source of some disquiet in the immediate post-war years when the visitors included both German and Japanese physicists, such as Hermann Haken and Sadao Nakajima .Footnote 4 To Fröhlich, however, racial, political and religious differences were no impediment to sincere scientific discourse, and in any case, as he pointed out, some of the younger people were still school-children during the war; appreciating how difficult it was for them, he was only too happy to be able to help. It was, politically, racially and religiously, a very heterogeneous department, with Arabs, Catholics, Communists, Hindus and Jews, but Fröhlich did his best to ensure that their irreconcilable differences were not a source of disharmony, actually, on occasion, forbidding discussion at coffee or tea of particularly sensitive issues, such as Arab-Israeli conflicts.

Indeed, morning coffee and afternoon tea were undoubtedly the most significant events of the day at which important ideas often emerged from extended and animated discussions, to which the congenial atmosphere of the house was so conducive; apparently, there was a reluctance to use reference books to settle a point, and to go and fetch was was highly unpopular because it broke up discussions! Despite this congeniality, surnames were always used, however—even Fröhlich’s wife often referring to him simply as ‘Fröhlich’; indeed, few knew what his initial H stood for: perhaps they thought it was Hamiltonian, which would not have been inappropriate (Powles 1973)! Departmental outings continued, as in Bristol, but now the preferred destination was usually North Wales .

In 1959, the department was relocated nearby in the new Chadwick Laboratory , on the top 3 floorsFootnote 5 of its 8-storey tower, the stairs of which Fröhlich daily athletically climbed, two-at-a-time, until he broke a thigh-bone in the 1970s; Theoretical Physics became an independent department of the university three years later in 1962.

As far as his research students were concerned, Fröhlich’s typical method of teaching was to toss out a few ideas, leaving the recipients essentially to sort things out for themselves, but giving the occasional pertinent, but often cryptic, suggestion as to how they might proceed—an approach to which the author can personally bear witness. He drew people into discussion by his enthusiasm for the subject, cultivating creative argument with colleagues—something that is rapidly disappearing today—and exercising his razor-sharp mind in constructive criticism, particularly on the occasion of the weekly theoretical physics seminars that were held every Thursday afternoon at 3.30 p.m. during the university terms. At the same time, he was always very approachable; he was recently described by a former colleague in Germany as ‘friendly, but very cunning!’

5.2 Theory of Dielectrics

Once settled in Liverpool, he completed his second book, Theory of Dielectrics [F(ii)], which was published by Oxford University Press in 1949, and which soon became the definitive text on the theory of dielectrics. The book was expressly written for use by applied scientists, including not only physicists but also chemists and biologists, in an attempt to fill a long-felt need for an up-to-date, authoritative, and systematic account of the theory of the dielectric constant and dielectric loss (but excluding dielectric breakdown). In addition, he hoped that it would act as a stimulus for further research, both theoretical and experimental (Fig. 5.3).

Fig. 5.3
figure 3

Fröhlich’s 2nd book, Theory of Dielectrics, first published 1949—Reproduced with the permission of Oxford University Press

After a short introduction to the basic concepts and definitions used in macroscopic dielectric theory for both static and time-dependent fields, and the energy loss associated with the latter, together with some important remarks relating to the temperature dependence of the static dielectric constant, which follow from some quite general thermodynamical considerations (Chap. 1), the subsequent subject matter of the book divides naturally into detailed consideration of static and dynamic properties. The main topic of the former is the theory of the static dielectric constant (Chap. 2), whilst that of the latter concerns dielectric loss (Chap. 3). These are followed by a long chapter devoted to applications to a variety of dielectric materials, including liquids and gases. The book was republished in 1957 with an additional Appendix (B) dealing with the general theory of the dielectric constant, and contained some theoretical work on dielectric loss done subsequent to the 1st Edition. A survey of the theoretical situation at this time was given by Fröhlich in [F103].

Basic to the considerations of Chap. 2 is a recognition of the distinction between two quite different types of interaction in a dielectric, namely those of short range (associated with chemical bonds, van der Waals attraction, and various repulsive forces), and the long-range forces of a dipolar nature, which means that they must be taken into account even at macroscopic distances. The latter feature led Fröhlich to base his work on the Lorentz-Kirkwood method of treating dipolar interaction, namely: ‘......from a macroscopic specimen select a microscopic spherical region which is sufficiently large to have the same dielectric properties as a macroscopic specimen. The interaction between dipoles inside the spherical region will then be calculated in an exact way, but for the calculation of their interaction with the rest of the specimen the latter is considered as a continuous medium’ [F(ii), p. 22]. The most important section of Chap. 2 is Sect. 2.7 (based on work published the preceding year [F66]), in which the following expressions are derived for the static dielectric constant, \( \varepsilon_{\text{s}} \), which hold quite generally for any dielectric material that is not permanently polarized:

$$ \varepsilon_{\text{s}} - 1 \, = \frac{{4\pi \;\;3\varepsilon_{\text{s}} \;\;\left\langle {M^{2} } \right\rangle }}{{3V(2\varepsilon_{\text{s}} + \, 1)kT}} = \frac{4\pi }{3V}\frac{{(\varepsilon_{\text{s}} + \, 2) }}{3}\frac{{\left\langle {M^{2}_{\text{vac}} } \right\rangle }}{kT} $$
(5.2.1)

where \( \left\langle {M^{2} } \right\rangle \) is the macroscopic fluctuation of the spontaneous Footnote 6 dipole moment of a sufficiently large sphere of dielectric material of volume V embedded in its own medium with dielectric constant \( \varepsilon_{\text{s}} \) Footnote 7; \( \left\langle {M^{2}_{\text{vac}} } \right\rangle \) are the fluctuations in the case of a sphere in vacuum.

It was shown that when all short-range forces are neglected, and the molecules are assumed to be spherical, Eq. 5.2.1 yields either the Clausius-Mossotti or the Onsager formula, according as whether the molecules are non-polar or polar, respectively. Including short-range forces between nearest neighbours only, Eq. 5.2.1 was then used to obtain a generalisation of Kirkwood’s formula for a polar liquid, which contains, for the case of spherical molecules, a corresponding generalisation of the Onsager formula. In turn, these generalisations permitted determination of the domain of validity of the original formulae (in which no account is taken of short-range forces), which have often been used indiscriminantly. It was found, for example, that for dipolar liquids the Onsager formula can be expected to hold only asymptotically at high temperatures.

Another topic treated in Chap. 2 was the temperature dependence of the static dielectric constant of a dipolar solid in the vicinity of an order-disorder transition, in which attention was drawn to the similarity in the dielectric behaviour of disordered solids and liquids.

The other main topic treated in the book (Chap. 3), which involved dynamic properties of the materials, was dielectric loss—i.e. the energy loss from an A.C. electric field due to the heating that occurs in a dielectric material. It was shown, in particular, that the familiar Debye equations for the real and imaginary parts of the complex, frequency-dependent dielectric constant, \( \varepsilon_{ 1} (\omega )\;{\text{and}}\;\varepsilon_{ 2} (\omega ) \), respectively, follow under the assumption that, in an external electric field, equilibrium is attained exponentially with time—i.e. \({\text{e}}^{{ - t/\tau }}\), where the relaxation time, \( \tau \), in general depends on temperature; the associated dielectric loss is given in terms of \( \varepsilon_{ 1} \) and \( \varepsilon_{ 2} \) by \( \tan \phi = \varepsilon_{ 2} /\varepsilon_{ 1} \). Again, Fröhlich stressed that the Debye equations ‘have been applied to many substances, not always, unfortunately, with the necessary discrimination with respect to the intended range of validity’, and proceeded to obtain expressions for the relaxation time for dipolar solids and liquids, taking care to specify the conditions necessary for their validity.

In the case of a dilute solution of dipolar molecules in a liquid or an amorphous solid, generalisations of the Debye equations were derived covering a range of relaxation times , and the associated absorption discussed. Finally, the case of resonance absorption was treated, in which the loss is due to displacements of charges bound elastically to an equilibrium position about which they oscillate with a frequency \( \omega_{\text{o}} \); near this resonance frequency, the power loss (and \( \varepsilon_{ 2} \)) is a maximum. In deriving the associated expressions for \( \varepsilon_{ 1} \) and \( \varepsilon_{ 2} \), it was assumed that equilibrium is approached through exponentially damped oscillations—i.e. \( {\text{e}}^{{{-}t/{\tau }}} \) was replaced by \( {\text{e}}^{{{-}t/{\tau }}} \cos \left( {\omega_{o} t + \theta } \right) \), where the phase \( \theta \) was shown to be given by \( \tan \theta = - \omega_{\text{o}} \tau \). It was found, in contrast to the case of Debye absorption , that the frequency of maximum absorption is here independent of temperature, whilst the resonance peak becomes narrower and higher with decreasing temperature. As noted in Sect. 4.6, Fröhlich had first considered dielectric loss already in 1941 [F36], and later in 1942 [F39] and 1945 [F51 Addendum], the latter being elaborated in detail in [F56]. In the latter work, he showed, for the case of a rigid dipole oscillating about an equilibrium position, that the usual Lorentz formula for the shape of absorption lines is incorrect except near resonance, and derived a generally valid expression for the loss angle, which must be used at ultra-high frequencies, and which takes the form of the familiar Debye expression at frequencies well above resonance. The new formula agreed with that derived independently by van Vleck and Weisskopf for a system of harmonic oscillators, using a different method. In [F57], Fröhlich applied his own method to van Vleck and Weisskopf’s system, reproducing their result.

The final chapter (Chap. 4) was devoted to applications of the results of Chaps. 1 and 2 to a variety of dielectric materials, such as non-polar solids, dipolar liquids and, in particular, ionic crystals about which two extremely important points were made, which subsequently developed into independent subjects. The first (on pages 153–155), the origins of which can be traced to a footnote (vide Sect. 4.3) in a paper with Mott of 1939 [F23] and which was one that proved to be of particular importance much later on in the case of biological systems (vide Sect. 6.3), was the fact that, because the long wavelength polarization waves in an ionic crystal give rise to dipole-dipole forces of long range, these vibrations must depend on the size and shape of the sample. This dependence has the strongest repercussions in the optical properties of polar specimens whose dimensions are of the order of, or smaller than, the wavelength of the polarization waves, for which the polarization is effectively uniform throughout the sample, and distinction between longitudinal and transverse modes vanishes. Here, neglecting retardation effects , he showed that for a spherical specimen of a diatomic polar lattice the lowest energy mode is a triply degenerate one in which the polarization is uniform throughoutFootnote 8 its volume (and is thus infrared active). The frequency, \( \omega_{\text{s}} \), of this mode was found to be intermediate between the longitudinal \( (\omega_{\text{L}} ) \), and transverse \( (\omega_{\text{T}} ) \) frequencies of the corresponding bulk crystal—i.e. a sample of the same material sufficiently large that its dimensions exceed the wavelength, and which absorbs light only at the frequency \( \omega_{\text{T}} \). The increase of the absorption frequency to the value \( \omega_{\text{s}} \) in the case of a ‘small’ sample is due to the polarization charge at its surface; \( \omega_{\text{s}} \) is given in terms of the bulk transverse frequency \( \omega_{\text{T}} \) by:

$$ \omega_{\text{S}}^{2} = \omega_{\text{T}}^{2} \frac{{(\varepsilon_{\text{s}} + \, 2)}}{{\left( {n^{2} + 2} \right)}} $$
(5.2.2)

In NaCl, for example, where \( \varepsilon_{\text{s}} \) = 5.9, n 2 = 2.25, \( \omega_{\text{s}} \) = 1.36\( \omega_{\text{T}} \)c.f. \( \omega_{\text{L}} \) = 1.62\( \omega_{\text{T}} \).

This important work by Fröhlich is interesting historically in that it complements the work of Mie, in whose institute in Freiburg Fröhlich was first employed, 1932–33—see Sect. 2.2. Mie (1908) showed that in the case of spherical metallic samples that are small compared to the wavelength of the light, the light absorption peak occurs at a frequency between zero and the plasma frequency, in contrast to the situation in a large metallic crystal where the peak absorption occurs at zero frequency. For both metallic and ionic spheres, \( \omega_{\text{s}} \) is defined by \( \varepsilon (\omega_{\text{S}} ) = - 2\varepsilon_{\text{M}} \), where \( \varepsilon (\omega_{\text{S}} ) \) is the appropriate dielectric function of the spherical sample and \( \varepsilon_{\text{M}} \) is the dielectric constant of the surrounding medium.

Apart from some early clarification and extension by Szigeti in an ERA Report (Szigeti 1951), this important work lay dormant theoretically for nearly 20 years before being taken up by others (Engelman and Ruppin 1968) who showed that, although not localised at the surface, Fröhlich’s mode is actually the lowest frequency mode of a series of surface modes; in addition, they generalised the above to ionic samples of arbitrary shape, and investigated the effect of retardationFootnote 9 in diatomic crystals, making contact with polaritons—i.e. collective (transverse) excitations having both phonon and photon character, which result from solving the equations of motion for the lattice together with the Maxwell equations (Ruppin and Engelman 1968, 1970). A few years later, Genzel and Martin treated the case of small spherical samples embedded in a non-absorbing medium with dielectric constant \( \varepsilon_{\text{M}} \), for which they derived (Genzel and Martin 1972, 1973) the following generalisation of Eq. 5.2.2:

$$ \omega_{\text{S}}^{2} = \omega_{\text{T}}^{2} \frac{{[\varepsilon_{\text{s}} \left( {1{-}f} \right) + \varepsilon_{\text{M}} \left( {2 + f} \right)]}}{{n^{2} \left( {1{-}f} \right) + \varepsilon_{\text{M}} (2 \, + f)}} $$
(5.2.3)

where f is the fraction of the total sample volume occupied by the small spheres.

Experimentally, attention seems first to have been drawn to this frequency shift in 1966 in studies of the infrared dielectric dispersion of UO2 and ThO2 (Axe and Pettit 1966), who analysed their data using an expression for \( \omega_{\text{s}} \), which anticipated Eq. 5.2.3 for the case f = 0. For further discussion of the theoretical-experimental situation, vide Genzel and Martin (1972, 1973).

The second topic, which proved to be well ahead of its time, and and which was not pursued by others until some years later,Footnote 10 appeared on the final page of the book. It referred to ‘the possibility of a permanent polarization of ionic crystals’, associated with the static dielectric constant, \( \varepsilon_{\text{s}} \), diverging to infinity as the transverse optic frequency, \( \omega_{\text{T}} \), tends to zero. This behaviour is implied by the following expression for \( \varepsilon_{\text{s}} \) in terms of n 2 and other ionic parameters quantities, which Fröhlich derived for the case of a diatomic polar lattice:

$$ \varepsilon_{\text{s}} \,{-}\,n^{2} = 4\pi [(n^{2} + 2)/3]^{2} \frac{{{(e^{*})^2} N_{\text{o}} }}{{M_{\text{red}} \;\omega_{\text{T}}^{2} }} $$
(5.2.4)

where the reduced ion mass M red is given by 1/M red = 1/M + + 1/M ,−, N o is the number of unit cells per unit volume and e* is an effective ionic charge. He concluded the book with the following sentence: ‘Investigations on these lines should be of importance in view of the properties of crystals like barium titanate. They have not been developed far enough, however, to be included in the present book’. This proved to be particularly prescient, since the first conclusive experimental evidence of ferroelectric soft modes was not obtained until some 13 years later, using infra-red spectroscopy (Barker and Tinkham 1962), (Spitzer et al. 1962), and neutron spectroscopy (Cowley 1962) on strontium titanate.

With these confirmations of his prediction, Fröhlich returned to this topic in 1967 in his contribution [F127] to Debye’s Festschrift, in which he also drew attention to another possibility associated with the vanishing of a transverse frequency—this time an electronic one—namely, the establishment of a metallic state. In his contribution [F126] to J.C. Slater’s Festschrift, he had shown how this could occur as an excited state of a system whose ground-state is a Mott insulator—an insulator characterised not by a filled valence band of electrons, but by electrons that are localised in order to minimize their mutual Coulomb repulsion. Essentially, the increase in the number of (delocalised) conduction electrons produced by raising the temperature acts to screen the interaction of the remaining localised electrons whose binding energy is thereby reduced, making it progressively easier to create more carriers, so that the process becomes a cooperative one. Eventually the screening becomes so strong that the remaining localised electrons are delocalised—i.e. their transverse frequency vanishes—and the system undergoes a first order phase transition to a metallic state. The novel feature of the model, which became the subject of the author’s Ph.D. thesis in 1965 (Hyland 1968), was that Coulomb correlations are responsible for both the Mott-insulating ground-state and, via screening effects, for its eventual transformation into a metallic state.

So useful did his Theory of Dielectrics prove to be that it was subsequently translated into Russian and Japanese, and was reissued as a paperback by OUP in 1986, when Physics Today commented: ‘The presentation is admirable for its clarity…. All in all, it remains a masterly treatment of an engrossing subject’. Indeed, such was the significance of Fröhlich’s many contributions in the field of dielectrics that these alone would have been sufficient to secure his international reputation, for in this field he was the undisputed master—so much so that to the ERA he became known as the ‘wizard’! His first honorary degree, conferred by the University of Rennes (France) in 1955, was in recognition of his monumental work in dielectric theory.

During a visit to Zurich to speak on dielectrics, at the invitation of Scherrer who at the time was working on ferroelectrics, Fröhlich took the opportunity to look up Pauli whom he had known since the time of his pre-war collaboration with Heitler and Kemmer on nuclear forces , when they were in regular correspondence. To his dismay, Pauli told him that the subject of his lecture was a trivial one, and consequently he would not be attending. To this Fröhlich replied that, in his view, it was far from trivial, claiming that, in his opinion, more mistakes had been made in this field than in any other; he proceeded to remind Pauli that the Lorentz-Lorenz relation relation, for example, is often wrongly used, as he was well aware. After this, Pauli became more enthusiastic, and did in fact attend Fröhlich’s talk that afternoon. At the end of the lecture, Scherrer , who was acting as chairman, asked Pauli if he would like to open the discussion, to which he replied:

There have been more mistakes in this area than in any other theory. I myself already made one this morning, and have no intention of making another! (Fröhlich - personal communication 1983)

Another topic in this field, which was addressed some time later, concerned the energy loss of charged particles moving through a dielectric. In a joint paper [F86] with R.L. Platzman in 1953, attention was drawn to the close relation that exists between this and dielectric loss in an insulator, and a formula was derived for the rate of energy loss due to dielectric relaxation; this formula indicated that, for electrons with kinetic energy below the lowest electronic excitation potential, dielectric relaxation makes a substantial contribution to the total energy loss, which is comparable to that arising from transfer of vibrational quanta, a finding that has implications in radiobiology. The same methodology was later used [F96] to obtain a formula for the yield of secondary electrons from that of photo-electrons, and was developed further in his contribution [F105] to the Max Planck Festschrift in 1958, in which he gave a phenomenological theory of the energy loss of fast charged particles moving with an assumed constant velocity through a medium whose properties are described in terms of a complex dielectric function, \( \varepsilon \), which, in general, depends on both frequency and wave-vector. The rate of energy loss was found to be proportional to the imaginary part of the inverse of \( \varepsilon \), in contrast to the case of optical excitation where it is proportional to the imaginary part of \( \varepsilon \); the two cases differ in respect of the fact that in the latter the exciting electric field is transverse, whilst in the case of a charged particle it is longitudinal. It was shown that energy loss for certain discrete values of frequency, \( \omega \), requires only that \( \varepsilon^{ - 1} \) has the form of a δ-function, or equivalently, that \( \varepsilon (\omega ) = 0 \). Previous work [F94, F95] with PelzerFootnote 11 had shown that this condition is satisfied by longitudinal electric oscillations in solids, which include the case of plasma oscillations. The influence of plasma oscillations of the electron gas in a non-degenerate semiconductor on conduction was briefly considered in a short note [F100] published by Doniach in 1956, in which an estimate was given for the minimum electron density, n P, at which the conduction electrons (moving with thermal velocities) can support plasma oscillations, above which an associated increase in electron mobility, μ, must be anticipated, as well as a departure from the usual \( \mu \sim T^{-3/2}\) law for electron-lattice scattering. They noted that if the plasma would completely describe the electronic motion, then there would be no ordinary scattering at all. Their estimate of n P took the following form (wherein ε is a background dielectric constant):

$$ {n_\text{P}} \approx (\varepsilon kT/4 \pi {e^2})^3 \text{cm}^{-3} $$
(5.2.5)

n P can easily satisfy n P << n D, where n D is the density at which the electron gas becomes degenerate, when only relatively few electrons in a range of order kT about the Fermi surface undergo scattering by lattice vibrations, on which plasma oscillations then have little direct influence.

In passing, it is of interest to record the disappointment felt by some in the Liverpool physics department that, having been brought there by Chadwick—presumably on the strength of the significance of his contribution to the meson theory of nuclear forces —Fröhlich should now be preoccupying himself with something as non-relativistic and seemingly mundane as dielectric theory. It was, however, far from mundane, as he showed on many occasions; indeed, it actually spawned the so-called ‘dielectric’ approach to the many-body problem some 10 years later, towards the end of the 1950s and early 1960s, whilst the fluctuation-dissipation theorem (Callen and Welton 1951) was to a certain extent prefigured in his treatment of the static dielectric constant. Furthermore, in the light of Fröhlich’s later concern with the connection between micro and macrophysics (vide Sect. 6.2), this early interest in dielectric theory is particularly pertinent since the dielectric constant affords an example of a macroscopic quantity that itself straddles, so to speak, the gap between micro and macrophysics.

5.3 Polaron Theory

Around 1949, Fröhlich returned to consider an aspect of the interaction of a slowly moving conduction electron with the polarization field it gives rise to in an ionic (polar) crystal, which had been neglected in his 1939 paper with Mott [F23] referred to in Sect. 4.3, namely, the effect of the reaction of this polarization back on the electron.Footnote 12 The need for a theory that takes this into account had first become apparent in the course of his work on dielectric breakdown in amorphous materials [F49, 60], and from the following related consideration, which had been alluded to as early as 1939 in his critique [F24] of Seeger and Teller’s formulation of von Hippel’s criterion for breakdown (vide Sect. 4.1): in the case of a moving electron, only those parts of the lattice that are sufficiently far from electron (so that the latter executes only a small angle during the course of a lattice vibration) will see the electron as an effectively static charge. The angular distance travelled by an electron at a distance d in a time \( 1/\omega \) (~ the period of the lattice vibration) is \( v/\omega d \); if this angle is to be small, then \( v/\omega d \ll 1 \), or equivalently \( d \gg v/\omega \)—i.e. the slower the electron, the shorter the distance beyond which the lattice will perceive it as stationary. The limiting value of d for which the electron appears static (and the polarization thus proportional to the Coulomb field of the electron) may thus be taken to be of the order of \( v/\omega \). Below this distance, the electron passes so quickly that its effect on an ion can be considered as a shock, exciting oscillation after it has passed.

As noted in Sects. 4.3 and 4.7, a slow (thermal) electron interacts most strongly with longitudinal polarization waves of long wavelength, \( \lambda \gg \) the lattice constant, whilst its de Broglie wavelength similarly exceeds the lattice constant. Accordingly, the lattice can here be modelled as a dielectric continuum, described by high and low frequency dielectric constants \( \varepsilon_{\infty } \), and \( \varepsilon_{\text{s}} \), and, in the simplest case, by a single longitudinal frequency \( \omega_{\text{L}} \).

The electric potential, \( \varPhi \left( r \right) \), in the dielectric continuum at a distance r from a stationary electron of charge e is given by:

$$ \varPhi \left( \varvec{r} \right) = - (1/\varepsilon_{\text{s}} )e/r $$
(5.3.1)

The static dielectric constant \( \varepsilon_{\text{s}} \) includes contributions not only from the displacement of ions from their equilibrium positions but also from their electronic polarizability.Footnote 13

In the case of a slowly moving electron, the above considerations indicate that Eq. 5.3.1 holds only at r > d. Replacing r by the distance \( v/\omega_{\text L} \), yields the following netFootnote 14 energy of interaction, W, between the electron and the polarization field of the lattice, where the factor \( \left( { 1/\varepsilon_{\infty } - 1/\varepsilon_{\text{s}} } \right) \) ensures that only the inertial polarization associated with ‘slow’ ion displacements is included; the polarization associated with deformation of the electron shells of the ions (the resonance frequency of which is far above that which characterises the ion displacements—i.e. \( \omega_{\textrm L} \)) does not influence the energy difference between states of different itinerant electron velocity since, for sufficiently low velocities, it is always excited to its full value:

$$ W \sim - \frac{1}{2} (1 / {\varepsilon}_{\infty} - 1/ {\varepsilon}_{\text{s}}) e^{2} {\omega}_{\textrm L}/ v $$
(5.3.2)

For small velocities, this potential energy term can exceed the electron’s kinetic energy ½ mv 2, as noted by Fröhlich in the series of lectures [F55] entitled Theoretical Physics in Industry, which he gave at the Royal Institution in 1946, commenting: ‘No detailed study of the influence of polarization on the motion of electrons has yet been made’.

In consequence of the quantum mechanical uncertainty principle, the above classical picture must be modified, however, as follows: associated with the electron’s velocity, v, is an uncertainty in position, Δx, given by ΔxΔp ≥ ħ/2, where p is the electron’s linear momentum (mv); thus Δx ≥ ħ/2mv. This positional uncertainty must not exceed the distance d introduced above, for otherwise there would be no meaning to the idea of ions beyond this distance ‘seeing’ the electron effectively as stationary. For the limiting case \( \varDelta x = d,v = (\hbar \omega / 2 {\text{m}})^{1/2}, \) whence:

$$ d = d_{\text{o}} \equiv (\hbar / 2m\omega_{\text{L}} )^{1/2} $$
(5.3.3)

and Eq. 5.3.2 must be replaced by:

$$ W\sim - \frac{1}{2} \, (1/\varepsilon_{\infty } - \, 1/\varepsilon_{\text{s}} )e^{2} /d_{\text{o}} $$
(5.3.4)

Thus d o is effectively a measure of the ‘size’ that the (original point) electron assumes in the dielectric in consequence of its interaction with the polarization field of the material. Clearly, the validity of the adopted continuum representation of the lattice requires that d o ≫ a, where a is the lattice constant, a condition that is usually fulfilled; in NaCl, for example, where a ~ 5.6 × 10−8 cm, ω L = 7 × 1012 Hz, d o = 2.8 × 10−7 cm. This ‘expanded’ electron constitutes the ‘polaron’, and since, for the consistency of the approach, its size d o must exceed the lattice constant, it is called a ‘large’ polaron . At r ≫ d o, the polarization field, which is here proportional to the Coulomb field of the electron, can follow the electron’s peregrinations through the dielectric continuum; this results in an increase in the mass of the electron to a value m**, which is larger than the Bloch effective mass , m* it would otherwise have.Footnote 15 At r < d o, however, the polarization is less than the classical value, and the response of the polarization field to the passing electron and, in turn, its self-energy are here governed by the dynamical properties of the polarization field.

In collaboration with Pelzer and Zienau, Fröhlich (Fig. 5.4)

Fig. 5.4
figure 4

Fröhlich, Pelzer and Zienau’s polaron paper [F72], 1950—Reproduced with the permission of Taylor & Francis

realised that this was the place for a field theory devoid of the singularities that had earlier plagued meson theory:

I decided that electrons in ionic crystals was the place for a singularity-free field theory, and this led to the theory of what is now known as the ‘large’ polaron [F185].

For, as seen from the above semi-classical considerations, the nature of the interaction itself provides a natural cut-off—unlike the situation in the meson-nucleon case, where it had to be imposed artificially, thereby destroying the relativistic invariance of the theory.

The term, H int, in the Hamiltonian describing the interaction of the slow electron with the polarization field of the lattice is essentially the same as that on which his earlier theory of dielectric breakdown had been based, but appropriately modified in respect the treatment of the crystal as a dielectric continuum,Footnote 16 characterised by a single longitudinal optic phonon frequency, \( \omega_{\text{L}} \), whilst to ensure that only the ‘inertial polarization associated with ion displacements is included, the effective dielectric constant \( \left( { 1/\varepsilon_{\infty } - 1/\varepsilon_{\text{s}} } \right) \) of Eq. 5.3.2 is introduced. Equation 4.1.8 is thus modified as follows:

$$ H_{\text{int}} = - 4{\pi } ie \left[\frac{\hbar \omega_{\text{L}}} {8{\pi V}}(1/\varepsilon_{\infty } - 1/\varepsilon_{\text{s}} )\right]^{1/2} \sum\limits_{{\mathbf{w}}} w^{ - 1} (b_{\text{w}} {\text{e}}^{{{\mathbf{iw}} \cdot {\mathbf{r}}}} - b_{\text{w}}^{*} {\text{e}}^{{{\mathbf{ - iw}} \cdot {\mathbf{r}}}} ) $$
(5.3.5)

where b w (b w *) are related to the polarization, P, by

$$ \varvec{P} = \left[\frac{\hbar \omega_{\text{L}}}{8 \pi V} (1/\varepsilon_{\infty } - 1/ \varepsilon_{\text{s}} )\right]^{1/2} \sum\limits_{{\mathbf{w}}} \frac{\varvec{w}}{w} \left( {{b}_{\text{w}} {\text{e}}^{{{\mathbf{iw}}\cdot{\mathbf{r}}}} + b_{{\mathbf{w}}}^{*}{\text{e}}^{{ - {\mathbf{iw}}\cdot{\mathbf{r}}}} } \right), $$
(5.3.6)

V is the volume of interest, e is the electron’s charge, and r its position vector.

The prefactor of the summation in Eq. 5.3.5 can be rewritten in terms of a dimensionless coupling constant \( \alpha\,( \equiv |W|/\hbar \omega_{\text{L}} )\;{\text{as}}\, {-} i\;\hbar \omega_{\text{L}} ({d}_{\text{o}} )^{1/4} (4\uppi\alpha /V)^{1/2} \), where d o is the ‘size ’ of the polaron, as given by Eq. 5.3.3. With Eq. 5.3.4, α is given by

$$ \alpha \equiv \frac {e^{ 2}} {2\hbar} \left( { 1/\varepsilon_{\infty } - { 1}/\varepsilon_{\text{s}} } \right)\left( { \frac{2m^{*}}{\hbar \omega_{\text{L}} }} \right)^{{1/2}} $$
(5.3.7)

where m* is the Bloch effective mass , which may be greater or less than m. α is analogous to the Sommerfeld fine-structure constant in quantum electrodynamics ,Footnote 17 except that the magnitude of α is here (via \( \varepsilon_{\infty } ,\varepsilon_{\text{s}} \;{\text{and}}\;\omega_{\text{L}} \)) material dependent, entailing the possibility of both α < 1 and α > 1.

The ground-state of the Hamiltonian, H, of the coupled electron-lattice system,

$$ H = p^{2} /2m + H_{\text{latt}} + H_{\text{int}} $$
(5.3.8)

was first studied in the weak coupling limit (α ≪ 1) in [F72] using 2nd order perturbation theory Footnote 18 in the context of which the polarization induced in the dielectric continuum by the electron and its reaction back on the electron is described, respectively, in terms of the virtual emissionFootnote 19 and re-absorption by the electron of longitudinal polarization quanta; the parallel with the meson-nucleon system treated earlier [F20] is clear.

As anticipated in the above semi-classical considerations, these virtual processes endow the electron with a finite self-energy, E o, and increase its Bloch effective mass , m*, to the value m**. Assuming the coupling is so weak that there is never more than one virtual quantum excited at any one time, perturbation theory yields the following results:

$$ E_{\text{o}} = - \alpha \hbar \omega_{\text{L}} $$
(5.3.9)
$$ m^{**} /m^{*} = 1/\left( {1 - \alpha /6} \right) $$
(5.3.10)

Unfortunately, however, most polar crystals do not satisfy α ≪ 1; in NaCl, for example, α ≈ 6. Accordingly, it was necessary to develop methods that could deal with the case of stronger coupling, where an arbitrary number of quanta are excited. The first to be tried, by M. Gurari Footnote 20 of Fröhlich’s department (Gurari 1953), and independently by others (Lee et al. 1953), was a variational method, the validity of which required that correlations between successively emitted quanta induced by the electron’s recoil could be neglected; this neglect is permissible provided the number of quanta (~½α) is not too large, which imposes an upper limit on α—i.e. the method is geared to the case of so-called ‘intermediate’ coupling. The result for E o is here found to be identical to that obtained in the weak coupling limit (but is now no longer restricted to α ≪ 1), whilst the form of m**/m* is identical to that obtained by expanding the weak coupling result to lowest order—i.e.

$$ m^{**} = m^{*} \left( {1 + \alpha /6} \right) $$
(5.3.11)

The extreme case of strong coupling (α ≫ 1) was treated by Fröhlich himself [F90] using a variational trial wave-function of a form suggested by the work of PekarFootnote 21 (1949). The resulting expression for the self-energy was found to be:

$$ E_{\rm o} \sim - 0.1\alpha^{2} \hbar \omega_{\text{L}} $$
(5.3.12)

It should be noted that this expression is actually independent of frequency, \( \omega_{\text{L}} \), because of the \( \omega_{\text{L}} \)-dependence of α (Eq. 5.3.7). The associated result for the effective mass was:

$$ m^{**} /m^{*} = (1 + 0.02\alpha^{4} ) $$
(5.3.13)

Thus, contrary to Landau’s earlier prediction that an electron becomes self-trapped in the polarization field it itself produces—i.e. ‘digs its own hole’ (Landau 1933)—Fröhlich’s strong-coupling analysis shows that this does not happen: m**/m* is finite, and the electron remains mobile (Fig. 5.5).

Fig. 5.5
figure 5

The paper ‘Electrons in Lattice Fields’ [F90], 1954—Reproduced with the permission of Taylor & Francis

For NaCl, in which α ≈ 6, Gurari’s result gives the lower self-energy, and is thus the better solution; indeed, as noted by Fröhlich ‘it is not easy to find actual substances for which α is so large as to make [his] the better solution’ [F90].

In strong coupling limit (α ≫ 1), the number of virtual quanta (~α 2) is large, and correlations between successively emitted quanta predominate to such an extent that they result in the formation of a polarization potential well of radius ~d o around the electron. This entails not only an enhanced effective mass (Eq. 5.3.13), but also an internal structure arising from the possibility of electronic excitation within the well. In this limit, the electron follows the zero-point fluctuations of the polarization field, in contrast with the situation in the weak coupling (α ≪ 1) where there is never more than one quantum excited at any one time, and where it is the lattice that follows the motion of the electron. Thus in both limits, in consequence of the recoil of the electron, its point charge is effectively spread over a finite volume of radius d o for weak coupling, and of radius d o for strong coupling. It is this that removes the necessity of introducing a cut-off, and automatically ensures a non-divergent self-energy. In both limits, consistency with the adopted continuum treatment of the crystalline lattice requires that the distances d o and (d o ) be much larger than the lattice constant, which is why such polarons are often called ‘large’ polarons, as already mentioned in the semi-classical preamble.

Towards the end of [F90], Fröhlich pointed out that although the intermediate- and strong-coupling expressions for E o yield the same self-energy near α ≈ 10, they did not match smoothly here, whilst the values of the polaron effective mass, m**, differ by a factor of 100; in other words, near α ≈ 10, neither approach affords a realistic description of the situation,Footnote 22 and he concluded that ‘......it seems desirable to develop a method which leads to a continuous transition between the results of the two methods’. For he believed that such a method would be useful not only in dealing with the polaron, but also possibly in connection with the more important problem of superconductivity. In connection with the latter, Fröhlich believed it to be particularly significant that, for α = 6, intermediate-coupling theory (Eq. 5.3.11) predicted an increase in m* by only a factor of 2, and that it would be worthwhile to investigate the extent to which this might continue to be the case when correlations between successively emitted quanta were included in the theory. It is of considerable interest to note, therefore, that Schrieffer told Fröhlich [F185] many years later that the form of the Ansatz for his many-electron extension of Cooper’s pair wave-function—in which all pairs have the same total momentum (vide Sect. 5.4), and which was basic to the eventual solution of the problem by Bardeen et al. (1957)—had actually been motivated by the structure of the variational wave-function of the large polaron in the intermediate-coupling regime, as given by Lee, Low and Pines in 1953. An approach to finding superconducting wave-functions starting from the strong-coupling limit, on the other hand, was considered to be less appropriate since in this limit the dynamics of the lattice, which by this time were known to be involved in superconductivity in an essential way, are effectively suppressed, so that there is no isotope effect (vide Sect. 5.4).

It was Fröhlich’s cryptic remarks about superconductivity at the end of [F90] that attractedFootnote 23 the attention of Feynman who took up Fröhlich’s challenge to develop a theory that permits the transition from weak to strong coupling to be achieved in a continuous way. This he successfully did by adapting his own Lagrangian path-integral (variational) method that he had earlier used successfully in quantum electrodynamics ,Footnote 24 which here permitted complete elimination of the phonons (Feynman 1955). The effect of this elimination was to replace the instantaneous form of the electron-phonon interaction given by Fröhlich’s interaction Hamiltonian (vide Eq. 5.4.6) by a retarded interaction in which the electron interacts only with itself, in a way that is inversely proportional to the distance travelled from previous times, bearing with it the memory of its history—i.e. the electron behaves as though it were in a potential resulting from its electrostatic interaction with the average charge density at its previous locations. Unfortunately, however, the associated path integral could not be evaluated in closed form, and was thus approximated by one that could be so evaluated. Basic to the implementation of Feynman’s programme was his modelling of the electron’s interaction with this potential as an electron bound harmonically to a fictitious particle of finite mass that now mimicked the essentials of the electron’s interaction with the lattice polarization.

In the extreme weak and strong coupling limits (α → 0, α → ∞), the following expressions were obtained for the self-energies:

$$ E_{\rm o} \le - \hbar \omega_{\rm L} [\alpha + 0.0123\,\alpha^{2} + \text{O} (\alpha^{3} )],\quad \alpha \to 0 $$
(5.3.14)
$$ E_{\rm o} \le - \hbar \omega_{\rm L} [\alpha^{2} /3\pi + 3ln2 + 3/4 + \text{O} (1/\alpha^{2} )],\quad \alpha \to \infty $$
(5.3.15)

which, as he commented, are ‘at least as accurate as previously known results’ (Feynman 1955)—namely those given by the earlier weak and strong coupling theories, Eqs. 5.3.9, 5.3.12. In the range of intermediate coupling regime, however, numerical methods had to be used to evaluate the integrals yielded by Feynman’s approach. This was done later by Feynman’s collaborator, T.D. Schultz , for α = 3, 5, 7, 9, 11 (Schultz 1959), and revealed the superiority of Feynman’s self-energy values over the entire range of coupling.

The final section of his paper was devoted to consideration of the effective mass, m**, of the polaron. Here, however, Feynman was unable to find an appropriate extension of his variational principle, which minimised the energy for finite polaron momentum, and which at the same time conserved momentum, and had to content himself with a non-rigorous treatment of the bound two-body system in terms of which he modelled the polaron; for low velocities, V, the self-energy was found to be augmented by a kinetic contribution ½ m**V 2, via which m** was defined. As in the case of the self-energy, an expression for m**/m* was obtained, which varied continuously with the coupling constant, α, although not being based on a variational principle, Feynman admitted it was difficult to adjudge the accuracy of the derived values, especially for large α (i.e. strong coupling).

How Feynman came to get involved with polaron theory is an amusing story that is told in a letter he sent to Fröhlich, which is reproduced, together with the latter’s reply, in Appendix 1 at the end of this chapter. Apart from the possibility of adapting his approach to the polaron problem to the nucleon-meson case (which proved to be possible only for the unrealistic case of spinless nucleons and mesons (Mano 1955)), Feynman’s real interest was, as just mentioned, in how his treatment of the polaron problem might suggest a strategy whereby the problem of superconductivity might be attacked; note the penultimate paragraph of his letter wherein he asks: ‘What do we have to do to understand superconductivity?’

A valuable review of polaron calculational methods by G.R. Allcock, also of Fröhlich’s department, appeared in 1956, in which Feynman’s solution and other contemporary approaches to Fröhlich’s challenge were considered in detail (Allcock 1956).

Powerful as Feynman’s approach was in permitting the transition from weak to strong coupling to be treated in a continuous way, its utility was physically undermined by the progressive breakdown, with increasing α, in the validity of the underlying treatment of the ionic lattice as a dielectric continuum: a new approach was called for. In the West,Footnote 25 this was again initiated by Fröhlich in 1957, this time from considerations of Debye dielectric loss in ionic solids associated with trapped electrons [F101, see also F114, F118]. The naive interpretation of the observed loss is to assume that an electron can sit on positive ions in the neighbourhood of a trapping centre (effectively positively charged) thus forming an electric dipole with a number of possible directions of equal energy between which it can make transitions, thus giving rise to the loss. He pointed out that this interpretation is flawed because the various possible sites of the electron will combine to form various quantum states of which the ground-state is, in general, non-degenerate, thus making Debye loss impossible. He noted, however, that the naive interpretation can be retained if the energy levels of these electronic states have, at temperature T, a spread considerably less than kT (for then they could be superposed so as again to form localised states in a quasi-classical way); this energy spread is governed by the overlap of the electronic wave-functions on neighbouring ions. Fröhlich now crucially realised that when the displacements of the ions in the vicinity of the positive ion on which an electron is assumed to be localised are taken into account, the overlap is reduced from the value (≫kT) appropriate to a rigid lattice by a factor exp(−Δ 2/x 2o ), where Δ is the ion displacement and x o its zero-point amplitude; this factor is a very sensitive function of (Δ 2/x 2o ), and can take very small values, thus reducing the overlap to less than kT.

These ideas were subsequently developed by G.L. Sewell ( 1958), for the case of a diatomic polar lattice, into what is now known as ‘small’ polaron theory, in which the interaction of an electron with the ionic displacements is again built in from the start. The electron is treated in the tight-binding approximation (as is appropriate in the case of narrow bands, such as d-bands, for example), the eigenstates of the system being constructed as linear combinations of localised states in which the electron is bound to one of the positive ions, into which are now included the displacement of neighbouring ions caused by the electron.Footnote 26 The energy levels corresponding to these eigenstates form a band—the so-called ‘small’ polaron band—whose width is determined by the overlap integral governing the transfer of the electron, together with its accompanying lattice displacements, between neighbouring positive ions. These accompanying lattice displacements make the motion of the electron more sluggish, which is reflected in an exponential increase in the (rigid lattice) Bloch effective mass , m* to the small polaron effective mass , m**—or, equivalently, to an exponential decrease in the Bloch bandwidth , W o to the small polaron bandwidth, W, given by:

$$ m^{**} = m^{*} \exp (\gamma ) $$
(5.3.16)
$$ W = W_{\text{o}} \exp ( - \gamma ) $$
(5.3.17)

The factors \( { \exp }( \pm \gamma ) \) arise as a result of the transfer from ion to ion of the lattice displacements that accompany the electron; γ can be written as:

$$ \gamma = f\left( T \right)\gamma^{\text{o}} $$
(5.3.18)

where f(T) is an increasing function of temperature, reflecting the fact that the random thermal motion of the ions opposes the transfer of their displacements, and γ o is the value of γ at T = 0 K; for the simple model used, which is based on localised s-states and an optic mode frequency that is independent of wavelength, γ o is typically about 15, when screening effects arising from the electronic polarizability of the ions are taken into account in terms of the high-frequency dielectric constant \( \varepsilon_{\infty } \).

The width of the small polaron band thus decreases with increasing temperature, and once it becomes less than kT extended states can no longer be defined. The electron then becomes localized in a self-trapped state, as originally envisaged by Landau (1933), but now with the important difference that this occurs only at finite temperatures. Since this localized state is stabilised by displacement of the surrounding ions, movement of the electron to a neighbouring site by tunnelling is possible only if the same lattice displacements can somehow be realised there so that this site is rendered degenerate with that at which the electron is initially localised. This can be achieved with the help of thermal vibrations (in particular, those belonging to the acoustic mode), and such thermally assisted motion (a multi-phonon process) is known as ‘hopping’ or ‘jumping’. This kind of transport is characterised by an extremely low mobility , μ, that is activated—i.e.

$$ \mu = \mu_\text{o} e^{ - E_{a}/kT} , $$
(5.3.19)

where the activation energy, E a is the energy required to recreate at the neighbouring site the lattice deformation that traps the electron initially.

It must be stressed that because of certain simplifying assumptions made (such as the band being based on localised orbitals that are s-states), the results are not quantitatively applicable to real narrow band materials in which the orbitals of relevance (such as d- or f-states) are highly anisotropic and in general degenerate; qualitatively, however, the band narrowing predicted on the simple model used can still be anticipated. The subject of the hopping mobility of small polarons was subsequently investigated in more detail by Lang and Firsov (1963) in a form that can be applied to real materials, such as the technologically important nuclear fuel UO2+x, in which the carriers are (5f) holes in the Mott-insulating ground-state of the stoichiometric material (Casado et al. 1994).

The implications of the uncertainty principle on the mobility of carriers in narrow bands , such as those associated with small polarons, and, in particular, its temperature dependence, was the subject of a short note with Sewell in 1959 [F104]; it was estimated that the transition to hopping occurs at mobilities less than about 0.1 cm2/s V. Sewell revisited the problem in a more extensive article published a few years later (Sewell 1963), as did Fröhlich himself in his contribution to the Festschrift for G. Busch [F135] in which he drew attention to some frequently occurring mistakes in the (then) current literature on small polaron transport. It should be noted that Fröhlich’s initial work in this area predated Holstein’s formulation of small polarons for the case of a molecular lattice where the electron-lattice interaction is of short range (Holstein 1959), which became better known than Sewell’s treatment for an polar lattice, published the previous year.

Somewhat later, Fröhlich developed a better way of proceeding in the case of low mobility, narrow band materials by constructing wave-functions in which the electron is based on the oscillating position of a positive ion, rather than on a (fixed) lattice point, as had been done previously, since, in this way, the electron’s movement is facilitated [F117, 118]. By following the ion’s oscillation, not only is a considerable amount of the electron-ion interaction already included in lowest order—thereby leaving a relatively small residual electron-phonon interaction to be dealt with—but also the necessity of including inter-band transitions is avoided [F125].

Out of this evolved an associated modification of the usual tight-binding method in which an extended Bloch state is based on a superposition of localised wave-functions. This was elaborated in considerable detail by one of Fröhlich’s doctoral students, T.K. Mitra , who showed, using localised wave-functions of the above kind in which the electron follows the ion motion adiabatically, that the electron-phonon matrix element governing intra-band transitions is proportional to the bandwidth, and is thus small in the case of narrow bands (Mitra 1969, 1978). The implications of this finding for superconductivity in materials with incomplete narrow energy bands will be considered in the next chapter in Sect. 6.1.

5.4 Theory of Superconductivity, the Introduction of Quantum Field Theory into Solid-State Physics , and Marriage

Important as polaron theory was in its own right, it was for Fröhlich, simply a testing ground for the non-relativistic application of field-theoretical methods, prior to bringing them to bear on the problem of superconductivity in metals, where, of course, many electrons have to be considered, and where the relevant lattice vibrations are acoustic, rather than optical. It must be remembered that, in 1950, superconductivity was the central problem in solid-state physics—a problem that had continued to defy solution since the advent of quantum theory, despite the efforts of the best minds, including several Nobel Laureates, such as Einstein, von Laue and Heisenberg . At that time, it was generally considered that the phenomenon involved some, yet to be discovered, novel collective feature of the Coulomb interaction between the conduction electrons. Such a consideration was not unreasonable, given that the success of the free-electron model of metals, in which the Coulomb interaction between electrons is neglected, was not then understood. Indeed, it did not become so until the work of Bohm and Pines , which showed that the Coulomb interaction between electrons can be divided into two parts: a long-range part, whose effect is described by longitudinal plasma oscillations, and a residual short-range ‘screened’ interaction, whose range is about 1 Å (Bohm and Pines 1953). Owing to the large excitation energy of the plasma oscillations, the long-range part can be ignored in many calculations, which explains the success of the free electron model in the case of manyFootnote 27 metals.

To Fröhlich, however, an approach to the problem of superconductivity in terms of the Coulomb interaction seemed ill-founded, given the extremely small experimental energy difference, δE, between the superconducting and normal states of the order of 10−3–10−4 eV per atom, which is minute in comparison with typical Coulomb energies, which are of the order of eV. Accordingly, he formed the opinion that the main problem in developing a theory of the superconducting state was first to find an interaction of the correct magnitude. The fact that such a dramatic effect as superconductivity is characterised by such a small energy difference is surely indicative that the difference between the normal and superconductive states is a highly subtle one, and that any understanding of it must be expected to require considerable ingenuity. In this connection, the small energy ms 2 (where m is the mass of an electron and s the speed of sound in a metal), which he knew [F81] played a role in some aspects of the theory of the electrical conductivity of normal metals, suggested itself in consequence of its small size, of the order of 10−5 eV. The involvement of the speed of sound focussed attention on the dynamics of the metal lattice, since the speed of sound is proportional to M −½, where M is the mass of an ion; furthermore, ms 2 is the product of an electron parameter (its mass, m) and a lattice characteristic (the speed of sound, s), suggesting an underlying electron-lattice interaction— a suggestion that is consistent with the empirical finding that it is poor conductors that become superconductors. For from the work of Bloch, it was known that electrical resistivity is due to the scattering of conduction electrons by thermally excited lattice vibrations, or phonons—i.e. is due to the electron-phonon interaction. Field-theoretically, this scattering expresses itself as absorption/emission of such real phonons by the electron, which give non-zero contributions in 1st order of perturbation theory .

From his experience with field theoretical methods, both relativistic and non-relativistic, Fröhlich realised that the same interaction entails, in addition, other possibilities connected with the virtual Footnote 28 emission and absorption by an electron of phonons. These processes arise in 2nd order perturbation theory, and, in the case of a single electron, had been the basis of polaron theory in the weak coupling limit. Similarly, in the case of a metal at absolute zero where there are no thermal lattice vibrations (real phonons) to scatter an electron, the Coulomb field of an electron can again itself cause a dynamic disturbance in the lattice by attracting positive ions in its vicinity, thereby creating a local increase in the density of positive charge. This, in turn, reacts back on the electron, lowering its energy—i.e. the electron acquires a polaron-like self-energy . In addition, however, and in contrast to the polaron case, there is here another effect arising from an influence of the lattice distortion on the motion of other conduction electrons in its vicinity—i.e. there is a dynamic interaction between electrons, which is transmitted by the dynamics of the lattice. Fröhlich’s novel hypothesis was that it is this interaction that perhaps underlies the superconducting state—an hypothesis based on his realisation that Bloch’s theory of electrical conductivity (Bloch 1928) was simply one aspect of a field theory; another, hitherto unrecognised complementary aspect was that the same electron-lattice interaction entails a novel interaction between electrons, mediated by the zero point dynamics of the lattice.

Given the success of the free electron model in describing the properties of the non-superconducting state of a metal, Fröhlich assumed that this continues to be the case as far as the direct Coulomb interaction between electrons is concerned, and thus focussed his attention solely on the electron-phonon interaction .

His first attempt to study this interaction used perturbation theory to calculate the associated energy change, using Bloch’s matrix elements appropriate to absolute zero where there are only zero-point phonons. To first order, there is no energy change on account of the absence of thermal phonons, the first non-zero contribution, ΔE, arising in second order from virtual processes of emission and absorption of zero-point phonons by the electronsFootnote 29:

$$ \Delta E = - 2\sum\limits_{\text{k}} {\sum\limits_{\text{w}} {\frac{{|M_{\text{w}} |^{2} f_{\text{k}} \left( {1 - f_{\text{q}} } \right)}}{{\varepsilon_{\text{q}} - \varepsilon_{\text{k}} + \hbar sw}}} } $$
(5.4.1)

where M w is the matrix element for emission of a vibrational quantum of wave-vector w by an electron of wave-vector k, which (to conserve momentum) then makes a transition to an intermediate state with wave-vector q (= k − w) from which it is re-absorbed into the original state. To satisfy the Pauli principle, the transition must be proportional to the probability f k (≤1) that the state k is occupied, and to the probability (1 − f q) that the state q is empty. At T = 0 K, where there are no thermal (real) phonons, |M w|2 is given by Bloch (1928):

$$ |M_{\text{w}} |^{2} = 4C^{2} \hbar w/9nVMs^{2} $$
(5.4.2)

where V is the volume, M is the ion mass, n their number per unit volume, s is the speed of sound, and C is an interaction constant having the dimension of energy (and is of the order of 10 eV, similar in magnitude to the Fermi energy , E F). It was found convenient to introduce an associated dimensionless constant, F, of the order of magnitude of unity, defined by:

$$ F = C^{ 2} / 3E_{\text{F}} Ms^{ 2} $$
(5.4.3)

Equation 5.4.1 contains the two effects described in the above semi-classical picture—namely, a polaron-like self-energy, ΔE 1, arising from the term linear in f, whilst the term bilinear in f can formally be interpreted as an interaction between two points in momentum space occupied with densities f k and f q (Fig. 5.6).

Fig. 5.6
figure 6

Fröhlich’s first paper on superconductivity—Reprinted with permission from H. Fröhlich, Physical Review Vol. 79, 845 (1950) Copyright 1950 by the American Physical Society

It should be appreciated, however, that perturbation theory treats the electron-phonon interaction as instantaneous, which it is not. For since the exchanged phonon moves with the speed of sound, s, which is much less than the Fermi velocity, v F, of a conduction electron, the phonons trail behind the electrons: thus a phonon absorbed by an electron at position r at time t will have been emitted by another electron in a different location, r′ at the earlier time t′, given by:

$$ t^{\prime} = t\,{-}\,\left| {\varvec{r^{\prime}} -\, \varvec{r}} \right|/s $$
(5.4.4)

The interaction thus depends not only on the inter-electronic separation \( \left| {\varvec{r^{\prime}} - \varvec{r}} \right| \), but also on \( \left( {t - t^{\prime}} \right) \)—i.e. it is a retarded interaction.Footnote 30

Whilst the parallel with quantum electrodynamics is clear, they differ in that, in vacuo, the latter yields an effectively instantaneous interactionFootnote 31 that is always repulsive, unlike the present case, where the algebraic form of the energy change due to the phonon-mediated electron-electron interaction given by 2nd order perturbation theory indicates that the interaction is actually attractive only for electrons whose energy difference is less than the energy of the exchanged lattice quantum, under which condition the lattice follows the electrons. The attractive interaction is thus of a highly dynamic nature, depending on the momentaFootnote 32 of the two electrons and on the ability of the lattice to dynamically respond to their Coulomb field. For electrons whose energy differs by more than the energy of a lattice quantum, which is the case for the majority of electrons, the interaction is repulsive, and the electrons follow the lattice almost instantaneously, as expected intuitively in consequence their low mass compared with that of an ion, m ≪ M.

The effect of this interaction, Fröhlich noted, is to shift electrons from the top of the usual spherically symmetric Fermi distribution, f o, appropriate in the case of no interaction, to higher energies, resulting in a new distribution f 1, which he obtained from f o by displacing a thin concentric shell of width ms (containing a relatively small number of electrons) from the region of the Fermi surface of f o, thereby creating a gap between two occupied regions of momentum space: the higher energy shell and the inner Fermi sea. This shift of a fraction of the electrons to higher energies is, however, opposed by the increase in kinetic energy that must accompany such a shift. Taking this into account, it was found that f 1 yields the lower energy provided the dimensionless coupling constant , F, exceeded a certain critical value F o that depended on the number of electrons per ion.

It is possible to express this criterion of the occurrence of superconductivity in terms of the resistivity at 273 K, and it was found to be obeyed by a range of superconductors, and was not obeyed by many metals that do not become superconductors. Thus, for F > F o, the usual Fermi distribution f o becomes unstable, the shell distribution f 1 yielding a lower energy. This requirement of strong electron-lattice interaction is, of course, consistent with the empirical fact that the metals that exhibit a transition to the superconducting state are those with poor electrical conductivity, arising from the electrons being strongly scattered by thermal phonons.

The energy difference \( \delta E \equiv \varDelta E\left( {f_{\text{o}} } \right) - \varDelta E\left( {f_{ 1} } \right) \) between the two distributions (or equivalently, the energy difference between the normal and superconducting states at absolute zero) was found, using Eq. 5.4.1, to be proportional to Fms 2—i.e. the difference increases linearly (via F) with the strength of the electron-lattice interaction, and involves the small energy ms 2, with which Fröhlich began his deliberations. Experimentally, δE is given by

$$ \delta E = H_{\text{c}}^{2} / 8\uppi $$
(5.4.5)

where H c is the critical magnetic field above which a superconductor reverts to its normal metallic state. Despite the calculated values being larger than typical experimental values by a factor of between 10 and 100, and the very severe difficulties known to be associated with the use of perturbation theory, Fröhlich was convinced that his phonon-mediated electron-electron attraction was the key to understanding the phenomenon of superconductivity, and tentatively identified the shell distribution, f 1, with the superconducting state. To convincingly establish this, it was necessary to show that f 1 leads to the correct thermal and electromagnetic properties of superconductors. However, apart from showing that a finite energy was necessary to deform the f 1 distribution (which thus exhibits a certain rigidity reminiscent of that which London had shown phenomenologically to lead to the Meissner effect (London 1950)), his own efforts [F76, F78] based on the f 1 distribution were not otherwise successful, and his identification of f 1 with the superconducting state proved somewhat premature. Notwithstanding this, his identification of a phonon-mediated electron-electron interaction proved to be the key that broke the deadlock in understanding superconductivity, as will become apparent.

Fröhlich finalised his initial calculations during a visit to Purdue where he spent the Spring Semester of 1950, and he submitted his paper [F76] to the Physical Review on 16 May 1950, before leaving. The referee was Bardeen who immediately recommended its publication. Fröhlich then spent a few days at Princeton, and it was there, at his breakfast table one morning, that he opened the current copy of Physical Review, dated 15 May, to find two Letters (Maxwell 1950; Reynolds et al. 1950), independently reporting that for mercury the superconducting transition temperature, T c, depended on the isotopic mass, M, in such a way that the product T c M 1/2 is approximately constant:

Upon checking, I found my mass-dependence confirmed, and on 19 May sent a letter [F77] to the Proceedings of the Physical Society to claim confirmation of the basic idea—the electron-phonon interaction [F185] (Fig. 5.7).

Fig. 5.7
figure 7

Letter concerning the isotope effect [F77], 1950—Reproduced with the permission of the Institute of Physics

This result was effectively already contained in Eq. 6.9 of [F76] (which appeared on 1 September 1950), namely \( H_{\text{c}}^{2} \sim m/M \), although the isotopic implications were not noted initially, but only later in a Note added in proof, in connection with the discovery that T c M 1/2 is approximately constant—a result that follows from \( H_{\text{c}} \propto M^{ -{1/2}} \) by applying the law of corresponding states. It should be emphasised that not only did the theory predict an isotope effect, but that the actual M-dependence was precisely as found experimentally.Footnote 33

It is amusing to note that one of the discoverers of the isotope effect, E. Maxwell told Fröhlich [F185] that, whilst still a young research worker, he was explicitly forbidden to investigate the isotopic dependence of the superconducting transition temperature, on the grounds that any suggestion of ever obtaining a positive result was laughable, in keeping with the prevailing belief that the ions, because of their large mass, could not possibly play any role in the phenomenon of superconductivity. Not being so dissuaded, however, he worked secretly by night, and found the effect.

It must be stressed that, contrary to accounts in many textbooksFootnote 34 and elsewhere, Fröhlich’s work predated the publication of the positive results on the isotope investigations: it was not motivated by it, unlike the contemporaneous work of Bardeen (1950). Although the isotope results had been announced, prior to their publication, at a meeting sponsored by the US Office of Naval Research in March, Bardeen was ignorant of them until early in May 1950, when he received a telephone call from Serrin (one of the discoverers) telling him about his results. ‘I immediately thought that the electron-phonon interaction must be involved and attempted to construct a theory on this basis’ (Bardeen 1973b, p. 64).

The importance of Fröhlich’s realisation of a phonon-mediated electron-electron interaction, and his conviction that this interaction was basic to the understanding superconductivity cannot be overestimated. More than half a century later, it is now difficult to appreciate just how avant garde his claim appeared at the time, when the perceived wisdom ‘knew’, not only that electrons repel one another, but also that the ions, in consequence of their large mass, could play no role in the phenomenon of superconductivity. From his experience with the polaron, however, Fröhlich suspected that this was not necessarily the case, and his thesis proved crazy enough to be correct. Its gradual acceptance was due not so much to the vindicating discovery of the isotope effect, but rather to the fact that a new type of electronic interaction had been found within the existing semi-empirical free electron model, without having to admit any new ad hoc hypothesis; this was the view of Pauli, for example, who subsequently directed his pupil Schafroth towards superconductivity.

Typical of the desperate situation with respect to superconductivity, which existed prior to Fröhlich’s work, were Pauli’s dictum: ‘theories of superconductivity are wrong’, and Felix Bloch’s claim: ‘theories of superconductivity can be disproved’ [F109]. In this connection, however, the following quotation from a letter from Fritz London to Laszlo Tisza (dated 2 November 1950) is of interest:

I was in some correspondence with Fröhlich. I am quite sure that his new interaction is the thing needed for superconductivity, but his first attempt to devise the electrodynamics seemed to me not quite to the point. Just today he writes to me that he has got it, and I am convinced that he, if anybody, is the man to do it.

Whilst in Princeton, Fröhlich visited Bell Labs where Bardeen was then working, and they compared their approaches to the problem of superconductivity:

Although our approaches were different, mine using a variational method and his perturbation theory, both theories were based on the self-energy of the electrons in the phonon field rather than a true interaction between electrons (Bardeen 1973b).

At Bell Labs , Fröhlich met also Schockley, with whom he discussed his 1947 work [F60, F61] on dielectric breakdown, which was based on the introduction of an electronic temperature that is higher than that of the lattice; the following year, Schockley coined the expression ‘hot’ electrons (Shockley 1951) to describe this situation (vide Sect. 4.7). Around this time—and probably not unconnected with his contributions to both these fields of mutual interest—an approach was made by Bell Labs to entice Fröhlich away from Liverpool to become their ‘specially endowed’ professor at Princeton University. This did not materialise, for not only was he unwilling to relinquish his effectively pure research post in Liverpool for one that would undoubtably have entailed undergraduate teaching,Footnote 35 but also he had recently married a young American philosophy student, Fanchon Aungst who did not wish to return to America at that time. She had only recently come to England from Chicago, where she had been a pupil of Rudolf Carnap ,Footnote 36 in order to read Philosophy in Oxford under Peter Strawson at Somerville College. In Oxford, the legendary ‘Miss Anscombe’ (Elizabeth AnscombeFootnote 37), the analytical philospher and authority on Ludwig Wittgenstein , made a lasting impression on her. Having landed in Liverpool in 1949, on her way to Oxford, Fanchon attended a meeting of the local German Circle, which at the time was frequented by many European intellectuals, such as Baroness Rausch von Traubenberg (Marie-Hilde Rosenfeld,Footnote 38 1889–1964) and Baroness Erisso (Eva von Sacher-Masoch ,Footnote 39 1911–91) who was then a Ph.D. student in Liverpool , and it was there, in a house in Gambier Terrace, that she was introduced to Fröhlich by a mutual friend, Erika Wirtz , a lecturer in German at the university. They were married the following year on Monday 26 June 1950, immediately after his return from America, and while she was still studying in OxfordFootnote 40; he was then 44 years old and she, 22. That morning, Fröhlich did not appear at coffee, which was unusual since it was known that he was not away on a trip; neither did he appear at afternoon tea, by which time he was known to be in the department. In response to queries by those present, Szigeti, who also had been absent at morning coffee, explained that Fröhlich had got married that morning, and was now catching up with his work! (Powles 1973). Another contributory factor in his not wanting to leave Liverpool was his love of Chinese food, since, at that time, Liverpool had what he considered to be the best Chinese restaurants in Europe!

Back from the USA, Fröhlich presented a report of his theory of superconductivity at the Summer Provincial Meeting of the Physical Society, which that year was held in Liverpool, 7–8 July 1950. His report concluded with a plea for experiments using more isotopes, and in the discussions that followed was informed by Pippard that Shoenberg’s Cambridge group was keen to undertake the necessary measurements, if isotopes could be made available, which W.D. Allen of the UK Atomic Energy Authority’s Harwell Laboratory immediately offered to arrange. Apparently, however, he had earlier sent isotopes to Mendelssohn’s group at Oxford University’s Clarendon Laboratory, but the samples had remained on Mendelssohn’s desk uninvestigated until he heard about positive results obtained by Shoenberg’s group. The latter’s results showed that the critical magnetic field in tin decreases with increasing isotopic mass according to M −½ when the temperature remains constant (Allen et al. 1950), as originally predicted by Fröhlich’s theory [F76, Eq. 6.9]; it was this finding, incidently, that finally persuaded Bohr that Fröhlich was on the right track. Fröhlich attributed Mendelssohn’s initial reluctance to investigate isotopes to the then prevailing belief that the ions play no role in the phenomenon of superconductivity, a belief that was possibly fuelled by the fact that Kamerlingh Onnes had failed to find an isotope effect in lead, as early as 1922 (Kamerlingh 1922). For further insights into the isotope affair, the account by Dahl should be consulted (Dahl 1992).

Prior to the publication of the eventual solution of the problem of superconductivity in 1957 (Bardeen et al. 1957), Fröhlich published 3 papers of singular elegance and importance, the last of which, in 1954, was a review of polaron theory entitled Electrons in Lattice Fields [F90], which, as mentioned in Sect. 5.3, was what prompted Feynman to apply his path integral approach to the problem; the other two papers dealt with superconductivity.

The first of these [F84], published in 1952, is arguably Fröhlich’s most influential contribution to physics, and marked the start of a new era with its introduction of the methods and concepts of quantum field-theory into non-relativistic condensed matter physics. His introduction of creation and annihilation operators for both electronsFootnote 41 and phonons, permitted the derivation of what is now known as the ‘Fröhlich Hamiltonian’ , H F: ‘A definite Hamiltonian stood where before there was emptiness; a definite mathematical problem was posed’ [F125]; ‘.......from a state of impotence the efforts of physicists were now channelled towards a definite task’ [F136].

H F contains 3 terms: (i) the kinetic energy of the electrons, whose mutual Coulomb interaction is neglected, in keeping with the success of the free electron theory in describing the non-superconducting state: this energy is expressed in terms of anti-commuting, fermion operators, a k and \( a^{\dag }_{\text{k}} \); (ii) the energy of the lattice vibrations expressed in terms of commuting, boson operators, b w and \( b^{\dag }_{\text{w}} \); (iii) the electron-lattice interaction energy, which involves products of the type \( b_{w} \,a^{\dag }_{\text{k}} \,a_{{{\text{k}} - {\text{w}}}} \) whose non-linearity means that each system is coupled to itself—i.e. a change in the electron distribution has an effect on the phonons, which, in turn, affects the electrons; thus:

$$ H_{\text{F}} = \sum\limits_{{\mathbf{k}}} {\varepsilon_{{\mathbf{k}}} a^{\dag }_{{\mathbf{k}}} a_{{\mathbf{k}}} } + \sum\limits_{{\mathbf{w}}} {\hbar s^{\prime}w\left( {b^{\dag }_{{\mathbf{w}}} b_{{\mathbf{w}}} + \,{\frac{1}{2}}} \right)} + \, i\sum\limits_{{{\mathbf{k}},{\mathbf{w}}}} {D_{{\mathbf{w}}} \left( {b_{{\mathbf{w}}} a^{\dag }_{{\mathbf{k}}} a_{{{\mathbf{k}} - {\mathbf{w}}}} - b^{\dag }_{{\mathbf{w}}} \;a^{\dag }_{{{\mathbf{k}} - {\mathbf{w}}}} a_{{\mathbf{k}}} } \right)} $$
(5.4.6)

where \( \varepsilon_{\text{k}} \) is the one-electron energy, \( \hbar^{ 2} k^{ 2} / 2 m \), s′ is the speed of sound in the absence of the electron-phonon interaction, and the real quantity D w is defined by:

$$ D_{\text{w}}^{2} = C^{2} \hbar s^{\prime}w/ \, 2nVM(s^{\prime})^{2} $$
(5.4.7)

where C is a constant with the dimensions of energy, V is the total volume, M is the ion mass, and n their number per unit volume. D 2w can be expressed in terms of a dimensionless coupling constant F′ and the Fermi energy EF, according to:

$$ D_{\text{w}}^{ 2} = { 4}F^{\prime}E_{\text{F}} \;\hbar s^{\prime}w/ 3nV $$
(5.4.8)

where F′ is defined by

$$ F^{\prime} = 3C^{ 2} / 8E_{\text{F}} M_{{}} (s^{\prime})^{ 2} $$
(5.4.9)

which is clearly closely related to F defined by Eq. 5.4.3.

He now used a canonical transformation to eliminate, as far as possible, the electron-phonon interaction from the Hamiltonian of Eq. 5.4.6, in the process of which retardation was again neglected. The aim was to focus attention on the (phonon-mediated) electron-electron interaction, rather than on the very much larger polaron-like self-energy, which in perturbation theory was given by ΔE 1. This yielded the following instantaneous electron-electron interaction, H int, which is attractive between electrons near the Fermi surface whose energies differ by less than that of the exchanged lattice quantum of the wave-vector w, in agreement with the original approach using perturbation theory:

$$ H_{\text{int}} \sim F\sum\limits_{{{\mathbf{k,}}\,{\mathbf{q,}}\,{\mathbf{w}}}} \frac{(\hbar sw)^{2}} {[(\varepsilon_{{{\mathbf{q}} - {\mathbf{w}}}} - \varepsilon_{{\mathbf{q}}} )^{2} \,{-}\,(\hbar sw)^{2} ]} {a^{\dag }_{{\mathbf{q}}} a_{{{\mathbf{q - w}}}} a^{\dag }_{{{\mathbf{k}} - {\mathbf{w}}}} a_{{\mathbf{k}}} } $$
(5.4.10)

F and s are renormalised values of F′ and s′, which are given by:

$$ F = F^{\prime}s^{\prime}/s\quad \quad s = s^{\prime}( 1- 2\nu F^{\prime}) $$
(5.4.11)

where ν is the number of electrons per ion.

The renormalization of the speed of sound can be considered as a kind of ‘inverse’ of polaron formation, in which instead of an electron carrying with it some lattice deformation, the ions carry with them oscillations in the electron density, which increase their inertia, resulting in a decrease in the speed of sound. The existence of this renormalisation disposedFootnote 42 of Wentzel’s criticism that the magnitude of the parameter F, necessary for superconductivity to occur in Fröhlich’s original theory of 1950, had to be so large that it entailed an instability of the lattice (Wentzel 1951). For Eq. 5.4.11 shows that to ensure that the renormalised speed of sound is positive (condition for lattice stability), it is the unrenormalised quantity F′ that must be below a certain value; F itself can be arbitrarily large (Fig. 5.8).

Fig. 5.8
figure 8

Fröhlich’s most influential paper introducing quantum field theory into solid-state physics—Reproduced with the permission of the Royal Society

It should be appreciated that Fröhlich’s Hamiltonian permitted, for the first time, systematic investigation not only of the phenomenon of superconductivity, in particular, but also of the electron-phonon interaction in metals, in general, such as its effect on the density of electronic energy levels; the associated impact on specific heat, C v , was considered by Buckingham (1951). The possibility that a residual attractive interaction exists in all metals, which at low temperatures has an ordering effect on the electron Fermi gas such that its specific heat is reduced, was the subject of a short note in 1963 [F119] in which Fröhlich argued that such a behaviour would entail a modification of the Third Law of thermodynamics such that for systems in equilibrium, \( \partial C_{v} /\partial T \to 0 \) as T  0—see also [F125].

In passing, it should be recorded that Fröhlich’s introduction of field-theoretic developments into solid state physics was later reciprocated by a gradual flow of concepts from condensed matter physics back into nuclear and particle physics, such collective motion in nuclei, and quark/gluon condensates.

In the early 1950s, Fröhlich spoke on his phonon-mediated electron-electron interaction mechanism at a number of international conferences, including the NBS Low Temperature Physics Conference in Washington in 1951 [F81], the Lorentz-Kamerlingh Onnes Centenary Conference on Electron Physics in Leiden (Netherlands) in 1953 [F85], the International Conference of Theoretical Physics in Japan in 1953 [F88] (at which he spoke also on the polaron problem [F87]), and at the 10th Solvay Conference in Bruxelles in 1954, where he first reported [F93] his solution for a one-dimensional model of a superconductor (vide infra). It is clear from the discussions at these meetings, however, that his ideas were by no means unanimously accepted, although the great importance of his phonon-mediated electron-electron interaction was acknowledged by Bohr and Heisenberg at the Leiden Conference [F85]:

I most thoroughly appreciate the great importance of Fröhlich’s contribution to our understanding of the interaction between the electrons through their coupling with the ion lattice (Bohr 1953).

I completely agree with Prof. Fröhlich that the isotope effect in superconductors suggests very strongly the predominance of an interaction of the kind produced by the zero point lattice vibrations, as he has discussed. Coulomb interaction seems to be less important (Heisenberg 1953).

Further progress towards obtaining superconductive solutions of Fröhlich’s Hamiltonian was, however, thwarted by difficulties connected with the use of perturbation theory, noted above, and by the absence of any other systematic method: a further idea seemed to be lacking, as noted, in particular, again by Bohr and Heisenberg in their discussion of Fröhlich’s Leiden paper:

In fact, as is generally recognised, we have to do in the superconductive phase with a state of the electrons which, although differing very little in energy from the normal one, exhibits a high degree of order… (Bohr 1953)

It will be reasonable to picture the state with current as a ‘solid body’ of electrons moving through the ionic latticeFootnote 43 (Heisenberg 1953)

In the absence of any such idea, and to refute a criticism from van Vleck and Slater at the NBS Meeting in Washington in 1951 that his Hamiltonian could not yield a phase transition, Fröhlich considered the one-dimensional model mentioned above, which, provided the interaction of an electron with the lattice is so strong that the recoil of the lattice when an electron is scattered can be neglected, he was able to solve non-perturbatively, using a Hartree self-consistent field approximation [F89]. A single lattice mode becomes strongly excited, and acts as a spatially periodic field on the electrons, which produces a gap in the single electron energy spectrum, à la Peierls (Peierls 1930); the gap was found to be proportional to exp(−3/2F). This particular form of the dependence of the gap on F cannot be expanded as power series about F = 0 (the function has an essential singularity at F = 0), which indicates the impossibility of ever obtaining such a gap in any order of perturbation theory,Footnote 44 as was later rigourously shown by Migdal (1958).

Despite this gap, the system is not an insulator at absolute zero because the periodic variation in electron density is tied, not to the lattice itself but rather, to the lattice modes that are here strongly excited, and which it enforces in a self-consistent way. The energy difference, \( \delta E \) per atom between this state and that in the absence of any electron-lattice coupling was found to be given by \( E_{\text{F}} \) exp(−3/F). The appearance of the Fermi energy, \( E_{\text{F}} \), in place of the energy ms 2, which characterised the perturbation result, means that there is here no isotope effect—its absence originating in the ‘over-strong’ electron-lattice interaction assumed ab initio, via which the dynamical properties of the lattice are effectively suppressed. Furthermore, the presence of the large energy \( E_{\text{F}} \) outweighs the reduction arising from the replacement of F by exp(−3/F), resulting in an unrealistically large energy difference. It may be noted, however, that the rapid variation of the function exp(−1/F) with F is consistent with the very sensitive empirical dependence of the superconducting transition temperature with pressure, despite F itself varying little under compression.

The possibility that a gap characterisesFootnote 45 the superconductive state has a long history in the phenomenological development of the subject, first appearing in the work of F. and H. London in 1935 (London and London 1935). Three years later it was invoked by Welker in an abortive theoretical attempt (Welker 1938) to understand the Meissner-Ochsenfeld effect —namely, the expulsion of a magnetic field applied above the superconducting transition temperature, T c , upon cooling through T c , so that, in the superconductive state, the system behaves as a perfect diamagnet;Footnote 46 such perfect diamagnetism had, incidentally, been anticipated by Frenkel the year before it had been discovered, in the paper to which reference was made in Chap. 2 (Frenkel 1933). The idea of a gap resurfaced again in the experimental work of Daunt and Mendelssohn on persistent currents in 1946, where the existence of a gap ‘protects’ the system against dissipation (Daunt and Mendelssohn 1946). The essential singularity that characterises the gap in Fröhlich’s one-dimensional model (which was the first to be derived theoretically, and in an exact way) turned out not to be peculiar to the low-dimensionality of the system considered, but was shared by the later work of Cooper in 1956 (vide infra), and by the full three-dimensional solution of Bardeen, Cooper and Schrieffer (BCS) the following year (vide infra). Prior to this, Bardeen had shown (Bardeen 1955) that a gap would entail a non-local relation between current density and magnetic field of the form that had been suggested by Pippard from his microwave measurements that showed that the depth to which a magnetic field penetrates a superconductor is essentially independent of the strength of the field. The non-locality was found to be characterised by a macroscopic distance—the so-called ‘coherence length’, of the order of 10−4 cm ( Faber and Pippard 1955)—a quantity that was later to feature prominently in the BCS theory. The existence of a gap was first confirmed by measurement of the electronic specific heatFootnote 47 in vanadium, which revealed a temperature dependence proportional to exp (−Δ/2kT), where the gap, Δ, was of the order of kT c (Corak et al. 1954), and later by measurements of heat conductivity and optical absorption.

In the meanwhile, the thermodynamic properties of this one-dimensional model were elaborated the following year by C.G. Kuper, then a Research Fellow in Fröhlich’s department, who showed that the model does indeed exhibit a second-order phase transition (Kuper 1955) in which the gap acts as a temperature-dependent ‘order parameter’. With increasing temperature, excitation of electrons across the gap reduces the periodic variation in electron density, which, in turn, reduces the amplitude of the resonant lattice modes, thereby narrowing the gap, which eventually vanishes above a certain temperature (the transition temperature), T c, given by \( kT_{\rm c} \approx E_{\rm F} \exp( - 3/2 F) \). The electronic specific heat was found to be exponential near T = 0 K and to exhibit a discontinuity at T c.

Despite some unrealistic features—such as the transition temperature being large compared to the Debye temperature, a condition that is not fulfilled in real superconductorsFootnote 48 (and which here arises from the presence of the large Fermi energy, E F in the prefactor of the exponential term in the expression for T c)—it is interesting that, some 20 years later, this one-dimensional model was to find application (Bardeen 1973a, b) to the so-called paraconductivity attributed to sliding modes in quasi one-dimensional organic and other low dimensional systems —see also [F173].

Crucial to the eventual BCS solution of the problem of superconductivity in 1957 was Leon Cooper’s demonstration in 1956 (Cooper 1956) that, in the case of just two electrons above the Fermi sea, Fröhlich’s phonon-mediated attractive interaction results in a single bound-state of zeroFootnote 49 centre-of-mass momentum, no matter how weak the attraction—essentially because the Pauli principle blocks any possible decay channels; the energy was again found to exhibit an essential singularity in the coupling constant.

The extension of Cooper’s work to many electron proved, however, to be quite complicated because of the Pauli principle, which prevents the pairing off all the electrons in Cooper pairs in any straightforward way—except when all the pairs have the same net momentum, which in the ground-state must be zero.

This pairing, which proved to be the concept that had earlier been missing,Footnote 50 had, incidentally, been independently anticipated qualitatively somewhat earlier by M. Schafroth (1954), following his work in Liverpool, to where, in 1952, he had been sent by Pauli to work with Fröhlich. The very first mention, of pairing, however, seems to have been made 8 years earlier by Ogg (1946) in connection with superconductivity in metal-ammonia solutions, which he attributed to Bose-Einstein condensation of pairs of electrons, a suggestion reiteratedFootnote 51 by Onsager in 1951.

The work of BCS was based on an approximate form of Fröhlich’s electron-electron interaction (Eq. 5.4.10) in which its momentum dependence (as contained in the factor \( G(\varepsilon_{{{\mathbf{q}} - {\mathbf{w}}}} - \varepsilon_{{\mathbf{q}}} ) \equiv (\hbar sw)^{ 2} [(\varepsilon_{{{\mathbf{q}} - {\mathbf{w}}}} - \varepsilon_{{\mathbf{q}}} )^{2} - (\hbar sw)^{ 2} ]^{ - 1} \)) is neglected, and is replaced by a negative constant within a shell of width ħω D centred on the Fermi surface, where ω D is the Debye frequency, and by zero elsewhere—i.e.

$$ G \rightarrow \left\{ {\begin{array}{*{20}l} { - 1,} \hfill & {{\text{for}}\;|\varepsilon_{{{\mathbf{q}} - {\mathbf{w}}}} - \varepsilon_{{\mathbf{q}}} |\; < \;\hbar \omega_{\rm D} } \hfill \\ {\;\;0,} \hfill & {{\text{for}}\;|\varepsilon_{{{\mathbf{q}} - {\mathbf{w}}}} - \varepsilon_{{\mathbf{q}}} |\; > \;\hbar \omega_{\rm D} }\hfill \\ \end{array} }\right. $$
(5.4.12)

In addition, they included the short-range repulsion, V c(w), that remains after the long-range part of the inter-electronic Coulomb interaction has been taken care of in terms of plasma oscillations. Their electron-electron interaction Hamiltonian is thus based on terms of the form:

$$ h_{\text{int}} \sim [ - F + V_{\text{c}} \left( \varvec{w} \right)]a^{\dag }_{{\mathbf{q}}} a_{{{\mathbf{q}} \, - \, {\mathbf{w}}}} a^{\dag }_{{{\mathbf{k}} \, - \, {\mathbf{w}}}} a_{{\mathbf{k}}} $$
(5.4.13)

It can be seen from Eq. 5.4.10 that the attractive interaction is strongest when the two electrons have the same energy, which will be the case if they have equal and opposite momenta (k, −k), so that the pair has zero centre-of-mass momentum—i.e. their centre of gravity is at rest. BCS chose their variational ground-state wave-function to ensure that the maximum number of such pairs of electrons (with of zeroFootnote 52 centre-of-mass momentum) take advantage of the attraction, and only such pairs (in which the electrons also have anti-parallel spin) are considered. The resulting many-pair state is a single quantum state, which is cooperatively produced and exhibits coherence Footnote 53 with an associated coherence length of the same order of magnitude as that predicted by Pippard, namely, 10−4 cm, mentioned above. It should be noted, however, that in contrast to the earlier work of Schafroth, the many-electron wave-function does not describe bosonic pairs of electrons, but rather correlations between pairs of electrons separated by a distance of the order of 10−4 cm; since this far exceeds the average inter-electronic separation, electrons belonging to millions of other pairs will be found within this distance. Accordingly, the pairs cannot be considered to be independent entities; instead, they are spatially interlocked in a highly intricate way that guarantees their collective coherence, and in consequence, the pairs do not satisfiy Bose commutation relations. As Bardeen stressed in his contribution to Fröhlich’s 1973 Festschrift ‘.....the key thing is pairing, not pairs. Although often used, the concept of (boson) pairs is misleading; they are not stable above the transition temperature, they overlap strongly and would not exist but for their interaction.’ (Bardeen 1973b)—i.e. pairing is inherently a cooperative effect, which is best treated in momentum space, rather than in position space where correlations between more than just two electrons would have to be considered.

The energy difference, \( \delta E \), between the ground-state of this system and that in the absence of any interaction was now found to be given essentially by ms 2exp(−2/F); this value is much smaller than that given by Fröhlich’s original perturbative calculation, where the energy difference was proportional to ms 2 F, because F now enters via the factor exp(−2/F), which is much smaller than F. It should be noted that the BCS result is a synthesis of the energy ms 2 that characterised Fröhlich’s original perturbative calculation, and the essential singularity factor of his one-dimensional model; the desirability of developing a method that ‘forms a link between the two methods discussed so far’ had already been stressed by Fröhlich 3 years earlier [F93]. As with the case of Cooper’s single-pair calculation, the many-pair ground-state is found to be stable for all positive values of the coupling constant F, however small. It follows from Eq. 5.4.13, however, that the effect of including the short-range part of the direct inter-electronic Coulomb repulsionFootnote 54 is to reduce F to a value F*, so that the criterion for superconductivity becomes F* > 0, i.e. the phonon-mediated electron-electron attraction must be strong enough to dominate the short-range part of the direct Coulomb repulsion.

In the case when each pair has the same non-zero total (centre-of-mass) momentum, the pairs move cooperatively together (coherence), again as a single quantum state,Footnote 55 effectively realising the ‘solid body’ envisaged by Heisenberg, thereby ensuring that the flow is stable against dissipation; for the break-up of any particular pair necessarily involves all other pairs, which would thus require an enormous expenditure of energy.

Although the assumption of an associated ‘rigidity’ in the superconducting many-electron wave-function had been phenomenologically shown to lead to the Meissner effect (London 1950), as already noted, this effect cannot be satisfactorily derived within the BCS theory because of problems of gauge invariance arising from the approximate form of the electron-electron interaction used (Schafroth 1958). As will be noted in the next chapter (Sect. 6.2), however, an exact, a model-independent derivation was eventually given by Sewell in 1990, in terms of the macroscopic wave-functions earlier introduced by Fröhlich in the late 1960s, following the work of Yang (vide Sect. 6.2).

At finite temperatures, some Cooper pairs will be broken up, so that a superconductor contains individual (unpaired) electrons in addition to Cooper pairs, the former increasing in number with increasing temperature, according to exp(−Δ/2kT), where Δ is the energy gap in a superconductor —i.e. the energy required to unbind a Cooper pair thereby creating two separate, unpaired, electrons. It is these unpaired electrons that are responsible for the electronic specific heat of a superconductor, which must, accordingly, be expected to be proportional to exp(−Δ/2kT); as already noted, this was precisely as found by experiment (Corak et al. 1954). Unlike in the early approach of Welker, but in common with Fröhlich’s one-dimensional model, the gap the BCS theory is a decreasing function of temperature, and vanishes at the transition temperature, in accordance with the 2nd order nature of the superconducting-to-normal phase transition.

It is of interest to record (as already mentioned in Sect. 5.3) that, many years later, Schrieffer told Fröhlich [185] that the form of his many-electron extension of Cooper’s pair wave-function, in which all pairs have the same total (zero) momentum, was actually motivated by the structure of the variational wave-function of the large polaron in the so-called ‘intermediate-coupling’ regime, first considered by Fröhlich in collaboration with Gurari in 1953 (Gurari 1953), and, independently, by others the same year (Lee et al. 1953).

It was later realised by Yang (vide Sect. 6.2), however, that the pairing correlations between electrons near the Fermi surface with equal and opposite spin and momenta, which characterise the many-pair BCS wave-function, can be collectively expressed spatially in terms of a two-point macroscopic wave-function, Φ 2(x, y), having the form of a bound state—i.e. Φ 2(x, y) is large only when \( \left| {\varvec{x} - \varvec{y}} \right| \) is below a length (the coherence length, ~10−4 cm) characteristic of a particular material. It is in terms of Φ 2(x, y) that the zero-momentum pairs with anti-parallel spin reflects itself macroscopically as a pair condensate. \( \left| {\varPhi_{ 2} } \right|^{ 2} \) is thus in the nature of an ‘order parameter’, the search for which Fröhlich had advocated already in 1953, the year before his one-dimensional model calculation was published, commenting:

Apart from the order of magnitude of the energy we have not yet, however, been able to derive any further properties of superconductors. ……I think, however, that the solution will not come from mathematical considerations only, but will require a new physical concept which will help us find an appropriate approximationFootnote 56. This should involve an ‘order’ parameter similar to the case of second order [phase] transitions. In superconductivity we have so far not been able to find such a parameter, and I think our efforts should be directed along such lines [F88].

Notwithstanding significant differences between the eventual BCS theory and that of Fröhlich’s earlier work—in particular, the former’s treatment of pairing correlations, which turned out to be intimately connected with the required new physical concept sought by Fröhlich, namely coherence—it remains a scandalous mystery to those properly acquainted with the history of the subject why Fröhlich was not included in the Nobel Citation shared by Bardeen, Cooper and Schrieffer in 1972 (although the number of people who can share the same Nobel prize is, admittedly, limited to three). For several essential features of their work, not least, the Hamiltonian used- which, apart from the inclusion of a direct, short-ranged screened Coulomb repulsion between electrons, was a simplication of the actual phonon-mediated electron–electron interaction derived by Fröhlich 5 years earlier using a canonical transformation. In addition, the form of their many-pair wave-function was actually motivated by the work of Fröhlich’s collaborators in Liverpool , whilst the BCS expression for the energy difference between the superconducting and normal states is a synthesis of the results of his earlier work [F76, F89]. Fröhlich’s pivotal contribution to the theory of superconductivity was, however, fulsomely acknowledged by Bardeen in a letter to Fröhlich, dated 22 July 1960, wherein he wrote:

The introduction of this interaction by Fröhlich in 1950 and the simultaneous verification of its importance by the discovery of the isotope effect gave the break-through that pointed the way towards the development of a successful theory of superconductivity [F136].

Fascinating reviews by Fröhlich of the state of superconductivity at various epochs (some of which contain valuable historical insights) can be found in [F109, F125, F136, F146, F185]. Of particular interest is [F109], which contains an insightful section entitled The psychology of Superconductivity.

5.5 Return to Particle Physics

Lest the impression be gained from the earlier parts of this chapter that Fröhlich’s interest in particle physics ceased with his disillusionment over the divergences that plagued further development of his pre-war work on the meson theory of nuclear forces , recounted in Chap. 4, it must be emphatically stated that this was not the case; it was a subject in which he maintained a profound interest throughout his life, and one to which he persistently returned, right up to the end, publishing some 11 papers between 1958 and 1985.

His post-war interest in particle physics was reawakened in the late 1950s with the discoveries of parity violation and CP invariance, the former ‘striking a chord’ with his early pioneering attempt with Heitler [F19] to understand the so-called anomalous magnetic moments of the neutron and proton in terms of scalar meson theory, which had been criticized by Kemmer precisely because their admitted spin-spin interaction violated parity! (vide Sect. 4.4 and Kemmer 1965). Neither did the associated discovery at the time of CP invariance come as too much of a surprise to Fröhlich. For this invariance indicates some deep connection between electric charge, generally considered to be an internal property of a particle, and the structure of the external space-time ‘occupied’ by the particle. Indeed, he had long suspected from the universality of electric charge that electric charge was more a property of electrodynamics than of the particular particles that ‘carry’ it, in which case, he believed, electrodynamics in its present form would need to be extended. To him, these experimental discoveries of the late 1950s were catalytic, and he accordingly embarked on what he later described [F121] as an ambitious programme ‘......whose aim is the derivation of the properties of particles and fields from geometrical considerations.’

Arguing, in 1960, that the conventional (passive) treatment of reflections in terms of point transformations is unphysical, he developed a novel approach involving the introduction of a new angular space in terms of which space reflections could be considered actively as special cases of continuous transformations [F106, F108]. He introduced his approach as follows:

The new treatment is based on a simple but important difference between reflexions and continuous rotation. I shall illustrate this first for an ordinary two-dimensional Euclidean space. Assume a coordinate system in which the x-direction runs from left to right, say, and the y-direction is obtained from it by a 90° anti-clockwise rotation. Consider an irregular triangle in this two-dimensional space described by the coordinates of three points (x 1, y 1), (x 2 , y 2), (x 3 , y 3). A rotation replaces these three by three different coordinate pairs (x k , y k ). As is well known in geometry there are two ways of interpreting the new set of co-ordinates: (i) the triangle has not been moved, but the co-ordinate frame has been rotated, (ii) the co-ordinate frame remains the same, but the triangle has been moved appropriately. The first interpretation has no physical (geometrical) meaning in terms of the figure. A co-ordinate system is quite an arbitrary device; completely different types of co-ordinates might be introduced without changing the geometrical properties. The second interpretation has, however, a very definite geometrical meaning: it tells us, for instance, that the angles of the triangle are unchanged by the ‘rotation’. To have a closer analogy with field equations we replace the triangles by the three straight lines forming it. They are described by three equations between y and x, a k x + b k y + c k  = 0. Rotation of the coordinate frame by an angle θ (first interpretation) corresponds to a replacement of (x, y) by (x′, y′), say, \( x = x^\prime \cos \theta - y^\prime \sin \theta ;y = x^\prime \sin \theta + y^\prime \cos \theta \). The appropriate motion of the triangle on the other hand corresponds to a replacement of \( a_{k} , \, b_{k} \;{\text{by}}\;a^{\prime}_{k} = a_{k} \cos \theta + b_{k} \sin \theta \); \( b_{k} = - a_{k} \sin \theta + b_{k} \cos \theta \) (second interpretation). Carried out together the two transformations leave the form of the equations invariant.

Consider now a reflexion in which the value of each x-coordinate is replaced by its negative; x 1 = -x 1, etc. The first interpretation says simply that the frame has been replaced by another one in which the x-axis runs from right to left, instead of from left to right. The second interpretion, however, can no longer be offered in terms of continuous displacements and rotations. It would require the triangle to be turned inside out (or rather its two-dimensional analogue). Thus if we decide to avoid this latter action, then no physical interpretation can be given to reflexions.

The possibility of a physical interpretation of these reflexions in a two-dimensional system can be regained, however, if use is made of a third dimension which permits rotation around the y-axis. This third dimension then represents an angle ϕ which for ϕ = 0, say, yields the first frame and for ϕ = π the reflected one.

Interpretation of the angle ϕ in terms of the x - y plane, suggests a formal connection Pauli spin, which has definite values (± ½) only for two directions, say ϕ = 0 and ϕ = π. The interpretation of intermediate angles would then have to be given in terms of a mixture of the original triangle, and of the reflected one. Thus right from the beginning, the present description considers the possibility of both these triangles. The co-ordinate ϕ decides which one is realized. Invariance of the above three equations under reflexion thus involves (i) replacement of the coordinate frame (x, y) by (- x, y) and (ii) rotation of the triangle around the y axis by 180°, leading to the replacement of a k by - a k .

The case of Lorentz transformations is, of course, more difficult than the above case, but it offers a similar distinction between continuous transformations and reflexions. The former always offer a physical interpretation, either as (three-dimensional) rotations, or as relative motion. The latter would require actions like turning a body inside out, which would require internal degrees of freedom. Our programme must then be the development of a description which right from the beginning permits the treatment of certain properties, say momentum (p k, p 0 ) together with all the reflected ones, (- p k, p o ); (p k, - p o ); (- p k , - p o ); k = 1, 2, 3 denotes the spatial, 0 the time part). This should be expected to involve the use of new angular co-ordinates.

Following the above discussion I feel that point transformations other than mere coordinate replacements should he considered as unphysical and should be replaced by continuous transformations through introduction of new angular coordinates [F106, F108, adapted].

This new angular space permitted the definition of dynamical variables that could be interpreted in terms of isobaric spin, electric charge and mass, and led to a wave-equationFootnote 57 of the following form, whose solutions described bosons:Footnote 58

$$ (B_{\upmu} \partial_{\upmu} + M)\,\varPsi = 0 $$
(5.5.1)

where M is a mass operator,Footnote 59 and the B μ are the counterparts of Kemmer’s β-matrices in the new angular space.

Crucial to the programme was the representation of the β μ-matrices in terms of direct products of four sets of Pauli matrices (ρ, ρ ), (σ, σ ) by:

$$ \beta_{\text{k}} = \, \frac{1}{2} \, (\rho_{1} \sigma_{\text{k}} + \rho^{\prime}_{1} \sigma^{\prime}_{\text{k}} ),\quad \quad \beta_{4} = \frac{1}{2} \, (\rho_{2} + \rho^{\prime}_{2} ), $$
(5.5.2)

which permitted a classification of wave-functions, Ψ, in terms of spin-pair functions referring to (ρ, ρ ), (σ, σ ); this yielded 4 different wave-functions, Ψ M, Ψ π, Ψ K and Ψ ν. The angular space defined by (σ, σ ) is the angular part of ordinary x k-space, since according to Eq. 5.5.1, rotation of the x k-space is identical to a unitary transformation of the (σ, σ ). On the other hand, the (ρ, ρ ) require the definition of a new angular space, which was considered to be an internal space of the bosons; the existence of this space permitted treatment of any unitary transformation of the (ρ, ρ ) as a particular case of a continuous transformation, in conformity with the new advocated approach to reflections.

The mass operator, M, in Eq. 5.5.1 must commute with the B μ, and with T 2, T 3 and Q 3, where, T and Q 3 are, respectively, the operators for isospin and electric charge, but these requirements do not completely determine its general form. Originally [F106], the following form was used:

$$ M = C (1 + I^{2})^{2}\; (T^{2} + I^{2}),$$
(5.5.3)

where C is a constant, and I is a kind of momentum operator that connects the constituent dashed and undashed spaces of the new angular space. In conjunction with the wave-equation Eq. 5.5.1, it was found that:

  1. i)

    Ψ π represents an isobaric spin triplet, the 3 particles having electric charges (1, 0, -1) and zero mechanical spin, which were tentatively identified with the π-mesons; their mass is 2C.

  2. ii)

    Ψ K represents 2 isobaric spin doublets, the 4 particles having electric charges (1, 0, -1, 0), 3rd component of isobaric spin (½, -½, -½, ½), and zero mechanical spin, which were tentatively identified with K-mesons: their mass is 7C.

A check on these identifications was provided by the predicted K:π mass ratio of 7:2, which is very close to the experimental value of 966:273; this permitted determination of the constant C whose value turned out to be 137 electron masses.

With light quanta in mind (as the only other boson known at the time), he noted that while the above expression for M allowed for a field with zero isobaric spin and mass, it was not sufficient to determine the external space time parts of the associated wave function. To remedy this, the following term was subsequently added, which had no effect on the above predictions:

$$ c[\frac{1}{2} \, (\rho_{3} + \rho^{\prime}_{3} )]^{2} \sum\limits_{\text{k}} {[\frac{1}{2} \, (\sigma_{\text{k}} + \sigma^{\prime}_{\text{k}} )]^{2} } $$
(5.5.4)

where c is a constant [F107].

It was then found that:

  1. iii)

    Ψ M describes an isospin singlet, and has 10 components in external space-time, which satisfy the Maxwell equations, i.e. it describes light quanta of spin 1. It should be noted that the ‘mass’ constant c here simply gives a measure for the electromagnetic vector potential.

  2. iv)

    Ψ ν describes 2 isobaric doublets, with the same electric charge and isospin properties as does Ψ K (i.e. the K-mesons), but now with mechanical spin 1; their mass is given by:

$$ m_{\nu } = \left[ {7C\left( {7C + \, 2c} \right)} \right]^{1/2} $$
(5.5.5)

where C is the constant that parametrises the expression for M corresponding to π- and K-mesons (Eq. 5.5.3); Fröhlich called the 4 new bosons ν-mesons. The appearance of the constant c in the expression for their mass, M ν, is perhaps unexpected, given that it parametrised the operator (Eq. 5.5.4) that yielded the Maxwell equations. Provided C and c have equal signs, m ν exceeds the mass the K-mesons. Light quanta and the ν-mesons require the use of the 10-dimensional representation of the β μ, while the π- and K-mesons require the 5-dimensional one.

Thus not only did the wave- equation Eq. 5.5.1 describe all bosons known at the time (1960), but it predicted the existence of a further 4 vector mesons (ν-mesons).

This prediction followed simply from the symmetries of the wave-equation (Eq. 5.5.1), without any consideration of interactions. Fröhlich thus considered it to be particularly significant when he learned that, precisely from considerations based on the empirical properties of the weak interaction, Lee and Yang had been led, the previous year, to the conclusion that 4 new bosons should exist with properties identical to those he had predicted (Lee and Yang 1960). Lee and Yang had called the new particles ‘schizons’, but, in a short note [F111] Fröhlich proposed that they be named \( \varphi \alpha \nu \chi \omega \nu\,({\text{from}}\;\varphi \alpha \nu o\sigma \;{\text{and}}\;\chi \omega \nu ) \) after his wife, Fanchon (Fig. 5.9). The same year (1961), K* vector mesons resonancesFootnote 60 having precisely these properties were experimentally detected (Alston et al. 1961), but for some reason this was never alluded to by Fröhlich, perhaps because he was well aware of the unsatisfactory nature of his ad hoc determination of the form of the mass operator, which he admitted ‘....... should be replaced by a compelling deivation’, but which ‘.....will require deeper insight than has been achieved so far into properties of the new angular spaces.’ [F107].

Fig. 5.9
figure 9

Prediction of new mesons, 1961 [F111], which he proposed to name after his wife—Reproduced with permission of the Institute of Physics

After 1961, many other mesons were discovered, which are not predicted by the above theory as it stands. In 1964, however, the whole direction of theoretical research in this area was dramatically altered with the advent of the quark model.

Prior to this, however, in his contribution [F108] to the Pauli memorial issue of the journal Helvetica Physica Acta, after presenting a more systematic treatment of his earlier work [F106, 107], he went on to consider, in more detail than previously, the implications of his treatment of reflections in terms of continuous transformations, noting that allowing for all possible combinations of space and time reflections quadruples the number of wave-equations. The other new element, however, was a ‘Note added in proof’, in which he reported that replacing the Kemmer β μ by the corresponding Dirac matrices leads, under certain conditions, to a wave equation for the electron-neutrino field. This was the starting point of the subsequent work of Fröhlich’s assistant in Liverpool, Ch Terreaux, who showed that the decomposition of the restricted Lorentz group into a direct product of two-dimensional unimodular transformations permits Fröhlich’s quartet of fermion wave equations to be obtained in a systematic way (Terreaux 1962); Terreaux went on to show that the existence of spin-½ particles entails the existence of just two kinds of electric charges in Minkowski space. Furthermore, his equations of motion for leptons exhaust only half of the total number of fermion equations, inviting speculation as to what the remaining half might describe.

In a long paper [F116] published the following year (in 1963) in Nuclear Physics—which was prefaced by a quotation adapted from the alchemical axiom of Maria Prophetissa :Footnote 61 One becomes two, two becomes three, and out of the three comes the one as the fourth—he went beyond his earlier phenomenological treatment, showing that continuous space reflections could be represented by non-linear transformations of a triad of 3-dimensional vectors, and that these transformations were equivalent to rotations of an associated tetrad in a 4-dimensional space.Footnote 62 It was found that the Lorentz invariance of the definitions of an unreflected and a fully reflected triad could only be maintained if, from a relativistic point of view, the triad has axial symmetry (Fig. 5.10).

Fig. 5.10
figure 10

Paper on isobaric spin space [F116]—Reproduced with the permission of Elsevier

Further study in this paper of the angular structure of the new space revealed that its properties could be interpreted solely in terms of homogeneous Lorentz transformations. Correspondingly, it was found that the structure was such as to permit the introduction of wave-equations not only for the light quanta treated previously, but also (quite remarkably) for neutrinos as well. The equation for the latter turned out to be identical to that which he had first mentioned in the ‘Note added in proof’ to [F108], and then obtained in 1961 from consideration of the implications of his continuous treatment of reflections on the structure of momentum space [F110]. This earlier work had yielded a completely geometrical interpretation of the neutrino, in which neutrino charge found interpretation as a coordinate in terms of which the distinction between left and right-handedness could be described, quantization of its field according to the Pauli principle following as a necessary consequence (Fig. 5.11).Footnote 63

Fig. 5.11
figure 11

Paper on the structure of momentum space, the neutrino and the Pauli principle [F110]—Reproduced with the permission of Elsevier

The possibility of a similar geometrical interpretation of the wave-equations for massive particles was considered to require, however, an essential extension to the purely angular structure admitted hitherto, involving the introduction of a length, consistent with the close connection between wave-equations and translations—i.e. inhomogeneous Lorentz transformations.

Four years later, hoping that the necessary geometrical concepts might already be available in electrodynamics, he showed that all 10 operators required for the definition of local generators of inhomogeneous Lorentz transformations exist in quantum (but not in classical) electrodynamics, since the required operators, which are based in a non-local way on the vector potential, do not commute with the fields they rotate and translate; consequently, field quantization no longer needed to be considered as a purely empirical feature but was seen, for the first time, to be actually imposed by geometrical requirements [F121]. In turn, the Maxwell equations acquired a corresponding geometrical significance, expressing in the local limit the invariance of the current density J μ under translations; J μ itself was expressed as the local limit of the derivative (with respect to an appropriate non-local coordinate) of a non-local scalar field that vanishes locally. Accordingly, the Maxwell equations themselves had now to be considered as the local limit of a bilocal theory .Footnote 64 He later noted at the end of [F124] that same conclusion must indeed be drawn from the usual presentation of electric current densities in particle physics. For these densities are quadratic in the particle field operators, \( \psi \left( x \right) \), whence, in consequence of the singularities involved their commutators, expressions like \( \psi^{\dag } \left( x \right)Q\,\psi \left( x \right) \), (where Q is an operator) have no meaning except in terms of the limit of \( \psi^{\dag } (x^\prime )Q\,\psi \left( x \right)\;{\text{as}}\;x \to x^\prime \).

Attributing particular significance to uniform dilations, which change the metric but leave Maxwell’s equations invariant, he concluded [F121] with the following profound statement, of particular relevance to his more general programme of geometrisation, and which reveals his concern about the origin of frames of reference:

Such invariance must also be demanded of any basic theory of particles; for establishment of a metric would require the existence of measuring instruments of length. They would consist of particles, whose existence cannot be postulated in a theory whose aim would be to derive this existence as one of its main consequences [F121].

Further consideration of these bilocal aspects of electrodynamics subsequently led to the introduction [F124] of a non-local generator of dual transformations under which the Maxwell equations with sources, J μ, remain invariant—in contrast to the usual local treatment where J μ = 0 is necessary for such invariance. The new non-local generator involved a 4-dimensional integral over an infinitesimal region of a quantity closely connected with the scalar product E.B of the electric and magnetic fields; it had integer eigenvalues and represented a new quantized property of the electromagnetic field, the precise nature of which remains to be established.

In conclusion, it must be acknowledged that, unlike the situation with Fröhlich’contributions in many other areas of theoretical physics, references to these works in the literature are conspicuous by their absence, from which it can only be concluded that, despite their predictions and undoubted ingenuity, they had little or no impact or influence on the future developments in this field; whether they were too radical, too little understood, or simply too far removed from contemporary fashions to merit serious attention must remain the subject of speculation. Many years later, however, considerable interest was expressed by Russian physicists, some of whom (including L.B. Okun) Fröhlich met during a visit to Moscow in 1983.