1 Introduction

1.1 Preface

This historical review of classical unified field theories consists of two parts. In the first, the development of unified field theory between 1914 and 1933, i.e., during the years EinsteinFootnote 1 lived and worked in Berlin, will be covered. In the second, the very active period after 1933 until the 1960s to 1970s will be reviewed. In the first version of Part I presented here, in view of the immense amount of material, neither all shades of unified field theory nor all the contributions from the various scientific schools will be discussed with the same intensity; I apologise for the shortcoming and promise to improve on it with the next version. At least, even if I do not discuss them all in detail, as many references as are necessary for a first acquaintance with the field are listed here; completeness may be reached only (if at all) by later updates. Although I also tried to take into account the published correspondence between the main figures, my presentation, again, is far from exhaustive in this context. Eventually, unpublished correspondence will have to be worked in, and this may change some of the conclusions. Purposely I included mathematicians and also theoretical physicists of lesser rank than those who are known to be responsible for big advances. My aim is to describe the field in its full variety as it presented itself to the reader at the time.

The review is written such that physicists should be able to follow the technical aspects of the papers (cf. Section 2), while historians of science without prior knowledge of the mathematics of general relativity at least might gain an insight into the development of concepts, methods, and scientific communities involved. I should hope that readers find more than one opportunity for further in-depth studies concerning the many questions left open.

I profited from earlier reviews of the field, or of parts of it, by PauliFootnote 2 ([246], Section V); Ludwig [212]; Whittaker ([414], pp. 188–196); Lichnerowicz [209]; Tonnelat ([356], pp. 1–14); Jordan ([176], Section III); Schmutzer ([290], Section X); Treder ([183], pp. 30–43); Bergmann ([12], pp. 62–73); Straumann [334, 335]; Vizgin [384, 385]Footnote 3 ; Bergia [11]; Goldstein and Ritter [146]; Straumann and O’Raifeartaigh [240]; Scholz [292], and Stachel [330]. The section on Einstein’s unified field theories in Pais’ otherwise superb book presents the matter neither with the needed historical correctness nor with enough technical precision [241]. A recent contribution of van Dongen, focussing on Einstein’s methodology, was also helpful [371]. As will be seen, with regard to interpretations and conclusions, my views are different in some instances. In Einstein biographies, the subject of “unified field theories” — although keeping Einstein busy for the second half of his life — has been dealt with only in passing, e.g., in the book of Jordan [177], and in an unsatisfying way in excellent books by Fölsing [136] and by Hermann [159]. This situation is understandable; for to describe a genius stubbornly clinging to a set of ideas, sterile for physics in comparison with quantum mechanics, over a period of more than 30 years, is not very rewarding. For the short biographical notes, various editions of J. C. Poggendorff’s Biographisch-Literarischem Handwörterbuch and internet sources have been used (in particular [1]).

If not indicated otherwise, all non-English quotations have been translated by the author; the original text of quotations is given in footnotes.

1.2 Introduction to part I

Past experience has shown that formerly unrelated parts of physics could be fused into one single conceptual formalism by a new theoretical perspective: electricity and magnetism, optics and electromagnetism, thermodynamics and statistical mechanics, inertial and gravitational forces. In the second half of the 20th century, the electromagnetic and weak nuclear forces have been bound together as an electroweak force; a powerful scheme was devised to also include the strong interaction (chromodynamics), and led to the standard model of elementary particle physics. Unification with the fourth fundamental interaction, gravitation, is in the focus of much present research in classical general relativity, supergravity, superstring, and supermembrane theory but has not yet met with success. These types of “unifications” have increased the explanatory power of present day physical theories and must be considered as highlights of physical research.

In the historical development of the idea of unification, i.e., the joining of previously separated areas of physical investigation within one conceptual and formal framework, two closely linked yet conceptually somewhat different approaches may be recognised. In the first, the focus is on unification of representations of physical fields. An example is given by special relativity which, as a framework, must surround all phenomena dealing with velocities close to the velocity of light in vacuum. The theory thus is said to provide “a synthesis of the laws of mechanics and of electromagnetism” ([16], p. 132). Einstein’s attempts at the inclusion of the quantum area into his classical field theories belongs to this path. Nowadays, quantum field theory is such a unifying representationFootnote 4 In the second approach, predominantly the unification of the dynamics of physical fields is aimed at, i.e., a unification of the fundamental interactions. Maxwell’s theory might be taken as an example, unifying the electrical and the magnetic field once believed to be dynamically different. Most of the unified theories described in this review belong here: Gravitational and electromagnetic fields are to be joined into a new field. Obviously, this second line of thought cannot do without the first: A new representation of fields is always necessary.

In all the attempts at unification we encounter two distinct methodological approaches: a deductive-hypothetical and an empirical-inductive method. As Dirac pointed out, however,

“The successful development of science requires a proper balance between the method of building up from observations and the method of deducing by pure reasoning from speculative assumptions, […].” ([233], p. 1001)

In an unsuccessful hunt for progress with the deductive-hypothetical method alone, Einstein spent decades of his life on the unification of the gravitational with the electromagnetic and, possibly, other fields. Others joined him in such an endeavour, or even preceded him, including Mie, Hilbert, Ishiwara, Nordström, and othersFootnote 5. At the time, another road was impossible because of the lack of empirical basis due to the weakness of the gravitational interaction. A similar situation obtains even today within the attempts for reaching a common representation of all four fundamental interactions. Nevertheless, in terms of mathematical and physical concepts, a lot has been learned even from failed attempts at unification, vid. the gauge idea, or dimensional reduction (Kaluza-Klein), and much still might be learned in the future.

In the following I shall sketch, more or less chronologically, and by trailing Einstein’s path, the history of attempts at unifying what are now called the fundamental interactions during the period from about 1914 to 1933. Until the end of the thirties, the only accepted fundamental interactions were the electromagnetic and the gravitational, plus, tentatively, something like the “mesonic” or “nuclear” interaction. The physical fields considered in the framework of “unified field theory” including, after the advent of quantum (wave-) mechanics, the wave function satisfying either Schrödinger’s or Dirac’s equation, were all assumed to be classical fields. The quantum mechanical wave function was taken to represent the field of the electron, i.e., a matter field. In spite of this, the construction of quantum field theory had begun already around 1927 [52, 174, 178, 175, 179]. For the early history and the conceptual development of quantum field theory, cf. Section 1 of Schweber [322], or Section 7.2 of Cao [28]; for Dirac’s contributions, cf. [190]. Nowadays, it seems mandatory to approach unification in the framework of quantum field theory.

General relativity’s doing away with forces in exchange for a richer (and more complicated) geometry of space and time than the Euclidean remained the guiding principle throughout most of the attempts at unification discussed here. In view of this geometrization, Einstein considered the role of the stress-energy tensor Tik (the source-term of his field equations Gik=-κTik) a weak spot of the theory because it is a field devoid of any geometrical significance.

Therefore, the various proposals for a unified field theory, in the period considered here, included two different aspects:

  • An inclusion of matter in the sense of a desired replacement, in Einstein’s equations and their generalisation, of the energy-momentum tensor of matter by intrinsic geometrical structures, and, likewise, the removal of the electric current density vector as a non-geometrical source term in Maxwell’s equations.

  • The development of a unified field theory more geometrico for electromagnetism and gravitation, and in addition, later, of the “field of the electron” as a classical field of “de Brogliewaves” without explicitly taking into account further matter sourcesFootnote 6.

In a very Cartesian spirit, Tonnelat (Tonnelat 1955 [356], p. 5) gives a definition of a unified field theory as “a theory joining the gravitational and the electromagnetic field into one single hyperfield whose equations represent the conditions imposed on the geometrical structure of the universe.” No material source terms are taken into accountFootnote 7. If however, in this context, matter terms appear in the field equations of unified field theory, they are treated in the same way as the stress-energy tensor is in Einstein’s theory of gravitation: They remain alien elements.

For the theories discussed, the representation of matter oscillated between the point-particle concept in which particles are considered as singularities of a field, to particles as everywhere regular field configurations of a solitonic character. In a theory for continuous fields as in general relativity, the concept of point-particle is somewhat amiss. Nevertheless, geodesics of the Riemannian geometry underlying Einstein’s theory of gravitation are identified with the worldlines of freely moving point-particles. The field at the location of a point-particle becomes unbounded, or “singular”, such that the derivation of equations of motion from the field equations is a non-trivial affair. The competing paradigm of a particle as a particular field configuration of the electromagnetic and gravitational fields later has been pursued by J. A. Wheeler under the names “geon” and “geometrodynamics” in both the classical and the quantum realm [412]. In our time, gravitational solitonic solutions also have been found [235, 26].

Even before the advent of quantum mechanics proper, in 1925–26, Einstein raised his expectations with regard to unified field theory considerably; he wanted to bridge the gap between classical field theory and quantum theory, preferably by deriving quantum theory as a consequence of unified field theory. He even seemed to have believed that the quantum mechanical properties of particles would follow as a fringe benefit from his unified field theory; in connection with his classical teleparallel theory it is reported that Einstein, in an address at the University of Nottingham, said that he

“is in no way taking notice of the results of quantum calculation because he believes that by dealing with microscopic phenomena these will come out by themselves. Otherwise he would not support the theory.” ([91], p. 610)

However, in connection with one of his moves, i.e., the 5-vector version of KaluzaFootnote 8’s theory (cf. Sections 4.2, 6.3), which for him provided “a logical unity of the gravitational and the electromagnetic fields”, he regretfully acknowledged:

“But one hope did not get fulfilled. I thought that upon succeeding to find this law, it would form a useful theory of quanta and of matter. But, this is not the case. It seems that the problem of matter and quanta makes the construction fall apart.”Footnote 9 ([96], p. 442)

Thus, unfortunately, also the hopes of the eminent mathematician SchoutenFootnote 10, who knew some physics, were unfulfilled:

“[…] collections of positive and negative electricity which we are finding in the positive nuclei of hydrogen and in the negative electrons. The older Maxwell theory does not explain these collections, but also by the newer endeavours it has not been possible to recognise these collections as immediate consequences of the fundamental differential equations studied. However, if such an explanation should be found, we may perhaps also hope that new light is shed on the […] mysterious quantum orbits.”Footnote 11 ([301], p. 39)

In this context, through all the years, Einstein vainly tried to derive, from the field equations of his successive unified field theories, the existence of elementary particles with opposite though otherwise equal electric charge but unequal mass. In correspondence with the state of empirical knowledge at the time (i.e., before the positron was found in 1932/33), but despite theoretical hints pointing into a different direction to be found in Dirac’s papers, he always paired electron and proton Footnote 12.

Of course, by quantum field theory the dichotomy between matter and fields in the sense of a dualism is minimised as every field carries its particle-like quanta. Today’s unified field theories appear in the form of gauge theories; matter is represented by operator valued spin-half quantum fields (fermions) while the “forces” mediated by “exchange particles” are embodied in gauge fields, i.e., quantum fields of integer spin (bosons). The space-time geometry used is rigidly fixed, and usually taken to be Minkowski space or, within string and membrane theory, some higher-dimensional manifold also loosely called “space-time”, although its signature might not be Lorentzian and its dimension might be 10, 11, 26, or some other number larger than four. A satisfactory inclusion of gravitation into the scheme of quantum field theory still remains to be achieved.

In the period considered, mutual reservations may have existed between the followers of the new quantum mechanics and those joining Einstein in the extension of his general relativity. The latter might have been puzzled by the seeming relapse of quantum mechanics from general covariance to a mere Galilei- or Lorentz-invariance, and by the statistical interpretation of the Schrödinger wave function. LanczosFootnote 13 , in 1929, was well aware of his being out of tune with those adherent to quantum mechanics:

“I therefore believe that between the ‘reactionary point of view’ represented here, aiming at a complete field-theoretic description based on the usual space-time structure and the probabilistic (statistical) point of view, a compromise […] no longer is possible.”Footnote 14 ([198], p. 486, footnote)

On the other hand, those working in quantum theory may have frowned upon the wealth of objects within unified field theories uncorrelated to a convincing physical interpretation and thus, in principle, unrelated to observation. In fact, until the 1930s, attempts still were made to “geometrize” wave mechanics while, roughly at the same time, quantisation of the gravitational field had also been tried [284]. Einstein belonged to those who regarded the idea of unification as more fundamental than the idea of field quantisation [95]. His thinking is reflected very well in a remark made by Lanczos at the end of a paper in which he tried to combine Maxwell’s and Dirac’s equations:

“If the possibilities anticipated here prove to be viable, quantum mechanics would cease to be an independent discipline. It would melt into a deepened ‘theory of matter’ which would have to be built up from regular solutions of non-linear differential equations, — in an ultimate relationship it would dissolve in the ‘world equations’ of the Universe. Then, the dualism ‘matter-field’ would have been overcome as well as the dualism ‘corpuscle-wave’.”Footnote 15 ([198], p. 493)

Lanczos’ work shows that there has been also a smaller subprogram of unification as described before, i.e., the view that somehow the electron and the photon might have to be treated together. Therefore, a common representation of Maxwell’s equations and the Dirac equation was looked for (cf. Section 7.1).

During the time span considered here, there also were those whose work did not help the idea of unification, e.g., van DantzigFootnote 16 wrote a series of papers in the first of which he stated:

“It is remarkable that not only no fundamental tensor [first fundamental form] or tensor-density, but also no connection, neither Riemannian nor projective, nor conformal, is needed for writing down the [Maxwell] equations. Matter is characterised by a bivectordensity […].” ([367], p. 422, and also [363, 364, 365, 366])

If one of the fields to be united asks for less “geometry”, why to mount all the effort needed for generalising Riemannian geometry?

A methodological weak point in the process of the establishment of field equations for unified field theory was the constructive weakness of alternate physical limits to be taken:

  • no electromagnetic field → Einstein’s equations in empty space;

  • no gravitational field → Maxwell’s equations;

  • “weak” gravitational and electromagnetic fields → Einstein-Maxwell equations;

  • no gravitational field but a “strong” electromagnetic field → some sort of non-linear electrodynamics.

A similar weakness occurred for the equations of motion; about the only limiting equation to be reproduced was Newton’s equation augmented by the Lorentz force. Later, attempts were made to replace the relationship “geodesics → freely falling point particles” by more general assumptions for charged or electrically neutral point particles — depending on the more general (non-Riemannian) connections introducedFootnote 17. A main hindrance for an eventual empirical check of unified field theory was the persistent lack of a worked out example leading to a new gravito-electromagnetic effect.

In the following Section 2, a multitude of geometrical concepts (affine, conformal, projective spaces, etc.) available for unified field theories, on the one side, and their use as tools for a description of the dynamics of the electromagnetic and gravitational field on the other will be sketched. Then, we look at the very first steps towards a unified field theory taken by ReichenbächerFootnote 18, Förster (alias Bach), WeylFootnote 19, EddingtonFootnote 20, and Einstein (see Section 3.1). In Section 4, the main ideas are developed. They include Weyl’s generalization of Riemannian geometry by the addition of a linear form (see Section 4.1) and the reaction to this approach. To this, Kaluza’s idea concerning a geometrization of the electromagnetic and gravitational fields within a five-dimensional space will be added (see Section 4.2) as well as the subsequent extensions of Riemannian to affine geometry by Schouten, Eddington, Einstein, and others (see Section 4.3). After a short excursion to the world of mathematicians working on differential geometry (see Section 5), the research of Einstein and his assistants is studied (see Section 6). Kaluza’s theory received a great deal of attention after O. KleinFootnote 21 intervention and extension of Kaluza’s paper (see Section 6.3.2). Einstein’s treatment of a special case of a metric-affine geometry, i.e., “distant parallelism”, set off an avalanche of research papers (see Section 6.4.4), the more so as, at the same time, the covariant formulation of Dirac’s equation was a hot topic. The appearance of spinors in a geometrical setting, and endeavours to link quantum physics and geometry (in particular, the attempt to geometrize wave mechanics) are also discussed (see Section 7). We have included this topic although, strictly speaking, it only touches the fringes of unified field theory.

In Section 9, particular attention is given to the mutual influence exerted on each other by the Princeton (EisenhartFootnote 22, VeblenFootnote 23), French (CartanFootnote 24), and the Dutch (Schouten, StruikFootnote 25) schools of mathematicians, and the work of physicists such as Eddington, Einstein, their collaborators, and others. In Section 10, the reception of unified field theory at the time is briefly discussed.

2 The Possibilities of Generalizing General Relativity: A Brief Overview

As a rule, the point of departure for unified field theory was general relativity. The additional task then was to ‘geometrize’ the electromagnetic field. In this review, we will encounter essentially five different ways to include the electromagnetic field into a geometric setting:

  • by connecting an additional linear form to the metric through the concept of “gauging” (Weyl);

  • by introducing an additional space dimension (Kaluza);

  • by choosing an asymmetric Ricci tensor (Eddington);

  • by adding an antisymmetric tensor to the metric (Bach, Einstein);

  • by replacing the metric by a 4-bein field (Einstein).

In order to bring some order into the wealth of these attempts towards “unified field theory,” I shall distinguish four main avenues extending general relativity, according to their mathematical direction: generalisation of

  • geometry,

  • dynamics (Lagrangians, field equations),

  • number field, and

  • dimension of space,

as well as their possible combinations. In the period considered, all four directions were followed as well as combinations between them like e.g., five-dimensional theories with quadratic curvature terms in the Lagrangian. Nevertheless, we will almost exclusively be dealing with the extension of geometry and of the number of space dimensions.

2.1 Geometry

It is very easy to get lost in the many constructive possibilities underlying the geometry of unified field theories. We briefly describe the mathematical objects occurring in an order that goes from the less structured to the more structured cases. In the following, only local differential geometry is taken into accountFootnote 26.

The space of physical events will be described by a real, smooth manifold MD of dimension D coordinatised by local coordinates xi, and provided with smooth vector fields X, Y, … with components Xi, Yi, … and linear forms ω, ν, …, (ωi, νi) in the local coordinate system, as well as further geometrical objects such as tensors, spinors, connectionsFootnote 27. At each point, D linearly independent vectors (linear forms) form a linear space, the tangent space (cotangent space) of MD. We will assume that the manifold MD is space- and time-orientable. On it, two independent fundamental structural objects will now be introduced.

2.2 Metrical structure

The first is a prescription for the definition of the distance ds between two infinitesimally close points on MD, eventually corresponding to temporal and spatial distances in the external world. For ds, we need positivity, symmetry in the two points, and the validity of the triangle equation. We know that ds must be homogeneous of degree one in the coordinate differentials dxi connecting the points. This condition is not very restrictive; it still includes Finsler geometry [281, 126, 224] to be briefly touched, below.

In the following, ds is linked to a non-degenerate bilinear form g(X, Y), called the first fundamental form; the corresponding quadratic form defines a tensor field, the metrical tensor, with D2 components gij such that

$$ ds = \sqrt {{g_{ij}}d{x^i}d{x^j}} , $$

where the neighbouring points are labeled by xi and xi+dxi, respectivelyFootnote 28. Besides the norm of a vector \(\left| X \right|: = \sqrt {{g_{ij}}{X^i}{X^j}}\), the “angle” between directions X, Y can be defined by help of the metric:

$$ \cos (\angle (X,Y)): = \frac{{{g_{ij}}{X^i}{Y^j}}}{{\left| X \right|\left| Y \right|}}. $$

From this we note that an antisymmetric part of the metrical tensor does not influence distances and norms but angles.

With the metric tensor having full rank, its inverse gik is defined throughFootnote 29

$$ {g_{mi}}{g^{mj}} = \delta _i^j $$

We are used to g being a symmetric tensor field, i.e., with gik=g(ik) and with only D(D+1)/2 components; in this case the metric is called Riemannian if its eigenvalues are positive (negative) definite and Lorentzian if its signature is ±(D−2)Footnote 30. In the following this need not hold, so that the decomposition obtainsFootnote 31:

$$ {g_{ij}} = {\gamma _{(ik)}} + {\phi _{\left[ {ik} \right]}}. $$

An asymmetric metric was considered in one of the first attempts at unifying gravitation and electromagnetism after the advent of general relativity.

For an asymmetric metric, the inverse

$$ {g_{ij}} = {h^{(ik)}} + {f^{[ik]}} = {h^{ik}} + {f^{ik}} $$

is determined by the relations

$$ \begin{array}{*{20}{l}} {{\gamma _{ij}}{\gamma ^{ik}} = \delta _j^k,}&{\;\;\;{\phi _{ij}}{\phi ^{ik}} = \delta _j^k,\;\;\;}&{{h_{ij}}{h^{ik}} = \delta _j^k,\;\;\;}&{{f_{ij}}{f^{ik}} = \delta _j^k,} \end{array} $$

and turns out to be [356]

$$ {h^{(ik)}} = \frac{\gamma }{g}{\gamma ^{ik}} + \frac{\phi }{g}{\phi ^{im}}{\phi ^{kn}}{\gamma _{mn}}, $$
$$ {f^{(ik)}} = \frac{\phi }{g}{\phi ^{ik}} + \frac{\gamma }{g}{\gamma ^{im}}{\gamma ^{kn}}{\phi _{mn}}, $$

where g, φ, and γ are the determinants of the corresponding tensors gik, φik, and γik. We also note that

$$ g = \gamma + \phi + \frac{\gamma }{2}{\gamma ^{kl}}{\gamma ^{mn}}{\phi _{km}}{\phi _{\ln }}, $$

where g := det gik, φ := det φik, γ := det γik. The results (6, 7, 8) were obtained already by Reichenbächer ([273], pp. 223–224)Footnote 32 and also by Schrödinger [320]. Eddington also calculated Equation (8); in his expression the term ∼φik*φik is missing (cf. [59], p. 233).

The manifold is called space-time if D=4 and the metric is symmetric and Lorentzian, i.e., symmetric and with signature sig g=±2. Nevertheless, sloppy contemporaneaous usage of the term “space-time” includes arbitrary dimension, and sometimes is applied even to metrics with arbitrary signature.

In a manifold with Lorentzian metric, a non-trivial real conformal structure always exists; from the equation

$$g(X,X) = 0$$

results an equivalence class of metrics {λ} with λ being an arbitrary smooth function. In view of the physical interpretation of the light cone as the locus of light signals, a causal structure is provided by the equivalence class of metrics [67]. For an asymmetric metric, this structure can exist as well; it then is determined by the symmetric part γik=γ(ik) of the metric alone taken to be Lorentzian.

A special case of a space with a Lorentzian metric is Minkowski space, whose metrical components, in Cartesian coordinates, are given by

$$ {\eta _{ik}} = {\delta _i}^0{\delta _i}^0 - {\delta _i}^1{\delta _i}^1 - {\delta _i}^2{\delta _i}^2 - {\delta _i}^3{\delta _i}^3. $$

A geometrical characterization of Minkowski space as an uncurved, flat space is given below. Let \({{\mathcal L}_X}\) be the Lie derivative with respect to the tangent vector XFootnote 33; then \({{\cal L}_{{{_X}_p}{\eta _{ik}}}} = 0\) holds for the Lorentz group of generators Xp.

The metric tensor g may also be defined indirectly through D vector fields forming an orthonormal D-leg (-bein) \(h_{\hat \iota }^k\). with

$$ {g_{lm}} = {h_{l\hat \jmath }}{h_{m\hat k}}{\eta ^{\hat \jmath \hat k}}, $$

where the hatted indices (“bein-indices”) count the number of legs spanning the tangent space at each point (ĵ=1, 2, … , D) and are moved with the Minkowski metricFootnote 34. From the geometrical point of view, this can always be done (cf. theories with distant parallelism). By introducing 1-forms \({\theta ^{\hat k}}: = h_l^{\hat k}d{x^l}\), Equation (11) may be brought into the form \(d{s^2} = {\theta ^{\hat \imath}}{\theta ^{\hat k}}{\eta _{\hat \imath\hat k}}\).

A new physical aspect will come in if the h kî are considered to be the basic geometric variables satisfying field equations, not the metric. Such tetrad-theories (for the case D=4) are described well by the concept of fibre bundle. The fibre at each point of the manifold contains, in the case of an orthonormal D-bein (tetrad), all D-beins (tetrads) related to each other by transformations of the group O(D), or the Lorentz group, and so on.

In Finsler geometry, the line element depends not only on the coordinates xi of a point on the manifold, but also on the infinitesimal elements of direction between neighbouring points dxi:

$$ d{s^2} = {g_{ij}}({x^n},d{x^m})d{x^i}d{x^j}. $$

Again, gij is required to be homogeneous of rank 1.

2.2.1 Affine structure

The second structure to be introduced is a linear connection L with D3 components Lijk; it is a geometrical object but not a tensor field and its components change inhomogeneously under local coordinate transformationsFootnote 35. The connection is a device introduced for establishing a comparison of vectors in different points of the manifold. By its help, a tensorial derivative ∇, called covariant derivative is constructed. For each vector field and each tangent vector it provides another unique vector field. On the components of vector fields X and linear forms ω it is defined by

$$\begin{array}{*{20}{c}} {{{\mathop \nabla \limits^ + }_k}{X^i} = \frac{{\partial {X^i}}}{{\partial {x^k}}} + {L_{kj}}^i{X^j},\;\;\;}&{{{\mathop \nabla \limits^ + }_k}{\omega _i} = \frac{{\partial {\omega _i}}}{{\partial {x^k}}} - {L_{ki}}^j{\omega _j}.} \end{array}$$

The expressions \({\mathop \nabla \limits^ + _k}{X^i}\) and \(\tfrac{{\partial {X^i}}}{{\partial {x^k}}}\) are abbreviated by \({X^i}_{\left\| k \right.}\) and Xi,k, respectively, while for a scalar f covariant and partial derivative coincide: \({\nabla _i}f = {\tfrac{{\partial f}}{{\partial {x_i}}}} \equiv {\partial _i}f \equiv {f_{,i}}\).

We have adopted the notational convention used by Schouten [300, 310, 389]. Eisenhart and others [121, 234] change the order of indices of the components of the connection:

$$ \begin{array}{*{20}{c}} {{{\mathop \nabla \limits^ - }_k}{X^i} = \frac{{\partial {X^i}}}{{\partial {x^k}}} + {L_{jk}}^i{X^j},\;\;\;}&{{{\mathop \nabla \limits^ - }_k}{\omega _i} = \frac{{\partial {\omega _i}}}{{\partial {x^k}}} - {L_{ik}}^j{\omega _j}.} \end{array} $$

As long as the connection is symmetric, this does not make any difference as \({\mathop \nabla \limits^ + _k}{X^i} - {\mathop \nabla \limits^ - _k}{X^i} = 2{L_{[kj]}}^i{X^j}\). For both kinds of derivatives we have:

$$ \begin{array}{*{20}{c}} {{{\mathop \nabla \limits^ + }_k}({v^l}{w_l}) = \frac{{\partial ({v^l}{w_l})}}{{\partial {x^k}}},\;\;\;}&{{{\mathop \nabla \limits^ - }_k}({v^l}{w_l}) = \frac{{\partial ({v^l}{w_l})}}{{\partial {x^k}}}} \end{array} $$

Both derivatives are used in versions of unified field theory by Einstein and othersFootnote 36.

A manifold provided with only a linear connection L is called affine space. From the point of view of group theory, the affine group (linear inhomogeneous coordinate transformations) plays a special role: With regard to it the connection transforms as a tensor (cf. Section 2.1.5).

For a vector density (cf. Section 2.1.5), the covariant derivative of contains one more term:

$$ \begin{array}{*{20}{c}} {{{\mathop \nabla \limits^ + }_k}{{\hat X}^i} = \frac{{\partial {{\hat X}^i}}}{{\partial {x^k}}} + {L_{kj}}^i{{\hat X}^j} - {L_{kr}}^r{{\hat X}^i},\;\;\;}&{{{\mathop \nabla \limits^ - }_k}{{\hat X}^i} = \frac{{\partial {X^i}}}{{\partial {x^k}}} + {L_{jk}}^i{{\hat X}^j} - {L_{rk}}^r{{\hat X}^i}.} \end{array} $$

A smooth vector field Y is said to be parallely transported along a parametrised curve λ(u) with tangent vector X if for its components \({Y^i}_{\left\| k \right.}{X^k}(u) = 0\) holds along the curve. A curve is called an autoparallel if its tangent vector is parallely transported along it at each pointFootnote 37:

$$ {X^i}_{\left\| k \right.}{X^k}(u) = \sigma (u){X^i}. $$

By a particular choice of the curve73x2019;s parameter, σ=0 may be imposed.

A transformation mapping autoparallels to autoparallels is given by:

$$ {L_{ik}}^j \to {L_{ik}}^j + {\delta ^j}_{(i}{\omega _{k)}}. $$

The equivalence class of autoparallels defined by Equation (18) defines a projective structure on MD [404, 403].

The particular set of connections

$$ _{(p)}{L_{ij}}^k: = {L_{ij}}^k - \frac{2}{{D + 1}}{\delta ^k}_{(i}{L_{j)}} $$

with \({L_j}: = {L_{im}}^m\) is mapped into itself by the transformation (18) [348].

In Part II of this article, we shall find the set of transformations \({L_{ik}}^j \to {L_{ik}}^j + {\delta ^j}_i\tfrac{{\partial \omega }}{{\partial {x^k}}}\) playing a role in versions of Einstein’s unified field theory.

From the connection Lijk further connections may be constructed by adding an arbitrary tensor field T to its symmetrised partFootnote 38:

$$ {\bar L_{ij}}^k = {L_{(ij)}}^k + {T_{ij}}^k = {\Gamma _{ij}}^k + {T_{ij}}^k. $$

By special choice of T we can regain all connections used in work on unified field theories. We will encounter examples in later sections. The antisymmetric part of the connection, i.e.,

$$ {S_{ij}}^k = {L_{[ij]}}^k = {T_{[ij]}}^k $$

is called torsion; it is a tensor field. The trace of the torsion tensor \({S_i}: = {S_{il}}^l\) is called torsion vector; it connects to the two traces of the affine connection \({L_i}: = {L_{il}}^l;{{\tilde L}_j}: = {L_{lj}}^l\) as \({S_i} = \tfrac{1}{2}({L_i} - {\tilde L_i})\).

2.2.2 Different types of geometry Affine geometry

Various subcases of affine spaces will occur, dependent on whether the connection is asymmetric or symmetric, i.e., with \({L_{ij}}^k = {\Gamma _{ij}}^k\). In physical applications, a metric always seems to be needed; hence in affine geometry it must be derived solely by help of the connection or, rather, by tensorial objects constructed from it. This is in stark contrast to Riemannian geometry where, vice versa, the connection is derived from the metric. Such tensorial objects are the two affine curvature tensors defined byFootnote 39

$$ \mathop {{\rm{ }}K}\limits^ + {\;^i}_{jkl}\; = {\partial _k}L_{lj}^{\;\;\;i} - {\partial _l}L_{kj}^{\;\;\;i} + L_{km}^{\;\;\;i}L_{lj}^{\;\;\;m} - L_{lm}^{\;\;\;i}L_{kj}^{\;\;\;m}, $$
$$ \mathop {{\rm{ }}K}\limits^ - {\;^i}_{jkl}\; = {\partial _k}L_{jl}^{\;\;\;i} - {\partial _l}L_{jk}^{\;\;\;i} + L_{mk}^{\;\;\;i}L_{jl}^{\;\;\;m} - L_{ml}^{\;\;\;i}L_{jk}^{\;\;\;m}, $$

respectively. In a geometry with symmetric affine connection both tensors coincide because of

$$ \frac{1}{2}(\mathop {{\rm{ }}K}\limits^ + \;_{jkl}^i - \mathop {{\rm{ }}K}\limits^ - \;_{jkl}^i) = {\partial _{[k}}{S_{]lj}}^i + 2{S_{j[k}}^mS_{l]m}^{\;\;\;i} + L_{m[k}^{\;\;\;i}{S_{l]j}}^m - L_{j[k}^{\;\;\;\;m}{S_{l]m}}^i. $$

In particular, in Riemannian geometry, both affine curvature tensors reduce to the one and only Riemann curvature tensor.

The curvature tensors arise because the covariant derivative is not commutative and obeys the Ricci identity:

$$ \mathop {{\rm{ }}\nabla }\limits^ + {\,_{[j}}\mathop \nabla \limits^ + {\,_{k]}}{A^i} = \frac{1}{2}\mathop {{\rm{ }}K}\limits^ + {\,^i}_{rjk}{A^r} - {S_{jk}}^r\mathop {{\rm{ }}\nabla }\limits^ + {\,_r}{A^i} $$
$$ \mathop {{\rm{ }}\nabla }\limits^ - {\,_{[j}}\mathop \nabla \limits^ - {\,_{k]}}{A^i} = \frac{1}{2}\mathop {{\rm{ }}K}\limits^ - {\,^i}_{rjk}{A^r} - {S_{jk}}^r\mathop {{\rm{ }}\nabla }\limits^ - {\,_r}{A^i} $$

For a vector density, the identity is given by

$$ \mathop {{\rm{ }}\nabla }\limits^ + {\,_{[j}}\mathop \nabla \limits^ + {\,_{\,k]}}{{\hat A}^i} = \frac{1}{2}\mathop {{\rm{ }}\nabla }\limits^ + \,{{\,} ^i}_{rjk}{{\hat A}^r} - {S_{jk}}^r\mathop \nabla \limits^ + {\,_{\,r}}{{\hat A}^i} + \frac{1}{2}{V_{jk}}{{\hat A}^i} $$

with the homothetic curvature Vjk to be defined below in Equation (31).

The curvature tensor (22) satisfies two algebraic identities:

$$ {\mathop {{\rm{ }}K}\limits^ + {\;^i}_{j[kl]}\; = 0,} $$
$$ {\mathop {{\rm{ }}K}\limits^ + {\;^i}_{\{ jkl\} }\; = 2{\nabla _{\{ j}}{S_{kl\} }}^i + 4{S_{m\{ j}}^i{S_{kl\} }}^m,} $$

where the curly bracket denotes cyclic permutation:

$$ {K^i}_{\{ jkl\} }: = {K^i}_{jkl} + {K^i}_{ljk} + {K^i}_{klj}. $$

These identities can be found in Schouten’s book of 1924 ([300], p. 88, 91) as well as the additional single integrability condition, called Bianchi identity:

$$ \mathop {{\rm{ }}K}\limits^ + {\;^i}_{j\{ kl\left\| {m\} } \right.}\; = 2{K^i}_{r\{ kl}{S_{m\} j}}^r. $$

A corresponding condition obtains for the curvature tensor from Equation (23).

From both affine curvature tensors we may form two different tensorial traces each. In the first case \({V_{kl}}: = {K^i}_{ikl} = {V_{[kl]}}\), and \({K_{jk}}: = {K^i}_{jki}\). Vkl is called homothetic curvature, while Kjk is the first of the two affine generalisations from \(\mathop {{\rm{ }}K}\limits^ +\) and \(\mathop {{\rm{ }}K}\limits^ -\) of the Ricci tensor in Riemannian geometry. We getFootnote 40

$$ {V_{kl}} = {\partial _k}{L_l} - {\partial _l}{L_k}, $$

and the following identities hold:

$$ {V_{kl}} + 2{K_{[kl]}} = 4{\nabla _{[k}}{S_{l]}} + 8{S_{kl}}^m{S_m} + 2{\nabla _m}{S_{kl}}^m, $$
$$ \mathop V\limits^ - {\,_{kl}} + 2\mathop K\limits^ - {\,_{[kl]}} = - 4\mathop \nabla \limits^ - {\,_{[k}}{S_{l]}} + 8{S_{kl}}^m{S_m} + 2{\nabla _m}{S_{kl}}^m, $$

where \({S_k}: = {S_{kl}}^l\). While Vkl is antisymmetric, Kjk has both tensorial symmetric and antisymmetric parts:

$$ {K_{[kl]}} = - {\partial _{[k}}{{\tilde L}_{l]}} + {\nabla _m}{S_{kl}}^m + {L_m}{S_{kl}}^m + 2{L_{[l\left| r \right.}}^m{S_{m\left| k \right.}}^r, $$
$$ {K_{(kl)}} = {\partial _{(k}}{{\tilde L}_{l)}} - {\partial _m}{L_{(kl)}}^m - {{\tilde L}_m}{L_{(kl)}}^m + {L_{(k\left| m \right|}}^n{L_{l)}}^m. $$

We use the notation \({A_{(i\left| k \right|l)}}\) in order to exclude the index k from the symmetrisation bracketFootnote 41.

In order to shorten the presentation of affine geometry, we refrain from listing the corresponding set of equations for the other affine curvature tensor (cf., however, [356]).

For a symmetric affine connection, the preceding results reduce considerably due to \({S_{kl}}^m = 0\). From Equations (29,30,32) we obtain the identities:

$$ {K^i}_{\{ jkl\} } = 0, $$
$$ {K^i}_{j\{ kl\left\| {m\} } \right.} = 0, $$
$${V_{kl}} + 2{K_{[kl]}} = 0,$$

i.e., only one independent trace tensor of the affine curvature tensor exists. For the antisymmetric part of the Ricci tensor \({K_{[kl]}} = - {\partial _{[k}}{{\tilde L}_{l]}}\) holds. This equation will be important for the physical interpretation of affine geometry.

In affine geometry, the simplest way to define a fundamental tensor is to set gij:=αK(ij), or gij:=αK̅(ij). It may be desirable to derive the metric from a Lagrangian; then the simplest scalar density that could be used as such is given by det (Kij)Footnote 42.

As a final result in this section, we give the curvature tensor calculated from the connection \({{\bar L}_{ij}}^k = {\Gamma _{ij}}^k + {T_{ij}}^k\) (cf. Equation (20)), expressed by the curvature tensor of \({\Gamma _{ij}}^k\) and by the tensor \({T_{ij}}^k\):

$$ {K^i}_{jkl}(\bar L) = {K^i}_{jkl}(\Gamma ) + {2^{(\Gamma )}}{\nabla _{[k}}{T_{l]j}}^i - 2{T_{[k\left| j \right|}}^m{T_{l]m}}^i + 2{S_{kl}}^m{T_{mj}}^i, $$

where (Γ)∇ is the covariant derivative formed with the connection \({\Gamma _{ij}}^k\) (cf. also [310], p. 141). Mixed geometry

A manifold carrying both structural elements, i.e., metric and connection, is called a metric-affine space. If the first fundamental form is taken to be asymmetric, i.e., to contain an antisymmetric part \({g_{[ik]}}: = \tfrac{1}{2}({g_{ij}} - {g_{ji}})\), we speak of a mixed geometry. In principle, both metric-affine space and mixed geometry may always be re-interpreted as Riemannian geometry with additional geometric objects: the 2-form field φ(f) (symplectic form), the torsion S, and the non-metricity Q (cf. Equation 41). It depends on the physical interpretation, i.e., the assumed relation between mathematical objects and physical observables, which geometry is the most suitable.

From the symmetric part of the first fundamental form hij=g(ij), a connection may be constructed, often called after Levi-CivitaFootnote 43 [204],

$$ \{\, _{ij}^k\} : = \frac{1}{2}{\gamma ^{kl}}({\gamma _{li,j}} + {\gamma _{lj,i}} - {h_{ij,}}), $$

and from it the Riemannian curvature tensor defined as in Equation (22) with \({L_{ij}}^k = \{ \,_{ij}^k\}\) (cf. Section 2.1.3); { kij } is called the Christoffel symbol. Thus, in metric-affine and in mixed geometry, two different connections arise in a natural way. In the remaining part of this section we will deal with a symmetric fundamental form γij only, and denote it by gij.

With the help of the symmetric affine connection, we may define the tensor of non-metricity \({Q_{ij}}^k\) byFootnote 44

$$ {Q_{ij}}^k: = {g^{kl}}{\nabla _l}{g_{ij}}. $$

Then the following identity holds:

$$ {\Gamma _{ij}}^k = \{ {\mkern 1mu} _{ij}^k{\mkern 1mu} \} + {K_{ij}}^k + \frac{1}{2}({Q^k}_{ij} + {Q_{ji}}^k - {Q_{j}^{k}}_{i}), $$

where the contorsion tensor \({K_{ij}}^k\), a linear combination of torsion \({S_{ij}}^k\), is defined byFootnote 45

$$ {K_{ij}}^k: = {S^k}_{ji} + {S^k}_{ij} - {S_{ij}}^k = - {K_i\,^k_j}. $$

The inner product of two tangent vectors Ai, Bk is not conserved under parallel transport of the vectors along Xl if the non-metricity tensor does not vanish:

$$ {X^k}{\mathop \nabla \limits^ +} _k({A^n}{B^m}{g_{nm}}) = {Q_{nml}}{A^n}{B^m}{X^l} \ne 0. $$

A connection for which the non-metricity tensor vanishes, i.e.,

$$ {\mathop \nabla \limits^ +} _k{g_{ij}} = 0 $$

holds, is called metric-compatibleFootnote 46

J. M. ThomasFootnote 47 introduced a combination of the terms appearing in \(\mathop \nabla \limits^ +\) and \(\mathop \nabla \limits^ -\) to define a covariant derivative for the metric ([346], p. 188),

$$ {g_{ik/l}}: = {g_{ik,l}} - {g_{rk}}{\Gamma _{il}}^{r} - {g_{ir}}{\Gamma _{lk}}^r, $$

and extended it for tensors of arbitrary rank ≥ 3.

Einstein later used as a constraint on the metrical tensor

$$ 0 = \mathop {{g_{ik\left\| l \right.}}}\limits_{ + - } : = {g_{ik,l}} - {g_{rk}}{\Gamma _{il}}^r - {g_{ir}}{\Gamma _{lk}}^r, $$

a condition that cannot easily be interpreted geometrically [97]. We will have to deal with Equation (47) in Section 6.1 and, more intensively, in Part II of this review.

Connections that are not metric-compatible have been used in unified field theory right from the beginning. Thus, in Weyl’s theory [397, 395] we have

$${Q_{ijk}} = {Q_k}{g_{ij}}.$$

In case of such a relationship, the geometry is called semi-metrical [300, 310]. According to Equation (44), in Weyl’s theory the inner product multiplies by a scalar factor under parallel transport:

$$ {X^k}{\mathop \nabla \limits^ +} _k({A^n}{B^m}{g_{nm}}) = ({Q_l}{X^l}){A^n}{B^m}{g_{nm}}. $$

This means that the light cone is preserved by parallel transport.

We may also abbreviate the last term in the identity (42) by introducing

$$ {X_{ij}}^k: = {Q^k}_{ij} + {Q_{ji}}^k - {{Q_{j}}^k}_{i}. $$

Then, from Equation (39), the curvature tensor of a torsionless affine space is given by

$$ {K^i}_{jkl}(\bar \Gamma ) = {K^i}_{jkl}(\{ \;_{nm}^{\;\;r}\;\} ) + {2^{(\{ \,_{jk}^i\} )}}{\nabla _{[k}}{X_{l]j}}^i - 2{X_{[k\left| j \right|}}^m{X_{l]m}}^i, $$

where ({ ijk })∇ is the covariant derivative formed with the Christoffel symbol.

Riemann-Cartan geometry is the subcase of a metric-affine geometry in which the metric-compatible connection contains torsion, i.e., an antisymmetric part \({L_{[ij]}}^k\); torsion is a tensor field to be linked to physical observables. A linear connection whose antisymmetric part \({S_{ij}}^k\) has the form

$$ {S_{ij}}^k = {S_{[i}}{\delta ^k}_{j]} $$

is called semi-symmetric [300].

Riemannian geometry is the further subcase with vanishing torsion of a metric-affine geometry with metric-compatible connection. In this case, the connection is derived from the metric: \({\Gamma _{ij}}^k = \{ \,_{ij}^k\,\}\), where \(\{ \,_{ij}^k\,\}\) is the usual Christoffel symbol (40). The covariant derivative of A with respect to the Levi-Civita connection \(\mathop \nabla \limits^{\{ \,_{ij}^k\,\} }\) is abbreviated by A;k. The Riemann curvature tensor is denoted

$$ R_{jkl}^i = {\partial _k}\{ \,_{lj}^i\,\} - {\partial _l}\{ \,_{kj}^i\,\} + \{ \,_{km}^{\;\;i}\,\} \{ \,_{lj}^m\,\} - \{ \,_{lm}^{\;\;i}\,\} \{ \,_{kj}^m\,\} . $$

An especially simple case of a Riemanian space is Minkowski space, the curvature of which vanishes:

$$ R_{jkl}^i(\eta ) = 0. $$

This is an invariant characterisation irrespective of whether the Minkowski metric η is given in Cartesian coordinates as in Equation (10), or in an arbitrary coordinate system. We also have \({{\mathcal L}_X}{R_{ijkl}} = 0\) where \({\mathcal L}\) is the Lie-derivative (see below under “symmetries”), and X stands for the generators of the Lorentz group.

In Riemanian geometry, the so-called geodesic equation,

$$ {X^i}_{;k}{X^k}(u) = \sigma (u){X^i}, $$

determines the shortest and the straightest curve between two infinitesimally close points. However, in metric affine and in mixed geometry geodesic and autoparallel curves will have to be distinguished.

A conformal transformation of the metric,

$$ {g_{ik}} \to {g'_{ik}} = \lambda {g_{ik}}, $$

with a smooth function λ changes the components of the non-metricity tensor,

$$ {Q_{ij}}^k \to {Q_{ij}}^k + {g_{ij}}{g^{kl}}{\partial _l}\sigma , $$

as well as the Levi-Civita connection,

$$ \{ \,_{ij}^k\,\} \to \{ \,_{ij}^k\,\} + \frac{1}{2}({\sigma _i}\delta _j^k - {\sigma _j}\delta _i^k + {g_{ij}}{g^{kl}}{\sigma _l}), $$

with \({\sigma _i}: = {\lambda ^{ - 1}}{\partial _i}\lambda\). As a consequence, the Riemann curvature tensor \({R^i}_{jkl}\) is also changed; if, however, \({{R'}^i}_{jkl} = 0\) can be reached by a conformal transformation, then the corresponding spacetime is called conformally flat. In MD, for D>3, the vanishing of the Weyl curvature tensor

$$ C_{jkl}^i: = R_{jkl}^i + \frac{2}{{D - 2}}(\delta _{[k}^i{R_{l]j}} + {g_{j[l}}{R^i}_{k]}) + \frac{{2R}}{{(D - 1)(D - 2)}}{\delta ^i}_{[l}{g_{k]j}} $$

is a necessary and sufficient condition for MD to be conformally flat ([397], p. 404, [300], p. 170).

Even before Weyl, the question had been asked (and answered) as to what extent the conformal and the projective structures were determining the geometry: According to Kretschmann (and then to Weyl) they fix the metric up to a constant factor ([196]; see also [401], Appendix 1; for a modern approach, cf. [67]).

The geometry needed for the pre- and non-relativistic approaches to unified field theory will have to be dealt with separately. There, the metric tensor of space is Euclidean and not of full rank; time is described by help of a linear form (Newton-Cartan geometry, cf. [65, 66]). In the following we shall deal only with relativistic unified field theories. Projective geometry

Projective geometry is a generalisation of Riemannian geometry in the following sense: Instead of tangent spaces with the light cone ηikdxidxk=0, where η is the Minkowski metric, in each event now a tangent space with a general, non-degenerate surface of second order γ will be introduced. This leads to a tangential cone gikdxidxk=0 in the origin (cf. Equation (9)), and to a hyperplane, the polar plane, formed by the contact points of the tangential cone and the surface γ. In place of the D inhomogeneous coordinates xi of MD, D + 1 homogeneous coordinates \({X^\alpha }(\alpha = 0,1,2, \ldots ,D)\) are definedFootnote 48 such that they transform as homogeneous functions of first degree:

$$ {X^\alpha }\frac{{\partial {{X'}^\nu }}}{{\partial {X^\mu }}} = {X'^\nu }. $$

The connection to the inhomogeneous coordinates xi is given by homogeneous functions of degree zero, e.g., by \({x^i} = \tfrac{{{x^i}}}{{{\phi _\alpha }{X^\alpha }}}\)Footnote 49. Thus, the \({X^\alpha }\) themselves form the components of a tangent vector. Furthermore, the quadratic form \({g_{\alpha \beta }}{X^\alpha }{X^\beta } = \epsilon = \pm 1\) is adopted with \({g_{\alpha \beta }}\) being a homogeneous function of degree -2. A tensor field \(T_{\;{n_1}\;{n_2}\;{n_3} \ldots }^{{m_1}{m_2}{m_3} \ldots }\) (cf. Section 2.1.5) depending on the homogeneous coordinates \({X^\mu }\) with u contravariant (upper) and l covariant (lower) indices is required to be a homogeneous function of degree \(r: = u - l\).

If we define \({\gamma _\mu }^i: = \tfrac{{\partial {x^i}}}{{\partial {X^\mu }}}\), with \({\gamma _\mu }^i{X^\mu } = 0\), then \({\gamma _\mu }^i\) transforms like a tangent vector under point transformations of the xi, and as a covariant vector under homogeneous transformations of the \({X^\alpha }\). The \({\gamma _\mu }^i\) may be used to relate covariant vectors ai and \({A_\mu }\) by \({A_\mu } = {\gamma _\mu }^i{a_i}\). Thus, the metric tensor in the space of homogeneous coordinates \({g_{\alpha \beta }}\) and the metric tensor \({g_{ik}}\) of MD are related by \({g_{ik}} = {\gamma _i}^\alpha {\gamma _k}^\beta {g_{\alpha \beta }}\) with \({\gamma _\mu }^i{\gamma _k}^\mu = \delta _k^i\). The inverse relationship is given by \({g_{\alpha \beta }} = {\gamma _\alpha }^i{\gamma _\beta }^k{g_{ik}} + \epsilon {X_\alpha }{X_\beta }\) with \({X_\alpha } = {g_{\alpha \beta }}{X^\beta }\). The covariant derivative for tensor fields in the space of homogeneous coordinates is defined as before (cf. Section 2.1.2):

$$ {\nabla _\alpha }{A^\beta }(X) = \frac{{\partial {A^\beta }(X)}}{{\partial {X^\alpha }}} + {\Gamma _{\alpha \nu }}^\beta (X){A^\nu }(X). $$

The covariant derivative of the quantity \({\gamma _k}^\mu\) interconnecting both spaces is given by

$$ {\nabla _\rho }{\gamma _k}^\mu = \frac{{\partial {\gamma _k}^\mu }}{{\partial {x^\rho }}} + {\Gamma _{\rho \sigma }}^\mu {\gamma _k}^\sigma - \{ \,_{kl}^m\,\} {\gamma _\rho }^l{\gamma _m}^\mu . $$

2.2.3 Cartan’s method

In this section, we briefly present Cartan’s one-form formalism in order to make understandable part of the literature. Cartan introduces one-forms \({\theta ^{\hat a}}(\hat a = 1, \ldots ,4)\) by \({\theta ^{\hat a}}: = h_l^{\hat a}d{x^l}\). The reciprocal basis in tangent space is given by \({e_{\hat \jmath}} = h_{\hat \jmath}^l\tfrac{\partial }{{\partial {x^l}}}\). Thus, \({\theta ^{\hat a}}({e_{\hat \jmath}}) = \delta _{\hat \jmath}^{\hat a}\). The metric is then given by \({\eta _{\hat \imath\hat k}}{\theta ^{\hat \imath}} \otimes {\theta ^{\hat k}}\). The covariant derivative of a tangent vector with bein-components X is defined via Cartan’s first structure equations,

$$ {\Theta ^i}: = D{\theta ^{\hat \imath}} = d{\theta ^{\hat \imath}} + {\omega ^{\hat \imath}}_{\hat l} \wedge {\theta ^{\hat \imath}}, $$

where \({\omega ^{\hat \imath }}_{\hat k}\) is the connection-1-form, and \({\Theta ^{\hat \imath}}\) is the torsion-2-form, \({\Theta ^{\hat \imath }} = - {S_{\hat l\hat m}}^{\hat \imath}{\theta ^{\hat l}} \wedge {\theta ^{\hat m}}\). We have \({\omega _{\hat \imath\hat k}} = - {\omega _{\hat k\hat \imath}}\). The link to the components \({L_{[ij]}}^k\) of the affine connection is given by \({\omega ^{\hat \imath}}_{\hat k} = h_l^{\hat \imath}h_{\hat k}^m{L_{\hat rm}}^l{\theta ^{\hat r}}\)Footnote 50. The covariant derivative of a tangent vector with bein-components X then is

$$ D{X^{\hat k}}: = d{X^{\hat k}} + {\omega ^{\hat k}}_{\hat l}{X^{\hat l}}. $$

By further external derivation Footnote 51 on Θ we arrive at the second structure relation of Cartan,

$$ D{\Theta ^{\hat k}} = {\Omega ^{\hat k}}_{\hat l} \wedge {\theta ^{\hat l}}. $$

In Equation (65) the curvature-2-form \({\Omega ^{\hat k}}_{\hat l} = \tfrac{1}{2}{R^{\hat k}}_{\hat l\hat m\hat n}{\theta ^{\hat m}} \wedge {\theta ^{\hat n}}\) appears, which is given by

$$ {\Omega ^{\hat k}}_{\hat l} = d{\omega ^{\hat k}}_{\hat l} + {\omega ^{\hat k}}_{\hat l} \wedge {\omega ^{\hat k}}_{\hat l}. $$

\({\Omega ^{\hat k}}_{\hat k}\) is the homothetic curvature.

2.2.4 Tensors, spinors, symmetries Tensors

Up to here, no definitions of a tensor and a tensor field were given: A tensor Tp(MD) of type (r, s) at a point p on the manifold MD is a multi-linear function on the Cartesian product of r cotangent- and s tangent spaces in p. A tensor field is the assignment of a tensor to each point of MD. Usually, this definition is stated as a linear, homogeneous transformation law for the tensor components in local coordinates:

$$T_{{{l'}_1}\;{{l'}_2}\;{{l'}_3} \cdots }^{{{k'}_1}{{k'}_2}{{k'}_3} \ldots } = T_{{n_1}\;{n_2}\;{n_3} \cdots }^{{m_1}{m_2}{m_{3 \cdots }}}\frac{{\partial {x^{{n_1}}}}}{{\partial {x^{{{l'}_1}}}}}\frac{{\partial {x^{{n_2}}}}}{{\partial {x^{{{l'}_2}}}}}\frac{{\partial {x^{{n_3}}}}}{{\partial {x^{{{l'}_3}}}}}\frac{{\partial {x^{{{k'}_1}}}}}{{\partial {x^{{m_1}}}}}\frac{{\partial {x^{{{k'}_2}}}}}{{\partial {x^{{m_2}}}}}\frac{{\partial {x^{{{k'}_3}}}}}{{\partial {x^{{m_3}}}}} \ldots $$

where xk=xk(xi) with smooth functions on the r.h.s. are taken from the set (“group”) of coordinate transformations (diffeomorphisms). Strictly speaking, tensors are representations of the abstract group at a point on the manifoldFootnote 52.

A relative tensor Tp(MD) of type (r, s) and of weight ω at a point p on the manifold MD transforms like

$$T_{{{l'}_1}\;{{l'}_2}\;{{l'}_3}..}^{{{k'}_1}{{k'}_2}{{k'}_3}..} = {\left[ {\det \left( {\frac{{\partial {x^s}}}{{\partial {{x'}^r}}}} \right)} \right]^\omega }T_{{n_1}\;{n_2}\;{n_3} \cdots }^{{m_1}{m_2}{m_{3 \cdots }}}\frac{{\partial {x^{{n_1}}}}}{{\partial {x^{{{l'}_1}}}}}\frac{{\partial {x^{{n_2}}}}}{{\partial {x^{{{l'}_2}}}}}\frac{{\partial {x^{{n_3}}}}}{{\partial {x^{{{l'}_3}}}}}\frac{{\partial {x^{{{k'}_1}}}}}{{\partial {x^{{m_1}}}}}\frac{{\partial {x^{{{k'}_2}}}}}{{\partial {x^{{m_2}}}}}\frac{{\partial {x^{{{k'}_3}}}}}{{\partial {x^{{m_3}}}}} \ldots $$

An example is given by the totally antisymmetric object ijkl with ijkl=±1, or ijkl=0 depending on whether (ijkl) is an even or odd permutation of (0123), or whether two indices are alike. ω=-1 for ijkl; in this case, the relative tensor is called tensor density. We can form a tensor from ijkl by introducing \({\eta _{ijkl}}: = \sqrt { - g} { \epsilon _{ijkl}}\), where gik is a Lorentz-metric. Note that \({\eta ^{ijkl}}: = \tfrac{1}{{\sqrt { - g} }}{ \epsilon ^{ijkl}}=\). The dual to a 2-form (skew-symmetric tensor) then is defined by \(*{F^{ij}} = \tfrac{1}{2}{\eta ^{ijkl}}{F_{kl}}\).

In connection with conformal transformations \(g \to \lambda g\), the concept of the gauge-weight of a tensor is introduced. A tensor \(T_{\;\;\;\;...}^{...}\) is said to be of gauge weight q if it transforms by Equation (56) as

$$T_{\;\;\;...}^{...} = {\lambda ^q}T_{\;\;\;...}^{...}.$$

Objects that transform as in Equation (67) but with respect to a subgroup, e.g., the linear group, affine group G(D), orthonormal group O(D), or the Lorentz group \({\mathcal L}\), are tensors in a restricted sense; sometimes they are named affine or Cartesian tensors. All the subgroups mentioned are Lie-groups, i.e., continuous groups with a finite number of parameters. In general relativity, both the “group” of general coordinate transformations and the Lorentz group are present. The concept of tensors used in Special Relativity is restricted to a representation of the Lorentz group; however, as soon as the theory is to be given a coordinate-independent (“generally covariant”) form, then the full tensor concept comes into play. Spinors

Spinors are representations of the Lorentz group only; as such they are related strictly to the tangent space of the space-time manifold. To see how spinor representations can be obtained, we must use the 2-1 homomorphism of the group SL(2,C) and the proper orthochronous Lorentz group, a subgroup of the full Lorentz groupFootnote 53. Let ASL(2,C); then A is a complex (2-by-2)-matrix with det A=1. By picking the special Hermitian matrix

$$ \mathbf{S} = {x^0}\mathbf{1} + \sum\limits_p {{\sigma _p}{x^p},} $$

where 1 is the (2 by 2)-unit matrix and σ are the Pauli matrices satisfying

$$ {\sigma _i}{\sigma _k} + {\sigma _k}{\sigma _i} = 2{\delta _{ik}}. $$

Then, by a transformation A from SL(2,C),

$$ \mathbf{S}' = \mathbf{AS}{\mathbf{A}^ + }, $$

where A+ is the Hermitian conjugate matrixFootnote 54. Moreover, det S=det S′ which, according to Equation (70), expresses the invariance of the space-time distance to the origin:

$$ {(x{\,^{0'}})^2} - {(x{\,^{1'}})^2} - {(x{\,^{2'}})^2} - {(x{\,^{3'}})^2} = {(x{\,^0})^2} - {(x{\,^1})^2} - {(x{\,^2})^2}{(x{\,^3})^2}. $$

The link between the representation of a Lorentz transformation Lik in space-time and the unimodular matrix A mapping spin space (cf. below) is given by

$$ L{(\mathbf{A})_{ik}} = \frac{1}{2}{\rm{tr(}}{\sigma _i}\mathbf{A}{\sigma _k}{\mathbf{A}^ + }). $$

Thus, the map is two to one: +A and -A give the same Lik.

Now, contravariant 2-spinors ξA(A=1, 2) are the elements of a complex linear space, spinor space, on which the matrices A are actingFootnote 55. The spinor is called elementary if it transforms under a Lorentz-transformation as

$$ {\xi ^{A'}} = \pm {\mathbf{A}^A}_C{\xi ^C}. $$

Likewise, contravariant dotted spinors ζȦ are those transforming with the complex-conjugate matrix Ā: {\zeta ^{\dot B'}} = \pm {{\bar \mathbf{A}}^{\dot B}}_{\,\dot D}{\zeta ^{\dot C}}. Covariant and covariant dotted 2-spinors correspondingly transform with the inverse matrices,

$$ {\xi _{B'}} = \pm {({\mathbf{A}^{ - 1}})^C}_B{\xi _C}, $$


$$ {\xi _{\dot B'}} = \pm {({{{\bf{\bar A}}}^{ - 1}})^{\dot C}}_{\dot B}{\xi _{\dot C}}. $$

The space of 2-spinors can be used as a representation space for the (proper, orthochronous) Lorentz group, with the 2-spinors being the elements of the most simple representation D(1/2,0).

Higher-order spinors with dotted and undotted indices \(S_{\,\,\,\,C \ldots \dot D \ldots }^{A \ldots \dot B \ldots }\) transform correspondingly. For the raising and lowering of indices now a real, antisymmetric (2×2)-matrix with components \({ \epsilon ^{AB}} = \delta _1^A\delta _B^2 - \delta _2^A\delta _B^1 = { \epsilon _{AB}}\) is needed, such that

$$ \begin{array}{*{20}{c}} {{\xi ^A} = { \epsilon ^{AB}}{\xi _B},\;\;\;\;\;}&{{\xi _A} = {\xi ^B}{ \epsilon _{BA}}.} \end{array} $$

Next to a spinor, bispinors of the form \({\zeta ^{AB}},\;{\xi ^{A\dot B}}\), etc. are the simplest quantities (spinors of 2nd order). A vector Xk can be represented by a bispinor XAḂ,

$$ {X^{A\dot B}} = \sigma _k^{A\dot B}{X^k}, $$

where \(\sigma _k^{A\dot B}\) (k=0,…3) is a quantity linking the tangent space of space-time and spinor space. If k numerates the matrices and A, designate rows and columns, then we can chose σ AḂ0 to be the unit matrix while for the other three indices σ AḂj are taken to be the Pauli matrices. Often the quantity \(s_k^{A\dot B} = \tfrac{1}{{\sqrt 2 }}\sigma _k^{A\dot B}\) is introduced. The reciprocal matrix \(s_{A\dot B}^k\) is defined by

$$ s_j^{A\dot B}s_{A\dot B}^k = \delta _j^k, $$


$$ s_j^{A\dot B}{s^{j\;\;C\dot D}} = { \epsilon ^{AC}}{ \epsilon ^{\dot B\dot D}}. $$

In order to write down spinorial field equations, we need a spinorial derivative,

$$ {\partial _{A\dot B}} = s_{A\dot B}^k{\partial _k} $$

with \({\partial _{A\dot B}}{\partial ^{A\dot B}} = {\partial _k}{\partial ^k}\). The simplest spinorial equation is the Weyl equation:

$$ \begin{array}{*{20}{c}} {{\partial _{A\dot B}}{\psi ^A} = 0,\;\;\;\;\;}&{\dot B = 1,2.} \end{array} $$

The next simplest spinor equation for two spinors χ, ψA would be

$$ \begin{array}{*{20}{c}} {{\partial _{A\dot C}}{\chi ^{\dot C}} = - \frac{{2\pi }}{{\sqrt 2 h}}m{\psi _A};\;\;\;\;\;}&{{\partial ^{C\dot B}}{\psi _C} = \frac{{2\pi }}{{\sqrt 2 h}}m{\chi ^{\dot B}},} \end{array} $$

where m is a mass. Equation (85) is the 2-spinor version of Dirac’s equation.

Dirac- or 4-spinors with 4 components ψk, k=1,…, 4, may be constructed from 2-spinors as a direct sum of contravariant undotted and covariant dotted spinors ψ and φ;: For k=1, 2, we enter ψ1 and ψ2; for k=3, 4, we enter φ and φ . In connection with Dirac spinors, instead of the Pauli-matrices the Dirac γ-matrices (4×4-matrices) appear; they satisfy

$$ {\gamma ^i}{\gamma ^k} + {\gamma ^k}{\gamma ^i} = 2{\eta ^{ik}}\mathbf{1}. $$

The Dirac equation is in 4-spinor formalism [53, 54]:

$$ \left( {i{\gamma ^l}\frac{\partial }{{\partial {x^l}}} + \kappa } \right)\chi = 0, $$

with the 4-component Dirac spinor χ. In the first version of Dirac’s equation, α- and β-matrices were used, related to the γ’s by

$$ \begin{array}{*{20}{c}} {{\gamma ^0} = \beta ,\;\;\;\;\;}&{{\gamma ^m} = \beta {\alpha ^m},\;\;\;\;\;}&{m = 1,2,3,} \end{array} $$

where the matrices β and αm are given by \( \left( {\begin{array}{*{20}{c}} 0&{ - {\sigma _i}}\\ {{\sigma _i}}&0 \end{array}} \right),\;\;\left( {\begin{array}{*{20}{c}} 0&{ - 1}\\ 1&0 \end{array}} \right) \).

The generally-covariant formulation of spinor equations necessitates the use of n-beins \(h_{\hat \imath}^k\), whose internal “rotation” group, operating on the “hatted” indices, is the Lorentz group. The group of coordinate transformations acts on the Latin indices. In Cartan’s one-form formalism (cf. Section 2.1.4), the covariant derivative of a 4-spinor is defined by

$$ D\psi = d\psi + \frac{1}{4}{\omega _{\hat \imath\hat k}}{\sigma ^{\hat \imath\hat k}}\psi , $$

where \({\sigma ^{\hat \imath\hat k}}: = \tfrac{1}{2}\left[ {{\gamma ^{\hat \imath}}{\gamma ^{\hat k}}} \right].\).

Equation (89) is a special case of the general formula for the covariant derivative of a tensorial form ψ, i.e., a vector in some vector space V, whose components are differential forms,

$$ D\psi = d\psi + {\omega ^{\hat \imath\hat k}}{\rho _{\hat \imath\hat k}}({e_\alpha })\psi , $$

where \(\rho ({e_\alpha })\) is a particular representation of the corresponding Lie algebra in V with basis vectors \({e_\alpha }\). For the example of the Dirac spinor, the adjoint representation of the Lorentz group must be usedFootnote 56. Symmetries

In Section 2.1.1 we briefly met the Lie derivative of a vector field \({{\mathcal L}_X}\) with respect to the tangent vector X defined by \({({L_X}Y)^k}: = {([X,Y])^k} = {X^i}{\partial _i}{Y^k} - {Y^i}{\partial _i}{X^k}\). With its help we may formulate the concept of isometries of a manifold, i.e., special mappings, also called “motions”, locally generated by vector fields X satisfying

$$ {{\mathcal L}_X}\;{g_{ik}}: = {\partial _k}({g_{ij}}){X^k} + {g_{lj}}{\partial _i}{X^l} + {g_{il}}{\partial _j}{X^l} = 0. $$

The generators X solving Equation (91) given some metric, form a Lie group Gr, the group of motions of MD. If a group Gr is prescribed, e.g., the group of spatial rotations O(3), then from Equation (91) the functional form of the metric tensor having O(3) as a symmetry group follows.

A Riemannian space is called (locally) stationary if it admits a timelike Killing vector; it is called (locally) static if this Killing vector is hypersurface orthogonal. Thus if, in a special coordinate system, we take Xi=δ i0 then from Equation (91) we conclude that stationarity reduces to the condition ∂0gik= 0. If we take X to be the tangent vector field to the congruence of curves xi=xi(u), i.e., if \({X^k} = \tfrac{{d{x^k}}}{{du}}\), then a necessary and sufficient condition for hypersurface-orthogonality is \({\epsilon^{ijkl}}{X_j}{X_{[k,l]}} = 0\).

A generalisation of Killing vectors are conformal Killing vectors for which \({{\mathcal L}_X}\;{g_{ik}} = \Phi {g_{ik}}\) with an arbitrary smooth function Φ holds. In purely affine spaces, another type of symmetry may be defined: \({{\mathcal L}_X}\;{\Gamma _{ik}}^l = 0\); they are called affine motions [425].

2.3 Dynamics

Within a particular geometry, usually various options for the dynamics of the fields (field equations, in particular as following from a Lagrangian) exist as well as different possibilities for the identification of physical observables with the mathematical objects of the formalism. Thus, in general relativity, the field equations are derived from the Lagrangian

$$ {\mathcal L} = \sqrt { - g} (R + 2\Lambda - 2\kappa {L_{\rm{M}}}), $$

where R(gik) is the Ricci scalar, g:=det gik, Λ the cosmological constant, and LM the matter Lagrangian depending on the metric, its first derivatives, and the matter variables. This Lagrangian leads to the well-known field equations of general relativity,

$$ {R^{ik}} - \frac{1}{2}R{g^{ik}} = - \kappa {T^{ik}}, $$

with the energy-momentum(-stress) tensor of matter

$$ {T^{ik}}: = \frac{2}{{\sqrt { - g} }}\frac{{\delta (\sqrt { - g} {L_{\rm{M}}})}}{{\delta {g_{ik}}}} $$

and \(\kappa = \tfrac{{8\pi G}}{{{c^4}}}\), where G is Newton’s gravitational constant. \({G^{ik}}: = {R^{ik}} - \tfrac{1}{2}R{g^{ik}}\) is called the Einstein tensor. In empty space, i.e., for Tik= 0, Equation (92) reduces to

$$ {R^{ik}} = 0. $$

If only an electromagnetic field \({F_{ik}} = \frac{{\partial {A_k}}}{{\partial {x^i}}} - \frac{{\partial {A_i}}}{{\partial {x^k}}}\) derived from the 4-vector potential Ak is present in the energy-momentum tensor, then the Einstein.Maxwell equations follow:

$$ \begin{array}{*{20}{c}} {{R^{ik}} - \frac{1}{2}R{g^{ik}} = - \kappa \left( {{F_{il}}{F^l}_k + \frac{1}{4}{g_{ik}}{F_{lm}}{F^{lm}}} \right),\;\;\;\;\;}&{{\nabla _l}{F^{il}} = 0.} \end{array} $$

The components of the metrical tensor are identified with gravitational potentials. Consequently, the components of the (Levi-Civita) connection correspond to the gravitational “field strength”, and the components of the curvature tensor to the gradients of the gravitational field. The equations of motion of material particles should follow, in principle, from Equation (92) through the relation

$$ {\nabla _l}{T^{il}} = 0 $$

implied by itFootnote 57. For point particles, due to the singularities appearing, in general this is a tricky task, up to now solved only approximately. However, the world lines for point particles falling freely in the gravitational field are, by definition, the geodesics of the Riemannian metric. This definition is consistent with the rigourous derivation of the geodesic equation for non-interacting dust particles in a fluid matter description. It is also consistent with all observations.

For most of the unified field theories to be discussed in the following, such identifications were made on internal, structural reasons, as no link-up to empirical data was possible. Due to the inherent wealth of constructive possibilities, unified field theory never would have come off the ground proper as a physical theory even if all the necessary formal requirements could have been satisfied. As an example, we take the identification of the electromagnetic field tensor with either the skew part of the metric, in a “mixed geometry” with metric compatible connection, or the skew part of the Ricci tensor in metric-affine theory, to list only two possibilities. The latter choice obtains likewise in a purely affine theory in which the metric is a derived secondary concept. In this case, among the many possible choices for the metric, one may take it proportional to the variational derivative of the Lagrangian with respect to the symmetric part of the Ricci tensor. This does neither guarantee the proper signature of the metric nor its full rank. Several identifications for the electromagnetic 4-potential and the electric current vector density have also been suggested (cf. below and [143]).

2.4 Number field

Complex fields may also be introduced on a real manifold. Such fields have also been used for the construction of unified field theories, although mostly after the period dealt with here (cf. Part II, in preparation). In particular, manifolds with a complex fundamental form were studied, e.g., with \({g_{ik}} = {s_{ik}} + i{a_{ik}}\), where \(i = \sqrt { - 1}\) [97]. Also, geometries based on Hermitian forms were studied [313]. In later periods, hypercomplex numbers, quaternions, and octonions also were used as basic number fields for gravitational or unified theories (cf. Part II, forthcoming).

In place of the real numbers, by which the concept of manifold has been defined so far, we could take other number fields and thus arrive, e.g., at complex manifolds and so on. In this part of the article we do not need to take into account this generalisation.

2.5 Dimension

Since the suggestions by Nordström and Kaluza [238, 181], manifolds with D>4 have been used for unified field theories. In most of the cases, the additional dimensions were taken to be spacelike; nevertheless, manifolds with more than one direction of time also have been studied.

3 Early Attempts at a Unified Field Theory

3.1 First steps in the development of unified field theories

Even before (or simultaneously with) the introduction and generalisation of the concept of parallel transport and covariant derivative by Hessenberg (1916/17) [160], Levi-Civita (1917), [204], Schouten (1918) [294], Weyl (1918) [397], and König (1919) [193], the introduction of an asymmetric metric was suggested by Rudolf FörsterFootnote 58 in 1917. In his letter to Einstein of 11 November 1917, he writes ([321], Doc. 398, p. 552):

“Perhaps, there exists a covariant 6-vector by which the appearance of electricity is explained and which springs lightly from the \({g_{\mu \nu }}\), not forced into it as an alien element.”

Footnote 59 Einstein replied:

“The aim of dealing with gravitation and electricity on the same footing by reducing both groups of phenomena to \({g_{\mu \nu }}\) has already caused me many disappointments. Perhaps, you are luckier in the search. I am totally convinced that in the end all field quantities will look alike in essence. But it is easier to suspect something than to discover it.”Footnote 60 (16 November 1917 [321], Vol. 8A, Doc. 400, p. 557)

In his next letter, Förster gave results of his calculations with an asymmetric \({g_{\mu \nu }} = {s_{\mu \nu }} + {a_{\mu \nu }}\)introduced an asymmetric “three-index-symbol” and a possible generalisation of the Riemannian curvature tensor as well as tentative Maxwell’s equations and interpretations for the 4-potential \({A_\mu }\), and special solutions (28 December 1917) ([321], Volume 8A, Document 420, pp. 5817#x2013;587). Einstein’s next letter of 17 January 1918 is skeptical:

“Since long, I also was busy by starting from a non-symmetric \({g_{\mu \nu }}\); however, I lost hope to get behind the secret of unity (gravitation, electromagnetism) in this way. Various reasons instilled in me strong reservations: […] your other remarks are interesting in themselves and new to me.”Footnote 61 ([321], Volume 8B, Document 439, pp. 610–611)

Einstein’s remarks concerning his previous efforts must be seen under the aspect of some attempts at formulating a unified field theory of matter by G. Mie [229, 230, 231]Footnote 62, J. Ishiwara, and G. Nordström, and in view of the unified field theory of gravitation and electromagnetism proposed by David Hilbert.

“According to a general mathematical theorem, the electromagnetic equations (generalized Maxwell equations) appear as a consequence of the gravitational equations, such that gravitation and electrodynamics are not really different.”Footnote 63 (letter of Hilbert to Einstein of 13 November 1915 [162])

The result is contained in (Hilbert 1915, p. 397)Footnote 64.

Einstein’s answer to Hilbert on 15 November 1915 shows that he had also been busy along such lines:

“Your investigation is of great interest to me because I have often tortured my mind in order to bridge the gap between gravitation and electromagnetism. The hints dropped by you on your postcards bring me to expect the greatest.”Footnote 65 [101]

Even before Förster alias Bach corresponded with Einstein, a very early bird in the attempt at unifying gravitation and electromagnetism had published two papers in 1917, Reichenb#x00E4;cher [270, 269]. His paper amounts to a scalar theory of gravitation with field equation R=0 instead of Einstein’s Rab=0 outside the electrons. The electron is considered as an extended body in the sense of Lorentz-Poincaré, and described by a metric joined continuously to the outside metricFootnote 66:

$$ d{s^2} = d{r^2} + {r^2}d{\phi ^2} + {r^2}{\cos ^2}\phi d{\psi ^2} + {\left( {1 - \frac{\alpha }{r}} \right)^2}dx_0^2. $$

Reichenbächer, at this point, seems to have had a limited understanding of general relativity: He thinks in terms of a variable velocity of light; he equates coordinate systems and reference systems, and apparently considers the transition from the Minkowskian to a non-flat metric as achieved by a coordinate rotation, a “Drehung gegen den Normalzustand” (“rotation with respect to the normal state”) ([270], p. 137). According to him, the deviation from the Minkowski metric is due to the electromagnetic field tensor:

“The disturbance, which is generated by the electrons and which forces us to adopt a coordinate system different from the usual one, is interpreted as the electromagnetic six-vector, as is known.”Footnote 67 ([270], p. 136)

By his “coordinate rotation”, or, as he calls it in ([269], p. 174), “electromagnetic rotation”, he tries to geometrize the electromagnetic field. As Weyl’s remark in Raum-Zeit-Materie ([398], p. 267, footnote 30) shows, he did not grasp Reichenbächer’s reasoning; I have not yet understood it either. Apparently, for Reichenbächer the metric deviation from Minkowski space is due solely to the electromagnetic field, whereas gravitation comes in by a single scalar potential connected to the velocity of light. He claims to obtain the same value for the perihelion shift of Mercury as Einstein ([269], p. 177). Reichenbächer was slow to fully accept general relativity; as late as in 1920 he had an exchange with Einstein on the foundations of general relativity [271, 71].

After Reichenbächer had submitted his paper to Annalen der Physik and seemingly referred to Einstein,

“Planck was uncertain to which of Einstein’s papers Reichenbächer appealed. He urged that Reichenbächer speak with Einstein and so dissolve their differences. The meeting was amicable. Reichenbächer’s paper appeared in 1917 as the first attempt at a unified field theory in the wake of Einstein’s covariant field equations.” ([262], p. 208)

In this context, we must also keep in mind that the generalisation of the metric tensor toward asymmetry or complex values was more or less synchronous with the development of Finsler geometry [126]. Although Finsler himself did not apply his geometry to physics it soon became used in attempts at the unification of gravitation and electromagnetism [274].

3.2 Early disagreement about how to explain elementary particles by field theory

In his book on Einstein’s relativity, Max Born, in 1920, had asked about the forces hindering “an electron or an atom” to disintegrate.

“Now, these objects are tremendous concentrations of energy in the smallest place; therefore, they will house huge curvatures of space or, in other words, gravitational fields. The idea that they keep together the dispersing electrical charges lies close at hand.”Footnote 68 ([19], p. 235)

Thus, the idea of a program for building the extended constituents of matter from the fields the source of which they are, was very much alive around 1920. However, Pauli’s remark after Weyl’s lecture in Bad Nauheim (86. Naturforscherversammlung, 19–25 September 1920) [245] showed that not everybody was a believer in it. He claimed that in bodies smaller than those carrying the elementary charge (electrons), an electric field could not be measured. There was no point of creating the “interior” of such bodies with the help of an electric field. Pauli:

“None of the present theories of the electron, also not Einstein’s (Einstein 1919 [70]), up to now did achieve solving satisfactorily the problem of the electrical elementary quanta; it seems obvious to look for a deeper reason for this failure. I wish to see this reason in the fact that it is altogether not permitted to describe the electromagnetic field in the interior of an electron as a continuous space function. The electrical field is defined as the force on a charged test particle, and if no smaller test particles exist than the electron (vice versa the nucleus), the concept of electrical field at a certain point in the interior of the electron — with which all continuum theories are working — seems to be an empty fiction, because there are no arbitrarily small measures. Therefore, I’d like to ask Mr. Einstein whether he approves of the opinion that a solution of the problem of matter may be expected only from a modification of our perception of space (perhaps also of time) and of electricity in the sense of atomism, or whether he thinks that the mentioned reservations are unconvincing and is of the opinion that the fundaments of continuum theory must be upheld.”Footnote 69

Pauli referred to Einstein’s paper about elementary particles and field theory in which he had exchanged his famous field equations for traceless equations with the electromagnetic field tensor as a source. Einstein’s answer is tentative and evasive: We just don’t know yetFootnote 70.

“With the progressing refinement of scientific concepts, the manner by which concepts are related to (physical) events becomes ever more complicated. If, in a certain stage of scientific investigation, it is seen that a concept can no longer be linked with a certain event, there is a choice to let the concept go, or to keep it; in the latter case, we are forced to replace the system of relations among concepts and events by a more complicated one. The same alternative obtains with respect to the concepts of timeand space-distances. In my opinion, an answer can be given only under the aspect of feasibility; the outcome appears dubious to me.”Footnote 71

In the same discussion Gustav Mie came back to Förster’s idea of an asymmetric metric but did not like it

“[…] that an antisymmetric tensor was added to the symmetric tensor of the gravitational potential, which represented the six-vector of the electromagnetic field. But a more precise reasoning shows that in this way no reasonable world function is obtained.”Footnote 72

It is to be noted that Weyl, at the end of 1920, already had given up on a possible field theory of matter:

“Finally I cut loose firmly from Mie’s theory and arrived at another position with regard to the problem of matter. To me, field physics no longer appears as the key to reality; in contrary, the field, the ether, for me simply is the totally powerless transmitter of causations, yet matter is a reality beyond the field and causes its states.”Footnote 73 (letter of Weyl to F. Klein on 28 December 1920, see [293], p. 83)

In the next year, Einstein had partially absorbed Pauli’s view but still thought it to be useful to apply field theory to the constituents of matter:

“The physical interpretation of geometry (theory of the continuum) presented here, fails in its direct application to spaces of submolecular scale. Yet it retains part of its meaning also with regard to questions concerning the constitution of elementary particles. Because one may try to ascribe to these field concepts […] a physical meaning even if a description of the electrical elementary particles which constitute matter is to be made. Only success can decide whether such a procedure finds its justification […].”Footnote 74 [72]

During the twenties Einstein changed his mind and looked for solutions of his field equations which were everywhere regular to represent matter particles:

“In the program, Mr. Einstein expressed during his two talks given in November 1929 at the Institut Henri Poincaré, he wished to search for the physical laws in solutions of his equations without singularities — with matter and the electromagnetic field thus being continuous. Let us move into the field chosen by him without too much surprise to see him apparently follow a road opposed to the one successfully walked by the contemporary physicists.”Footnote 75 ([36], p. 17 (1178))

4 The Main Ideas for Unification between about 1918 and 1923

After 1915, Einstein first was busy with extracting mathematical and physical consequences from general relativity (Hamiltonian, exact solutions, the energy conservation law, cosmology, gravitational waves). Although he kept thinking about how to find elementary particles in a field theory [70] and looked closer into Weyl’s theory [72], at first he only reacted to the new ideas concerning unified field theory as advanced by others. The first such idea after Förster’s, of course, was Hermann Weyl’s gauge approach to gravitation and electromagnetism, unacceptable to Einstein and to Pauli for physical reasons [246, 292].

Next came Kaluza’s five-dimensional unification of gravitation and electromagnetism, and Eddington’s affine geometry.

4.1 Weyl’s theory

4.1.1 The geometry

Weyl’s fundamental idea for generalising Riemannian geometry was to note that, unlike for the comparison of vectors at different points of the manifold, for the comparison of scalars the existence of a connection is not required. Thus, while lengths of vectors at different points can be compared without a connection, directions cannot. This seemed too special an assumption to Weyl for a genuine infinitesimal geometry:

“If we make no further assumption, the points of a manifold remain totally isolated from each other with regard to metrical structure. A metrical relationship from point to point will only then be infused into [the manifold] if a principle for carrying the unit of length from one point to its infinitesimal neighbours is given.” Footnote 76

In contrast to this, Riemann made the much stronger assumption that line elements may be compared not only at the same place but also at two arbitrary places at a finite distance.

“However, the possibility of such a comparison ‘at a distance’ in no way can be admitted in a pure infinitesimal geometry.” Footnote 77 ([397], p. 397)

In order to invent a purely “infinitesimal” geometry, Weyl introduced the 1-dimensional, Abelian group of gauge transformations,

$$ g \to \bar g: = \lambda g, $$

besides the diffeomorphism group (coordinate transformations). At a point, Equation (98) induces a local recalibration of lengths l while preserving angles, i.e., δll. If the non-metricity tensor is assumed to have the special form Qijk=Qkgij, with an arbitrary vector field Qk, then as we know from Equation (57), with regard to these gauge transformations

$$ {Q_k} \to {Q_k} + {\partial _k}\sigma . $$

We see a striking resemblance with the electromagnetic gauge transformations for the vector potential in Maxwell’s theory. If, as Weyl does, the connection is assumed to be symmetric (i.e., with vanishing torsion), then from Equation (42) we get

$$ {\Gamma _{ij}}^k = \{ \,_{ij}^k\,\} + \frac{1}{2}(\delta _i^k{Q_j} + \delta _j^k{Q_i} - {g_{ij}}{g^{kl}}{Q_l}). $$

Thus, unlike in Riemannian geometry, the connection is not fully determined by the metric but depends also on the arbitrary vector function Qi, which Weyl wrote as a linear form dQ=Qidxi. With regard to the gauge transformations (98), \({\Gamma _{ij}}^k\) remains invariant. From the 1-form dQ, by exterior derivation a gauge-invariant 2-form \(F = {F_{ij}}d{x^i}\,\Lambda \;d{x^j}\) with \(F = {Q_{i,j}} - {Q_{j,i}}\) follows. It is named “Streckenkrümmung” (“line curvature”) by Weyl, and, by identifying Q with the electromagnetic 4-potential, he arrived at the electromagnetic field tensor F.

Let us now look at what happens to parallel transport of a length, e.g., the norm |X| of a tangent vector along a particular curve C with parameter u to a different (but infinitesimally neighbouring) point:

$$ \frac{{d\left| X \right|}}{{du}} = {\left| X \right|_{\left\| k \right.}}{X^k} = (\sigma - \frac{1}{2}{Q_k}{X^k})\left| X \right|. $$

By a proper choice of the curve’s parameter, we may write (101) in the form \(d\left| X \right| = - {Q_k}{X^k}\left| X \right|du\) and integrate along C to obtain \(|X| = \int {\exp ( - {Q_k}{X^k}du).} \). If X is taken to be tangent to C, i.e., \({X^k} = \tfrac{{d{x^k}}}{{{du}}}\), then

$$ \left| {\frac{{d{x^k}}}{{du}}} \right| = \int {\exp ( - {Q_k}(x)d{x^k})} , $$

i.e., the length of a vector is not integrable; its value generally depends on the curve along which it is parallely transported. The same holds for the angle between two tangent vectors in a point (cf. Equation (44)). For a vanishing electromagnetic field, the 4-potential becomes a gradient (“pure gauge”), such that \({Q_k}d{x^k} = \tfrac{{\partial \omega }}{{\partial {x^k}}}d{x^k} = d\omega\), and the integral becomes independent of the curve.

Thus, in Weyl’s connection (100), both the gravitational and the electromagnetic fields, represented by the metrical field g and the vector field Q, are intertwined. Perhaps, having in mind Mie’s ideas of an electromagnetic world view and Hilbert’s approach to unification, in the first edition of his book, Weyl remained reserved:

“Again physics, now the physics of fields, is on the way to reduce the whole of natural phenomena to one single law of nature, a goal to which physics already once seemed close when the mechanics of mass-points based on Newton’s Principia did triumph. Yet, also today, the circumstances are such that our trees do not grow into the sky.”Footnote 78 ([396], p. 170; preface dated “Easter 1918”)

However, a little later, in his paper accepted on 8 June 1918, Weyl boldly claimed:

“I am bold enough to believe that the whole of physical phenomena may be derived from one single universal world-law of greatest mathematical simplicity.”Footnote 79 ([397], p. 385, footnote 4)

The adverse circumstances alluded to in the first quotation might be linked to the difficulties of finding a satisfactory Lagrangian from which the field equations of Weyl’s theory can be derived. Due to the additional group of gauge transformations, it is useful to introduce the new concept of gauge-weight within tensor calculus as in Section 2.1.5Footnote 80. As the Lagrangian \({\mathcal L} = \sqrt { - g} L\) must have gauge-weight w=0, we are looking for a scalar L of gauge-weight -2. WeitzenböckFootnote 81 has shown that the only possibilities quadratic in the curvature tensor and the line curvature are given by the four expressions [391]

$$ \begin{array}{*{20}{c}} {{{({K_{ij}}\;{g^{ij}})}^2},\;\;\;\;\;}&{{K_{ij}}{K_{kl}}{g^{ik}}{g^{jl}},\;\;\;\;\;}&{{K^i}_{jkl}{K^j}_{imn}{g^{km}}{g^{\ln }},\;\;\;\;\;}&{{F_{ij}}{F_{kl}}{g^{ik}}{g^{jl}}.} \end{array} $$

While the last invariant would lead to Maxwell’s equations, from the invariants quadratic in curvature, in general field equations of fourth order result.

Weyl did calculate the curvature tensor formed from his connection (100) but did not get the correct resultFootnote 82; it is given by Schouten ([310], p. 142) and follows from Equation (51):

$$ {K^i}_{jkl} = {R^i}_{jkl} + {Q_{j;[k}}{\delta _{l]}}^i + {\delta _j}^i{Q_{[l;k]}} - {Q^i}_{;[k}{g_{l]j}} + \frac{1}{2}\left( { - {\delta _{[l}}^i{Q_{k]}}{Q_j} + {Q_{[k}}{g_{l]j}}{Q^i} - {\delta _{[k}}^i{g_{l]j}}{Q_r}{Q^r}} \right). $$

If the metric field g and the 4-potential QiAi are varied independently, from each of the curvature-dependent scalar invariants we do get contributions to Maxwell’s equations.

Perhaps Bach (alias Förster) was also dissatisfied with Weyl’s calculations: He went through the entire mathematics of Weyl’s theory, curvature tensor, quadratic Lagrangian field equations and all; he even discussed exact solutions. His Lagrangian is given by \({\mathcal L} = \sqrt g (3{W_4} - 6{W_3} + {W_2})\), where the invariants are defined by

$$ \begin{array}{*{20}{c}} {{W_4}: = {S_{pqik}}{S^{pqik}},\;\;\;\;\;}&{{W_3}: = {g^{ik}}{g^{lm}}{F^p}_{ikq}{F^q}_{lmp},\;\;\;\;\;}&{{W_2}: = {g^{ik}}{F^p}_{ikp}} \end{array} $$


$$ \begin{array}{*{20}{l}} {{S_{[pq][ik]}}: = \frac{1}{4}({F_{pqik}} - {F_{qpik}} + {F_{ikpq}} - {F_{kipq}}),}\\ {\;\;\;{F_{pqik}} = {R_{pqik}} + \frac{1}{2}({g_{pq}}{f_{ki}} + {g_{pk}}{f_{q;i}} + {g_{qi}}{f_{p;k}} - {g_{pi}}{f_{q;k}} - {g_{qk}}{f_{p;i}})}\\ {\;\;\;\;\;\;\;\;\;\;\;\; + \frac{1}{2}({f_q}{g_{p;[k}}{f_{i]}} + {f_p}{g_{q;[k}}{f_{i]}} + {g_{p[i}}{g_{k]q}}{f_r}{f^r}),} \end{array} $$

where Rpqik is the Riemannian curvature tensor, \({f_{ik}} = {f_{i,k}} - {f_{k,i}}\), and fi is the electromagnetic 4-potential [4].

4.1.2 Physics

While Weyl’s unification of electromagnetism and gravitation looked splendid from the mathematical point of view, its physical consequences were dire: In general relativity, the line element ds had been identified with space- and time intervals measurable by real clocks and real measuring rods. Now, only the equivalence class {λgik|λ arbitrary} was supposed to have a physical meaning: It was as if clocks and rulers could be arbitrarily “regauged” in each event, whereas in Einstein’s theory the same clocks and rulers had to be used everywhere. Einstein, being the first expert who could keep an eye on Weyl’s theory, immediately objected, as we infer from his correspondence with Weyl.

In spring 1918, the first edition of Weyl’s famous book on differential geometry, special and general relativity Raum-Zeit-Materie appeared, based on his course in Zürich during the summer term of 1917 [396]. Weyl had arranged that the page proofs be sent to Einstein. In communicating this on 1 March 1918, he also stated that

“As I believe, during these days I succeeded in deriving electricity and gravitation from the same source. There is a fully determined action principle, which, in the case of vanishing electricity, leads to your gravitational equations while, without gravity, it coincides with Maxwell’s equations in first order. In the most general case, the equations will be of 4th order, though.”Footnote 83

He then asked whether Einstein would be willing to communicate a paper on this new unified theory to the Berlin Academy ([321], Volume 8B, Document 472, pp. 663–664). At the end of March, Weyl visited Einstein in Berlin, and finally, on 5 April 1918, he mailed his note to him for the Berlin Academy. Einstein was impressed: In April 1918, he wrote four letters and two postcards to Weyl on his new unified field theory — with a tone varying between praise and criticism. His first response of 6 April 1918 on a postcard was enthusiastic:

“Your note has arrived. It is a stroke of genious of first rank. Nevertheless, up to now I was not able to do away with my objection concerning the scale.”Footnote 84 ([321], Volume 8B, Document 498, 710)

Einstein’s “objection” is formulated in his “Addendum” (“Nachtrag”) to Weyl’s paper in the reports of the Academy, because Nernst had insisted on such a postscript. There, Einstein argued that if light rays would be the only available means for the determination of metrical relations near a point, then Weyl’s gauge would make sense. However, as long as measurements are made with (infinitesimally small) rigid rulers and clocks, there is no indeterminacy in the metric (as Weyl would have it): Proper time can be measured. As a consequence follows: If in nature length and time would depend on the pre-history of the measuring instrument, then no uniquely defined frequencies of the spectral lines of a chemical element could exist, i.e., the frequencies would depend on the location of the emitter. He concluded with the words

“Regrettably, the basic hypothesis of the theory seems unacceptable to me, [of a theory] the depth and audacity of which must fill every reader with admiration.”Footnote 85 ([395], Addendum, p. 478)

Einstein’s remark concerning the path-dependence of the frequencies of spectral lines stems from the path-dependency of the integral (102) given above. Only for a vanishing electromagnetic field does this objection not hold.

Weyl answered Einstein’s comment to his paper in a “reply of the author” affixed to it. He doubted that it had been shown that a clock, if violently moved around, measures proper time ∫ ds. Only in a static gravitational field, and in the absence of electromagnetic fields, does this hold:

“The most plausible assumption that can be made for a clock resting in a static field is this: that it measure the integral of the ds normed in this way [i.e., as in Einstein’s theory]; the task remains, in my theory as well as in Einstein’s, to derive this fact by a dynamics carried through explicitly.”Footnote 86 ([395], p. 479)

Einstein saw the problem, then unsolved within his general relativity, that Weyl alluded to, i.e., to give a theory of clocks and rulers within general relativity. Presumably, such a theory would have to include microphysics. In a letter to his former student Walter Dällenbach, he wrote (after 15 June 1918):

“[Weyl] would say that clocks and rulers must appear as solutions; they do not occur in the foundation of the theory. But I find: If the ds, as measured by a clock (or a ruler), is something independent of pre-history, construction and the material, then this invariant as such must also play a fundamental role in theory. Yet, if the manner in which nature really behaves would be otherwise, then spectral lines and well-defined chemical elements would not exist. […] In any case, I am as convinced as Weyl that gravitation and electricity must let themselves be bound together to one and the same; I only believe that the right union has not yet been found.”Footnote 87 ([321], Volume 8B, Document 565, 803)

Another famous theoretician who could not side with Weyl was H. A. Lorentz; in a paper on the measurement of lengths and time intervals in general relativity and its generalisations, he contradicted Weyl’s statement that the world-lines of light-signals would suffice to determine the gravitational potentials [211].

However, Weyl still believed in the physical value of his theory. As further “extraordinarily strong support for our hypothesis of the essence of electricity” he considered the fact that he had obtained the conservation of electric charge from gauge-invariance in the same way as he had linked with coordinate-invariance earlier, what at the time was considered to be “conservation of energy and momentum”, where a non-tensorial object stood in for the energy-momentum density of the gravitational field ([398], pp. 252–253).

Moreover, Weyl had some doubts about the general validity of Einstein’s theory which he derived from the discrepancy in value by 20 orders of magniture of the classical electron radius and the gravitational radius corresponding to the electron’s mass ([397], p. 476; [152]).

4.1.3 Reactions to Weyl’s theory I: Einstein and Weyl

There exists an intensive correspondence between Einstein and Weyl, now completely available in volume 8 of the Collected Papers of Einstein [321]. We subsume some of the relevant discussions. Even before Weyl’s note was published by the Berlin Academy on 6 June 1918, many exchanges had taken place between him and Einstein.

On a postcard to Weyl on 8 April 1918, Einstein reaffirmed his admiration for Weyl’s theory, but remained firm in denying its applicability to nature. Weyl had given an argument for dimension 4 of space-time that Einstein liked: As the Lagrangian for the electromagnetic field FikFik is of gauge-weight -2 and \(\sqrt { - g}\) has gauge-weight D/2 in an MD, the integrand in the Hamiltonian principle \(\sqrt { - g} {F_{ik}}{F^{ik}}\) can have weight zero only for D=4: “Apart from the [lacking] agreement with reality it is in any case a grandiose intellectual performance”Footnote 88 ([321], Vol. 8B, Doc. 499, 711). Weyl did not give in:

“Your rejection of the theory for me is weighty; […] But my own brain still keeps believing in it. And as a mathematician I must by all means hold to [the fact] that my geometry is the true geometry ‘in the near’, that Riemann happened to come to the special case Fik=0 is due only to historical reasons (its origin is the theory of surfaces), not to such that matter.”Footnote 89 ([321], Volume 8B, Document 544, 767)

After Weyl’s next paper on “pure infinitesimal geometry” had been submitted, Einstein put forward further arguments against Weyl’s theory. The first was that Weyl’s theory preserves the similarity of geometric figures under parallel transport, and that this would not be the most general situation (cf. Equation (49)). Einstein then suggested the affine group as the more general setting for a generalisation of Riemannian geometry ([321], Vol. 8B, Doc. 551, 777). He repeated this argument in a letter to his friend Michele Besso from his vacations at the Baltic Sea on 20 August 1918, in which he summed up his position with regard to Weyl’s theory:

“[Weyl’s] theoretical attempt does not fit to the fact that two originally congruent rigid bodies remain congruent independent of their respective histories. In particular, it is unimportant which value of the integral \(\int {\phi _\nu }d{x_\nu }\) is assigned to their world line. Otherwise, sodium atoms and electrons of all sizes would exist. But if the relative size of rigid bodies does not depend on past history, then a measurable distance between two (neighbouring) world-points exists. Then, Weyl’s fundamental hypothesis is incorrect on the molecular level, anyway. As far as I can see, there is not a single physical reason for it being valid for the gravitational field. The gravitational field equations will be of fourth order, against which speaks all experience until now […].”Footnote 90 ([99], p. 133)

Einstein’s remark concerning “affine geometry” is referring to the affine geometry in the sense it was introduced by Weyl in the 1st and 2nd edition of his book [396], i.e., through the affine group and not as a suggestion of an affine connexion.

From Einstein’s viewpoint, in Weyl’s theory the line element ds is no longer a measurable quantity — the electromagnetical 4-potential never had been one. Writing from his vacations on 18 September 1918, Weyl presented a new argument in order to circumvent Einstein’s objections. The quadratic form Rgikdxidxk is an absolute invariant, i.e., also with regard to gauge transformations (gauge weight 0). If this expression would be taken as the measurable distance in place of ds, then

“[…] by the prefixing of this factor, so to speak, the absolute norming of the unit of length is accomplished after all”Footnote 91 ([321], Volume 8B, Document 619, 877–879)

Einstein was unimpressed:

“But the expression Rgikdxidxk for the measured length is not at all acceptable in my opinion because R is very dependent on the matter density. A very small change of the measuring path would strongly influence the integral of the square root of this quantity.”Footnote 92

Einstein’s argument is not very convincing: gik itself is influenced by matter through his field equations; it is only that now R is algebraically connected to the matter tensor. In view of the more general quadratic Lagrangian needed in Weyl’s theory, the connection between R and the matter tensor again might become less direct. Einstein added:

“Of course I know that the state of the theory as I presented it is not satisfactory, not to speak of the fact that matter remains unexplained. The unconnected juxtaposition of the gravitational terms, the electromagnetic terms, and the λ-terms undeniably is a result of resignation.[…] In the end, things must arrange themselves such that action-densities need not be glued together additively.”Footnote 93 ([321], Volume 8B, Document 626, 893–894)

The last remarks are interesting for the way in which Einstein imagined a successful unified field theory.

4.1.4 Reactions to Weyl’s theory II: Schouten, Pauli, Eddington, and others

Sommerfeld seems to have been convinced by Weyl’s theory, as his letter to Weyl on 3 June 1918 shows:

“What you say here is really marvelous. In the same way in which Mie glued to his consequential electrodynamics a gravitation which was not organically linked to it, Einstein glued to his consequential gravitation an electrodynamics (i.e., the usual electrodynamics) which had not much to do with it. You establish a real unity.”Footnote 94 [327]

Schouten, in his attempt in 1919 to replace the presentation of the geometrical objects used in general relativity in local coordinates by a “direct analysis”, also had noticed Weyl’s theory. In his “addendum concerning the newest theory of Weyl”, he came as far as to show that Weyl’s connection is gauge invariant, and to point to the identification of the electromagnetic 4-potential. Understandably, no comments about the physics are given ([295], pp. 89–91).

In the section on Weyl’s theory in his article for the Encyclopedia of Mathematical Sciences, Pauli described the basic elements of the geometry, the loss of the line-element ds as a physical variable, the convincing derivation of the conservation law for the electric charge, and the too many possibilities for a Lagrangian inherent in a homogeneous function of degree 1 of the invariants (103). As compared to his criticism with respect to Eddington’s and Einstein’s later unified field theories, he is speaking softly, here. Of course, as he noted, no progress had been made with regard to the explanation of the constituents of matter; on the one hand because the differential equations were too complicated to be solved, on the other because the observed mass difference between the elementary particles with positive and negative electrical charge remained unexplained. In his general remarks about this problem at the very end of his article, Pauli points to a link of the asymmetry with time-reflection symmetry (see [246], pp. 774–775; [244]). For Einstein, this criticism was not only directed against Weyl’s theory

“but also against every continuum-theory, also one which treats the electron as a singularity. Now as before I believe that one must look for such an overdetermination by differential equations that the solutions no longer have the character of a continuum. But how?” ([103], p. 43)

In a letter to Besso on 26 July 1920, Einstein repeated an argument against Weyl’s theory which had been removed by Weyl — if only by a trick to be described below; Einstein thus said:

“One must pass to tensors of fourth order rather than only to those of second order, which carries with it a vast indeterminacy, because, first, there exist many more equations to be taken into account, second, because the solutions contain more arbitrary constants.”Footnote 95 ([99], p. 153)

In his book “Space, Time, and Gravitation”, Eddington gave a non-technical introduction into Weyl’s “welding together of electricity and gravitation into one geometry”. The idea of gauging lengths independently at different events was the central theme. He pointed out that while the fourfold freedom in the choice of coordinates had led to the conservation laws for energy and momentum, “in the new geometry is a fifth arbitrariness, namely that of the selected gaugesystem. This must also give rise to an identity; and it is found that the new identity expresses the law of conservation of electric charge.” One natural gauge was formed by the “radius of curvature of the world”; “the electron could not know how large it ought to be, unless it had something to measure itself against” ([57], pp. 174, 173, 177).

As Eddington distinguished natural geometry and actual space from world geometry and conceptual space serving for a graphical representation of relationships among physical observables, he presented Weyl’s theory in his monograph “The mathematical theory of relativity”

“from the wrong end — as its author might consider; but I trust that my treatment has not unduly obscured the brilliance of what is unquestionably the greatest advance in the relativity theory after Einstein’s work.” ([59], p. 198)

Of course, “wrong end” meant that Eddington took Weyl’s theory such

“that his non-Riemannian geometry is not to be applied to actual space-time; it refers to a graphical representation of that relation-structure which is the basis of all physics, and both electromagnetic and metrical variables appear in it as interrelated.” ([59], p. 197)

Again, Eddington liked Weyl’s natural gauge encountered in Section 4.1.5, which made the curvature scalar a constant, i.e., K=4λ; it became a consequence of Eddington’s own natural gauge in his affine theory, Kijgij (cf. Section 4.3). For Eddington, Weyl’s theory of gauge-transformation was a hybrid:

“He admits the physical comparison of length by optical methods […]; but he does not recognise physical comparison of length by material transfer, and consequently he takes λ to be a function fixed by arbitrary convention and not necessarily a constant.“ ([59], pp. 220–221)

In the depth of his heart Weyl must have kept a fondness for his idea of “gauging” a field all during the decade between 1918 and 1928. As he had abandoned the idea of describing matter as a classical field theory since 1920, the linking of the electromagnetic field via the gauge idea could only be done through the matter variables. As soon as the new spinorial wave function (“matter wave”) in Schrödinger’s and Dirac’s equations emerged, he adapted his idea and linked the electromagnetic field to the gauging of the quantum mechanical wave function [407, 408]. In October 1950, in the preface for the first American printing of the English translation of the fourth edition of his book Space, Time, Matter from 1922, Weyl clearly expressed that he had given up only the particular idea of a link between the electromagnetic field and the local calibration of length:

“While it was not difficult to adapt also Maxwell’s equations of the electromagnetic field to this principle [of general relativity], it proved insufficient to reach the goal at which classical field physics is aiming: a unified field theory deriving all forces of nature from one common structure of the world and one uniquely determined law of action.[…] My book describes an attempt to attain this goal by a new principle which I called gauge invariance. (Eichinvarianz). This attempt has failed.” ([410], p. V)

4.1.5 Reactions to Weyl’s theory III: Further research

Pauli, still a student, and with his article for the Encyclopedia in front of him, pragmatically looked into the gravitational effects in the planetary system, which, as a consequence of Einstein’s field equations, had helped Einstein to his fame. He showed that Weyl’s theory had, for the static case, as a possible solution a constant Ricci scalar; thus it also admitted the Schwarzschild solution and could reproduce all desired effects [244, 243].

Weyl himself continued to develop the dynamics of his theory. In the third edition of his Space-Time-Matter [398], at the Naturforscherversammlung in Bad Nauheim in 1920 [399], and in his paper on “the foundations of the extended relativity theory” in 1921 [402], he returned to his new idea of gauging length by setting R=λ= const. (cf. Section 4.1.3); he interpreted λ to be the “radius of curvature” of the world. In 1919, Weyl’s Lagrangian originally was \({\mathcal L} = \tfrac{1}{2}\sqrt g {K^2} + \beta {F_{ik}}{F^{ik}}\) together with the constraint K=2λ, with constant λ ([398], p. 253). As an equivalent Lagrangian Weyl gave, up to a divergenceFootnote 96

$$ {\mathcal L} = \sqrt g \left( {R + \alpha {F_{ik}}{F^{ik}} + \frac{1}{4}(2\lambda - 3{\phi _l}{\phi ^l})} \right), $$

with the 4-potential φ and the electromagnetic field Fik. Due to his constraint, Weyl had navigated around another problem, i.e., the formulation of the Cauchy initial value problem for field equations of fourth order: Now he had arrived at second order field equations. In the paper in 1921, he changed his Lagrangian slightly into

$$ {\mathcal L} = \sqrt g \left( {R + \alpha {F_{ik}}{F^{ik}} + \frac{{{ \epsilon ^2}}}{4}(1 - 3{\phi _l}{\phi ^l})} \right), $$

with a factor in Weyl’s connection (100),

$$ {\Gamma _{ij}}^k = \{ \,_{ij}^k\,\} + \frac{ \in }{2}(\delta _i^k{\phi _j} + \delta _j^k{\phi _i} - {g_{ij}}{g^{kl}}{\phi _l}). $$

In both presentations, he considered as an advantage of his theory:

“Moreover, this theory leads to the cosmological term in a uniform and forceful manner, [a term] which in Einstein’s theory was introduced ad hoc”Footnote 97 ([402], p. 474)

Reichenbächer seemingly was unhappy about Weyl’s taking the curvature scalar to be a constant before the variation; in the discussion after Weyl’s talk in 1920, he inquired whether one could not introduce Weyl’s “natural gauge” after the variation of the Lagrangian such that the field equations would show their gauge invariance first ([399], p. 651). Eddington criticised Weyl’s choice of a Lagrangian as speculative:

“At the most we can only regard the assumed form of action […] as a step towards some more natural combination of electromagnetic and gravitational variables.” ([59], p. 212)

The changes, which Weyl had introduced in the 4th edition of his book [401], and which, according to him, were of fundamental importance for the understanding of relativity theory, were discussed by him in a further paper [400]. In connection with the question of whether, in general relativity, a formulation might be possible such that “matter whose characteristical traits are charge, mass, and motion generates the field”, a question which was considered as unanswered by Weyl, he also mentioned a publication of Reichenbächer [272]. For Weyl, knowledge of the charge and mass of each particle, and of the extension of their “world-channels” were insufficient to determine the field uniquely. Weyl’s hint at a solution remains dark; nevertheless, for him it meant

“to reconciliate Reichenbächer’s idea: matter causes a ‘deformation’ of the metrical field and Einstein’s idea: inertia and gravitation are one.” ([400], p. 561, footnote)

Although Einstein could not accept Weyl’s theory as a physical theory, he cherished “its courageous mathematical construction” and thought intensively about its conceptual foundation: This becomes clear from his paper “On a complement at hand of the bases of general relativity” of 1921 [73]. In it, he raised the question whether it would be possible to generate a geometry just from the conformal invariance of Equation (9) without use of the conception “distance”, i.e., without using rulers and clocks. He then embarked on conformal invariants and tensors of gauge-weight 0, and gave the one formed from the square of Weyl’s conformal curvature tensor (59), i.e. \({C^i}_{jkl}C_i^{jkl}\). His colleague in Vienna, WirtingerFootnote 98, had helped him in thisFootnote 99. Einstein’s conclusion was that, by writing down a metric with gauge-weight 0, it was possible to form a theory depending only on the quotient of the metrical components. If J has gauge-weight -1, then Jgik is such a metric. In order to reduce the new theory to general relativity, in addition only the differential equation

$$ J = {J_0} = {\rm{const}}{\rm{.}} $$

would have to be solved.

Eisenhart wished to partially reinterpret Weyl’s theory: In place of putting the vector potential equal to Weyl’s gauge vector, he suggested to identify it with \(\tfrac{{ - F_k^i{J^k}}}{\mu }\), where Ji is the electrical 4-current vector (-density) and μ the mass density. He referred to Weyl, Eddington’s book, and to Pauli’s article in the Encyclopedia of Mathematical Sciences [116].

Einstein’s rejection of the physical value of Weyl’s theory was seconded by DienesFootnote 100, if only with a not very helpful argument. He demanded that the connection remain metric-compatible from which, trivially, Weyl’s gauge-vector must vanish. Dienes applied the same argument to Eddington’s generalisation of Weyl’s theory [51]. Other mathematicians took Weyl’s theory at its face value and drew consequences; thus M. Juvet calculated Frenet’s formulas for an “n-èdre” in Weyl’s geometry by generalising a result of Blaschke for Riemannian geometry [180]. More important, however, for later work was the gauge invariant tensor calculus by a fellow of St. John’s College in Cambridge, M. H. A. Newman [237]. In this calculus, tensor equations preserve their form both under a change of coordinates and a change of gauge. Newman applied his scheme to a variational principle with Lagrangian K2 and concluded:

“The part independent of the ‘electrical’ vector φi is found to be \({K_{ij}} - \tfrac{1}{4}K{g_{ij}}\), a tensor which has been considered by Einstein from time to time in connection with the theory of gravitation.” ([237], p. 623)

After the Second World War, research following Weyl’s classical geometrical approach with his original 1-dimensional Abelian gauge-group was resumed. The more important development, however, was the extension to non-Abelian gauge-groups and the combination with Kaluza’s idea. We shall discuss these topics in Part II of this article. The shift in Weyl’s interpretation of the role of the gauging from the link between gravitation and electromagnetism to a link between the quantum mechanical state function and electromagnetism is touched on in Section 7.

4.2 Kaluza’s five-dimensional unification

What is now called Kaluza-Klein theory in the physics community is a mixture of quite different contributions by both scientistsFootnote 101. Kaluza’s idea of looking at four spatial and one time dimension originated in or before 1919; by then he had communicated it to Einstein:

“The idea of achieving [a unified field theory] by means of a five-dimensional cylinder world never dawned on me. [–] At first glance I like your idea enormously.” (letter of Einstein to Kaluza of 21 April 1919)

This remark is surprising because Nordström had suggested a five-dimensional unification of his scalar gravitational theory with electromagnetism five years earlier [238], by embedding space-time into a five-dimensional world in quite the same way as Kaluza did. In principle, Einstein could have known Nordström’s work. In the same year 1914, he and Fokker had given a covariant formulation of Nordström’s pure (scalar) theory of gravitation [104]. In a subsequent letter to Kaluza of 5 May 1919 Einstein still was impressed: “The formal unity of your theory is startling.” However, on 29 May 1919, Einstein became somewhat reservedFootnote 102:

“I respect greatly the beauty and boldness of your idea. But you understand that, in view of the existing factual concerns, I cannot take sides as planned originally.”Footnote 103

Kaluza’s paper was communicated by Einstein to the Academy, but for reasons unknown was published only in 1921 [181]. Kaluza’s idea was to write down the Einstein field equations for empty space in a five-dimensional Riemannian manifold with metric \({g_{\alpha \beta }}\), i.e., \({R_{\alpha \beta }} = 0\), α, β=1,…, 5, where \({R_{\alpha \beta }}\) is the Ricci tensor of M5, and to look at small deviations γ from Minkowski space: \({g_{\alpha \beta }} = - {\delta _{\alpha \beta }} + {\gamma _{\alpha \beta }}\).Footnote 104. In order to obtain a theory in space-time, he assumed the so-called “cylinder condition”

$$ {g_{\alpha \beta ,5}} = 0, $$

equivalent to the existence of a spacelike translational symmetry (Killing vector). Equation (109) is used for all “functions of state” (Zustandsgrössen), i.e., also for the matter variables. Kaluza did not normalize the Killing vector to a constant, i.e., he kept

$$ {g_{55}} \ne {\rm{const}}{\rm{.}} $$

Equation (110) is called the “sharpened cylinder condition” by some authors including Einstein. Of the 15 components of \({g_{\alpha \beta }}\), five had to get a new physical interpretation, i.e. \({g_{\alpha 5}}\) and g55; the components gik, i, k=1, …, 4, were to describe the gravitational field as before; Kaluza took gi5 proportional to the electromagnetic vector potential Ai. The component g55 turned out to be a (scalar) gravitational potential which, in the static case, satisfies the equation

$$ {\nabla ^2}{g_{55}} = - \kappa {\mu _0}, $$

with the constant matter density μ0.

Kaluza also showed that the geodesics of the five-dimensional space reduce to the equations of motion for a charged point particle in space-time, if a weakness assumption is made for the components of the 5-velocity \({u^\alpha }:{u^1},{u^2},{u^3},{u^5} \ll 1\), u4≃1. The Lorentz force appears augmented by an additional term containing g55 of the order \({\left( {\tfrac{u}{c}} \right)^2}\) which thus may be neglected. From the fifth equation of motion Kaluza concluded that the fifth component of momentum p5e, with e being the particles’ electric charge (up to a constant of proportionality). From the equations of motion, charge conservation also followed in Kaluza’s linear approximation. Kaluza was well aware that his theory broke down if applied to elementary particles like electrons or protons, and speculated about an escape in which gravitation had to be considered as some “difference effect”, and the gravitational constant given “a statistical meaning”. For him, any theory claiming universal validity was endangered by quantum theory, anyway.

From the cylinder condition, a grave objection toward Kaluza’s approach results: Covariance with regard to the diffeomorphism group of M5 is destroyed. The remaining covariance group G5 is given by

$$ \begin{array}{*{20}{c}} {{x^{5'}} = {x^5} + f({x^k}),\;\;\;\;\;}&{{x^{l'}} = {x^l}({x^m}),\;\;\;\;\;}&{k,l,m = 1, \ldots ,4.} \end{array} $$

The objects transforming properly under (112) are: the scalar \({g_{5'5'}} = {g_{55}}\), the vector-potential \({g_{5'k'}} = {g_{5l}}\frac{{\partial {x^l}}}{{\partial {x^{k'}}}} + {g_{55}}\frac{{\partial {x^5}}}{{\partial {x^{k'}}}}\), and the projected metric

$$ {g_{i'k'}} - \frac{{{g_{i'5'}}{g_{k'5'}}}}{{{g_{5'5'}}}} = \left( {{g_{lm}} - \frac{{{g_{l5}}{g_{m5}}}}{{{g_{55}}}}} \right)\frac{{\partial {x^l}}}{{\partial {x^{i'}}}}\frac{{\partial {x^m}}}{{\partial {x^{k'}}}}. $$

Klein identified the group; however, he did not comment on the fact that now further invariants are available for a Lagrangian, but started right away from the Ricci scalar of M5 [185]. The group G5 is isomorphic to the group H5 of transformations for five homogeneous coordinates \({X^{\mu '}} = {f^\mu }({X^\nu })\) with \({f^\nu }\) homogeneous functions of degree 1. Here, contact is made to the projective formulation of Kaluza’s theory (cf. “projective geometry” in Sections 2.1.3 and 6.3.2).

While towards the end of May 1919 Einstein had not yet fully supported the publication of Kaluza’s manuscript, on 14 October 1921 he thought differently:

“I am having second thoughts about having kept you from the publication of your idea on the unification of gravitation and electricity two years ago. I value your approach more than the one followed by H. Weyl. If you wish, I will present your paper to the Academy after all.”Footnote 105 (letter from Einstein to Kaluza reprinted in [49], p. 454)

It seems that at some point Einstein had set his calculational aide GrommerFootnote 106 to work on regular spherically symmetric solutions of Kaluza’s theory. This led to a joint publication which was submitted just one month after Einstein had finally presented a rewritten manuscript of Kaluza’s to the Berlin Academy [105]. The negative result of his own paper, i.e., that no non-singular, statical, spherically symmetric exact solution exists, did not please Einstein. He also thought that Kaluza’s assumption of general covariance in the five-dimensional manifold had no support from physics; he disliked the preference of the fifth coordinate due to Equation (109) which seemed to contradict the equivalence of all five coordinates used by Kaluza in the construction of the field equations [105]. In any case, apart from an encouraging letter to Kaluza in 1925 in which he called Kaluza’s idea the only serious attempt at unified field theory besides the Weyl-Eddington approach, Einstein kept silent on the five-dimensional theory until 1926.

4.3 Eddington’s affine theory

4.3.1 Eddington’s paper

The third main idea that emerged was Eddington’s suggestion to forego the metric as a fundamental concept and start right away with a (general) connection, which he then restricted to a symmetric one Γ in order to avoid an “infinitely crinkled” world [58]. His motivation went beyond the unification of gravitation and electromagnetism:

“In passing beyond Euclidean geometry, gravitation makes its appearance; in passing beyond Riemannian geometry, electromagnetic force appears; what remains to be gained by further generalisation? Clearly, the non-Maxwellian binding forces which hold together an electron. But the problem of the electron must be difficult, and I cannot say whether the present generalisation succeeds in providing the material for its solution” ([58], p. 104)

In the first, shorter, part of two, Eddington describes affine geometry; in the second he relates mathematical objects to physical variables. He distinguishes the affine geometry as the “geometry of the world-structure” from Riemannian geometry as “the natural geometry of the world”. He starts by calculating both the curvature and Ricci tensors from the symmetric connection according to Equation (39). The Ricci tensor Kij(Γ):=*Gij is asymmetricFootnote 107,

$$*{\text{G}_{kl}} = {\text{R}_{kl}} + {\text{F}_{kl}},$$

with Rkl(Γ) being the symmetric and Fkl(Γ) the antisymmetric part. According to Equation (31) Fkl derives from a “vector potential”, i.e., \({F_{kl}} = {\partial _k}{\Gamma _l} - {\partial _l}{\Gamma _k}\) with \({\Gamma _{lr}}^r\), such that an immediate physical identification of Fkl with the electromagnetic field tensor is at hand. With half of Maxwell’s equations being satisfied automatically, the other half is used to define the electric charge current jk by \({j^l}: = {F^{lk}}_{\left\| k \right.}\). By this, Eddington claims to guarantee charge conservation:

“The divergence of jk will vanish identically if jk is itself the divergence of any antisymmetrical contravariant tensor.” ([64], p. 223; cf. also [58], p. 113)

Now, by Equation (25),

$$ {F^{lk}}_{\left\| {[l} \right\|k]} = {K_{[rk]}}{F^{rk}} + {S_{jk}}^s{\nabla _s}{F^{jk}}. $$

For a symmetric connection thus, unlike in Riemannian geometry,

$$ {j^k}_{\left\| k \right.} = {F^{lk}}_{\left\| {[l} \right\|k]} = {F_{rk}}{F^{rk}} \ne 0. $$

However, for a tensor density, due to Equation (16) we obtain

$$ {{\hat \jmath}^k}_{\left\| k \right.} = {{\hat F}^{lk}}_{\left\| {[l} \right\|k]} = \frac{1}{2}({V_{rk}} + 2{K_{[rk]}}){{\hat F}^{rk}} + {S_{jk}}^s{\nabla _s}{{\hat F}^{jk}}, $$

and thus for a torsionless connection (cf. Equation (38)) \({j^k}_{\left\| k \right.} = 0.\)

Eddington introduces the metrical tensor by the definition

$$ \lambda {g_{kl}} = {R_{kl}}, $$

“introducing a universal constant λ, for convenience, in order to remain free to use the centimetre instead of the natural unit of length”. This is called “Einstein’s gauge” by Eddington; he is delighted that

“Our gauging-equation is therefore certainly true wherever light is propagated, i.e., everywhere inside the electron. Who shall say what is the ordinary gauge inside the electron?” ([58], p. 114)

While this remark certainly is true, there is no guarantee in Eddington’s approach that gkl thus defined is a Lorentzian metric, i.e., that it could describe light propagation at all. Only connections leading to a Lorentz metric can be used if a physical interpretation is wanted. Note also, that the interpretation of Rkl as the metric implies that det Rkl≠0.

We must read Equation (118) as giving gkl(Γ) if the only basic variable in affine geometry, i.e., the connection \({\Gamma _{ij}}^k\), has been determined by help of some field equations. Thus, in general, gkl is not metric-compatible; in order to make it such, we are led to the differential equations \({R_{ij\left\| k \right.}} = 0\) for \({\Gamma _{ij}}^k\), an equation not considered by Eddington. In the absence of an electromagnetic field, Equation (118) looks like Einstein’s vacuum field equation with cosmological constant. In principle, now a fictitious “Riemannian” connection (the Christoffel symbol) can be written down which, however, is a horribly complicated function of the affine connection — as the only fundamental geometrical quantity available. This is due to the expression for the inverse of the metric, a function cubic in Rkl. Eddington’s affine theory thus can also be seen as a bi-connection theory. Note also that Eddington does not explicitly say how to obtain the contravariant form of the electromagnetic field Fij from Fij; we must assume that he thought of raising indices with the complicated inverse metric tensor.

In connection with cosmological considerations, Eddington cherished the λ-term in Equation (118):

“I would as soon think of reverting to Newtonian theory as of dropping the cosmic constant.” ([63], p. 35)

Now, Eddington was able to identify the energy-momentum tensor Tik of the electromagnetic field by decomposing the Ricci tensor Kij formed from Equation (51) into a metric part Rik and the rest. The energy-momentum tensor Tik of the electromagnetic field is then defined by Einstein’s field equations with a fictitious cosmological constant \(\kappa {T^{ik}}: = {G^{ik}} - \tfrac{1}{2}{g^{ik}}(G - 2\lambda )\).

Although Eddington’s interest did not rest on finding a proper set of field equations, he nevertheless discussed the Lagrangian \({\mathcal L} = \sqrt { - g} *{G^{ik}}*{G_{ik}}\), and showed that a variation with regard to gik did not lead to an acceptable field equation.

Eddington’s main goal in this paper was to include matter as an inherent geometrical structure:

“What we have sought is not the geometry of actual space and time, but the geometry of the world-structure which is the common basis of space and time and things.” ([58], p. 121)

By “things” he meant

  1. (1)

    the energy-momentum tensor of matter, i.e., of the electromagnetic field,

  2. (2)

    the tensor of the electromagnetic field, and

  3. (3)

    the electric charge-and-current vector.

His aim was reached in the sense that all three quantities were fixed entirely by the connection; they could no longer be given from the outside. As to the question of the electron, it is seen as “a region of abnormal world-curvature”, i.e., of abnormally large curvature.

While Pauli liked Eddington’s distinction between “natural geometry” and “world geometry” — with the latter being only “a graphical representation” of reality — he was not sure at all whether “a point of view could be taken from which the gravitational and electromagnetical fields appear as union”. If so, then it must be a purely phenomenological one without any recourse to the nature of the charged elementary particles (cf. his letter to Eddington quoted below).

Lorentz did not like the large number of variables in Eddington’s theory; there were 4 components of the electromagnetic potential, 10 components of the metric and 40 components of the connection:

“It may well be asked whether after all it would not be preferable simply to introduce the functions that are necessary for characterising the electromagnetic and gravitational fields, without encumbering the theory with so great a number of superfluous quantities.” ([211], p. 382)

4.3.2 Einstein’s reaction and publications

Eddington’s publication early in 1921, generalising Einstein’s and Weyl’s theories started a new direction of research both in physics and mathematics. At first, Einstein seems to have been reserved (cf. his letters to Weyl in June and September 1921 quoted by Stachel in his article on Eddington and Einstein ([330], pp. 453–475; here p. 466)), but one and a half years later he became attracted by Eddington’s idea. To Bohr, Einstein wrote from Singapore on 11 January 1923:

“I believe I have finally understood the connection between electricity and gravitation. Eddington has come closer to the truth than Weyl.” ([139], p. 274)

He now tried to make Eddington’s theory work as a physical theory; Eddington had not given field equations:

“I must absolutely publish since Eddington’s idea must be thought through to the end.” (letter of Einstein to Weyl of 23 May 1923; cf. [241], p. 343)

And a few days later, he was still intrigued about this sort of unified field theory, in particular about its elusiveness: “[…] Over it lingers the marble smile of inexorable nature, which has bestowed on us more longing than brains.”Footnote 108 (letter of Einstein to Weyl of 26 May 1923; cf.[241], p. 343) And indeed Einstein published fast, even while still on the steamer returning from Japan through Palestine and Spain: The paper of February 1923 in the reports of the Berlin Academy carries, as location of the sender, the ship “Haruna Maru” of the Japanese Nippon Yushen Kaisha lineFootnote 109 [77].

“In past years, the wish to understand the gravitational and electromagnetic field as one in essence has dominated the endeavours of theoreticians. […] From a purely logical point of view only the connection should be used as a fundamental quantity, and the metric as a quantity derived thereof […] Eddington has done this.”Footnote 110 ([77], p. 32)

Like Eddington, Einstein used a symmetric connection and wrote down the equationFootnote 111

$$ {\lambda ^2}{K_{kl}} = {g_{kl}} + {\phi _{kl}}, $$

where gkl=g(kl) and φkl=φ[kl], and λ is a “large number”. By this, the metric was defined as the symmetric part of the Ricci tensor. Due to

$$ {\phi _{kl}} = \frac{1}{2}\left( {\frac{{\partial {\Gamma _{kj}}^j}}{{\partial {x^l}}} - \frac{{\partial {\Gamma _{lj}}^j}}{{\partial {x^k}}}} \right) $$

one half of Maxwell’s equations is satisfied if φkl is taken to be the electromagnetic field tensor. Let us note, however, that while \({{\Gamma _{kj}}^j}\) transforms inhomogeneously, its transformation law

$$ {\Gamma _{k'j'}}^{j'} = {\Gamma _{lm}}^m\frac{{\partial {x^l}}}{{\partial {x^{k'}}}} + \frac{{{\partial ^2}{x^{l'}}}}{{\partial {x^{k'}}\partial {x^{m'}}}} \cdot \frac{{\partial {x^{m'}}}}{{\partial {x^{l'}}}} $$

is not exactly the same as that of the electric 4-potential under gauge transformations.

For a Lagrangian, Einstein used \({\mathcal L} = 2\sqrt { - \det \;{K_{ij}}} ;\); he claims that for vanishing electromagnetic field the vacuum field equations of general relativity, with the cosmological term included, hold. Einstein varied with regard to gkl and σkl, not, as one might have expected, with regard to the connection \({\Gamma _{kj}}^j\). If \({\hat f^{kl}}: = \tfrac{{\delta {\mathcal L}}}{{\delta {\phi _{kl}}}}\), then the electric current density jl is defined by \({{\hat \jmath}^l}: = \tfrac{{\partial {{\hat f}^{kl}}}}{{\partial {x^k}}}\). \({\hat f^{kl}}\) is interpreted as “the contravariant tensor of the electromagnetic field”.

The field equations are obtained from the Lagrangian by variation with regard to the connection \({\Gamma _{kj}}^l\) and are (Einstein worked in space-time)

$$ \begin{array}{*{20}{c}} {{{\hat s}^{kl}}_{\;\;\;\left\| m \right.} + \frac{1}{3}\delta _m^k{\,_{\hat \jmath}}{\,^l} + \frac{1}{3}\delta _m^l{\,_{\hat \jmath}}{\,^k} = 0,\;\;\;\;\;}&{3{{\hat s}^{kl}}_{\;\;\;\left\| l \right.} + {5_{\hat \jmath}}{\,^k} = 0,} \end{array} $$

with the definition of the current density \(_{\hat \jmath }\,^k\) given before, and \({\hat s^{kl}} = \tfrac{{\delta {\cal L}}}{{\delta {g_{kl}}}}\). Besides \({{\hat s}^{kl}}\), Einstein also uses skl introduced by\

$$ \begin{array}{*{20}{c}} {{{\hat s}^{kl}} = {s^{kl}}\sqrt { - \det \;{s_{kl}}} ,\;\;\;\;\;}&{{s^{kl}}{s_{ml}} = \delta _m^k.} \end{array} $$

From Equation (121) the connection can be obtained. If \(_{\hat \jmath}\,^l = \sqrt { - \det \;{s_{kl}}} {j^l}\), and jk=skljl, then the affine connection may formally be expressed by

$$ {\Gamma _{kj}}^l = \frac{1}{2}{s^{lr}}\left( {\frac{{\partial {s_{kr}}}}{{\partial {x^j}}} + \frac{{\partial {s_{jr}}}}{{\partial {x^k}}} - \frac{{\partial {s_{kj}}}}{{\partial {x^r}}}} \right) - \frac{1}{2}{s_{k{j^{\hat \jmath}}}}^l + \frac{1}{3}{\delta ^l}_{({k^{\hat \jmath}}j)} $$

This equation is an identity if a solution of the field equations (121) is inserted. From Equation (123),

$$ {\Gamma _{kj}}^j = \frac{\partial }{{\partial {x^k}}}\sqrt {\det \;{s_{lr}}} + \frac{1}{3}{j_k}. $$

If no electromagnetic field is present, \({\hat s^{kl}}\) reduces to \({\hat s^{kl}} = {g^{kl}}\sqrt { - \det \;{g_{kl}}}\); the definition of themetric gij in Equation (119) is reinterpreted by Einstein as giving his vacuum field equation with cosmological constant λ-2. In order that this makes sense, the identifications in Equation (119) are always to be made after the variation of the Lagrangian is performed.

For non-vanishing electromagnetic field, due to Equation (124) the Equation (120) now becomes

$$ {\hat \phi _{kl}} = \frac{1}{6}\left( {\frac{{{\partial _{\hat \jmath k}}}}{{\partial {x^l}}} - \frac{{{\partial _{\hat \jmath l}}}}{{\partial {x^k}}}} \right), $$

which means that for vanishing current density no electromagnetic field is possible. Einstein concluded:

“But the extraordinary smallness of 1/λ2 implies that finite φkl are possible only for tiny, almost vanishing current density. Except for singular positions, the current density is practically vanishing.”Footnote 112

Einstein went on to show that Maxwell’s vacuum equations are holding in first order approximation. Up to the same order, \({\hat f^{kl}} \simeq {\hat \phi _{kl}}\). In general however, \({{\hat \phi }_{kl}} \ne {s_{km}}{s_{ln}}{{\hat f}^{nm}}\). Also, the geometrical theory presented here is energetically closed, i.e., the current density \({\hat \jmath l}\) cannot be given arbitrarily as in the usual Maxwell theory with external sources.

Einstein was not sure whether “electrical elementary elements”, i.e., nonsingular electrons, are possible in this theory; they might be. He found it remarkable “[…] that, according to this theory, positive and negative electricity cannot differ just in sign”Footnote 113 ([77], p. 38). His final conclusion was:

“that EDDINGTON’S general idea in context with the Hamiltonian principle leads to a theory almost free of ambiguities; it does justice to our present knowledge about gravitation and electricity and unifies both kinds of fields in a truly accomplished manner.”Footnote 114 ([77], p. 38)

Until the end of May 1923, two further publications followed in which Einstein elaborated on the theory. In the second paper, he exchanged the Lagrangian \({\mathcal L} = 2\sqrt { - \det \;{K_{ij}}}\) for a new one, i.e., for \({\mathcal L} = - 2\sqrt { - \det \;{K_{ij}}} + \hat R - \tfrac{1}{6}{{\hat s}^{lm}}{i_l}{i_m}\), where \(_{\hat \imath }\,^k = {i^k}\sqrt {\det \;{s_{kl}}}\). \({\mathcal L}\) is to be varied with respect to ŝkl and kl. The resulting equations for the gravitational and electromagnetic fields are the symmetric and skew-symmetric part, respectively, of

$$ {K_{jk}} = {R_{jk}} + \frac{1}{6}\left( {\frac{{\partial {i_j}}}{{\partial {x^k}}} - \frac{{\partial {i_k}}}{{\partial {x^j}}} + {i_j}{i_k}} \right). $$

Although the theory offered, for every solution with positive charge, also a solution with negative charge, the masses in the two cases were the same. However, the only known particle with positive charge at the time (what is now called the proton) had a mass greatly different from the particle with negative charge, the electron. Einstein noted:

“Therefore, the theory may not account for the difference in mass of positive and negative electrons.”Footnote 115 ([74], p. 77)

In the third paper [76], apart from changing notationsFootnote 116, Einstein set λ=1. He also dropped the assumption (119) and replaced it by allowing his Lagrangian (Hamiltonian) Ĥ to be a function of the two independent variables,

$$ \begin{array}{*{20}{c}} {{\gamma _{ij}} = {K_{(ij)}},\;\;\;\;\;}&{{\phi _{ij}} = {K_{[ij]}}.} \end{array} $$

The logic of the subsequent derivations in his paper is quite involved. The first step consisted in the definition of tensor densities

$$ \begin{array}{*{20}{c}} {{{\hat g}^{kl}}: = \frac{{\delta \hat H}}{{\delta {\gamma _{kl}}}},\;\;\;\;\;}&{{{\hat f}^{kl}}: = \frac{{\delta \hat H}}{{\delta {\phi _{kl}}}}.} \end{array} $$

In the second step, the variations δγkl and δφkl were expressed by δΓ lik via (127) and inserted into δĤ=0. The ensuing equation could be solved for Γ lik and led to Equation (123). In the third step, the Lagrangian Ĥ* is taken as a functional of the variables introduced in the first step, i.e., of ĝkl, kl such that in place of Equation (128) the relations

$$ \begin{array}{*{20}{c}} {{\gamma _{kl}}: = \frac{{\delta \hat H * }}{{\delta {{\hat g}^{kl}}}},\;\;\;\;\;}&{{\phi _{kl}}: = \frac{{\delta \hat H * }}{{\delta {{\hat f}^{kl}}}}.} \end{array} $$

hold. Einstein then took “the expression most natural vis-a-vis our present knowledge”, i.e., \(\hat H * = 2\alpha \sqrt { - g} - \tfrac{\beta }{2}{f_{ik}}{{\hat f}^{ik}}\). By using both Equation (127) and Equation (129), Einstein obtained the Einstein. Maxwell equations augmented by a term -1/6ikil on the side of the energy-momentum tensor of the electromagnetic field and Equation (125) with a changed l.h.s. now reading -βfik.

After a field rescaling, he then took a third expression to become his Lagrangian

$$ \bar H = \sqrt { - g} \left[ {R - 2\alpha + \kappa \left( {\frac{1}{2}{f_{ij}}{f^{ij}}} \right) - \frac{1}{\beta }{i_l}{i^l}} \right], $$

where α and β are arbitrary constants, and κ is the gravitational constant. il is defined to be proportional to the electromagnetic 4-potential fk, i.e., 1/βil=-fk, and fij corresponds to φij, \(\sqrt { - g} {f^{ij}}\) to ij. After the field equations had been obtained by this longwinded procedure, it became obvious that they could also be derived from =H. taken as an “effective” Lagrangian varied with respect to gik and fik. In Einstein’s words: “R is the Riemannian curvature scalar formed from gijFootnote 117. In the third paper as well, Einstein’s desire to create a unified field theory satisfying all his criteria still was not fulfilled: His equations, again, did not give a singularity-free electron. In a paper on Hilbert’s vision of a unified science, Sauer and Majer recently have found out from lectures of Hilbert given in Hamburg and Zürich in 1923, that Hilbert considered Einstein’s work in affine theory a return to his own results of 1915 by “[…] a colossal detour via Levi-Civita, Weyl, Schouten, Eddington […]” [215]. It seems that, in this evaluation, Hilbert was influenced by Einstein’s proportionality between the 4-potential and the electrical current which Hilbert had assumed as early as in 1915 [161]Footnote 118.

4.3.3 Comments by Einstein’s colleagues

While, in the meantime, mathematicians had taken over the conceptual development of affine theory, some other physicists, including the perpetual pièce de resistance Pauli, kept a negative attitude:

“[…] I now do not at all believe that the problem of elementary particles can be solved by any theory applying the concept of continuously varying field strengths which satisfy certain differential equations to regions in the interior of elementary particles. […] The quantities \(\Gamma _{\nu \alpha }^\mu\) cannot be measured directly, but must be obtained from the directly measured quantities by complicated calculational operations. Nobody can determine empirically an affine connection for vectors at neighbouring points if he has not obtained the line element before. Therefore, unlike you and Einstein, I deem the mathematician’s discovery of the possibility to found a geometry on an affine connection without a metric as meaningless for physics, in the first place.”Footnote 119 (Pauli to Eddington on 20 September 1923; [251], pp. 115–119)

Also Weyl, in the 5th edition of Raum-Zeit-Materie ([398], Appendix 4), in discussing “world-geometric extensions of Einstein’s theory”, found Eddington’s theory not convincing. He criticised a theory that keeps only the connection as a fundamental building block for its lack of a guarantee that it would also house the conformal structure (light cone structure). This is needed for special relativity to be incorporated in some sense, and thus must be an independent fundamental input [405].

Likewise, Eddington himself did not appreciate much Einstein’s followership. In Note 14, § 100 appended to the second edition of his book, he laid out Einstein’s theory but not without first having warned the reader:

“The theory is intensely formal as indeed all such action-theories must be, and I cannot avoid the suspicion that the mathematical elegance is obtained by a short cut which does not lead along the direct route of real physical progress. From a recent conversation with Einstein I learn that he is of much the same opinion.” ([64], pp. 257–261)

In fact, when Eddington’s book was translated into German in 1925 [60], Einstein wrote an appendix to it in which he repeated, with minor changes, the results of his last paper on the affine theory. His outlook on the state of the theory now was rather bleak:

“For me, the final result of this consideration regrettably consists in the impression that the deepening of the geometrical foundations by Weyl-Eddington is unable to bring progress for our physical understanding; hopefully, future developments will show that this pessimistic opinion has been unjustified.”Footnote 120 ([60], p. 371)

An echo of this can be found in Einstein’s letter to Besso of 5 June 1925:

“I am firmly convinced that the entire chain of thought Weyl-Eddington-Schouten does not lead to something useful in physics, and I now have found another, physically better founded approach. To me, the quantum-problem seems to require something like a special scalar, for the introduction of which I have found a plausible way.”Footnote 121 ([99], p. 204)

This remark shows that Einstein must have taken some notice of Schouten’s work in affine geometry. What the “special scalar” was, remains an open question.

4.3.4 Overdetermination of partial differential equations and elementary particles

Einstein spent much time in thinking about the “quantum problem”, as he confessed to Born:

“I do not believe that the theory will be able to dispense with the continuum. But I fail to succeed in giving my pet idea a tangible form: to understand the quantum-structure through an overdetermination by differential equations.”Footnote 122 ([103], pp. 48–49)

In a paper from December 1923, Einstein not only stated clearly the necessary conditions for a unified field theory to be acceptable to him, but also expressed his hope that this technique of “overdetermination” of systems of differential equations could solve the “quantum problem”.

“According to the theories known until now the initial state of a system may be chosen freely; the differential equations then give the evolution in time. From our knowledge about quantum states, in particular as it developed in the wake of Bohr’s theory during the past decade, this characteristic feature of theory does not correspond to reality. The initial state of an electron moving around a hydrogen nucleus cannot be chosen freely; its choice must correspond to the quantum conditions. In general: not only the evolution in time but also the initial state obey laws.”Footnote 123 ([75], pp. 360–361)

He then ventured the hope that a system of overdetermined differential equations is able to determine

“also the mechanical behaviour of singular points (electrons) in such a way that the initial states of the field and of the singular points are subjected to constraints as well. […] If it is possible at all to solve the quantum problem by differential equations, we may hope to reach the goal in this direction.”

We note here Einstein’s emphasis on the very special problem of the quantum nature of elementary particles like the electron, as compared to the general problem of embedding matter fields into a geometrical setting.

One of the crucial tests for an acceptable unified field theory for him now was:

“The system of differential equations to be found, and which overdetermines the field, in any case must admit this static, spherically symmetric solution which describes, respectively, the positive and negative electron according to the equations given above [i.e the Einstein-Maxwell equations].”Footnote 124

This attitude can also be found in a letter to M. Besso from 5 January 1924:

“The idea I am wrestling with concerns the understanding of the quantum facts; it is: overdetermination of the laws by more field equations than field variables. In such a way, the un-ambiguity of the initial conditions ought to be understood without leaving field theory. […] The equations of motion of material points (electrons) will be given up totally; their motion ought to be co-determined by the field laws.”Footnote 125 ([99], p. 197)

In his answer, Besso asked for more information concerning the quantum aspect of the concept of “overdetermination”, because:

“On the one hand, this seems to be connected only formally with a field theory; on the other, it has not yet dawned on me how in this manner something corresponding to the discrete quantum orbits may be reached.”Footnote 126 ([99], p. 199)

5 Differential Geometry’s High Tide

In the introduction to his book, Struik distinguished three directions in the development of the theory of linear connections [337]:

  1. (1)

    The generalisation of parallel transport in the sense of Levi-Civita and Weyl. Schouten is the leading figure in this approach [300].

  2. (2)

    The “geometry of paths” considering the lines of constant direction for a connection — with the proponents Veblen, Eisenhart [122, 114, 115, 373], J. M. Thomas [348], and T. Y. ThomasFootnote 127[349, 347]. Here, only symmetric connections can appear.

  3. (3)

    The idea of mapping a manifold at one point to a manifold at a neighbouring point is central (affine, conformal, projective mappings). The names of König [192] and Cartan [29, 302] are connected with this program.

In his assessment, Eisenhart [121] adds to this all the geometries whose metric is

“based upon an integral whose integrand is homogeneous of the first degree in the differentials. Developments of this theory have been made by Finsler, Berwald, Synge, and J. H. Taylor. In this geometry the paths are the shortest lines, and in that sense are a generalisation of geodesics. Affine properties of these spaces are obtained from a natural generalisation of the definition of Levi-Civita for Riemannian spaces.” ([121], p. V)

In fact, already in May 1921 Jan Arnoldus Schouten in Delft had submitted two papers classifying all possible connections [297, 296]. In the first he wrote:

“Motivated by relativity theory, differential geometry received a totally novel, simple and satisfying foundation; I just refer to G. Hessenberg’s ‘Vectorial foundation…’, Math. Ann. 78, 1917, S. 187–217 and H. Weyl, Raum-Zeit-Materie, 2. Section, Leipzig 1918 (3. Aufl. Berlin 1920) as well as ‘Reine Infinitesimalgeometrie’ etc.Footnote 128. […] In the present investigation all 18 different linear connections are listed and determined in an invariant manner. The most general connection is characterised by two fields of third degree, one tensor field of second degree, and a vector field […].”Footnote 129 ([297], p. 57)

The fields referred to are the torsion tensor S kij , the tensor of non-metricity Q kij , the metric gij, and the tensor C kij which, in unified field theory, was rarely used. It arose because Schouten introduced different linear connections for tangent vectors and linear forms. He defined the covariant derivative of a 1-form not by the connection L kij in Equation (13), but by

$$ \mathop {{\omega _i}}\limits^ + = \frac{{\partial {\omega _i}}}{{\partial {x^k}}} - \prime {L_{ki}}^j{\omega _j}, $$

with \(\prime L \ne L\). In fact

$$ C_{ij}^k: = L_{ij}^k - \prime L_{ij}^k. $$

In the first paper, Schouten had considered only the special case C kij =Ci δ kj Footnote 130.

Furthermore, on p. 57 of [297] we read:

“The general connection for n=4 at least theoretically opens the door for an extension of Weyl’s theory. For such an extension an invariant fixing of the connection is needed, because a physical phenomenon can correspond only to an invariant expression.”Footnote 131

Through footnote 5 on the same page we learn the pedagogical reason why Schouten did not use the ‘direct’ method [294, 336] in his presentation, but rather a coordinate dependent formalismFootnote 132:

“As the results of the present investigation might be of interest for a wider circle of mathematicians, and also for a number of physicists […].”Footnote 133

At the end of the first paper we can find a section “Eventual importance of the present investigation for physics” (p. 79–81) and the confirmation that during the proofreading Schouten received Eddington’s paper ([58], accepted 19 February 1921). Thus, while Einstein and Weyl influenced Eddington, Schouten apparently did his research without knowing of Eddington’s idea. Einstein, perhaps, got to know Schouten’s work only later through the German translation of Eddington’s book where it is mentioned ([60], p. 319), and to which he wrote an addendum, or, more directly, through Schouten’s book on the Ricci calculus, Die Grundlehren der Mathematischen Wissenschaften in Einzeldarstellungen, in the same famous yellow series of Springer Verlag [300]. On the other hand, Einstein’s papers following Eddington’s [77, 74] inspired Schouten to publish on a theory with vector torsion that tried to remedy a problem Einstein had noted in his papers, i.e., that no electromagnetic field could be present in regions of vanishing electric current density. According to Schouten

“[…] we see that the electromagnetic field only depends on the curl of the electric current vector, so that the difficulty arises that the electromagnetic field cannot exist in a place with vanishing current density. In the following pages will be shown that this difficulty disappears when the more general supposition is made that the original deplacement is not necessarily symmetrical.” ([300], p. 850)

Schouten criticised Einstein’s argument for using a symmetric connectionFootnote 134 as unfounded (cf. Equation (15)). He then restricted the generality of his approach; in modern parlance, he did allow for vector torsion only:

“We will not consider the most general case, but the semi-symmetric case in which the alternating part of the parameters has the form:

$$ 1/2({{\Gamma '}_\mu }^\nu _{\,\,\,\,\,\lambda } - {{\Gamma '}_\lambda }^\nu _{\,\,\,\,\,\mu }) = 1/2({S_\lambda }\;\delta _\mu ^\nu - {S_\mu }\;\delta _\lambda ^\nu ), $$

in which Sλ is a general covariant vector.” ([298], p. 851)

The affine connection Γ′ can then be decomposed as follows:

$$ {\Gamma '_{jk}}^l = {\Lambda _{jk}}^l + {S_{[j}}\;\delta _{k]}^l. $$

Hence, besides the covariant derivative ∇′ following from use of \({\Gamma '_{jk}}^l\), in his calculations Schouten also introduced a covariant derivative ∇* formed with \({\Lambda _{jk}}^l\). Schouten’s point of departure for the field equations is Einstein’s first Lagrangian \({\mathcal L} = \sqrt {\det \;{K_{ij}}}\) and, consequently, his field equations were the same as Einstein’s apart from additional terms in vector torsion. Also, Schouten’s definition of some of the observables is different; For example, the electromagnetic field tensor unlike in Equation (125) is now

$$ {\hat F_{kl}} = \frac{1}{6}\left( {\frac{{{\partial _{\hat \imath k}}}}{{\partial {x^l}}} - \frac{{{\partial _{\hat \imath l}}}}{{\partial {x^k}}}} \right) - \left( {\frac{{\partial {{\hat S}_k}}}{{\partial {x^l}}} - \frac{{\partial {{\hat S}_l}}}{{\partial {x^k}}}} \right), $$

where \({i^k}: = \nabla _l^*{f^{kl}} - {P_l}{f^{kl}}\), and \({P_k}: = - \tfrac{\partial }{{\partial {x^k}}}(\log \sqrt { - \det \;{K_{ij}}} ) + {\Lambda _{lk}}^l\). On the same topic, Schouten wrote a paper with Friedman in Leningrad [142]. A similar, but less detailed, classification of connections than Schouten’s has also been given by Cartan. He relied on the curvature, torsion and homothetic curvature 2-forms ([32], Section III; cf. also Section 2.1.4). In 1925, EyraudFootnote 135 came back to Schouten’s paper [298] and proved that his connection can be mapped projectively and conformally on a Riemannian space [124, 123].

Other mathematicians were also stimulated by Einstein’s use of differential geometry in his general relativity and, particularly, by the idea of unified field theory. Examples are Eisenhart and Veblen, both in Princeton, who developed the “geometry of paths”Footnote 136 under the influence of papers by Weyl, Eddington, and Einstein [122, 117, 383]. In Eisenhart’s paper, we may read that

“Einstein has said (in Meaning of Relativity) that ‘a theory of relativity in which the gravitational field and the electromagnetic field enter as an essential unity’ is desirable and recently has proposed such a theory.” ([117], pp. 367–368)


“His geometry also is included in the one now proposed and it may be that the latter, because of its greater generality and adaptability will serve better as the basis for the mathematical formulation of the results of physical experiments.” ([117], p. 369)

The spreading of knowledge about properties of differential geometric objects like connection and curvature took time, however, even in Leningrad. Seven years after Schouten’s classification of connections, Fréedericksz of Leningrad — known better for his contributions to the physics of liquid crystals — put forward a classification of his own by using both the connection and the curvature tensor [138].

6 The Pursuit of Unified Field Theory by Einstein and His Collaborators

6.1 Affine and mixed geometry

Already in July 1925 Einstein had laid aside his doubts concerning “the deepening of the geometric foundations”. He modified Eddington’s approach to the extent that he now took both a nonsymmetric connection and a non-symmetric metric, i.e., dealt with a mixed geometry (metric-affine theory):

“[…] Also, my opinion about my paper which appeared in these reports [i.e., Sitzungsberichte of the Prussian Academy, Nr. 17, p. 137, 1923], and which was based on Eddington’s fundamental idea, is such that it does not present the true solution of the problem. After an uninterrupted search during the past two years I now believe to have found the true solution.”Footnote 137 ([78], p. 414)

As in general relativity, he started from the Lagrangian \({\mathcal L} = {\hat g^{ik}}{R_{ik}}\), but now with ĝik and the connection \({\Gamma _{kj}}^l\) being varied separately as independent variables. After some manipulations, the variation with regard to the metric and to the connection led to the following equations:

$$ \begin{array}{*{20}{c}} { - \frac{{\partial {g_{ik}}}}{{\partial {x^l}}} + {g_{rk}}{\Gamma _{il}}^r + {g_{ir}}{\Gamma _{lk}}^r + {g_{ik}}{\phi _l} + {g_{il}}{\phi _k} = 0,\;\;\;\;\;}&{{R_{ik}} = 0,} \end{array} $$

i.e., 64+16 equations for the same number of variables. φk is an arbitrary covariant vector. The asymmetric gik is related to ĝlm by

$$ \begin{array}{*{20}{c}} {{{\hat g}_{ir}}{{\hat g}^{jr}} = {{\hat g}_{ri}}{{\hat g}^{rj}} = {\delta _i}^j,\;\;\;\;\;}&{{{\hat g}_{ik}} = \frac{{{g_{ik}}}}{{\sqrt { - g} }}.} \end{array} $$

The three equations (134) and

$$ \begin{array}{*{20}{c}} {\frac{{\partial {{\hat g}^{ik}}}}{{\partial {x^k}}} - \frac{{\partial {{\hat g}^{ki}}}}{{\partial {x^k}}} = 0,\;\;\;\;\;}&{{R_{ik}} = 0,} \end{array} $$

were the result of the variation. In order to be able to interpret the symmetric part of gik as metrical tensor and its anti(skew)-symmetric part as the electromagnetic field tensor, Einstein put φk=0, i.e., overdetermined his system of partial differential equations. However, he cautioned:

“However, for later investigations (e.g., the problem of the electron) it is to be kept in mind that the HAMILTONian principle does not provide an argument for putting φk equal to zero.”Footnote 138

In comparing Equation (134) with φk=0 and Equation (47), we note that the expression does not seem to correspond to a covariant derivative due to the + sign where a − sign is required. But this must be due to either a calculational error, or to a printer’s typo because in the paper of J. M. Thomas following Einstein’s by six months and showing that Einstein’s “new equations can be obtained by direct generalisation of the equations of the gravitational field previously given by him. The process of generalisation consists in abandoning assumptions of symmetry and in adopting a definition of covariant differentiation which is not the usual one, but which reduces to the usual one in case the connection is symmetric.“ ([346], p. 187) J. M. Thomas wrote Einstein’s Equation (134) in the form

$$ \begin{array}{*{20}{c}} {{g_{ik/l}} = {g_{ik}}{\phi _l} + {g_{il}}{\phi _k},\;\;\;\;\;}&{{\phi _l} = - \frac{2}{{n - 1}}{\Omega _{rl}}^r,} \end{array} $$

with Ω being the skew-symmetric part of the asymmetric connection \({H_{ij}}^k = \Gamma _{(ij)}^{\;\;\;k} + \Omega _{[ij]}^{\;\;\;\;k}\), and gij being the symmetric part of the asymmetric metric \({h_{ij}} = {g_{(ij)}} + {\omega _{[ij]}}\). The two covariant derivatives introduced by J. M. Thomas are \({g_{ij,k}} = \tfrac{{\partial {g_{ij}}}}{{\partial {x^k}}} - {g_{rj}}{\Gamma _{ik}}^r - {g_{ir}}{\Gamma _{jk}}^r\) and \({h_{ij/k}} = \tfrac{{\partial {h_{ij}}}}{{\partial {x^k}}} - {h_{rj}}{H_{ik}}^r - {h_{ir}}{H_{jk}}^r\). J. M. Thomas then could reformulate Equation (137) in the form

$$ {g_{ij/l}} = {g_{[ri]}}{\Omega _{jl}}^r + {g_{[ir]}}{\Omega _{lj}}^r, $$

and derive the result

$$ {g_{ij,l}} + {g_{jl,i}} + {g_{li,j}} = 0 $$

(see [346], p. 189).

After having shown that his new theory contains the vacuum field equations of general relativity for vanishing electromagnetic field, Einstein then proved that, in a first-order approximation, Maxwell’s field equations result cum grano salis: Instead of \({F_{ik,l}} + {F_{li,k}} + {F_{kl,i}} = 0\) he only obtained \(\Sigma \tfrac{\partial }{{\partial {x^l}}}({F_{ik,l}} + {F_{li,k}} + {F_{kl,i}}) = 0\).

This was commented on in a paper by Eisenhart who showed “more particularly what kind of linear connection Einstein has employed” and who obtained “in tensor form the equations which in this theory should replace Maxwell’s equations.” He then pointed to some difficulty in Einstein’s theory: When identification of the components of the antisymmetric part φij of the metric \({a_{ij}} = {g_{ij}} + {\phi _{ij}}\) with the electromagnetic field is made in first order,

“they are not the components of the curl of a vector as in the classical theory, unless an additional condition is added.” ([120], p. 129)

Toward the end of the paper Einstein discussed time-reversal; according to him, by it the sign of the magnetic field is changed, while the sign of the electric field vector is left unchangedFootnote 139. As he wanted to obtain charge-symmetric solutions from his equations, Einstein now proposed to change the roles of the magnetic fields and the electric fields in the electromagnetic field tensor. In fact, the substitutions \(\mathbf{\tilde E} \to \mathbf{\tilde B}\) and \({\mathbf{\tilde B}} \to - {\mathbf{\tilde E}}\) leave invariant Maxwell’s vacuum field equations (duality transformations)Footnote 140. Already Pauli had pointed to time-reflection symmetry in relation with the problem of having elementary particles with charge ±e and unequal mass ([246], p. 774).

At first, Einstein seems to have been proud about his new version of unified field theory; he wrote to Besso on 28 July 1925 that he would have liked to present him “orally, the egg laid recently, but now I do it in writing”, and then explained the independence of metric and connection in his mixed geometry. He went on to say:

“If the assumption of symmetryFootnote 141 is dropped, the laws of gravitation and Maxwell’s field laws for empty space are obtained in first approximation; the antisymmetric part of ĝik is the electromagnetic field. This is surely a magnificent possibility which likely corresponds to reality. The question now is whether this field theory is consistent with the existence of quanta and atoms. In the macroscopic realm, I do not doubt its correctness.”Footnote 142 ([99], p. 209) We have noted before that a similar suggestion within a theory with a geometry built from an asymmetric metric had been made, in 1917, by Bach alias Förster.

Yet, in the end, also this novel approach did not convince Einstein. Soon after the publication discussed, he found his argument concerning charge symmetric solutions not to be helpful. The link between the occurrence of solutions with both signs of the charge with time-symmetry of the field equations induced him to doubt, if only for a moment, whether the endeavour of unifying electricity and gravitation made sense at all:

“To me, the insight seems to be important that an explanation of the dissimilarity of the two electricities is possible only if time is given a preferred direction, and if this is taken into account in the definition of the decisive physical quantities. In this, electrodynamics is basically different from gravitation; therefore, the endeavour to melt electrodynamics with the law of gravitation into one unity, to me no longer seems to be justified.“Footnote 143 [79]

In a paper dealing with the field equations

$$ {R_{ik}} - \frac{R}{4}{g_{ik}} = - \kappa {T_{ik}}, $$

which had been discussed earlier by Einstein [70], and to which he came back now after RainichFootnote 144’s insightful paper into the algebraic properties of both the curvature tensor and the electromagnetic field tensor ([263, 264, 265, 266]), Einstein indicated that he had lost hope in the extension of Eddington’s affine theory:

“That the equations (140) have received only little attention is due to two circumstances. First, the attempts of all of us were directed to arrive, along the path taken by Weyl and Eddington or a similar one, at a theory melting into a formal unity the gravitational and electromagnetic fields; but by lasting failure I now have laboured to convince myself that truth cannot be approached along this path.“Footnote 145 (Einstein’s italics; [80], p. 100)

The new field equation was picked up by R. N. Sen of Kalkutta who calculated “the energy of an electric particle” according to it [323].

In the same spirit as the one of his paper, Einstein said good bye to his theory in a letter to Besso on Christmas 1925 in words similar to those in his letter in June:

“Regrettably, I had to throw away my work in the spirit of Eddington. Anyway, I now am convinced that, unfortunately, nothing can be made with the complex of ideas by Weyl-Eddington. The equations

$$ {R_{ik}} - \frac{1}{4}R\;{g_{ik}} = - \kappa {T_{ik}}\;\;\;\;\;{\rm{electromagnetic}} $$

I take as the best we have nowadays. They are 9 equations for the 14 variables gik and γik New calculations seem to show that these equations yield the motion of the electrons. But it appears doubtful whether there is room in them for the quanta.” Footnote 146 ([99], p. 216)

According to the commenting note by Tonnelat, the 14 variables are given by the 10 components of the symmetric part g(ik)Footnote 147 of the metric and the 4 components of the electromagnetic vector potential “the rotation of which are formed by the γ[ik]Footnote 148.

But even “the best we have nowadays” did not satisfy Einstein; half a year later, he expressed his opinion in a letter to Besso:

“Also, the equation put forward by myselfFootnote 149,

$$ {R_{ik}} = {g_{ik}}{f_{lm}}{f^{lm}} - \frac{1}{2}{f_l}{f_{km}}{g^{lm}} $$

gives me little satisfaction. It does not allow for electrical masses free from singularities. Moreover, I cannot bring myself to gluing together two items (as the l.h.s. and the r.h.s. of an equation) which from a logical-mathematical point of view have nothing to do with each other.” Footnote 150 ([99], p. 230)

6.2 Further work on (metric-) affine and mixed geometry

Research on affine geometry as a frame for unified field theory was also carried on by mathematicians of the Princeton school. Thus J. M. Thomas, after having given a review of Weyl’s, Einstein’s, and Schouten’s approaches, said about his own work:

“I show in the present paper that his [Einstein’s] new equations can be obtained by a direct generalisation of the equations of the gravitational field previously given by him [gij;k=0; Rij=0]. […] In the final section I show that the adoption of the ordinary definition of covariant differentiation leads to a geometry which includes as a special case that proposed by Weyl as a basis for the electric theory; further that the asymmetric connection for this special case is of the type adopted by Schouten for the geometry at the basis of his electric theory.” ([346], p. 187)

We met J. M. Thomas’ paper before in Section 6.1.

During the period considered here, a few physicists followed the path of Eddington and Einstein. One who had absorbed Eddington’s and Einstein’s theories a bit later was InfeldFootnote 151 of WarsawFootnote 152. In January 1928, he followed Einstein by using an asymmetric metric the symmetric part γik of which stood for the gravitational potential, the skew-symmetric part φik for the electromagnetic field. However, he set the non-metricity tensor (of the symmetric part γ of the metric) \({Q_{ij}}^k = 0\), and assumed for the skew-symmetric part φ,

$$ {\nabla _l}{\phi _{ij}} = {J_{ijl}}, $$

with an arbitrary tensor Jijl. The electric current vector then is defined by \({J^i} = {J^{il}}_l\) where the indices, as I assume, are moved with γik. In a weak-field approximation for the metric, Infeld’s connection turned out to be \({L_{ik}}^l = \{ \,_{ik}^l\,\} + \tfrac{1}{2}({\phi _{i,}}^l_k + {\phi _{k,}}^l_i + {\delta ^{ls}}{\phi _{ik,s}})\). For field equations Infeld postulated the (generalised) Einstein field equations in empty space, Kij=0. He showed that, in first approximation, he got what is wanted, i.e., Einstein’s and Maxwell’s equations [166].

Three months later, Infeld published a note in Comptes Rendus of the Parisian Academy in which he now presented the exact connection as

$$ {L_{ik}}^l = \{ \,_{ik}^l\,\} + \frac{\alpha }{2}({\phi _{i,}}^l_k + {\phi _{k,}}^l_i + {g^{ls}}{\phi _{ik,s}}), $$

where α is “an extremely small numerical factor”. By neglecting terms ∼α2 he could gain both Einstein’s field equation in empty space (94) and Maxwell’s equation, if the electric current vector is identified with \({\alpha ^{ - 1}}({L_{il}}^l - {L_{li}}^l)\). Thus, he is back at vector torsion treated before by Schouten [298].

The Japanese physicist Hattori embarked on a metric-affine geometry derived purely from an asymmetric metrical tensor \({h_{ik}} = {g_{(ik)}} + {f_{[ik]}}\). He defined an affine connection

$$ {L_{ik}}^j = \{ \,_{ik}^j\,\} + {g^{jl}}({f_{li,k}} + {f_{li,k}} - {f_{ik,l}}), $$

where \({g^{il}}{g_{lk}} = {\delta ^i}_k\), and the Christoffel symbol is formed from g. The electromagnetic field was not identified with fik by Hattori, but with the skew-symmetric part of the (generalised) Ricci tensor formed from \({L_{ik}}^j\). By introducing the tensor \({f_{ijk}}: = \tfrac{{\partial {f_{ij}}}}{{\partial {x^k}}} + \tfrac{{\partial {f_{jk}}}}{{\partial {x^i}}} + \tfrac{{\partial {f_{ki}}}}{{\partial {x^j}}}\), he could write the (generalised) Ricci tensor as

$$ {K_{ik}} = {R_{ik}} + \frac{1}{4}{f_{im}}^n{f_{kn}}^m - {\nabla _l}{f_{ik}}^l, $$

where the covariant derivative ∇ is formed with the Levi-Civita connection of gij. The electromagnetic field tensor Fik now is introduced through a tensor potential by \({F_{ik}}: = {\nabla _l}{f_{ik}}^l\) and leads to half of “Maxwell’s” equations. In the sequel, Hattori started from a Lagrangian \( {\mathcal L} = ({g^{ik}} + {\alpha ^2}{F^{ik}}){K_{ik}} \) with the constant α2 and varied, alternatively, with respect to gij and fij. He could write the field equations in the form of Einstein’s, with the energy-momentum tensor of the electromagnetic field Fik and a “matter” tensor Mik on the r.h.s., Mik being a complicated, purely geometrical quantity depending on Kik, K, fikl, and Fikl. Fikl is formed from Fik as fikl from fik. From the variation with regard to fik, in addition to Maxwell’s equation, a further field equation resulted, which could be brought into the form

$$ {F^{ik}} = \frac{2}{3}{\nabla _l}{F^{ikl}}, $$

i.e., fiklFikl Hattori’s conclusion was:

“The preceding equation shows that electrical charge and electrical current are distributed wherever an electromagnetic field exists.”Footnote 153

Thus, the same problem obtained as in Einstein’s theory: A field without electric current or charge density could not exist [155]Footnote 154.

Infeld quickly reacted to Hattori’s paper by noting that Hattori’s voluminous calculations could be simplified by use of Schouten’s Equation (39) of Section 2.1.2. As in Hattori’s theory two connections are used, Infeld criticised that Hattori had not explained what his fundamental geometry should be: Riemannian or non-Riemannian? He then gave another example for a theory allowing the identification of the electromagnetic field tensor with the antisymmetric part of the Ricci tensor: He displayed again the well-known connection with vector torsion used by Schouten [298] without referring to Schouten’s paper [165]. He also claimed that Hattori’s Equation (145) is the same as the one that had been deduced from Eddington’s theory by Einstein in the Appendix to the German translation of Eddington’s book ([60], p. 367). All in all, Infeld’s critique tended to deny that Hattori’s theory was more general than Einstein’s, and to point out

“that the problem of generalising the theory of relativity cannot be solved along a purely formal way. At first, one does not see how a choice can be made among the various non-Riemannian geometries providing us with the gravitational and Maxwell’s equations. The proper world geometry which ought to lead to a unified theory of gravitation and electricity can only be found by an investigation of its physical content.”Footnote 155 ([165], p. 811)

Infeld could as well have applied this admonishment to his own unified field theory discussed above. Perhaps, he became irritated by comparing his expression for the connection (142) with Hattori’s (145).

In June 1931, von Laue submitted a paper of the Genuese mathematical physicist Paolo Straneo to the Berlin Academy [331]. In it Straneo took note of Einstein’s teleparallel geometry, but decided to take another route within mixed geometry; he started with a symmetric metric and the asymmetric connection

$$ {L_{ik}}^j = \{ \,_{ik}^j\,\} + 2{\delta ^j}_i{\psi _k} $$

with both non-vanishing curvature tensor \({K^i}_{jkl} = {R^i}_{jkl} + 2\delta _j^i(\tfrac{{\partial {\psi _l}}}{{\partial {x^k}}} - \tfrac{{\partial {\psi _k}}}{{\partial {x^l}}})\) and torsion \({S_{ik}}^j = 2{\delta ^j}_{[i}{\psi _{k]}}\). Thus, Straneo suggested a unified field theory with only vector torsion as Schouten had done 8 years earlier [298, 142]) without referring to him. The field equations Straneo wrote down, i. e.

$$ {K_{ik}} - \frac{1}{2}K{g_{ik}} = - \kappa {T_{ik}} + {\Psi _{ik}}, $$

where Tik is the symmetric and Ψik the antisymmetric part of the l.h.s., do not fulfill Einstein’s conception of unification: Straneo kept the energy-momentum tensor of matter as an extraneous object (including the electromagnetic field) as well as the electric current vector. The antisymmetric part of (147) just is \({\Psi _{ik}} = (\tfrac{{\partial {\psi _l}}}{{\partial {x^k}}} - \tfrac{{\partial {\psi _k}}}{{\partial {x^l}}})\); thus Ψik is identified with the electromagnetic field tensor, and the electric current vector Ji defined by \({\Psi ^{il}}_l = {J^i}\). Straneo wrote further papers on the subject [332, 333].

By a remark of Straneo, that auto-parallels and geodesics have to be distinguished in an affine geometry, the Indian mathematician KosambiFootnote 156 felt motivated to approach affine geometry from the system of curves solving i+αi(x, x, t) with an arbitrary parameter t. He then defined two covariant “vector-derivations” along an arbitrary curve and arrived at an (asymmetric) affine connection. By this, he claimed to have made superfluous the five-vectors of Einstein and MayerFootnote 157 [107]. This must be read in the sense that he could obtain the Einstein.Mayer equations from his formalism without introducing a connecting quantity leading from the space of 5-vectors to space-time [195].

Einstein, in his papers, did not comment on the missing metric compatibility in his theory and its physical meaning. Due to this complication — for example even a condition of metric compatibility would not have the physical meaning of the conservation of the norm of an angle between vectors under parallel transport, and the further difficulty that much of the formalism was very clumsy to manipulate; essential work along this line was done only much later in the 10940s and 1950s (Einstein, Einstein and Strauss, Schrödinger, Lichnerowicz, Hlavaty, Tonnelat, and many others). In this work a generalisation of the equation for metric compatibility, i.e., Equation (47), will play a central role. The continuation of this research line will be presented in Part II of this article.

6.3 Kaluza’s idea taken up again

6.3.1 Kaluza: Act I

Einstein became interested in Kaluza’s theory again due to O. Klein’s paper concerning a relation between “quantum theory and relativity in five dimensions” (see Klein 1926 [185], received by the journal on 28 April 1926). Einstein wrote to his friend and colleague Paul Ehrenfest on 23 August 1926: “Subject Kaluza, Schroedinger, general relativity”, and, again on 3 September 1926: “Klein’s paper is beautiful and impressive, but I find Kaluza’s principle too unnatural.” However, less than half a year later he had completely reversed his opinion:

“It appears that the union of gravitation and Maxwell’s theory is achieved in a completely satisfactory way by the five-dimensional theory (Kaluza-Klein-Fock).” (Einstein to H. A. Lorentz, 16 February 1927)

On the next day (17 February 1927), and ten days later Einstein was to give papers of his own in front of the Prussian Academy in which he pointed out the gauge-group, wrote down the geodesic equation, and derived exactly the Einstein.Maxwell equations — not just in first order as Kaluza had done [81, 82]. He came too late: Klein had already shown the same before [185]. Einstein himself acknowledged indirectly that his two notes in the report of the Berlin Academy did not contain any new material. In his second communication, he added a postscript:

“Mr. Mandel brings to my attention that the results reported by me here are not new. The entire content can be found in the paper by O. Klein.”Footnote 158

He then referred to the papers of Klein [185, 186] and to “Fochs Arbeit” which is a paper by FockFootnote 159 1926 [130], submitted three months later than Klein’s paper. That Klein had published another important clarifying note in Nature, in which he closed the fifth dimension, seems to have escaped EinsteinFootnote 160 [184]. Unlike in his paper with Grommer, but as in Klein’s, Einstein, in his notes, applied the “sharpened cylinder condition”, i.e., dropped the scalar field. Thus, the three of them had no chance to find out that Kaluza had made a mistake: For g55≠const., even in first approximation the new field will appear in the four-dimensional Einstein.Maxwell equations ([145], p. 5).

MandelFootnote 161 of Leningrad was not given credit by Einstein although he also had rediscovered by a different method some of O. Klein’s results [216]. In a footnoote, Mandel stated that he had learned of Kaluza’s (whom he spelled “Kalusa”) paper only through Klein’s article. He started by embedding space-time as a hypersurface x5=const. into M5, and derived the field equations in space-time by assuming that the five-dimensional curvature tensor vanishes; by this procedure he obtained also a matter-energy tensor “closely linked to the second fundamental form of this hypersurface”. From the geodesics in M5 he derived the equations of motion of a charged point particle. One of the two additional terms appearing besides the Lorentz force could be removed by a weakness assumption; as to the second, Mandel opinioned

“that the experimental discovery of the second term appears difficult, yet perhaps not entirely impossible.” ([216], p. 145)

As to Fock’s paper, it is remarkable because it contains, in nuce, the coupling of the Schrödinger wave function ψ and the electromagnetic potential by the gauge transformation ψ=ψ0 eip/h, where h is Planck’s constant and p “a new parameter with the unit of the quantum of action” [130]. In Fock’s words:

“The importance of the additional coordinate parameter p seems to lie in the fact that it causes the invariance of the equations [i.e., the relativistic wave equations] with respect to addition of an arbitrary gradient to the 4-potential.”Footnote 162 ([130], p. 228)

Fock derived the general relativistic wave equation and the equations of motion of a charged point particle; the latter is identified with the null geodesics of M5. Neither Mandel nor Fock used the “sharpened cylinder condition” (110).

A main motivation for Klein was to relate the fifth dimension with quantum physics. From a postulated five-dimensional wave equation

$$ \begin{array}{*{20}{c}} {{a^{ik}}\left( {\frac{{{\partial ^2}U}}{{\partial {x^i}\partial {x^k}}} - \{ \,_{\;r}^{ik}\,\} \frac{{\partial U}}{{\partial {x^r}}}} \right) = 0,\;\;\;\;\;}&{i,k, = \ldots ,5} \end{array} $$

and by neglecting the gravitational field, he arrived at the four-dimensional Schrödinger equation after insertion of the quantum mechanical differential operators \(- \tfrac{{ih}}{{2\pi }}\tfrac{\partial }{{\partial {x^i}}}\). It was Klein’s papers and the magical lure of a link between classical field theory and quantum theory that raised interest in Kaluza’s idea — seven years after Kaluza had sent his manuscript to Einstein. Klein acknowledged Mandel’s contribution in his second paper received on 22 October 1927, where he also gave further references on work done in the meantime, but remained silent about Einstein’s papers [189]. Likewise, Einstein did not comment on Klein’s new idea of “dimensional reduction” as it is now called and which justifies Klein’s name in the “Kaluza-Klein” theories of our time. By this, the reduction of five-dimensional equations (as e.g., the five-dimensional wave equation) to four-dimensional equations by Fourier decomposition with respect to the new 5th spacelike coordinate x5, taken as periodic with period L, is understood:

$$ \psi (x,{x^5}) = \frac{1}{{\sqrt L }}{\Sigma _n}{\psi _n}(x),{e^{in{x^5}/{R^5}}} $$

with an integer n. Klein had only the lowest term in the series. The 5th dimension is assumed to be a circle, topologically, and thus gets a finite linear scale: This is at the base of what now is called “compactification”. By adding to this picture the idea of de Broglie waves, Klein brought in Planck’s constant and determined the linear scale of x5 to be unmeasurably small (∼10-30). From this, the possibility of “forgetting” the fifth dimension arose which up to now has not been observed.

In his papers, Einstein took over Klein’s condition g55=1, which removed the additional scalar field admitted by the theory. It was Reichenbächer who apparently first tried to perform the projection into space-time of the most general five-dimensional metric, and without using the cylinder condition (109):

“Now, a rather laborious calculation of the five-dimensional curvature quantities in terms of a four-dimensional submanifold contained in it has shown to me also in the general case (g55≠const., dependence of the components of the f u n d a m e n t a l [tensor] of x5 is admitted) that the c h a r a c t e r i s t i c properties of the field equations are then conserved as well, i.e., they keep the form

$$ \begin{array}{*{20}{c}} {{R^{ik}} - \frac{1}{2}{g^{ik}}R = {T^{ik}},\;\;\;\;\;}&{\frac{{\partial \sqrt g {F^{ik}}}}{{\partial {x^k}}} = {s^i},} \end{array} $$

only the Tik contain further terms besides the electromagnetic energy tensor Sik, and the quantities collected in si do not vanish. […] The appearance of the new terms on the right hand sides could even be welcomed in the sense that now the field equations are obtained not only for a field point free of matter and chargeFootnote 163.” Footnote 164 ([276], p. 426) Here, in nuce, is already contained what more than a decade later Einstein and Bergmann worked out in detail [102].

It is likely that Reichenbächer had been led to this excursion into five-dimensional space, an idea which he had rejected before as unphysical, because his attempt to build a unified field theory in space-time through the ansatz for the metric \({\gamma _{ik}} = {g_{ik}} - { \epsilon ^2}{\phi _i}{\phi _k}\) with φk the electromagnetic 4-potential, had failed. Beyond incredibly complicated field equations nothing much had been gained [275]. Reichenbächer’s ansatz is well founded: As we have seen in Section 4.2, due to the violation of covariance in M5, γik transforms as a tensor under the reduced covariance group.

Even L. de Broglie became interested in Kaluza’s “bold but very beautiful theory” and rederived Klein’s results his way [46], but not without getting into a squabble with Klein, who felt misunderstood [188, 47]. He also suggested that one should not accept the cylinder condition, a suggestion looked into by Darrieus who introduced an electrical 5-potential and 5-current, and deduced Maxwell’s equations from the five-dimensional homogeneous wave equation and the fivedimensional equation of continuity [43].

In 1929 Mandel tried to “axiomatise” the five-dimensional theory: His two axioms were the cylinder condition (109) and its sharpening, Equation (110). He then weakened the second assumption by assuming that “an objective meaning does not rest in the gik proper, but only in their quotients”, an idea he ascribed to O. Klein and Einstein. He then discussed conformally invariant field equations, and tried to relate them to equations of wave mechanics [220].

Klein’s lure lasted for some years. In 1930, N. R. Sen claimed to have investigated the “Kepler-problem for the five-dimensional wave equation of Klein”. What he did was to calculate the energy levels of the hydrogen atom (as a one particle-system) with the general relativistic wave equation in space-time (148) with aik=γik, where \({\gamma _{ik}} = {g_{ik}} + {\alpha ^2}{\gamma _{55}}\) is the metric on space-time following from the 5-metric \({\gamma _{\alpha \beta }}\) by dx5=0. For gik he took the Reissner-Nordström solution and did not obtain a discrete spectrum [324]. He continued his approach by trying to solve Schrödinger’s wave equation [325]

$$ {g^{ik}}\left( {\frac{{{\partial ^2}u}}{{\partial {x^i}\partial {x^k}}} - \{ \,_{\;r}^{ik}\,\} \frac{{\partial u}}{{\partial {x^r}}}} \right) = - \frac{{4{\pi ^2}}}{{{h^2}}}m_0^2{c^2}u. $$

Presently, the different contributions of Kaluza and O. Klein are lumped together by most physicists into what is called “Kaluza-Klein theory”. An early criticism of this unhistorical attitude has been voiced in [210].

6.3.2 Kaluza: Act II

Four years later, Einstein returned to Kaluza’s idea. Perhaps, he had since absorbed Mandel’s ideas which included a projection formalism from the five-dimensional space to space-time [216, 217, 218, 219].

In a paper with his assistant Mayer, Einstein now presented Kaluza’s approach in the form of an implicit projective four-dimensional theory, although he did not mention the word “projective” [107]:

“Psychologically, the theory presented here connects to Kaluza’s well-known theory; however, it avoids extending the physical continuum to one of five dimensions.”Footnote 165

In the eyes of Einstein, by avoiding the artificial cylinder condition (109), the new method removed a serious objection to Kaluza’s theory.

Another motivation is also put forward: The linearity of Maxwell’s equations “may not correspond to reality”; thus, for strong electromagnetic fields, Einstein expected deviations from Maxwell’s equations. After a listing of all the shortcomings of Kaluza’s theory, the new approach is introduced: At every event a five-dimensional vector space V5 is affixed to space-time V4, and “mixed” tensors \({\gamma _\imath }^{\;k}\) are defined linking the tangent space of space-time V4 with a V5 such that

$$ {g_{\iota \kappa }}{\gamma ^\iota }_l{\gamma ^\kappa }_m = {g_{lm}}, $$

where glm is the metric tensor of V4, and \({g_{\iota \kappa }}\) a non-singular, symmetric tensor on V5 with ι, κ=1, … , 5, and k, l=1, … , 4Footnote 166. Indices are raised and lowered with the metrics of V5 or V4, respectively. There exists a “preferred direction of V5” defined by \({\gamma _\iota }^{\;k}{A^\iota } = 0\), and which is the normal to a “preferred plane” \({\gamma _\iota }^{\;k}{\omega _k} = 0\)Footnote 167. A consequence then is

$$ {\gamma _{\sigma k}}{\gamma ^{\tau k}} = {\gamma _\sigma }^k{\gamma _k}^\tau = {\delta _\sigma }^\tau - {A_\sigma }{A^\tau }. $$

A covariant derivative for five-vectors in V4 is defined with a “three-index-symbol” \({\Gamma ^\iota }_{\pi l}\) with two indices in V5, and one in V4 standing in for the connection coefficients:

$$ {\mathop \nabla \limits^ + _l}{X^\iota } = \frac{{\partial {X^\iota }}}{{\partial {x^l}}} + {\Gamma _\pi }^\iota {X^\pi }. $$

The covariant derivative of 4-vectors is defined as usual,

$$ {\nabla _l}{X^i} = \frac{{\partial {X^i}}}{{\partial {x^l}}} + \{ \,_{jl}^i\,\} {X^j}, $$

where { ijl } is calculated from the metric of V4 as given in Equation (149). Both covariant derivatives are abbreviated by the same symbol A;k. The covariant derivative of tensors with both indices referring to V5 and those referring to V4, is formed correspondingly. In this context, Einstein and Mayer mention an extension of absolute differential calculus by “WAERDEN and BARTOLOTTI” without giving any reference to their respective papers. They may have had in mind van der Waerden’s [368] and BortolottiFootnote 168’s [24] papers. The autoparallels of V5 lead to the exact equations of motion of a charged particle, not the geodesics of V4.

Einstein and Mayer made three basic assumptions:

$$ \begin{array}{*{20}{l}} {{g_{\iota \kappa ;\;l}} = 0,}\\ {\,{\gamma _\iota }\,_;^k{\,_l} \;= {A^\iota }{F_{kl}},}\\ {\;\;{F_{kl}} = - {F_{lk}},} \end{array} $$

where \({A^\iota }\) is the preferred direction and Fkl an arbitrary 2-form, later to be interpreted as the electromagnetic field tensor. From them \({A_{\sigma ;l}} = {\gamma _\sigma }^k{F_{lk}}\) follows. They also noted that a symmetric tensor Fkl could have been interpreted as the second fundamental form, and the formalism would then be the same as local isometric embedding of V4 into V5.

Einstein and Mayer introduced what they called “Fünferkrümmung” (5-curvature) via the three-index symbol given above by

$$ {P^\sigma }_{\iota kl} = {\partial _k}{\Gamma _{\iota l}}^\sigma - {\partial _l}{\Gamma _{\iota k}}^\sigma + {\Gamma _{\tau k}}^\sigma {\Gamma _{\iota l}}^\tau - {\Gamma _{\tau l}}^\sigma {\Gamma _{\iota k}}^\tau . $$

It is related to the Riemannian curvature \({R^r}_{mlk}\) of V4 by

$$ {P^\sigma }_{\iota kl}\;{\gamma _{\sigma m}} = {A_\iota }({F_{mk;l}} - {F_{ml;k}}) + {\gamma _{\iota r}}({R^r}_{mlk} + {F_{mk}}{F_l}^{\;r} - {F_{ml}}{F_k}^r), $$


$$ {P^\sigma }_{\iota kl}{A_\sigma } = {\gamma _{\iota r}}(F_{k\;;\;l}^{\;r} - F_{l\;;\;k}^{\;r}). $$

From (154), by transvection with \({\gamma ^{\tau k}}\), the 5-curvature itself appears:

$$ {P^\tau }_{\iota kl} = {\gamma _{\iota r}}{A^\tau }(F_{k;l}^{\;r} - F_{l;k}^{\;r}) + {\gamma ^{\tau r}}{A_\iota }({F_{rk;l}} - {F_{rl;k}}) + {\gamma _{\iota r}}{\gamma ^{\tau s}}({R^r}_{slk} + {F_{sk}}{F_l}^{\;r} - {F_{sl}}{F_k}^{\;r}). $$

By contraction, \({P_{\iota k}}: = {\gamma _\tau }^r{P^\tau }_{\iota rk}\) and \(P: = {\gamma ^{\iota k}}{P_{\iota k}}\). Two new quantities are introduced:

  1. (1)

    \({U_{\iota k}}: = {P_{\iota k}} - \tfrac{1}{4}(P + R)\), where R is the Ricci scalar of the Riemannian curvature tensor of V4, and

  2. (2)

    the tensor \({N_{klm}}: = {F_{\{ kl;m\} }}\)Footnote 169.

It turns out that \(P = R - {F_{kp}}{F^{kp}}\).

The field equations put forward in the paper by Einstein and Mayer now are

$$ \begin{array}{*{20}{c}} {{U_{\iota k}} = 0,\;\;\;\;\;}&{{N_{klm}} = 0,} \end{array} $$

and turn out to be exactly the Einstein-Maxwell vacuum field equations. Thus, by another formalism, Einstein and Mayer rederived what Klein had obtained in his first paper on Kaluza’s theory [185].

The authors’ conclusion is:

“From the theory presented here, the equations for the gravitational and the electromagnetic fields follow effortlessly by a unifying method; however, up to now, [the theory] does not bring any understanding for the way corpuscles are built, nor for the facts comprised by quantum theory.”Footnote 170 ([107], p. 19)

After this paper Einstein wrote to Ehrenfest in a letter of 17 September 1931 that this theory “in my opinion definitively solves the problem in the macroscopic domain” ([241], p. 333). Also, in a lecture given on 14 October 1931 in the Physics Institute of the University of Wien, he still was proud of the 5-vector approach. In talking about the failed endeavours to reconcile classical field theory and quantum theory (“a cemetery of buried hopes”) he is reported to have said:

“Since 1928 I also tried to find a bridge, yet left that road again. However, following an idea half of which came from myself and half from my collaborator, Prof. Dr. Mayer, a startlingly simple construction became successful. […] According to my and Mayer’s opinion, the fifth dimension will not show up. […] according to which relationships between a hypothetical five-dimensional space and the four-dimensional can be obtained. In this way, we succeeded to recognise the gravitational and electromagnetic fields as a logical unity.”Footnote 171 [96]

In his letter to Besso of 30 October 1931, Einstein seemed intrigued by the mathematics used in his paper with Mayer, but not enthusiastic about the physical content of this projective formulation of Kaluza’s unitary field theory:

“The only result of our investigation is the unification of gravitation and electricity, whereby the equations for the latter are just Maxwell’s equations for empty space. Hence, no physical progress is made, [if at all] at most only in the sense that one can see that Maxwell’s equations are not just first approximations but appear on as good a rational foundation as the gravitational equations of empty space. Electrical and mass-density are non-existent; here, splendour ends; perhaps this already belongs to the quantum problem, which up to now is unattainable from the point of view of field [theory] (in the same way as relativity is from the point of view of quantum mechanics). The witty point is the introduction of 5-vectors \({a^\sigma }\) in fourdimensional space, which are bound to space by a linear mechanism. Let as be the 4-vector belonging to \({a^\sigma }\); then such a relation \({a^s} = \gamma _\sigma ^s{a^\sigma }\) obtains. In the theory equations are meaningful which hold independently of the special relationship generated by \(\gamma _\sigma ^s\). Infinitesimal transport of \({a^\sigma }\) in fourdimensional space is defined, likewise the corresponding 5-curvature from which spring the field equations.”Footnote 172 ([99], pp. 274–25)

In his report for the Macy-Foundation, which appeared in Science on the very same day in October 1931, Einstein had to be more optimistic:

“This theory does not yet contain the conclusions of the quantum theory. It furnishes, however, clues to a natural development, from which we may anticipate further developments in this direction. In any event, the results thus far obtained represent a definite advance in knowledge of the structure of physical space.” ([94], p. 439)

Unfortunately, as in the case of his previous papers on Kaluza’s theory, Einstein came in only second: Veblen had already worked on projective geometry and projective connections for a couple of years [374, 376, 375]. One year prior to Einstein’s and Mayer’s publication, with his student HoffmannFootnote 173, he had suggested an application to physics equivalent to the Kaluza-Klein theory [381, 163]. However, according to Pauli, Veblen and Hoffmann had spoiled the advantage of projective theory:

“But these authors choose a formulation that, due to an unnecessary specialisation of the coordinate system, prefers the fifth coordinate relative to the remaining [coordinates] in much the same way as this had happened in Kaluza-Klein theory by means of the cylinder condition […].”Footnote 174 ([249], p. 307)

By using the idea that an affine (n+1)-space can be represented by a projective n-space [413], Veblen and Hoffmann avoided the five dimensions of Kaluza: There is a one-to-one correspondence between the points of space-time and a certain congruence of curves in a five-dimensional space for which the fifth coordinate is the curves’ parameter, while the coordinates of space-time are fixed. The five-dimensional space is just a mathematical device to represent the events (points) of space-time by these curves. Geometrically, the theory of Veblen and Hoffmann is more transparent and also more general than Einstein and Mayer’s: It can house the additional scalar field inherent in Kaluza’s original approach. Thus, Veblen and Hoffmann also gained the Klein.Gordon equation in curved space, i.e., an equation with the Ricci scalar R appearing besides its mass term. Interestingly, the curvature term reads as 5/27R ([381], p. 821). In his note, Hoffmann generalised the formalism such as to include Dirac’s equations (without gravitation), although some technical difficulties remained. Nevertheless, Hoffman remained optimistic:

“There is thus a possibility that the complete system will constitute an improved unification within the relativity theory of the gravitational, electromagnetic and quantum aspects of the field.” ([163], p. 89)

In his book, Veblen emphasised

“[…] that our theory starts from a physical and geometrical point of view totally different from KALUZA’s. In particular, we do not demand a relationship between electrical charge and a fifth coordinate; our theory is strictly four-dimensional.”Footnote 175 [379]

Shortly after Einstein’s and Mayer’s paper had appeared, Schouten and van Dantzig also proved that the 5-vector formalism of this paper can be brought into a projective form [314].

In a second note, Einstein and Mayer extended the 5-vector-formalism to include Maxwell’s equations with a non-vanishing current density [109]. Of the three basic assumptions of the previous paper, the second had to be given up. The expression in the middle of Equation (153) is replaced by

$$ {\gamma ^\iota }_{k;l} = {A^\iota }{F_{kl}} + {\gamma ^{\iota r}}{V_{rlk}}, $$

where, again, Fkl=-Flk, and the new Vrlk=Vrkl are arbitrary tensors. The field equations were set up according to the method of the first paper; now the 5-curvature scalar was \(P = R - {F_{kp}}{F^{kp}} - {V_{rqp}}{V^{rpq}}\). It also turned out that \({V^{rpq}} = { \epsilon ^{lrpq}}{\phi _l}\) with \({\phi _l} = \tfrac{{\partial \phi }}{{\partial {x^l}}}\), i.e., that the introduction of Vrpq brought only one additional variable. The electric current density became ∼VVprqFrq.

In the last paragraph, the compatibility of the equations was proven, and at the end Cartan was acknowledged:

“We note that Mr. Cartan, in a general and very illuminating investigation, has analysed more deeply the property of systems of differential equations that has been termed by us ‘compatibility’ in this paper and in previous papers.”Footnote 176 [37]

At about the same time as Einstein and Mayer wrote their second note, van Dantzig continued his work on projective geometry [361, 362, 360]. He used homogeneous coordinates \({X^\alpha }\), with α= 1, …, 5, and the invariant \({g_{\alpha \beta }}{X^\alpha }{X^\beta }\), and introduced projectors and covariant differentiation (cf. Section 2.1.3). Together with him, Schouten wrote a series of papers on projective geometry as the basis of a unified field theory [316, 317, 315, 318]Footnote 177, which, according to Pauli, combine

“all advantages of the formulations of Kaluza-Klein and Einstein-Mayer while avoiding all their disadvantages.” ([249], p. 307)

Both the Einstein-Mayer theory and Veblen and Hoffmann’s approach turned out to be subcases of the more general scheme of Schouten and van Dantzig intending

“to give a unification of general relativity not only with Maxwell’s electromagnetic theory but also with Schrödinger’s and Dirac’s theory of material waves.” ([318], p. 271)

In this paper ([318], p. 311, Figure 2), we find an early graphical representation of the parametrised set of all possible theories of a kindFootnote 178. The formalism of Schouten and van Dantzig allows for taking the additional dimension to be timelike; in their physical applications the metric of spacetime is taken as a Lorentz metric; torsion is also included in their geometry.

Pauli, with his student J. SolomonFootnote 179, generalised Klein, and Einstein and Mayer by allowing for an arbitrary signature in an investigation concerning “the form that take Dirac’s equations in the unitary theory of Einstein and Mayer”Footnote 180 [253]. In a note added after proofreading, the authors showed that they had noted Schouten and Dantzig’s papers [316, 317]. The authors pointed out that

“[…] even in the absence of gravitation we must pay attention to a difference between Dirac’s equation in the theory of Einstein and Mayer, and Dirac’s equation as it is written out, usually.”Footnote 181 ([253], p. 458)

The second order wave equation iterated from their form of Dirac’s equation, besides the spin term contained a curvature term -1/4R, with the numerical factor different from Veblen’s and Hoffmann’s. In a sequel to this publication, Pauli and Solomon corrected an error:

“We examine from a general point of view the theory of spinors in a five-dimensional space. Then we discuss the form of the energy-momentum tensor and of the current vector in the theory of Einstein-Mayer.[…] Unfortunately, it turned out that the considerations of §in the first part are marred by a calculational error… This has made it necessary to introduce a new expression for the energy-momentum tensor and […] likewise for the current vector […].”Footnote 182 ([254], p. 582)

In the California Institute of Technology, Einstein’s and Mayer’s new mathematical technique found an attentive reader as well; A. D. Michal and his co-author generalised the Einstein-Mayer 5-vector-formalism:

“The geometry considered by Einstein and Mayer in their ‘Unified field theory’ leads to the consideration of an n-dimensional Riemannian space Vn with a metric tensor gij, to each point of which is associated an m-dimensional linear vector space Vm, (m>n), for which vector spaces a general linear connection is defined. For the general case (mn≠1) we find that the calculation of the mn ‘exceptional directions’ is not unique, and that an additional postulate on the linear connection is necessary. Several of the new theorems give new results even for n=4, m=5, the Einstein-Mayer case.“ [228]

Michal had come from Cartan and Schouten’s papers on group manifolds and the distant parallelisms defined on them [227]. H. P. Robertson found a new way of applying distant parallelism: He studied groups of motion admitted by such spaces, e.g., by Einstein’s and Mayer’s spherically symmetric exact solution [282] (cf. Section 6.4.3).

Cartan wrote a paper on the Einstein-Mayer theory as well ([39], an article published only posthumously) in which he showed that this could be interpreted as a five-dimensional flat geometry with torsion, in which space-time is embedded as a totally geodesic subspace.

6.4 Distant parallelism

The next geometry Einstein took as a fundament for unified field theory was a geometry with Riemannian metric, vanishing curvature, and non-vanishing torsion, named “absolute parallelism”, “distant parallelism”, “teleparallelism, or “Fernparallelismus”. The contributions from the Levi-Civita connection and from contorsionFootnote 183 in the curvature tensor cancel. In place of the metric, tetrads are introduced as the basic variables. As in Euclidean space, in the new geometry these 4-beins can be parallely translated to retain the same fixed directions everywhere. Thus, again, a degree of absoluteness is re-introduced into geometry in contrast to Weyl’s first attempt at unification which tried to soften the “rigidity” of Riemannian geometry.

The geometric concept of “fields of parallel vectors” had been introduced on the level of advanced textbooks by Eisenhart as early as 1925–1927 [119, 121] without use of the concept of a metric. In particular, the vanishing of the (affine) curvature tensor was given as a necessary and sufficient condition for the existence of D linearly independent fields of parallel vectors in a D-dimensional affine space ([121], p. 19).

6.4.1 Cartan and Einstein

As concerns the geometry of “Fernparallelism”, it is a special case of a space with Euclidean connection introduced by Cartan in 1922/23 [31, 30, 32]. Pais let Einstein “invent” and “discover” distant parallelism, and he states that Einstein “did not know that Cartan was already aware of this geometry” ([241], pp. 344–345). However, when Einstein published his contributions in June 1928 [84, 83], Cartan had to remind him that a paper of his introducing the concept of torsion had

“appeared at the moment at which you gave your talks at the Collège de France. I even remember having tried, at Hadamard’s place, to give you the most simple example of a Riemannian space with Fernparallelismus by taking a sphere and by treating as parallels two vectors forming the same angle with the meridians going through their two origins: the corresponding geodesics are the rhumb lines.”Footnote 184 (letter of Cartan to Einstein on 8 May 1929; cf. [50], p. 4)

This remark refers to Einstein’s visit in Paris in March/April 1922. Einstein had believed to have found the idea of distant parallelism by himself. In this regard, Pais may be correct. Every researcher knows how an idea, heard or read someplace, can subconsciously work for years and then surface all of a sudden as his or her own new idea without the slightest remembrance as to where it came from. It seems that this happened also to Einstein. It is quite understandable that he did not remember what had happened six years earlier; perhaps, he had not even fully followed then what Cartan wanted to explain to him. In any case, Einstein’s motivation came from the wish to generalise Riemannian geometry such that the electromagnetic field could be geometrized:

“Therefore, the endeavour of the theoreticians is directed toward finding natural generalisations of, or supplements to, Riemannian geometry in the hope of reaching a logical building in which all physical field concepts are unified by one single viewpoint.”Footnote 185 ([84], p. 217)

In an investigation concerning spaces with simply transitive continuous groups, Eisenhart already in 1925 had found the connection for a manifold with distant parallelism given 3 years later by Einstein [118]. He also had taken up Cartan’s idea and, in 1926, produced a joint paper with Cartan on “Riemannian geometries admitting absolute parallelism” [40], and Cartan also had written about absolute parallelism in Riemannian spaces [33]. Einstein, of course, could not have been expected to react to these and other purely mathematical papers by Cartan and Schouten, focussed on group manifolds as spaces with torsion and vanishing curvature ([41, 34], pp. 50–54). No physical application had been envisaged by these two mathematicians.

Nevertheless, this story of distant parallelism raises the question of whether Einstein kept up on mathematical developments himself, or whether, at the least, he demanded of his assistants to read the mathematical literature. Against his familiarity with mathematical papers speaks the fact that he did not use the name “torsion” in his publications to be described in the following section. In the area of unified field theory including spinor theory, Einstein just loved to do the mathematics himself, irrespective of whether others had done it before — and done so even better (cf. Section 7.3).

Anyhow, in his response (Einstein to Cartan on 10 May 1929, [50], p. 10), Einstein admitted Cartan’s priority and referred also to Eisenhart’s book of 1927 and to Weitzenböck’s paper [393]. He excused himself byWeitzenböck’s likewise omittance of Cartan’s papers among his 14 references. In his answer, Cartan found it curious that Weitzenböck was silent because

“[…] he indicates in his bibliography a note by Bortolotti in which he several times refers to my papers.”Footnote 186 (Cartan to Einstein on 15 May 1929; [50], p. 14)

The embarrassing situation was solved by Einstein’s suggestion that he had submitted a comprehensive paper on the subject to Zeitschrift für Physik, and he invited Cartan to add his description of the historical record in another paper (Einstein to Cartan on 10 May 1929). After Cartan had sent his historical review to Einstein on 24 May 1929, the latter answered three months later:

“I am now writing up the work for the Mathematische Annalen and should like to add yours […]. The publication should appear in the Mathematische Annalen because, at present, only the mathematical implications are explored and not their applications to physics.”Footnote 187 (letter of Einstein to Cartan on 25 August 1929 [50, 35, 89])

In his article, Cartan made it very clear that it was not Weitzenböck who had introduced the concept of distant parallelism, as valuable as his results were after the concept had become known. Also, he took Einstein’s treatment of Fernparallelism as a special case of his more general considerations. Interestingly, he permitted himself to interpet the physical meaning of geometrical structuresFootnote 188:

“Let us say simply that mechanical phenomena are of a purely affine nature whereas electromagnetic phenomena are essentially metric; therefore it is rather natural to try to represent the electromagnetic potential by a not purely affine vector.”Footnote 189 ([35], p. 703)

Einstein explained:

“In particular, I learned from Mr. Weitzenböck and Mr. Cartan that the treatment of continua of the species which is of import here, is not really new.[…] In any case, what is most important in the paper, and new in any case, is the discovery of the simplest field laws that can be imposed on a Riemannian manifold with Fernparallelismus.”Footnote 190 ([89], p. 685)

For Einstein, the attraction of his theory consisted

“For me, the great attraction of the theory presented here lies in its unity and in the allowed highly overdetermined field variables. I also could show that the field equations, in first approximation, lead to equations that correspond to the Newton-Poisson theory of gravitation and to Maxwell’s theory. Nevertheless, I still am far from being able to claim that the derived equations have a physical meaning. The reason is that I could not derive the equations of motion for the corpuscles.”Footnote 191 ([89], p. 697)

The split, in first approximation, of the tetrad field hab according to \({h_{ab}} = {\eta _{ab}} + {\bar h_{ab}}\) lead to homogeneous wave equations and divergence relations for both the symmetric and the antisymmetric part identified as metric and electromagnetic field tensors, respectively.

6.4.2 How the word spread

Einstein in 1929 really seemed to have believed that he was on a good track because, in this and the following year, he published at least 9 articles on distant parallelism and unified field theory before switching off his interest. The press did its best to spread the word: On 2 February 1929, in its column News and Views, the respected British science journal Nature reported:

“For some time it has been rumoured that Prof. Einstein has been about to publish the results of a protracted investigation into the possibility of generalising the theory of relativity so as to include the phenomena of electromagnetism. It is now announced that he has submitted to the Prussian Academy of Sciences a short paper in which the laws of gravitation and of electromagnetism are expressed in a single statement.”

Nature then went on to quote from an interview of Einstein of 26 January 1929 in a newspaper, the Daily Chronicle. According to the newspaper, among other statements Einstein made, in his wonderful language, was the following:

“Now, but only now, we know that the force which moves electrons in their ellipses about the nuclei of atoms is the same force which moves our earth in its annual course about the sun, and it is the same force which brings to us the rays of light and heat which make life possible upon this planet.” [2]

Whether Einstein used this as a metaphorical language or, whether he at this time still believed that the system “nucleus and electrons” is dominated by the electromagnetic force, remains open.

The paper announced by Nature is Einstein’s “Zur einheitlichen Feldtheorie”, published by the Prussian Academy on 30 January 1929 [88]. A thousand copies of this paper had been sold within 3 days, so the presiding secretary of the Academy ordered the printing of a second thousand. Normally, only a hundred copies were printed ([183], Dokument Nr. 49, p. 136). On 4 February 1929, The Times (of London) published the translation of an article by Einstein, “written as an explanation of his thesis for readers who do not possess an expert knowledge of mathematics”. This article then became reprinted in March by the British astronomy journal The Observatory [86]. In it, Einstein first gave a historical sketch leading up to the introduction of relativity theory, and then described the method that guided him to the new theory of distant parallelism. In fact, the only formulas appearing are the line elements for two-dimensional Riemannian and Euclidean space. At the end, by one figure, Einstein tried to convey to the reader what consequence a Euclidean geometry with torsion would have — without using that name. His closing sentences areFootnote 192:

“Which are the simplest and most natural conditions to which a continuum of this kind can be subjected? The answer to this question which I have attempted to give in a new paper yields unitary field laws for gravitation and electromagnetism.” ([86], p. 118)

A few months later in that year, again in Nature, the mathematician H. T. H. Piaggio gave an exposition for the general reader of “Einstein’s and other Unitary Field Theories”. He was a bit more explicit than Einstein in his article for the educated general reader. However, he was careful to end it with a warning:

“Of course the ultimate test of the theory must be by experiment. It may succeed in predicting some interaction between gravitation and electromagnetism which can be confirmed by observation. On the other hand, it may be only a ‘graph’ and so outside the ken of the ordinary physicist.” ([258], p. 879)

The use of the concept “graph” had its origin in Eddington’s interpretation of his and other peoples’ unified field theories to be only graphs of the world; the true geometry remained the Riemannian geometry underlying Einstein’s general relativity.

Even the French-Belgian writer and poet Maurice Maeterlinck had heard of Einstein’s latest achievement in the area of unified field theory. In his poetic presentation of the universe “La grande féerie”Footnote 193 we find his remark:

“Einstein, in his last publications comments to which are still to appear, again brings us mathematical formulae which are applicable to both gravitation and electricity, as if these two forces seemingly governing the universe were identical and subject to the same law. If this were true it would be impossible to calculate the consequences.”Footnote 194 ([214], p. 68)

6.4.3 Einstein’s research papers

We are dealing here with Einstein’s, and Einstein and Mayer’s joint papers on distant parallelism in the reports of the Berlin Academy and Mathematische Annalen, which were taken as the starting point by other researchers following suit with further calculations. Indeed, there was a lot of work to do, only in part because Einstein, from one paper to the next, had changed his field equationsFootnote 195.

In his first note [84], dynamics was absent; Einstein made geometrical considerations his main theme: Introduction of a local “n-bein-field” \(h_{\hat \imath}^k\) at every point of a differentiable manifold and the related object \({h_{k\hat \imath}}\) defined as the collection of the “normed subdeterminants of the \(h_{\hat \imath}^k\)Footnote 196 such that \(h_{i\hat l}h_{\hat l}^k = \delta _i^k\). As we have seen before, the components of the metric tensor are defined by

$$ {g_{lm}} = {h_{l\hat \jmath}}{h_{m\hat \jmath}}, $$

where summation over \(\hat \jmath = 1, \ldots ,n\) is assumedFootnote 197.

“Fernparallelism” now means that if the components referred to the local n-bein of a vector \({A^{\hat k}} = h_l^{\hat k}{A^l}\) at a point p, and of a vector \({B^{\hat k}}\) at a different point q are the same, i.e., \({A^{\hat k}} = {B^{\hat k}}\), then the vectors are to be considered as “parallel”. There is an underlying symmetry, called “rotational invariance” by Einstein: joint rotations of each n-bein by the same angle. All relations with a physical meaning must be “rotationally invariant”. Of course, in space-time with a Lorentz metric, the 4-bein-transformations do form the proper Lorentz group.

If parallel transport of a tangent vector A is defined as usual by \(d{A^k} = - {\Delta _{lm}}{A^l}d{x^m}\), then the connection components turn out to be

$$ {\Delta _{lm}}^k = {h^{k\hat \jmath}}\frac{{\partial {h_{l\hat \jmath}}}}{{\partial {x^m}}}. $$

An immediate consequence is that the covariant derivative of each bein-vector vanishes,

$$ h_{\hat \jmath;l}^k: = h_{\hat \jmath;l}^k + {\nabla _{sl}}^kh_{\hat \jmath}^s = 0, $$

by use of Equation (161). Also, the metric is covariantly constant

$${g_{ik;l}} = 0.$$

Neither fact is mentioned in Einstein’s note. Also, no reference is given to Eisenhart’s paper of 1925 [118], in which the connection (161) had been given (Equation (3.5) on p. 248 of [118]), as noted above, its metric-compatibility shown, and the vanishing of the curvature tensor concluded.

The (Riemannian) curvature tensor calculated from Equation (161) turns out to vanish. As Einstein noted, by gij from Equation (160) also the usual Riemannian connection \({\Gamma _{lm}}^k(g)\) may be formed. Moreover, \({Y_{lm}}^k: = {\Gamma _{lm}}^k(g) - {\Delta _{lm}}^k\) is a tensor that could be used for building invariants. In principle, distant parallelism is a particular bi-connection theory. The connection \({\Gamma _{lm}}^k(g)\) does not play a role in the following (cf., however, de DonderFootnote 198’s paper [48]).

From Equation (161), obviously the torsion tensor \({S_{lm}}^k = \tfrac{1}{2}({\Delta _{lm}}^k - {\Delta _{ml}}^k) \ne 0\) follows (cf. Equation (21)). Einstein denoted it by \({\Lambda _{lm}}^k\) and, in comparison with the curvature tensor, considered it as the “formally simplest” tensor of the theory for building invariants by help of the linear form \({\Lambda _{lj}}^jd{x^l}\) and of the scalars \({g^{ij}}{\Lambda _{im}}^l{\Lambda _{jl}}^m\) and \({g_{ij}}\;{g^{lr}}{g^{ms}}{\Lambda _{lm}}^i{\Lambda _{rs}}^{j.}\). He indicated how a Lagrangian could be built and the 16 field equations for the field variables hlj obtained.

At the end of the note Einstein compared his new approach to Weyl’s and Riemann’s:

  • WEYL: Comparison at a distance neither of lengths nor of directions;

  • RIEMANN: Comparison at a distance of lengths but not of directions;

  • Present theory: Comparison at a distance of both lengths and directions.

In his second note [83], Einstein departed from the Lagrangian \({\mathcal L} = h\;{g^{ij}}{\Lambda _{im}}^l{\Lambda _{jl}}^m\), i.e., a scalar density corresponding to the first scalar invariant of his previous noteFootnote 199. He introduced \({\phi _k}: = {\Lambda _{kl}}^l\), and took the case φ=0 to describe a “purely gravitational field”. However, as he added in a footnote, pure gravitation could have been characterised by \(\tfrac{{\partial {\phi _i}}}{{\partial {x^k}}} - \tfrac{{\partial {\phi _k}}}{{\partial {x^i}}}\) as well. In his first paper on distant parallelism, Einstein did not use the names “electrical potential” or “electrical field”. He then showed that in a first-order approximation starting from \({h_{i\hat \jmath}} = {\delta _{i\hat \jmath}} + {j_{i\hat \jmath}} + \ldots\), both the Einstein vacuum field equations and Maxwell’s equations are surfacing. To do so he replaced \({h_{i\hat \jmath }}\) by \({g_{ij}} = {\delta _{ij}} + 2{k_{(ij)}}\) and introduced \({\phi _k}: = \tfrac{1}{2}\left( {\tfrac{{\partial {k_{kj}}}}{{\partial {x^j}}} - \tfrac{{\partial {k_{jj}}}}{{\partial {x^k}}}} \right)\). Einstein concluded that

“The separation of the gravitational and the electromagnetic field appears artificial in this theory. […] Furthermore, it is remarkable that, according to this theory, the electromagnetic field does not enter the field equations quadratically.”Footnote 200 ([83], p. 6)

In a postscript, Einstein noted that he could have obtained similar results by using the second scalar invariant of his previous note, and that there was a certain indeterminacy as to the choice of the Lagrangian.

This shows clearly that the ambiguity in the choice of a Lagrangian had bothered Einstein. Thus, in his third note, he looked for a more reassuring way of deriving field equations [88]. He left aside the Hamiltonian principle and started from identities for the torsion tensor, following from the vanishing of the curvature tensorFootnote 201. He thus arrived at the identity given by Equation (29), i.e., (Einstein’s equation (3), p. 5; his convention is \({\Lambda _{\hat k\hat l}}\,^{\hat \imath}: = 2{S_{\hat k\hat l}}\,^{\hat \imath}\))

$$ 0 = 2{\nabla _{\{ \hat \jmath}}{S_{\hat k\hat l\} }}^{\hat \imath} + 4{S_{\hat m\{ \hat \jmath}}^{\hat \imath}{S_{\hat k\hat l\} }}^{\hat m}. $$

By defining \({\phi _{\hat k}}: = {\Lambda _{\hat k\hat l}}^{\;\hat l} = 2\;{S_{\hat k\hat l}}^{\;\hat l}\), and contracting equation (164), Einstein obtained another identityFootnote 202:

$$ \nabla { * _j}{\hat V_{\hat k\hat l}}^j = 0, $$

where the covariant divergence refers to the connection components \({\Delta _{\hat m\hat l}}^{\hat k}\), and the tensor density \({\hat V_{\hat k\hat l}}^{\hat \jmath}\) is given by

$$ {\hat V_{\hat k\hat l}}^{\hat \jmath }: = 2h({S_{\hat k\hat l}}^{\,\hat \jmath} + {\phi _{[\hat l}}{\delta _{\hat k]}}^{\hat \jmath}). $$

For the proof, he used the formula for the covariant vector density given in Equation (16), which, for the divergence, reduces to \({\mathop \nabla \limits^ + _i}{X^i} = \tfrac{{\partial {{\hat X}^i}}}{{\partial {x^i}}} - 2{S_j}{\hat X^j}\).

The second identity used by Einstein follows with the help of Equation (27) for vanishing curvature (Einstein’s equation (5), p. 5):

$$ {\mathop \nabla \limits^ + _{[j}}{\mathop \nabla \limits^ + _{k]}}\;{\hat A^{jk}} = - {S_{jk}}^r{\mathop \nabla \limits^ + _r}{\hat A^{jk}}. $$

As we have seen in Section 2, if he had read it, Einstein could have taken these identities from Schouten’s book of 1924 [300].

By replacing Âjk by k̂l̂j and using Equation (165), the final form of the second identity now is

$$ \nabla { * _j}({\nabla _{\hat l}}{\hat V^{\hat k\hat lj}} - 2{\hat V^{\hat klr}}{S_{lr}}^j) = 0. $$

Einstein first wrote down a preliminary set of field equations from which, in first approximation, both the gravitational vacuum field equations (in the limit =0, cf. below) and Maxwell’s equations follow:

$$ \begin{array}{*{20}{c}} {\nabla { * _j}{{\hat U}_{\hat k\hat l}}^j = 0,\;\;\;\;\;}&{\nabla { * _r}\nabla { * _l}{{\hat V}_{\hat k}}^{lr} = 0.} \end{array} $$


$$ {\hat U_{\hat k\hat l}}^{\hat \jmath}: = {\hat V_{\hat k\hat l}}^{\hat \jmath} - 2 \epsilon h\;{\phi _{[\hat l}}\;\delta _{\hat k]}^{\hat \jmath} $$

replaces \({\hat V_{kl}}^j\) such that the necessary number of equations is obtained. With this first approximation as a hint, Einstein, after some manipulations, postulated the 20 exact field equations:

$$ \begin{array}{*{20}{c}} {\nabla { * _{\hat l}}{{\hat V}^{\hat k\hat lr}} - 2{{\hat V}^{\hat krs}}{S_{sr}}^{\;l} = 0,\;\;\;\;\;}&{\nabla { * _j}[h\;ph{i^{[k;j]}}] = 0,} \end{array} $$

among which 8 identities hold.

Einstein seems to have sensed that the average reader might be able to follow his path to the postulated field equations only with difficulty. Therefore, in a postscript, he tried to clear up his motivation:

“The field equations suggested in this paper may be characterised with regard to other such possible ones in the following way. By staying close to the identity (167), it has been accomplished that not only 16, but 20 independent equations can be imposed on the 16 quantities h i By ‘independent’ we understand that none of these equations can be derived from the remaining ones, even if there exist 8 identical (differential) relations among them.”Footnote 203 ([88], p. 8)

He still was not entirely sure that the theory was physically acceptable:

“A deeper investigation of the consequences of the field equations (170) will have to show whether the Riemannian metric, together with distant parallelism, really gives an adequate representation of the physical qualities of space.”Footnote 204

In his second paper of 1929, the fourth in the series in the Berlin Academy, Einstein returned to the Hamiltonian principle because his collaborators Lanczos and MüntzFootnote 205 had doubted the validity of the field equations of his previous publication [88] on grounds of their unproven compatibility. In the meantime, however, he had found a Lagrangian such that the compatibility-problem disappeared. He restricted the many constructive possibilities for \({\mathcal L} = {\mathcal L}(h_i^{\hat k},{\partial _l}h_i^{\hat k})\) by asking for a Lagrangian containing torsion at most quadratically. His Lagrangian is a particular linear combination of the three possible scalar densities, as follows:

  1. (1)

    \(\hat H = {\tfrac{1}{2}}{{\hat \jmath }_1} + {\tfrac{1}{4}}{{\hat \jmath }_2} - {{\hat \jmath }_3},\)

  2. (2)

    \(\hat H* = {\tfrac{1}{2}}{\hat \jmath _1} - {\tfrac{1}{4}}{\hat \jmath _2},\)

  3. (3)

    \(\hat H** = {{\hat \jmath }_3},\)

with \({{\hat \jmath }_1}: = h\;{S_{kl}}^m{S^k}{_m^l}\), \({{\hat \jmath }_2}: = h\;{S_{kl}}^m{S^{kl}}_m\), and \({{\hat \jmath }_3}: = h\;{S_j}{S^j}\). If 1, 2 are small parameters, then his final Lagrangian is \({\mathcal L} = \hat H + { \epsilon _1}\hat H* + { \epsilon _2}\hat H**\). In order to prove that Maxwell’s equations follow from his Lagrangian, Einstein had to perform the limit \(\sigma : = \tfrac{{{ \epsilon _1}}}{{{ \epsilon _2}}} \to 0\) in an expression termed \(\hat G*{\;^{\;ik}}\), which he assumed to depend homogeneously and quadratically on a linear combination of torsionFootnote 206.

In a Festschrift for his former teacher and colleague in Zürich, A. Stodola, Einstein summed up what he had reachedFootnote 207. He exchanged the definition of the invariants named \({{\hat \jmath}_2}\), and \({{\hat \jmath}_3}\), and stated that a choice of A=-B, C=0 in the Lagrangian \(\hat \jmath = A{{\hat \jmath }_1} + B{{\hat \jmath }_2} + C{{\hat \jmath }_3}\) would give field equations

“[…] that coincide in first approximation with the known laws for the gravitational and electromagnetic field […]”Footnote 208

with the proviso that the specialisation of the constants A, B, C must be made only after the variation of the Lagrangian, not before. Also, together with Müntz, he had shown that for an uncharged mass point the Schwarzschild solution again obtained [87].

Einstein’s next publication was the note preceding Cartan’s paper in Mathematische Annalen [89]. He presented it as an introduction suited for anyone who knew general relativity. It is here that he first mentioned Equations (162) and (163). Most importantly, he gave a new set of field equations not derived from a variational principle; they areFootnote 209.

$${G^{ik}}: = S{_{\;.\;\;.}^{il\;k}{_{\left\| l \right.}}} - S_{.\;\;.}^{imn}S_{nm}^{\;\;\;k} = 0,$$
$$ {F_{ik}}: = \;\;\;\;\;S_{ik}^{..}\;^l\,_{\left\| l \right.}\;\;\;\;\; = 0, $$

where Sikl=gimgknSmnl. There exist 4 identities among the 16+6 field equations

$$ {G^{il}}_{\left\| l \right.} - {F^{il}}_{\left\| l \right.} + S_{.\;\;.}^{imn}{F_{nm}} = 0. $$

As Cartan remarked, Equation (172) expresses conservation of torsion under parallel transport:

“In fact, in the new theory of Mr. Einstein, it is natural to call a universe homogeneous if the torsion vectors that are associated to two parallel surface elements are parallel themselves; this means that parallel transport conserves torsion.”Footnote 210 ([35], p. 703)

From Equation (173) with the help of Equation (171), (172), Einstein wrote down two more identities. One of them he had obtained from Cartan:

“But I am very grateful to you for the identity

$$ {G^{ik}}_{\left\| i \right.} - {S_{lm}}^{\;k}{G^{lm}} = 0, $$

which, astonishingly, had escaped me. […] In a new presentation in the Sitzungsberichten, I used this identity while taking the liberty of pointing to you as its source.” Footnote 211 (letter of Einstein to Cartan from 18 December 1929, Document X of [50], p. 72)

In order to show that his field equations were compatible he counted the number of equations, identities, and field quantities (in n-dimensional space) to find, in the end, n2+n equations for the same number of variables. To do so, he had to introduce an additional variable ψ via \({F_k} = {\phi _k} - \tfrac{{\partial \log \psi }}{{\partial {x^k}}}\). Here, Fk is introduced by \({F_{ik}} = {\partial _k}{F_i} - {\partial _i}{F_k} = {\partial _k}{\phi _i} - {\partial _i}{\phi _k}\). Einstein then showed that \({\partial _l}{F_{ik}} + {\partial _k}{F_{li}} + {\partial _i}{F_{kl}} = 0\).

The changes in his approach Einstein continuously made, must have been hard on those who tried to follow him in their scientific work. One of them, ZaycoffFootnote 212, tried to make the best out of them:

“Recently, A. Einstein ([89]), following investigations by E. Cartan ([35]), has considerably modified his teleparallelism theory such that former shortcomings (connected only to the physical identifications) vanish by themselves.“Footnote 213 ([433], p. 410)

In November 1929, Einstein gave two lectures at the Institute Henri Poincaré in Paris which had been opened one year earlier in order to strengthen theoretical physics in France ([14], pp. 263–272). They were published in 1930 as the first article in the new journal of this institute [92]. On 23 pages he clearly and leisurely outlined his theory of distant parallelism and the progress he had made. As to references given, first Cartan’s name is mentioned in the text:

“It is not for the first time that such spaces are envisaged. From a purely mathematical point of view they were studied previously. M. Cartan was so amiable as to write a note for the Mathematische Annalen exposing the various phases in the formal development of these concepts.”Footnote 214 ([92], p. 4)

Note that Einstein does not say that it was Cartan who first “envisaged” these spaces before. Later in the paper, he comes closer to the point:

“This type of space had been envisaged before me by mathematicians, notably by WEITZENBÖCK, EISENHART et CARTAN […].”Footnote 215 [92]

Again, he held back in his support of Cartan’s priority claim.

Some of the material in the paper overlaps with results from other publications [85, 90, 93]. The counting of independent variables, field equations, and identities is repeated from Einstein’s paper in Mathematische Annalen [89]. For n=4, there were 20 field equations (Gik=0, Fl=0) for 16+1 variables \(h_{\hat \imath}^k\) and ψ, four of which were arbitrary (coordinate choice). Hence 7 identities should exist, four of which Einstein had found previously. He now presented a derivation of the remaining three identities by a calculation of two pages’ length. The field equations are the same as in [89]; the proof of their compatibility takes up, in a slightly modified form, the one communicated by Einstein to Cartan in a letter of 18 December 1929 ([92], p. 20). It is reproduced also in [90].

Interestingly, right after Einstein’s article in the institute’s journal, a paper of C. G. Darwin, “On the wave theory of matter”, is printed, and, in the same first volume, a report of Max Born on “Some problems in Quantum Mechanics.” Thus, French readers were kept up-to-date on progress made by both parties — whether they worked on classical field theory or quantum theory [45, 21].

A. Proca, who had attended Einstein’s lectures, gave an exposition of them in a journal of his native Romania. He was quite enthusiastic about Einstein’s new theory:

“A great step forward has been made in the pursuit of this total synthesis of phenomena which is, right or wrong, the ideal of physicists. […] the splendid effort brought about by Einstein permits us to hope that the last theoretical difficulties will be vanquished, and that we soon will compare the consequences of the theory with [our] experience, the great stepping stone of all creations of the mind.”Footnote 216 [260, 261]

Einstein’s next paper in the Berlin Academy, in which he reverts to his original notation h l , consisted of a brief critical summary of the formalism used in his previous papers, and the announcement of a serious mistake in his first note in 1930, which made invalid the derivation of the field equations for the electromagnetic field ([90], p. 18). The mistake was the assumption on the kind of dependence on torsion of the quantity Ĝ*ik, which was mentioned above. Also, Einstein now found it better “to keep the concept of divergence, defined by contraction of the extension of a tensor” and not use the covariant derivative \(\nabla { * _l}\) introduced by him in his third paper in the Berlin Academy [88].

Then Einstein presented the same field equations as in his paper in Annalen der Mathematik, which he demanded to be

  1. (1)


  2. (2)

    of second order, and

  3. (3)

    linear in the second derivatives of the field variable h ki .

while these demands had been sufficient to uniquely lead to the gravitational field equations (with cosmological constant) of general relativity, in the teleparallelism theory a great deal of ambiguity remained. Sixteen field equations were needed which, due to covariance, induced four identities.

“Therefore equations must be postulated among which identical relations are holding. The higher the number of equations (and consequently also the number of identities among them), the more precise and stronger than mere determinism is the content; accordingly, the theory is the more valuable, if it is also consistent with the empirical facts.”Footnote 217 ([90], p. 21)

He then gave a proof of the compatibility of his field equations:

“The proof of the compatibility, as given in my paper in the Mathematische Annalen, has been somewhat simplified due to a communication which I owe to a letter of Mr. CARTAN (cf. §3, [16]).”Footnote 218

The reader had to make out for himself what Cartan’s contribution really was.

In linear approximation, i.e., for \({h_{ik}} = {\delta _{ik}} + {\bar h_{ik}}\), Einstein obtained d’Alembert’s equation for both the symmetric and the antisymmetric part of ik, identified with the gravitational and the electromagnetic field, respectively.

Einstein’s next note of one and a half pages contained a mathematical result within teleparallelism theory: From any tensor with an antisymmetric pair of indices a vector with vanishing divergence can be derived [93].

In order to test the field equations by exhibiting an exact solution, a simple case would be to take a spherically symmetric, asymptotically (Minkowskian) 4-bein. This is what Einstein and Mayer did, except with the additional assumption of space-reflection symmetry [106]. Then the 4-bein contains three arbitrary functions of one parameter s:

$$ \begin{array}{*{20}{c}} {h_{\hat \imath}^\alpha = \lambda (s)\delta _{\hat \imath}^\alpha ,\;\;\;}&{h_{\hat 4}^\alpha = \tau (s){x_\alpha },\;\;\;}&{h_{\hat \imath}^4 = 0,\;\;\;}&{h_{\hat 4}^{\hat 4} = u(s),} \end{array} $$

where α, \(\hat \imath = 1,2,3\). As an exact solution of the field equations (171, 172), Einstein and Mayer obtained \(\lambda = \tau = e{s^{ - 3}}{(1 - {e^2}/{s^4})^{ - \tfrac{1}{4}}})\) and \(u = 1 + m\;\int {ds\;{s^{ - 2}}(1 - {e^2}/{s^4})} {\;^{ - \tfrac{1}{4}}}\). The constants e and m were interpreted as electric charge and “ponderomotive mass,” respectively. A further exact solution for uncharged point particles was also derived; it is static and corresponds to “two or more unconnected electrically neutral masses which can stay at rest at arbitrary distances”. Einstein and Mayer do not take this physically unacceptable situation as an argument against the theory, because the equations of motion for such singularities could not be derived from the field equations as in general relativity. Again, the continuing wish to describe elementary particles by singularity-free exact solutions is stressed.

Possibly, W. F. G. Swan of the Bartol Research Foundation in Swarthmore had this paper in mind when he, in April 1930, in a brief description of Einstein’s latest publications, told the readers of Science:

“It now appears that Einstein has succeeded in working out the consequences of his general law of gravity and electromagnetism for two special cases just as Newton succeeded in working out the consequences of his law for several special cases. […] It is hoped that the present solutions obtained by Einstein, or if not these, then others which may later evolve, will suggest some experiments by which the theory may be tested.” ([339], p. 391)

Two days before the paper by Einstein and Mayer became published by the Berlin Academy, Einstein wrote to his friend Solovine:

“My field theory is progressing well. Cartan has already worked with it. I myself work with a mathematician (S. MayerFootnote 219 from Vienna), a marvelous chap […].“Footnote 220 ([98], p. 56)

The mentioning of Cartan resulted from the intensive correspondence of both scientists between December 1929 and February 1930: About a dozen letters were exchanged which, sometimes, contained long calculations [50] (cf. Section 6.4.6). In an address given at the University of Nottingham, England, on 6 June 1930, Einstein also must have commented on the exact solutions found and on his program concerning the elementary particles. A report of this address stated about Einstein’s program:

“The problem is nearly solved; and to the first approximations he gets laws of gravitation and electro-magnetics. He does not, however, regard this as sufficient, though those laws may come out. He still wants to have the motions of ordinary particles to come out quite naturally. [The program] has been solved for what he calls the ‘quasistatical motions’, but he also wants to derive elements of matter (electrons and protons) out of the metric structure of space.” ([91], p. 610)

With his “assistant” Walther Mayer, Einstein then embarked on a very technical, systematic study of compatible field equations for distant parallelism [108]. In addition to the assumptions (1), (2), (3) for allowable field equations given above, further restrictions were made:

  1. (4)

    the field equations must contain the first derivatives of the field variable h i only quadratically;

  2. (5)

    the identities for the left hand sides Gik of the field equations must be linear in Gik and contain only their first derivatives;

  3. (6)

    torsion must occur only linearly in Gik.

For the field equations, the following ansatz was made:

$$ 0 = {G^{ik}} = pS_{.\;\;.\;\;\left\| l \right.}^{il\;\;k} + qS_{.\;\;.\;\;\left\| l \right.}^{kl\;\;i} + {a_1}\phi _{.\;\;.}^{i\left\| k \right.} + {a_2}\phi _{.\;\;.}^{k\left\| i \right.} + {a_3}g_{.\;\;.}^{ik}\phi _{\left\| l \right.}^l + {R^{ik}}, $$

where Rik is a collection of terms quadratic in torsion S, and p, q, a1, a2, a3 are constants. They must be determined in such a way that the “divergence-identity”

$$ {G^{ik}}_{;i} + {G^{ki}}_{;i} + {G^{lm}}({c_1}{S_{lm}}^k + {c_2}{S_l}^{\; \cdot m}\,_k + {c_3}{S_m}^{\; \cdot l}\,_k) + {c_4}{G^{kl}}{\phi _l} + {c_5}{G^{lk}}{\phi _l} + {c_6}G_l^ \cdot \phi _ \cdot ^k + B{G^{l \cdot }}\,_{l\left\| \cdot \right.}\,^k = 0 $$

is satisfied. Here, 8 new constants A, B, cr with r=1, …, 6 to be fixed in the process also appear. After inserting Equation (175) into Equation (176), Einstein and Mayer reduced the problem to the determination of 10 constants by 20 algebraic equations by a lengthy calculation. In the end, four different types of compatible field equations for the teleparallelism theory remained:

“Two of these are (non-trivial) generalisations of the original gravitational field equations, one of them being known already as a consequence of the Hamiltonian principle. The remaining two types are denoted in the paper by […].”Footnote 221

With no further restraining principles at hand, this ambiguity in the choice of field equations must have convinced Einstein that the theory of distant parallelism could no longer be upheld as a good candidate for the unified field theory he was looking for, irrespective of the possible physical contentFootnote 222. Once again, he dropped the subject and moved on to the next. While aboard a ship back to Europe from the United States, Einstein, on 21 March 1932, wrote to Cartan:

“[…] In any case, I have now completely given up the method of distant parallelism. It seems that this structure has nothing to do with the true character of space […].” ([50], p. 209)

What Cartan might have felt, after investing the forty odd pages of his calculations printed in Debever’s book, is unknown. However, the correspondence on the subject came to an end in May 1930 with a last letter by Cartan. Footnote 223.

6.4.4 Reactions I: Mostly critical

About half a year after Einstein’s two papers on distant parallelism of 1928 had appeared, ReichenbachFootnote 224, who always tended to defend Einstein against criticism, classified the new theory [268] according to the lines set out in his book [267] as “having already its precisely fixed logical position in the edifice of Weyl-Eddington geometry” ([267], p. 683). He mentioned as a possible generalization an idea of Einstein’s, in which the operation of parallel transport might be taken as integrable not with regard to length but with regard to direction: “a generalisation which already has been conceived by Einstein as I learned from him” ([267], p. 687)Footnote 225.

As concerns parallelism at a distance, Reichenbach was not enthusiastic about Einstein’s new approach:

“[…] it is the aim of Einstein’s new theory to find such an entanglement between gravitation and electricity that it splits into the separate equations of the existing theory only in first approximation; in higher approximation, however, a mutual influence of both fields is brought in, which, possibly, leads to an understanding of questions unanswered up to now as [is the case] for the quantum riddle. But this aim seems to be in reach only if a direct physical interpretation of the operation of transport, even of the immediate field quantities, is given up. From the geometrical point of view, such a path [of approach] must seem very unsatisfactory; its justifications will only be reached if the mentioned link does encompass more physical facts than have been brought into it for building it up.”Footnote 226 ([267], p. 689)

A first reaction from a competing colleague came from Eddington, who, on 23 February 1929, gave a cautious but distinct review of Einstein’s first three publications on distant parallelism [84, 83, 88] in Nature. After having explained the theory and having pointed out the differences to his own affine unified field theory of 1921, he confessed:

“For my own part I cannot readily give up the affine picture, where gravitational and electric quantities supplement one another as belonging respectively to the symmetrical and antisymmetrical features of world measurement; it is difficult to imagine a neater kind of dovetailing. Perhaps one who believes that Weyl’s theory and its affine generalisation afford considerable enlightenment, may be excused for doubting whether the new theory offers sufficient inducement to make an exchange.” [62]

Weyl was the next unhappy colleague; in connection with the redefinition of his gauge idea he remarked (in April/May 1929):

“[…] my approach is radically different, because I reject distant parallelism and keep to Einstein’s general relativity. […] Various reasons hold me back from believing in parallelism at a distance. First, my mathematical intuition a priori resists to accept such an artificial geometry; I have difficulties to understand the might who has frozen into rigid togetherness the local frames in different events in their twisted positions. Two weighty physical arguments join in […] only by this loosening [of the relationship between the local frames] the existing gauge-invariance becomes intelligible. Second, the possibility to rotate the frames independently, in the different events, […] is equivalent to the s y m m e t r y o f t h e e n e r g y - m o m e n t u m t e n s o r, or to the validity of the conservation law for angular momentum.”Footnote 227 ([407], pp. 330–332.)

As usual, Pauli was less than enthusiastic; he expressed his discontent in a letter to Hermann Weyl of 26 August 1929:

“First let me emphasize that side of the matter about which I fully agree with you: Your approach for incorporating gravitation into Dirac’s theory of the spinning electron […] I am as adverse with regard to Fernparallelismus as you are […] (And here I must do justice to your work in physics. When you made your theory with gikgik this was pure mathematics and unphysical; Einstein rightly criticised and scolded you. Now the hour of revenge has come for you, now Einstein has made the blunder of distant parallelism which is nothing but mathematics unrelated to physics, now you may scold [him].)”Footnote 228 ([251], pp. 518–519)

Another confession of Pauli’s went to Paul Ehrenfest:

“By the way, I now no longer believe in one syllable of teleparallelism; Einstein seems to have been abandoned by the dear Lord.”Footnote 229 (Pauli to Ehrenfest 29 September 1929; [251], p. 524)

Pauli’s remark shows the importance of ideology in this field: As long as no empirical basis exists, beliefs, hopes, expectations, and rationally guided guesses abound. Pauli’s letter to Weyl from 1 July 1929 used non-standard language (in terms of science):

“I share completely your skeptical position with regard to Einstein’s 4-bein geometry. During the Easter holidays I have visited Einstein in Berlin and found his opinion on modern quantum theory reactionary.”Footnote 230 ([251], p. 506)

while the wealth of empirical data supporting Heisenberg’s and Schrödinger’s quantum theory would have justified the use of a word like “uninformed” or even “not up to date” for the description of Einstein’s position, use of “reactionary” meant a definite devaluation.

Einstein had sent a further exposition of his new theory to the Mathematische Annalen in August 1928. When he received its proof sheets from Einstein, Pauli had no reservations to criticise him directly and bluntly:

“I thank you so much for letting be sent to me your new paper from the Mathematische Annalen [89], which gives such a comfortable and beautiful review of the mathematical properties of a continuum with Riemannian metric and distant parallelism […]. Unlike what I told you in spring, from the point of view of quantum theory, now an argument in favour of distant parallelism can no longer be put forward […]. It just remains […] to congratulate you (or should I rather say condole you?) that you have passed over to the mathematicians. Also, I am not so naive as to believe that you would change your opinion because of whatever criticism. But I would bet with you that, at the latest after one year, you will have given up the entire distant parallelism in the same way as you have given up the affine theory earlier. And, I do not wish to provoke you to contradict me by continuing this letter, because I do not want to delay the approach of this natural end of the theory of distant parallelism.”Footnote 231 (letter to Einstein of 19 December 1929; [251], 526–527)

Einstein answered on 24 December 1929:

“Your letter is quite amusing, but your statement seems rather superficial to me. Only someone who is certain of seeing through the unity of natural forces in the right way ought to write in this way. Before the mathematical consequences have not been thought through properly, is not at all justified to make a negative judgement. […] That the system of equations established by myself forms a consequential relationship with the space structure taken, you would probably accept by a deeper study — more so because, in the meantime, the proof of the compatibility of the equations could be simplified.”Footnote 232 ([251], p. 582)

Before he had written to Einstein, Pauli, with lesser reservations, complained vis-a-vis Jordan:

“Einstein is said to have poured out, at the Berlin colloquium, horrible nonsense about new parallelism at a distance. The mere fact that his equations are not in the least similar to Maxwell’s theory is employed by him as an argument that they are somehow related to quantum theory. With such rubbish he may impress only American journalists, not even American physicists, not to speak of European physicists.”Footnote 233 (letter of 30 November 1929, [251], p. 525)

Of course, Pauli’s spells of rudeness are well known; in this particular case they might have been induced by Einstein’s unfounded hopes for eventually replacing the Schrödinger-Heisenberg-Dirac quantum mechanics by one of his unified field theories.

The question of the compatibility of the field equations played a very important role because Einstein hoped to gain, eventually, the quantum laws from the extra equations (cf. his extended correspondence on the subject with Cartan ([50] and Section 6.4.6).

That Pauli had been right (except for the time span envisaged by him) was expressly admitted by Einstein when he had given up his unified field theory based on distant parallelism in 1931 (see letter of Einstein to Pauli on 22 January 1932; cf. [241], p. 347).

Born’s voice was the lonely approving one (Born to Einstein on 23 September 1929)Footnote 234:

“Your report on progress in the theory of Fernparallelism did interest me very much, particularly because the new field equations are of unique simplicity. Until now, I had been uncomfortable with the fact that, aside from the tremendously simple and transparent geometry, the field theory did look so very involved”Footnote 235 ([154], p. 307)

Born, however, was not yet a player in unified field theory, and it turned out that Einstein’s theory of distant parallelism became as involved as the previous ones.

Einstein’s collaborator Lanczos even wrote a review article about distant parallelism with the title “The new field theory of Einstein” [201]. In it, Lanczos cautiously offers some criticism after having made enough bows before Einstein:

“To be critical with regard to the creation of a man who has long since obtained a place in eternity does not suit us and is far from us. Not as a criticism but only as an impression do we point out why the new field theory does not house the same degree of conviction, nor the amount of inner consistency and suggestive necessity in which the former theory excelled.[…] The metric is a sufficient basis for the construction of geometry, and perhaps the idea of complementing RIEMANNian geometry by distant parallelism would not occur if there were the wish to implant something new into RIEMANNian geometry in order to geometrically interpret electromagnetism.”Footnote 236 ([201], p. 126)

When Pauli reviewed this review, he started with the scathing remark

“It is indeed a courageous deed of the editors to accept an essay on a new field theory of Einstein for the ‘Results in the Exact Sciences’ [literal translation of the journal’s title]. His never-ending gift for invention, his persistent energy in the pursuit of a fixed aim in recent years surprise us with, on the average, one such theory per year. Psychologically interesting is that the author normally considers his actual theory for a while as the ‘definite solution’. Hence, […] one could cry out: ’Einstein’s new field theory is dead. Long live Einstein’s new field theory!’”Footnote 237 ([248], p. 186)

For the remainder, Pauli engaged in a discussion with the philosophical background of Lanczos and criticised his support for Mie’s theory of matter of 1913 according to which

“the atomism of electricity and matter, fully separated from the existence of the quantum of action, is to be reduced to the properties of (singularity-free) eigen-solutions of still-to-be-found nonlinear differential equations for the field variables.”Footnote 238

Thus, Pauli lightly pushed aside as untenable one of Einstein’s repeated motivations and hoped-for tests for his unified field theories.

Lanczos, being dissatisfied with Einstein’s distant parallelism, then tried to explain “electromagnetism as a natural property of Riemannian geometry” by starting from the Lagrangian quadratic in the components of the Ricci tensor: \({\mathcal L} = {R_{ik}}{R_{lm}}{g^{il}}{g^{km}} + C{({R_{ik}}{g^{ik}})^2}\) with an arbitrary constant C. He varied gik and Rik independently [202]. (For Lanczos see J. Stachel’s essay “Lanczos’ early contributions to relativity and his relation to Einstein” in [330], pp. 499–518.)

6.4.5 Reactions II: Further research on distant parallelism

The first reactions to Einstein’s papers came quickly. On 29 October 1928, de Donder suggested a generalisation by using two metric tensors, a space-time metric gik, and a bein-metric g ik , connected to the 4-bein components \({h_{l\hat \jmath }}\) by

$$ {g_{lm}} = g_\star^{\hat \jmath \hat k}{h_{l\hat \jmath}}{h_{m\hat k}}. $$

In place of Einstein’s connection (161), defined through the 4-bein only, he took:

$$ {\Delta _{lm}}^k = \frac{1}{2}{h^{k\hat \jmath}}({h_{l\hat \jmath \cdot m}} - {h_{m\hat \jmath \cdot l}}), $$

where the dot-symbol denotes covariant derivation by help of the Levi-Civita connection derived from g ik . If the Minkowski metric is used as a bein metric g, then the dot derivative reduces to partial derivation, and Einstein’s original connection is obtained [48].

Another application of Einstein’s new theory came from Eugen Wigner in Berlin whose paper showing that the tetrads in distant-parallelism-theory permitted a generally covariant formulation of “Diracs equation for the spinning electron”, was received by Zeitschrift für Physik on 29 December 1928 [419]. He did point out that “up to now, grave difficulties stood in the way of a general relativistic generalisation of Dirac’s theory” and referred to a paper of Tetrode [344]. Tetrode, about a week after Einstein’s first paper on distant parallelism had appeared on 14 June 1928, had given just such a generally relativistic formulation of Dirac’s equation through coordinate dependent Gamma-matricesFootnote 239; he also wrote down a (symmetric) energy-momentum tensor for the Dirac field and the conservation laws. However, he had kept the metric gik introduced into the formalism by

$$ {\gamma _i}{\gamma _k} + {\gamma _k}{\gamma _i} = 2{g_{ik}} $$

to be conformally flat. For the matrix-valued 4-vector γi he prescribed the condition of vanishing divergence. Wigner did not fully accept Tetrode’s derivations because there, implicitly and erroneously, it had been assumed that the two-dimensional representation of the Lorentz group (2-spinors) could be extended to a representation of the affine group. Wigner stated that such difficulties would disappear if Einstein’s teleparallelism theory were used. Nevertheless, nowhere did he claim that the Dirac equation could only be formulated covariantly with the help of Einstein’s new theory.

Zaycoff of the Physics Institute of the University in Sofia also followed Einstein’s work closely. Half a year after Einstein’s first two notes on distant parallelism had appeared [84, 83], i.e., shortly before Christmas 1928, Zaycoff sent off his first paper on the subject, whose arrival in Berlin was acknowledged only after the holidays on 13 January 1929 [429]. In it he described the mathematical formalism of distant parallelism theory, gave the identity (42), and calculated the new curvature scalar in terms of the Ricci scalar and of torsion. He then took a more general Lagrangian than Einstein and obtained the variational derivatives in linear and, in a simple example, also in second approximation. In his presentation, he used both the teleparallel and the Levi-Civita connections. His second and third papers came quickly after Einstein’s third note of January 1929 [88], and thus had to take into account that Einstein had dropped derivation of the field equations from a variational principle. In his second paper, Zaycoff followed Einstein’s method and gave a somewhat simpler derivation of the field equations. An exact, complicated wave equation followed:

$$ {D_l}{D^l}{S_k} - {F_{kj}}{S^j} - {X_k} = 0, $$

where \({X_k}: = \tfrac{1}{2}{D_k}({V^{lmn}}{S_{lnm}}) + {S_{lkm}}{D_r}{S^{lrm}} - \tfrac{1}{2}{S_k}{V^{lmn}}{S_{lnm}} + {V^{lmn}}{S_{mn}}^r{S_{lkr}} + {S_{rkm}}{S^r}{S^m}\) with torsion \({S_{mn}}^r\) and the torsion vector Sk, and the covariant derivative \({D_l}: = {\nabla _l} - {S_l}\), ∇lbeing the teleparallel connection (161). In linear approximation, the Einstein vacuum and the vacuum Maxwell equations are obtained, supplemented by the homogeneous wave equation for a vector field [431]. In his third note, Zaycoff criticised Einstein “for not having shown, in his most recent publication, whether his constraints on the world metric be permissible.” He then derived additional exact compatibility conditions for Einstein’s field equations to hold; according to him, their effect would show up only in second approximation [430]. In his fourth publication Zaycoff came back to Einstein’s Hamiltonian principle and rederived for himself Einstein’s results. He also defended Einstein against critical remarks by Eddington [62] and Schouten [304], although Schouten, in his paper, had mentioned neither Einstein nor his teleparallelism theory, but only gave a geometrical interpretation of the torsion vector in a geometry with semi-symmetric connection. Zaycoff praised Einstein’s teleparallelism theory in words reminding me of the creation of the world as described in Genesis:

“We may say that A. Einstein built a plane world which is no longer waste like the Euclidean space-time-world of H. Minkowski, but, on the contrary, contains in it all that we usually call physical reality.”Footnote 240 ([428], p. 724)

A conference on theoretical physics at the Ukrainian Physical-Technical Institute in Charkow in May 1929, brought together many German and Russian physicists. Unified field theory, quantum mechanics, and the new quantum field theory were all discussed. Einstein’s former calculational assistant Grommer, now on his own in Minsk, in a brief contribution stressed Einstein’s path for getting an overdetermined system of differential equations: Vary with regard to the 16 beinquantities but consider only the 10 metrical components as relevant. He claimed that Einstein had used only the antisymmetric part of the tensor \({P_{lm}}^k: = \Gamma {(g)_{lm}}^k - {\Delta _{lm}}^k\), where both \(\Gamma {(g)_{lm}}^k\) and \({\Delta _{lm}}^k\) were mentioned above (in Einstein’s first note) although Einstein never used \(\Gamma {(g)_{lm}}^k\). According to Grommer the anti-symmetry of P is needed, because its contraction leads to the electromagnetic 4-potential and because the symmetric part can be expressed by the antisymmetric part and the metrical tensor. He also played the true voice of his (former) master by repeating Einstein’s program of deriving the equations of motion from the overdetermined system:

“If the law of motion of elementary particles could be derived from the overdetermined field equation, one could imagine that this law of motion permit only discrete orbits, in the sense of quantum theory.”Footnote 241 ([153], p. 646)

Levi-Civita also had sent a paper on distant parallelism to Einstein, who had it appear in the reports of the Berlin Academy [207]. Levi-Civita introduced a set of four congruences of curves that intersect each other at right angles, called their tangents \(\lambda _{\hat \imath}^k\) and used Equation (160) in the form:

$$ {\delta _{lm}} = {h_{l\hat \jmath}}{h_{m\hat \jmath}}. $$

He also employed the Ricci rotation coefficients defined by \({\gamma _{\hat \imath \hat k\hat l}} = {\nabla _\sigma }h_{\hat \imath}^\beta {h_{\hat k\beta }}h_{\hat l}^\sigma\), where the hatted indices are “bein”-indices; the Greek letters denote coordinates. They obey

$$ {\gamma _{\hat \imath \hat k\hat l}} + {\gamma _{\hat k\hat \imath \hat l}} = 0. $$

The electromagnetic field tensor Fik was entered via

$$ {F_{ik}}\lambda _{\hat l}^i\lambda _{\hat m}^k = \lambda _{\hat s}^r\frac{\partial }{{\partial {x^r}}}{\gamma _{\hat l\hat m}}^{\hat s}. $$

Levi-Civita chose as his field equations the Einstein-Maxwell equations projected on a rigidly fixed “world-lattice” of 4-beins. He used the time until the printing was done to give a short preview of his paper in Nature [206]. About a month before Levi-Civita’s paper was issued by the Berlin Academy, Fock and Ivanenko [135] had had the same idea and compared Einstein’s notation and the one used by Levi-Civita in his monograph on the absolute differential calculus [205]:

“Einstein’s new gravitational theory is intimately linked to the known theory of the orthogonal congruences of curves due to Ricci. In order to ease a comparison between both theories, we may bring together here the notations of R i c c i and L e v i - C i v i t a […] with those of Einstein.”Footnote 242

A little after the publication of Levi-Civita’s papers, Heinrich Mandel embarked on an application of Kaluza’s five-dimensional approach to Einstein’s theory of distant parallelism [218]. Einstein had sent him the corrected proof sheets of his fourth paper [85]. The basic idea was to consider the points of M4 as equivalent to the ensemble of congruences with tangent vector X i5 in M5 (with cylindricity condition) werden.”Footnote 243. The space-time interval is defined as the distance of two lines of the congruence on \({M_5}:d{\tau ^2} = ({\gamma _{il}} - {X_{5i}}\;{X_{5l}})(\delta _k^l - {X_5}^l{X_{5k}})d{x^i}d{x^k}\). Mandel did not identify the torsion vector with the electromagnetic 4-potential, but introduced the covariant derivative \({\Delta _k}{A^i}: = \tfrac{{\partial {A^i}}}{{\partial {x^k}}} + \{ _{kj}^i\} {A^j} + \tfrac{e}{c}{X_{5k}}{\mathcal M}_l^i{A^l}\), where the tensor \({\mathcal M}\) is skew-symmetric. We may look at this paper also as a forerunner of some sort to the Einstein.Mayer 5-vector formalism (cf. Section 6.3.2).

Before Einstein dropped the subject of distant parallelism, many more papers were written by a baker’s dozen of physicists. Some were more interested in the geometrical foundations, in exact solutions to the field equations, or in the variational principle.

One of those hunting for exact solutions was G. C. McVittie who referred to Einstein’s paper [88]:

“[…] we test whether the new equations proposed by Einstein are satisfied. It is shown that the new equations are satisfied to the first order but not exactly.”

He then goes on to find a rigourous solution and obtains the metric \(d{s^2} = {e^{a{x_1}}}dx_4^2 - {e^{ - 2a{x_1}}}dx_1^2 - {e^{ - a{x_1}}}dx_2^2 - {e^{ - a{x_1}}}dx_3^2\) and the 4-potential \({\phi _4} = \tfrac{1}{{2\sqrt \pi }}{e^{\tfrac{1}{2}a{x_1}}}\) [225]. He also wrote a paper on exact axially symmetric solutions of Einstein’s teleparallelism theory [226].

Tamm and Leontowich treated the field equations given in Einstein’s fourth paper on distant parallelism [85]. They found that these field equations did not have a spherically symmetric solution corresponding to a charged point particle at restFootnote 244. The corresponding solution for the uncharged particle was the same as in general relativity, i.e., Schwarzschild’s solution. Tamm and Leontowitch therefore guessed that a charged point particle at rest would lead to an axially-symmetric solution and pointed to the spin for support of this hypothesis [342, 342].

WienerFootnote 245 and VallartaFootnote 246 were after particular exact solutions of Einstein’s field equations in the teleparallelism theory. By referring to Einstein’s first two papers concerning distant parallelism, they set out to show that the

“[…] electromagnetic field is incompatible in the new Einstein theory with the assumption of static spherical symmetry and symmetry of the past and the future. […] the new Einstein theory lacks at present all experimental confirmation.”

In footnote 4, they added:

“Since writing this paper the authors have learned from Dr. H. Müntz that the new Einstein field equations of the 1929 paper do not yield the vanishing of the gravitational field in the case of spherical symmetry and time symmetry. In this case he has been able to obtain results checking the observed perihelion of mercury” ([416], p. 356)

Müntz is mentioned in [88, 85].

In his paper “On unified field theory” of January 1929, Einstein acknowledges work of a Mr. Müntz:

“I am pleased to dutifully thank Mr. Dr. H. Müntz for the laborious exact calculation of the centrally-symmetric problem based on the Hamiltonian principle; by the results of this investigation I was led to the discovery of the road following here.”Footnote 247

Again, two months later in his next paper, “Unified field theory and Hamiltonian principle”, Einstein remarks:

“Mr. Lanczos and Müntz have raised doubt about the compatibility of the field equations obtained in the previous paper […].”

and, by deriving field equations from a Lagrangian shows that the objection can be overcome. In his paper in July 1929, the physicist Zaycoff had some details:

“Solutions of the field equations on the basis of the original formulation of unified field theory to first approximation for the spherically symmetric case were already obtained by Müntz.”

In the same paper, he states: “I did not see the papers of Lanczos and Müntz.” Even before this, in the same year, in a footnote to the paper of Wiener and Vallarta, we read:

“Since writing this paper the authors have learned from Dr. H. Müntz that the new Einstein field equations of the 1929 paper do not yield the vanishing of the gravitational field in the case of spherical symmetry and time symmetry. In this case he has been able to obtain results checking the observed perihelion of mercury.”

The latter remark refers to a constant query Pauli had about what would happen, within unified field theory, to the gravitational effects in the planetary system, described so well by general relativityFootnote 248.

Unfortunately, as noted by Meyer Salkover of the Mathematics Department in Cincinatti, the calculations by Wiener and Vallarta were erroneous; if corrected, one finds the Schwarzschild metric is indeed a solution of Einstein’s field equations. In the second of his two brief notes, Salkover succeeded in gaining the most general, spherically symmetric solution [288, 287]. This is admitted by the authors in their second paper, in which they present a new calculation.

“In a previous paper the authors of the present note have treated the case of a spherically symmetrical statical field, and stated the conclusions: first, that under Einstein’s definition of the electromagnetic potential an electromagnetic field is incompatible with the assumption of static spherical symmetry and symmetry of the past and future; second, that if one uses the Hamiltonian suggested in Einstein’s second 1928 paper, the electromagnetic potential vanishes and the gravitational field also vanishes.”

And they hasten to reassure the reader:

“None of the conclusions of the previous paper are vitiated by this investigation, although some of the final formulas are supplemented by an additional term.” ([417], p. 802)

Vallarta also wrote a paper by himself ([358], p. 784) whose abstract reads:

“In recent papers Wiener and the author have determined the tensors \(^s{h_\lambda }\) of Einstein’s unified theory of electricity and gravitation under the assumption of static spherical symmetry and of symmetry of past and future. It was there shown that the field equations suggested in Einstein’s second 1928 paper [83] lead in this case to a vanishing gravitational field. The purpose of this paper is to investigate, for the same case, the nature of the gravitational field obtained from the field equations suggested by Einstein in his first 1929 paper [88].”

He also claims

“that Wiener has shown in a paper to be published elsewhere soon that the Schwarzschild solution satisfies exactly the field equations suggested by Einstein in his second 1929 paper ([85]).”

Finally, Rosen and Vallarta [283] got together for a systematic investigation of the spherically symmetric, static field in Einstein’s unified field theory of the electromagnetic and gravitational fields [93].

Further papers on Einstein’s teleparallelism theory were written in Italy by Bortolotti in Cagliari, Italy [22, 23, 25, 24], and by Palatini [242].

In Princeton, people did not sleep either. In 1930 and 1931, T. Y. Thomas wrote a series of six papers on distant parallelism and unified field theory. He followed Einstein’s example by also changing his field equations from the first to the second publication. After that, he concentrated on more mathematical problems , such as proving an existence theorem for the Cauchy-Kowlewsky type of equations in unified field theory, by studying the characteristics and bi-characteristic, the characteristic Cauchy problem, and Huygen’s principle. T. Y. Thomas described the contents of his first paper as follows:

“In a number of notes in the Berlin Sitzungsberichte followed by a revised account in the Mathematische Annalen, Einstein has attempted to develop a unified theory of the gravitational and electromagnetic field by introducing into the scheme of Riemannian geometry the possibility of distant parallelism. […] we are led to the construction of a system of wave equations as the equations of the combined gravitational and electromagnetic field. This system is composed of 16 equations for the determination of the 16 quantities h i and is closely analogous to the system of 10 equations for the determination of the 10 components gik in the original theory of gravitation. It is an interesting fact that the covariant components h i of the fundamental vectors, when considered as electromagnetic potential vectors, satisfy in the local coordinate system the universally recognised laws of Maxwell for the electromagnetic field in free space, as a consequence of the field equations.” [350]

This looks as if he had introduced four vector potentials for the electromagnetic field, and this, in fact, T. Y. Thomas does: “the components h i will play the role of electromagnetic potentials in the present theory.” The field equations are just the four wave equations \(\sum {{e_{\hat k}}h_{\hat \jmath,\hat k\hat k}^{\hat \imath}}\) where the summation extends over , with = 1,…4, and the comma denotes an absolute derivative he has introduced. The gravitational potentials are still gik. In his next note, T. Y. Thomas changed his field equations on the grounds that he wanted them to give a conservation law.

“This latter point of view is made the basis for the construction of a system of field equations in the present note — and the equations so obtained differ from those of note I only by the appearance of terms quadratic in the quantities \(h_{j,k}^{\hat \imath}\). It would thus appear that we can carry over the interpretation of the h i as electromagnetic potentials; doing this, we can say that Maxwell’s equations hold approximately in the local coordinate system in the presence of weak electromagnetic fields.” [351]

The third paper contains a remark as to the content of the concept “unified field theory”:

“It is the objective of the present note to deduce the general existence theorem of the Cauchy-Kowalewsky type for the system of field equations of the unified field theory. […] Einstein (Sitzber. 1930, 18–23) has pointed out that the vanishing of the invariant \(h_{j,k}^{\hat \imath }\) is the condition for the four-dimensional world to be Euclidean, or more properly, pseudo-Euclidean. From the point of view of our previous notes this fact has its interpretation in the statement that the world will be pseudo-Euclidean only in the absence of electric and magnetic forces. This means that gravitational and electromagnetic phenomena must be intimately related since the existence of gravitation becomes dependent on the electromagnetic field. Thus we secure a real physical unification of gravitation and electricity in the sense that these concepts become but different manifestations of the same fundamental entity — provided, of course, that the theory shows itself to be tenable as a theory in agreement with experience.” [352].

In his three further installments, T. Y. Thomas moved away from unified field theory to the discussion of mathematical details of the theory he had advanced [353, 354, 355].

Unhindered by constraints from physical experience, mathematicians try to play with possibilities. Thus, it was only consequential that Valentin Bargmann in Berlin, after Riemann and Weyl, now engaged in looking at a geometry allowing a comparison “at a distance” of directions but not of lengths, i.e., only of the quotient of vector components, Ai/Ak [5]. In the framework of a purely affine theory he obtained a necessary and sufficient condition for this geometry,

$$ R_{jkl}^i(\Gamma ) = \frac{1}{D}{\delta ^i}_j{V_{kl}}, $$

with the homothetic curvature Vkl from Equation (31). Then Bargmann linked his approach to Einstein’s first note on distant parallelism [84, 89], introduced a D-bein h ki , and determined his connection such that the quotients Ai/Ak of vector components with regard to the D-bein remained invariant under parallel transport. The resulting connection is given by

$$ {\Gamma _{lm}}^k = h_j^k\frac{{\partial h_l^j}}{{\partial {x^m}}} - \delta _l^k{\psi _m}, $$

where ψm corresponds to \({\Gamma _{lm}}^k\).

Schouten and van Dantzig also used a geometry built on complex numbers, and on Hermitian forms:

“[…] we were able to show that the metric geometry used by Einstein in his most recent approach to relativity theory [84, 83] coincides with the geometry of a Hermitian tensor of highest rank, which is real on the real axis and satisfies certain differential equations.” ([313], p. 319)

The Hermitian tensor referred to leads to a linear integrable connection that, in the special case that it “is real in the real”, coincides with Einstein’s teleparallel connection.

Distant parallelism was revived four decades later within the framework of Poincaré gauge theory; the corresponding theories will be treated in the second part of this review.

6.4.6 Overdetermination and compatibility of systems of differential equations

In the course of Einstein’s thinking about distant parallelism, his ideas about overdetermined systems of differential equations gradually changed. At first, the possibility of gaining hold on the paths of elementary particles — described as singular worldlines of point particles — was central. He combined this with the idea of quantisation, although Planck’s constant h could not possibly surface by such an approach. But somehow, for Einstein, discretisation and quantisation must have been too close to bother about a fundamental constant.

Then, after the richer constructive possibilities (e.g., for a Lagrangian) became obvious, a principle for finding the correct field equations was needed. As such, “overdetermination” was brought into the game by Einstein:

“The demand for the existence of an ‘overdetermined’ system of equations does provide us with the means for the discovery of the field equations”Footnote 249 ([90], p. 21)

It seems that Einstein, during his visit to Paris in November 1929, had talked to Cartan about his problem of finding the right field equations and proving their compatibility. Starting in December of 1929 and extending over the next year, an intensive correspondence on this subject was carried on by both men [50]. On 3 December 1929, Cartan sent Einstein a letter of five pages with a mathematical note of 12 pages appended. In it he referred to his theory of partial differential equations, deterministic and “in involution,” which covered the type of field equations Einstein was using and put forward a further field equation. He clarified the mathematical point of view but used concepts such as “degree of generality” and “generality index” not familiar to Einstein Footnote 250 . Cartan admittedFootnote 251:

“I was not able to completely solve the problem of determining if there are systems of 22 equations other than yours and the one I just indicated […] and it still astonishes me that you managed to find your 22 equations! There are other possibilities giving rise to richer geometrical schemes while remaining deterministic. First, one can take a system of 15 equations […]. Finally, maybe there are also solutions with 16 equations; but the study of this case leads to calculations as complicated as in the case of 22 equations, and I was not fortunate enough to come across a possible system […].” ([50], pp. 25–26)

Einstein’s rapid answer of 9 December 1929 referred to the letter only; he had not been able to study Cartan’s note. As the further correspondence shows, he had difficulties in following Cartan:

“For you have exactly that which I lack: an enviable facility in mathematics. Your explanation of the indice de généralité I have not yet fully understood, at least not the proof. I beg you to send me those of your papers from which I can properly study the theory.” ([50], p. 73)

It would be a task of its own to closely study this correspondence; in our context, it suffices to note that Cartan wrote a special noteFootnote 252

“[…] edited such that I took the point of view of systems of partial differential equations and not, as in my papers, the point of view of systems of equations for total differentials […]”Footnote 253

which was better suited to physicists. Through this note, Einstein came to understand Cartan’s theory of systems in involution:

“I have read your manuscript, and this enthusiastically. Now, everything is clear to me. Previously, my assistant Prof. Müntz and I had sought something similar — but we were unsuccessful.”Footnote 254 ([50], pp. 87, 94)

In the correspondence, Einstein made it very clear that he considered Maxwell’s equations only as an approximation for weak fields, because they did not allow for non-singular exact solutions approaching zero at spacelike infinity.

“It now is my conviction that for rigourous field theories to be taken seriously everywhere a complete absence of singularities of the field must be demanded. This probably will restrict the free choice of solutions in a region in a far-reaching way — more strongly than the restrictions corresponding to your degrees of determination.”Footnote 255 ([50], p. 92)

Although Einstein was grateful for Cartan’s help, he abandoned the geometry with distant parallelism.

7 Geometrization of the Electron Field as an Additional Element of Unified Field Theory

After the advent of Schrödinger’s and Dirac’s equations describing the electron non-relativistically and relativistically, a unification of only the electromagnetic and gravitational fields was considered unconvincing by many theoretical physicists. Hence, in the period 1927–1933, quite a few attempts were made to include Schrödinger’s, Dirac’s, or the Klein-Gordon equation as a classical oneparticle equation into a geometrical framework by relating the quantum mechanical wave function with some geometrical object. Such an approach then was believed to constitute a unification, up to a degree, of gravitation and/or electricity and quantum theory. In this section, we loosely collect some of these approaches.

The mathematicians Struik and Wiener found the task of an amalgamation of relativity and quantum theory (wave mechanics) attractive:

“It is the purpose of the present paper to develop a form of the theory of relativity which shall contain the theory of quanta, as embodied in Schrödingers wave mechanics, not merely as an afterthought, but as an essential and intrinsic part.” [338]

A further example for the new program is given by J. M. Whittaker at the University of Edinburgh [415] who wished to introduce the wave function via the matter terms:

“In addition to the wave equations a complete scheme must include electromagnetic and gravitational equations. These will differ from the equations of Maxwell and Einstein in having ‘wave’ terms instead of ‘particle’ terms for the current vector and material energy tensor. The object of the present paper is to find these equations […].” ([415], p. 543)

Zaycoff, from the point of view of distant parallelism, found the following objection to unified field theory as the only valid one:

“It neglects the existence of wave-mechanical phenomena. By the work of Dirac, wavemechanics has reached an independent status; the only attempt to bring together this new group of phenomena with the other two is J. M. Whittaker’s theory [415].”Footnote 256 [428]

In fact, in a short note, Zaycoff presented his version of Whittaker’s theory with 8 coupled secondorder linear field equations for two “wave vectors” that, in a suitable combination, were to represent “the Dirac’s wave equations”; they contain the Ricci tensor and both the electromagnetic 4-potential and field [432]. Thus, what is described is Dirac’s equation in external gravitational and electromagnetic fields, not a unified field theory. Whittaker had expressed himself more clearly:

“Eight wave functions are employed instead of Dirac’s four. These are grouped together to form two four-vectors and satisfy wave equations of the second order. It is shown […] that these eight wave equations can be reduced, by addition and subtraction, to the four second order equations satisfied by Dirac’s functions taken twice over; and that, in a sense, the present theory includes Dirac’s.” ([415], p. 543)

Whittaker also had written down a variational principle by which the gravitational and electromagnetic field equations were also gained. However, as the terms for the various fields were just added up in his Lagrangian, the theory would not have qualified as a genuine unified field theory in the spirit of Einstein.

What fancy, if only shortlived, flowers sprang from the mixing of geometry and wave mechanics is shown by the example of H. Jehle’s “[…] path leading, on the one hand, to electrical elementary particles and, on the other, to the explanation of cosmological problems by quantum theory.” [171] His ad-hoc modification of Einstein’s equations was:

$$ \begin{array}{*{20}{c}} {{G_{ik}}\;{\psi ^2} - {\sigma ^2}R\frac{{\partial \psi }}{{\partial {x^i}}}\frac{{\partial \psi }}{{\partial {x^k}}} = 0,\;\;\;\;\;}&{{G_{ik}}\;\psi \bar \psi - {\sigma ^2}R\frac{{\partial \psi }}{{\partial {x^i}}}\frac{{\partial \bar \psi }}{{\partial {x^k}}} = 0,} \end{array} $$

where ψ is the quantum mechanical state function. Although, two years later, Jehle withdrew his claims concerning elementary particles, he continued to apply

“wave-mechanical methods to gravitational phenomena, by which the curious structure of the spiral nebulae and spherical star systems may be readily understood.” [172]

An eminent voice was Weyl’s:

“It seems to me that it is now hopeless to seek a unification of gravitation and electricity without taking material waves into account.” ([408], p. 325)

Now, this posed a problem because for the representation of the electrons in the form of Dirac’s equation, the elements of spin space, i.e., spinors, had to be used. How to combine them with the vectors and tensors appearing in electromagnetic and gravitational theories? As the spinor representation is the simplest representation of the Lorentz group, everything may be played back to spin space. At the time, this was being done in different ways, in part by the use of number fields with which physicists were unacquainted such as quaternions and sedenions (cf. Schouten [315]). Others, such as Einstein and Mayer, liked vectors better and introduced so-called semi-vectors. Still others tried to write Dirac’s equations in a vectorial form and took into account the doubling of variables and equations [213, 415]. Some less experienced, as e.g., “Exhibition Research Student” G. Temple, even claimed that a tensorial theory was necessary to retain it relativistically:

“It is an admitted fact that Dirac’s wave functions are not the components of a tensor and that his wave equations are not in tensorial form. It is contended here that therefore his theory cannot be upheld without abandoning the theory of relativity.” ([343], p. 352)

While this story about geometrizing wave mechanics might not be a genuine part of unified field theory at the time, it seems interesting to follow it as a last attempt for binding together classical field theory and quantum physics. Even Einstein was lured into thinking about spinors by Dirac’s equation; this equation promised more success for his program concerning elementary particles as solutions of differential equations (cf. Section 7.3. )

7.1 Unification of Maxwell’s and Dirac’s equations, of electrons and light

The appearance of these so-called “wave equations” for some seemed to show a kinship between the photon and the electron; this led to attempts to obtain a common representation for both kind of particles (waves) [199, 200, 198, 286]. Some of the motivation for these papers came from formal considerations, i.e., the wish to replace the not yet well-understood new spinor representation of the Lorentz group by the old tensor representation; a working knowledge about non-commuting objects like matrices was not yet available to everyone:

“There are probably readers who will share the present writer’s feeling that the methods of non-commutative algebra are harder to follow, and certainly much more difficult to invent, than are operations of types long familiar to analysis.” ([44], p. 654)

More interesting is Frenkel’s remark about Darwin’s presentation of Dirac’s equations in a form analogous to Maxwell’s equations [44]:

“This relation between the wave-mechanical equations of a ‘quantum of electricity’ and the electromagnetic field equations, which may be looked at as wave-mechanical equations for photons, ought to have a fundamental physical meaning. Therefore, I do not think it is superfluous to win the wave equation of the electron as a generalisation of M a x w e l l’s equations.”Footnote 257 ([140], p. 357)

H. T. Flint of King’s College in London aimed at describing the electron in a Maxwell-like way within a five-dimensional approach. He saw two “unsatisfactory points” in Dirac’s approach, the introduction of the operator \(\tfrac{h}{{2\pi i}}\tfrac{\partial }{{\partial {x^k}}} - e{\phi _k}\), and the mass term mc. In order to mend these thin spots he wrote down two Maxwell’s equations,

$$ \begin{array}{*{20}{c}} {{\nabla _\mu }{F^{\lambda \mu }} = {J^\lambda },\;\;\;\;\;}&{{\nabla _\mu }{G^{\lambda \mu }} = {L^\lambda },} \end{array} $$

where \({{F^{\lambda \mu }}}\) and \({{G^{\lambda \mu }}}\) are two asymmetric field tensors, and \({J_\lambda } = \tfrac{{\partial \psi }}{{\partial {x^\lambda }}}\) and \({L_\lambda } = \tfrac{{\partial \theta }}{{\partial {x^\lambda }}}\) are two current vectors (μ, λ, ν= 1, 2, …, 5). ψ is the electron’s wave function; although not provided by Flint, the interpretation of .. points to a second kind of wave function. Despite the plentitude of variables introduced, Flint’s result was meagre; his second order wave equation contained the correct mass term and two new terms; he did not write up all the equations occurring [129]. His approach for embedding wave mechanics into a Maxwell-like was continued in further papers, in part in collaboration with J. W. Fisher; to them it appeared

“unnecessary to introduce in any arbitrary way terms and operators to account for quantum phenomena.” ([128], p. 653; [127])

By adding four spinor equations at his choosing to Dirac’s equation, Wisniewski in Poland arrived at a “system of equations similar to Maxwell’s”. His conclusion sounds a bit strange:

“These equations may be interpreted as equations for the electromagnetic field in an electron gas whose elements are electric and magnetic dipoles.” [388]

In this context, another unorthodox suggestion was put forward by A. Anderson who saw matter and radiation as two phases of the same substrate:

“We conclude that, under sufficiently large pressure, even at absolute zero normal matter and black-body radiation (gas of light quanta) become identical in every sense. Electrons and protons cannot be distinguished from quanta of light, gas pressure not from radiation pressure.”Footnote 258

Anderson somehow sensed that charge conservation was in his way; he meddled through by either assuming neutral matter, i.e., a mixture of electrons and protons, or by raising doubt as to “whether the usual quanta of light are strictly electrically neutral” ([3], p. 441).

One of the German theorists trying to keep up with wave mechanics was Gustav Mie. He tried to reformulate electrodynamics into a Schrödinger-type equation and arrived at a linear, homogeneous wave equation of the Klein.Gordon-type for the ψ-function on the continuum of the components of the electromagnetic vector potential [232]. Heisenberg and Pauli, in their paper on the quantum dynamics of wave fields, although acknowledging Mie’s theory as an attempt to establish the classical side for the application of the correspondence principle, criticised it as a formal scheme not yet practically applicable [158].

7.2 Dirac’s electron with spin, Einstein’s teleparallelism, and Kaluza’s fifth dimension

In the same year 1928 in which Einstein published his theory of distant parallelism, Dirac presented his relativistic, spinorial wave equation for the electron with spin. This event gave new hope to those trying to include the electron field into a unified field theory; it induced a flood of papers in 1929 such that this year became the zenith for publications on unified field theory. Although we will first look at papers which gave a general relativistic formulation of Dirac’s equation without having recourse to a geometry with distant parallelism, Tetrode’s paper seems to be the only one not influenced by Einstein’s work with this geometry (cf. Section 6.4.5). Although, as we noted in Section 6.4.1, the technique of using n-beins (tetrads) had been developed by mathematicians before Einstein applied it, it may well have been that it became known to physicists through his work. Both Kaluza’s five-dimensional space and four-dimensional projective geometry were also applied in the general relativistic formulation of Dirac’s equation.

7.2.1 Spinors

This is a very sketchy outline with a focus on the relationship to unified field theories. An interesting study into the details of the introduction of local spinor structures by Weyl and Fock and of the early history of the general relativistic Dirac equation was given recently by Scholz [291].

For some time, the new concept of spinorial wave function stayed unfamiliar to many physicists deeply entrenched in the customary tensorial formulation of their equationsFootnote 259. For example, J. M. Whittaker was convinced that Dirac’s theory for the electron

“has been brilliantly successful in accounting for the ‘duplexity’ phenomena of the atom, but has the defect that the wave equations are unsymmetrical and have not the tensor form.” ([415], p. 543)

Some early nomenclature reflects this unfamiliarity with spinors. For the 4-component spinors or Dirac-spinors (cf. Section 2.1.5) the name “half-vectors” coined by Landau was in useFootnote 260. Podolsky even purported to show that it was unnecessary to employ this concept of “half-vector” if general curvilinear coordinates are used [259]. Although van der Waerden had written on spinor analysis as early as 1929 [368] and Weyl’s [407, 408], Fock’s [133, 131], and Schouten’s [306] treatments in the context of the general relativistic Dirac equation were available, it seems that only with van der Waerden’s book [369], Schrödinger’s and Bargmann’s papers of 1932 [319, 6], and the publication of Infeld and van der Waerden one year later [167] a better knowledge of the new representations of the Lorentz group spread out. Ehrenfest, in 1932, still complainedFootnote 261:

“Yet still a thin booklet is missing from which one could leasurely learn spinor- and tensor-calculus combined.”Footnote 262 ([68], p. 558)

In 1933, three publications of the mathematician Veblen in Princeton on spinors added to the development. He considered his first note on 2-spinors “a sort of geometric commentary on the paper of Weyl” [378]. Veblen had studied Weyl’s, Fock’s, and Schouten’s papers, and now introduced a “spinor connection of the first kind” \(\Lambda _{\beta \alpha }^A\), α=1,…,4, with the usual transformation law under the linear transformation \({\bar \psi ^A} = T_D^A{\psi ^D}\;(A,D = 1,2)\) changing the spin frame:

$$ \bar \Lambda _{D\alpha }^C = \left( {\Lambda _{B\alpha }^At_D^B + \frac{{\partial t_D^A}}{{\partial {x^\alpha }}}} \right)T_A^C, $$

where t BD is the inverse matrix T-1. T AD corresponds to A AB of Equation (75); however, the transformation need not be unimodular. Thus, Veblen took up Schouten’s concept of “spin density” [306] by considering quantities transforming like \({\bar \psi ^A} = {t^N}T_D^A{\psi ^D}\) (A, D=1, 2), with t:= det t BD . Then, the covariant derivative of a spinor of weight N, ψA is considered; the expressionFootnote 263

$$ \frac{{\partial {\psi ^A}}}{{\partial {x^\alpha }}} + \Lambda _{B\alpha }^A{\psi ^B} - N\Lambda _{B\alpha }^B{\psi ^A} = {\nabla _\alpha }{\psi ^A} $$

gives “the components of a geometric object which transforms like those of a spinor of weight N with respect to the index A and like those of a projective tensor with respect to α”. In his first paper [378], a further generalisation is introduced including “gauge-transformations” in an additional variable x0: \({\bar \psi ^A} = {e^{k{x^0}}}{f^A}\) (A, D= 1, 2), with the gauge transformations

$$ \begin{array}{*{20}{c}} {{x^0} = {x^{0'}} - \log \;\rho ({x^{k'}}),\;\;\;\;\;}&{{x^k} = {x^k}({x^{l'}}),} \end{array} $$

k is called the “index” of the spinor. In order to deal with 4-spinors Veblen considered a complex projective 3-space and defined 6 real homogeneous coordinates \({X^\sigma }\), with σ=0, …, 5, through Hermitian forms of the 4-spinor components. The subspace X0=0, X5=0 of the quadratic \({({X^0})^2} + {({X^1})^2} + {({X^2})^2} + {({X^3})^2} - {({X^4})^2} - {({X^5})^2} = 0\) is then tangent to the Minkowski light cone ([377], p. 515).

Veblen imbedded spinors into his projective geometry [380]:

“[…] The components of still other objects, the spinors, remain partially indeterminate after coordinates and gauge are fixed and become completely determinate only when the spin frame is specified. There are several ways of embodying this invariant theory in a formal calculus. The one which is here employed has its antecedents chiefly in the work ofWeyl, van derWaerden, Fock, and Schouten. It differs from the calculus arrived at by Schouten chiefly in the treatment of gauge invariance, Schouten (in collaboration with van Dantzig) having preferred to rewrite the projective relativity in a formalism obtainable from the original one by a sort of coordinate transformation, whereas I think the original form fits in better with the classical notations of relativity theory. […] The theory of spinors is more general than the projective relativity and is reduced to the latter by the specification of certain fundamental spinors. These spinors have been recognised by several students (Pauli and Solomon, Fock) of the subject but their role has probably not been fully understood since it has quite recently been thought necessary to give special proofs of invariance.” [380]

The transformation law for spinors is the same as beforeFootnote 264:

$$ \begin{array}{*{20}{c}} {\bar \psi = {e^{k{x^0}}}{t^N}{T^A}_B{\psi ^B},\;\;\;\;\;}&{A,B = 1, \ldots 4.} \end{array} $$

In part, he also takes over van der Waerden’s notation (dotted indices.) As to Veblen’s papers on 2- and 4-spinors, my impression is that, beyond a more detailed presentation, alas with a less transparent notation, they do not really bring a pronounced advance with regard to Weyl’s, Fock’s, van der Waerden’s, and Pauli’s publications (cf. Sections 7.2.2 and 7.2.3). Veblen himself had a different opinion; for him the homogeneous coordinates used by Pauli seemed “to make things more complicated” (cf. the paragraphs on projective geometry in Section 2.1.3). Veblen’s inhomogeneous coordinates xi, (i = 1, 2, 3, 4) and the homogeneous coordinates \({X^\mu }\), with μ=0, …, 4, are connected by

$$ \begin{array}{*{20}{c}} {{X^0} = \exp ({x^0}),\;\;\;\;\;}&{{x^i} = \exp ({x^0}{x^i}).} \end{array} $$

According to Veblen,

“In a five-dimensional representation the use of the homogeneous coordinates (X0, … ,X4) amounts to representing the points of space-time by the straight lines through the origin, whereas the use of x1, … , x4, and the gauge variable amounts to using the system of straight lines parallel to the x0-axis for the same purpose. The transformation (192) given above carries the system of lines into the other.“ [382]

7.2.2 General relativistic Dirac equation and unified field theory

After Tetrode and Wigner, whose contributions were mentioned in Section 6.4.5, Weyl also gave a general relativistic formulation of Dirac’s equation. He gave up his original idea of coupling electromagnetism to gravitation and transferred it to the coupling of the electromagnetic field to the matter (electron-) field: In order to keep quantum mechanical equations like Dirac’s gauge invariant, the wave function had to be multiplied by a phase factor [407, 408]. Actually, Weyl had expressed the change in his outlook, so important for the idea of gauge-symmetry in modern physics ([424], pp. 13–19), already in 1928 in his book on group theory and quantum mechanics ([406], pp. 87–88). We have noted before his refutation of distant parallelism (cf. Section 6.4.4). In his papers, Weyl used a 2-spinor formalism and a tetrad notation different from Einstein’s and Levi-Civita’s: He wrote ep() in place of h p , and o(l; kj) for the Ricci rotation coefficients γjkl; this did not ease the reading of his paper. He partly agreed with what Einstein imagined:

“It is natural to expect that one of the two pairs of components of D i r a c’s quantity belongs to the electron, the other to the proton.”Footnote 265

In contrast to Einstein, Weyl did not expect to find the electron as a solution of “classical” spinorial equations:

“For every attempt at establishing the quantum-theoretical field equations, one must not lose sight [of the fact] that they cannot be tested empirically, but that they provide, only after their quantization, the basis for statistical assertions concerning the behaviour of material particles and light quanta.”Footnote 266 ([407], p. 332)

For many years, Weyl had given the statistical approach in the formulation of physical laws an important role. He therefore could adapt easily to the Born-Jordan-Heisenberg statistical interpretation of the quantum state. For Weyl and statistics, cf. Section V of Sigurdsson’s dissertation ([326], pp. 180–192).

At about the same time, Fock in May 1929 and later in the year wrote several papers on the subject of “geometrizing” Dirac’s equation:

“In the past two decades, endeavours have been made repeatedly to connect physical laws with geometrical concepts. In the field of gravitation and of classical mechanics, such endeavours have found their fullest accomplishment in E i n s t e i n’s general relativity. Up to now, quantum mechanics has not found its place in this geometrical picture; attempts in this direction (Klein, Fock) were unsuccessful. Only after Dirac had constructed his equations for the electron, the ground seems to have been prepared for further work in this direction.”Footnote 267 ([135], p. 798)

In another paper [134], Fock and Ivanenko took a first step towards showing that Dirac’s equation can also be written in a generally covariant form. To this end, the matrix-valued linear form ds=γkdxk (summation over k=1, …, 4) was introduced and interpreted as the distance between two points “in a space with four continuous and one discontinuous dimensions”; the discrete variable took only the integer values 1, 2, 3, and 4. Then the operator-valued vectorial quantity γkuk with the vectorial operator uk and its derivative \(\tfrac{{ds}}{{d\tau }} = {\gamma _k}{v^k}\) immediately led to Dirac’s equation by replacing vk by \(\tfrac{1}{m}\left( {\tfrac{h}{{2\pi i}}\tfrac{\partial }{{\partial {x^k}}} + \tfrac{e}{c}{A_k}} \right)\), where Ak is the electromagnetic 4-potential, by also assuming the velocity of light c to be the classical average of the “4-velocity” vk, and by applying the operator to the wave function ψ. In the next step, instead of the Dirac γ-matrices with constant entries γ (0)l .. , the coordinate-dependent bein-components \({\gamma _{\hat k}} = {h_{\hat k}}^l\gamma _l^{(0)}\) are defined; ds2 then gives the orthonormality relations of the 4-beins.

In a subsequent note in the Reports of the Parisian Academy, Fock and Ivanenko introduced Dirac’s 4-spinors under Landau’s name “half vector” and defined their parallel transport with the help of Ricci’s coefficients. In modern parlance, by introducing a covariant derivative for the spinors, they in fact already obtained the “gauge-covariant” derivative \({\nabla _k}\psi : = (\tfrac{\partial }{{\partial {x^k}}} - \tfrac{{2\pi i}}{h}\tfrac{e}{c}{A_k})\psi\). Thus \(\delta \psi = \tfrac{{2\pi i}}{h}\tfrac{e}{c}{A_k}d{x^k}\psi\) is interpreted in the sense of Weyl:

“Thus, it is in the law for the transport of a half-vector that Weyl’s differential linear form must appear.”Footnote 268 ([134], p. 1469)

In order that gauge-invariance results, ψ must transform with a factor of norm 1, innocuous for observation, i.e., \(\psi \to \exp (i\tfrac{{2\pi }}{h}\tfrac{e}{c}\sigma )\) if \({A_k} \to {A_k} + \tfrac{{\partial \sigma }}{{\partial {x^k}}}\). Another note and extended presentations in both a French and a German physics journal by Fock alone followed suit [133, 131, 132]. In the first paper Fock defined an asymmetric matter tensor for the spinor field,

$$ {T^j}_k = \frac{{ch}}{{2\pi i}}\left[ {\bar \psi {\gamma ^j}\left( {\frac{{\partial \psi }}{{\partial {x^k}}} - {\Gamma _k}\psi } \right) - \frac{1}{2}{\nabla _k}\left( {\bar \psi {\gamma ^j}\psi } \right)} \right], $$

where \({\Gamma _k} = \Sigma _{\hat l}{e_{\hat l}}{\alpha _{\hat l}}{h_{k\hat l}}{C_{\hat l}}\) is related to the matrix-valued spin connection in the expression for the parallel transport of a half-vector ψ:

$$ \delta \psi = \sum\limits_{\hat l} {{e_{\hat l}}{C_{\hat l}}d{s_{\hat l}}\psi .} $$

The covariant derivative then is \({D_k} = \tfrac{\partial }{{\partial {x^k}}} - {\Gamma _k}\). Fock made clear that the covariant formulation of Dirac’s equation did not need the special geometry of Einstein’s theory of distant parallelism:

“By help of the concept of parallel transport of a half-vector, Dirac’s equations will be written in a generally invariant form. […] The appearance of the 4-potential φl besides the Ricci-coefficients γikl in the expression for parallel transport, on the one hand provides a simple reason for the emergence of the term \({p_l} - \tfrac{e}{c}{\phi _l}\) in the wave equation and, on the other, shows that the potentials φl have a place of their own in the geometrical world-view, contrary to Einstein’s opinion; they need not be functions of the γikl.”Footnote 269 ([131], p. 261, Abstract)

For his calculations, Fock used Eisenhart’s book [119] and “the excellent collection of the most important formulas and facts in the paper of Levi-Civita” [207]. Again, Weyl’s “principle of gauge invariance” as formulated in Weyl’s book of 1928 [406] is mentioned, and Fock stressed that he had found this principle independently and earlierFootnote 270:

“The appearance of Weyl’s differential form in the law for parallel transport of a half vector connects intimately to the fact, observed by the author and also by Weyl (l.c.), that addition of a gradient to the 4-potential corresponds to multiplication of the ψ- function with a factor of modulus 1.”Footnote 271 ([130], p. 266)

The divergence of the complex energy-momentum tensor \(W_k^i = T_k^i + iU_k^i\) satisfies

$$ \begin{array}{*{20}{c}} {{\nabla _j}{T^j}_k = e{J^l}{F_{lk}},\;\;\;\;\;}&{{\nabla _j}{U^j}_k = \frac{{hc}}{{4\pi }}{J^l}{R_{lk}},} \end{array} $$

with the electromagnetic field tensor Fik, the Ricci tensor Rik, and the Dirac current Jk. The French version of the paper preceded the German “completed presentation”; in it Fock had noted:

“The 4-potential finds its place in Riemannian geometry, and there exists no reason for generalising it (Weyl, 1918), or for introducing distant parallelism (Einstein 1928). In this point, our theory, developed independently, agrees with the new theory by H. Weyl expounded in his memoir ‘gravitation and the electron’.”Footnote 272 ([132], p. 405)

In both of his papers, Fock thus stressed that Einstein’s teleparallel theory was not needed for the general covariant formulation of Dirac’s equation. In this regard he found himself in accord with Weyl, whose approach to the Dirac equation he nevertheless criticised:

“The main subject of this paper is ‘Dirac’s difficulty’Footnote 273. Nevertheless, it seems to us that the theory suggested by Weyl for solving this problem is open to grave objections; a criticism of this theory is given in our article.”Footnote 274

Weyl’s paper is seminal for the further development of the gauge idea [407].

Although Fock had cleared up the generally covariant formulation of Dirac’s equation, and had tried to propagate his results by reporting on them at the conference in Charkow in May 1929Footnote 275 [169], further papers were written. Thus, Reichenbächer, in two papers on “a wavemechanical 2-component theory” believed that he had found a method different from Weyl’s for obtaining Dirac’s equation in a gravitational field. As was often the case with Reichenbächer’s work, after longwinded calculations a less than transparent result emerged. His mass term contained a square root, i.e., a ± two-valuedness, which, in principle, might have been instrumental for helping to explain the mass difference of proton and electron. As he remarked, the chances for this were minimal, however [277, 278].

In two papers, Zaycoff (of Sofia) presented a unified field theory of gravitation, electromagnetism and the Dirac field for which he left behind the framework of a theory with distant parallelism used by him in other papers. By varying his Lagrangian with respect to the 4-beins, the electromagnetic potential, the Dirac wave function and its complex-conjugate, he obtained the 20 field equations for gravitation (of second order in the 4-bein variables, assuming the role of the gravitational potentials) and the electromagnetic field (of second order in the 4-potential), and 8 equations of first order in the Dirac wave function and the electromagnetic 4-potential, corresponding to the generalised Dirac equation and its complex conjugate [426, 427].

In another paper, Zaycoff wanted to build a theory explaining the “equilibrium of the electron”. This means that he considered the electron as extended. At this occasion, he fought with himself about the admissibility of the Kaluza-Klein approach:

“Recently, repeated attempts have been made to raise the number of dimensions of the world in order to explain its strange lawfulness (H. Mandel, G. Rumer, the author et al.). No doubt, there are weighty reasons for such a seemingly paradoxical view. For it is impossible to represent Poincaré’s pressure of the electron within the normal spacetime scheme. However, the introduction of such metaphysical elements is in gross contradiction with space-time causality, although we may doubt in causality in the usual sense due to Heisenberg’s uncertainty relations. A multi-dimensional causality cannot be understood as long as we are unable to give the extra dimensions an intuitive meaning.”Footnote 276 [433]

Rumer’s paper is [285] (cf. Section 8). In the paper, Zaycoff introduced a six-dimensional manifold with local coordinates x0, … , x5 where x0, x5 belong to the additional dimensions. His local 6-bein comprises, besides the 4-bein, four electromagnetic potentials and a further one called “eigenpotential” of the electromagnetic field. As he used a “sharpened cylinder condition, ” no further scalar field is taken into account. For x0 to x4 he used the subgroup of coordinate transformations given in Klein’s approach, augmented by x5′=x5.

Schouten seemingly became interested in Dirac’s equation through Weyl’s publications. He wrote two papers, one concerned with the four-dimensional and a second one with the five-dimensional approach [306, 307]. They resulted from lectures Schouten had given at the Massachusetts Institute of Technology from October to December 1930 and at Princeton University from January to March 1931; Weyl’s paper referred to is in Zeitschrift für Physik [407]. Schouten relied on his particular representation of the Lorentz group in a complex space, which later attracted Schrödinger’s criticism. [305]. His comment on Fock’s paper [131] isFootnote 277:

“Fock has tried to make use of the indetermination of the displacement of spin-vectors to introduce the electromagnetic vector potential. However the displacement of contravariant tensor-densities of weight +1/2 being wholly determined and only these vector-densities playing a role, the idea of Weyl of replacing the potential vector by pseudovectors of class +1 and −1 seems much better.” ([306], p. 261, footnote 19)

Schouten wrote down Dirac’s equation in a space with torsion; his iterated wave equation, besides the mass term, contains a contribution ∼−1/4R if torsion is set equal to zero. Whether Schouten could fully appreciate the importance of Weyl’s new idea of gauging remains open. For him an important conclusion is that

“by the influence of a gravitational field the components of the potential vector change from ordinary numbers into Dirac-numbers.” ([306], p. 265)

Two years later, Schrödinger as well became interested in Dirac’s equation. We reproduce a remark from his publication [319]:

“The joining of Dirac’s theory of the electron with general relativity has been undertaken repeatedly, such as by Wigner [419], Tetrode [344], Fock [131], Weyl [407, 408], Zaycoff [434], Podolsky [259]. Most authors introduce an orthogonal frame of axes at every event, and, relative to it, numerically specialised Dirac-matrices. This procedure makes it a little difficult to find out whether Einstein’s idea concerning teleparallelism, to which [authors] sometimes refer, really plays a role, or whether there is no dependence on it. To me, a fundamental advantage seems to be that the entire formalism can be built up by pure operator calculus, without consideration of the ψ-function.”Footnote 278 ([319], p. 105)

The γ-matrices were taken by Schrödinger such that their covariant derivative vanished, i.e., \({\gamma _{l\left\| m \right.}} = \tfrac{{\partial {\gamma _l}}}{{\partial {x^m}}} - \Gamma _{lm}^r(g){\gamma _r} + {\gamma _l}{\Gamma _m} - {\gamma _m}{\Gamma _l} = 0\), where Γl is the spin-connection introduced by \({\psi _{\left\| l \right.}} = \tfrac{{\partial \psi }}{{\partial {x^l}}} - {\Gamma _l}\psi\). Schrödinger took γ0, γi, with i = 1, 2, 3, as Hermitian matrices. He introduced tensor-operators T iklm such that the inner product \(\psi *{\gamma _0}T_{lm}^{ik}\psi\) instead of \(\psi *T_{lm}^{ik}\psi\) stayed real under a “complemented point-substitution”.

In the course of his calculations, Schrödinger obtained the wave equation

$$ \frac{1}{{\sqrt g }}{\nabla _k}\sqrt g {g^{kl}}{\nabla _l} - \frac{R}{4} - \frac{1}{2}{f_{kl}}{s^{kl}} = {\mu ^2}, $$

where μ=2πmc/h, fkl is the electromagnetic field tensor, and \({s^{kl}}: = \tfrac{1}{2}{\gamma ^{[k}}{\gamma ^{l]}}\) with the γ-matrices γk, i.e., the spin tensor. As to the term with the curvature scalar R, Schrödinger was startled:

“To me, the second term seems to be of considerable theoretical interest. To be sure, it is much too small by many powers of ten in order to replace, say, the term on the r.h.s. For μ is the reciprocal Compton length, about 1011 cm-1. Yet it appears important that in the generalised theory a term is encountered at all which is equivalent to the enigmatic mass term.”Footnote 279 ([319], p. 128)

The coefficient -1/4 in front of the Ricci scalar in Schrödinger’s (Klein-Gordon) wave equation differs from the 1/6 needed for a conformally invariant version of the scalar wave equationFootnote 280 (cf. [257], p. 395).

Bargmann in his approach, unlike Schrödinger, did not couple “point-substitutions [linear coordinate transformations] and similarity transformations [in spin space]”[6]. He introduced a matrix α with \(\alpha + {\alpha ^\dag } = 0\) such that \({(\alpha {\gamma ^l})^\dag } = (\alpha {\gamma ^l})\), with l = 0, … , 3.

Levi-Civita wrote a letter to Schrödinger in the form of a scientific paper, excerpts of which became published by the Berlin Academy:

“Your fundamental memoir induced me to develop the calculational details for obtaining, from Dirac’s equations in a general gravitational field, the modified form of your four equations of second order and thus make certain the corresponding additional terms. These additional terms do depend in an essential way on the choice of the orthogonal tetrad in the space-time manifold: It seems that without such a tetrad one cannot obtain Dirac’s equation.”Footnote 281 [208]

The last, erroneous, sentence must have made Pauli irate. In this paper, he pronounced his anathema (in a letter to Ehrenfest with the appeal “Please, copy and distribute!”):

“The heap of corpses, behind which quite a lot of bums look for cover, has got an increment. Beware of the paper by Levi-Civita: Dirac- and Schrödinger-type equations, in the Berlin Reports 1933. Everybody should be kept from reading this paper, or from even trying to understand it. Moreover, all articles referred to on p. 241 of this paper belong to the heap of corpses.”Footnote 282 ([252], p. 170)

Pauli really must have been enraged: Among the publications banned by him is also Weyl’s wellknown article on the electron and gravitation of 1929 [407].

Schrödinger’s paper was criticised by Infeld and van der Waerden on the ground that his calculational apparatus was unnecessarily complicated. They promised to do better and referred to a paper of Schouten’s [306]:

“In the end, Schouten arrives at almost the same formalism developed in this paper; only that he uses without need n-bein components and theorems on sedenionsFootnote 283, while afterwards the formalism is still burdened with auxiliary variables and pseudo-quantities. We have taken over the introduction of ‘spin densities’ by Schouten.”Footnote 284 ([168], p. 4)

Unlike Schrödinger’s, the wave equation derived from Dirac’s equation by Infeld and Waerden contained a term +1/4R, with R the Ricci scalar.

It is left to an in-depth investigation, how this discussion concerning teleparallelism and Dirac’s equation involving Tetrode, Wigner, Fock, Pauli, London, Schrödinger, Infeld and van derWaerden, Zaycoff, and many others influenced the acceptance of the most important result, i.e., Weyl’s transfer of the gauge idea from classical gravitational theory to quantum theory in 1929 [407, 408].

7.2.3 Parallelism at a distance and electron spin

Einstein’s papers on distant parallelism had a strong but shortlived impact on theoretical physicists, in particular in connection with the discussion of Dirac’s equation for the electron,

$$ \left( {i{\gamma ^k}\frac{\partial }{{\partial {x^k}}} + \mu } \right)\psi = 0, $$

where the 4-spinor ψ and the γ-matrices are used. At the time, there existed some hope that a unified field theory for gravitation, electromagnetism, and the “electron field” was in reach. This may have been caused by a poor understanding of the new quantum theory in Schrödinger’s version: The new complex wave function obeying Schrödinger’s, and, more interestingly for relativists, Dirac’s equation or the ensuing Klein-Gordon wave equation, was interpreted in the spirit of de Broglie’s “onde pilote”, i.e., as a classical matter wave, not — as it should have been — as a probability amplitude for an ensemble of indistinguishable electrons. One of the essential features of quantum mechanics, the non-commutativity of conjugated observables like position and momentum, nowhere entered the approaches aiming at a geometrization of wave mechanics.

Einstein was one of those clinging to the picture of the wave function as a real phenomenon in space-time. Although he knew well that already for two particles the wave function no longer “lived” in space-time but in 7-dimensional configuration space, he tried to escape its statistical interpretation. On 5 May 1927, Einstein presented a paper to the Academy of Sciences in Berlin with the title “Does Schrödinger’s wave mechanics determine the motion of a system completely or only in the statistical sense?”. It should have become a 4-page publication in the Sitzungsberichte. As he wrote to Max Born:

“Last week I presented a short paper to the Academy in which I showed that one can ascribe fully determined motions to Schrödinger’s wave mechanics without any statistical interpretation. Will appear soon in Sitz.-Ber. [Reports of the Berlin Academy].”Footnote 285 ([103], p. 136)

However, he quickly must have found a flaw in his argumentation: He telephoned to stop the printing after less than a page had been typeset. He also wanted that, in the Academy’s protocol, the announcement of this paper be erased. This did not happen; thus we know of his failed attempt, and we can read how his line of thought began ([183], pp. 134–135).

Each month during 1929, papers appeared in which a link between Einstein’s teleparallelism theory and quantum physics was foreseen. Thus, in February 1929, Wiener and Vallarta stressed that

“the quantities shλFootnote 286 of Einstein seem to have one foot in the macro-mechanical world formally described by Einstein’s gravitational potentials and characterised by the index λ, and the other foot in a Minkowskian world of micro-mechanics characterised by the index s. That the micro-mechanical world of the electron is Minkowskian is shown by the theory of Dirac, in which the electron spin appears as a consequence of the fact that the world of the electron is not Euclidean, but Minkowskian. This seems to us the most important aspect of Einstein’s recent work, and by far the most hopeful portent for a unification of the divergent theories of quanta and gravitational relativity.” [418]

The correction of this misjudgement of Wiener and Vallarta by Fock and Ivanenko began only one month later [134], and was complete in the summer of 1929 [134, 133, 131, 132].

In March, Tamm tried to show

“that for the new field theory of Einstein [84, 88] certain quantum-mechanical features are characteristic, and that we may hope that the theory will enable one to seize the quantum laws of the microcosm.”Footnote 287 ([341], p. 288)

Tamm added a torsion term \(i\hbar \sqrt {({S_i}{S^i})} \chi\) to the Dirac equation (197) and derived from it a general relativistic (Schrödinger) wave equation in an external electromagnetic field with a contribution from the spin tensor coupled to a torsion termFootnote 288 \({\alpha ^{[i}}{\alpha ^{k]}}{S_{ik}}^l\). As Tamm assumed for the torsion vector \({S_k} = \pm \tfrac{{ie}}{{\hbar c}}\;{\phi _k}\), his tetrads had to be complex, with the imaginary part containing the electromagnetic 4-potential φk. This induced him to see another link to quantum physics; by returning to the first of Einstein’s field equations (170) and replacing in Equation (169) by \(i\tfrac{e}{c}\hbar\) in the limit ħ→0, he obtained the laws of electricity and gravitation, separately. From this he conjectured that, for finite h, Einstein’s field equations might correctly reproduce the quantum features of “the microcosm” ([341], p. 291); cf. also [340].

What remained after all the attempts at geometrizing the matter field for the electron, was the conviction that the quantum mechanical “wave equations” could be brought into a covariant form, i.e., could be dealt with in the presence of a gravitational field, but that quantum mechanics, spin, and gravitation were independent subjects as seen from the goal of reaching unified field theory.

7.2.4 Kaluza’s theory and wave mechanics

For some, Kaluza’s introduction of a fifth, spacelike dimension seemed to provide a link to quantum theory in the form of wave mechanics. Although he did not appreciate Kaluza’s approach, Reichenbächer listed various possibilities: With the fifth dimension, Kaluza and Klein had connected electrical charge, Fock the electromagnetic potential, and London the spin of the electron [276]. Also, the idea of relating Schrödinger’s matter wave function with the new metrical component g55 was put to work. GonsethFootnote 289 and Juvet, in the first of four consecutive notes submitted in August 1927 [150, 148, 149, 147] stated:

“The objective of this note is to formulate a five-dimensional relativity whose equations will give the laws for the gravitational field, the electromagnetic field, the laws of motion of a charged material point, and the wave equation of Mr. Schrödinger. Thus, we will have a frame in which to take the gravitational and electromagnetic laws, and in which it will be possible also for quantum theory to be included.”Footnote 290 ([150], p. 543)

It turned out that from the R55-component of the Einstein vacuum equations \({R_{\alpha \beta }} = 0\), α, β=1, … , 5, with the identification g55=ψ made, and the assumption that ψ, \(\tfrac{{\partial \psi }}{{\partial {x^i}}}\) be “very small”, while ψ, \(\tfrac{{\partial \psi }}{{\partial {x^5}}}\) be “even smaller”, the covariant d‘Alembert equation followed, an equation that was identified by the authors with Schrödinger’s equation. Their further comment is:

“We thus can see that the fiction of a five-dimensional universe provides a deep reason for Schrödinger’s equation. Obviously, this artifice will be needed if some phenomenon would force the physicists to believe in a variability of the [electric] charge.”Footnote 291 ([149], p. 450)

In the last note, with the changed identification g55=ψ2 and slightly altered weakness assumptions, Gonseth and Juvet gained the relativistic wave equation with a non-linear mass term.

Interestingly, a couple of months later, O. Klein had the same idea about a link between the g55-component of the metric and the wave function for matter in the sense of de Broglie and Schrödinger. However, as he remarked, his hopes had been shattered [189]. Klein’s papers were of import: Remember that Kaluza had identified the fifth component of momentum with electrical charge [181], and five years later, in his papers of 1926 [185, 184], Klein had set out to quantise charge. One of his arguments for the unmeasurability of the fifth dimension rested on Heisenberg’s uncertainty relation for position and momentum applied to the fifth components. If the elementary charge of an electron has been measured precisely, then the fifth coordinate is as uncertain as can be. However, Klein’s argument is fallacious: He had compactified the fifth dimension. Consequently, the variance of position could not become larger than the compactification length l ∼ 10-30, and the charge of the electron thus could not have the precise value it has. In another paper, Klein suggested the idea that the physical laws in space-time might be implied by equations in five-dimensional space when suitably averaged over the fifth variable. He tried to produce wavemechanical interference terms from this approach [187]. A little more than one year after his first paper on Kaluza’s idea, in which he had hoped to gain some hold on quantum mechanics, Klein wrote:

“Particularly, I no longer think it to be possible to do justice to the deviations from the classical description of space and time necessitated by quantum theory through the introduction of a fifth dimension.” ([189], p. 191, footnote)

At about the same time, W. Wilson of the University of London rederived the Schrödinger equation in the spirit of O. Klein and noted:

“Dr. H. T. Flint has drawn my attention to a recent paper by O. Klein [189] in which an extension to five dimensions similar to that given in the present paper is described. The corresponding part of the paper was written some time ago and without any knowledge of Klein’s work […].” ([420], p. 441)

Even Eddington ventured into the fifth dimension in an attempt to reformulate Dirac’s equation for more than one electron; he used matrix algebra extensively:

“The matrix theory leads to a very simple derivation of the first order wave equation, equivalent to Dirac’s but expressed in symmetrical form. It leads also to a wave equation which we can identify as relating to a system containing electrons with opposite spin. […] It is interesting to note the way in which the existence of electrons with opposite spins locks the ‘fifth dimension,’ so that it cannot come into play and introduce the absolute into a world of relation. The domain of either electron alone might be rotated in a fifth dimension and we could not observe any difference.” ([61], pp. 524, 542)

Eddington’s “pentads” built up from sedenions later were generalised by Schouten [307].

J. W. Fisher of King’s College re-interpreted Kaluza-Klein theory as presented in Klein’s third paper [187]. He proceeded from the special relativistic homogeneous wave equation in fivedimensional space and, after dimensional reduction, compared it to the Klein-Gordon equation for a charged particle. By making a choice different from Klein’s for a constant he rederived the result of de Broglie and others that null geodesics in five-dimensional space generate the geodesics of massive and massless particles in space-time [127].

Mandel of Petersburg/Leningrad believed that

“a consideration in five dimensions has proven to be well suited for the geometrical interpretation of macroscopic electrodynamics.” ([221], p. 567)

He now posed the question whether this would be the same for Dirac’s theory. Seemingly, he also believed that a tensorial formulation of Dirac’s equation was handy for answering this question and availed himself of “the tensorial form given by W. Gordon [151], and by J. Frenkel”Footnote 292 [140]. Mandel used, in five-dimensional space, the complex-valued tensorial wave function \({\Psi ^{ik}} = \psi {\gamma ^{ik}} + {\Psi ^{[ik]}}\) with a 5-scalar ψ. Here, he had taken up a suggestion J. Frenkel had developed during his attempt to describe the “rotating electron,” i.e., Frenkel’s introduction of a skew-symmetric wave function proportional to the “tensor of magneto-electric moment” mik of the electron by mikψ=m0ψik [141, 140]. Ψik may depend on x5; by taking Ψ periodic in x5, Mandel derived a wave equation “which can be understood as a generalisation of the Klein-Fock five-dimensional wave equation […].” He also claimed that the vanishing of ψ made M5 cylindrical (in the sense of Equation (109) [221]). As he had taken notice of a paper of Jordan [173] that spoke of the electromagnetic field as describing a probability amplitude for polarised photons, Mandel concluded that the amplitude of his Ψ-field might then represent polarised electrons as its quanta. However, he restricted himself to the consideration of classical one-particle wave equations because

“in some cases one can properly speak of a quasi-macroscopical one-body problem — think of a beam of monochromatic cathod-rays in an arbitrary external force-field.”Footnote 293

In a later paper, Mandel came back to his wave equation with a skew-symmetric part and gave it a different interpretation [222].

Unlike Klein, Mandel tried to interpret the wave function as a new discrete coordinate, an idea going back to Pauli [247]. He took “Dirac’s spin variable” ζ and the spatial coordinate x5 as a pair of canonically-conjugate operator-valued variables; ζ is linked to positive and negative elementary charge (of proton and electron) as its eigenvalues. In Mandel’s five-dimensional space, the fifth coordinate, as a “charge” coordinate, thus assumed only 2 discrete values αe.

“This completely corresponds to the procedure of the Dirac theory, with the only difference that for Dirac the coordinate ζ could assume not 2 but 4 values; from our point of view this remains unintelligible.”Footnote 294 ([222], p. 785)

In following Klein, Mandel concluded from the Heisenberg uncertainty relations that

“[…] all possible values of this quantity [x5] still remain completely undetermined such that all its possible values from -inf to +inf are of equal probability.”

This made sense because, unlike Klein, Mandel had not compactified the fifth dimension. His understanding of quantum mechanics must have been limited, though: Only two pages later he claimed that the canonical commutation relations \([\mathbf{p},\mathbf{q}] = \tfrac{\hbar }{i}\mathbf{1}\) could not be applied to his pair of variables due to the discrete spectrum of eigenvalues. He then essentially went over to the Weyl form of the operators p, q in order to “save” his argument [222].

Another one of the many versions of “Dirac’s equation” was presented, in December 1930, by Zaycoff who worked both in the framework of Einstein’s teleparallel theory and of Kaluza’s five-dimensional space. His Lagrangian is complicatedFootnote 295,

$$ M = - i\tilde \psi {\gamma ^\rho }\frac{{\partial \psi }}{{\partial {x^\rho }}} + \frac{i}{2}{S_m}{J_m} + \frac{1}{{24}}{S_{klm}}{J_{klm}} + a{f_m}{J_m} + \frac{1}{8}{F_{km}}{J_{km}} + \mu {J_0}, $$

where summation is implied and \({S_{kl}}^m\) is the torsion tensor, fm the electromagnetic vector potential, and fik the electromagnetic field. Note that the Dirac current \({J_m}: = \tilde \psi {\gamma _m}\psi\) couples to both the torsion vector and the 4-potential. The remaining variables in (198) are \({J_m}: = i\tilde \psi {\gamma _m}{\gamma _l}^\dag {\gamma _0}\psi\), with kl, and \({J_{klm}}: = i\tilde \psi {\gamma _k}{\gamma _l}^\dag {\gamma _m}\psi\), with klm [434].

While Zaycoff submitted his paper, Schouten lectured at the MIT. and, among other things, showed “how the mass-term in the Dirac equations comes in automatically if we start with a five-dimensional instead of a four-dimensional Riemannian manifold” ([306], p. 272). He proved a theorem:

The Dirac equations for Riemannian space-time with electromagnetic field and mass can be written in the form of equations without field or mass \({\alpha ^b}{\nabla _b}\psi = 0\) in an R5.

Here αb is the set of Dirac numbers defined by α(aαb)=gab, (α)aαb)αc=αa(αbαc) with a, b, c = 0, … , 4, and ∇b the covariant spinor derivative defined by him.

As we mentioned above (cf. Section 6.3.2), another approach to the matter within projective geometry was taken by Pauli with his student J. Solomon [253]. After these two joint publications, marred by a calculational error, Pauli himself laid out his version of the projective theory in two installments with the first, as a service to the community, being a pedagogical presentation of the formalism connected with projective geometry [249]. The second paper, again, has the application to Dirac’s equation as a prime motivation:

“The following deductions are intended to show […] that the unifying combination of the gravitational and the electromagnetic fields, by projective differential geometry with the aid of five homogeneous coordinates, is a general method whose range reaches beyond classical field-physics and into quantum theory. Perhaps, the hope is not unjustified that the method will stand the test as a general framework for the laws of physics also with regard to a future physical and conceptual improvement of the foundations of Dirac’s theory.”Footnote 296 ([250], pp. 837–838)

Pauli started with the observation that the group of orthogonal transformations in five-dimensional space had an irreducible, four-dimensional matrix representation satisfying

$$ \begin{array}{*{20}{c}} {{\alpha _\mu }{\alpha _\nu } + {\alpha _\nu }{\alpha _\mu } = 2{g_{\mu \nu }} \cdot 1,\;\;\;\;\;}&{\mu ,\nu = 1, \ldots ,5,} \end{array} $$

where \({\alpha _\mu }\) are 4×4 matrices given at the end of Section 2.1.5 in a different representation \( {\alpha _\mu } = \left( {\begin{array}{*{20}{c}} {{\sigma _\mu }}&0\\ 0&{ - {\sigma _\mu }} \end{array}} \right) \) with μ = 1, 2, 3, \( {\alpha _4} = \left( {\begin{array}{*{20}{c}} 1&0\\ 0&1 \end{array}} \right) \), and augmented by \( {\alpha _5} = \left( {\begin{array}{*{20}{c}} 0&1\\ 1&0 \end{array}} \right) \). This had been known also to Eddington [61] and Schouten [303]. He then introduced projective spinors depending on five homogeneous coordinates without using bein-quantities. He followed the methods of Schrödinger and Bargmann [319, 6], i.e., used the existence of a matrix A such that \(\mathbf{A}{\alpha _\mu }\) is Hermitian. The transformation laws of 4-spinors Ψ and matrices \({\alpha _\mu }\) are coupled:

$$ \psi ' = {S^{ - 1}}\Psi , $$
$$ {\alpha '_\mu } = {S^{ - 1}}{\alpha _\mu }S, $$
$$ A' = {S^\dag }AS. $$

For transformations in the space of homogeneous coordinates \({X'^\mu } = {a^\mu }_\nu {X^\nu }\) such that \({\alpha _\nu } = {a^\mu }_\nu {\alpha '_\mu }\), the quantity \({a_\mu }: = {\Psi ^\star}A{\alpha _\mu }\Psi\) transforms, for fixed \({\alpha _\mu }\), under changes of the spin frame, Equation (200), as a covariant vectorFootnote 297.

Pauli criticised an analogous attempt at formulating Dirac’s equation with the help of five homogeneous coordinates by Schouten and van Dantzig [317, 308, 309] as being “difficult to understand and less than transparent”Footnote 298. A projective spinor is defined via

$$\Psi = \psi {F^l},$$

where ψ is a normal (“affine”) spinor (degree of homogeneity 0) and F a real scalar of (homogeneity) degree 1, i.e., \(F = {X^\mu }\tfrac{{\partial F}}{{\partial {X^\mu }}}\). There exist two (related) spin-connections Λk for projective spinors Ψ and \({\mathop \Lambda \limits^{\rm{R}} _k}\) for spinors ψ.

Pauli’s Dirac equation, derived from a Lagrangian, looked in five dimensions like

$${\alpha ^\mu }({\Psi _{;\mu }} + k{X_\mu }\Psi )$$

with \(k = - \tfrac{{imc}}{h} - \tfrac{{ie}}{{hc}}\tfrac{c}{{\sqrt \kappa }}\tfrac{1}{r}\), and \(l = + \tfrac{{ie}}{{hc}}\tfrac{c}{{\sqrt \kappa }}\tfrac{1}{r}\). The covariant derivative is formed with the spin connection Λk. An involved calculation leads to Dirac’s equation in four dimensions:

$$ {\alpha ^k}\left( {\frac{{\partial \psi }}{{\partial {x^k}}} + {{\mathop \Lambda \limits^{\rm{R}} }_k}\psi - \frac{{ie}}{{hc}}{\Phi _k}\psi } \right) - i\frac{{mc}}{h}{\alpha _0}\psi + \frac{r}{8}\frac{{\sqrt \kappa }}{c}{F_{kl}}\;{\alpha _0}\;{\alpha ^{[kl]}}\psi = 0, $$

with the numerical factor r, the electrical 4-potential Φk, and the electromagnetic field tensor Fkl; furthermore, \({\alpha ^{\mu \nu }} = {\alpha ^{[\mu }}{\alpha ^{\nu ]}}\).

Pauli succeeded also in formulating a five-dimensional energy-momentum tensor containing, besides the four-dimensional energy-momentum tensor, the four-dimensional Dirac current vector. At the end of his paper Pauli stressed the

“more provisional character of his 5-dimensional-projective form of Dirac’s theory. […] In contrast to the joinder of the gravitational and electromagnetic fields, a direct logical coupling of the matter-wave-field with these has not been achieved in the form of the theory developed here.”

7.3 Einstein, spinors, and semi-vectors

Ehrenfest, even after van der Waerden’s paper on spinor analysis [368], in 1932 pressed Einstein to think about a simple geometric interpretation of spinors. To this end Einstein responded, together with his assistant Mayer, by introducing the concept of semi-vector seemingly more natural to him than a spinor, and also more general:

“In spite of the great importance which the spinor concept, as introduced by P a u l i and D i r a c, has obtained in molecular physics, one cannot claim that the analysis of this concept up to now satisfies all justified demands. Our efforts have lead to a derivation corresponding, according to our opinion, to all demands for clarity and naturalness and avoiding completely any not so transparent artifice. Thereby, […] the introduction of novel quantities was shown to be necessary, the ‘semi-vectors’, which include spinors but possess a clearly more transparent transformation character than spinors.”Footnote 299 ([110], p. 522)

In this first publication on the subject, Einstein and Mayer explicitly referred to the paper by Infeld and van der Waerden, of which they had received a copy several months before publication ([167], and [110], p. 25, footnote). Apparently, Einstein found the reconstruction of the spinor concept in his paper more “clear and natural” than Infeld and van der Waerden’s. Nevertheless, the approach and notation of Infeld and van der Waerden became the accepted one by physicists.

About three months before the first paper on semi-vectors was published, Einstein wrote to Besso:

“I work with my Dr. Mayer on the theory of spinors. We already could clear up the mathematical relations. A grasp on the physics is far away, farther than one thinks at present. In particular, I still am convinced that the attempt at an essentially statistical theory will fail.”Footnote 300 ([99], p. 291)

Besides the aspired-to clarity and simplicity, Einstein’s main hope was that, with his semivector system of equations replacing Dirac’s equation, it might be possible to explain the existence of elementary particles with opposite charge and unlike mass, i.e., of electron and proton. As noted before, he had not been able to solve this problem by his affine field theory of 1923 (cf. Section 4.3.2), nor by the approaches to unified field theory that followed. As it turned out, the positron was discovered at about the same time, and the problem dissolved while Einstein and Mayer began to reformulate the spinor concept. Einstein again seems to have been fully convinced that his new concept of “semi-vector” was superior to the spinor concept. On 7 May 1933, he wrote to De Haas:

“Scientifically Mayer and I have found one very natural generalisation of Dirac’s equation which makes it comprehensible, that there are two understandable elementary masses, while there is only one electric charge.” [69]

In the first paper in the reports of the Berlin Academy, the mathematical foundations of the semi-vector formalism are developed [110]. The basic idea of Einstein and Mayer is the possibility of a decomposition of any (proper) Lorentz transformation described by a real matrix D into a product BC of a pair of complex-conjugate, commuting matricesFootnote 301 B and C. The transformations represented by B or C form a group isomorphic to the Lorentz group. In terms of infinitesimal Lorentz transformations given by an antisymmetric tensor ωik, this amounts to the decomposition into a self-dual and an anti-selfdual part: \({\omega _{ik}} = \tfrac{1}{2}\left( {{\omega _{ik}} + i\omega _{ik}^*} \right) + \tfrac{1}{2}\left( {{\omega _{ik}} - i\omega _{ik}^*} \right)\), with the dualFootnote 302 defined by \(\omega _{ik}^*: = \tfrac{1}{2}\sqrt g { \epsilon _{ijkl}}\;{\omega ^{kl}}\).

Contravariant semi-vectors of the first and second kind now are defined by their transformation laws: \({\rho ^{i'}} = {b^i}_k{\rho ^k}\) and \({\sigma ^{i'}} = {c^i}_k{\sigma ^k}\), where \({b^i}_k\), \({c^i}_k\) are the components of B, C. For real Lorentz transformation D, \({\bar b^i}_{\;\;k} = {c^i}_k\) must hold. As B, C are both also Lorentz transformations,

“the metric tensor gik is also a semi-vector of 1st kind (and of 2nd kind) with transformationinvariant components.”

Thus it can be used for raising and lowering indices of semi-vectors ([110], p. 535).

The system of equations intended as a replacement of the Dirac equation appears only in the second publication [111]. A Lagrange function for the semi-vector is found and the generalised Dirac equations for the semi-vectors ψ, χ look likeFootnote 303:

$$ \begin{array}{*{20}{c}} {{E^{r\sigma \tau }}\left( {\frac{{\partial {\psi _\sigma }}}{{\partial {x^r}}} - i \epsilon {\psi _\sigma }{\phi _r}} \right) = {{\bar C}^{\tau \rho }}{\chi _\rho },\;\;\;\;\;}&{{