The CPT theorem is a well-known and well-established fundamental result in relativistic quantum field theory (QFT), stating that any such theory will be invariant under a sequence of transformations consisting of time reversal (T), spatial inversion (P for parity), and finally particle–antiparticle exchange (C for charge conjugation; it is not clear which of the two words the C stands for, if any). This paper attempts to answer a very simple question: How did they come up with that?

One might suspect, given the fact that (a) the sequence of transformations appears rather unintuitive at first glance and (b) the CPT theorem represents one of the few exact results in quantum field theory, that the CPT theorem emerged from the attempts at constructing a rigorous, formal mathematical formulation of QFT (axiomatic QFT) and that CPT was only imbued with physical significance through its appearance as an invariant transformation in the context of the theorem. And indeed, this is how it was soon presented. When Jost [18] gave the first axiomatic proof in 1957; he called it a “strange theorem,” whose connection to the foundations of QFT needed to be clarified. The appropriation of the CPT theorem by axiomatic field theory was fully achieved through Raymond Streater and Arthur Wightman’s textbook “PCT, Spin and Statistics, and All That” [38], where already in the title the CPT theorem was framed as one of the central successes of the axiomatic approach to field theory.

In this paper, we want to trace the roots of the CPT theorem, which lie outside the axiomatic approach and culminated with the first explicit formulations of the theorem in 1955. It is hoped that this reconstruction will provide a somewhat different, and less abstract, perspective on the CPT theorem. Indeed, we will see not only less formalistic proofs, but also how the notion of CPT invariance arose from physical and phenomenological considerations already before the establishment of the theorem. These considerations centered around one question: how to formulate time reversal in relativistic QFT, or rather, which transformation should be identified as time reversal. This was important for several reasons: Time reversal invariance was important in the context of thermodynamics and also played a role as a constraint in particle-physics model building. But most prominently the question of time reversal invariance arose in the search for the physical principles underlying the connection between spin and statistics.

This paper can thus be considered a sequel of sorts to a paper on the history of the spin-statistics theorem that one of the authors (AB) published in this journal several years ago [4]. This paper consequently starts where that paper ends, with the first proofs of the spin-statistics theorem in 1940. In Sect. 1, we will discuss Belinfante’s proof of the spin-statistics theorem, which takes as its premise invariance under charge conjugation C, contrasting with Pauli’s better-known proof, which made use of discrete transformations (in this case PT) in a rather different manner. In Sect. 2, we discuss Schwinger’s derivation of the spin-statistics theorem, which did not use charge conjugation invariance as its premise, but rather time reversal invariance. In Sect. 3, we then observe, in the work of Gerhart Lüders, for the first time a reversal of the logic, with discrete symmetries (in this case CT) being derived from the spin-statistics theorem, for the purpose of using the symmetries to constrain particle-physics model building. We then return to Pauli in Sect. 4 who took Lüders’ results and turned them into the first mature version of the CPT theorem. The reader will note that the story leading up to this formulation has a rather anecdotal feel, as the results related to the CPT theorem arose as incidental spin-offs from other research programs. The central role of the CPT theorem was only realized after the discovery of parity violation in 1957, and we conclude with an outlook on this transformation in Sect. 5

1 The first proofs of the spin-statistics theorem

Concluding a decade-long development [4], the end of the 1930s saw the first general proofs of the spin-statistics theorem by a handful of physicists: the most famous one by Wolfgang Pauli, a more formalistic proof by his assistant Markus Fierz, and two proofs by PhD students, Fredrik Belinfante at Utrecht University and Jacobus Stephanus de Wet at Princeton [10]. Of these, the proofs by Pauli and Belinfante are of central importance to our story, because they invoked discrete symmetries related to the CPT theorem. While they have this feature in common, the two proofs were also essentially different in their ambitions, making comparisons between them, which were performed both by the historical actors themselves [33] and in the secondary literature ([10], pp. 306–309) rather problematic.

Pauli started his proof [31] by observing a manifest discrete symmetry of all (non-interacting) classical relativistic field equations. From the existence of this symmetry, he had concluded that there was a general doubling of the solutions of these equations, where in all cases these doubled solutions were physically problematic, either because they had a negative charge-current density (integer spin) or because they had a negative-energy density (half-integer spin). He then presented field quantization as a solution to these anomalies, and the physical principles he invoked in this argument (microcausality, positive-definiteness of the energy) were supposed to derive the commutation relations in their totality (including their singularity structure), not just answer the question whether one should use commutators or anti-commutators.

Belinfante [2] in turn had postulated a discrete symmetry for any interacting quantum field theory. From the demand that all observables (in particular the charge-current density) be invariant under this symmetry, he had been able to derive one very specific feature of the field-theoretic commutation relations, namely whether the field operators involved should commute or anti-commute when taken at the same point in space-time. From this, he could deduce whether they should be quantized using commutators or anti-commutators, but the other features of the commutators, in particular their causal properties and singularity structure, were relegated to a more general discussion of the properties of relativistic quantum field theory and did not enter his discussion of the spin-statistics connection.

The discrete symmetry that Pauli had used was the product of P and T. This product itself had not been extensively invoked before, but Pauli considered it the natural relativistic extension of the spatial symmetry P and the temporal symmetry T, both of which had of course a long tradition. Indeed, it seems safe to say that most physicists at the time would have unquestioningly assumed that both were good symmetries of microscopic physics. Their formulation for tensorial quantities was unproblematic, following as it did from the transformation of the basic coordinate four-vector, where P changed the sign of the three spatial components, while T changed the sign of the temporal component. The formulation of P and T for half-integer spin fields had been pioneered by Pauli himself in March 1935, in the context of a lecture series he held at the Institut Henri Poincaré in Paris on the foundations of relativistic quantum theory.Footnote 1 He had derived the P and T transformations for Dirac spinors by demanding that the Dirac equation be invariant under these transformations; in particular, this meant demanding that the term transform as a scalar. This in turn meant that under the transformation \(\psi \rightarrow \Lambda \psi \), where \(\Lambda \) is the Dirac matrix representing P or T, the Dirac matrices transform as a vector, i.e., that

$$\begin{aligned} \Lambda ^{-1} \gamma _0 \Lambda= & {} - \gamma _0 \nonumber \\ \Lambda ^{-1} \gamma _i \Lambda= & {} \gamma _i \end{aligned}$$
(1.1)

for T and

$$\begin{aligned} \Lambda ^{-1} \gamma _0 \Lambda= & {} \gamma _0 \nonumber \\ \Lambda ^{-1} \gamma _i \Lambda= & {} - \gamma _i \end{aligned}$$
(1.2)

for P. From this condition, Pauli obtained \(\Lambda =\gamma _0\) for P and \(\Lambda = \gamma _5 \gamma _0\) for T, up to an undetermined phase, which at the time he had simply set to 1. Giulio [35] had then demonstrated that in order for complex conjugate spinors to also transform correctly under P and T, a factor of i needs to be included in P. Pauli’s PT for Dirac spinors was thus equivalent to the matrix transformation \(i \gamma _5\) and could easily be extended to arbitrary classical half-integer spin fields. It was then a manifest symmetry of the non-interacting relativistic field equations, on which Pauli could build his arguments, as the PT transformation changed the sign of the energy-momentum tensor and thus revealed the presence of negative-energy solutions in the classical field theory. Once the PT transformation had done its work in demonstrating the general presence of such solutions, and thus the necessity of (ultimately fermionic) field quantization, Pauli did not need to invoke it any further. In particular, the persistence of this symmetry in the quantum theory was neither postulated nor proven, as it was irrelevant to Pauli’s further argument concerning the exact form of the commutation relations.

The role of discrete symmetries in Belinfante’s proof was quite different. For starters, the symmetry he invoked, charge conjugation C, was by no means a well-established symmetry of microscopic physics with a pedigree in classical physics. It had been formulated only in 1937, by Belinfante’s supervisor Hans Kramers.Footnote 2 Kramers, in turn, had developed the notion of charge conjugation in order to shed light on a paper by Majorana [27]: Majorana had studied the Dirac equation for an uncharged particle in a basis where the Dirac operator is real; in this basis, the Dirac equation had purely real (and purely imaginary) solutions, as opposed to Dirac’s basis, where all solutions are necessarily complex. Majorana could thus impose a reality condition on the Dirac field operator (which is a solution of the Dirac equation), which implied that the operator creating a particle with positive energy E and some momentum p was identical with the operator annihilating a particle with negative energy \(-E\) and momentum \(-p\). Since the latter operator was (in hole theory) identified with the creation of a positive-energy antiparticle, this could be read as identifying a particle and its antiparticle; for Majorana, it meant the elimination of the entire concept of antiparticle, since the negative-energy states, which had necessitated the introduction of that concept, could be entirely removed from the theory, at least for neutral particles.

Kramers now aimed to understand how Majorana’s reality condition could be extended to an arbitrary basis of the Dirac matrices. To this end, he invented the charge conjugation transformation, which consists of the action of a specific Dirac matrix C (which differs from basis to basis) and the complex conjugation of the spinor. With this transformation, Majorana’s condition in an arbitrary basis could be written as

$$\begin{aligned} \psi = \psi ^C = C \psi ^{*} \end{aligned}$$
(1.3)

i.e., the Dirac spinor was to be equal to its charge conjugate. The name charge conjugation derived from the fact that when applied to the Dirac wave function of a charged particle it gave a solution to the Dirac equation for a particle with opposite electric charge. This is ensured by the matrix C obeying the relation

$$\begin{aligned} C \gamma _{\mu }^{*} = - \gamma _{\mu } C, \end{aligned}$$
(1.4)

since then complex conjugation and left-multiplication with C transform the Dirac equation into a Dirac equation for \(\psi ^C\) with the charge reversed. A matrix with this property had been introduced by Pauli in his already mentioned 1935 Paris lectures, which were the foundational treatise on the \(\gamma \) matrix calculus. He had introduced three matrices A, B, and C which, when commuted with the gamma matrices turned them into their hermitian conjugate, transpose, and complex conjugate, respectively (with some signs depending on conventions for the Minkowski metric).Footnote 3 From the interrelations between these various transformations, Pauli had deduced a further essential property of the matrix C ([30], p. 130):

$$\begin{aligned} C C^{*} = 1. \end{aligned}$$
(1.5)

In order to turn charge conjugation from an operation defined for Dirac spinors into a central element of quantum field theory, Belinfante in his thesis chose to reconstruct relativistic field theory not with vector (Lorentz) indices, nor with Weyl spinor indices, but with Dirac spinor indices. All objects constructed in this manner would then have well-defined transformation properties under charge conjugation (and also parity). Duck and Sudarshan [10] heavily criticize Belinfante for this idiosyncratic approach, which he named “undor calculus” because not vectors or (Weyl) spinors are used as constitutive elements, but Dirac wave (Latin: unda)Footnote 4 functions.Footnote 5 But it did allow him to define for the first time the charge conjugation of a vector.

In Belinfante’s formalism, a four-vector \(V_{\mu }\) arises as part of the reducible “undor of the second rank,” which contains in addition a scalar, a pseudoscalar, a pseudovector, and an antisymmetric tensor, i.e., the usual covariant quantities one can construct from two Dirac spinors. The four-vector is \({\overline{\psi }} \gamma _{\mu } \psi \) and under charge conjugation consequently transforms into

$$\begin{aligned} V^C_{\mu } = \psi ^T C^{\dagger } \gamma _0 \gamma _{\mu } C \psi ^{*} = \psi ^T \gamma _0^{*} \gamma _{\mu }^{*} \psi ^{*} = V^{*}_{\mu }. \end{aligned}$$
(1.6)

In fact, Belinfante obtained the result that for all components of the undor of the second rank (and thus for all bosons), charge conjugation was simply equal to complex conjugation. With charge conjugation now a transformation applicable in all relativistic field theories, Belinfante postulated a novel invariance property for quantum field theory in general: reverse the sign of all charges and replace all fields by their charge conjugate:

By way of hypothesis one might assume that such a symmetry is a fundamental property of nature. We shall call this possible property the “charge-invariance” of the physical world... ([2], p. 34)

From this postulate, he could then deduce that field operators at the same space-time point must commute or anti-commute, depending on their spin. Take the interaction of a Dirac spinor field with the (real, i.e., charge conjugation invariant) electromagnetic field, for example,

$$\begin{aligned} e {\overline{\psi }} \gamma _{\mu } \psi A^{\mu }. \end{aligned}$$
(1.7)

Under Belinfante’s transformation, this goes into

$$\begin{aligned} (-e) \overline{\psi ^C} \gamma _{\mu } \psi ^C A^{\mu }. \end{aligned}$$
(1.8)

The two expressions are equal only if the \(\psi \) operators anti-commute, for then:

$$\begin{aligned} (-e) \overline{\psi ^C} \gamma _{\mu } \psi ^C A^{\mu }= & {} (-e) \psi ^T C^{\dagger } \gamma _0 \gamma _{\mu } C \psi ^{*} A^{\mu } \nonumber \\= & {} e \psi ^{\dagger } C^T \gamma _{\mu }^T \gamma _0^T C^{*} \psi A^{\mu }. \end{aligned}$$
(1.9)

Using Eqs. 1.4 and 1.5, as well as the standard hermiticity normalization of the Dirac matrices,Footnote 6 the matrix reduces to \(\gamma _0 \gamma _{\mu }\) as it should.

Pauli agreed with Belinfante’s proof in general, but in their joint publication emphasized the superiority of his proof. After all, from his far more plausible premises the entire commutation relations could be deduced; his proof thus required far less of the wider framework of quantum field theory to support it.Footnote 7 But Belinfante had established that the elementary spin-statistics connection could be deduced from the novel demand of charge invariance. It was thus quite striking when, in 1951, Schwinger claimed to be able to deduce the spin-statistics connection from a far better established discrete symmetry, namely time reversal invariance.

2 Schwinger and time reversal in QFT

In the early 1950s, in the wake of the renormalization of QED, Julian Schwinger presented a total reformulation of quantum field theory, which he presented in a series of papers, beginning with [37]. Schwinger’s reformulation of QFT has some similarities to the later axiomatic program in that it sought to reconstruct QFT from the ground up. As opposed to axiomatic QFT, however, Schwinger’s program was not characterized by mathematical rigor or great abstraction; it was rather intended as a systematization (and differential formulation) of Feynman’s path integrals,Footnote 8 which had been successful precisely for its computational power in calculating scattering amplitudes.

Feynman’s method involved no formal quantization procedure. Pushing this idea further, Schwinger’s entire program was built on the idea of eliminating the two-step process of setting up a classical (field) theory and then (canonically) quantizing it. He wanted to replace this process with a coherent framework for field theory that was quantum from the outset. It was thus essential to Schwinger’s program to derive the commutation relations within his framework, rather than postulate them. Pauli’s two-step approach of proving the spin-statistics theorem, beginning with the shortcomings of the classical theory, was thus out of question for Schwinger, and the proof he ended up giving was far closer to Belinfante’s version. However, the invariance property he took as his starting point was not framed as a novel charge invariance, but rather as the established property of time reversal invariance.Footnote 9

Schwinger was the first physicist to investigate how to implement time reversal in a theory that was both quantum—unlike Pauli—and a field theory—unlike Eugene Wigner, who had written a foundational paper on time reversal in non-relativistic quantum mechanics [48]. Wigner had shown that time reversal in a quantum theory involves more than just the reflection of the time coordinate. This is easily seen by considering the Heisenberg equation of motion for some operator A:

$$\begin{aligned} \left[ H, A \right] = - i \hbar \frac{\partial A}{\partial t}. \end{aligned}$$
(2.1)

Under reversal of coordinate time, the right-hand side acquires a minus sign from the time derivative, while the left-hand side does not. There must thus be a more sophisticated time reversal operator T that introduces an additional minus on either side of the equation. Wigner’s solution was to have T involve complex conjugation. Complex conjugation is a somewhat thorny concept for abstract Hilbert space operators, but it can always be implemented once a certain representation is chosen. The important thing here is that complex conjugation leaves the Hamiltonian invariant (the Hamiltonian is a “real” operator in Wigner’s terminology) but changes the sign of the imaginary unit on the right-hand side of the Heisenberg equation of motion. The operator A is assumed to be either real (invariant under T) or imaginary (picking up a minus sign under T); in any case, the change of A under time reflection cancels, as do the signs from flipping t and conjugating the imaginary unit, leaving the equations of motion invariant.

Schwinger pointed out that there was another option, namely that the Hamiltonian change sign under time reversal. This was of course in keeping with Pauli’s PT transformation mapping positive-energy solutions of half-integral field theories into negative-energy solutions. But the problem of negative-energy states was supposed to be solved in the quantum theory, through fermionic quantization and the Dirac Sea, so that Schwinger discarded this option ([37], footnote 8). In the quantum theory, the energy of spinor fields should transform just like the energy of tensorial fields. Indeed, it should transform in precise analogy to the transformation of the coordinate vectors: Since the energy density is the time–time (zero–zero) component of the energy-momentum tensor \(T_{\mu \nu }\), one would expect it not to change sign under a time reversal. Conversely, the charge density, which was the time component of the charge-current vector \(j_{\mu }\), should change sign under time reversal. And this latter demand, though Schwinger did not state it explicitly, actually ruled out Wigner’s solution. For consider the time component of the conserved current for a scalar field \(\phi \):

$$\begin{aligned} j_0 = \phi \partial _0 \phi ^{*} - \phi ^{*} \partial _0 \phi . \end{aligned}$$
(2.2)

This does not change sign under time coordinate reversal cum complex conjugation (the two minus signs cancel), as Schwinger would have wanted it to. He was thus forced to adopt a third solution to the problem of time reversal in quantum theory: Time reversal would flip the order of operators in a product. This would lead to a change of sign for the commutator on the left-hand side of the Heisenberg equations, again making it invariant. Schwinger motivated this change in ordering by considering time reversal as a “transposition” of the operators, a notion that was similarly problematic to complex conjugation for abstract operators, but which Schwinger made more palatable by interpreting it as an operation that mapped the Hilbert Space of states to its dual, i.e., bras to kets (and vice versa). The re-ordering induced by this time reversal transformation led to a shuffling of the Lagrangian, but Schwinger could show that it could be unshuffled and returned to its original form (i.e., could be shown to be invariant under time reversal) if the fields obeyed the correct commutation relations as implied by the spin-statistics theorem.

The proof was analogous to Belinfante’s in that the spin-statistics relation had to be invoked in order to undo the re-ordering caused by the transformation (through complex conjugation or transposition). Schwinger’s proof purported to be far more general than Belinfante’s, for it dealt not just with a specific term in the Lagrangian (interaction of fermionic current with a vector field), but with an arbitrary Lagrangian (an arbitrary Lagrangian invariant under \(\mathrm{P}\) that is, an assumption that Schwinger did not make explicit). For purposes of comparison, we show how Schwinger’s proof worked for the same kind of term considered by Belinfante. The interaction term of Eq. 1.7 is first time-reversed, giving

$$\begin{aligned} e \overline{\gamma _5 \gamma _0 \psi } \gamma _{\mu } \gamma _5 \gamma _0 \psi A^{\mu } (-1)^{1+\delta _{\mu 0}}= & {} e \psi ^{\dagger } \gamma _0^{\dagger } \gamma _5^{\dagger } \gamma _0 \gamma _{\mu } \gamma _5 \gamma _0 \psi A^{\mu } (-1)^{1+\delta _{\mu 0}} \nonumber \\= & {} e \psi ^{\dagger } \gamma _{\mu } \gamma _{0} \psi A^{\mu } (-1)^{1+\delta _{\mu 0}} \nonumber \\= & {} - e \psi ^{\dagger } \gamma _{0} \gamma _{\mu } \psi A^{\mu } \end{aligned}$$
(2.3)

using the commutation relations and the usual hermiticity normalization (Note that the electromagnetic potential, as a four-vector, gets a minus sign in its zero component under time reversal.). The final expression is then transposed, and the operators brought back into the correct order by commuting; if the right spin-statistics connection is used (the electromagnetic field of course is taken to commute with the spinors), this gives precisely the additional minus sign needed to make the expression invariant. Schwinger could thus claim to have shown that he had derived the spin-statistics theorem from the time reversal invariance of the quantum theory itself, rather than having to invoke the discrete reflection properties of the classical theory, as Pauli had done.

Unsurprisingly, Schwinger’s paper found an interested reader in Pauli, who was interested in this novel demonstration of his theorem. And Pauli was quick to realize that Schwinger’s claim to have derived the spin-statistics theorem from time reversal invariance was problematic. For while Schwinger had dismissed a Wigner-type implementation of “time reversal as complex conjugation” for QFT, one could actually construct a perfectly well-defined transformation in this way. This transformation flipped the time coordinate, as it should, but left the charge invariant, in contrast to Schwinger’s transformation. There appeared to be two time reversals in QFT, and only one of them (Schwinger’s variant, which flipped the charge) led to the spin-statistics theorem. Pauli concluded:Footnote 10

Physically the connection between spin and statistics should thus be more closely connected with the transformation of the sign of the charge [...] than with time reversibility.Footnote 11

Pauli thus believed that Schwinger had essentially reproduced Belinfante’s proof, but had obscured this fact by combining Belinfante’s charge reversal with time reversal, calling the combined transformation “time reversal,” when in fact it was the reversal of operator order (which ensured the change in sign of the charge) that ensured the spin-statistics connection, not the time reversal itself.

Pauli continued to be interested in the question whether Schwinger’s transformation was really a legitimate formulation of time reversal or merely a charge conjugation with time reversal grafted on for argumentative purposes. In 1952, he gave a course on time reversal in quantum theory at the Les Houches Summer School. He began the notesFootnote 12 by rejecting the notion of a standalone time reversal in relativistic theories, focusing instead, as already in 1940, solely on the combination PT:

For simplicity we shall consider the transformation in which all four coordinates x change sign and call it reflexion [sic]. The transformation in which the spatial coordinates change sign is not relevant to time reversal and can be studied otherwise. The transformation in which all four coordinates change signe [sic] requires less writing than the transformation in which \(x_4\) [time] alone changes sign because it commutes with any Lorentz transformation.

He then gave two distinct explicit expressions for such a “reflexion,” one that changed the sign of the charge density (“Schwinger transformation”) and one that did not (“Wigner transformation”). The latter was explicitly given in the context of QFT for the first time; we will discuss it in more detail later on, when comparing it to Lüders’s formulation. These two transformations were presented as essentially equivalent possibilities. In fact, Pauli even allowed for a certain symmetry between the two: The Schwinger transformation was the product of a Wigner transformation and charge conjugation, but the Wigner transformation was also the product of a Schwinger transformation and charge conjugation.

The Les Houches notes are, to our knowledge, the first mention of a CPT transformation: By combining Schwinger’s combination of time reversal and charge conjugation with his own predilection for always combining spatial and temporal reflexions, Pauli had constructed a new discrete symmetry transformation, which he called “Schwinger transformation” and which we would now identify as CPT. But he still emphasized that it was specifically charge conjugation that was related to the spin-statistics connection:

Schwinger transformation. It is the product of the Wigner transformation and of the well-known charge conjugation. The last one depends on statistics.

The reader should notice the change of emphasis in the last sentence: With a plethora of symmetries to choose from (PT, CPT, C in Pauli’s lectures alone), it was now the spin-statistics connection that appeared as the solid foundation, and one needed to invoke it in order to prove that certain symmetries were actually realized. (Pauli also remarked that “[i]t must be emphasized that ‘Wigner transformation’ holds for all statistics.”) The discrete symmetries were now the contested question and were slowly moving from premise to conclusion, a possibility that had been opened up by Schwinger’s very general (though not very explicit) proof of the spin-statistics theorem.

3 Lüders and motion reversal

A contemporaneous (and independent) attempt of formulating time reversal in QFT came from Lüders [23], a postdoc with Werner Heisenberg in Göttingen. Like Pauli, Lüders had read the work of Schwinger and WatanabeFootnote 13 and was interested in the connection between time reversal and the spin-statistics theorem. He was also interested in another nascent debate, in thermodynamics, which we shall sketch in the following. Indeed, Pauli had also shown interest in this debate, so it may well have played a role in his renewed focus on time reversal in QFT as well.Footnote 14

The debate concerned the principle of detailed balance, the thermodynamic principle that at equilibrium each reaction occurs at the same rate as its inverse. It had first been used by Ludwig Boltzmann in his derivation of the H-theorem [15] and had received its quantum implementation in 1916/17, when Albert Einstein based his derivation of Planck’s law on the assumption that transition amplitudes for inverse processes were supposed to be equal ([11, 12]). In this form, the principle also figured centrally in quantum derivations of Boltzmann’s H-theorem ([29], Eq. 21) The issue was re-examined when Walter Heitler’s damping theory,Footnote 15 a proposed modification of quantum field theory introduced in the 1940s in a (pre-renormalization) attempt to remove the divergences, appeared to lead to violations of detailed balance [16]. But Jim Hamilton and Huan Wu Peng, who had discovered the failure of the principle in damping theory, also pointed out that, in the cases they were considering, reaction balance could be restored by averaging over spins in the initial and field states. Coester [8] then showed that this weaker form of detailed balance was a generic result of the time reversal invariance of a quantum field theory, if the time reversal operation flipped the sign of both momenta and spins. This provided further impetus for Lüders to examine the question of time reversal in QFT.

In his paper, Lüders attempted to construct a formulation of time reversal in QFT from a handful of more physical principles. These did not, however, include the commutation relations and thus the spin-statistics connection. Lüders performed his study of time reversal in the Schrödinger representation, an extravagancy he would later apologize to Pauli for.Footnote 16 This was, however, deeply connected to his physical reading of time reversal, which Lüders preferred to describe as “motion reversal” (Bewegungsumkehr), since actual time reversal was out of the question due to the time asymmetry involved in quantum measurements, as emphasized by Watanabe. He thus wanted to implement time reversal as a reversal of all motions in the initial conditions, where “motion” in quantum theory meant motion in Hilbert space: Time reversal was thus better understood as flipping the sign of the time derivative of the wave function, rather than flipping the sign of the time coordinate.

Lüders’s main goal was then to show that this reversal of abstract motion in Hilbert space implied a reversal of both momenta and spins, as demanded by Coester’s theorem. This was indeed the case; what Lüders had constructed from his rather involved Schrödinger picture setup was a Wigner time reversal in Pauli’s sense. But Pauli’s and Lüders’s transformations, the two first (and mutually independent) formulations of Wigner time reversal in QFT, differed in their specific formulation, beyond the fact that Pauli’s transformation also included a space reflection, while Lüders’s didn’t. Pauli had written his Wigner transformation (for comparability we drop a factor of \(\gamma _0\) in Pauli’s transformation, which causes the spatial reflection) as a transformation of the spinors:Footnote 17

$$\begin{aligned} \psi ({\mathbf {x}}, t) \rightarrow U^{\dagger } \psi ^{*} ({\mathbf {x}}, - t). \end{aligned}$$
(3.1)

This was followed, following Schwinger, by a transposition of the entire Lagrangian. Lüders’s transformation, on the other hand, involved neither a complex conjugation of the spinor, nor a transposition of the Lagrangian. Instead, it consisted of a simple unitary transformation of the spinors

$$\begin{aligned} \psi ({\mathbf {x}}, t) \rightarrow U \psi ({\mathbf {x}}, - t) \end{aligned}$$
(3.2)

along with a complex conjugation of all numerical factors in the Lagrangian, in particular of the Dirac matrices.

What both transformations had in common was that they could no longer, as Schwinger had, use an explicit expression for U in terms of Dirac matrices. The reason is that Wigner transformations invoke complex conjugation (be it of the spinors, as in Pauli’s case, or of the Dirac matrices, as in Lüders’s case), and the behavior of the Dirac matrices under complex conjugation is heavily basis dependent, even after the usual hermiticity normalization is imposed. The matrix U was thus defined only through its commutation relations with the Dirac matrices;Footnote 18 in fact it was equivalent to the matrix B that Pauli had introduced already in [30] (along with the matrix C) as the matrix that transposes the Dirac matrices, according to the relation:

$$\begin{aligned} U \gamma _{\mu } U^{\dagger } = \gamma _{\mu }^T. \end{aligned}$$
(3.3)

The equivalence between Pauli’s and Lüders’s transformation can then be studied by our usual example, the interaction term of QED of Eq. 1.7. For a Wigner time reversal (or motion reversal), the spatial components of the four-potential change sign, while the temporal component remains invariant. For an invariant interaction, the fermionic charge-current density vector should show the same behavior, as it does both for the Pauli transformation:

$$\begin{aligned} \left[ \overline{U^{\dagger } \psi ^{*}} \gamma _{\mu } U^{\dagger }\psi ^{*} \right] ^T= & {} \left[ \psi ^T U \gamma _0 \gamma _{\mu } U^{\dagger }\psi ^{*} \right] ^T \nonumber \\= & {} \left[ \psi ^T \gamma _0^T \gamma _{\mu }^T \psi ^{*} \right] ^T \nonumber \\= & {} \psi ^{\dagger } \gamma _{\mu } \gamma _0 \psi \nonumber \\= & {} \psi ^{\dagger } \gamma _0 \gamma _{\mu } \psi (-1)^{1-\delta _{\mu 0}} \end{aligned}$$
(3.4)

and for the Lüders transformation (where the asterisk on the square bracket implies Lüders’s complex conjugation of only the Dirac matrices, not the field operators, nor, of course, the transformation matrix U):

$$\begin{aligned} \left[ \overline{U \psi } \gamma _{\mu } U \psi \right] ^{*}= & {} \left[ \psi ^{\dagger } U^{\dagger } \gamma _0 \gamma _{\mu } U\psi \right] ^{*} \nonumber \\= & {} \psi ^{\dagger } U^{\dagger } \gamma _0^{*} \gamma _{\mu }^{*} U\psi \nonumber \\= & {} \psi ^{\dagger } U^{\dagger } \gamma _0^T \gamma _{\mu }^T U\psi (-1)^{1-\delta _{\mu 0}} \nonumber \\= & {} \psi ^{\dagger } \gamma _0 \gamma _{\mu } \psi (-1)^{1-\delta _{\mu 0}}. \end{aligned}$$
(3.5)

The essential difference is that for the Pauli case the minus sign appears from the use of the Dirac matrix commutation relations, while for the Lüders case it appears one step earlier, when invoking the hermiticity normalization. The essential difference to the case of Schwinger (TC) and Belinfante (C) is that the invariance under this T transformation does not need to invoke the spin-statistics connection, as Pauli had stressed.

Looking beyond these parallelisms, Lüders went a significant step further than Pauli in the last section of his paper, where he broached a novel question:

The problem arises whether the laws of nature are in fact invariant with respect to motion reversal as formulated here. This invariance would constitute a further (and not trivially understandable) principle in addition to the principle of Lorentz invariance.Footnote 19

In Wigner’s original study of time reversal in quantum mechanics, there had of course been systems that were not invariant, those involving external magnetic fields. But this anomaly disappeared in the fully dynamical theory of QED, which was, as we have seen, invariant under Wigner time reversal, Schwinger time reversal, and charge conjugation. But Lüders now found that there were other Lorentz-invariant QFTs that were not invariant under his Wigner-type time reversal. In particular, Lüders pointed to the following interaction Lagrangian density for the interaction between a spin 1/2 fermion \(\psi \) and a scalar field \(\phi \)

$$\begin{aligned} {\mathcal {L}} = g{\bar{\psi }}\psi \phi + g'{\bar{\psi }}\gamma _{\mu }\psi \partial _{\mu }\phi . \end{aligned}$$
(3.6)

Both summands by themselves could be invariant under T by themselves: For the first summand, \(\phi \) has to be even under time reversal; for the second summand, it has to be odd (since the derivative has its timelike component change sign under T, while the spatial components stay invariant, in contrast to the transformation properties of the spinorial current discussed above). If both terms contained the same scalar field, they could thus never be made invariant under T simultaneously. This may seem a rather contrived example. After all, we now know far more interesting and physical QFTs that are not invariant under T. However, these QFTs also violate P, and this was not yet a viable option for Lüders in 1952.

We are indeed now so used to non-invariance under T that it is hard to grasp how radical a thought this must have been for Lüders. He thus included a discussion at the end of his paper why the consequences might not be as catastrophic as they seem at first glance, arguing that the principle of detailed balance might not be as essential to thermodynamic equilibrium as usually assumed. But more important was another argument: Perhaps, there would be no QFTs not invariant under time reversal after all, because such theories were already ruled out by the demand of charge conjugation invariance.

Indeed, in a review article on meson theory, submitted several months earlier (20 December 1951) together with his fellow postdocs at the Heisenberg Institute, Walter Thirring and Reinhard Oehme, Lüders had excluded the very interaction Lagrangian above on the basis of charge conjugation invariance [14]. Charge conjugation invariance had, since being cautiously proposed by Belinfante more than a decade earlier, established itself as a highly plausible invariance property of fundamental field theories; Lüders, Thirring and Oehme referred to it as “invariance under the exchange of particles and anti-particles” (p. 216). As opposed to time reversal, the mathematical formulation of charge conjugation was rather unambiguous. But with his explicit formulation of time reversal (which was, as we have seen, equivalent to the Wigner-type transformation constructed in parallel by Pauli), Lüders was now able to compare the implications of C and T invariance.

Lüders returned to this question in 1953, during a stay with the CERN theory group in Copenhagen, completing a paper on “the Equivalence of Invariance under Time Reversal and under particle-antiparticle conjugation for relativistic field theories.”Footnote 20 In investigating this question, Lüders performed another step in inverting premises and conclusions: The invariance properties under the discrete transformations C and T were now the conclusions that Lüders’s analysis aimed for, not the premises being used. Specifically, Lüders aimed to show that all relativistic quantum field theories were invariant under TC (i.e., a Schwinger-type time reflection), which would directly explain the equivalence of imposing C and T invariance separately, an idea Lüders actually credited Bruno Zumino for.

Now, general invariance of relativistic field theories under TC transformations was something that Schwinger had already stated in 1951, though he had of course called his TC transformations simply time reversal. But Lüders’s proof was substantially more explicit. For one, Lüders expressly pointed out that he was considering only parity-invariant theories, an important step toward CPT. Also, Lüders explicitly considered the most general possible relativistically invariant Lagrangian, though this also meant limiting himself to spins no greater than 1. In general, however, his proof of invariance under TC was not much different from Schwinger’s, which we referred to earlier, giving the example of the interaction term between electrons and photons.

What it gained in explicitness, Lüders’s proof lost by being overly convoluted. The reason for this convolution is that Lüders did not like the method, pioneered by Schwinger and endorsed by Pauli, of implementing the anti-unitarity of T by transposing the Lagrangian. Transposition did not go well with his use of the Schrödinger representation, as it implied turning bras into kets; as Lüders put it, Schwinger’s formulation of TC (which Lüders called “formal reflection in time”) “cannot be applied to state vectors in a simple manner” (p. 12). So while he could demonstrate the invariance of the Lagrangian under Schwinger’s TC quite easily, Lüders had to go a step further and prove invariance under his own TC, which arises as the product of his formulation of Wigner time reversal (Eq. 3.2) and charge conjugation. Explicitly, Lüders’s TC is thus:Footnote 21

$$\begin{aligned} \psi \rightarrow U C \psi ^{*} \end{aligned}$$
(3.7)

along with the complex conjugation of all c-numbers. In order to make sense of this, Lüders observed that (up to an arbitrary phase)

$$\begin{aligned} C = \gamma _0 \gamma _5 U^{\dagger } \end{aligned}$$
(3.8)

by combining the defining equations of C and U (Eqs. 1.4 and 3.3). He further observed that one could view his TC as the product of Schwinger’s TC and a hermitian conjugation: The hermitian conjugation would undo the transposition in the Schwinger transformation and instead complex conjugate the entire Lagrangian. The resulting transformation would entail a complex conjugation of the c-numbers (as in Lüders’s TC) and also introduce complex conjugation of the field operators into Schwinger’s transformation of the fields, which then read

$$\begin{aligned} \psi\rightarrow & {} \left( \gamma _5 \gamma _0 \psi \right) ^{*} \nonumber \\= & {} \gamma _5^T \gamma _0^T \psi ^{*} \nonumber \\= & {} U \gamma _5 \gamma _0 U^{\dagger } \psi ^{*} \nonumber \\= & {} - UC \psi ^{*}. \end{aligned}$$
(3.9)

This also coincided with Lüders’s substitution up to an inessential minus sign. Lüders could thus show invariance under his TC by invoking first invariance under Schwinger’s TC and then under hermitian conjugation. While this did shed some light on the relation between Pauli’s and Lüders’s formulations of Wigner transformations, it made for a very unwieldy proof, all the more as hermiticity is somewhat more elusive in the Schrödinger representation. In particular, the role of the correct spin-statistics relation as a key assumption for deriving TC invariance was completely marginalized, compared to the drawn-out discussions of how to factorize TC into a Schwinger transformation and a hermitian conjugation.

In any case, the proof was completed. Lüders had shown that C and T invariance were equivalent for relativistic, parity-invariant QFTs. As Lüders remarked, “[i]t seems to be a matter of taste which of these two postulates is considered the more fundamental one” (p. 5). Lüders submitted his paper on 23 October 1953 after having returned to Göttingen from Copenhagen. His next stop was Switzerland, where he joined the experimental CERN group in Geneva, aiding in the calculation of particle trajectories for the new synchrotron [25]).

4 Pauli and the CPT theorem

It was only when Lüders returned to Göttingen again that he received a response to his paper on time reversal, a response from Pauli. The letter is not extant; indeed, there may not even have been a letter. All we know from Lüders’s response is that Pauli had sent him a copy of the Les Houches notes, and in the copy Pauli had marked the section in which he explained the connection between Wigner time reversal, Schwinger time reversal, and charge conjugation. In this section, Pauli had written that invariance under charge conjugation implied the equivalence of Wigner and Schwinger invariance. Lüders consequently replied (26 April 1954, [42]) that he no longer needed to assume charge conjugation invariance, as he had shown in general that relativistic field theories were CT invariant. But Lüders clearly did not invest all that much in clarifying the relation between his and Pauli’s work, erroneously identifying Pauli’s Wigner transformation with his formal reflection in time (when it is really to be identified with Pauli’s Schwinger transformation).

Things might have stayed at that, if Pauli had not needed to write an article for the volume celebrating Niels Bohr’s 70th birthday. The volume’s editor, Léon Rosenfeld, had hoped that Pauli would write on the history of the exclusion principle. Pauli was not enthused, as he wrote to Heisenberg on 13 May 1954:

We do not have to stick too closely to the article titles given in Rosenfeld’s first circular. I am not at all happy with the title suggested to me, “Atomic constitution and the exclusion principle” [...] Do you have an idea for something I could write about newer physics? (I can’t think of anything that would be suitable for this Festschrift.)Footnote 22

Perhaps, inspired by the brief exchange with Lüders, Pauli began thinking about publishing his work on time reversal, and on 21 May 1954, he suggested this to Rosenfeld, who, in turn, was not entirely happy with the idea:

As you have seen from my circular (I do not believe for a moment that you have not read it seriously), each contributor has the duty of discussing in his article one aspect of Bohr’s scientific interest with which he has been associated. That your share in this scheme should be the exclusion principle is obvious and it would be a great pity if you really would not accept it as no one else could do it as well as you. All you are asked to do is to recall Bohr’s reactions to the problems of classification of spectra and so on, and to give a few personal recollections of that period. This need not be long, only a few pages. After that you can develop your ideas in any way you chose on any modern subject connected with the exclusion principle. It seems to me that the topic you mention: Time reversal and charge conjugation, can be reasonably considered as falling within that category. One might think that it would be more naturally linked up with the topic of Dirac’s equation, but this one is allotted to [Oskar] Klein, who has already accepted. To sum up, what I would suggest is therefore that you do write, as you wish, on time reversal and charge conjugation, but that you add the above mentioned few pages of historical considerations on the theme of the exclusion principle.Footnote 23

But Pauli would not have it:

It is my suspicion that you ask me to write about Bohr’s historical connection with the exclusion principle because you do not know this connection and you are curious to hear it. So I answer with giving you some personal informations about it together with the sincere assertion that I really wonder whether it is a nice thing to excavate a person’s errors at his 70th birthday. But an historical article of me would reveal it relentlessly!Footnote 24

This was followed by a long discussion of Bohr’s mistaken hypotheses of the early 1920s on the orbital structure of many-electron atoms. The letter concluded:

[W]hy tell all this [sic] old errors again on a 70th birthday? Why not let Bohr’s paper of 1922 stay asleep? Why not to look forward? Good old Salvation Army, why not to make the time reversal in a more abstract way?

What do you think now about the theme of my article? Do you still have such a curiosity about old times?

Pauli thus began preparing an article on time reversal and engaged with Lüders’s paper more thoroughly, resulting in a long letter on 16 June 1954 [42]. He had reached the conclusion that Lüders had really hit onto something worthwhile, but brought up some predictable ideas for improvement, in particular the use of PT instead of P (as had been Pauli’s practice from the start), and the use of Schwinger’s transposition operation, which Pauli had already endorsed in the Les Houches lectures.

Most importantly, Pauli suggested improvements in constructing the most general Lorentz-invariant Lagrangian, a problem that Schwinger (and we with him) had treated only cursorily and which had taken up the bulk of Lüders’s cumbersome proof. Pauli was able to treat the possible invariants in a more compact manner by using the van der Waerden spinor notation, which he had already employed in his proof of the spin-statistics theorem. This possibility was another advantage of not demanding parity invariance (which greatly complicates manners by interchanging the fundamental Weyl spinors) from the outset. It also allowed Pauli, as he highlighted, to treat particles with arbitrarily high (though still finite) spin and thus the most general Lorentz-invariant field theories. This generality substantially removed the connection to QED and the notion of charge, conclusively transforming the operation C from a specific transformation that flips the sign of the electric charge into a more general transformation that exchanges particles and anti-particles, a transformation that Pauli habitually referred to as particle-antiparticle conjugation. This transformation always flipped the sign of a generalized charge, namely the number of particles minus the number of anti-particles.

These improvements and re-interpretations were complemented by a deep conceptual insight. Pauli now realized that Lüders’s result “has a very close relation to my old result,” the spin-statistics theorem. Very soon, he had settled on a title for the paper that would also satisfy Rosenfeld: “Exclusion principle, Lorentz-group and reflections of space-time and charge” (Letter to Oskar Klein, 1 July 1954). Indeed, Pauli began to view the new results as an extension of his spin-statistics work that now also related to interacting theories, not just free ones, as had been the case for his 1940 result. By October, he had completed the paper as he wrote to Res Jost (6 October 1954):

It has turned into a sign orgy of about 30 pages (raising the psychological question: is that masochism? [...]) But I also have to say that in the course of this work I have lost a lot of my fear of the signs: the main point, namely the derivation of the “strong reflexion”,—as I now call it, abbreviating it by SR (“strong reflexion” is the one where the space-time-coordinates and the charge change sign) from the proper-continuous Lorentz group [...] and the quantization of the fermion fields with anti-commutators, the boson fields with commutators—that is now so simple that any child can understand it.Footnote 25

Pauli had thus now settled on the reading that Lüders’s central result had been to derive CPT invariance from Lorentz invariance and the spin-statistics connection. Simplifying Lüders’s calculation using Schwinger’s transposition convention and the van der Waerden spinor notation, Pauli now found the proof of CPT invariance so transparent (and it is indeed not significantly more complicated than the specific examples given in this paper) that he began to suspect that Schwinger might have already had this result in 1951. This is indeed an open question: Schwinger had been very brief concerning the general class of Lagrangians his result applied to and he had only derived the spin-statistics connection from discrete invariance properties, not vice versa. Pauli thus remained agnostic, as he wrote to Rosenfeld:

Writing down my ideas I feel “His Majesty” = Julian Schwinger should know everything what is in this paper. But he will never tell it, so this will be forever “unobservable.”Footnote 26

One way in which Pauli’s work definitely went beyond Schwinger (and Lüders) was by including parity not as a premise, but as part of the invariance property to be proven, establishing CPT as the fundamental symmetry of relativistic QFTs. Schwinger would later recall that he had protested to Pauli:

Actually I protested explicitly to Pauli. [...] When I saw him somewhere. I’ve forgotten where. [...] Pauli was saying that what I had done there had nothing to do with Luders and all that. [...] I pointed out that it’s just running the argument backwards. [...] I couldn’t persuade him. I remember though pointing to [...] a simple example of it in one of the papers, Theory of Quantized Fields four or five, wherever it talks about spin a half, indicating that in fact I had already made use of the combination of those transformations and that was also a natural thing to do and this was certainly pre-Luders and so forth. Then I also felt very much that I’d been gypped on that. That TCP was my old theorem stood on its head.Footnote 27

Schwinger was referring to a number of passages in later installments of his “The Theory of Quantized Fields” papers (as listed in a letter from Viktor Weisskopf to Pauli, 21 March 1955 [43]). But these passages either refer to Schwinger’s CT transformation or, where they conceivably make reference to CPT, this transformation appears in a very different context, having in particular nothing to do with field commutation relations. Schwinger’s complaint thus seems to have been somewhat overstated, though he unquestionably played an important role in the genesis of the CPT theorem, as our historical reconstruction has hopefully made clear. Pauli himself was eventually convinced that Schwinger had already known the CPT theorem by 1953/54, even if he had not made it explicit in his publications (Pauli to Fierz, Letter of 24 January 1957).

As we are now speaking of priority issues, we need to also mention a further proof of a CPT-like theorem at around the same time, by Bell [3].Footnote 28 Bell’s proof (submitted 12 April 1955) definitely postdates Lüders’s, and probably also Pauli’s, and Bell himself certainly made no priority claims. Bell’s proof also does not add any substantially new aspects, though the presentation is more transparent, especially for the contemporary reader. In the introduction to his proof, Bell states that his result “differs from his [Lüders’s] mainly in stressing the simple classical analogue [...] working mostly in the Heisenberg rather than the interaction representation” [3]. Indeed, we have followed this choice of Bell’s in our presentation of the equations in this paper.

That is not to say that there might not be a fascinating parallel story concerning how Bell arrived at the theorem. But we have not been able to reconstruct it, due to the archival situation: Bell’s papers were in the possession of his widow Mary, but their current whereabouts are unknown.Footnote 29 There is, however, one aspect of Bell’s work, that we can and should briefly discuss. Many years after Bell’s proof, Bell’s thesis advisor, Rudolph Peierls, recalled—commemorating Bell after his death in 1990—how Bell had come to work on the topic:

At the time we heard of experiments which seemed to reveal evidence for a negatively charged particle which was stable, but with a mass less than that of the proton. The experimenters asked us whether this could possibly be the anti-proton. This seemed unlikely, but could it be firmly ruled out? Everybody expected particle and antiparticle to have the same mass, but was this strictly necessary?

This was a problem after his [Bell’s] heart. He did not like to take commonly held views for granted, but tended to ask “How do you know?”. In due course he came up with the “CPT theorem”, that the results of any field theory must remain unchanged if one reverses the sign of space coordinates and of time, and interchanges particles and anti-particles. (He said cautiously, “in any theory of the present form”, but nobody has yet given an example of a sensible theory in which the theorem would not hold). The theorem ensures, in particular, that any particle and antiparticle must have the same mass. Any evidence contradicting the theorem would be very hard to reconcile with our present basic physics; so far, no such evidence has been found. Indeed, the experiment which had raised the question was not confirmed.

The proof of the theorem formed the basis of John’s Ph.D. thesis...[34]

Equal masses for particles and anti-particles is nowadays frequently cited as the main implication of the CPT theorem. Unfortunately, however, we have been unable to find any contemporary evidence for Peierls’s recollections: We have been unable to find references to the experiment mentioned, not even in Peierls’s correspondence, and no mention is made of mass equality in Bell’s paper/thesis. So what then were viewed as the implications of the CPT theorem in 1954/1955?

The clearest vision for the significance of the CPT theorem, though also without reference to particle-antiparticle mass equality, was given by Pauli.Footnote 30 In Schwinger’s work, and before, elements of the CPT theorem had shown up in the context of proving the spin-statistics theorem. It was Lüders who had first inverted this logic. His aim had been to provide a general proof that imposing charge conjugation invariance on a QFT was equivalent to imposing time reversal invariance. This was a theorem to be used for efficient model building in the context of nuclear interactions. The puzzle that Pauli believed to have solved was one that he had already addressed in his Les Houches lectures, namely how to think about the two formulations of time reversal in QFT that were on the market, namely Wigner transformations (PT) and Schwinger transformations (CPT). This was a question that had also been brought up by others, e.g., by [39].Footnote 31 For Pauli, the CPT theorem now provided these two transformations with distinct roles, as outlined in a letter to Weisskopf of 12 October 1954. CPT was an automatic result of Lorentz invariance and spin-statistics, one got it “for free” (geschenkt). PT, on the other hand, was not automatic. Demanding invariance under PT (or equivalently under C) could give restrictions on possible interactions (as Lüders’s had done). PT was also the form of time reversal that was relevant to thermodynamics, i.e., for detailed balance. The CPT theorem thus established the role(s) of time reversal in relativistic quantum field theory.

5 Outlook and conclusions

The story might end here, and for the purposes of this paper, it more or less does. But we would be amiss to not at least mention the spectacularly new significance that the CPT theorem obtained just a few years later with the experimental discovery of parity violation. After all, it was with these developments that the CPT theorem obtained the standing that made it attractive to mathematical physicists and ultimately also to historians such as the authors of this paper.

In 1949, British physicist C. F. Powell observed two new cosmic ray particles, the \(\tau \) and the \(\theta \) mesons, with the same mass and lifetime, differing only in their mode of decay. This raised the question of whether the \(\tau \) and the \(\theta \) were the same or different particles, with the former proposition implying parity non-conservation in the weak interaction responsible for their decay. Powell’s discovery prompted theorists T.D. Lee and C.N. Yang to consider the bearing that experiments involving the weak interaction had on this question. In 1956 they published their results, concluding that existing experimental data confirmed parity conservation in strong and electromagnetic interactions, but was inconclusive regarding the weak interaction. They proposed several experiments to determine whether parity was violated in the weak interaction, one of which would be taken up by C.S. Wu at Columbia University, and her collaborators from the National Bureau of Standards, in the same year.Footnote 32

The experiment consisted in probing the angular distribution of electrons resulting from the beta decay of polarized \(\text {Co}^{60}\) nuclei. By measuring the emission rate of electrons with momentum parallel and antiparallel to the nuclear spin, it could be determined whether weak interactions “differentiate the right from the left,” as Lee and Yang ([22], p. 54) put it. The experimental setup consisted essentially of a cobalt sample cooled to very low temperatures to allow for a high degree of polarization along an applied magnetic field direction, coupled with a system to count beta particle (electron) emission along a given direction, and scintillators to determine the anisotropy of the gamma ray emission associated with this beta decay. The gamma ray anisotropy served as a measure of the degree of polarization of the radioactive sample. By flipping the direction of the polarizing magnetic field and comparing the electron emission rates along the two directions, Wu and her NBS collaborators observed a significant anisotropy in the emitted electron momentum, with emission antiparallel to the nuclear spin being favored. These results proved that parity is not conserved in the weak interaction and prompted additional experimental work to confirm their findings. The confirmation came swiftly, from Wu’s Columbia colleague Leon Lederman and collaborators who validated her findings with an experiment studying the magnetic moment of free muons, which had also been proposed by Lee and Yang as a way to determine parity (non)conservation. These findings came as a shock to many in the physics community, notably to Pauli who would go so far as to write an obituary to “our dear friend parity.”Footnote 33

With parity violation, the CPT theorem really came into its own. From it one could infer that at least one other discrete symmetry, C or T, had to violated as well. There was already a viable proposal for a theoretical framework that combined the observed parity violation with a lack of charge conjugation invariance (which was of course much harder to observe, given a lack of cobalt anti-nuclei), the two-component neutrino theory, first proposed by salam [36]. Using the CPT theorem, Pauli was then further able to conclude that, apart from CPT, CP and T remained the only intact discrete symmetries (Letter to Heisenberg, 19 January 1957).Footnote 34 But he emphasized that “[w]hen I considered such formal possibilities in my paper in the Bohr-Festival Volume (1955), I did not think that this could have something to do with Nature. I considered it merely as a mathematical play...” (Letter to Wu of 19 January 1957). Soon after, Pauli received a preprint by Lee, Yang and Reinhard Oehme in which similar considerations were made. In the preprint, the authors appear to have referred to the theorem as the “Lüders–Pauli theorem,” which got Pauli to thinking about a more adequate name. He pitched the name SLP theorem (Letter to Telegdi of 22 January 1957) for Schwinger–Lüders–Pauli to satisfy the American predilection for “strange abbreviations” and taking great joy in the fact that the “P” in this abbreviation did “not mean parity.” Another option brought forward by Pauli was the more “objective” name “strong reflexion theorem” (letter to Weisskopf of 27 January 1957). In his letter to Yang (12 February 1957), Pauli then also suggested the name “CPT theorem” and this was the name that Yang, Lee, and Oehme ended up using in the published paper [21]. While the use of those three letters quickly caught on, there was quite some variety as to the order: Lüders (Lüders to Pauli, 6 February 1957) favored “TCP,” which is the abbreviation for Tricresyl Phosphate, a gasoline additive heavily marketed by Shell in the 1950s (“A new day in fuel for the American motorist is here!”, Life Magazine, 27 July 1953, p. 34). Res Jost, who was the first, already in 1957, to try and integrate the theorem into the emerging framework of axiomatic QFT (“the foundations of quantized field theory,” as he referred to it) preferred CTP.Footnote 35 As witnessed by the title of the 1964 axiomatic QFT textbook by Wightman and Streater already mentioned in the introduction, “PCT, Spin and Statistics, and All That,” it took a while for physicists to agree on the canonical order first proposed by Pauli.

Within in a few months after the discovery of parity violation, the CPT theorem had been fully immersed in a new context. It was a theorem that needed a name of its own, it was deeply connected to the physics of the weak interactions, and it was being analyzed on a new level of mathematical rigor. Its physical implications were also being re-evaluated: Only now did the primary consequence of the CPT theorem come to be appreciated, namely, that particles and anti-particles necessarily had the same masses and lifetimes [21, 26].Footnote 36 Already, it started to seem surprising in hindsight that Pauli had put all that work into deriving the CPT theorem two years before it became central to physics. When asked, Pauli dutifully sketched the story of Wigner, Schwinger and Lüders that we have reconstructed in this paper (Letter to Fierz of 24 January 1957). But he also reported of “parapsychological” dreams that he had when writing the paper for the Bohr volume.

For Pauli, his dreams were of central importance, and he discussed and interpreted them at great length in his correspondence with the psychiatrist C.G. Jung [28]. In 1957, Pauli recalled that he had had some significant dreams around the time of the establishment of the CPT theorem, dreams that centrally involved reflections, as he wrote to Jung on 22 March 1957:

Mr Fierz asked me the question what had given me the idea to work on the mathematics of reflections in 1954/55; there must have been a psychological background. I replied that I also found this highly probable. For one, the events within physics from 1952 (when I started working on reflections again) to 1956 gave no real occasion to focus on this particular subject. And I also remembered a very impressive dream, which occurred after completing my paper, which on a conscious level had seemed to me a very sober piece of work:

Dream of 27 November 1954

I am with the “dark woman” in a room where experiments are being conducted. These experiments consist in the appearance of “reflexes.” The other people in the room think the reflexes are “real objects”, while the dark woman and I know that they are “just reflections.” This creates a secret, which separates us from the other people. This secret fills us with fear.

Later, the dark woman and I go down a steep mountain by ourselves.

This was preceded by dreams with a connection to biology; afterward (January 1955) there followed a dream in which “the Chinese woman” had a child,Footnote 37 but “the people” did not want to acknowledge it.

Pauli concluded that he had a “mirror complex,” which he did not yet fully understand, but which he believed had subconsciously motivated him to pursue the CPT theorem before it became relevant to physics. He spent the last year of his life intensively exploring this mirror complex (cf. ([6], pp. 40–41)), but without resolving it, leaving an element of mystery in the story of the genesis of the CPT theorem.