1 Introduction

Referring to Pascual Jordan, in a letter to Niels Bohr dated December 1926, Max Born wrote [1]:

as far as practical results are concerned, perhaps he is not so productive but [he is] philosophically very gifted and above all interested in fundamental questions.

It is quite natural that Jordan’s interest in philosophy could not find space in the articles in which, with Born and Werner Heisenberg, he laid the first complete formulation of quantum mechanics.

Nonetheless, Jordan found a way to express his beliefs about the foundations of the new mechanics in some articles he published in 1927. He did so in a very concise form while presenting the first system of axioms for quantum theory [2] and, less concisely, in three other semi-popular writings [3,4,5]. In these works Jordan remarked the connection existing between probability theory and quantum mechanics and recognized that quantum probability constitutes a generalization of classical probability apt to describe the events of the new mechanics. Unfortunately, this view on the “new” probability was later forgotten and, as far as we know, Jordan himself never mentioned it again.

In what follows, we try, firstly, to answer to the question about how Jordan arrived at this generalization of classical probability and, secondly, to identify the way that, starting from matrix multiplication—the great discovery of Heisenberg [6] recognized by Born [7]—could have led Jordan to the interference law of probability amplitude passing through the theory of probability. Before doing this, it is necessary to recall the atmosphere reigning in Göttingen in the mid-twenties of the last century.

2 Göttingen

As full professor and director of the Physics Institute of the University of Göttingen, Born worked with a group of young and talented collaborators, Wolfgang Pauli and Heisenberg just to cite two names. Among them Jordan was the youngest one, since born in 1902. Recalling those years, Born writes that Rudolf Ladenburg:

found that the “strength” of a spectral line [...] is proportional to the transition probability between the two stationary states involved, a quantity introduced before by Einstein in the theory of heat radiation [8], p. 215.

It is quite natural that Jordan, student at the University of Göttingen and then assistant to Richard Courant and Born, had at the center of his interest the transition probabilities.

During summer 1925, Heisenberg presented to Born the manuscript of the article Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen [6], where he proposed the equality:

$$\begin{aligned} {\mathfrak {B}}(n,n-\beta )\exp (i\omega (n,n-\beta )t) = \sum _{\alpha =-\infty }^{+\infty } {\mathfrak {A}}(n.n-\alpha ){\mathfrak {A}}(n-\alpha ,n-\beta )\exp (i\omega (n,n-\beta )t) \end{aligned}$$
(1)

which he described as a “kind of composition” (Art der Zusamensetzung) imposed “almost by force by the combination of frequencies” (nahezu zwangläufig aus der Kombinationsrelation der Frequenzen) of the Rydberg–Ritz principle.

As argued by Bartel van der Waerden [9], p. 31, the meaning of (1) is nor very clear due to the quantity \({\mathfrak {A}}(n.n-\alpha )\) about which Heisenberg speaks as of a complex vector. In this respect, Jagdish Mehra and Helmut Rechenberg [1] state that (1) can be written as the expression:

$$\begin{aligned} C(n,n-\beta )=\sum _{\alpha =-\infty }^{\alpha =+\infty }A(n,n-\alpha )B(n-\alpha ,n-\beta ), \end{aligned}$$
(2)

which allows to determine the amplitude \(C(n,n-\beta )\) by adding, for all the possible values of \(\alpha \), the products of the amplitudes \(A(n,n-\alpha )\) and \(B(n-\alpha ,n-\beta )\). In other words, according to (2), the “transition amplitudes” between two stationary states are calculated by means of a sum of “symbolic multiplications” which combines the “transition amplitudes” through all possible intermediate states. Terms in quotation marks are from Born [8], p. 216.

Reflecting on Heisenberg’s manuscript, Born realized that (1) is a multiplication between matrices, but he did not go further as he failed to prove that the off-diagonal terms of the product matrix were null. At first, Born invited Pauli to collaborate on the matrix interpretation but Pauli scornfully declined the invitation. Following this refusal Born turned to Jordan who, in a couple of days, overcame the obstacles that had stopped his teacher. In a few weeks of intense work, Born and Jordan developed the mechanics of matrices which they published with the meaningful title Zur Quantenmechanik [10]. As a consequence, Jordan found himself bound between matrix multiplication and transition probabilities, two apparently distant concepts that nevertheless had to be somehow connected.

Indeed, some kind of connection already existed and in Göttingen it was beginning to be seen. In his memories, Born writes:

We translated Plank’s calculation into the language of quantum theory, introducing ‘transition quantities’ instead of the corresponding classical quantities [...]. We were struck by the fact that the ‘transition quantities’ appearing in our formula always corresponded to squares of amplitudes of vibrations in classical theory [8], p.216.

These ‘transition quantities’ are the number of atoms that passed from one stationary state to another, through what will later be called a quantum jump. In other words, they were the fractions of the population of atoms that from a starting state, possibly moving through intermediate states, ended up in a final state. The number of these atoms, divided by that of the total population of the starting state, constitutes the relative frequency of the atoms making the jump. Considering the enormous number of atoms involved in these jumps, basically these relative frequencies are probabilities, of course transition probabilities of the same type that Albert Einstein had studied in his work on the quantum theory of radiation [11].

The connection between products of matrices and transition probabilities was, and someway still is, far from immediate and we believe that Jordan soon became convinced of the impossibility of directly connecting the transition probability to the matrix multiplication: as we shall see, the reason is that usual probability is in contrast with the laws of optics. To describe the micro-physical phenomena, it was necessary to introduce new ideas and concepts. This is what Jordan did by placing the notion of probability amplitude at the basis of the first system of axioms for quantum mechanics.

Although he attributed the merit of having identified this notion to Pauli, it is Jordan’s first work where the basic principles of quantum mechanics are formulated by means of probability amplitudes and where the probabilities—conditional probabilities (relative Wahrscheinlichkeit) and not absolute probabilities as Jordan repeatedly specifies—are obtained as the squared modulus of the amplitudes [2]. In what follows, referring to the usual conditional probability based on Renyi’s axioms [12] we will denote it by P(H|E) in which H is the hypothesis and E is the evidence [13]. On the other hand, referring to a probability amplitude and to a conditional probability in quantum mechanics we will denote them by \(\phi (A|S)\), respectively, by \(|\phi (A|S)|^2 = P(A|S)\) in which A is the assertion and S the supposition that von Neumann called, respectively, Behauptung and Voraufsetzung [14].

3 The postulates of Jordan

David Hilbert was teaching at the University of Göttingen in the period we are considering and he had a great interest in the formal structure of quantum mechanics. It is therefore natural that, after having formulated matrix mechanics with Heisenberg and Born, Jordan tried to identify a system of axioms that could contain “in itself as very special cases all the [hitherto known] four formulations [of quantum mechanics]” (alle vier Formulierungen in sich als sehr spezielle Falle) [2], specifically: matrix mechanics, Born and Norbert Wiener’s operational calculus, Ervin Schrödinger’s wave mechanics and Paul Dirac’s q-number theory. The result of his work, entitled “On a new foundation of quantum mechanics” (Über eine neue Begründung der Quantemechanik), was submitted to the Zeitschrift für Physik toward the end of 1926 and published at the beginning of the following year. This work of Jordan is in many respects lacking, as Duncan and Janssen have shown [15]. Nevertheless Jordan’s axiomatic system was immediately reworked by Hilbert, John von Neumann and Lothar Nordheim [16] with some important changes that, however, did not modify the most significant innovation introduced by Jordan.

Jordan’s axiomatic system consists of two main axioms, Postulat I and Postulat II, and four auxiliary ones Postulat A, Postulat B, Postulat C and Postulat D. Few months later, Jordan formulated another system [17] in which he took into account the improvement suggested by Hilbert and co-authors; for example, he used the Hilbert space formalism and dropped the supplementary amplitude (Ergänzungamplitude). Anyone interested in Jordan’s second axiomatization can found it precisely described and carefully analyzed by Antony Duncan and Michel Janssen [15, 18]. In the following, we will deal only with Postulat II and Postulat C, which we consider the fundamental ones.

In the introductory section (Einleitung), Jordan states theFootnote 1:

Postulate II: Let \(\psi (Q_0,q_0)\) be the probability amplitude that quantity Q takes the value \(Q_0\), given that \(q=q_0\), then the probability amplitude \(\Phi (Q_0,\beta _0)\) of \(Q_0\) given \(\beta _0\) will be equal to

$$\begin{aligned} \Phi (Q_0,\beta _0)=\int \psi (Q_0,q)\varphi (q,\beta )\mathrm{{d}}q \end{aligned}$$
(3)

where the integration must be extended to all values of q.

where Qq and \(\beta \) are continuous Hermitian quantum mechanical quantities.

In the original expression of (3), as reported here, there is a misprint: \(\beta \) must be replaced by \(\beta _0\) in the amplitude \(\varphi (q,\beta )\) on the right-hand side of the equality, so (3) should be read as: \(\Phi (Q_0,\beta _0)=\int \psi (Q_0,q)\varphi (q,\beta _0)\mathrm{{d}}q\).

Immediately after the formulation of this postulate, Jordan adds that it seems appropriate to qualify Postulat II as “interference of probabilities” (Interferenz der Wahrscheinlichkeiten) since it imposes that “not the probabilities themselves but their amplitudes follow the usual law of combination of the probability calculus” (nicht die Wahrscheinlichkeiten selbst, sondern ihre Amplituden dem gewöhnlichen Kombinationsgesetz der Wahrscheinlichkeitsrechnung folgen).

The “usual law of combination” that Jordan refers to is known as the law of total probability and states that when propositions \(A_{1},\ldots ,A_{j},\ldots ,A_{n}\) are such that \(\bigcup \limits _{j=1}^n A_j\) is logically true while \(A_j,A_h\) is logically false when \(j\ne h\), then the conditional probability of H, given E is:

$$\begin{aligned} P(H|E)=\sum _{j=1}^n P(H|E,A_j) P(A_j|E). \end{aligned}$$
(4)

Equation (4) can be very easily proved by applying the sum and product rules of probability theory.Footnote 2 Even neglecting the difference between integral and summation, the argument of (4) is different from those of (3). In fact, E appears in the evidence of \(P(H|E,A_j)\) of (4), while \(\beta _0\) does not appear in the supposition of \(\psi (Q_0,q)\) of (3). This difference is a consequence of the Markov property, normally assumed in quantum mechanics. Jordan assumes this property without explicitly mentioning it.

In paragraph 2, entitled “Statistical Foundation of Quantum Mechanics” (Statistische Begründung der Quantenmechanik), where “statistical” can be read as “probabilistic”, Jordan formulatesFootnote 3:

Postulate C: Probabilities combine by interfering. Let \(F_1,F_2\) be two facts for which there are the amplitudes \(\varphi _1, \varphi _2\). When \(F_1, F_2\) are mutually exclusive,

$$\begin{aligned} \varphi _1 + \varphi _2 \end{aligned}$$
(5)

is the amplitude for the fact “\(F_1\ \text {or}\ F_2\)”; when \(F_1, F_2\) are independent

$$\begin{aligned} \varphi _1 \varphi _2 \end{aligned}$$
(6)

is the amplitude for the fact “\(F_1\ \text {and}\ F_2\)”.

Immediately after Jordan notes that “As a first consequence we get that” (Als erste Folgerung ergibt sich)Footnote 4:

Let \(\varphi (x,y)\) be the amplitude for a value x of q, given \(\beta =y\) and \(\chi (x,y)\) the amplitude for \(Q=x\), given \(q=y\); then

$$\begin{aligned} \Phi (x,y)=\int \chi (x,z)\varphi (z,y) \mathrm{{d}}z \end{aligned}$$
(7)

is the amplitude for \(Q=x\) given \(\beta =y\).

The way in which this claim is formulated may generate some perplexity since in the premise the value of q is once set equal to x and then to y, while, in the thesis, these are the values of Q and \(\beta \), with z being the value of q. The fact remains that (3) is a rewriting of (7), and thus, this last claim shows that Postulat II is a consequence of Postulat C.

Jordan enunciates this consequence without proving it. The first thing that comes to mind is that he omitted the proof considering it very simple and similar to that of (4). This interpretation appears similar to Duncan’s and Janssen’s one that sees Postulat C as a mean used by Jordan [15]:

to capture the striking features in his quantum formalism that the probability amplitudes rather than the probability themselves follow the usual composition rules for probabilities.

This viewpoint—the fact that Postulat II is a consequence of Postulat C—provides a kind of justification of the Postulat II in the sense that applying the basic rules of the theory of probability to probability amplitudes it is possible to obtain the interference law. But is this really the case? Are the two rules of Postulat C really those of sum and product in probability theory transposed into quantum mechanical probability amplitudes?

Before going into details, we note that Jordan advances two justifications of Postulat II. One of these is formal, as just seen, the other one is informal but more intuitive. We aim at analyzing these justifications. Anticipating the results of our analysis, we say that while the formal justification does not seem to be acceptable, the informal one is much more satisfactory.

Starting with the formal justification, the first thing to do is to clarify the meaning of the facts (Tatsachen) \(F_1\) and \(F_2\) for which, as Jordan says, “there are the amplitudes \(\varphi _1\) and \(\varphi _2\)”. If \(F_i\) is a fact for which there is the amplitude \(\phi _i\), then \(F_i\) is a conditional fact. By conditional fact we mean a fact whose occurrence is strictly linked to another fact which may or may not occur. Said in other words, a conditional fact takes place only in case another fact takes place. As stressed by Bruno de Finetti [19], a conditional event H|E has three truth values: true, when E is true and H is true as well; false, when E is true but H is false; void, that is neither true nor false, when E is false. The same holds for the facts considered by Jordan. When \(F_i\) is a fact conditional to another fact \(A_i\), \(F_i\) occurs when, occurred \(A_i\), \(F_i\) occurs as well; \(F_i\) does not occur when, occurred \(A_i\), \(F_i\) does not occur; finally, \(F_i\) is void when \(A_i\) does not occur. By considering the roulette game as an example, a conditional fact could be the occurrence of “number 21” when a “red number” occurs. As conditional event, this conditional fact may be denoted by \(21 \vert \text {red}\) where “21” is the assertion and “red” the supposition. A bet on the occurrence of \(21\vert \text {red}\), is winning if the ball ends in pocket 21, is losing if the ball ends in anyone of the other 17 red pockets. The bet is canceled, that is, it is considered as not having been made, if the ball ends in one of the 19 pockets with a color other than red.

Assuming that x, y and z are values for Q, \(\beta \) and q respectively, let us write in full the amplitudes on the right-hand side of (7), namely \(\chi (Q=x\,|\,q=z)\) and \(\varphi (q=z\,|\,\beta =y)\). Now we can ask ourselves about the meaning of the premises “when \(F_1, F_2\) are mutually exclusive” and “when \(F_1, F_2\) are independent”.

Starting from the first, we consider two conditional facts \(F_n\) and \(F_m\). We suppose that these facts involve four quantities A, B, C and D whose values are a, b, c and d, respectively. Thus, we can write them as:

$$\begin{aligned} F_n\equiv (A=a\,|\,B=b)\quad \text {and}\quad F_m\equiv (C=c\,|\,D=d) \end{aligned}$$
(8)

Now we can ask how \(F_n\) and \(F_m\) can be mutually exclusive. If, as in (8) we are dealing with conditional facts without any mutual connection, this question makes no sense. Returning to the roulette example, does it makes sense to ask if the outcome of the number “21” on the first wheel of the San Remo Casino and the outcome of the number “25” on the first wheel of the Monte Carlo Casino are mutually exclusive? The only possibility that we can imagine for this exclusion is the phantasmagorical assumption that the croupiers turning the wheels in San Remo and Monte Carlo bargain over in some way that this double outcome should never occur.

Evidently Jordan was not considering facts that are completely disconnected like those we have imagined for the two roulettes, but rather conditional facts that are connected in the sense that the supposition of one becomes the assertion of the other. To understand what sort of exclusion Jordan was considering, let us imagine a traveler who, coming from city \(A=a\) is going to city \(C=c\) and, during the trip, comes to a crossroad B from which three roads \(B=b_1\), \(B=b_2\) and \(B=b_3\) are branching, each one leading to C. Let us also suppose that the traveler intends to take only one of the possible alternative roads at B. As a consequence, the three conditional journeys, mutually exclusive, are

$$\begin{aligned} B = b_j\,|\,A=a \quad \text {and}\quad C=c\,|\,B = b_j\qquad \text {with}\;j=1,2,3. \end{aligned}$$

This is what (7) shows when the roads, from three, become a continuous infinity.

Now let us move to Jordan’s rule of independence. The first observation that must be made is that in probability theory the product rule has nothing to do with independence. One of the axioms of the theory of conditional probability [12] is the product rule, according to which the conditional probability of the conjunction of hypotheses A and B, given C, is

$$\begin{aligned} P(A,B\,|\,C) = P(A\,|\,C) P(B\,|\,C,A). \end{aligned}$$

In this equality there is no clue about stochastic independence. That is introduced when

$$\begin{aligned} P(A,B\,|\,C)=P(A\,|\,C)P(B\,|\,C), \end{aligned}$$

that is, when

$$\begin{aligned} P(B\,|\,C,A) = P(B\,|\,C). \end{aligned}$$

Stochastic independence is a property of the probability function and not of hypotheses, events or facts. It is worth noting that since stochastic independence is not an axiom, the theory of probability can be fully developed without ever involving independence. For example, Pierre–Simon Laplace himself did not use stochastic independence in one of the earliest applications of the theory of probability to inferential statistic [20].

Jordan reverses and modifies things in the sense that he first speaks of the independence of facts, and then, as a consequence of this, introduces the product of the probability amplitudes. But, since Jordan’s independence is not the stochastic independence, it is important to understand what he means when he assumes that \(F_1\) and \(F_2\) are independent. In this regard, it is worth to read what, twenty years later, Richard Feynman wrote about a problem very similar to Jordan’s one.

3.1 Feynman’s measurements

In the second paragraph of the paper where he proposed his famous paths, Feynman considers [21]:

an imaginary experiment in which we can make three measurements successive in time: first of a quantity A , then of B , and then of C . There is really no need for these to be of different quantities, and it will do just as well if the example of three successive position measurements is kept in mind. Suppose that a is one of a number of possible results which could come from measurement A , b is a results that could arise from B , and c is a result possible from the third measurement C. [...] We define \(P_{ab}\) as the probability that if measurement A gave the result a, then measurement B will give the result b. Similarly, \(P_{bc}\) is the probability that if measurement B gives the result b, then measurement C gives c . Further, let \(P_{ac}\) be the chance that if A gives a , then C gives c . Finally, denote by \(P_{abc}\) the probability of all three, i.e., if A gives a, then B gives b , and C gives c . If the events a and b are independent of those between b and c, then

$$\begin{aligned} P_{abc} =P_{ab} P_{bc}. \end{aligned}$$
(9)

[...] In any event, we expect the relation

$$\begin{aligned} P_{ac} =\sum _{b} P_{abc} \end{aligned}$$
(10)

This is because, if initially measurement A gives a and the system is later found to give the result c to measurement C, the quantity B must have had some value at the time intermediate to A and C. The probability that it was b is \(P_{abc}\). [...] The classical law obtained by combining (1) and (2)Footnote 5, [is]

$$\begin{aligned} P_{ac} =\sum _{b} P_{ab} P_{bc} \end{aligned}$$
(11)

Feynman’s probabilities are clearly conditional probabilities. By writing them in full, \(P_{ab}\) becomes \(P(B=b\,|\,A=a)\); \(P_{bc}\) becomes \(P(C=c\,|\,B=b)\); \(P_{ac}\) becomes \(P(C=c\,|\,A=a)\) and \(P_{abc}\) becomes \(P(B=b,C=c\,|\,A=a)\). Consequently, (9) becomes

$$\begin{aligned} P(B=b,C=c \,|\, A=a) = P(B=b \,|\, A=a) P(C=c\,|\,B=b). \end{aligned}$$
(12)

It is clear that (12) is not a consequence of stochastic independence since, by the product rule,

$$\begin{aligned} P(B=b,C=c \,|\, A=a)\; =\; P(B=b \,|\, A=a) P(C=c \,|\, A=a, B=b) \end{aligned}$$
(13)

and, if stochastic independence holds, (13) becomes

$$\begin{aligned} P(B=b,C=c \,|\, A=a) \;=\; P(B=b \,|\, A=a) P(C=c \,|\, A=a). \end{aligned}$$
(14)

Therefore, when Feynman refers to independence he does not mean stochastic independence but rather he thinks of a stochastic dependence weakened by Markov’s property. Feynman, in the quotation above, refers to probabilities and not to amplitudes, but his argument is analogous to Jordan’s one that considers facts having in mind conditional events and conditional probabilities, as suggested by the opening of Postulat C. Following Feynman, let us try to understand what Jordan means while writing “\(F_1, F_2\) are independent”.

In order to obtain Eq. (12) also Feynman speaks of independence but, unlike Jordan, he says something more, namely he assumes that the measurements of A and B do not influence what happens between the measurement of B and C. This is a way of expressing the absence of any physical influence between the measurements. Said in other words, Feyman assumes that the material situation in which the three successive measurements are performed does not change. But the fact that all measurements take place under the same material conditions does not imply that the outcome of one measurement cannot change the probability of the outcome of the subsequent measurement. Physical independence is not stochastic independence and, above all, the former does not imply the latter since, as shown, Feynman’s independence is Markovian stochastic dependenceFootnote 6. Returning to Jordan, since he refers to results of observations, we speculate that, in order to demonstrate that Postulat II follows from Postulat C, he must have used arguments similar to those used by Feynman. That is, he must have thought, first, that \(F_1\) and \(F_2\) are related in such a way that in \(F_1\equiv q = z | \beta = y\), \(q=z\) is an assertion, while in \(F_2\equiv Q = x | q=z\), \(q=z\) is a supposition. Second, that independence ensures Markovian stochastic dependence for probability amplitudes. Third, that at any intermediate time between the determinations of the value of the quantities \(\beta \) and Q, the quantity q must have one and only one value.

We cannot know whether Jordan intended to deduce Postulat II from Postulat C being convinced that this axiom transforms the sum and product rules of probability theory in similar rules for amplitudes in quantum mechanics. Perhaps Jordan believed he was closer to probability theory by deducing Postulat II from Postulat C. With a view of the title of paragraph 2, this deduction would have made the interference law of probability amplitudes almost a consequence of the principles of probability. But the rules of exclusion and of independence of Postulat C are not the sum rule and the product rule of probability theory rephrased for probability amplitudes. It follows that Postulat C neither explains nor justifies the interference law. Nevertheless, we can ask ourselves how Jordan arrived at Postulat II, notably how he could get there starting from matrix multiplication. Before discussing this, however, we must talk about the transformation that the concept of probability underwent in Göttingen.

4 From absolute to conditional probability

In order to fully understand the generalization Jordan made of the notion of probability amplitude, let us see how, between 1926 and 1927, the notion of probability changed in Göttingen. For this purpose it is appropriate to mention a similar transformation that took place in probability theory.

In the mid-nineteenth century, after Laplace, Karl Friedrich Gauss and Siméon Poisson, the classical theory of probability made use of two notions of probability, one absolute and the other conditional. The former was used in the theory of direct probabilities, the latter in that of inverse probabilities, that is the field that today we call Bayesian statistics. The statistical inference that Laplace published in 1774 [20] is the first case where conditional probability is used to estimate unknown quantities. The contributions of James Clerk Maxwell and Ludwig Boltzmann, based on absolute probabilities, while giving great impetus to the frequency interpretation of probability, did not change the situation. A deep modification occurred only in 1921 with the publication of A treatise on probability by John Maynard Keynes in which we find the statement: “No proposition is in itself probable or improbable, just as no place is intrinsically distant.” [22], (p. 7, edition 1957). A nice way to say that all probabilities are conditional. Keynes’s thesis changed deeply the understanding of probability theory, as researches by Harlod Jeffreys, Bruno de Finetti, Leonard J. Savage and Edwin T. Jaynes confirmed years later. Obviously, these contributions were not known to the Göttingen physicists, nevertheless their view on probability was changing as well.

In a preliminary communication (Vorläufige Mitteilung) published in 1926 [7], Born argued that in a collision between an electron and an atom, the states of both systems remain coupled in a tricky way (in verwickelter Weise). However, when referring to the electrons that, after the collisions, scatter in various directions, he writesFootnote 7 (note \(^1\) in text below will be explained after the quotation):

If we now want to reinterpret this result corpuscularly, then only one interpretation is possible: \(\Phi _{\overset{nm}{\tau }} (\alpha , \beta , \gamma )\) determines the probability\(^1\) that the electron coming from the z direction is launched [...] in a given direction \(\alpha , \beta , \gamma \).

Note \(^1\) above is the celebrated note, added during proofreading, that specifies: More careful reflection shows that the probability is proportional to the square of the magnitude \(\Phi _{\overset{nm}{\tau }}(\alpha , \beta , \gamma )\)Footnote 8. The probability of Born concerns only with the hypothesis, that is with the direction \(\alpha , \beta , \gamma \) in which the electron can be scattered. On the other hand, immediately after this passage, Born addsFootnote 9:

You get no answer to the question “what is the state after the collision” but only to the question “how likely is a certain effect of the collision”

which reiterates that probability deals only with the effect of the collision. The “square of the magnitude \(\Phi _{\overset{nm}{\tau }}(\alpha , \beta , \gamma )\)” is an absolute probability.

Having only one argument, Born’s probability is in perfect harmony with the classical probability of Laplace. With keenness, Jordan grasped this feature of the probability of Born. In fact, in the introductory section of [17] he observes that the only argument of Born’s probability is the state of the system n, that is Born’s probability is an absolute probability.

Pauli, in an article published a year later, generalized Born’s proposal by assuming that probability is a two-argument function. In a note, considering a system of N particles with position coordinates \(q_1 \ldots q_f\) such that each quantum state of the system corresponds to a function \(\psi (q_1 \ldots q_f)\), he definesFootnote 10:

\(|\psi (q_1 \ldots q_f)|^2 \mathrm{{d}}q_1 \ldots \mathrm{{d}}q_f\) is the probability that in the concerned quantum state of the system of interest these coordinates are located simultaneously in the concerned volume element \(\mathrm{{d}}q_1 \ldots \mathrm{{d}}q_f\).

Pauli’s probability concerns an assertion, the coordinate of the system, and a supposition, the state of the system.

Once again, Jordan captures the novelty inherent in Pauli’s probability. Unlike Born’s probability, Pauli’s probability is a two-argument function: an assertion and a supposition. Jordan grasps one of the generalizations introduced by Pauli but misses, or at least do not explicitly mention, the other crucial change introduced by Pauli’s conditional probability. In truth, in addition to Jordan, this peculiarity of probability in quantum mechanics has escaped many physicists. As Bernard O. Koopman argued [24] in the usual conditional probability the hypothesis and the evidence refer to distinct outcomes of the same trial, while in quantum probability the assertion and the supposition refer to the outcomes of different trials. In probabilistic terms, in the first probability the hypothesis and the evidence refer to the same sample space, while in quantum probability the assertion and the supposition refer to different sample spaces. This is precisely what makes Einstein’s probability and Pauli’s probability deeply different. In Einstein’s transition probability hypothesis and evidences refer to the same quantity, the energy level. In Pauli’s probability the assertion refers to a value of a quantity, the coordinate, while the supposition refers to a different quantity, the state.

In the introductory paragraph, Jordan credits Pauli with the generalization he is about to suggest. He writes, “Pauli has drawn attention to the following generalization” (Pauli hat folgende Verallgemeinerung ins Auge gefaßt). A brief comment on Jordan’s keenness on attributing to others the innovations he gradually introduces seems now appropriate since the reason could be simple. When Jordan writes this article, he is very young (24 years old) and has just attended to the lectures and seminars where the issues he is dealing with have been discussed. Therefore, he probably feels somehow obliged to support his ideas with the authority of his teachers, in particular to seek the support of Pauli whose surly and arrogant character he knew well.

Immediately afterward, dropping the hydrogen atom, Jordan completes the decisive step for the introduction of the interference law of probability amplitude. The notion of probability that Jordan introduces immediately before formulating Postulat I and Postulat II represents a fundamental improvement since he puts aside the energy levels and the coordinates of the electron of the hydrogen atom and employs any two quantitiesFootnote 11:

Let \(q,\beta \) be two [continuous] quantum mechanical Hermitian quantities [...], then there is always a function \(\varphi (q,\beta )\) such that

$$\begin{aligned} \left| \varphi (q_0, \beta _0)\right| ^2 \mathrm{{d}}q \end{aligned}$$
(18)

measures the (relative) probability that, given the value \(\beta _0\) of \(\beta \), the quantity q has a value in the interval \(q_0, q_0+\mathrm{{d}}q\).

It is worth noting that Jordan is the first among the Göttingen researchers that notes explicitly that (18) is a conditional probability. Moreover we recall, as observed by Duncan and Janssen, the assumption of continuity, introduced by Jordan “for convenience” (der Bequemlichkeit halber), rises serious technical difficulties, overcome later by von Neumann [15]. But this is not what interests us; we are interested in the deep change, almost a revolution, that (18) introduces into the theory of probability which in Göttingen, where Keynes’ book was unknown, was still considered to be the classical probability of Laplace.

We have seen that for Born probability is an absolute notion. This is also the way some modern authors interpret probability, by referring just to the hypothesis without reference to any evidence. Pauli generalizes Born’s idea in the sense that the arguments of his probability are two, the assertion—that is the position of the system is in a given interval—and the supposition—that is the system is in a given state. This means that if the system were in a different state, then the probability of being in that interval might be different. It is worth noting that while the assertion talks about the location of the system, the supposition talks about its energy, so the quantity considered in the assertion is different than that considered in the supposition. Nonetheless, Pauli’s proposal remains linked to Heisenberg’s idea since the amplitude and the probability still refer to the position and energy of the electron of the hydrogen atom. Jordan’s conclusive generalization allows us to determine the probability of the assertion “the quantity q has the value \(q_0\)” when we know the supposition “the quantity \(\beta \) has the value \(\beta _0\)”, with q and \(\beta \) being any possible quantities. Of course these quantities can be the energy level and the position of the electron of a hydrogen atom, but they can also be any other possible quantity. This is Jordan’s significant new generalization: knowing the value of a quantity, (18) allows us to determine the probability that another quantity has a given value. And it is precisely with reference to this notion of conditional probability that Jordan will justify his Postulat II.

We might now be asked: “why do you always talk about probabilities while Postulat II refers to amplitudes?” Our answer is quite simple. Jordan’s most convincing justification for Postulat II is based on a contrast between probability and probability amplitude. We are convinced that it was essentially probabilistic the reasoning that led Jordan from matrix multiplication to interference law of probability amplitude. Said in other words, by considering the peculiar characteristics of conditional probability Jordan figured out those of probability amplitudes. It is no coincidence that in the introductory Paragraph 1 Jordan states that his axiomatization is intended “to ground quantum laws on some simple statistical assumptions [2]” (die quantenmechanischen Gesetze als Folgerungen einiger einfacher statistischer Annahmen zu begründen). As we shall see in the next section, following this analogy he found an insurmountable obstacle: the law of total probability is in open contrast to the behavior of light. But to see this contrast, we must look at the intuitive justification that Jordan gave to the Postulat II.

5 Jordan’s intuitive justification of Postulate II

Born and Jordan considered the “law of multiplication between quantum quantities” (Multiplikationsgesetz der quantentheoretischen Größen), that is (1) the “symbolic multiplication” as Born called it, as “The mathematical foundation of Heisenberg’s consideration [10]” (Die mathematische Grundlage der Heisenbergschen Betrachung). It is therefore quite natural that Jordan led his attention to this multiplication and considered it as a starting point for the reflections which led him to Postulat II. In (1) there is a summation while in Postulat II there is an integral, but this does not matter. The real difference between Heisenberg’s multiplication and Postulat II is that the latter composes two amplitudes of two quantities whatsoever. That Postulat II is a “rule of composition of two amplitudes [16]” (Kompositionregel zweier Amplituden) was maintained by Hilbert von Neumann and Northeim since for them this rule “has clear analogies with the theorem of addition and the theorem of multiplication of the usual probability calculus [16]” (stellt offenbar ein Analogon zu dem Additions- und Multiplicationstheorem der gewöhnlichen Wahrscheinlichkeitsrechnung dar). We consider these observations as a precise indication of the reasoning followed by Jordan to arrive at the interference law of probability amplitudes. We believe it is important to follow this route again in order to show that Jordan saw quantum mechanics as a generalization of the theory of probability. For Jordan, at least at that time, “the quantum mechanics is ultimately a set of rules for conditional probability” [15]. Said in other words, Jordan understood quantum mechanics as a probability theory that ceased to be an “elementary calculus of probability [4]” (elementaren Wahrscheinlichkeitsrechnung).

No matter what Jordan thought, Postulat C can in no way justify Postulat II. It therefore seems natural to ask whether there was any other justification for this Postulat in Jordan’s works. A very convincing justification of Postulat II can be found in two works, halfway between popularization and epistemology, published in Die Naturwissenschaften between the end of July and the beginning of August 1927 [4, 5]. It is noteworthy that these papers were conceived while Jordan was developing his axiomatic system.

We focus our attention on a passage found in the eighth and last paragraph of Die Entwicklung der neuen Quantenmechanik (Schluß) [4] whose title is “The generalized theory” (Die verallgemeinerte Theorie), where ‘theory’ refers to quantum theory. In this paragraph Jordan leaves popularization for epistemology. After arguing that all laws “in the sphere of atoms and quanta” (im Gebiete der Atome und der Quanten) are not causal but probabilistic, Jordan writes,Footnote 12:

In general, the probability that q is in the interval \(q_0, q_0+\mathrm{{d}}q\) meanwhile another mechanical quantity has the value \(\beta _0\) can be represented in the form

$$\begin{aligned} \left| \varphi (\beta _0,q_0)\right| ^2 \mathrm{{d}}q. \end{aligned}$$
(22)

(Actually the matter is a little more complex; however, this complication is not very essential.) Now we want to think that \(\left| \psi (q_0, Q_0) \right| ^2 \mathrm{{d}}Q\) is the probability that another quantity Q has a value in the interval \(Q_0, Q_0+\mathrm{{d}}Q\), when q has the value \(q_0\); if for this quantum mechanical probability the usual probability calculus were valid, it is evident that then one could argue: the probability that Q is in the interval \(Q_0+\mathrm{{d}}Q\), when the quantity \(\beta \) has the value \(\beta _0\), is given by

$$\begin{aligned} \mathrm{{d}}Q \cdot \int \left| \varphi (\beta _0, q)\right| ^{2}\cdot \left| \psi (q, Q_0)\right| ^{2} \mathrm{{d}}q \end{aligned}$$
(23)

But quantum mechanical probabilities do not combine in the way corresponding to elementary probability calculus. We know very well from optics that light plus light does not always have to yield strengthened light—but that there are interferences; and this for the theory of light quanta means that the probabilities of the occurrence of light quanta do not simply add up, but that the ‘amplitudes of probabilities’ add up. Now, as Pauli surmised, a probability interference of this kind occurs quite generally for quantum mechanical probabilities. Mathematically, this is expressed as follows: the probability that Q has a value in the interval \(Q_0+\mathrm{{d}}Q\) is not given by the formula just seen [(23)] (which should hold for the ordinary probabilities that do not interfere), but by means of \(\mathrm{{d}}Q\cdot \left| \Phi (\beta _0, Q_0)\right| ^2\), and therefore we have

$$\begin{aligned} \Phi (\beta _0, Q_0) = \int \varphi (\beta _0, q) \psi (q,Q_0)\mathrm{{d}}q. \end{aligned}$$
(24)

Thus, in general it is not the probabilities themselves that combine but rather their amplitudes in the manner in which we are accustomed from the elementary probability calculus.

For clarity we add that the only difference between (24) and Postulat II is the following: In \(\Phi (\beta _0, Q_0)\) of (24) the first argument \(\beta _0\) is the supposition and the second argument \(Q_0\) the assumption (the order of arguments is with reference to the symbol \(\beta _0 \rightarrow Q_0\) used in stochastic processes where \(\beta _0\) is the present state and \(Q_0\) the future state). In \(\Phi (Q_0, \beta _0)\) of Postulat II \(Q_0\) is the assumption and \(\beta _0\) is the supposition (here reference is made to the symbol \(Q_0 \,|\, \beta _0\) denoting a conditional assertion \(Q_0\) that is true only if the supposition \(\beta _0\) is true). The same considerations apply to the other amplitudes of (24) and of Postulat II.

The quotation above is about a contrast opposing the law of total probability to the interference law of probability amplitude. The assertion “it is not the probabilities themselves that combine but rather their amplitudes” means that the law of total probability holds in the “elementary probability calculus”, in the usual theory of probability but not in quantum mechanics. Among the principles of quantum probability there is no place for (23) and the law of total probability must be replaced. We are convinced that for Jordan the open contrast between this law and the interference of light is the justification of Postulat II.

Of course, this contrast could not be imagined if one had not firstly considered (23). Only after having considered this formula Jordan realizes its inadequacy. So it is advisable to clarify the “we know well from the optics” which certifies the inadequacy of (23). This will provide us with the opportunity to lay out Jordan’s intuitive justification of Postulat II.

5.1 Landé’s example

The contrast that Jordan first pointed out was proposed again, without mentioning its discoverer, by Heisenberg, Feynman, Hans Reichenbach, Alfred Landé and many others. As we have seen, Jordan is indeed very concise, which is why we believe it is appropriate to return to the phrase “We know very well from optics that light plus light does not always have to yield strengthened light”. With this sentence Jordan asserts that the probabilities of quantum mechanics do not combine according to the law of total probability. However, he does not say why this happens. To see the matter more closely let’s read what Landé who, a quarter of a century later, takes up Jordan’s argument in a beautiful introductory text written with the aim at putting “more emphasis on the physical background of quantum mechanics” [25].

Landé considers a monochromatic light train which issues with transverse polarization of \(a' \)-direction from a polarizer a which absorbs the \(a''\)-polarization. A crystal plate b, placed in the path of the \(a'\)-ray, splits this ray in two transversal components \(b'\) and \(b''\). An analyzer c, placed after the crystal plate, may pass light linear of \(c'\)-polarization only. Finally, a screen place behind c makes it possible to detect the intensity \(J_{a'c'}\) which, Landé points out, is a fraction of that of the initial \(a'\)-intensity. If the light leaving the polarizer a were composed of photons polarized in the direction \(a'\), the ( [25], pp. 107-108):

polarization effects might be explained by photons depicted as transverse two-way arrows. [...] the photons issuing from the polarizer have their double arrows in the \(\pm a'\)-direction. When entering the crystal b, the fraction \(J_{a'b'} = \cos ^2(a'\cdot b')\) of the \(a'\) photons turn to the \(b'\)-orientation, and a corresponding fraction to the \(b''\)-orientation. Next, the fraction \(J_{b'c'} = \cos ^2(b'\cdot c')\) of the b-photons turn to the \(c'\)-orientation, and so on, so that the final intensity passed by the analyzer \(c'\) would be

$$\begin{aligned} J_{a'c'} = J_{a'b'} J_{b'c'} + J_{a'b''} J_{b''c'} \end{aligned}$$
(25)

as a sum of products, the first product indicating the probability of a photon turning from the orientation \(a'\) to \(c'\) via \(b'\), the second via \(b''\). However, the result (25) is wrong. It does not account for the phase shifts in b. The correct result is obtained by the following method, known to every student of optics.

Replace Eq. (25) by the sum of products of complex amplitudes

$$\begin{aligned} \Psi _{a'c'} = \Psi _{a'b'} \Psi _{b'c'} + \Psi _{a'b''} \Psi _{b''c'} \end{aligned}$$
(26)

where \(\Psi _{a'c'} = \cos (a' \cdot b')\exp {(i\varphi _{a'b'})}\), and put

$$\begin{aligned} J_{a'c'} = \left| \Psi _{a'c'}\right| ^2 = \big |\sum _b \Psi _{a'b}\Psi _{bc'}\big |^2 \end{aligned}$$
(27)

The contribution \(\Psi _{a'b}\Psi _{bc'}\) is a product of two ordinary component factors, \(\cos (a',b')\,\cos (b',c')\), multiplied by a product of two phase factors, representing the whole phase shift \((\varphi _{a'b'} + \varphi _{b'c'})\) on the path from \(a'\) to \(c'\) via \(b'\).

Landé, taking up Jordan’s contrast, compares a formula of probability theory with a formula of quantum mechanics. The starting quantity is the polarization given by the polarizer a, the intermediate one is the transversal component given by the crystal plate b while the arrival quantity is the polarization established by the analyzer c. The difference Jordan’s and Landé’s arguments is simply in the number of values. If we take into account this difference, (25) can be written as (23) and (27) can be written as (24). Like Jordan, Landé observes that the resulting intensity \(J_{a'c'}\) is not the addition of two squares as in (25) but the square of the absolute value of an addition as in (27).

A light train is a population of photons and its intensity is the number of photons that compose the train. Due to the enormous number of photons involved in light, the relative frequency of photons of a fraction of a light train is a probability or, better, is the conditional probability of a assertion given a supposition. To be convinced of this, it suffices to note that the intensity \(J_{a'c'}\), is the number of photons which, having been emitted by a with polarization \(a'\), after having passed through b and c, arrived on the screen with polarization \(c'\). As its subscript indicates, \(J_{a'c'}\) is the number of \(c'\) relative to that of \(a'\). Landé, at least at the beginning, does not speak of probability but of intensity; however, when he illustrates the addends of (25), he specifies that they are probabilities of photons. If we read carefully (25), \(J_{a'c'}\) is not the probability that a photon manifests the \(c'\) polarization but the probability that a photon, originally with polarization \(a'\), manifests the polarization \(c'\). It is therefore the conditional probability \(P(c = c' \,|\, a = a')\), and (25) can be written

$$\begin{aligned} P(c = c' \,|\, a = a') = \sum _{j=b',b''} P(b = j \,|\, a = a') P(c = c' \,|\, b = j). \end{aligned}$$
(28)

For Jordan assumption and supposition are different than those of Landé but the reasoning is the same since both (23) and (25) would be valid if, like the ordinary, the probability of quantum mechanics would satisfy both the law of probability and Markov’s property. The probability of quantum mechanics must describe the behavior of light and experience shows that this behavior does not conform to the laws of probability. As Jordan remarks, there are cases in which “the probabilities of the occurrence of light quanta do not simply add up”, that is an increase in photons does not imply an increase in light and therefore in probability. To obtain the correct light intensities, one must add not products of probabilities, but products of probability amplitudes. The behavior of light can only be described if, instead of adding products of squares, which are positive, one adds the squared absolute value of an algebraic sum. Such a square is able to account for the reduction of light and therefore of interference. Jordan proposes Postulat II only after having ascertained that “quantum-mechanical probabilities do not combine in this way”. That is after having established that microphysics cannot be described by ordinary probability.

6 Heisenberg and Jordan

The history of quantum mechanics we have followed so far had the aim at directing our attempt toward speculating about the intellectual path that led Jordan from matrix multiplication to the law of interference of probability amplitudes passing through probability theory. However, there is still one part missing in this history. The missing part deals with the deep difference between the context in which Heisenberg elaborated the “different quantum interpretation” (quantentheoretische Umdeutung) and the one in which Jordan established the “new foundation of quantum mechanics” (neue Begründung der Quantemechanik).

Following Ernst Mach, Heisenberg starts from the assumption that natural sciences must be based on facts and not on entities that cannot be observed. The description of the hydrogen atom cannot be made using the orbits of its electron which cannot be observed. The hydrogen atom must be described by something observable, in particular by its spectral lines. Hence, Heisenberg replaces the position of the electron q(t) with a double-entry table \(q_{mn}(t)\) and poses \(q_{mn}(t)=q_{nm}^*(t)\), since physical quantities have real values. Then he assumes that each element of the table has a time factor \(\exp (i\omega _{mn}t)\), where \(\omega _{mn}\) is the spectral frequency, and he asked himself the question whether, if a quantum quantity has taken the place of the classical x(t), which quantum quantity must take the place of \(x(t)^2\)?” (welche quantentheoretische Größe tritt dann an Stelle von \(x(t)^2\)?). The answer, as he saw it, is a “sort of combination” (Art der Zusammensetzung) that is an “almost inevitable consequence of the rule of combining frequencies” (nahezu zwangläufig aus der Kombinationsrelation der Frequezen). By multiplying two complex numbers, the exponent of the product is obtained by adding the exponents of the factors, in the same way one multiplies two matrices row-by-column. Heisenberg does not mention probability amplitudes. As the title of his paper clearly says, he aimed at constructing quantum mechanics by reinterpreting classical relations [6].

By reading Heisenberg’s “quantentheoretische Umdeutung” one clearly gets the impression that the author has no inkling of the probabilistic turn which is about to revolutionize Physics while, by proposing Postulat II, Jordan was fully aware that quantum mechanics was imposing on Physics an irreversible change. To understand Jordan’s awareness of this change it is enough to read the semi-popular articles [3, 5]. To give just an example, in Section 5 there is an explanation of the three particle statistics operated by means of two cells and two systems, molecules and quanta for classical and non-classical particles. This explanation is truly remarkable in two respects: on one side, it is the first exemplification of particle statistics that has ever been made since Enrico Fermi’s article [26, 27], published in 1926; on the other side, because no reference is made either to distinguishability for classical particles or to indistinguishability for quantum particles but only to the number of systems and the number of cells.

While working on his axiomatic system, Jordan knew that that the “Art der Zusammensetzung” is a matrix multiplication whose elements are “transition quantities” ( [8], p. 216). He knew as well that probability results from the square of the modulus of a probability amplitude. Moreover, he was aware that Einstein’s transition probabilities can be replaced by a more general two-argument function giving the conditional probability of the value of one quantity given the value of another quantity. We stress this main feature of Jordan’s approach which will be adopted by Hilbert and co-workers, notably by von Neumann. Above all Jordan knew that “transition quantities” must be combined in a way analogous to transition probabilities, that is, in order to obtain the amplitude of the value of one quantity given the value of another quantity, it is necessary to ‘go through’ all the possible values of an intermediate quantity. Now a question arises spontaneously: why Jordan does not use Born’s absolute probability and uses a conditional probability function allowing for different quantities at the assertion and the supposition? The answer is given by Jordan himself when he says he followed a suggestion from Pauli. We have already commented about Jordan’s repeated references to his teachers and we believe that Jordan arrived independently from Pauli at the conditional probabilities. In support to this claim, we refer to the dissertation that, in order to obtain the Habilitazion in 1926, Jordan defended at the University of Göttingen. The dissertation was first published in Die Naturwissenschaften [3] with the title “Causality and statistics in modern physics” (Kausalität und Statistik in der modernen Physik) and then translated by Robert Oppenheimer and published in Nature with title “Philosophical foundations of quantum theory” [28]. In this article Jordan discusses causality in classical mechanics and a-causality of the new mechanics.

Without following him in the development of his arguments, let us consider the probability Jordan used in an example in which he analyzes a system of two mass points. He supposed to know the orthogonal coordinates of these points from \(x_1\) to \(z_2\) but not their momenta, noting that, when this is the case, one must resort to statistical mechanics. After recalling that, according to quantum mechanics, the position and momentum of a system cannot be known simultaneously, Jordan formulates a question which, in his opinion, “is very closely related to that [of statistical mechanics which has] just discussed” (die mit der eben erörterten eng verwandt ist)Footnote 13:

When we know: the system has a certain probability of being in the first quantum state; and a certain probability of being in the second quantum state; and so on. How large then is the probability that the representative point of the system is located at a certain place in the rectangle from \(x_1\) to \(z_2\) of the coordinate space? This question is immediately answered when the Schrödinger wave function in the coordinate space is known.

We are not interested in the answer but in the premise of the first sentence of this passage. A probability amplitude consists of an assertion and a supposition and it can only be determined if the supposition is known or supposed to be known. To obtain a probability by starting from an amplitude, it is necessary to know the supposition. In the example considered by Jordan, the assertion is the place, say L, while the supposition is the quantum state, say \(S_j\), of the system. Therefore, in order to use quantum mechanics for determining the probability of L, it is necessary to assume that \(S_j\) is known. But Jordan does not assume that he knows the quantum state of the system but speaks of the probabilities, for the system, of being in the first state, in the second state and so on. Said in other terms, he assumes that he knows the probability distribution on all possible suppositions. If the quantum state of the system is not known but are known the probabilities \(P(S_j),\; j=1,2, \ldots ,d \), then, to express the probability of L, it is necessary to combine the probabilities of the conditional assertion \(L \,|\, S_j,\; j=1,2, \ldots ,d\). This amounts to use the law of total probability, that is:

$$\begin{aligned} P(L) = \sum _{j=1}^d P(S_j) P(L \,|\, S_j). \end{aligned}$$
(29)

This is the probabilistic approach to the question and certainly it is not the way quantum mechanics suggests. The approach based on the Schrödinger wave function, like the matrix approach, in order to determine the coordinate of the system does not presuppose knowledge of the probability distribution on the possible states of the system. So, the probability distribution of the states in which the system could be is useless. If the state of the system is not known, the probability of the place where the system is located cannot be determined.

Now we do not dwell on this question, but rather note how deeply probabilistic is the way in which Jordan poses it. In (29), which summarizes Jordan’s way of reasoning, in addition to \(P(S_j)\), the absolute probabilities of the possible states, there is the factor \(P(L \,|\, S_j)\), that is the conditional probability of the conditional assertion \(L \,|\, S_j\). Jordan was fully aware of the role played by conditional probabilities and it is extremely plausible that this awareness first of all guided him in proposing the generalization (18) and, secondly, led him to combine the probability amplitudes of any two quantities.

Unlike Heisenberg, Jordan was fully aware of the probabilistic change given to physics by the research to which he greatly contributed. This awareness is the keystone supporting our thesis about the intellectual path which led Jordan to Postulat II. In the next section, we leave history and, taking some ingredients from probability and philosophy, we will try to envisage Jordan’s intellectual path.

7 The road Jordan could have followed

A clear indication about this path comes from Jordan himself who, immediately after the passage in which he opposes (24) to (23), adds (where the “above formula” is (24))Footnote 14:

Probability amplitudes themselves can be thought of as more general matrices—the above formulas for interferential combination of probabilities is mathematically just the same as the formula for matrix multiplication.

Following Jordan’s suggestion, we start with matrix multiplication and, for simplicity, we consider a \(5\times 5\) matrix M whose elements are indicated as (jk) for \(j, k = 1, \ldots , 5\). Referring to intensities, that Jordan saw as probabilities, the elements below the main diagonal stand for the intensities of emissions, those above the diagonal for the intensities of absorptions. So, the off-diagonal elements correspond to transitions between states (transition quantities for Born), while the diagonal elements correspond to the states.

Working with Born, Jordan had dealt with “transition quantities” connected to the fraction of atoms passing from one state to another. He was therefore dealing with the relative frequencies of the atoms making a transition, substantially he was dealing with transition probabilities. The element \((j, k), \quad j \ne k\), in matrix M can be connected to the transition probability from the j-th to the k-th state. We can imagine that Jordan saw M as a probability matrix firstly with reference to Einstein’s transition probabilities but later referring to more general conditional probabilities where the assertion is the coordinate of the system and the supposition the state in which it is. Having in mind the “symbolic multiplication” of Heisenberg he might have posed the question: “If the two factor matrices are lists of conditional probabilities, what kind of conditional probabilities result from the product of two such matrices?”

To find an answer to this question, we consider a row-by-column multiplication. In such a multiplication, if the pair \((j,k)^2\) is the generic element of the product of two matrices such as M, then

$$\begin{aligned} (j,k)^2 = j,1 \cdot 1,k + j,2 \cdot 2,k + j,3 \cdot 3,k + j,4 \cdot 4,k + j,5 \cdot 5,k. \end{aligned}$$
(30)

So, with reference to the energy levels of the hydrogen atom, Jordan may have imagined the element \((j, k)^2\) of (30) as the result of the transition between two stationary states and the terms on the right as all the possible transitions leading from the j-th to the k-th state when between the initial state and the ending state there is an intermediate state. One immediately realizes that the analogy does not hold for products as \((4,2) \cdot (2,2)\) or \((4,4) \cdot (4,2)\) since the elements (2, 2) and (4, 4) cannot be considered transitions.

Jordan knew Heisenberg’s manuscript and it is precisely this work that mentions the “symbolic multiplication” and the composition (Zusamensetzung) of three states shown by (2). This composition allows, starting from a state n, to arrive at state \(n-\beta \) passing through an intermediate state \(n-\alpha \). That is, the passage is partitioned into two stages, the first from n to \(n-\alpha \), the second that from \(n-\alpha \) to \(n-\beta \). Of course, all possible values of the intermediate quantity \(\alpha \) must be taken into account. This way of reading the work of Heisenberg could have led Jordan to consider, as in (23), in addition to the two quantities Q and \(\beta \), a third quantity q and consequently the probability \(|\varphi (\beta _0, q)|^2\) from the value \(\beta _0\) of \(\beta \) to a value z of the intermediate quantity q and then the probability \(|\varphi (q, Q_0)|^2\) from this value z of q to the value \(Q_0\) of Q. We cannot know if things were exactly as we imagine but, be that as it may, when in addition to the starting quantity and the arrival quantity, an intermediate quantity is considered, it becomes simple to interpret the product of a row-by-column multiplication when the factors are intended as conditional probabilities.

In order to clarify what we mean, and always for the sake of simplicity, let us consider the multiplication row-by-column of two matrices A and B like matrix M above with elements \(A_{i,j}\), and respectively \(B_{i,j}\), whose product is the matrix C with elements \(C_{i,j}\). For the generic element \(C_{i,k}\) we have

$$\begin{aligned} C_{ik} = A_{i,1} \cdot B_{1,k} + A_{i,2} \cdot B_{2,k} + A_{i,3}\cdot B_{3,k} + A_{i,4} \cdot B_{4,k} + A_{i,5} \cdot B_{5,k}. \end{aligned}$$
(31)

If the three matrices are interpreted as three quantities and each pair of indices of (30) as a conditional proposition, for instance ik as \(k\,|\,i\) that is the value of C given that of A, we have the equivalence

$$\begin{aligned} k \,|\, i \equiv 1 \,|\, i \cdot k \,|\, 1 + 2 \,|\, i\cdot k \,|\, 2 + 3| i \cdot k \,|\,3 + 4 \,|\, i \cdot k \,|\,4 + 5\,|\, i \cdot k \,|\, 5 \end{aligned}$$
(32)

if \(\cdot \) and \(+\) are read as the symbols of conjunction and disjunction. This means breaking up the passage from the value i of A to the value k of C, into many passages, each identified by a value j of B, which make possible to “start” from i and “arrive” at k “passing through” all possible values of j.

The reading of (31) that we have just seen is not necessarily connected to physical problems but is commonly used in probability and statistics, for instance for anthropological characters such as weight, chest (perimeter) and height. In this case, once interpreted as in (31), the row-by-column product provides for each weight-height pair the combination that can be accomplished with all possible chest values. This combination first subordinates the value j of the chest to the value i of the weight and then the value k of the height to the value j of the chest. As an example let’s consider two matrices like M above representing 5 intervals of height, weight and chest. We also put that in the first factor of the multiplication the rows list the heights and the columns the chests, in the second factor the rows list the chests and the columns the weights and in the product the rows list the heights and the columns the weights. For element (4, 2) of the product, that is the one in which there are people with weight 4 and height 2, the equivalence

$$\begin{aligned} 2|4 \equiv 1\,|\,4 \cdot 2 \,|\,1 + 2\,|\, 4 \cdot 2 \,|\,2 + 3\,|\, 4 \cdot 2 \,|\,3 + 4\,|\, 4 \cdot 2\,|\,4 + 5\,|\,4 \cdot 2\,|\,5 \end{aligned}$$

lists all the chests a person who weighs 4 and is 2 tall might have. Since this can be repeated for all elements of the matrix product, each element of this product can be understood as the listing of all the related possibilities chest for any weight-height pairFootnote 15.

Considering three measurements, Feynman [21] does exactly what we have just done with weight, chest and height. Indeed, if AB and C are three measurements successive in time, it is not really difficult to transform the equivalence (31) into a probability equality. Let us first rewrite (31)

$$\begin{aligned} C = k \,|\, A = i \quad \equiv \quad C = k,\bigcup \limits _{j=1}^5 B = j \,|\, A = i. \end{aligned}$$
(35)

Due to (34) we have

$$\begin{aligned} P(C = k \,|\, A = i) = P(C=k, \bigcup \limits _j B = j \,|\, A = i). \end{aligned}$$

Finally the product and sum rules and Markov’s property give

$$\begin{aligned} P(C = k \,|\, A = i) = \sum \limits _{j=1}^5 P(B = j \,|\, A = i) P( C = k \,|\, B = j). \end{aligned}$$
(33)

If we think to the qs of (23) as a finite quantity, thus substituting the integral sign, (35) is the (23) of Jordan.

8 Conclusion

Heisenberg’s “quantum reinterpretation” (quantentheoretische Umdeutung) has nothing to do with probability. His “sort of combination” bears no relation to probability. For Jordan’s “new foundation” (neue Begründung) things went differently. Following probabilistic arguments from matrix multiplication he arrived at the law of total probability. Reaching this law was essential for Jordan. Only after arriving there could he oppose this law to those of optics. Through the assertion “We well know from optics that light plus light does not always have to give strengthened light”, Jordan certified the open contrast between the behavior of light and the principles of “the usual calculus of probabilities”. This contrast shows that the ordinary theory of probability is unable to describe the “interference of probabilities” (Interferenz der Wahrscheinlichkeiten), hence the behavior of light. It justifies, in a sense imposes, the Postulat II that marks the birth of a probability theory capable of describing the microcosm.

According to Jordan, quantum theory is a set of rules able to deal with a new sort of conditional probabilities. But these probabilities obey the rules of optics. With reference to these rules, Jordan concludes the route that from Laplace’s principles, which will soon become Kolmogorov’s axioms, led him to the interference law of probability amplitude. With this law, probability theory ceases to be “elementary” but still remains a probability theory.