Foundations of Physics

, Volume 46, Issue 11, pp 1471–1494 | Cite as

Special States Demand a Force for the Observer

  • L. S. SchulmanEmail author
Open Access


The “special state” understanding of the measurement process is presented, namely there is no “measurement process,” only unitary time evolution. However, in contrast to the many worlds interpretation, there is only one world. How this can be accomplished and how statistical mechanics is changed as a result are also discussed. The focus though is on experimental tests of this theory and the in-principle realization that this can give rise to feasible experimental tests. Those tests rely on the particular feature of having only one world, so that any change in the wave function must have a proximate cause, and it is the detection of that cause that constitutes the test. In a companion article there is further exploration concerning the details of the test. In addition, in the present article, the special state theory is extended theoretically through evidence of the uniqueness of the Cauchy distribution as well as explicit recognition of the role of entanglement.


Special states Arrow of time Foundational experiments 

1 Introduction

Many interpretations of quantum mechanics exist, all of which agree on experimental results. This is why they are called interpretations. Other understandings of the measurement process may differ and sometimes offer the possibility of an experimental check. In this article I describe a theory that differs in its experimental predictions from all others that I know of. Like the Many Worlds Interpretation it allows only unitary time evolution, but unlike that theory, has only one world. This suffices to create an experimental test.

The experimental prediction to which I refer is the absence of source free changes in angular momentum. Thus in either the many worlds interpretation (MWI) or the Copenhagen interpretation (in which I include the variety of ensemble theories) it is permitted for a spin prepared (say) in an eigenstate of \(J_x\) to be measured in an eigenstate of \(J_z\). Your idea of which way the angular momentum vector is pointing has changed, but there is no need for any force to have been applied (I go into this in detail below). For the special state theory, there must be a force applied to the spin to reorient it. Observing or not observing that force is the test.

In Sect. 2 I discuss quantum aspects, how it is possible to have only pure unitary (time) evolution and nevertheless have but a single “world.” Following that, Sect. 3 deals with related issues of statistical mechanics. This is relevant, since it is clear that to have both pure unitary evolution and only one world something wildly un-intuitive must be happening. This strangeness will be seen to be in the realm of statistical mechanics. Some of these topics are taken up at greater length in [1].

Finally in Sect. 5 I show that for the special state theory the changes in direction of one’s perception of angular momentum require forces, while for the MWI (etc.), they do not. Sect. 6 is a discussion.

The appendices are mainly concerned with technical matters. Of note is Appendix 2, where I present strong evidence for the appearance of the Cauchy distribution in the theory. (See Sect. 4 for the significance of this distribution.) The material in Appendix 2 represents an important step beyond what appears in [1]. In that reference I found that the Cauchy distribution provided a solution, but there was no proof that the second moment needed to be infinite. In the new work the drop-off of the Cauchy distribution stands out as superior to other options including other long-tailed distributions that lack a second moment.

2 Quantum Aspects

According to the special state theory, the only thing that ever happens is pure unitary quantum time evolution. If one has a wave function \(\psi \) at time-0, and if H is its Hamiltonian, then at time-t the wave function is \(\exp (-iHt/\hbar )\psi \). “Measurement” involves no other dynamics. So far this sounds like the many-worlds interpretation. But I inject another feature: there is only one world. How can this be? Let us look at a very small world. Suppose there is a single 2-state system in contact with a heat bath of bosons [2]. Initially the system is in its excited state and the bosons serve two purposes: their coupling induces the decay and they constitute the measurement device that notes the decay. This model of decay and measurement misses much of the real world, like “registering” the measurement, i.e., making sure the system does not return to its excited state (irreversibility). These features are assumed to result from parts of the system that I do not model. I use the spin boson model with a single boson, for which I can take the Hamiltonian1 to be
$$\begin{aligned} H=\frac{\varepsilon }{2} \left( 1+\sigma _z \right) + \omega a^\dagger a +\beta \sigma _x (a^\dagger + a) . \end{aligned}$$
The Pauli spin matrices are the operators for the 2-state (spin) system, a and \(a^\dagger \) are the boson operators and \(\varepsilon \), \(\beta \) and \(\omega \) are parameters.

Suppose this system is started with spin up, \(\psi _\mathrm{spin}=\left( {\begin{array}{c} 1 \\ 0 \end{array}}\right) \), with the oscillator state unspecified. For the parameters given in footnote 2,2 at time-0.15 there is about a 50 % probability of decay, namely if one would trace out over the oscillator states the density matrix for the spin would be half-half. However, there are two states that I wish to focus on. These are particular initial states of the oscillator, and their probability distributions are given in Fig. 1. The state whose probability distribution is shown in Fig. 1a (and whose phases are also fixed, but not shown) has the property that at time-0.15 it has not decayed at all, despite the fact that a random or average state would, at that time, be in a superposition of up and down spin states, with about half its probability in each state. Similarly, the state whose probabilities are shown in Fig. 1b has decayed (almost)3 completely at this time, again despite the behavior of a typical or average state.

This does not mean that the non-decay state must never change its spin-occupation values and certainly the decay state had to change, since at time-0 it had no amplitude in the decayed state. For reference, the probability that each of these states is in the “up” state is shown as a function of time in Fig. 2. As is clear, the decay or non-decay criterion is only applied at time-0.15. At other times the states can be anything.
Fig. 1

a shows the probability of excitation of various oscillator states that contribute to the non-decay state. Only shown are even oscillator states, as the total amplitude in this case for the odd states is computed to be about \(10^{-27}\) and is due to numerical error.4 Phases of the states are not shown, but are also fixed by the non-decay condition. In image (b) are shown the probabilities for the state that decays; in this case only even oscillator states are shown (again the wrong parity states are due to numerical error). As in image (a), the phases, though not shown are crucial to the “special” nature of the state. Both images represent the probabilities at time-0. Note that although in this case both special states only involve even states, in general special states can be of either parity. [On the other hand, for other Hamiltonians or projections, the special states need not be eigenstates of H (see footnote 4)].

Fig. 2

Survival history of the non-decay and decay states, where “survival” is the probability that the system is in the “up” state at the indicated time (horizontal axis). Image (a) shows this probability for the non-decay state, which for this example was a “non-decay” state at all intermediate times, but in general need not be. Similarly, image (b) shows the corresponding probability for the state found to be (essentially) fully decayed at time-0.15. At both earlier and later times it need not be fully decayed.

What good are these states and how did I find them? The second question is addressed in Appendix 1 and is technical. The answer to the first question is the main idea for avoiding many worlds while holding on to unitary time evolution.

Suppose our Schrödinger cat is placed in a chamber with the usual vial of poison whose dispersal is governed by the spin state of a spin-boson system of the sort just discussed. The latter system is also in the chamber and the entire setup isolated. It is opened—its isolation ceases and the irreversible “registration” is due to the observer—at time-0.15 and one looks to see if the cat is dead or alive. The usual problem is that there is positive probability for both options. However, there is no problem if the initial state of the oscillator is one of the “special” ones I have been describing. For the non-decay state there is a living cat, for the decay state it is dead. This is accomplished with no black magic; it is the result of unitary time evolution from the specified initial conditions.

That is the main idea of the special state theory: no macroscopic superpositions because the initial conditions are special—always.5 I mention also that there is no entanglement. At time-0.15 the spin state is wholly in one state or the other and a trace over its coordinates would leave the oscillator state unchanged; and vice-versa.

The next—obvious—question is, why should Nature arrange to have a “special” state as the initial conditions for every situation where a potential split into many worlds occurs? For this I do not have an answer, except to say that it is the conclusion I am driven to by insisting that no magic dynamics occurs in the measurement process and there is only one world. What I can offer though is perspective. How strange is it for there to be particular, non-random, initial conditions? For this I turn to the next Section.

3 Boundary Conditions and the Arrow of Time

When making predictions I assume the initial conditions are random. As discussed in [1] this is equivalent to the usual arrow of time. But this assumption has never been verified experimentally; its main virtue is that answers computed with this assumption agree with experiment—no mean feat, but not a definitive proof, as I will shortly demonstrate.

This assumption is closely related to another that occurs in statistical mechanics, the ergodic hypothesis. As discussed in [3] this lies at the foundations of statistical mechanics but has never been established6 and is very likely false. Textbook authors (as described in [3]) have struggled with this problem in arguments leading to the justification, for example, of adopting the thermal state (\(\rho \sim \exp (-\beta H)\)). But the hypothesis is generally accepted for the reason mentioned earlier: agreement with experience—so far.

To show that the assumption of random initial conditions is unnecessary, I use the cat map [4], which is known to be mixing (and thus ergodic). This is a map of the unit square (with coordinates xy) into itself
$$\begin{aligned} \left. \begin{array}{cl} x' &{}\equiv x+y \\ y' &{}\equiv x+2y \end{array}\right\} \!\!\!\mod 1 \qquad \hbox {or} \ \left( {\begin{array}{c} x' \\ y' \end{array}}\right) =M\left( {\begin{array}{c} x \\ y \end{array}}\right) \!\!\! \mod 1 \ \hbox {with}\ M\equiv \left( \begin{array}{cc} 1&{}1\\ 1&{}2\end{array}\right) . \end{aligned}$$
Our system is an ideal gas consisting of N points in the unit square, each moving in discrete time under the cat map. For example, in Fig. 3  I show what happens with a collection of \(N=500\) points initially satisfying \(0.5\le x\le 0.6\) and \(0.5\le y\le 0.6\). Clearly this mechanical system is headed for chaos. To get a quantitative idea of “chaos” I define an information entropy using a coarse graining. As grains I take the 100 1/10 by 1/10 squares contained in the unit square, and only count the number of points in each grain. In effect I assume the observer can only determine which square a point is in, not the point’s exact coordinates. With this coarse graining the entropy is
$$\begin{aligned} S=-\sum p_k \log p_k \;,\ p_k\equiv n_k/N , \end{aligned}$$
where k runs over the coarse grains and \(n_k\) is the number of points in grain-k. The behavior of the entropy for the points in Fig. 3 is shown in Fig. 4
Fig. 3

500 points are started in the square \(0.5\le x\le 0.6\) and \(0.5\le y\le 0.6\), but are otherwise randomly selected. After one time step under the cat map, Eq. (2), they have become the parallelogram in the next figure. They have stretched in the direction of the eigenvector with larger eigenvalue of M (Eq. (2)) and correspondingly shrunk along the other eigenvector. The product of the eigenvalues is 1 because \(\det M=1\). The dynamics thus satisfies Liouville’s theorem, in which area is preserved in phase space. By time 3 (the next figure) the mod 1 action coupled with the stretching has begun to pull the points apart and by time 7 (the last figure) nothing is recognizable.

Fig. 4

Entropy, as defined in Eq. (3), as a function of time for the simulation of Fig. 3. Note that equilibration sets in at about time-5. The dashed figure is the maximum entropy for this coarse graining, namely \(\log 100\). It is not attained because of fluctuations due to the finite number of points.

Fig. 5

The times are as follows: row-1: [0, 1, 2, 4]; row-2: [6, 7, 8, 9]; row-3: [10, 11, 12, 14]; row-4: [15, 16, 17, 18]. These points all evolve under the cat map and satisfy boundary conditions at times 0, 9 and 18. Size of the point position marker varies with the image, for better visibility.

This is what one might call normal behavior: the initial points were selected randomly within the given grain and the entropy increases monotonically until close to equilibrium, after which it fluctuates in a predictable way.7 But let me display another simulation. As I will explain in a moment, the initial points were not selected randomly, but for the first time steps it certainly will look that way. The simulation runs for 18 times steps away from the initial square and is shown in Fig. 5. The sequence of images should be read left-to-right and row-by-row. There are 4000 points and most time steps are illustrated. I stress, every single point in this simulation evolved by pure cat map dynamics. So how did I get it to give me these strange images? Actually it was easy. I randomly occupied the first little square (\(0.6\le x\le 0.8\) and \(0.4\le y\le 0.6\)) at time-0 with about 30 \(\times \) 4000 points. Then I imposed two conditions on the points. First, at time-18 they needed to occupy a different little square (but of the same size as the original). This cut the number of acceptable points by a factor 25. Then I also required that at time-9 the points arranged themselves in the figure of a cat. This cut things down a bit more, so I was left with 4000 points that satisfied all three conditions. This solved a 3-time boundary value problem. It was trivial in this case since the points were non-interacting; simply removing a point did not affect the others. It should be pointed out that solving a multiple time boundary value problem for interacting particles, even two of them, can be quite challenging. But for the point I wish to make, an ideal gas does the job.

To get a message from this demonstration I turn to entropy. The coarse graining is a bit more coarse than before: grain sizes are 1/5 by 1/5. For each of the configurations shown the entropy was computed and is graphed as the circles in Fig. 6. A second curve, with star markers, is also shown in that figure. It is the entropy for 4000 points initially starting in the same coarse grain, but having no additional requirements. (Like Fig. 4, it corresponds to “normal” initial conditions.) Compare these two curves. The point I wish to make is that Open image in new window .
Fig. 6

The circles and solid line represent the entropy as a function of time for the simulation shown in Fig. 5. Note that it drops a bit at time-9 and then at time-18 plummets back to 0, since all points are again in a single grain. The other curve, marked by stars, is the entropy as a function of time for 4000 points having the same initial condition, but with no other constraints.

The entropy graph makes the point quantitatively, but qualitatively, if you compare Figs. 3 and 5, it is clear that aside from slightly different initial conditions, there is a great similarity in the initial relaxation. One can be quantitative in other ways also, discussing relaxation times and such, but the issue to be stressed is that with fewer than 4 % of available points the relaxation, almost to the time of the constraint, looks entirely normal.

My message is simple: I may have future constraints, but I would not know about it. The arrow of time does not (necessarily) point as fixedly as one might have supposed.

Another way to say this is to observe that the points in the simulation that yielded Fig. 5 were not random but had a cryptic constraint, that is a constraint that was difficult (or for the macroscopic world, impossible) to discern, but which nevertheless plays in important role as the dynamics unfolds.

My objective in this presentation is to pave the way for the idea that there could be other cryptic constraints in the world. In particular, not every imaginable state occurs in Nature; only those which, in my terminology, are special. This is surely a severe restriction, but I have made the point that the restriction may be invisible. The kind of restriction that I find most palatable is a two-time boundary condition. This surely contradicts the usual arrow of time, but as demonstrated, its effect may not be noticed except close to the boundaries, which may be well-separated.8

What kind of two-time boundary condition could select special states? First consider initial conditions. As Wald [5] has pointed out, in the early universe the entropy was low, for unknown reasons. I will go a step further: perhaps the von Neumann entropy was also low, in other words there was little or no entanglement. In this speculative mode I will further imagine that our entire cosmology is roughly time symmetric. This idea is not popular today due to the discovery of accelerating expansion. But the phenomenon is poorly understood and there have been suggestions of a periodic cosmology despite the acceleration (a far from comprehensive sample is [6, 7]), so I will take liberties in my speculation. One additional component enters this line of thought, namely the connection first suggested by Gold [8], relating the arrow of time to an expanding universe. One then expects that under contraction the arrow will be reversed. Recalling the connection between boundary conditions and the arrow of time, one can now enunciate a possible boundary condition that would demand special states: no entanglement at the beginning, no entanglement at the end.9

Consider then a Schrödinger cat. At the end of the experiment—and again demanding only unitary time evolution—in the MWI there is a portion of the wave function with a dead cat and portion of the wave function having support on a macroscopic state recognized as a living cat. The dead one is buried, the living one perhaps becomes involved in another experiment, sending cats to Mars. But how can these portions of wave function be recombined coherently, as they would need to be if there is a no-entanglement demand in our future. It would take tremendous coordination to accomplish this coherently. Having a special state is also an unlikely way to avoid entanglement, but it is much less unlikely than recombining after a superposition of macroscopically distinct states has formed.

There are many caveats in the above argument. The first is that maybe the need for special states has nothing to do with cosmology. It may be that special states are indeed the way to reconcile quantum and statistical mechanics, but the argument just given is wrong or irrelevant. Next, it is possible that future boundary conditions might obtain even in an ever-expanding universe. Then there are technical matters. What is the role of identical particles? Perhaps you do not need to recombine the cat, since its electrons and other constituents are all identical, you may be able to recombine local portions of the wave function with particles nearby. For these and other questions, the short answer is, I don’t know.

But with the two-time boundary rationale one can approach another issue, namely what about the small amount of left over wave function. As I state in footnote 3, the special state for decay has a small but non-zero probability of non-decay (in the example given it is \(8.1\times 10^{-5}\)), similarly for the non-decay special state. In the context of a boundary value problem one does not need perfection. The measure of possible error in “specializing” is given by the tolerance of that boundary value problem.

There are several comments to be made that reflect on the plausibility of these ideas. First there are the numbers. In my computer modeling I found errors of order \(8.1\times 10^{-5}\) using a Hilbert space of dimension 500. The actual dimension of physical spaces boggles the imagination. Using known formulas for entropy, one mole of neon in 1 cubic meter at room temperature and pressure possesses [3] on the order of \(\mathcal{M}=\exp (S/k_B)\simeq 10^{1.3\,\times \,10^{25}}\) states, i.e., dimensions in Hilbert space. So for any macroscopic apparatus one can expect far smaller errors than I obtained in my example. In a similar vein, reflecting on the total number of states in the universe, it becomes less implausible that only a tiny subset (the special states) can nevertheless contain many states.10 Another point, which may be surprising, is that for the right kind of wave functions, there is a cessation of entanglement in their collisions [9]. If you scatter particles of unequal mass having Gaussian wave packets off one another they rapidly adjust the packet width so that there is no further entanglement.

Finally there is the issue of recovering the Born probabilities. If every experiment involves interaction with apparatus and special states, why is it that probabilities can be calculated using only the wave function of the system being studied? This is our next subject.

4 Recovering Standard Probabilities

A fair coin has a 50 % chance of landing heads, 50 % tails. In effect, the phase space of your body and the motions needed to flip the coin are divided into two sets, one of which gives heads, one tails. These should be closely interwoven so that you have difficulty controlling the outcome. So is this a property of the coin or of you?

Something along these lines is what I claim occurs during a quantum measurement. The Born rule says, look only at the wave function. But I am saying that the space of special states of the apparatus breaks up in the way the phase space of the coin flipper does. The special states are a small subset (subspace, actually) of the entire Hilbert space, but their relative size (dimension) is proportional to the probabilities of the various outcomes. In other words, quantum probability is like its classical counterpart: in each instance the result is determined, but one would need microscopic precision to know the outcome. So one uses probability. The claim therefore is that the dimension of the space of special states for each outcome is proportional to the square of the wave function amplitude for that outcome.

This is an assertion that I do not know how to check. The few cases where I have an analytic handle on special states (see [1]) I do not consider typical, and where I have numerical results I believe the spaces are much too small. As a result, I have taken a defensive position on this point, to see if the assertion can be disproved.

I focus on the simplest quantum measurement problem, determining the state of a two-level system. I consider a Stern-Gerlach apparatus for which the initial spin state is
$$\begin{aligned} u_\theta = e^{i\theta \sigma _x/2} {1\atopwithdelims ()0} . \end{aligned}$$
There are of course many other degrees of freedom in this problem. Foremost is the position of the spin (riding perhaps on a K atom); then there is the macroscopic magnet, macroscopic screen, and much, much more. Consistent with the desire to show impossibility I grant that there are special states to do the job, different ones for different angles, \(\theta \), of the spin orientation. I can make a guess at the nature of the special state—if it exists—by looking for the least unlikely way to get the usual Stern-Gerlach result. The special state might unite the spatial wave function of the K with diverging position coordinates after it has emerged from the inhomogeneous magnetic field. This seems to me less likely—involving more degrees of freedom doing atypical things—than to have \(\theta \) turned to 0 or \(\pi \) before it enters the field.11 So we’ll assume that’s what happens; perhaps what one takes to be a stray magnetic field provides just the right force to bring the spin to 0 or \(\pi \) prior to its entry into the inhomogeneous field, the function of that field being to make the position coordinate dependent on the original spin.
Let us call this apparently (but not really) random change in \(\theta \) a “kick.” As \(\theta \) varies, the number of kicks to one or the other special value also varies. Let \(f(\theta )\) be the probability of obtaining a kick of size \(\theta \). Of course for any one experiment only kicks of size \(-\theta +n\pi \) (with n an integer) enter, but it is reasonable to assume that as \(\theta \) varies there is some well-defined distribution, manifested in our situation as a probability. Thus to get spin up one must have a kick of size \(-\theta \) or, if one allows for larger kicks, \(2n\pi -\theta \). Similarly, to get spin down one requires a kick of size \((2n+1)\pi -\theta \) (with n an integer). Thus the probability of spin up is \(g(\theta )=\sum _{n=-\infty }^{n=\infty } f(-\theta +2n\pi )\). Similarly for spin down one adds \(\pi \) to each summand in the argument of f. On the other hand, standard quantum mechanics, i.e., the Born rules, dictate that the ratio of down to up is \(\tan ^2(\theta /2)\). Therefore our requirement on f (hence on g) is
$$\begin{aligned} \tan ^2\left( \frac{\theta }{2}\right) =\frac{g(\theta +\pi )}{g(\theta )} ,\quad \hbox {with}\quad g(\theta )\equiv \sum _{n=-\infty }^{n=\infty } f(\theta +2n\pi ) . \end{aligned}$$
(In Eq. (5), in the definition of g, use has been made of f’s \(\theta \rightarrow -\theta \) symmetry as well as the fact that n is a dummy variable.) There is an explicit solution to Eq. (5), namely \(f_0(\theta )=1/\theta ^2\). Unfortunately this solution is not normalizable, as a probability should be. (In fact there is no normalizable solution to Eq. (5); see App. 2.) However, for \(\theta \) close to 0 it is possible to cut off the function and eliminate the singularity without experimental implications. A convenient cutoff makes use of the Cauchy distribution
$$\begin{aligned} C_a(\phi )=\frac{a/\pi }{a^2+\phi ^2} , \end{aligned}$$
which for small enough a changes Eq. (5) very little (see App. 3). The deviations from standard probabilities are largest for \(\theta \sim 0\) and are of order \(a^2\); since a is unknown, one does not have an experimental test. The distribution \(C_a\) does not have a second moment (it is a Lévy distribution), and this may be necessary for any function f satisfying Eq. (5). In App. 2 I report partial proofs, but I emphasize that for the purposes of the experimental tests described below and in [10], the Cauchy distribution is not required.

Before going to what I consider a true experimental test I will discuss two possibilities, one of which does not constitute a test, while the other may, but I don’t know how to estimate outcomes.

The first—the non-test—deals with fluctuations. Perhaps with a Cauchy distribution at the heart of the special states, fluctuations would be exceptionally large. In other words, you send in N atoms with initial wave function given by Eq. (4) and the average number of spin down outcomes would be \(N\sin ^2(\theta /2)\); nevertheless, the fluctuations about that average would be larger than the expected \(\sqrt{N}\). For better or for worse, this is not true, and a mathematical demonstration is given in App. 4.

A second test deals with runs, i.e., are the successive (in time) spin values, as detected by the (Stern-Gerlach) screen independent of one another, or do they tend to have many up, followed by many down, etc., keeping the average correct. Why should this be? If the special state depends on fluctuations in the magnetic fields, it is plausible that fewer unusual field values would be needed if successive “kicks” were correlated. This may indeed be happening, but I have no way to estimate it.

5 Force-Free Rotation?

In the MWI there is no need for any force to be applied in order to go from (say) an eigenfunction of \(J_x\) to an eigenfunction of \(J_z\). The same is true for the Copenhagen interpretation.

What is amusing is that individual observers do have a change in their perceptions of the value of the angular momentum, and this occurs because of a change in the overall wave function, but there is no need for angular momentum non-conservation. The context is the Stern-Gerlach experiment measuring the z component of angular momentum. Imagine an (already) up spin sent into this apparatus. There is no transfer of \(J_z\), although there is a very small transfer of linear momentum, since the atom (carrying the spin) is deflected. Similarly a down spin induces no \(J_z\) transfer. In equations
$$\begin{aligned} |\psi \rangle _+= & {} |\!\!\uparrow \rangle \otimes \Omega _0\otimes |\hbox {observer}\rangle \rightarrow U | \psi \rangle _+ = |\!\!\uparrow \rangle \otimes \Omega _+\otimes |\hbox {observer sees}~+\rangle \nonumber \\ |\psi \rangle _-= & {} |\!\!\downarrow \rangle \otimes \Omega _0\otimes |\hbox {observer}\rangle \rightarrow U | \psi \rangle _- = |\!\!\downarrow \rangle \otimes \Omega _-\otimes |\hbox {observer sees}~-\rangle .\quad \end{aligned}$$
In these equations \(U=\exp (-iHt/\hbar )\) is the time evolution operator and \(\Omega \) represents anything not the spin or the observer, in particular the magnets and the atom’s translational degrees of freedom. I emphasize: although there is a small transfer of linear momentum, there no transfer of z -component of angular momentum in either case. But now consider an initial state whose spin is not oriented along the z-axis,
$$\begin{aligned} \psi _\mathrm{initial}=\left( \alpha |\!\!\uparrow \rangle + \beta |\!\!\downarrow \rangle \right) \otimes \Omega _0\otimes |\hbox {observer}\rangle . \end{aligned}$$
By the superposition principle this yields one observer who had prepared the state at a non-trivial angle to the z-axis, but found at the end a spin pointing along that axis. The same is true of the other observer. Each has seen a change in the perception of the z-component of angular momentum. On the other hand, since the wave function in Eq. (8) is a superposition of the initial wave functions of Eq. (7) the dynamics can be separately considered for each, and there is no transfer of z-component of angular momentum at any stage. How can this be? The (version of the) observer who saw “up” will say, “Oh, I was on the “\(\alpha \)” (in our notation) component of the wave function, while the other (version of the) observer would make a similar statement, with \(\beta \) replacing \(\alpha \).

One should not find this shocking. Despite the observer’s possible perplexity, there is conservation of angular momentum. The total Hamiltonian commutes with the (total) \(J_z\) operator; it’s just that each observer, decohered from the other, sees a peculiarity.

The explanation would be slightly different for (my understanding of) the Copenhagen interpretation. Until you actually measure \(J_z\) it has no value, since \(J_z\) does not commute with the projector for the spin state in Eq. (8).

However, with only one world—the contention of the special state theory—there can be no change in the wave function without a proximate cause. If a quantity is changed the single observer can, if it is physically possible, determine what caused the change. This proximate cause lies in the special state itself. If \(\langle |J_z|\rangle \) (of the spin) changes its value, something else has to pick it up.12 This “something” can only be due to the peculiarities of the special state, what has been called a kick earlier. (Recall, the kick is not a deviation from the laws of nature, but like the cat at time-9 in the progression of Fig. 5, it is the result of exact obedience to the rules, but happening because of unusual initial conditions.)13

In a companion article [10] we give concrete suggestions for detecting the ostensibly random cause in some partulcar experiments. The general idea is to find where the unusual state (giving rise to the “kick”) is least unlikely, and attempting to detect it.

6 Discussion

After giving background on the special state theory I arrived at a potential experimental test. The many worlds and Copenhagen interpretations predict observations of source-free changes of the observer’s perception of the angular momentum. The special state theory, on the other hand, does not: something must push the spin to its new orientation. The proposed experimental test makes use of this distinction.

In the appendices special topics related to the main text are taken up. In particular there is a proof that no probability distribution can exactly satisfy Eq. (5) and a demonstration of the preferred role of the Cauchy distribution—preferred even over other long-tailed distributions.


  1. 1.

    The full spin boson model is generally taken to have Hamiltonian \(H=(\varepsilon /2)(1+\sigma _z)+\Delta \sigma _x+\sum _k \omega _k a_k^\dagger a_k +\sigma _z \sum _k \beta _k (a_k^\dagger + a_k) +\sum _k \beta _k^2/\omega _k\), but we will take a much simpler version for our example, as given in Eq. (1).

  2. 2.

    The parameters for the spin boson model of Eq. (1) are \(\epsilon =0.5\), \(\omega =0.1\), and \(\beta =0.6\). The oscillator was cut off and altogether \(250\) states were considered. This led to an error in the commutator of a and \(a^\dagger \) in the \(250^\mathrm{th}\) diagonal term, but not elsewhere. To reduce the cut off effect, only states with relatively small probability in the highest levels were considered.

  3. 3.

    Neither the non-decay nor the decay is perfect. For the case shown, the probability of decay for the “non-decay” state is about \(6.1\times 10^{-4}\), while the probability of non-decay of the “decay” state is about \(8.1\times 10^{-5}\). I comment on this near the end of Sect. 3.

  4. 4.

    For the Hamiltonian of Eq. (1) there is a constant of the motion, conventionally called “parity.” For the single oscillator this is parity\(\,\equiv \Pi =(-1)^{a^\dagger a}\sigma _z\), and \([H,\Pi ]=0\). (See Appendix 1 for notation in the following discussion.) Our “B” also commutes with \(\Pi \) because the projections involved (P and Q) themselves commute with \(\Pi \), since they are functions of the operator \(\sigma _z\). Hence the eigenstates of B also can be sorted by their parity. In general, if the projections do not cummute with particular symmetries of the Hamiltonian there would be no need for B and H to have common spectrum.

  5. 5.

    Note also that this enforces a degree of determinism that may elicit extreme discomfort. For example, my special state for the spin-boson system put into the cat’s chamber is special for time 0.15, not for another time. It is coordinated with the fact that the observer actually opens the chamber at that time. This observer may think the opening time is arbitrary, but it is already built into the state of the universe that the chamber will be opened at that time. For some interpretations of the concept of free will this would deny that possibility (but there are many interpretations). This level of predictability and determinism is natural with the two-time boundary conditions that are introduced in the next section, but for some, the conclusions may be shocking.

  6. 6.

    Proofs of ergodicity in the realm of mathematics of course do exist, but they are of little relevance for the uses to which physicists put this hypothesis. The dynamics used in those demonstrations is artificial and even more important, the time scales for true multiparticle systems are enormous, well beyond the lifetime of the universe.

  7. 7.

    Finding the average deviation from maximum entropy is straightforward for given parameters (number of grains, number of points), although the calculation involves non-uniform asymptotics. See [1], p. 40 and Exercise 2.2.2 (but beware of typos).

  8. 8.

    Thomas Gold introduced the idea that the thermodynamic arrow of time is a consequence of the expanding universe [8]. Some reviled this idea claiming the absurdity of an opposite arrow if our universe has a big crunch in its future. Even some respected scientists failed to appreciate how a two-time boundary value problem deals with this issue. See also [12].

  9. 9.

    Of course there is entanglement every time a bound state forms. The assumption is that this is also accomplished by means of special states. The electron and proton (say) definitely come together or definitely do not. As usual this can only be accomplished with the aid of other degrees of freedom.

  10. 10.

    The ideas of determinism (see foonote 5) and two-time boundary conditions can have implications outside of the quantum issues we have devoted this article to. As is well-known, the evidence for the existence of black holes, while convincing (e.g., [13]), does not exclude the possibility that there are compact objects, highly dense, but that they simply have not passed to the black hole stage. Why not? No reason has been given and it would seem to be highly contrived to assume that peculiar dynamical features have just managed to prevent black hole formation under accepted notions of general relativity. However, once you have black holes in the universe other problems arise, information issues and firewalls [14]. Given the speculative nature of the boundary conditions I have proposed (“no entanglement at the beginning, no entanglement at the end”) one can continue the speculation and demand that the boundary conditions also require that no black holes form, despite their dynamical possibility (just as I am claiming that macroscopic superpositions—Schrödinger cats—do not form, despite their dynamical possibility). This may also have bearing on the question that Wald [5] poses: why did the universe not begin in a maximum entropy state, i.e., a black hole? On the other hand, my proposal can be viewed simply as a rephrasing of his question: why have these boundary conditions? It is questions like these that suggest humility for homo sapiens.

  11. 11.

    In the companion article [10] we discuss this issue more thoroughly.

  12. 12.

    The fact that \([H,J_z]=0\) guarantees that the expectation of the operator does not change. In MWI this allows the two versions of the observer to balance \(\langle |J_z|\rangle \), but this is not possible in the special state theory—there is only one observer at the end of the experiment.

  13. 13.

    I mention here initial conditions, but the conditions could be specified at any time, since the dynamics, both for the cat and for quantum mechanics, are completely deterministic.

  14. 14.

    It is also true that the symmetry of h about 0 (following from that of g) implies that for small \(\theta \), \(h(\theta )=\hbox {O}(\theta ^2)\), but we will not use this.

  15. 15.

    “Small kicks” would need to be separated by the physical correlation time in order to be considered independent.

  16. 16.

    There is little loss of generality in using this form for \(f(\theta )\). For all cases, we expect a to be small, so that the tail of the distribution function has the correct properties. For \(\ell <3/2\) the asymptotic behavior in \(\theta \) is \(f\sim 1/\theta ^{2\ell }\) so that the only difference from the Lévy distribution is the small \(\theta \) behavior. (For \(\ell =1\) this is precisely the Cauchy distribution.) For \(\ell \ge 3/2\) one can use the central limit theorem and one goes back to the earlier discussion in the text. The only case not considered is large a and \(\ell \ge 3/2\). Numerical investigation of this case found a minimum in size for \(s(\theta )\) at \(a=1\), for a variety of \(\ell \) values. This minimum was about \(e^{-4}\) in sharp contrast to the much smaller values for the Cauchy distribution using small a.

  17. 17.

    To prepare the atoms in the state shown in Eq. (4) one could use half the output of another Stern-Gerlach setup oriented at an appropriate angle. Something like this was done by Stern himself in the early 1930’s. The problem he and his collaborators came up against was that if there was a non-zero magnetic field along the entire path between the two magnets the spins would follow the field. Stern et al. summarized their work in [15], but it is interesting to read their tribulations in the course of these efforts [16, 17]. The analysis of their work was done analytically by Majorana [18]. I have translated both the Frisch-Segre article (from the German) and the Majorana article (from the Italian) (however, imperfectly) and would be happy to send this to anyone who asks (and I’d be grateful for corrections). The Majorana calculation was a tour de force and was sufficient to allow Frisch and Segre to conclude that theory and experiment were consistent. It is amusing that in the 21\(^\mathrm{st}\) century I could use a computer to take into account effects that Majorana neglected and found even better agreement of theory and experiment. My translations and computations are unpublished.



I am grateful to Predrag Cvitanovic, Marcos da Luz, Bernard Gaveau, Amos Ori and Shina Tan for helpful discussions. Open access funding has been provided by the Max Planck Society.


  1. 1.
    Schulman, L.S.: Time’s Arrows and Quantum Measurement. Cambridge University Press, New York (1997)CrossRefGoogle Scholar
  2. 2.
    Schulman, L.S.: Special states in the Spin-Boson model. J. Stat. Phys. 77, 931–944 (1994)ADSMathSciNetCrossRefGoogle Scholar
  3. 3.
    Gaveau, B., Schulman, L.S.: Is ergodicity a reasonable hypothesis? Eur. Phys. J. Spec. Top. 224, 891–904 (2015)CrossRefGoogle Scholar
  4. 4.
    Arnol’d, V.I., Avez, A.: Ergodic Problems of Classical Mechanics. Benjamin, New York (1968)zbMATHGoogle Scholar
  5. 5.
    Wald, R.M.: The arrow of time and the initial conditions of the universe. arXiv:gr-qc/0507094v1 (2005)
  6. 6.
    Khoury, J., Steinhardt, P.J., Turok, N.: Designing cyclic universe models. Phys. Rev. Lett. 92, 031302 (2004)ADSCrossRefGoogle Scholar
  7. 7.
    Steinhardt, P.J., Turok, N.: Cosmic evolution in a cyclic universe. Phys. Rev. D. 65, 126003 (2002)ADSMathSciNetCrossRefGoogle Scholar
  8. 8.
    Gold, T.: The arrow of time. Am. J. Phys. 30, 403–410 (1962)ADSMathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Schulman, L.S.: Evolution of wave-packet spread under sequential scattering of particles of unequal mass. Phys. Rev. Lett. 92, 210404 (2004)ADSCrossRefGoogle Scholar
  10. 10.
    Schulman, L.S., da Luz, M.G.E.: Looking for the source of change. Found. Phys. (2016). doi: 10.1007/s10701-016-0031-x
  11. 11.
    Hille, E.: Analytic Function Theory. Ginn and Company, New York (1959)zbMATHGoogle Scholar
  12. 12.
    Schulman, L.S.: Models for intermediate time dynamics with two-time boundary conditions. Phys. A. 177, 373–380 (1991)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Broderick, A.E., Narayan, R., Kormendy, J., Perlman, E.S., Rieke, M.J., Doeleman, S.S.: The event horizon of M87. Astrophys. J. 805, 179 (2015). doi: 10.1088/0004-637X/805/2/179
  14. 14.
    Almheiri, A., Marolf, D., Polchinski, J., Sully, J.: Black holes: complementarity or firewalls? J. High. Energ. Phys. 02, 062 (2013)ADSMathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Frisch, R., Phipps, T.E., Segre, E., Stern, O.: Process of space quantisation. Nature. 130, 892–893 (1932)ADSCrossRefGoogle Scholar
  16. 16.
    Phipps, T.E., Stern, O.: Uber die Einstellung der Richtungsquantelung. Z. Phys. 73, 185–191 (1932)ADSCrossRefzbMATHGoogle Scholar
  17. 17.
    Frisch, R., Segre, E.: On the process of space quantization. II. Z. Phys. 80, 610–616 (1933)ADSCrossRefGoogle Scholar
  18. 18.
    Majorana, E.: Atoms in an oriented, variable magnetic field. Nuov. Cim. 2, 43–50 (1932)CrossRefzbMATHGoogle Scholar

Copyright information

© The Author(s) 2016

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Max Planck Institute for the Physics of Complex SystemsDresdenGermany
  2. 2.Physics DepartmentClarkson UniversityPotsdamUSA

Personalised recommendations